# Questions tagged [machine-learning]

Implementation questions about machine learning algorithms. General questions about machine learning should be posted to their specific communities.

30,769 questions

**0**

votes

**0**answers

12 views

### sklearn svc is being training well but I can't predict

I'm training one svm classifier using sklearn,
the features which was trained the model has this shape:
[[0.04954833 0.09270993 0.08942692 ... 0.48863458 0.73213733 0.06461511]
[0.67277258 0....

**1**

vote

**0**answers

17 views

### Import a TensorFlow checkpoint and use it for predictions with Keras

I want to use a pre-trained GoogLeNet model to predict images' classes with Keras. I have downloaded the relevant TensorFlow files:
$ tree pretrained/
pretrained/
├── cifar
│ ├── checkpoint
│ ├── ...

**0**

votes

**0**answers

4 views

### Why can't Forward Stagewise Additive Modeling work with absolute loss function?

In Forward Stagewise Additive Modeling, if the loss function is squared loss, the next weak learner fits to the residual error.
Why not we do like this when the loss function is absolute error or ...

**0**

votes

**0**answers

15 views

### Transforming a gzip file into npy file [duplicate]

For my my ML model I need to open a gzip file and convert it to an array.
My code looks like this:
def load_data(path):
with np.load(path) as f:
x_train, y_train = f['x_train'], f['...

**0**

votes

**0**answers

12 views

### Regression model/ non-linear regression

I have a dataset with dimension 96x100. The columns represent the water flow for 100 days at times starting at 5am with interval of 15min. I tried running linear regression but it produces a variance ...

**1**

vote

**1**answer

10 views

### RandomizedSearchCV for XGboost, imbalanced dataset and optimum iterations count (n_iter)

I am working on a imbalanced (9:1) binary classification problem and would like to use Xgboost & RandomizedSearchCV.
As shown in code there are 472,50,000 (5*7*5*5*5*5*6*4*9*10) combinations of ...

**0**

votes

**0**answers

8 views

### Why using RMSE as loss function in logistic regression takes non convex form but doesn't in linear regression?

I am taking this deep learning course from Andrew NG. In the 3rd lecture of 2nd week of the first course, he mentions that we can use RMSE for logistic regression as well but it will take a nonconvex ...

**-1**

votes

**1**answer

12 views

### Why pycharm having error while loading .joblib file

I have a trained decision tree model file music-recommender.joblib. When I am using Jupyter notebook, I am able to load this trained model successfully and able to do prediction. But the same code I ...

**0**

votes

**0**answers

18 views

### For a reinforcement learning model implementation can the reward system be different for training and evaluation?

I am trying to create a reinforcement learning model for valuation of the company based on financials of the company. In evaluation I will use financials trend to give reward to the Agent. While in ...

**1**

vote

**0**answers

16 views

### convert the following json to csv

there's 3gb JSON data which I want to convert it into CSV format using python. But the piece of code which I have written converts the data to CSV but stores it in a single cell. I don't want the "...

**1**

vote

**1**answer

29 views

### Multiple linear regression with gradient descent

Halo,
I'm new in machine learning and Python and I want to predict the Kaggle House Sales in King County dataset with my gradient descent.
I'm splitting 70% (15k rows) training and 30% (6k rows) ...

**-1**

votes

**0**answers

6 views

### In (Sales Demand prediction) Multivariate time series Model, how to use the features of time t for predicting the sales demand at time t?

I am currently working on a sales demand prediction of a product using
time series analysis.
Here, I am taking some features such as(day of the week , day of the month, price, price_discount, sales) ...

**1**

vote

**1**answer

21 views

### How can correct sample_weight in sklearn.naive_bayes?

I'm implementing Naive Bayes by sklearn with imbalanced data.
My data has more than 16k records and 6 output categories.
I tried to fit the model with the sample_weight calculated by sklearn.utils....

**1**

vote

**0**answers

18 views

### Extract FAQ content from websites of different domain name

Currently, I have used Scrapy and bs4 to do web crawling on individual website's faq contents.
However, as different websites format their html structures differently, I will need to adjust the tags ...

**1**

vote

**0**answers

22 views

### Retrieve words based on a given context For example, given a job description, I need to find the words related to the skills

I have a paragraph of job description as below
JAVA SOLUTION ARCHITECT â€“ CONTRACT â€“ DALLAS, TXA leading consulting firm is looking to bring on a Java Solution Architect to help them deliver a ...

**0**

votes

**0**answers

12 views

### Backward Difference Encoding with probability [on hold]

Backward encoding for K ordinal categorical data produces k-1 binary categories and the mean of the dependent variable for a level is compared with the mean of the dependent variable for the prior ...

**0**

votes

**0**answers

26 views

### What exactly does “.fit()” do? [duplicate]

I'm pretty new to machine learning, I want to know what exactly the .fit() function does.
No background information.
fitmod = mod.fit(drug2[["DUQ219", "DUQ272", "DUQ352", "DUQ410"]], drug2["DUQ352"])...

**1**

vote

**0**answers

13 views

### how to save trained ARIMA model to use later

I am using a univariate time series dataset to forecast. I am using ARIMA model to train. But it is a time-consuming process to train every time. Is there any process to save the trained ARIMA model ...

**-1**

votes

**0**answers

17 views

### How to obtain images by providing its meaning in deep learning?Word to image deep learning

I'm creating an IOS application that converts a word typed in text field to the corresponding image.
I want to create an ML model that return an image by providing its meaning(juste a simple word) but ...

**-2**

votes

**0**answers

12 views

### Convert an SVM to decision tree(s), is it possible?

So, let's say I already inferred an SVM and its parameters from a given dataset, so I already have a discriminative classifier set up.
I would like to create an equivalent classifier based on a ...

**-1**

votes

**1**answer

11 views

### Pre-Modeling feature selection

I began a segmentation exercise, where we would like to cluster users with similar characteristics in their respective groups. We have 100s of features to consider, some are obvious like age and geo ...

**-1**

votes

**0**answers

8 views

### Do TFLite models usually contain fewer operations?

I was wondering if a tflite model should contain fewer operations than the original tf model. Clearly, tflite models are smaller in size, but I believe that's (mostly) because of the use of ...

**1**

vote

**0**answers

20 views

### Standardscaler v Min Max. Standardscaler is not working for ANN but MInMax does

I have a dataset with multiple features and a target. I am using ANN to predict. When I scale the features using MInMax, everything works fine and during the compile and fix I get good loss and ...

**1**

vote

**0**answers

19 views

### Initialize neural network weights with Tensorflow

I am developing a neural network model using Tensorflow. In the LOSO cross validation, I need to train a model for 10 folds, since I have data from 10 different subjects.
Taking this into account, I ...

**-2**

votes

**0**answers

12 views

### Recommendation to classify issue based on error messages

I would like to categorize issue based on the error messages and Platform. Platform and Error Message are independent variables and Issue is dependent variable (Please see attachment). I have some ...

**4**

votes

**1**answer

43 views

### How to rewrite a tensorflow graph to use CPU for all operations

I've trained a network on a multi GPU and CPU setup, and saved the resulting model as a tensorflow SavedModel. I then have another script which can load the resulting model and run the required ops to ...

**-4**

votes

**0**answers

16 views

### simple questions about Regression - ML

Could you please provide 2-3 sentence for each following question?
1) Describe any optimization criterion that could be used for estimating parameters of a linear regression model.
2) Describe any ...

**1**

vote

**0**answers

14 views

### Can I use trained ML model in django

How to use trained ML model in django and run the trained ML model using user provided values in a form.

**-2**

votes

**1**answer

18 views

### Odd behavior of cost over time with SGD

I am relatively new to ML/DL and have been trying to improve my skills by making a model that learns the MNIST data set without TF or keras. I have 784 input nodes, 2 hidden layers of 16 neurons each, ...

**-1**

votes

**1**answer

33 views

### how to define target variable for linear regression

I want to perform regression analysis on a dataset of dimensions 96x100. The columns represent the value for number of days(100) while the independent variable is the time. How can I perform linear ...

**1**

vote

**1**answer

17 views

### Manually computed AIC differs from statsmodel AIC

I tried to manually code a formula for the AIC. I want to use it in connection with scikit learn. For testing if i coded correctly, I compared the AIC values from statsmodels given the same datasets. ...

**1**

vote

**0**answers

19 views

### mismatching between whole data during splitting data

I've just been reading up on train_test_split and have realized that I'm inadvertently leaking data with my current preprocessing setup due to make history-based function def create_dataset() and ...

**1**

vote

**0**answers

16 views

### Macro metrics (recall/F1…) for multiclass CNN

I use CNN for image classification on unbalance dataset. I'm totaly new with Keras backend. It's multiclass problem (not multilabel) and I have 16 classes.
I want to compute MACRO metrics for each ...

**-3**

votes

**0**answers

16 views

### Back Propagation: How is cost applied?

For a single example from the training set, a cost function summates output errors for all outputs, I believe. In principle, how is this cost applied?
Is the cost propagated back for each example or ...

**-1**

votes

**0**answers

9 views

### Anomaly detection algorithm performance measurements

I develop algorithm for anomaly detection problem .
I encounter in problem for estimating the accuracy of the algorithm .
The algorithm is working well but the algorithm recognize the anomaly with ...

**1**

vote

**0**answers

9 views

### Generating Batches for Word Encoders

I am confused as to what skip_window does in this situation. Apparently it is not equal to the batch_size. Additionally, I don't know what the second for loop does - if skip_window is a scalar, then ...

**1**

vote

**0**answers

21 views

### Is there a lightweight Python module to load pre-fitted ML modules and perform prediction?

I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services.
My idea is to store in the device only the code in charge of ...

**1**

vote

**0**answers

19 views

### why 10-fold cross validation is even faster than 1-fold fit when using LGB?

I am using LGB to handle a machine leaning task. But I found when I use the sklearn API cross_val_score and set cv=10, the time cost is less than single fold fit. I splited dataset useing ...

**-1**

votes

**0**answers

15 views

### Is there any way i could identify sentence pattern?

I want to know if any library provides me the way to identify or set some rules on top of POC to identify the sentence patterns such as identifying the subject, verb, object(not just these three), etc....

**2**

votes

**0**answers

14 views

### Using xgboost in azure ML environment

I cannot succeed to use xgboost package in Azure Machine Learning Studio interpreter. I am trying to import a model using xgboost that I trained in order to deploy it here. But It seems that my ...

**1**

vote

**0**answers

32 views

### Neural network does not predict properly with Count Vectorizer

I'm trying to do a Sentiment Analysis prediction using the text and the scores of random IMDB reviews. I turned all the words into a Bag Of Words and put it all in a neural network. The prediction ...

**-2**

votes

**0**answers

5 views

### In my machine-learning model, I use linear regression and it give high error rate then how I minimize MSE error?

If MSE error of train set is 21.69, and test set error is 17.68 how to minimize it? By using validation or learning curve is this possible? Or can I add more data!

**2**

votes

**0**answers

29 views

### How to handle loss function and log probabilities for neural network with multiple outputs?

I've implemented a custom environment in the style of OpenAi Gym environments in which I have shapes (circles, squares etc.) that I can move on a plane within a boundary. I want to apply reinforcement ...

**1**

vote

**0**answers

28 views

### Reset Tensorflow adam optimizer

I am performing K-fold cross validation on a neural network model with Tensorflow, so I need to train the model k times.
I would like to know how to reset the state of the tf.train.AdamOptimizer so I ...

**3**

votes

**1**answer

40 views

### Algorithm to classify instances from a dataset similar to another smaller dataset, where this smaller dataset represents a single class

I have a dataset that represents instances from a binary class. The twist here is that there are only instances from the positive class and I have none of the negative one. Or rather, I want to ...

**-2**

votes

**1**answer

31 views

### Is it possible to create synthetic data from the existing one? [on hold]

Generally, the problem is that I have a small dataset (300 instances). I have tried most of the machine learning algorithms, and now I wanted to apply some deep learning. So the problem is the size of ...

**1**

vote

**1**answer

29 views

### Bigger batch size reduce training time

i'm using CNN for image classification; I do data augmentation with keras ImageDataGenerator
I think i'm missing something.
A /// train =model.fit_generator(image_gen.flow(train_X, train_label, ...

**1**

vote

**1**answer

24 views

### How to prepare the multilevel multivalued training dataset in python

I am a beginner in machine learning. My academic project involves detecting human posture from acceleration and gyro data. I am stuck at the beginning itself. My accelerometer data has x,y,z values ...

**1**

vote

**0**answers

38 views

### How to reduce position changes after dimensionality reduction?

Disclaimer: I'm a machine learning beginner.
I'm working on visualizing high dimensional data (text as tdidf vectors) into the 2D-space. My goal is to label/modify those data points and recomputing ...

**1**

vote

**0**answers

25 views

### unsupervised learning - clustering numpy arrays within numpy arrays

We're working with a dataset of spoken numbers. The wavefiles are converted to MFCC values. Each row (wavfile) consists of around 20 to 40 (depending on the length of the soundfile) arrays, with 13 ...