Questions tagged [machine-learning]

Implementation questions about machine learning algorithms. General questions about machine learning should be posted to their specific communities.

0
votes
0answers
12 views

sklearn svc is being training well but I can't predict

I'm training one svm classifier using sklearn, the features which was trained the model has this shape: [[0.04954833 0.09270993 0.08942692 ... 0.48863458 0.73213733 0.06461511] [0.67277258 0....
1
vote
0answers
17 views

Import a TensorFlow checkpoint and use it for predictions with Keras

I want to use a pre-trained GoogLeNet model to predict images' classes with Keras. I have downloaded the relevant TensorFlow files: $ tree pretrained/ pretrained/ ├── cifar │   ├── checkpoint │   ├── ...
0
votes
0answers
4 views

Why can't Forward Stagewise Additive Modeling work with absolute loss function?

In Forward Stagewise Additive Modeling, if the loss function is squared loss, the next weak learner fits to the residual error. Why not we do like this when the loss function is absolute error or ...
0
votes
0answers
15 views

Transforming a gzip file into npy file [duplicate]

For my my ML model I need to open a gzip file and convert it to an array. My code looks like this: def load_data(path): with np.load(path) as f: x_train, y_train = f['x_train'], f['...
0
votes
0answers
12 views

Regression model/ non-linear regression

I have a dataset with dimension 96x100. The columns represent the water flow for 100 days at times starting at 5am with interval of 15min. I tried running linear regression but it produces a variance ...
1
vote
1answer
10 views

RandomizedSearchCV for XGboost, imbalanced dataset and optimum iterations count (n_iter)

I am working on a imbalanced (9:1) binary classification problem and would like to use Xgboost & RandomizedSearchCV. As shown in code there are 472,50,000 (5*7*5*5*5*5*6*4*9*10) combinations of ...
0
votes
0answers
8 views

Why using RMSE as loss function in logistic regression takes non convex form but doesn't in linear regression?

I am taking this deep learning course from Andrew NG. In the 3rd lecture of 2nd week of the first course, he mentions that we can use RMSE for logistic regression as well but it will take a nonconvex ...
-1
votes
1answer
12 views

Why pycharm having error while loading .joblib file

I have a trained decision tree model file music-recommender.joblib. When I am using Jupyter notebook, I am able to load this trained model successfully and able to do prediction. But the same code I ...
0
votes
0answers
18 views

For a reinforcement learning model implementation can the reward system be different for training and evaluation?

I am trying to create a reinforcement learning model for valuation of the company based on financials of the company. In evaluation I will use financials trend to give reward to the Agent. While in ...
1
vote
0answers
16 views

convert the following json to csv

there's 3gb JSON data which I want to convert it into CSV format using python. But the piece of code which I have written converts the data to CSV but stores it in a single cell. I don't want the "...
1
vote
1answer
29 views

Multiple linear regression with gradient descent

Halo, I'm new in machine learning and Python and I want to predict the Kaggle House Sales in King County dataset with my gradient descent. I'm splitting 70% (15k rows) training and 30% (6k rows) ...
-1
votes
0answers
6 views

In (Sales Demand prediction) Multivariate time series Model, how to use the features of time t for predicting the sales demand at time t?

I am currently working on a sales demand prediction of a product using time series analysis. Here, I am taking some features such as(day of the week , day of the month, price, price_discount, sales) ...
1
vote
1answer
21 views

How can correct sample_weight in sklearn.naive_bayes?

I'm implementing Naive Bayes by sklearn with imbalanced data. My data has more than 16k records and 6 output categories. I tried to fit the model with the sample_weight calculated by sklearn.utils....
1
vote
0answers
18 views

Extract FAQ content from websites of different domain name

Currently, I have used Scrapy and bs4 to do web crawling on individual website's faq contents. However, as different websites format their html structures differently, I will need to adjust the tags ...
1
vote
0answers
22 views

Retrieve words based on a given context For example, given a job description, I need to find the words related to the skills

I have a paragraph of job description as below JAVA SOLUTION ARCHITECT – CONTRACT – DALLAS, TXA leading consulting firm is looking to bring on a Java Solution Architect to help them deliver a ...
0
votes
0answers
12 views

Backward Difference Encoding with probability [on hold]

Backward encoding for K ordinal categorical data produces k-1 binary categories and the mean of the dependent variable for a level is compared with the mean of the dependent variable for the prior ...
0
votes
0answers
26 views

What exactly does “.fit()” do? [duplicate]

I'm pretty new to machine learning, I want to know what exactly the .fit() function does. No background information. fitmod = mod.fit(drug2[["DUQ219", "DUQ272", "DUQ352", "DUQ410"]], drug2["DUQ352"])...
1
vote
0answers
13 views

how to save trained ARIMA model to use later

I am using a univariate time series dataset to forecast. I am using ARIMA model to train. But it is a time-consuming process to train every time. Is there any process to save the trained ARIMA model ...
-1
votes
0answers
17 views

How to obtain images by providing its meaning in deep learning?Word to image deep learning

I'm creating an IOS application that converts a word typed in text field to the corresponding image. I want to create an ML model that return an image by providing its meaning(juste a simple word) but ...
-2
votes
0answers
12 views

Convert an SVM to decision tree(s), is it possible?

So, let's say I already inferred an SVM and its parameters from a given dataset, so I already have a discriminative classifier set up. I would like to create an equivalent classifier based on a ...
-1
votes
1answer
11 views

Pre-Modeling feature selection

I began a segmentation exercise, where we would like to cluster users with similar characteristics in their respective groups. We have 100s of features to consider, some are obvious like age and geo ...
-1
votes
0answers
8 views

Do TFLite models usually contain fewer operations?

I was wondering if a tflite model should contain fewer operations than the original tf model. Clearly, tflite models are smaller in size, but I believe that's (mostly) because of the use of ...
1
vote
0answers
20 views

Standardscaler v Min Max. Standardscaler is not working for ANN but MInMax does

I have a dataset with multiple features and a target. I am using ANN to predict. When I scale the features using MInMax, everything works fine and during the compile and fix I get good loss and ...
1
vote
0answers
19 views

Initialize neural network weights with Tensorflow

I am developing a neural network model using Tensorflow. In the LOSO cross validation, I need to train a model for 10 folds, since I have data from 10 different subjects. Taking this into account, I ...
-2
votes
0answers
12 views

Recommendation to classify issue based on error messages

I would like to categorize issue based on the error messages and Platform. Platform and Error Message are independent variables and Issue is dependent variable (Please see attachment). I have some ...
4
votes
1answer
43 views

How to rewrite a tensorflow graph to use CPU for all operations

I've trained a network on a multi GPU and CPU setup, and saved the resulting model as a tensorflow SavedModel. I then have another script which can load the resulting model and run the required ops to ...
-4
votes
0answers
16 views

simple questions about Regression - ML

Could you please provide 2-3 sentence for each following question? 1) Describe any optimization criterion that could be used for estimating parameters of a linear regression model. 2) Describe any ...
1
vote
0answers
14 views

Can I use trained ML model in django

How to use trained ML model in django and run the trained ML model using user provided values in a form.
-2
votes
1answer
18 views

Odd behavior of cost over time with SGD

I am relatively new to ML/DL and have been trying to improve my skills by making a model that learns the MNIST data set without TF or keras. I have 784 input nodes, 2 hidden layers of 16 neurons each, ...
-1
votes
1answer
33 views

how to define target variable for linear regression

I want to perform regression analysis on a dataset of dimensions 96x100. The columns represent the value for number of days(100) while the independent variable is the time. How can I perform linear ...
1
vote
1answer
17 views

Manually computed AIC differs from statsmodel AIC

I tried to manually code a formula for the AIC. I want to use it in connection with scikit learn. For testing if i coded correctly, I compared the AIC values from statsmodels given the same datasets. ...
1
vote
0answers
19 views

mismatching between whole data during splitting data

I've just been reading up on train_test_split and have realized that I'm inadvertently leaking data with my current preprocessing setup due to make history-based function def create_dataset() and ...
1
vote
0answers
16 views

Macro metrics (recall/F1…) for multiclass CNN

I use CNN for image classification on unbalance dataset. I'm totaly new with Keras backend. It's multiclass problem (not multilabel) and I have 16 classes. I want to compute MACRO metrics for each ...
-3
votes
0answers
16 views

Back Propagation: How is cost applied?

For a single example from the training set, a cost function summates output errors for all outputs, I believe. In principle, how is this cost applied? Is the cost propagated back for each example or ...
-1
votes
0answers
9 views

Anomaly detection algorithm performance measurements

I develop algorithm for anomaly detection problem . I encounter in problem for estimating the accuracy of the algorithm . The algorithm is working well but the algorithm recognize the anomaly with ...
1
vote
0answers
9 views

Generating Batches for Word Encoders

I am confused as to what skip_window does in this situation. Apparently it is not equal to the batch_size. Additionally, I don't know what the second for loop does - if skip_window is a scalar, then ...
1
vote
0answers
21 views

Is there a lightweight Python module to load pre-fitted ML modules and perform prediction?

I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services. My idea is to store in the device only the code in charge of ...
1
vote
0answers
19 views

why 10-fold cross validation is even faster than 1-fold fit when using LGB?

I am using LGB to handle a machine leaning task. But I found when I use the sklearn API cross_val_score and set cv=10, the time cost is less than single fold fit. I splited dataset useing ...
-1
votes
0answers
15 views

Is there any way i could identify sentence pattern?

I want to know if any library provides me the way to identify or set some rules on top of POC to identify the sentence patterns such as identifying the subject, verb, object(not just these three), etc....
2
votes
0answers
14 views

Using xgboost in azure ML environment

I cannot succeed to use xgboost package in Azure Machine Learning Studio interpreter. I am trying to import a model using xgboost that I trained in order to deploy it here. But It seems that my ...
1
vote
0answers
32 views

Neural network does not predict properly with Count Vectorizer

I'm trying to do a Sentiment Analysis prediction using the text and the scores of random IMDB reviews. I turned all the words into a Bag Of Words and put it all in a neural network. The prediction ...
-2
votes
0answers
5 views

In my machine-learning model, I use linear regression and it give high error rate then how I minimize MSE error?

If MSE error of train set is 21.69, and test set error is 17.68 how to minimize it? By using validation or learning curve is this possible? Or can I add more data!
2
votes
0answers
29 views

How to handle loss function and log probabilities for neural network with multiple outputs?

I've implemented a custom environment in the style of OpenAi Gym environments in which I have shapes (circles, squares etc.) that I can move on a plane within a boundary. I want to apply reinforcement ...
1
vote
0answers
28 views

Reset Tensorflow adam optimizer

I am performing K-fold cross validation on a neural network model with Tensorflow, so I need to train the model k times. I would like to know how to reset the state of the tf.train.AdamOptimizer so I ...
3
votes
1answer
40 views

Algorithm to classify instances from a dataset similar to another smaller dataset, where this smaller dataset represents a single class

I have a dataset that represents instances from a binary class. The twist here is that there are only instances from the positive class and I have none of the negative one. Or rather, I want to ...
-2
votes
1answer
31 views

Is it possible to create synthetic data from the existing one? [on hold]

Generally, the problem is that I have a small dataset (300 instances). I have tried most of the machine learning algorithms, and now I wanted to apply some deep learning. So the problem is the size of ...
1
vote
1answer
29 views

Bigger batch size reduce training time

i'm using CNN for image classification; I do data augmentation with keras ImageDataGenerator I think i'm missing something. A /// train =model.fit_generator(image_gen.flow(train_X, train_label, ...
1
vote
1answer
24 views

How to prepare the multilevel multivalued training dataset in python

I am a beginner in machine learning. My academic project involves detecting human posture from acceleration and gyro data. I am stuck at the beginning itself. My accelerometer data has x,y,z values ...
1
vote
0answers
38 views

How to reduce position changes after dimensionality reduction?

Disclaimer: I'm a machine learning beginner. I'm working on visualizing high dimensional data (text as tdidf vectors) into the 2D-space. My goal is to label/modify those data points and recomputing ...
1
vote
0answers
25 views

unsupervised learning - clustering numpy arrays within numpy arrays

We're working with a dataset of spoken numbers. The wavefiles are converted to MFCC values. Each row (wavfile) consists of around 20 to 40 (depending on the length of the soundfile) arrays, with 13 ...

http://mssss.yulina-kosm.ru