Questions tagged [xgboost]

XGBoost is a library for constructing boosted tree models in R, Python, Java, Scala, and C++. Use this tag for issues specific to the package (i.e. input/output, installation, functionality).

1
vote
1answer
10 views

RandomizedSearchCV for XGboost, imbalanced dataset and optimum iterations count (n_iter)

I am working on a imbalanced (9:1) binary classification problem and would like to use Xgboost & RandomizedSearchCV. As shown in code there are 472,50,000 (5*7*5*5*5*5*6*4*9*10) combinations of ...
0
votes
0answers
12 views

Using different truth for evalution XGBoost python

In XGBoost I want to use the NDCG metric to evaluate my model. My data set looks something like this: click_bool booking_bool relevance 1 1 5 0 ...
0
votes
1answer
18 views

How to fix 'DMatrix/Booster has not been intialized or has already been disposed ' error

I trained an xgboost model and saved it. And then I copied it to my another system to predict results through the following codes. Python 3.7, xgboost 0.8, conda The same model file can work if I ...
0
votes
0answers
12 views

Unable to import xgboost in Python

I have installed xgboost successfully using pip for Python 2.7.16 (I installed this Python version using Homebrew on macOS High Sierra). My problem is that I'm unable to import xgboost in Python, as ...
0
votes
0answers
5 views

Minimize values predicted from an XGBoost model through feature optimization

I have created an XGBoost model in python which predicts time taken for a given process basis key drivers. Intent now is the find the right combination of drivers which will minimize the processing ...
0
votes
0answers
7 views

Multi Class Classification using XGBClassifier

I am using XGBClassifier for multiclass classification(5 classes - [1,2,3,4,5]). I have set objective parameter as 'multi:softmax' but still when I predict using my model I am getting continuous ...
0
votes
0answers
31 views

xgboost predict wrong result when creating dmatrix from numpy array

I have a trained XGBoost model and the model predict well when I create xgboost dmatrix from pandas dataframe during test. But when I create xgboost dmatrix from numpy array, the model always predict ...
0
votes
0answers
21 views

XGBoost error - Unknown objective function reg:squarederror

I am training a xgboost model for regression task and I passed the following parameters - params = {'eta':0.4, 'max_depth':5, 'colsample_bytree':0.6, 'objective':'reg:squarederror'} num_round = 10 ...
0
votes
0answers
9 views

Is there a way to iterate through 2D C++ vector to create XGDMatrix?

I want to create an XGDMatrix from a C++ 2D vector. My data inputs are in a vector<vector<float>> and I do not want to convert them into a 2D float array to be able to use ...
0
votes
0answers
5 views

Input and output of ranking using LambdaMART in XgBoost

When performing ranking using LambdaMART in xgboost what will be the input and is output a probablility that one link is more relevant than another as in RankNet or some other thing.Please clarify.
0
votes
0answers
13 views

Why is their error jump from 1.35% to 9.91% after using Standard Scalar?

I am using XGBoost with default parameters,the Mean Absolute Error is 1.35%(please refer to my code),RMSE=4966.55 and then using Standard Scalar with XGBoost ,I got an Mean Absolute Error of 9.16% ,...
0
votes
1answer
20 views

Rebuilding and training new Deep Learning Python model after feature importances and feature selection to reduce feature amount?

i'm learning Deep Learning concepts with python and i've come so far with my project. This open project's purpose is to detect Liver cancer so patient avoid biopsy and can be healed sooner than usual....
0
votes
0answers
10 views

reference a column in a xgb.DMatrix for model prediction

I would like to reference a column in a xgb.DMatrix and use it in an if condition. Is it possible to reference elements in a column in an xgb.DMatrix? (I am using the xgboost package) Such as: ...
0
votes
1answer
14 views

How to save feature importance plot of xgboost to a file from Jupyter notebook

I am struggling with saving the xgboost feature-importance plot to a file. I have created a model and plotted importance of features in my jupyter notebook- xgb_model = xgboost.train(best_params, ...
0
votes
0answers
14 views

Use XGBoost's implementation of AUC on a test set

I want to be able to use a classifier fitted with XGBoost to compute AUC (Area under the ROC curve) on a test set, but using XGBoost's implementation of AUC (which is used to compute the metric on the ...
0
votes
0answers
25 views

LIME Object of type 'ndarray' is not JSON serializable

I am using LIME to interpret results of a multi class classification Xgboost model. The target contains 6 labels, which have been encoded using LabelEncoder. from lime import lime_text from lime....
0
votes
0answers
7 views

Number of variables greater than number of observations in XGBoost

I am using XGBoost in Python. My columns (variables) are survey questions and the rows are responses for each individual user. Because I use One Hot Encoding, the number of columns goes up to 274 (...
0
votes
0answers
16 views

RandomizedSearchCV and sklearn.externals.joblib.externals.loky.process_executor.TerminatedWorkerError problem

I'm using RandomizedSearchCV to find the optimal config for my XGBClassifier learner and run into the TerminatedWorkerError error Python code rs = RandomizedSearchCV(XGBClassifier(), param_grid, cv=...
2
votes
1answer
33 views

Why does xgboost produce the same predictions and nan values for features when using entire dataset?

Summary I am using Python v3.7 and xgboost v0.81. I have continuous data (y) at a US state level by each week from 2015 to 2019. I'm trying to regress on the following features to y: year, month, ...
0
votes
1answer
31 views

Trouble training xgboost

I am trying to run a Python notebook (link). At line below In [446]: where author train XGBoost, I am getting an error ValueError: DataFrame.dtypes for data must be int, float or bool. ...
1
vote
1answer
29 views

Plot number formatting in XGBoost plot_importance()

I've trained an XGBoost model and used plot_importance() to plot which features are the most important in the trained model. Although, the numbers in plot have several decimal values which floods the ...
0
votes
0answers
40 views

Why does my XGBoost model that loaded by Scala API and used by broadcast in Spark run so slowly?

I have a XGBoost model trained by python API, and now I just want to use it in spark to predict massive data that about 300 million. I just use the loadModel API of the Scala's XGBoost to load the ...
0
votes
1answer
12 views

What is “[0]#011train-merror:0.17074#011validation-merror:0.1664” error when running xgb_model.fit() in AWS Sagemaker?

I'm running through the official sagemaker tutorial here. And although training completes, I'm getting errors like below periodically during training, xgb_model.fit(inputs=data_channels, logs=True). ...
1
vote
0answers
13 views

When using the scale_pos_weight parameter in xgboost, I don't know why this is happening?

I have to solve the binary classification problem. (The ratio of train data size betweens label 0 and 1 is 4.7: 1) So, I created the model with the xgboost algorithm. Result is quite good. - AUC: 0....
0
votes
0answers
19 views

Why am I getting different results from Scikit-learn API vs Learning API of XGBoost (Part 2)?

I used the Scikit-learn API for XGBoost (in python). My accuracy was ~ 75%. I used the same parameter set and used the Learning API for XGBoost; my accuracy was ~ 87%. My understanding is that Scikit-...
0
votes
1answer
17 views

LabelEncoder is not converting the strings into numericals (0,1,2)

I have the data like below, I need to encode the variables, but LabelEncoder is not encoding the strings My data looks like below Delivery_class First Class Same Day Second Class Standard Class X=...
0
votes
0answers
19 views

Bayesian Optimization with XGBoost error: “Some trailing characters could not be parsed: 'Inf' ”

I am trying to replicate this method using a different dataset in R. https://www.kaggle.com/btyuhas/bayesian-optimization-with-xgboost My data has missing values, but no Inf values. However, it ...
-1
votes
0answers
13 views

How to restore an XGBoost model into Python, which was saved in R?

I need to use an XGBoost model built in R by my colleagues , in my python environment for some plotting and exploration. I found PMML as an option. Is it possible to directly use model saved using ...
0
votes
0answers
17 views

Subset of features on external memory

I have a large file that I'm not able to load so I'm using a local file with xgb.DMatrix. But I'd like to use only a subset of the features. The documentation on xgboost says that the colset argument ...
3
votes
1answer
39 views

Does oversampling happen before or after cross-validation using imblearn pipelines?

I have split my data into train/test before doing cross-validation on the training data to validate my hyperparameters. I have an unbalanced dataset and want to perform SMOTE oversampling on each ...
1
vote
0answers
19 views

is it possible take a look inside a XGBoost model? is there estimators_ equivalent to xgboost package?

scikit-learn provide the attribute estimators_ to let us take a look inside a RandomForestRegressor. forest_model = RandomForestRegressor(n_estimators=10,random_state=1) forest_model.fit(train_X, ...
0
votes
1answer
55 views

I have a XGBoost model trained in python, but it will get a different predictions when loaded in scala and used the same features, why?

I have a xgboost model trained in python api named as my_fpd20.model, now I want use it in Scala to execute the prediction operation, but when I do test, there get a different predicted result when ...
0
votes
0answers
20 views

Xgboost - Data Cleanup for string values

I have a panda data frame with string values in a column .I am trying to run XGboost modelling on this . Before the modeling I am trying to convert the values to numeric . Below error is coming ...
1
vote
1answer
54 views

Why am I getting different results from Scikit-learn API vs Learning API of XGBoost?

I used the Scikit-learn API for XGBoost (in python). My accuracy was ~ 68%. I used the same parameter set and used the Learning API for XGBoost; my accuracy was ~ 60%. My understanding is that Scikit-...
1
vote
0answers
27 views

Supervised Time Series efficiency improvement

The data that I have is hourly recorded over the past 4 months. I am building a time series model and I've tried several methods so far: Arima, LSTMs, Prophet but they can be quite slow for my task ...
0
votes
1answer
32 views

Problem regarding predict_proba function in XGBoost in python

Currently I am working on a binary classification problem. I want my predicted output to be the probability, not 1 or 0 using XGBoost. I have divided the data set into train, validate and test set. ...
0
votes
0answers
25 views

R - Using xgboost as feature selection but also interaction selection

Let's say I have a dataset with a lot of variables (more than in the reproductible example below) and I want to build a simple and interpretable model, a GLM. I can use a xgboost model first, and ...
4
votes
1answer
47 views

Using xgboost in BaggingRegressor

I need to run xgboost in BaggingRegressor, I use xgboost import xgboost D_train = xgboost.DMatrix(X_train, lab_train) D_val = xgboost.DMatrix(X_train[test_index], lab_train[test_index]) D_pred =...
1
vote
0answers
29 views

importance feature xgboost for text predictions

i have two text file for rating review positive and negative after the preprocessing the data with nlp i make prediction with XGboost and i am tried to get importance feature as categorical variable ...
0
votes
0answers
28 views

how can i convert LSTM model to linear regression model?

Here is LSTM predict model and i want to convert Linear Regression. ... model.fit(x_train, y_train, epochs=10, batch_size=16) trainPredict = model.predict(x_train) testPredict = model.predict(x_test)...
0
votes
1answer
25 views

Different scores when training XGBoost models in different machines

On trying XGBoost's Demo on Ranking feature that it has: https://github.com/dmlc/xgboost/tree/master/demo/rank There different prediction scores that generated on different machines. Is that expected?...
0
votes
0answers
63 views

feature selection with xgboost and early stopping (and using mlxtend for feature selection)

i'd like to do feature selection for xgboost model with early stopping enabled ( using SequentialFeatureSelector from mlxtend library, but open for other choices) as early_stopping is a parameter for ...
0
votes
0answers
19 views

Is there a way change leaf node label in xgboost tree using graphviz

I tried to put tree picture, but my reputation was not enough... Anyway, in XGBoost, I checked the tree to interpret the model. By the way, leaf node label means raw score. I want to see prediction ...
0
votes
0answers
10 views

Does XGboost Regressor has all parameters like regression

Can we measure p-value, variable wise coefficient value and other parameters in XGboost regressor like we do in linear Regression(OLS) model
0
votes
1answer
57 views

XGBoost training fails if null values exists (setHandleInvalid “keep” exists for whole pipeline)

I'm training an XGBoostRegressor model using Spark (Scala), and I've noticed that the number of predicted values is less than what was given to the model using model.transform(df). The problem is due ...
0
votes
1answer
20 views

What is the correct name for Hyperparametrs XGBOOST in a Pipeline?

I'm working in a classification problem, and I'm using the Grisearch method for finding the optimal hyperparameters. However I'm using the architecture of a pipeline for build the same classification ...
1
vote
0answers
33 views

Kernel Died using XGBoost in Windows 10 on Spyder

I'm trying to use xgboost on my laptop for a simple classification problem in Spyder IDE. Below there is the code: # Importing the libraries import numpy as np import matplotlib.pyplot as plt import ...
1
vote
2answers
34 views

I'm getting a “command not found” error when trying to install “xgboost” in an Ubuntu 16.04 Virtual Machine

I'm trying to install xgboost in an Ubuntu 16.04 virtual machine. I'm following this guide and ran this command: cmake .. I got this error: -bash: cmake: command not found What am I doing wrong ...
0
votes
1answer
34 views

Installation of custom XGBoost fails due to error in shared library

Im trying to set up a custom version of XGBoost from https://github.com/robjhyndman/M4metalearning in R. When I run devtools::install_github("pmontman/customxgboost") I get this error: > ...
-1
votes
0answers
20 views

How to tune Hyper parameters under XGBoost more efficiently

I am using XGBoost model for my problem. I see that there are many hyper paramters for Xgboost. I tried hyper tuning 4 parameters using Grid search and it is almost 5 hrs and it is still running. is ...

http://mssss.yulina-kosm.ru