Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom multiclass objective receives transformed predictions #4288

Closed
jeffzi opened this issue Mar 22, 2019 · 6 comments · Fixed by #5564
Closed

Custom multiclass objective receives transformed predictions #4288

jeffzi opened this issue Mar 22, 2019 · 6 comments · Fixed by #5564

Comments

@jeffzi
Copy link
Contributor

jeffzi commented Mar 22, 2019

Hi,

I'm having issues with custom multiclass objective function receiving transformed predictions. The problem has already been raised in issue #2776

I believe the objective function should expect preds as a (N, K) array, N = #data, K = #classes, but it's not the case by default in both python and R. The previous issue points to

if (!prob) {
io_preds->Resize(max_preds_.Size());
io_preds->Copy(max_preds_);
}

In python, I managed to get the correct preds shape by adding 'objective': 'multi:softprob' to the parameters on top of my custom objective function. Here is a MWE:

import numpy as np
from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder
import xgboost as xgb

iris = datasets.load_iris()
X, y = iris.data, iris.target
dtrain = xgb.DMatrix(X, label=y)

# builtin
params = {'objective': 'multi:softprob', 'num_class': 3}
model_builtin = xgb.train(params, dtrain, num_boost_round = 1)
preds_builtin = model_builtin.predict(dtrain)

# custom
def obj(preds, dtrain):
    labels = dtrain.get_label().reshape(-1, 1)
    labels = OneHotEncoder(sparse=False, categories='auto').fit_transform(labels)
    grad = preds - labels
    hess = 2.0 * preds * (1.0 - preds)
    return grad.flatten(), hess.flatten()

params = {'objective': 'multi:softprob', 'num_class': 3}
model_custom = xgb.train(params, dtrain, num_boost_round = 1, obj = obj)
preds_custom = model_custom.predict(dtrain)

# assert approaches give same results
assert np.sum(np.abs(preds_custom - preds_builtin)) == 0

## fails without objective 'multi:softprob because preds.shape == (150,)
#params = {'num_class': len(np.unique(y))}
#model = xgb.train(params, dtrain, num_boost_round = 1, obj = obj)

import sinfo
sinfo.sinfo()
#> -----
#> numpy     	1.16.1
#> sklearn   	0.20.3
#> xgboost   	0.82
#> -----
#> Python 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0 (clang-1000.11.45.5)]
#> Darwin-18.2.0-x86_64-i386-64bit
#> 4 logical CPU cores, i386

Created on 2019-03-22 by the reprexpy package

The R package will not allow to use the same trick:

library(xgboost)
X <- data(agaricus.train, package = 'xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
params <- list(objective = "multi:softprob",  num_class = 3)
model_custom <- xgboost::xgb.train(params, data = dtrain, obj = identity)
#> Error in check.custom.obj(): Setting objectives in 'params' and 'obj' at the same time is not allowed

Created on 2019-03-22 by the reprex package (v0.2.1)

A solution is to modify this line of xgb.iter.update:

pred <- predict(booster_handle, dtrain)

as:

pred <- predict(booster_handle, dtrain, outputmargin = TRUE, reshape = TRUE)

Moreover, we need to apply softmax to predictions since we cannot "force" softprob.

MWE reproducing the python example:

library(xgboost)

data(iris)
X <- as.matrix(iris[, names(iris) != "Species"])
y <- as.numeric(iris$Species) - 1
dtrain <- xgboost::xgb.DMatrix(X, label = y)

# builtin
params <- list(objective = "multi:softprob",  num_class = 3)
model_builtin <- xgboost::xgb.train(params, data = dtrain, nrounds = 1)
preds_builtin <-predict(model_builtin, dtrain, reshape = TRUE)

# custom
softmax <- function(x) {
  exp(x) / rowSums(exp(x))
}

obj <- function(preds, dtrain) {
  labels <- xgboost::getinfo(dtrain, "label")
  labels <- as.data.frame(as.factor(labels))
  names(labels) <- "class"
  labels <- model.matrix(~ class-1, labels) # onehot encode
  
  preds = preds - apply(preds, 1, max)
  prob = softmax(preds)
  
  grad <- prob - labels
  hess <-  2 * prob * (1 - prob)
  return(list(grad = as.vector(t(grad)), hess = as.vector(t(hess))))
}

params <- list(objective = obj,  num_class = 3)
model_custom <- xgboost::xgb.train(params, data = dtrain, nrounds = 1)
preds_custom <- predict(model_custom, dtrain, reshape = TRUE, outputmargin = TRUE)
preds_custom <- softmax(preds_custom)

# assert approaches give same results (kinda)
stopifnot(all.equal(preds_builtin, preds_custom))
#> Error in eval(expr, envir, enclos): preds_builtin and preds_custom are not equal:
#>   Mean relative difference: 5.577018e-08

devtools::session_info(pkgs = c("xgboost"), include_base = FALSE)
#> ─ Session info ──────────────────────────────────────────────────────────
#>  version  R version 3.5.2 (2018-12-20)
#>  os       macOS Mojave 10.14.3        
#>  system   x86_64, darwin15.6.0                              
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package    * version date       lib source        
#>  xgboost    * 0.82.1  2019-03-11 [1] CRAN (R 3.5.2)

Created on 2019-03-22 by the reprex package (v0.2.1)

The long term solution would be to modify the native c++ code but it's beyond my expertise.

trivialfis added a commit to trivialfis/xgboost that referenced this issue Apr 16, 2020
* Add a demo for writing multi-class custom objective function.

Closes dmlc#4996, dmlc#4288 .
trivialfis added a commit to trivialfis/xgboost that referenced this issue Apr 20, 2020
* Add a demo for writing multi-class custom objective function.

Closes dmlc#4996, dmlc#4288 .
@kaijennissen
Copy link

kaijennissen commented Jul 8, 2020

I'm working on a multiclass classification problem using a custom objective function, where I use the probabilities of each class.
I've also tried to write a custom evaluation function based on the class probabilites but it seems that the custom evaluation functions only receives the class label predictions, not the individual probabilites.
Is there a way to change this behaviour.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 8, 2020

@kaijennissen Take a look at the example at

def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):
.

@guyko81
Copy link

guyko81 commented Jul 28, 2020

I don't see how to put the custom_predict function into the evaluation. Can you help?

@trivialfis
Copy link
Member

@guyko81 Could you please open a new issue I can track for adding demo?

@guyko81
Copy link

guyko81 commented Jul 28, 2020

@trivialfis I just opened, thanks in advance!

@Cai-SunShine
Copy link

@kaijennissen Take a look at the example at

def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):

.

Hi, I have read the link you paste. I have a question about it, I think the hessian should be a matrix of (class_num, class_num) but not a vector?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants