Custom multiclass objective receives transformed predictions #4288

jeffzi · 2019-03-22T16:42:35Z

Hi,

I'm having issues with custom multiclass objective function receiving transformed predictions. The problem has already been raised in issue #2776

I believe the objective function should expect preds as a (N, K) array, N = #data, K = #classes, but it's not the case by default in both python and R. The previous issue points to

xgboost/src/objective/multiclass_obj.cu

Lines 154 to 157 in 263e203

    
           if (!prob) { 
        
             io_preds->Resize(max_preds_.Size()); 
        
             io_preds->Copy(max_preds_); 
        
           }

In python, I managed to get the correct preds shape by adding 'objective': 'multi:softprob' to the parameters on top of my custom objective function. Here is a MWE:

import numpy as np
from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder
import xgboost as xgb

iris = datasets.load_iris()
X, y = iris.data, iris.target
dtrain = xgb.DMatrix(X, label=y)

# builtin
params = {'objective': 'multi:softprob', 'num_class': 3}
model_builtin = xgb.train(params, dtrain, num_boost_round = 1)
preds_builtin = model_builtin.predict(dtrain)

# custom
def obj(preds, dtrain):
    labels = dtrain.get_label().reshape(-1, 1)
    labels = OneHotEncoder(sparse=False, categories='auto').fit_transform(labels)
    grad = preds - labels
    hess = 2.0 * preds * (1.0 - preds)
    return grad.flatten(), hess.flatten()

params = {'objective': 'multi:softprob', 'num_class': 3}
model_custom = xgb.train(params, dtrain, num_boost_round = 1, obj = obj)
preds_custom = model_custom.predict(dtrain)

# assert approaches give same results
assert np.sum(np.abs(preds_custom - preds_builtin)) == 0

## fails without objective 'multi:softprob because preds.shape == (150,)
#params = {'num_class': len(np.unique(y))}
#model = xgb.train(params, dtrain, num_boost_round = 1, obj = obj)

import sinfo
sinfo.sinfo()
#> -----
#> numpy     	1.16.1
#> sklearn   	0.20.3
#> xgboost   	0.82
#> -----
#> Python 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0 (clang-1000.11.45.5)]
#> Darwin-18.2.0-x86_64-i386-64bit
#> 4 logical CPU cores, i386

^{Created on 2019-03-22 by the reprexpy package}

The R package will not allow to use the same trick:

library(xgboost)
X <- data(agaricus.train, package = 'xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
params <- list(objective = "multi:softprob",  num_class = 3)
model_custom <- xgboost::xgb.train(params, data = dtrain, obj = identity)
#> Error in check.custom.obj(): Setting objectives in 'params' and 'obj' at the same time is not allowed

^{Created on 2019-03-22 by the reprex package (v0.2.1)}

A solution is to modify this line of xgb.iter.update:

xgboost/R-package/R/utils.R

Line 148 in 263e203

pred <- predict(booster_handle, dtrain)

as:

pred <- predict(booster_handle, dtrain, outputmargin = TRUE, reshape = TRUE)

Moreover, we need to apply softmax to predictions since we cannot "force" softprob.

MWE reproducing the python example:

library(xgboost)

data(iris)
X <- as.matrix(iris[, names(iris) != "Species"])
y <- as.numeric(iris$Species) - 1
dtrain <- xgboost::xgb.DMatrix(X, label = y)

# builtin
params <- list(objective = "multi:softprob",  num_class = 3)
model_builtin <- xgboost::xgb.train(params, data = dtrain, nrounds = 1)
preds_builtin <-predict(model_builtin, dtrain, reshape = TRUE)

# custom
softmax <- function(x) {
  exp(x) / rowSums(exp(x))
}

obj <- function(preds, dtrain) {
  labels <- xgboost::getinfo(dtrain, "label")
  labels <- as.data.frame(as.factor(labels))
  names(labels) <- "class"
  labels <- model.matrix(~ class-1, labels) # onehot encode
  
  preds = preds - apply(preds, 1, max)
  prob = softmax(preds)
  
  grad <- prob - labels
  hess <-  2 * prob * (1 - prob)
  return(list(grad = as.vector(t(grad)), hess = as.vector(t(hess))))
}

params <- list(objective = obj,  num_class = 3)
model_custom <- xgboost::xgb.train(params, data = dtrain, nrounds = 1)
preds_custom <- predict(model_custom, dtrain, reshape = TRUE, outputmargin = TRUE)
preds_custom <- softmax(preds_custom)

# assert approaches give same results (kinda)
stopifnot(all.equal(preds_builtin, preds_custom))
#> Error in eval(expr, envir, enclos): preds_builtin and preds_custom are not equal:
#>   Mean relative difference: 5.577018e-08

devtools::session_info(pkgs = c("xgboost"), include_base = FALSE)
#> ─ Session info ──────────────────────────────────────────────────────────
#>  version  R version 3.5.2 (2018-12-20)
#>  os       macOS Mojave 10.14.3        
#>  system   x86_64, darwin15.6.0                              
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package    * version date       lib source        
#>  xgboost    * 0.82.1  2019-03-11 [1] CRAN (R 3.5.2)

^{Created on 2019-03-22 by the reprex package (v0.2.1)}

The long term solution would be to modify the native c++ code but it's beyond my expertise.

The text was updated successfully, but these errors were encountered:

* Add a demo for writing multi-class custom objective function. Closes dmlc#4996, dmlc#4288 .

kaijennissen · 2020-07-08T10:10:20Z

I'm working on a multiclass classification problem using a custom objective function, where I use the probabilities of each class.
I've also tried to write a custom evaluation function based on the class probabilites but it seems that the custom evaluation functions only receives the class label predictions, not the individual probabilites.
Is there a way to change this behaviour.

hcho3 · 2020-07-08T10:24:31Z

@kaijennissen Take a look at the example at

xgboost/demo/guide-python/custom_softmax.py

Line 34 in 22a31b1

def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):

.

guyko81 · 2020-07-28T13:32:09Z

I don't see how to put the custom_predict function into the evaluation. Can you help?

trivialfis · 2020-07-28T14:07:19Z

@guyko81 Could you please open a new issue I can track for adding demo?

guyko81 · 2020-07-28T16:00:29Z

@trivialfis I just opened, thanks in advance!

Cai-SunShine · 2024-05-09T13:52:06Z

@kaijennissen Take a look at the example at

xgboost/demo/guide-python/custom_softmax.py

Line 34 in 22a31b1

def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):

.

Hi, I have read the link you paste. I have a question about it, I think the hessian should be a matrix of (class_num, class_num) but not a vector?

hcho3 added the known-issue label Apr 30, 2019

hcho3 mentioned this issue May 12, 2019

[jvm-packages] support distributed synchronization of customized evaluation metrics #4280

Closed

7 tasks

jeffzi mentioned this issue Dec 10, 2019

Custom multiclass objectives in R #4996

Closed

trivialfis added the type: bug label Apr 16, 2020

trivialfis added a commit to trivialfis/xgboost that referenced this issue Apr 16, 2020

Set output margin to True for custom objective.

4bb2eb2

* Add a demo for writing multi-class custom objective function. Closes dmlc#4996, dmlc#4288 .

trivialfis mentioned this issue Apr 16, 2020

[BREAKING] Set output margin to True for custom objective. #5265

Closed

trivialfis added a commit to trivialfis/xgboost that referenced this issue Apr 20, 2020

Set output margin to True for custom objective.

c09fe63

* Add a demo for writing multi-class custom objective function. Closes dmlc#4996, dmlc#4288 .

trivialfis mentioned this issue Apr 20, 2020

[Breaking] Set output margin to True for custom objective. #5564

Merged

trivialfis closed this as completed in #5564 Apr 20, 2020

guyko81 mentioned this issue Jul 28, 2020

Custom multiclass eval metric receives transformed predictions #5952

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom multiclass objective receives transformed predictions #4288

Custom multiclass objective receives transformed predictions #4288

jeffzi commented Mar 22, 2019

kaijennissen commented Jul 8, 2020 •

edited

Loading

hcho3 commented Jul 8, 2020

guyko81 commented Jul 28, 2020

trivialfis commented Jul 28, 2020

guyko81 commented Jul 28, 2020

Cai-SunShine commented May 9, 2024

Custom multiclass objective receives transformed predictions #4288

Custom multiclass objective receives transformed predictions #4288

Comments

jeffzi commented Mar 22, 2019

kaijennissen commented Jul 8, 2020 • edited Loading

hcho3 commented Jul 8, 2020

guyko81 commented Jul 28, 2020

trivialfis commented Jul 28, 2020

guyko81 commented Jul 28, 2020

Cai-SunShine commented May 9, 2024

kaijennissen commented Jul 8, 2020 •

edited

Loading