Dear All,
I am trying to implement my own metric (a log loss metric) for a
binary classification problem in Caret.
I must be making some mistake, because I cannot get anything sensible
out of it.
I paste below a numerical example which should run in more or less one
minute on any laptop.
When I run it, I finally have an output of the kind




Aggregating results
Something is wrong; all the LogLoss metric values are missing:
   LogLoss
    Min.   : NA
     1st Qu.: NA
      Median : NA
       Mean   :NaN
         3rd Qu.: NA
          Max.   : NA
           NA's   :40
           Error in train.default(x, y, weights = w, ...) : Stopping
           In addition: Warning message:
           In nominalTrainWorkflow(x = x, y = y, wts = weights, info =
           trainInfo,  :
             There were missing values in resampled performance
             measures.



Any suggestion is appreciated.
Many thanks

Lorenzo





####################################################àà

library(caret)
library(C50)


LogLoss <- function (data, lev = NULL, model = NULL)
{
   probs <- pmax(pmin(as.numeric(data$T), 1 - 1e-15), 1e-15)
       logPreds <- log(probs)
            log1Preds <- log(1 - probs)
                real <- (as.numeric(data$obs) - 1)
                    out <- c(mean(real * logPreds + (1 - real) *
                    log1Preds)) * -1
                        names(out) <- c("LogLoss")
                            out
                            }






train <- matrix(ncol=5,nrow=200,NA)

train <- as.data.frame(train)
names(train) <- c("donation", "x1","x2","x3","x4")

set.seed(134)

sel <- sample(nrow(train), 0.5*nrow(train))


train$donation[sel] <- "yes"
train$donation[-sel] <- "no"

train$x1 <- seq(nrow(train))
train$x2 <- rnorm(nrow(train))
train$x3 <- 1/train$x1
train$x4 <- sample(nrow(train))

train$donation <- as.factor(train$donation)

c50Grid <- expand.grid(trials = 1:10,
        model = c( "tree" ,"rules"
                            ),winnow = c(TRUE,
                                                     FALSE ))





tc <- trainControl(method = "repeatedCV", summaryFunction=LogLoss,
                  number = 10, repeats = 10, verboseIter=TRUE,
                  classProbs=TRUE)


model <- train(donation~., data=train, method="C5.0", trControl=tc,
              metric="LogLoss", maximize=FALSE, tuneGrid=c50Grid)

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to