Hi R-list, We have constructed a hurdle model some time ago. Now we were able to gather new data in the same city (38 new sites), and want to do an external validation to see if the model still performs ok. All the books and lectures I have read say its the best validation option but... I have made a (simple) search, but it seems that as having new data for a model is rare, have not found anything with the depth enough so as to reproduce it/adapt it to hurdle models.
I have predicted the probability for non-zero counts nonzero <- 1 - predict(final, newdata = datosnuevos, type = "prob")[, 1] and the predicted mean from the count component countmean <- predict(final, newdata = datosnuevos, type = "count") I understand that "newdata" is taking into account the new values for the independent variables (environmental variables), is it? So, I have to compare the predicted values of y (calculated with the new values of the environmental variables) with the new observed values. That would be using the model (constructed with the old values), having as input the new variables, and having as output a "new" prediction, to be contrasted with the "new" observed y. These comparison would be by means of AUC, correct classification, and/or what other options? Results of the external validation would just be a % of correct predicted values? plots? Need some guidance, sorry if the explanation was "basic" but needed to write it in my own words so as not to miss any detail. Thank you very much in advance, María Eugenia Utgés CeNDIE-ANLIS Buenos Aires Argentina a [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.