Hi All, I have the following script, that raises error at the last command. I am new to R and require some clarification on what is going wrong.
#Creating the training and testing data sets splitFlag <- sample.split(pfi_v3, SplitRatio = 0.7) trainPFI <- subset(pfi_v3, splitFlag==TRUE) testPFI <- subset(pfi_v3, splitFlag==FALSE) #Structure of the trainPFI data frame > str(trainPFI) ******* 'data.frame': 491 obs. of 16 variables: $ project_id : int 1 2 3 6 7 9 10 12 13 14 ... $ project_lat : num 51.4 51.5 52.2 51.9 52.5 ... $ project_lon : num -0.642 -1.85 0.08 -0.401 -1.888 ... $ sector : Factor w/ 9 levels "Defense","Hospitals",..: 4 4 4 6 6 6 6 6 6 6 ... $ contract_type : chr "Turnkey" "Turnkey" "Turnkey" "Turnkey" ... $ project_duration : int 1826 3652 121 730 730 790 522 819 998 372 ... $ project_delay : int -323 0 -60 0 0 0 -91 0 0 7 ... $ capital_value : num 6.7 5.8 21.8 24.2 40.7 10.7 70 24.5 60.5 78 ... $ project_delay_pct : num -17.7 0 -49.6 0 0 0 -17.4 0 0 1.9 ... $ delay_type : Ord.factor w/ 9 levels "7 months early & beyond"<..: 1 5 3 5 5 5 2 5 5 6 ... library(caret) library(e1071) set.seed(100) tr.control <- trainControl(method="cv", number=10) cp.grid <- expand.grid(.cp = (0:10)*0.001) #Fitting the model using regression tree tr_m <- train(project_delay ~ project_lon + project_lat + project_duration + sector + contract_type + capital_value, data = trainPFI, method="rpart", trControl=tr.control, tuneGrid = cp.grid) tr_m CART 491 samples 15 predictor No pre-processing Resampling: Cross-Validated (10 fold) Summary of sample sizes: 443, 442, 441, 442, 441, 442, ... Resampling results across tuning parameters: cp RMSE Rsquared 0.000 441.1524 0.5417064 0.001 439.6319 0.5451104 0.002 437.4039 0.5487203 0.003 432.3675 0.5566661 0.004 434.2138 0.5519964 0.005 431.6635 0.5577771 0.006 436.6163 0.5474135 0.007 440.5473 0.5407240 0.008 441.0876 0.5399614 0.009 441.5715 0.5401718 0.010 441.1401 0.5407121 RMSE was used to select the optimal model using the smallest value. The final value used for the model was cp = 0.005. #Fetching the best tree best_tree <- tr_m$finalModel Alright, all the aforementioned commands worked fine. Except the subsequent command raises error, when the developed model is used to make predictions: best_tree_pred <- predict(best_tree, newdata = testPFI) Error in eval(expr, envir, enclos) : object 'sectorHospitals' not found Can someone guide me what to do to resolve this issue. Any help will be highly appreciated. Many Thanks and Kind Regards -- Muhammad Bilal Research Fellow and Doctoral Researcher, Bristol Enterprise, Research, and Innovation Centre (BERIC), University of the West of England (UWE), Frenchay Campus, Bristol, BS16 1QY muhammad2.bi...@live.uwe.ac.uk<mailto:olugbenga2.akin...@live.uwe.ac.uk> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.