[R] Data Security when using R
Hello. At the company I work for, I recently requested having R loaded onto my desktop and some of my colleagues. My company's IT/Security groups are having trouble assessing whether R software meets their standards. Can anyone point me to a source where i can read about how R uses data? does it store the data somewhere? Does data ever actually leave the company's environment? etc...? Thanks. Sean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R, ctree and categorical variables
I am running the ctree function in R. My data has about 10 variables, many of which are categorical. 2 of the categorical variables have many levels (one has 900 levels, another has 1,000 levels). As an example, 1 of these variables is disease code and is structured as A, B, C, , AA, AB, AC Each time i've tried to run the ctree function, including these 2 variables in the data, the function never stops running. When i remove these 2 variables from the data and run without them, the function returns in about 3 seconds. Q: Is there a limit to the amount of levels that a categorical variable can contain? Is there something else that i may be overlooking? THanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: print and ctree
I have run the ctree function, and my dependent variable is broken into 3 categories: low cost, moderate cost and high cost. When i plot the results (eg. using plot(test.ct)), the plot shows, at the very bottom of each node, the probability of falling into each cost category. When i print the actual results (eg. using print(test.ct)), i get all of the backup information, but i do not get the probability of falling into each cost category. Is there a way i can get these probabilities to show up on the actual summary of results using print function? THANKS. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] helpful functions in R for testing results of tree ("party")
Hello all. I have been teaching myself how to use recursive partitioning in R, particularly using the "party" package. Now that i've generated some trees, i would like to understand how i would go about validating the goodness of fit/accuracy, etc..., of the trees. What functions can i use? Do i need to validate each terminal node separately? Are there examples that i could review which would make this more easily understood? Thanks in advance for your help. Sean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA and Regression with complex categorical variables
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA and Regression with complex categorical variables
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ctree question
Hello. I have used the "party" package to generate a regression tree as follows: >origdata<-read.csv("origdata.csv") >ctrl<-ctree_control(mincriterion=0.99,maxdepth=10,minbucket=10) >test.ct<-ctree(Y~X1+X2+X3,data=origdata,control=ctrl) The above works fine. Orig data was my training data. I now have a test data file (testdata), and would like to run the testdata through the above tree to see predictions. I tried using the following function >predict(test.ct,newdata=testdata) but I get the following error: Error in checkData(oldData, RET) : Levels in factors of new data do not match original data I've looked at the testdata file closely and it does not appear to contain any levels of factors that were not in the original. What might I be doing incorrectly, and how can i use the tree that was generated above to generate predictions for this new file testdata? THanks. sean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.