[R] Data Security when using R

2013-11-11 Thread seanstclair

   Hello.  At the company I work for, I recently requested having R loaded onto
   my desktop and some of my colleagues.

   My company's IT/Security groups are having trouble assessing whether R
   software meets their standards.

   Can anyone point me to a source where i can read about how R uses data? does
   it store the data somewhere?  Does data ever actually leave the company's
   environment?  etc...?

   Thanks.
   Sean
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R, ctree and categorical variables

2011-07-28 Thread seanstclair

   I am running the ctree function in R.



   My data has about 10 variables, many of which are categorical.  2 of the
   categorical variables have many levels (one has 900 levels, another has
   1,000 levels).  As an example, 1 of these variables is disease code and is
   structured as A, B, C, , AA, AB, AC



   Each time i've tried to run the ctree function, including these 2 variables
   in  the data, the function never stops running.  When i remove these 2
   variables from the data and run without them, the function returns in about
   3 seconds.



   Q:  Is there a limit to the amount of levels that a categorical variable can
   contain?  Is there something else that i may be overlooking?





   THanks.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: print and ctree

2011-07-31 Thread seanstclair

   I have run the ctree function, and my dependent variable is broken into 3
   categories:  low cost, moderate cost and high cost.



   When i plot the results (eg. using plot(test.ct)), the plot shows, at the
   very  bottom  of  each node, the probability of falling into each cost
   category.



   When i print the actual results (eg. using print(test.ct)), i get all of the
   backup information, but i do not get the probability of falling into each
   cost category.



   Is  there a way i can get these probabilities to show up on the actual
   summary of results using print function?



   THANKS.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] helpful functions in R for testing results of tree ("party")

2011-12-27 Thread seanstclair

   Hello all.

   I  have  been  teaching myself how to use recursive partitioning in R,
   particularly using the "party" package.

   Now that i've generated some trees, i would like to understand how i would
   go about validating the goodness of fit/accuracy, etc..., of the trees. What
   functions can i use?  Do i need to validate each terminal node separately?
   Are there examples that i could review which would make this more easily
   understood?

   Thanks in advance for your help.

   Sean
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA and Regression with complex categorical variables

2011-10-21 Thread seanstclair

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA and Regression with complex categorical variables

2011-10-24 Thread seanstclair

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ctree question

2012-01-18 Thread seanstclair

   Hello.  I have used the "party" package to generate a regression tree as
   follows:

   >origdata<-read.csv("origdata.csv")
   >ctrl<-ctree_control(mincriterion=0.99,maxdepth=10,minbucket=10)
   >test.ct<-ctree(Y~X1+X2+X3,data=origdata,control=ctrl)

   The above works fine.  Orig data was my training data.  I now have a test
   data file (testdata), and would like to run the testdata through the above
   tree to see predictions.  I tried using the following function

   >predict(test.ct,newdata=testdata)

   but I get the following error:

   Error in checkData(oldData, RET) :
 Levels in factors of new data do not match original data

   I've looked at the testdata file closely and it does not appear to contain
   any levels of factors that were not in the original.  What might I be doing
   incorrectly, and how can i use the tree that was generated above to generate
   predictions for this new file testdata?
   
   THanks.
   sean
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.