Re: [R] splitting dataset based on variable and re-combining

2012-12-10 Thread Brian Feeny
prediction from model 1 (Setosa) for entire dataset > > P2 <- ... # prediction from model 2 for entire dataset > > I <- Species=="setosa" # > > Predictions <- P1 * I + P2 * ( 1 - I ) > > On Monday, December 10, 2012, Brian Feeny wrote: > > I ha

[R] splitting dataset based on variable and re-combining

2012-12-10 Thread Brian Feeny
I have a dataset and I wish to use two different models to predict. Both models are SVM. The reason for two different models is based on the sex of the observation. I wish to be able to make predictions and have the results be in the same order as my original dataset. To illustrate I will us

Re: [R] Assignment of values with different indexes

2012-12-05 Thread Brian Feeny
5, 2012, at 11:49 PM, arun wrote: > > > Hi, > > Would it be okay to use: > y<-na.omit(y[myindex]<-x) > y > # [1] -1.36025132 -0.57529211 1.18132359 0.41038489 1.83108252 -0.03563686 > #[7] 1.25267314 1.08311857 1.56973422 -0.30752939 > > A.K. &g

[R] Assignment of values with different indexes

2012-12-05 Thread Brian Feeny
I would like to take the values of observations and map them to a new index. I am not sure how to accomplish this. The result would look like so: x[1,2,3,4,5,6,7,8,9,10] becomes y[2,4,6,8,10,12,14,16,18,20] The "newindex" would not necessarily be this sequence, but a sequence I have stored i

Re: [R] How to re-combine values based on an index?

2012-12-01 Thread Brian Feeny
tibco.com > > >> -----Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf >> Of Brian Feeny >> Sent: Saturday, December 01, 2012 8:04 PM >> To: r-help@r-project.org >> Subject: [R] How t

[R] How to re-combine values based on an index?

2012-12-01 Thread Brian Feeny
I am able to split my df into two like so: dataset <- trainset index <- 1:nrow(dataset) testindex <- sample(index, trunc(length(index)*30/100)) trainset <- dataset[-testindex,] testset <- dataset[testindex,-1] So I have the index information, how could I re-combine the data using that back into

Re: [R] Help with this error "kernlab class probability calculations failed; returning NAs"

2012-11-29 Thread Brian Feeny
r levels to avoid leading numbers and try again. > > Max > > > > > On Thu, Nov 29, 2012 at 10:18 PM, Brian Feeny wrote: > > > Yes I am still getting this error, here is my sessionInfo: > > > sessionInfo() > R version 2.15.2 (2012-10-26) >

Re: [R] Help with this error "kernlab class probability calculations failed; returning NAs"

2012-11-29 Thread Brian Feeny
t; Upgrade to the version just released on cran and see if you still have the > issue. > > Max > > > On Thu, Nov 29, 2012 at 6:55 PM, Brian Feeny wrote: > I have never been able to get class probabilities to work and I am relatively > new to using these tools, and I am l

[R] Help with this error "kernlab class probability calculations failed; returning NAs"

2012-11-29 Thread Brian Feeny
I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong. I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read th

Re: [R] Building factors across two columns, is this possible?

2012-11-23 Thread Brian Feeny
0 5 4 7 9 6 5 3 10 does this make sense? I am hoping there is a way to accomplish this. Brian On Nov 23, 2012, at 11:42 PM, Brian Feeny wrote: > > I am trying to make it so two columns with similar data use the same internal > numbers for same fa

[R] Building factors across two columns, is this possible?

2012-11-23 Thread Brian Feeny
I am trying to make it so two columns with similar data use the same internal numbers for same factors, here is the example: > read.csv("test.csv",header =FALSE,sep=",") V1V2 V3 1 sun moonstars 2 stars moon sun 3 cat dog catdog 4 dog moon sun 5 bird pla

Re: [R] caret train and trainControl

2012-11-23 Thread Brian Feeny
Fold01: k= 7 > > + Fold01: k= 9 > > - Fold01: k= 9 > > + Fold01: k=11 > > - Fold01: k=11 > > > > + Fold10: k=17 > > - Fold10: k=17 > > + Fold10: k=19 > > - Fold10: k=19 > > + Fold10: k=21 > > - Fold10: k=21 >

[R] caret train and trainControl

2012-11-23 Thread Brian Feeny
I am used to packages like e1071 where you have a tune step and then pass your tunings to train. It seems with caret, tuning and training are both handled by train. I am using train and trainControl to find my hyper parameters like so: MyTrainControl=trainControl( method = "cv", number=5,

Re: [R] What is the . in formula ~. syntax?

2012-11-23 Thread Brian Feeny
Thank you! I searched in the manual, but I did not see where this is mentioned, I looked under operators and in some of the formula documentation. Brian On Nov 23, 2012, at 3:15 AM, Michael Weylandt wrote: > > > On Nov 23, 2012, at 4:26 AM, Brian Feeny wrote: > >>

[R] What is the . in formula ~. syntax?

2012-11-22 Thread Brian Feeny
I know if I have a dataframe with columns y, x1, x2 and I wish to have y as my y value and x1 and x2 as x values I can do: y ~ x1 + x2 or y ~. but can someone explain what . actually is or what its transposed into? I searched for this with no success, reading the "formula" manual pages. Bria

[R] Using doMC to run parallel SVM grid search?

2012-11-21 Thread Brian Feeny
Has anyone used doMC to speed up an SVM grid search? I am considering doing like so: library(doMC) registerDoMC() foreach (i=0:3) %dopar% { tuned_part1 <- tune.svm(label~., data = trainset, gamma = 10^(-10:-6), cost = 10^(-1:1)) tuned_part2 <- tune.svm(label~., data = trainset,

[R] Using doMC to run parallel SVM grid search?

2012-11-21 Thread Brian Feeny
Has anyone used doMC to speed up an SVM grid search? I am considering doing like so: library(doMC) registerDoMC() foreach (i=0:3) %dopar% { tuned_part1 <- tune.svm(label~., data = trainset, gamma = 10^(-10:-6), cost = 10^(-1:1)) tuned_part2 <- tune.svm(label~., data = trainset,

Re: [R] cluster analysis in R

2012-11-21 Thread Brian Feeny
http://cran.r-project.org/web/views/Cluster.html might be a good start Brian On Nov 21, 2012, at 1:36 PM, KitKat wrote: > Thank you for replying! > I made a new post asking if there are any websites or files on how to > download package mclust (or other Bayesian cluster analysis packages) an

[R] Scaling values 0-255 -> -1 , 1 - how can this be done?

2012-11-21 Thread Brian Feeny
I have a dataframe in which I have values 0-255, I wish to transpose them such that: if value > 127.5 value = 1 if value < 127.5 value = -1 I did something similar using the "binarize" function of the biclust package, this transforms my dataframe to 0 and 1 values, but I wish to use -1 and 1

Re: [R] e1071 SVM: Cross-validation error confusion matrix

2012-11-20 Thread Brian Feeny
responding to my own question, I see in ?svm man it states fitted() and predict() can do the same thing: # test with train data pred <- predict(model, x) # (same as:) pred <- fitted(model) On Nov 21, 2012, at 1:08 AM, signal wrote: > Did you ever receive a response to this? I did not see one

[R] Removing columns that are na or constant

2012-11-20 Thread Brian Feeny
I have a dataset that has many columns which are NA or constant, and so I remove them like so: same <- sapply(dataset, function(.col){ all(is.na(.col)) || all(.col[1L] == .col) }) dataset <- dataset[!same] This works GREAT (thanks to the r-users list archive I found this) however, then

Re: [R] data after write() is off by 1 ?

2012-11-20 Thread Brian Feeny
0, 2012, at 2:30 PM, Brian Feeny wrote: > I am new to R, so I am sure I am making a simple mistake. I am including > complete information in hopes > someone can help me. > > Basically my data in R looks good, I write it to a file, and every value is > off by 1. > > Here

[R] data after write() is off by 1 ?

2012-11-20 Thread Brian Feeny
I am new to R, so I am sure I am making a simple mistake. I am including complete information in hopes someone can help me. Basically my data in R looks good, I write it to a file, and every value is off by 1. Here is my flow: > str(prediction) Factor w/ 10 levels "0","1","2","3",..: 3 1 10

Re: [R] How to subset my data and at the same time keep the balance?

2012-11-19 Thread Brian Feeny
Just curious, once you have a model that works well, does it make sense to then tune it against 100% of the dataset (with known outcomes) so you can apply it to data you wish to predict for or is that a bad approach? I have done like is explained in this thread many times, taken a sample, learn

[R] Best prediction to use to use for basic problem?

2012-11-18 Thread Brian Feeny
I have a rather basic set of data. It is simply a variable that can be 0, 1 or 2 and its value over a series of time t0 - t9 like so: y: 1 1 2 0 1 2 2 1 2 1 x: t0 t1 t2 t3 t4 t5 t6 t7 t8

Re: [R] help interpreting dudi.pco

2012-11-18 Thread Brian Feeny
I am new to R as well, it sounds like you would want to look at clustering, perhaps k-means clustering. Brian On Nov 18, 2012, at 12:19 AM, avadhoot velankar wrote: > I am working on morphometry of hairs and want to see if selected variables > are giving significantly distinct groups. > > I

Re: [R] library/function to compare two phrases?

2012-11-17 Thread Brian Feeny
Thank you Michael and David. I am onto agrep and adist and they look very useful for what I am wanting to do. My initial results are promising! Brian On Nov 17, 2012, at 6:20 PM, R. Michael Weylandt wrote: > On Sat, Nov 17, 2012 at 11:00 PM, Brian Feeny wrote: >> I am looking for

[R] library/function to compare two phrases?

2012-11-17 Thread Brian Feeny
I am looking for a library/function in R that can compare two phrases and give me a score, or somehow classify them as correct as possible. The "phrases" are obfuscated/messy. I am not concerned about which is "correct" (for example spell checking), I am only concerned in grouping them so that

Re: [R] Strange problem with reading a pipe delimited file

2012-11-17 Thread Brian Feeny
= paste("V", seq_len(ncol), sep = "")) Thank you for your help Brian On Nov 17, 2012, at 4:34 PM, Brian Feeny wrote: > > On Nov 17, 2012, at 4:27 PM, Duncan Murdoch wrote: >>> >> >> I would suggest reading the help file: read.delim onl

Re: [R] Strange problem with reading a pipe delimited file

2012-11-17 Thread Brian Feeny
On Nov 17, 2012, at 4:27 PM, Duncan Murdoch wrote: >> > > I would suggest reading the help file: read.delim only looks at the first 5 > lines to determine the number of columns if you don't specify the colClasses. > > Duncan Murdoch > Duncan, I have tried to pass colClasses but R complains

[R] Strange problem with reading a pipe delimited file

2012-11-17 Thread Brian Feeny
I am trying to read in a pipe delimited file that has rows with varying number of columns, here is my sample data: A|B|C|D A|B|C|D|E|F A|B|C|D|E A|B|C|D|E|F|G|H|I A|B|C|D A|B|C|D|E|F|G|H|I|J You can see line 6 has 10 columns. Yet, I can't explain why R does like so: > test <- read.delim("mypat

Re: [R] Using cbind to combine data frames and preserve header/names

2012-11-17 Thread Brian Feeny
17, 2012, at 11:25 AM, David Winsemius wrote: > > On Nov 16, 2012, at 9:39 PM, Brian Feeny wrote: > >> I have a dataframe that has a header like so: >> >> classvalue1 value2 value3 >> >> class is a factor >> >> the actual values in t

[R] Using cbind to combine data frames and preserve header/names

2012-11-16 Thread Brian Feeny
I have a dataframe that has a header like so: class value1 value2 value3 class is a factor the actual values in the columns value1, value2 and value3 are 0-255, I wish to binarize these using biclust. I can do this like so: binarize(dataframe[,-1]) this will return a dataframe, but then I