Re: [R] a question about data manipulation in R

2015-09-15 Thread John Posner
Given your "input: data frame, with variables "V1" and "V2", here's a solution. This might not be the most "R-like" solution, since I'm still more of a Python refugee than a native R coder. -John # analyze input, using run-length encoding runs_table = rle(input$V1) number_of_runs = length(runs

[R] dplyr: producing a good old data frame

2015-02-23 Thread John Posner
I'm using the dplyr package to perform one-row-at-a-time processing of a data frame: > rnd6 = function() sample(1:300, 6) > frm = data.frame(AA=rnd6(), BB=rnd6(), CC=rnd6()) > frm AA BB CC 1 123 50 45 2 12 30 231 3 127 147 100 4 133 32 129 5 66 235 71 6 38 264 261 The interface is

[R] Coding style question

2015-02-17 Thread John Posner
In the course of slicing-and-dicing some data, I had occasion to create a list like this: list( subset(my_dataframe, GR1=="XX1"), subset(my_dataframe, GR1=="XX2"), subset(my_dataframe, GR1=="YY"), subset(my_dataframe, GR1 %in% c("XX1", "XX2")), subset(my_dataframe, GR2=="Remi

Re: [R] Paste every two columns together

2015-01-28 Thread John Posner
Kate, here's a solution that uses regular expressions, rather than vector manipulation: > mystr = "ID1 A A T G C T G C G T C G T A" > gsub(" ([ACGT]) ([ACGT])", " \\1\\2", mystr) [1] "ID1 AA TG CT GC GT CG TA" -John > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.o

Re: [R] Separating a Complicated String Vector

2015-01-04 Thread John Posner
I'm coming to R from Python, so I coded a Python3 solution: # data = """alabama bates tuscaloosa smith arkansas fayette little rock alaska juneau nome """.split() state_list = ["alabama", "arkansas", "alaska"] # etc. return_list = [] for word in data: if word in state_l

Re: [R] Condensing data.frame

2014-12-07 Thread John Posner
e- > From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us] > Sent: Sunday, December 07, 2014 3:14 PM > To: John Posner > Cc: 'Chel Hee Lee'; Morway, Eric; R mailing list > Subject: Re: [R] Condensing data.frame > > dplyr version (good for large datasets): > > l

Re: [R] Condensing data.frame

2014-12-07 Thread John Posner
Here's a solution using the plyr library: library(plyr) dat <- read.table(header=TRUE, sep=",", as.is=TRUE, ## < as.is=TRUE text="site,tax_name,count,countTotal,countPercentage CID_1,Cyanobacteria,46295,123509,37.483098398 CID_1,Proteobacteria,36120,123509,29.244832

Re: [R] dplyr/summarize does not create a true data frame

2014-11-23 Thread John Posner
structure(list(Id = structure(1:10, .Label = c("P01", "P02", "P03", "P04", "P05", "P06", "P07", "P08", "P09", "P10"), class = "factor"), Sex = structure(c(2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L

[R] dplyr/summarize does not create a true data frame

2014-11-21 Thread John Posner
I got an error when trying to extract a 1-column subset of a data frame (called "my.output") created by dplyr/summarize. The ncol() function says that my.output has 4 columns, but "my.output[4]" fails. Note that converting my.output using as.data.frame() makes for a happy ending. Is this the in

Re: [R] Help with ddply/summarize

2014-11-14 Thread John Posner
> -Original Message- > From: David L Carlson [mailto:dcarl...@tamu.edu] > Sent: Friday, November 14, 2014 10:25 AM > To: John Posner; 'r-help@r-project.org' > Subject: RE: Help with ddply/summarize > > I think this is what you want: > > > MyVar &l

[R] Help with ddply/summarize

2014-11-13 Thread John Posner
I have a straightforward application of ddply() and summarize(): ddply(MyFrame, .(Treatment, Week), summarize, MeanValue=mean(MyVar)) This works just fine: Treatment Week MeanValue 1MyDrug BASELINE 5.91 2MyDrugWEEK 1 4.68 3MyDrugWEEK 2 4.08 4MyDr