[R] how to reach a txt file like this?

2015-06-09 Thread Ye Lin
​Hey All, I have a txt data file that looks like this: ​[{“ID”:“A”,“Name":"Tom", "Age":"18"},{“ID”:“B”,“Name":"Jim", "Age":"19"}] ​How can I read this into R as a data frame? I have used readLines to read all the lines but dont know how to deal with column names and inputs. Thanks for your help

[R] how to read a local JSON file

2015-06-06 Thread Ye Lin
Hi All, I downloaded a data file from dropbox and its in JSON format. here is my code: library(RJSONIO) data <- fromJSON(file='C:/Users/Downloads/sample.json') Lines <- readLines("C:/Users/Downloads/sample.json") df <- as.data.frame(t(sapply(Lines, fromJSON))) I got this error message: incomplet

Re: [R] cannot find package colbycol in R 3.2.0

2015-04-21 Thread Ye Lin
Thanks! The package still cannot be installed and I've found an alternative way which is using package "limma" On Tue, Apr 21, 2015 at 10:20 AM, Marc Schwartz wrote: > > > On Apr 21, 2015, at 12:01 PM, Ye Lin wrote: > > > > Hi All, after installing the new

[R] cannot find package colbycol in R 3.2.0

2015-04-21 Thread Ye Lin
Hi All, after installing the new version of R (3.2.0), I cannot find package "colbycol", is there anyway to use it with the new version? I want to use function cbc.read.table, which is in package "colbycol". If this package is no longer available in the new version, is there anyway around it? Tha

Re: [R] Questions about R

2013-11-06 Thread Ye Lin
You can get details at http://www.r-project.org/ But to answer your question: Yes it is free On Wed, Nov 6, 2013 at 9:09 AM, Silvia Espinoza wrote: > Good morning. I am interested in downloading R. I would appreciate if you > can help me with the following questions, please. > > 1. Is R

Re: [R] speeding up a loop

2013-10-18 Thread Ye Lin
; GUI even if you have the writes buffered. You should be able to debug > with some of these pointers. > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > > On Fri, Oct 1

Re: [R] speeding up a loop

2013-10-18 Thread Ye Lin
ill probably show a lot of time in the > functions that handle dataframes. > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > > On Fri, Oct 18, 2013 at 9:23 AM, Ye Lin wrote: >

Re: [R] speeding up a loop

2013-10-18 Thread Ye Lin
howed CPU usage is really high. Is there anyway to figure out why R is taxing my system? Thanks! Ye On Thursday, October 17, 2013, David Winsemius wrote: > > On Oct 17, 2013, at 2:56 PM, Ye Lin wrote: > > > Hey R professionals, > > > > I have a large dataset and I wa

[R] speeding up a loop

2013-10-17 Thread Ye Lin
Hey R professionals, I have a large dataset and I want to run a loop on it basically creating a new column which gathers information from another reference table. When I run the code, R just freezes and even does not response after 30min which is really unusual. I tried sapply as well but does no

Re: [R] how to code y~x/(x+a) in lm() function

2013-08-20 Thread Ye Lin
really what you want to fit, then you should > be using non-linear methods, e.g. by applying the function nls(). > > cheers, > > Rolf Turner > > > > On 21/08/13 09:39, Ye Lin wrote: > >> Hey All, >> >> I wanna to fit a model y~x/(a+x) to my data, here

[R] how to code y~x/(x+a) in lm() function

2013-08-20 Thread Ye Lin
Hey All, I wanna to fit a model y~x/(a+x) to my data, here is the code I use now: lm((1/y-1)~I(1/x)+0, data=b) and it will return the coefficient which is value of a however, if I use the code above, I am not able to draw a curve the presents this equation. How can I do this? Thanks for your

Re: [R] add different regression lines for groups on ggplot

2013-07-29 Thread Ye Lin
Thanks John, yes you are right I have add different smooth statements, here is the code from Dennis for my case: library(ggplot2) ggplot(data = df, aes(x=Var1, y=log(Var2), color=SiteID, group=SiteID)) + geom_point() + geom_smooth(data = subset(df, SiteID != "AL3"), method='lm', formula= y

[R] add different regression lines for groups on ggplot

2013-07-26 Thread Ye Lin
Hey All, I need to apply different regression lines to different group on my ggplot, and here is the code I use: qplot(x=Var1,y=Var2,data=df,color=SiteID,group=SiteID)+geom_point()+geom_smooth(method='lm',formula=log(y)~I(1/x),se=FALSE,size=2) However the regression for different groups is as be

[R] modify timestemp

2013-07-03 Thread Ye Lin
Hey All, I want to standardize my timestamp which is formatted as hh:mm:ss My data looks like this: Date Time 01/01/2013 00:09:01 01/02/2013 00:10:14 01/03/2013 00:11:27 01/04/2013 00:12:40 01/05/2013 00:13:53 01/06/2013 00:15:06 01/07/2013 00:16:19 01/08/2013 00:17:32 01/09/2013 00:18

Re: [R] Survey imputation

2013-06-13 Thread Ye Lin
look up imputation on survey data might be helpful On Thu, Jun 13, 2013 at 10:45 AM, Bert Gunter wrote: > Is this an R question? > > Seems like it belongs on a statistical or survey list, not r-help. > > Cheers, > Bert > > On Thu, Jun 13, 2013 at 10:37 AM, Scott Raynaud > wrote: > > I'm working

Re: [R] identify data points by certain criteria

2013-06-13 Thread Ye Lin
gt; On Jun 12, 2013, at 5:55 PM, Ye Lin wrote: > > > Hey I want to identify data points by criteria, here is an example of my > > 1min data > > > > Time Var1 Var2 > > 00:001 0 > > 00:010 0 > > 00:021

[R] identify data points by certain criteria

2013-06-12 Thread Ye Lin
Hey I want to identify data points by criteria, here is an example of my 1min data Time Var1 Var2 00:001 0 00:010 0 00:021 0 00:031 0 00:040 0 00:051 0 00:061 0 00:07

[R] delete active dataset

2013-06-03 Thread Ye Lin
Hi All, whenever I open R using the shortcut on desktop, there are 2 active datasets in the workspace, I tried to start the program from Start menu, same thing!! How can I delete these two active datasets and make sure whenever I restart the program, they wont appear? Thanks! [[alternativ

Re: [R] combine two columns into one

2013-05-29 Thread Ye Lin
t; dat$UniqueID <- paste(dat$Date,dat$Time, sep = '_') > aggregate(dat$Var,list(dat$UniqueID),sum) #isn't this the correct order > # Group.1 x > #1 1_1 11 > #2 1_2 25 > #3 2_1 11 > library(plyr) > ddply(dat,.(UniqueID),summarize,Var=sum(Var)) >

[R] combine two columns into one

2013-05-29 Thread Ye Lin
Hey all! I have a time series dataset like this: DateTime Var 112 1 14 1 1 5 1 2 8 1 2 8 1 2 9 213 21 4 214 I created a

[R] group data based on row value

2013-05-22 Thread Ye Lin
hey, I want to divide my data into three groups based on the value in one column with group name. dat: Var 0 0.2 0.5 1 4 6 I tried: dat <- cbind(dat, group=cut(dat$Var, breaks=c(0.1,0.6))) But it doesnt work, I want to group those <0.1 as group A, 0.1-0.6 as group B, >0.6 as group C Thanks fo

Re: [R] add identifier column by row

2013-05-21 Thread Ye Lin
it works! Thanks! On Tue, May 21, 2013 at 1:24 PM, Sarah Goslee wrote: > You can use rep() to create the Date column, and data.frame() to combine > it. > > For your simple example, > > newdata <- data.frame(dat, Date=rep(1:3, each=2)) > > On Tue, May 21, 2013 at 4:

Re: [R] add identifier column by row

2013-05-21 Thread Ye Lin
, > > rep(c(1:30),each=48) > > fills the first 1440 rows. > > > > On 21-May-13, at 1:16 PM, Ye Lin wrote: > > I want to add identifier column (Date) to a time series data frame. I want >> to name the "Date" column be from 1 to 30 every 1440 rows. >&

[R] x axis problem when plotting

2013-05-21 Thread Ye Lin
Hey I have a dataset like this: Date Var day 1/1/2013 1 Tue 1/2/2013 2 Wed 1/3/2013 3 Thu 1/4/2013 4 Fri 1/5/2013 5 Sat 1/6/2013 6 Sun 1/7/2013 7 Mon 1/8/2013 8 Tue 1/9/2013 9 Wed 1/10/2013 10 Thu And I want to plot Var~day Here is the code I use: plot(Dataset$Var~Dataset$day,xlab='Day

[R] add identifier column by row

2013-05-21 Thread Ye Lin
I want to add identifier column (Date) to a time series data frame. I want to name the "Date" column be from 1 to 30 every 1440 rows. Say I have a data like this (I simply my actual data here): $dat ID Var 1 1 2 4 3 6 4 7 5 7 6 8 How can

Re: [R] filter rows by value

2013-05-17 Thread Ye Lin
351 > ", header = TRUE) > > dat[dat$Time %% 100 == 51, ] > > > > Em 17-05-2013 22:01, Ye Lin escreveu: > >> Hey All, >> >> I want to delete rows based on the last 2 digits on the value in one >> column >> but I dont know how

[R] filter rows by value

2013-05-17 Thread Ye Lin
Hey All, I want to delete rows based on the last 2 digits on the value in one column but I dont know how to do that. Suppose my data looks like this: Var Time 1 51 2 151 3 251 *4234* *5 331* 6351 I want to delete the rows that the

Re: [R] duplicate rows with new time series

2013-05-10 Thread Ye Lin
Thanks, this one works! On Thu, May 9, 2013 at 5:09 PM, Gabor Grothendieck wrote: > On Thu, May 9, 2013 at 8:09 PM, Gabor Grothendieck > wrote: > > On Thu, May 9, 2013 at 7:24 PM, Ye Lin wrote: > >> Hey All, > >> > >> I want to duplicate the records b

[R] duplicate rows with new time series

2013-05-09 Thread Ye Lin
Hey All, I want to duplicate the records but add a new "timestamp" columns as new time series, but I dont know how to do that. my dataset(dat1) looks like this: No. TimeStamp Var1 1 2012-06-18 06:00:003 2 2012-06-18 06:06:00 4 I use this code to do dup

Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
2 > #3 3 0001 14 0001_3 > #4 4 0002 16 0002_1 > #5 5 0002 17 0002_2 > > dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) > with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE) > A.K

Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
In each category, the order is the same. Fro example, the first match in dat2 should return to the first record in dat2 On Tue, May 7, 2013 at 11:31 AM, Chris Stubben wrote: > Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the >> same, I simplify the data here. So in dat

Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the same, I simplify the data here. So in dat2, it may have records for ID=0002 first then ID=0001, also I have more than two categories under ID col. On Tue, May 7, 2013 at 10:57 AM, Chris Stubben wrote: > > I want to merge

[R] create unique ID for each group

2013-05-07 Thread Ye Lin
Hey All, I have a dataset(dat1) like this: ObsNumber ID Weight 1 0001 12 2 0001 13 3 0001 14 4 0002 16 5 0002 17 And another dataset(

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin
quot;ID","Group")) > library(ggplot2) > ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point() > A.K. > > > > - Original Message - > From: Ye Lin > To: R help > Cc: > Sent: Friday, May 3, 2013 4:37 PM > Subject: [R]

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin
May 3, 2013 at 1:49 PM, David Winsemius wrote: > > On May 3, 2013, at 1:37 PM, Ye Lin wrote: > > > Hey, > > > > I have a dataset like this: > > > > ID Var1 Var2 Group > > A1 11BB > > A2 1

[R] color by group in ggplot

2013-05-03 Thread Ye Lin
Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the point

Re: [R] Read big data (>3G ) methods ?

2013-04-26 Thread Ye Lin
I can not think of sth better. Maybe try read part of the data that you want to analyze, basically break the large data set into pieces. On Fri, Apr 26, 2013 at 10:58 AM, Ye Lin wrote: > Have you think of build a database then then let R read it thru that db > instead of your desktop? &g

Re: [R] Read big data (>3G ) methods ?

2013-04-26 Thread Ye Lin
Have you think of build a database then then let R read it thru that db instead of your desktop? On Fri, Apr 26, 2013 at 8:09 AM, Kevin Hao wrote: > Hi all scientists, > > Recently, I am dealing with big data ( >3G txt or csv format ) in my > desktop (windows 7 - 64 bit version), but I can not

[R] ggplot-display text in bar chart

2013-04-22 Thread Ye Lin
I want to show counts value on stacked bar chart in ggplot2. I found similar question here http://stackoverflow.com/questions/6644997/showing-data-values-on-stacked-bar-chart-in-ggplot2 but that one shows value instead of counts. My data frame(dat1) is sth like this: Group Length Width 1

Re: [R] count each answer category in each column

2013-04-19 Thread Ye Lin
t; # Var1 Age > #1 0-10 2 > #2 11-20 2 > #3 >20 1 > #4 0 > > #[[3]] > # Var1 Rate > #1 Bad2 > #2 Good2 > #3 1 > A.K. > > > > - Original Message - > From: Ye Lin > To: R help > Cc: > Sent: Thursday, April

Re: [R] count each answer category in each column

2013-04-19 Thread Ye Lin
Thanks David! I do get confused sometimes when sth can be easily and directly done in Excel which is what I am familiar with, but I find it takes more time for me to operate that in R. On Thu, Apr 18, 2013 at 4:27 PM, David Winsemius wrote: > > On Apr 18, 2013, at 3:46 PM, Ye Lin

[R] count each answer category in each column

2013-04-18 Thread Ye Lin
Hey, Is it possible that R can calculate each options under each column and return a summary table? Suppose I have a table like this: Gender Age Rate Female0-10 Good Male0-10 Good Female 11-20 Bad Male 11-20 Bad Male >20 N/A I want to have a summary

[R] plot 2 y axis

2013-04-16 Thread Ye Lin
Hi, I want to plot two variables on the same graph but with two y axis just like what you can do in Excel. I searched online that seems like you can not achieve that in ggplot. So is there anyway I can do it in a nice way in basic plot? Suppose my data looks like this: WeightHeight Date 0.

Re: [R] group data

2013-04-15 Thread Ye Lin
> dat <- read.table(text = " > > ID Value > AL1 1 > AL2 2 > CA1 3 > CA4 4 > ", header = TRUE, stringsAsFactors = FALSE) > > dat$State <- substr(dat$ID, 1, 2) > > > Note that this dependes on having State being defined by the first two > ch

Re: [R] Fw: split date and time

2013-04-12 Thread Ye Lin
Thanks! On Fri, Apr 12, 2013 at 3:30 PM, arun wrote: > > > > > - Forwarded Message - > From: arun > To: Ye Lin > Cc: > Sent: Friday, April 12, 2013 6:25 PM > Subject: Re: [R] split date and time > > > > Hi Ye, > > Is this okay? > >

[R] split date and time

2013-04-12 Thread Ye Lin
Hi R experts, For example I have a dataset looks like this: Number TimeStamp Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 How can I split the "TimeStamp" Column into two and return a new table like this: Number Date Time Value 1 1/1/2

Re: [R] group data

2013-04-11 Thread Ye Lin
tringsAsFactors = FALSE) > > dat$State <- substr(dat$ID, 1, 2) > > > Note that this dependes on having State being defined by the first two > characters of ID. > > Hope this helps, > > Rui Barradas > > > Em 11-04-2013 19:37, Ye Lin escreveu: > >>

Re: [R] group data

2013-04-11 Thread Ye Lin
Try the following. > > > dat <- read.table(text = " > > ID Value > AL1 1 > AL2 2 > CA1 3 > CA4 4 > ", header = TRUE, stringsAsFactors = FALSE) > > dat$State <- substr(dat$ID, 1, 2) > > > Note that this dependes on having State be

[R] group data

2013-04-11 Thread Ye Lin
Hey, I have a dataset and I want to identify the records by groups for further use in ggplot. Here is a sample data: ID Value AL1 1 AL2 2 CA1 3 CA4 4 I want to identify all the records that in the same state (AL1 AND A2), group them as "AL", and do the same for CA1 and CA4. How can I have

Re: [R] how to calculate average of each column

2013-04-10 Thread Ye Lin
Make up some data > dat <- data.frame(X = rnorm(200), Y = rnorm(200)) > > # Divide into subsets of 60 rows each and compute the col means > grp <- rep(1:(1 + nrow(dat) / 60), each = 60)[seq_len(nrow(dat))] > do.call(rbind, lapply(split(dat, grp), colMeans)) > > > Hope this h

Re: [R] how to calculate average of each column

2013-04-10 Thread Ye Lin
7 18.8 19.9 18.85000 20.46667 > # V9 V10 > #17.81667 18.51667 > > A.K. > > > > - Original Message - > From: Ye Lin > To: r-help@r-project.org > Cc: > Sent: Wednesday, April 10, 2013 1:46 PM > Subject: [R] how to calculate average

[R] how to calculate average of each column

2013-04-10 Thread Ye Lin
Hey All, I have a large dataset and I want to calculate the average of each column then return a new dataset. Here is my question: I dont know if there is a function that can allow me to calculate the average every 60 records of data in the whole dataset, and return a new data frame. Not sure if