[R] Subscripting
Dataframe1 contains a list of specific dates. Dataframe2 contains a large dataset, one element of which is Date. How do I create a subset of Dataframe2 that excludes the dates from Dataframe1? I know how to do it with a left outer join vs null in SQL, but I can't figure out how to do it more directly via the subcripts that already exist? Dateframe1 Date 1/1/2010 1/18/2010 Dataframe2 Date Attribute Count 1/1/2010 Red 5 1/15/2010 Green 2 1/18/2010 Purple 8 1/19/2010 Yellow 3 ResultingDataframe (Dataframe2 minus the rows that have Dates in Dataframe1) Date Attribute Count 1/15/2010 Green 2 1/19/2010 Yellow 3 -- View this message in context: http://n4.nabble.com/Subscripting-tp1476330p1476330.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AHRQ Patient Quality Indicators
Is anyone aware of R code that mimic's AHRQ's SAS code for their Prevention Quality Indicators (PQI)? Don't see it anywhere, but wanted to see if anyone else knew of anything. Many thanks -- View this message in context: http://n4.nabble.com/AHRQ-Patient-Quality-Indicators-tp1747243p1747243.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Correcting for missing data combinations
I can think of many brute-force ways to do this outside of R, but was wondering if there was a simple/elegant solution within R instead. I have a table that looks something like the following: Factor1 Factor2 Value A 11/11/2009 5 A 11/12/2009 4 B 11/11/2009 7 B 11/13/2009 8 >From that I need to generate all permutations of Factor1 and Factor2 and force a 0 for any combination that doesn’t exist in the actual data table. By way of example, I’d like the output for above to end up as: Factor1Factor2 Value A 11/11/2009 5 A 11/12/2009 4 A 11/13/2009 0 B 11/11/2009 7 B 11/12/2009 0 B 11/13/2009 8 Truly appreciate any thoughts. -- View this message in context: http://n4.nabble.com/Correcting-for-missing-data-combinations-tp961301p961301.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Limiting number of tickmarks in lattice bwplot
Have a simple bwplot with 24 ordered factors across the x axis. I would like to only label every 4th tick mark so that the labels fit. I tried scales=list(x=list(tick.number=6)), but I still seem to get 24 tickmarks and 24 labels. Full code is below: bwplot(SumOfIn.Use ~ Hour | Period, scales=list(x=list(tick.number=6)),horizontal=FALSE,las=2,main="Rooms Running",sub="Timeframe: 8/09 - 12/09",xlab="Hour of Day",ylab="Rooms Running",ex.main=0.7,cex.axis=0.5,data=dbs.weekday,as.table=TRUE,layout=c(2,1)) -- View this message in context: http://n4.nabble.com/Limiting-number-of-tickmarks-in-lattice-bwplot-tp1011515p1011515.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on Merge/Lookup
I need to merge three datasets and don't know how. If I were using SQL, I would use df3, look up the characteristics of each date in df1 and the value for each observation in df2. df1 - unique list of Dates and characteristics of those dates Date, MM, WW, DOW df2 - the raw data Date, Place, Value df3 - all posibile combinations of Date + Place (via expand.grid(unique(df2$Date),unique(df2$Place)) Date, Place I need to end up with: Date, MM, WW, DOW, PLace, Value (plug 0 if combo doesn't exist in raw data). Appreciate any help! -- View this message in context: http://n4.nabble.com/Question-on-Merge-Lookup-tp1112384p1112384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question on sqldf syntax
trying to structure sql to merge two datasets. structure follows: dbs.possible.combos (all possible combinations of dates and places) Date Place 1/1/10 N-01 1/1/10 S-02 1/2/10 N-01 1/2/10 S-02 etc... dbs.aggregate (the raw data aggregated by date and location) Date Place Days 1/1/10 N-01 6 1/1/10 S-02 10 1/2/10 S-02 5 Trying to merge so I look-up the values for each possible combo dbs.final <- sqldf("select dbs.possible.combos$Date, dbs.possible.combos$Place, dbs.possible.combos$Days FROM dbs.possible.combos LEFT JOIN dbs.aggregate ON (dbs.possible.combos$Place = dbs.aggregate$Place) AND (dbs.possible.combos$Date = dbs.aggregate$Date)") Resulting in: Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: near ".": syntax error) What am I getting wrong in the syntax? -- View this message in context: http://n4.nabble.com/question-on-sqldf-syntax-tp1289707p1289707.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on sqldf syntax
Actually, better sql would likely be: dbs.final <- sqldf("select * from dbs.possible.combos left join dbs.aggregate using (Date,Place)") but this still doesn't work -- View this message in context: http://n4.nabble.com/question-on-sqldf-syntax-tp1289707p1289718.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Print lattice output to table?
I have beautiful box and whisker charts formatted with lattice, which is obviously calculating summary statistics internally in order to draw the charts. Is there a way to dump the associated summary tables that are being used to generate the charts? Realize I could use tapply or such to get something similar, but I have all the groupings and such already configured to generate the charts. Simply want to dump those values to a table so that I don't have to interpolate where the 75th percentile is on a visual chart. Appreciate any thoughts.. -- View this message in context: http://n4.nabble.com/Print-lattice-output-to-table-tp1375040p1375040.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tapply on multiple groups
Can you make tapply break down groups similar to bwplot or such? Example: Data frame has one measure (Days) and two Dimensions (MM and Place). All have the same length. > length(dbs.final$Days) [1] 3306 > length(dbs.final$Place) [1] 3306 > length(dbs.final$MM) [1] 3306 Doing the following makes a nice table for one dimension and one measure: do.call(rbind,tapply(dbs.final$Days,dbs.final$Place, summary)) But, what I really need to do is break it down on two dimensions and one measures - effectively equivalent to the following bwplot call: bwplot( Days ~ MM | Place, ,data=dbs.final) Is there an equivalent to the "|" operation in tapply? -- View this message in context: http://n4.nabble.com/tapply-on-multiple-groups-tp1380593p1380593.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print lattice output to table?
That works great. Thanks! -- View this message in context: http://n4.nabble.com/Print-lattice-output-to-table-tp1375040p1380862.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Are any values in one list contained within a second list
Silly question, but, can I test to see if any value of list a is contained in list b without doing a loop? A loop is easy enough, but wanted to see if there was a cleaner way. By way of example: List 1: a, b, c, d, e, f, g List 2: z, y, x, w, v, u, b Return true, since both lists contain b List 1: a, b, c, d, e, f, g List 2: z, y, x, w, v, u, t Return false, since the lists have no mutual values -- View this message in context: http://r.789695.n4.nabble.com/Are-any-values-in-one-list-contained-within-a-second-list-tp2253637p2253637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about user define function
Have the following function that is called by the statement below. Trying to return the two dataframes, but instead get one large list including both tables. ReadInputDataFrames <- function() { dbs.this= read.delim("this.txt", header = TRUE, sep = "\t", quote="\"", dec=".") dbs.that= read.delim("that.txt", header = TRUE, sep = "\t", quote="\"", dec=".") c(this= dbs.this,patdb = dbs.that) } Called by: c <- ReadInputDataFrames() Results: str(c) yields a list of 106 items $this.variabe1..53, $that$variable1..53 -- View this message in context: http://r.789695.n4.nabble.com/Question-about-user-define-function-tp2256513p2256513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Return value associated with a factor
I am using the code below to extract census tract information. save.tract$state, save.tract$county and save.tract$tract are returned as factors. In the last three statements, I need to save the actual value of the factor, but, instead, the code is yielding the position of the factor. How do I instead return the value of the factor? By way of example, for Lon=-82.49574 and Lat=29.71495, the code returns state = 1, county = 1 and tract = 161. The desired results are state=12, county = 001 tract = 002201. #set libraries library(UScensus2000tract) library(gdata) data(florida.tract) #read input dbs.in = read.delim("addresses_coded_new.txt", header = TRUE, sep = "\t", quote="\"", dec=".") #instantiate output more.columns <- data.frame( state=double(0), county=double(0), tract=double(0)) dbs.in <- cbind(dbs.in,more.columns) #fiure out how many times to loop j <- nrow(dbs.in) #loop through each lab/long and assign census tract for (i in 1:j) { index<-overlay(SpatialPoints(cbind(dbs.in$Lon[i],dbs.in$Lat[i])),florida.tract) save.tract<-florida.tract[index,] dbs.in$state[i] <- save.tract$state #this is returning the position in the list instead of the value dbs.in$county[i] <- save.tract$county #this is returning the position in the list instead of the value dbs.in$tract[i] <- save.tract$tract #this is returning the position in the list instead of the value } -- View this message in context: http://r.789695.n4.nabble.com/Return-value-associated-with-a-factor-tp2262605p2262605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Return value associated with a factor
Works great. Thanks much! -- View this message in context: http://r.789695.n4.nabble.com/Return-value-associated-with-a-factor-tp2262605p2262656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic question - more efficient method than loop?
I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = "Medicaid" if(dbs1$Payor[i] %in% Payor.Group.Medicare) dbs1$Payor.Group[i] = "Medicare" if(dbs1$Payor[i] %in% Payor.Group.Commercial) dbs1$Payor.Group[i] = "Commercial" if(dbs1$Payor[i] %in% Payor.Group.Workers.Comp) dbs1$Payor.Group[i] = "Workers Comp" if(dbs1$Payor[i] %in% Payor.Group.Self.Pay) dbs1$Payor.Group[i] = "Self Pay" if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Newborn) dbs1$Adm.Source.Group[i] = "Newborn" if(dbs1$Adm.Source[i] %in% Adm.Source.Group.ED) dbs1$Adm.Source.Group[i] = "ED" if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Routine) dbs1$Adm.Source.Group[i] = "Routine" if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Transfer) dbs1$Adm.Source.Group[i] = "Transfer" } -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
Perfect. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271153.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove observations deemed influential by influential.measure
dbs is an existing dataframe. I fit a lm and looked at influential observations. I want now to delete the influential observations from dbs, fit another lm, and see how different the results are. What is the syntax to remove the influential observations from dbs? fit <- lm(NI ~ PG + log(TG), data=dbs) fit.influential.observations <- influence.measures(fit) dbs.without.influential.observations <- ? -- View this message in context: http://r.789695.n4.nabble.com/Remove-observations-deemed-influential-by-influential-measure-tp2272474p2272474.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove observations deemed influential by influential.measure
dbs_influential_obs <- which(apply(fit.influential.observations$is.inf, 1, any)) dbs_sans_influential_obs <- dbs1[-dbs_influential_obs,] -- View this message in context: http://r.789695.n4.nabble.com/Remove-observations-deemed-influential-by-influential-measure-tp2272474p2272524.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sqldf hanging on macintosh - works on windows
Have a long script that runs fine on windows (32 bit). When I try to run in on two different macs (64 bit), however, it hangs with identical behavior. I start with: library(sqldf) This results in messages: Loading required package: DBI Loading required package: RSQLite Loading required package: RSQLite.extfuns Loading required package: gsubfn Loading required package: proto Loading required package: chron I then read some data, etc. I execute the following: #merge raw data and all possible combinations df.final <- sqldf('select Date, Hour, x as RoomsInUse from "df.possible.combos" left join "df.aggregate" using (Hour, Date)') I receive the messages: Loading required package: tcltk Loading Tcl/Tk interface ... + Then I get into some kind of loop. Message at bottom ribbon says: executing: try(gsub('\\s+','',paste(capture.output(print(arg(summary))),collapse=")),silent=TRUE) On the pc implementation it runs flawlessly, and quickly. Truly appreciate any ideas. -- View this message in context: http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3022193.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf hanging on macintosh - works on windows
added library(RH2) Still get message: Loading required package: tcltk Loading Tcl/Tk interface + directly after sqldf statement > df.final <- sqldf('select Date, Hour, x as RoomsInUse from > "df.possible.combos" + left join "df.aggregate" using (Hour, Date)') There is no progress spinner. If I hit I get a ">" At that point I start to enter any command (just summary, for instance), I get the progress spinner, the "try(gsub('\\s+','',paste(capture.output(print(arg(summary))),collapse=")),silent=TRUE) " message in the bottom ribbon, and the system apparently hangs. -- View this message in context: http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3022233.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf hanging on macintosh - works on windows
> > library(sqldf) Loading required package: DBI Loading required package: RSQLite Loading required package: RSQLite.extfuns Loading required package: gsubfn Loading required package: proto Loading required package: chron > debug(sqldf) > df.final <- sqldf('select Date, Hour, x as RoomsInUse from > "df.possible.combos" + left join "df.aggregate" using (Hour, Date)') debugging in: sqldf("select Date, Hour, x as RoomsInUse from \"df.possible.combos\"\nleft join \"df.aggregate\" using (Hour, Date)") debug: { as.POSIXct.character <- function(x) structure(as.numeric(x), class = c("POSIXt", "POSIXct")) as.Date.character <- function(x) structure(as.numeric(x), class = "Date") as.Date.numeric <- function(x, origin = "1970-01-01", ...) base::as.Date.numeric(x, origin = origin, ...) as.dates.character <- function(x) structure(as.numeric(x), class = c("dates", "times")) as.times.character <- function(x) structure(as.numeric(x), class = "times") overwrite <- FALSE request.open <- missing(x) && is.null(connection) request.close <- missing(x) && !is.null(connection) request.con <- !missing(x) && !is.null(connection) request.nocon <- !missing(x) && is.null(connection) dfnames <- fileobjs <- character(0) if (request.close || request.nocon) { on.exit({ dbPreExists <- attr(connection, "dbPreExists") dbname <- attr(connection, "dbname") if (!missing(dbname) && !is.null(dbname) && dbname == ":memory:") { dbDisconnect(connection) } else if (!dbPreExists && drv == "sqlite") { dbDisconnect(connection) file.remove(dbname) } else { for (nam in dfnames) dbRemoveTable(connection, nam) for (fo in fileobjs) dbRemoveTable(connection, fo) dbDisconnect(connection) } }) if (request.close) { if (identical(connection, getOption("sqldf.connection"))) options(sqldf.connection = NULL) return() } } if (request.open || request.nocon) { if (is.null(drv)) { drv <- if ("package:RpgSQL" %in% search()) { "pgSQL" } else if ("package:RMySQL" %in% search()) { "MySQL" } else if ("package:RH2" %in% search()) { "H2" } else "SQLite" } drv <- tolower(drv) if (drv == "mysql") { m <- dbDriver("MySQL") connection <- if (missing(dbname) || dbname == ":memory:") { dbConnect(m) } else dbConnect(m, dbname = dbname) dbPreExists <- TRUE } else if (drv == "pgsql") { m <- dbDriver("pgSQL") if (missing(dbname) || is.null(dbname)) { dbname <- getOption("RpgSQL.dbname") if (is.null(dbname)) dbname <- "test" } connection <- dbConnect(m, dbname = dbname) dbPreExists <- TRUE } else if (drv == "h2") { m <- H2() if (missing(dbname) || is.null(dbname)) dbname <- ":memory:" dbPreExists <- dbname != ":memory:" && file.exists(dbname) connection <- if (missing(dbname) || dbname == ":memory:") { dbConnect(m, "jdbc:h2:mem:", "sa", "") } else { jdbc.string <- paste("jdbc:h2", dbname, sep = ":") dbConnect(m, jdbc.string) } } else { m <- dbDriver("SQLite") if (missing(dbname)) dbname <- ":memory:" dbPreExists <- dbname != ":memory:" && file.exists(dbname) if (is.null(getOption("sqldf.dll"))) { dll <- Sys.which("libspatialite-1.dll") if (dll != "") options(sqldf.dll = dll) else options(sqldf.dll = FALSE) } dll <- getOption("sqldf.dll") if (length(dll) != 1 || identical(dll, FALSE) || nchar(dll) == 0) { dll <- FALSE } else { if (dll == basename(dll)) dll <- Sys.which(dll) } options(sqldf.dll = dll) if (!identical(dll, FALSE)) { connection <- dbConnect(m, dbname = dbname, loadable.extensions = TRUE) s <- sprintf("select load_extension('%s')", dll) dbGetQuery(connection, s) } else connection <- dbConnect(m, dbname = dbname) init_extensions(connection) } attr(connection, "dbPreExists") <- dbPreExists if (missing(dbname) && drv == "sqlite") d
Re: [R] sqldf hanging on macintosh - works on windows
Marc: Installing Simon's package worked perfectly. Thanks so much! -- View this message in context: http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3023736.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] class changed after execution with sqldf
When I run sqldf to merge two datasets, it's changing the Date (class date) to a numeric value (class factor). Not sure why. Appreciate any insight. Console output for two datasets and the merged dataset (via sqldf) listed below. > summary(df.aggregate) Date Hourx Min. :2010-07-01 0 : 64 Min. : 0.00 1st Qu.:2010-07-25 1 : 64 1st Qu.: 1.00 Median :2010-08-16 2 : 64 Median : 9.00 Mean :2010-08-16 3 : 64 Mean :11.77 3rd Qu.:2010-09-08 4 : 64 3rd Qu.:23.00 Max. :2010-09-30 5 : 64 Max. :32.00 (Other):1152 > class(df.aggregate$Date) [1] "Date" > summary(df.possible.combos) Date Hour Min. :2010-07-01 Min. : 0.00 1st Qu.:2010-07-25 1st Qu.: 5.75 Median :2010-08-16 Median :11.50 Mean :2010-08-16 Mean :11.50 3rd Qu.:2010-09-08 3rd Qu.:17.25 Max. :2010-09-30 Max. :23.00 > class(df.possible.combos$Date) [1] "Date" > #merge raw data and all possible combinations > df.final <- sqldf('select Date, Hour, x as RoomsInUse from > "df.possible.combos" + left join "df.aggregate" using (Hour, Date)') > summary(df.final) Date Hour RoomsInUse 14791.0: 24 Min. : 0.00 Min. : 0.00 14792.0: 24 1st Qu.: 5.75 1st Qu.: 1.00 14796.0: 24 Median :11.50 Median : 9.00 14797.0: 24 Mean :11.50 Mean :11.77 14798.0: 24 3rd Qu.:17.25 3rd Qu.:23.00 14799.0: 24 Max. :23.00 Max. :32.00 (Other):1392 > class(df.final$Date) [1] "factor" -- View this message in context: http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] class changed after execution with sqldf
Forgot to mention. This works in the PC implementation of R. The results I'm seeing here are in Mac OS X with X11 and tcl/tk installed. -- View this message in context: http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024602.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] REmove level with zero observations
If I have a column with 2 levels, but one level has no remaining observations. Can I remove the level? Had intended to do it as listed below, but soon realized that even though there are no observations, the level is still there. For instance summary(dbs3.train.sans.influential.obs$HAC) yields 0 ,1 4685,0 nlevels(dbs3.train.sans.influential.obs$HAC) yields [1] 2 drop.list <- NULL for (i in 1:ncol(dbs3.train.sans.influential.obs)) { if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- cbind(drop.list,i)}} yields nothing because HAC still has two levels, even though there aren't any observations in on of the levels. What I want to do is loop through all columns that are factors and create a list of items to drop because there will subsequently be < 2 levels when I try to run a linear model. -- View this message in context: http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312553.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] REmove level with zero observations
Ended up working as follows: dbs3.train.sans.influential.obs <- drop.levels(dbs3.train.sans.influential.obs) drop.list <- NULL for (i in 4:ncol(dbs3.train.sans.influential.obs)) { if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- cbind(drop.list,i)}} dbs3.train.sans.influential.obs <- dbs3.train.sans.influential.obs[-c(drop.list)] -- View this message in context: http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312821.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sapply/lapply instead of loop
Using the input below, can I do something more elegant (and more efficient) than the loop also listed below to pad strings to a width of 5? The true matrix is about 300K rows and 31 columns. ### #INPUT ### > temp DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 > ### #CODE ### ssize <- c(nrow(temp), ncol(temp)) aa <- c(1:ssize[2]) aa <- paste("DX", aa, sep = "") record <- matrix("?", nrow = ssize, ncol = ssize[2]) colnames(record) <- aa mm <- 0 #for (j in 1:1) { for (j in 1:ssize[1]) { mm <- j a <- as.character(as.matrix(as.data.frame(temp[j,]))) len2 <- sum(a != "?") mi <- 0 for (k in 1:len2) { aa <- a[k] a0 <- 5 - nchar(aa) if (a0 > 0) { for (st in 1:a0) { aa <- paste(aa, " ", sep = "") } } record[j, k] <- aa } } ### #OUTPUT ### DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
Both of those approaches seem to return (" v75") instead of ("v75 "). -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320305.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
That works great, and is ever so much simpler. Thanks much! -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320317.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Intersecting list vs rows in matrix
Know that if I have List_1 and List_2 that I can check to see if the intersect via the code below: List _1: a, b, c, d, e, f, g List_2: z, y, x, w, v, u, b length(intersect(List_1, List_2)) > 0 return = true If instead I wanted to check a dataframe that is a "list of lists," how would I do that by record without looping? List _1: a, b, c, d, e, f, g List_2: z, y, x, w, v, u, b y, z, w, v, v, u, m z, y, x, a, b, c . . . return true false true , , , -- View this message in context: http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Intersecting list vs rows in matrix
Very cool. Thanks! -Original Message- From: "Henrique Dallazuanna [via R]" To: Lipori, Gigi Sent: 08/10/2010 05:18:25 PM Subject: Re: Intersecting list vs rows in matrix Try this: colSums(apply(List_2, 1, is.element, List_1)) > 0 On Tue, Aug 10, 2010 at 5:42 PM, GL wrote: > > Know that if I have List_1 and List_2 that I can check to see if the > intersect via the code below: > > List _1: > a, b, c, d, e, f, g > List_2: > z, y, x, w, v, u, b > length(intersect(List_1, List_2)) > 0 > return = true > > If instead I wanted to check a dataframe that is a "list of lists," how > would I do that by record without looping? > > List _1: > a, b, c, d, e, f, g > > List_2: > z, y, x, w, v, u, b > y, z, w, v, v, u, m > z, y, x, a, b, c > . > . > . > > return > true > false > true > , > , > , > > -- > View this message in context: > http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320427.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ View message @ http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320470.html To unsubscribe from Intersecting list vs rows in matrix, click http://r.789695.n4.nabble.com/template/NodeServlet.jtp?tpl=unsubscribe_by_code&node=2320427&code=cGZsdWdnQHNoYW5kcy51ZmwuZWR1fDIzMjA0Mjd8LTIwMDU4OTM4Nw== -- View this message in context: http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320642.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AHRQ - Creation of Comorbidity Variables
If there are any other users who use AHRQ's SAS code comoanaly2010 and comformat2010 to create comorbidity variables, I thought you might be interested in the following PRELIM code we wrote to mimic its functionality in R. It seems to yield similar results, but may contain errors. Please feel free to comment (kindly) or enhance. I'm sure there are better ways to skin this cat, but we at least took a stab at it. Thought this would be a good use of the community if there are any other interested users. # Function flag # # Intended to provide functionality from AHRQ comformat2010 comoanaly2010 # # Input, dataframe with # id in column 1 # msdrg in column 2 # diagnosis in columns 4-53 # Output, numeriuc list with id and one element per cc # dimnames = c('ID', # 'CHF','VALVE','PULMCIRC','PERIVASC', # 'HTN_C','PARA','NEURO','CHRNLUNG','DM', # 'DMCX','HYPOTHY','RENLFAIL','LIVER','ULCER', # 'AIDS','LYMPH','METS','TUMOR','ARTH', # 'ANEMDEF','ALCOHOL','DRUG','PSYCH','DEPRESS') ) flag = function(data, k) { data = data[k, ] print(data) print(k) id = as.matrix(data[1]) DX = data[4:53] DX = as.matrix(DX) DRG = as.matrix(data[2]) ##format chf = c(39891, 4280:4289, 42800:42889) v1 = paste(0, 9320:9324, sep = "") v5 = paste("V422", "", sep = "") v6 = paste("V433", "", sep = "") valve = c(v1, 3940:3971, 39400:39709, 3979, 4240:4249, 42400:42499, 7463:7466, 74630:74659, v5, v6) pulmcirc = c(41511:41519, 4160:4169, 41600:41689, 4179) p3 = paste(c(4471, 5571, 5579, "V434"), "", sep = "") perivasc = c(4400:4409, 44000:44089, 4411:4419, 44100:44189, 4420:4429, 44200:44289, 4431:4439, 44310:44389, 44421:44422, p3, 449) htn = c(4011, 4019, 64200:64204) htncx = c(4010, 4372) # the following are special, temporary formats used in the creation of the # hypertension complicated comorbidity when overlapping with congestive # heart failure or renal failure occurs. These temporary formats are # referenced in the program called comoanaly2009.txt htnpreg = c(64220:64224) htnwochf = c(40200, 40210, 40290, 40509, 40519, 40599) htnwchf = c(40201, 40211, 40291) hrenworf = c(40300, 40310, 40390, 40501, 40511, 40591, 64210:64214) hrenwrf = c(40301, 40311, 40391) hhrwohrf = c(40400, 40410, 40490) hhrwchf = c(40401, 40411, 40491) hhrwrf = c(40402, 40412, 40492) hhrwhrf = c(40403, 40413, 40493) ohtnpreg = c(64270:64274, 64290:64294) para = c(3420:3449, 34200:34489, 43820:43853, 78072) neuro = c(3300:3319, 33000:33189, 3340:3359, 33400:33589, 3411:3419, 34110:34189, 3452:3453, 34520:34529, 3320, 3334, 3335, 3337, 3380, 7687, 7803, 7843, 340, 33371, 33372, 33379, 33385, 33394, 34500:34511, 34540:34591, 34700:34701, 34710:34711, 64940:64944, 76870:76873, 78031, 78032, 78039, 78097) chrnlung = c(490:492, 4900:4928, 49000:49279, 49300:49392, 494, 4940:4941, 49400:49409, 496:505, 4950:5049, 49500:50499, 5064) dm = c(25000:25033, 64800:64804, 24900:24931) dmcx = c(25040:25093, 7751, 24940:24991) hypothy = c(243:244, 2430:2442, 24300:24419, 2448, 2449) renlfail3 = paste(c("V420", "V451", "V568"), "", sep = "") renlfail4 = paste("V", c(4511:4512), sep = "") renlfail5 = paste("V", c(560:563, 5600:5632), sep = "") renlfail = c(5853:5856, 5859, 586, renlfail3, renlfail4, renlfail5) liver1 = paste(0, c(7022, 7023, 7032, 7033, 7044, 7054), sep = "") liver = c(liver1, 4560, 4561, 45620, 45621, 5710, 5712, 5713, 57140:57149, 5715:5716, 5718:5719, 5723, 5728, "V427") ulcer1 = paste(531, c(41, 51, 61, 70, 71, 91), sep = "") ulcer2 = paste(532, c(41, 51, 61, 70, 71, 91), sep = "") ulcer3 = paste(533, c(41, 51, 61, 70, 71, 91), sep = "") ulcer4 = paste(534, c(41, 51, 61, 70, 71, 91), sep = "") ulcer = c(ulcer1, ulcer2, ulcer3, ulcer4) aids = paste(0, c(42:44, 420:449, 4200:4289), sep = "") lymph = c(2:20238, 20250:20301, 20302:20382, 2386, 2733) mets = c(1960:1991, 19600:19909, 20970:20975, 20979, 78951) tumor = c(1400:1729, 1740:1759, 14000:17289, 17400:17589, 20900:20924, 20931:20936, 25801:25803, 2093, 20925:20929, 179:195, 1790:1958, 17900:19579) arth = c(7010, 7100:7109, 7140:7149, 7200:7209, 71000:71089, 71400:71489, 72000:72089, 725) c1 = paste(c(2860:2869, 2871, 2873:2875), "", sep = "") coag = c(2860:2869, 2871, 2873:2875, 28600:28689, 28730:28749, 64930:64934, 28
[R] bwplot in loop
I have the two loops listed below. The first executes perfectly and creates a series of density plots. The second does not produce any output, however, if I enter the exact bwplot command after the loop executes, I get output for the last value in the list of services. Why am I not getting output for bwplot inside the loop? #INSIDE LOOP PERFECT OUTPUT - ONE FOR EACH SERVICE for (Service in Service.Levels) { dbs1 <- subset( dbs,Service.Code == Service) if(nrow(dbs1) >2) { plot(density(dbs1$Enter.to.Incision)) }} #INSIDE LOOP NO OUTPUT for (Service in Service.Levels) { dbs1 <- subset( dbs,Service.Code == Service) bwplot(M.Surg..Last.Name ~ Enter.to.Incision , data=dbs1) } #AFTER LOOP AS SINGLE STATEMENT - PERFECT OUTPUT FOR LAST SERVICE IN LIST bwplot(M.Surg..Last.Name ~ Enter.to.Incision , data=dbs1) -- View this message in context: http://r.789695.n4.nabble.com/bwplot-in-loop-tp2220020p2220020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bwplot in loop
Subsequently saw this in FAQs See FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f -- View this message in context: http://r.789695.n4.nabble.com/bwplot-in-loop-tp2220020p2220034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Changing Data Type
Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried setting back to a Date via as.Date, but get an error the error: character string is not in a standard unambiguous format. Any thoughts appreciated. Code/Results listed below: > summary(df.possible.combos) Date Hour Min. :2011-03-01 Min. : 0.00 1st Qu.:2011-03-23 1st Qu.: 5.75 Median :2011-04-14 Median :11.50 Mean :2011-04-14 Mean :11.50 3rd Qu.:2011-05-06 3rd Qu.:17.25 Max. :2011-05-31 Max. :23.00 > summary(df.aggregate) Date Hour x Min. :2011-03-01 16 : 82 Min. : 1.000 1st Qu.:2011-03-22 17 : 82 1st Qu.: 1.000 Median :2011-04-13 18 : 82 Median : 2.000 Mean :2011-04-14 19 : 79 Mean : 4.195 3rd Qu.:2011-05-07 20 : 76 3rd Qu.: 7.000 Max. :2011-05-31 7 : 75 Max. :20.000 (Other):377 > #merge raw data and all possible combinations > df.final <- sqldf('select Date, Hour, x as RoomsInUse from > "df.aggregate" + left join "df.possible.combos" using (Hour, Date)') > summary(df.final) Date Hour RoomsInUse 15069.0: 16 16 : 82 Min. : 1.000 15114.0: 16 17 : 82 1st Qu.: 1.000 15063.0: 15 18 : 82 Median : 2.000 15082.0: 15 19 : 79 Mean : 4.195 15125.0: 15 20 : 76 3rd Qu.: 7.000 15044.0: 14 7 : 75 Max. :20.000 (Other):762 (Other):377 > thedate <- as.Date(df.final$Date) Error in charToDate(x) : character string is not in a standard unambiguous format > -- View this message in context: http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.