Re: [R] Parsing XML?

2022-07-28 Thread Spencer Graves
Hi, Richard et al.: On 7/28/22 1:50 AM, Richard O'Keefe wrote: What do you mean by "a list that I can understand"? A quick tally of the number of XML elements by identifier: 1 echoedSearchRetrieveRequest 1 frbrGrouping 1 maximumRecords 1 nextRecordPosition 1 numberOfRecords 1 query 1 records 1

Re: [R] Parsing a Date

2020-08-03 Thread Rui Barradas
Hello, I'm reposting, I sent the previous in HTML format. My apologies, I'm not at my computers. And another solution, taking advantage of Rasmus' one: simplify2array(parallel::mclapply(c(  "%Y",  "%m",  "%d",  "%H"), function(fmt, x) {  as.integer(format(as.POSIXct(x), format = fmt)) }, x = d

Re: [R] Parsing a Date

2020-08-02 Thread Rui Barradas
Hello, And another solution, taking advantage of Rasmus' one: simplify2array(parallel::mclapply(c( � "%Y", � "%m", � "%d", � "%H"), function(fmt, x) { ��� as.integer(format(as.POSIXct(x), format = fmt)) }, x = dta$forecast.date)) # [,1] [,2] [,3] [,4] #[1,] 2020��� 8��� 1�� 12 #[2,] 20

Re: [R] Parsing a Date

2020-08-02 Thread Rasmus Liland
On 2020-08-02 09:24 -0700, Philip wrote: | Below is some Weather Service data. I | would like to parse the forecast date | field into four different columns: | Year, Month, Day, Hour Dear Philip, I'm largely re-iterating Eric and Jeff's excellent solutions: > dat <- structure(list(f

Re: [R] Parsing a Date

2020-08-02 Thread Jeff Newmiller
Learn to post plain text and use dput to include data: dta <- structure(list(forecast.date = c("2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00" ), levels = c("1000 mb", "1000 mb", "1000 mb", "1000 mb", "1000 mb" ), lon = c(-113.13

Re: [R] Parsing a Date

2020-08-02 Thread Eric Berger
If the forecast.date column is of type character you can use lubridate to do this: > library(lubridate) > a <- "2020-08-01 12:00:00" > year(a) # [1] 2020 > month(a) # [1] 8 etc On Sun, Aug 2, 2020 at 7:24 PM Philip wrote: > Below is some Weather Service data. I would like to parse the foreca

[R] Parsing a Date

2020-08-02 Thread Philip
Below is some Weather Service data. I would like to parse the forecast date field into four different columns: Year Month Day Hour I would like to drop the final four zeros. Any suggestions? forecast.date levels lon lat HGT RH

Re: [R] parsing DOB data

2020-04-17 Thread Jim Lemon
Hi Peter, I worked out a neat function to add the century to short dates. It works fine on its own, but sadly it bombs when used with sapply. Maybe someone else can point out my mistake: add_century<-function(x,changeover=68,previous=19,current=20,pos=1,sep="-") { xsplit<-unlist(strsplit(x,sep))

Re: [R] parsing DOB data

2020-04-16 Thread Jim Lemon
Hi Peter, One way is to process the strings before converting them to dates: x2<-c("45-12-03","01-06-24","04-9-15","1901-03-04") add_century<-function(x,changeover=68,previous=19,current=20) { centuries<-sapply(sapply(x,strsplit,"-"),"[",1) shortyears<-which(!(nchar(centuries)>2)) century<-rep(

Re: [R] parsing DOB data

2020-04-16 Thread Pär Leijonhufvud
. par.leijonhuf...@regionjh.se Sjukhuskemist +46(0)63-153 376, +46-(0)70-242 7006 Laboratoriemedicin Östersunds sjukhus -Original Message- From: R-help On Behalf Of Peter Nelson via R-help Sent: den 1

[R] parsing DOB data

2020-04-16 Thread Peter Nelson via R-help
I have a data set (.csv) with date (eg date of birth) information stored as character vectors that I’m attempting to transform to POSIXct objects using the package lubridate (1.7.4). The problem that I’m trying to address is that my two digit years are invariably (?) parsed to 20xx. For example,

Re: [R] parsing files with "\" character

2019-08-27 Thread Jeff Newmiller
The principles of regex are basically the same between R and those other languages, so I don't see why you would switch... but if you did, asking here would be inappropriate. I think the short answer is yes, but can't be specific without a reproducible example. ([1] is recommended but not requi

Re: [R] parsing files with "\" character

2019-08-27 Thread Michael Dewey
Dear April Can you show us an example of what you are trying to do and how it fails? There are rules about backspaces but I find that if one backspace does not work try two, three, four until it works. It would be better to understand the rules but life is short. Michael On 27/08/2019 06:56

[R] parsing files with "\" character

2019-08-27 Thread April Ettington
Is there any way to parse files that include the \ character in a string? When I try to use grep to extract strings with a pattern that includes "\" it fails. If there is no way to do it with R, is it possible with python or a bash script? Thank you, April [[alternative HTML version d

Re: [R] parsing the file

2016-08-28 Thread Jeff Newmiller
Based on the discussion of ORing values with characters in [1] which may generate "unusual" characters I suspect a botched conversion from EBCDIC may have messed with some of the data. If there are signed data fields then OP may need to read the original file and treat it as if it were binary da

Re: [R] parsing the file

2016-08-28 Thread jim holtman
Here is an attempt at parsing the data. It is fixed field so the regular expression will extract the data. Some does not seem to make sense since it has curly brackets in the data. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not h

Re: [R] parsing a complex file

2016-08-27 Thread jim holtman
It is not clear as to how you want to parse the file. You need to at least provide an example of what you expect from the output. You mention " the detail which begins with 2 at byte location 1 to another file"; I don't see the '2' at byte location 1. Jim Holtman Data Munger Guru What is the p

[R] parsing a complex file

2016-08-27 Thread Glenn Schultz
All, I have a complex file I would like to parse in R a sample is described below The header is 1:200 and the detail is 1 to 200.  I have written code to parse the file so far.  As follows: numchar <- nchar(x = data, type = "chars") start <- c(seq(1, numchar, 398)) end <- c(seq(398, numchar, 3

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter
also check out this CRAN task view: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter
I suggest you go through some R tutorials to learn about R's capabilities. Some recommendations can be found here: https://www.rstudio.com/online-learning/#R To answer your specific query: ?scan ## Because you do not specify file format. ?grep ?regexp ## to use regular expressions to find tex

[R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Alexander Nikles
Dear Community, I hope that I have the right category selected because I am relatively new to the "R" world. I come with a relatively challenging problem in the luggage. I would like to realize, that "R" reads text files (there are several hundred pieces in my folder) sequentially, and screens

Re: [R] Parsing XML File

2015-10-13 Thread Lorenzo Isella
Dear Jim, Thanks for your reply. What you did is 100% what I need -- I now have a data frame with the relevant data and I can take up from there. Regards Lorenzo On Sun, Oct 11, 2015 at 03:54:10PM -0400, jim holtman wrote: Not sure exactly what you want since you did not show an expected output

Re: [R] Parsing XML File

2015-10-11 Thread jim holtman
Not sure exactly what you want since you did not show an expected output, but this will extract the attributes from AccVal in the structure: > # > library(XML) > > xmlfile=xmlParse("/temp/account.xml") > > class(xmlfile) #"XMLI

[R] Parsing XML File

2015-10-11 Thread Lorenzo Isella
Dear All, I am struggling with the parsing of the xml file you can find at https://www.dropbox.com/s/i4ld5qa26hwrhj7/account.xml?dl=0 Essentially, I would like to be able to convert it to a data.frame to manipulate it in R and detect all the attributes of an account for which unrealizedPNL goes

Re: [R] Parsing all rows & columns of a Dataframe into one column

2015-08-11 Thread Anshuk Pal Chaudhuri
ta) mat<-na.omit(mat) dim(mat) <- NULL newdata <- data.frame(mat, stringsAsFactors=FALSE) Cheers Petr > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Anshuk > Pal Chaudhuri > Sent: Monday, August 10, 2015 7:24 AM > To: r-h

Re: [R] Parsing all rows & columns of a Dataframe into one column

2015-08-09 Thread PIKAL Petr
Cheers Petr > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Anshuk > Pal Chaudhuri > Sent: Monday, August 10, 2015 7:24 AM > To: r-help@r-project.org > Subject: [R] Parsing all rows & columns of a Dataframe into one column > >

[R] Parsing all rows & columns of a Dataframe into one column

2015-08-09 Thread Anshuk Pal Chaudhuri
Hi All, I am using R for reading certain values in a dataset. I have values in a data frame all scattered in different columns & rows, some values might be NA as well. e.g. below three columns V1, V2,V3, and their respective values. V1 V2 V2 NA NA 90 abc 89.09 $50 76799 N

Re: [R] Parsing large amounts of csv data with limited RAM

2015-07-14 Thread jim holtman
take a look at the sqldf package because it has the ability to load a csv file to a database from which you can then process the data in pieces Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Tue, Jul 14,

Re: [R] Parsing large amounts of csv data with limited RAM

2015-07-14 Thread Jeff Newmiller
You seem to want your cake and eat it too. Not unexpected, but you may have your work cut out to learn about the price of having it all. Plotting: pretty silly to stick with gigabytes of data in your plots. Some kind of aggregation seems required here, with the raw data being a stepping stone to

[R] Parsing large amounts of csv data with limited RAM

2015-07-14 Thread Dupuis, Robert
I'm relatively new to using R, and I am trying to find a decent solution for my current dilemma. Right now, I am currently trying to parse second data from a 7 months of CSV data. This is over 10GB of data, and I've run into some "memory issues" loading them all into a single dataset to be plot

Re: [R] Parsing Google Finance page data?

2014-11-20 Thread Matt Considine
FWIW, this is the kludge I came up with. The idea is that I only know the name of the company and not the ticker/exchange. So the following admittedly doesn't work in all cases (e.g. "Time Warner"). So if anyone alternatively knows how to return a list of tickers/exchanges of companies match

Re: [R] Parsing Google Finance page data?

2014-11-20 Thread Collin Lynch
If you do not need a pure R solution, you might also find it helpful to blend languages. For scraping and munging tasks such as this I generally turn to python to do extraction then feed data to R for analysis via rpy. On Thu, Nov 20, 2014 at 8:57 PM, Spencer Graves < spencer.gra...@structuremoni

Re: [R] Parsing Google Finance page data?

2014-11-20 Thread Spencer Graves
The Ecfun package includes functions written to scrape data from web pages. See, e.g., readUShouse, readUSsenate, readUSstateAbbreviations. They use getURL{RCurl} and readHTMLTable{XML}. Hope this helps. Spencer Graves On 11/20/2014 5:42 PM, Matt Considine wrote: Hi,

[R] Parsing Google Finance page data?

2014-11-20 Thread Matt Considine
Hi, I'm wondering if anyone can point me to code to parse data on Google Finance pages, i.e. parse the results of a URL request such as this http://www.google.com/finance?q=apple I know how to return the contents of the page; it's figuring out the best tools to parse it that I'm interested i

Re: [R] Parsing XML file to data frame

2014-05-06 Thread David Winsemius
On May 5, 2014, at 11:42 AM, Timothy W. Cook wrote: > I didn't find an attached XML file. Maybe the list removes attachments? The list does not remove all attachments, It removes ones that are not among the listed acceptable formats. XML is not among the list of acceptable formats. If it had b

Re: [R] Parsing XML file to data frame

2014-05-06 Thread starcraz
Tim - the file is a hyperlink at the beginning of the message called 'sample.xml' or here's the hyperlink http://r.789695.n4.nabble.com/file/n4689883/sample.xml -- View this message in context: http://r.789695.n4.nabble.com/Parsing-XML-file-to-data-frame-tp4689883p4690021.html Sent from the R h

Re: [R] Parsing XML file to data frame

2014-05-05 Thread Timothy W. Cook
I didn't find an attached XML file. Maybe the list removes attachments? You might try posting to StackOverflow.com if this is the case. On Fri, May 2, 2014 at 2:17 PM, starcraz wrote: > Hi all - I am trying to parse out the attached XML file into a data frame. > The file is extracted from Had

[R] Parsing XML file to data frame

2014-05-02 Thread starcraz
Hi all - I am trying to parse out the attached XML file into a data frame. The file is extracted from Hadoop File Services (HFS). I am new in using the XML package so need some help in parsing out the data. Below is some code that I explore to get the attribute into a data frame. Any help is apprec

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread arun
Hi, In addition, you could also do: gsub(".*www\\.([[:alnum:]]+\\.[[:alnum:]]+).*","\\1",url) #[1] "mdd.com"    "mdd.com"    "mdd.com"    "genius.com" "google.com"  gsub(".*www\\.([[:alnum:]]+\\.[[:alnum:]]+).*","\\1",url2) #[1] "mdd.com"    "mdd.com"    "mdd.edu"    "genius.gov" "google.com" gs

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread arun
Try: gsub(".*\\.com","",url) [1] "/food/pizza/index.html" "/build-your-own/index.html" [3] "/special-deals.html"    "/find-a-location.html" [5] "/hello.html"     gsub(".*www\\.([[:alpha:]]+\\.com).*","\\1",url) #[1] "mdd.com"    "mdd.com"    "mdd.com"    "genius.com" "go

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread Ben Tupper
Hi, The XML package has a nice function, parseURI(), that nicely slice and dices the url. library(XML) parseURI('http://www.mdd.com/food/pizza/index.html') Might that help? Cheers, Ben On Mar 6, 2014, at 12:23 PM, Abraham Mathew wrote: > Let's say that I have the following character vecto

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread Abraham Mathew
Oh, that's perfect. I can just use one of the apply functions to run that each url and then extract the methods that I need. Thanks! On Thu, Mar 6, 2014 at 11:52 AM, Ben Tupper wrote: > Hi, > > The XML package has a nice function, parseURI(), that nicely slice and > dices the url. > > libr

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread Ista Zahn
See the parse_url function in the httr package. It does all this and more. On Mar 6, 2014 2:45 PM, "Sarah Goslee" wrote: > There are many ways to do this. Here's a simple version and a slightly > fancier version: > > > url = c("http://www.mdd.com/food/pizza/index.html";, > "http://www.mdd.com/bui

Re: [R] Parsing aspects of a url path in R

2014-03-06 Thread Sarah Goslee
There are many ways to do this. Here's a simple version and a slightly fancier version: url = c("http://www.mdd.com/food/pizza/index.html";, "http://www.mdd.com/build-your-own/index.html";, "http://www.mdd.com/special-deals.html";, "http://www.genius.com/find-a-location.html";, "http://www.google

[R] Parsing aspects of a url path in R

2014-03-06 Thread Abraham Mathew
Let's say that I have the following character vector with a series of url strings. I'm interested in extracting some information from each string. url = c("http://www.mdd.com/food/pizza/index.html";, " http://www.mdd.com/build-your-own/index.html";, "http://www.mdd.com/special-deals.html";

Re: [R] Parsing Complex Text in Single Cell

2014-01-30 Thread arun
Another way would be: library(qdap) library(stringr) x <- scan(what="character",)  x1 <- c(x,x) x1 <- paste(x1,collapse=" ")  x2 <- gsub('"',"",bracketXtract(x1,"curly"))  res2 <- as.data.frame(str_trim(do.call(rbind,genXtract(paste0(x2,","),":",","))),stringsAsFactors=FALSE) res2[,1:3] <-

Re: [R] Parsing Complex Text in Single Cell

2014-01-30 Thread Rui Barradas
Hello, Maybe something like the following. x <- scan(what = "character", text = ' {"trial":1,"corr":1,"resp_dur":799,"stim":"â†*�*â†*�*â†*�*â†*�*â†*�*","cond ":"congruent"},{"trial":2,"corr":1,"resp_dur":0,"stim":"xx→xx","cond":" nogo"},{"trial":3,"corr":0,"resp_dur

[R] Parsing Complex Text in Single Cell

2014-01-29 Thread Patzelt, Edward
R Experts - We have a complex problem whereby Qualtrics exported our data into a single cell as seen below. We attempted to parse it using scan() without much success. Hoping to get a little nudge here. I've posted the full data set here: https://www.dropbox.com/s/e246uiui6jrux6c/CoopandSelfContr

[R] Parsing useragent in R

2013-10-08 Thread Aviad Klein
Hello, Has any1 ever tried to parse a useragent string to get the OS and Browser information out of it? There are many Java, PHP and Python ways to do it, but I was wondering if there is a way to do it with R. Googling retrieved no help at all. A useragent string might look like this : "mozilla/

Re: [R] Parsing XML to tree.

2013-05-08 Thread Gabor Grothendieck
On Wed, May 8, 2013 at 3:43 AM, avinash sahu wrote: > Hi All, > > I am struggling to parse a XML file that describes a tree. The XML file is > present here: > http://www.emouseatlas.org/emap/ema/theiler_stages/StageDefinition/Stage_xml/TS23.xml > I want is simple list of parents id of each compone

Re: [R] Parsing XML to tree.

2013-05-08 Thread avinash sahu
Hi All, I am using XML package. Even this type of simple parse is not giving intended result. tissue.tree <- xmlTreeParse(" http://www.emouseatlas.org/emap/ema/theiler_stages/StageDefinition/Stage_xml/TS23.xml";, handlers=list( anatomy=function(x,attr) {x},

Re: [R] Parsing XML to tree.

2013-05-08 Thread Ben Tupper
Hi, On May 8, 2013, at 3:43 AM, avinash sahu wrote: > Hi All, > > I am struggling to parse a XML file that describes a tree. The XML file is > present here: > http://www.emouseatlas.org/emap/ema/theiler_stages/StageDefinition/Stage_xml/TS23.xml > I want is simple list of parents id of each compo

[R] Parsing XML to tree.

2013-05-08 Thread avinash sahu
Hi All, I am struggling to parse a XML file that describes a tree. The XML file is present here: http://www.emouseatlas.org/emap/ema/theiler_stages/StageDefinition/Stage_xml/TS23.xml I want is simple list of parents id of each component. The output will look like Component = [7148 7149 7150 7

Re: [R] Parsing of the comment character when sourcing a file

2012-11-28 Thread Duncan Murdoch
On 12-11-28 6:12 AM, Jim Lemon wrote: In the course of producing some plots for a publication, I wanted to mark the places where double counting of cases had occurred. I used three symbols, “*”, “^” and “#”. While these worked fine if the code was pasted into the R console, the “#” (comment chara

[R] Parsing of the comment character when sourcing a file

2012-11-28 Thread Jim Lemon
In the course of producing some plots for a publication, I wanted to mark the places where double counting of cases had occurred. I used three symbols, “*”, “^” and “#”. While these worked fine if the code was pasted into the R console, the “#” (comment character) was recognized even when quote

Re: [R] Parsing very large xml datafiles with SAX: How to profile functions?

2012-10-26 Thread Duncan Temple Lang
Hi Frederic Perhaps the simplest way to profile the individual functions in your handlers is to write the individual handlers as regular named functions, i.e. assigned to a variable in your work space (or function body) and then two write the handler functions as wrapper functions that call thes

Re: [R] Parsing very large xml datafiles with SAX (XML package): What data structure should I favor?

2012-10-26 Thread R. Michael Weylandt
I'd look into the data.table package. Cheers, RMW On Oct 26, 2012, at 6:00 PM, Frederic Fournier wrote: > Hello again, > > I have another question related to parsing a very large xml file with SAX: > what kind of data structure should I favor? Unlike using DOM function that > can return list

[R] Parsing very large xml datafiles with SAX (XML package): What data structure should I favor?

2012-10-26 Thread Frederic Fournier
Hello again, I have another question related to parsing a very large xml file with SAX: what kind of data structure should I favor? Unlike using DOM function that can return lists of relevant nodes and let me use various versions of 'apply', the SAX parsing returns me one thing at a time. I first

[R] Parsing very large xml datafiles with SAX: How to profile functions?

2012-10-26 Thread Frederic Fournier
Hello everyone, I'm trying to parse a very large XML file using SAX with the XML package (i.e., mainly the xmlEventParsing function). This function takes as an argument a list of other functions (handlers) that will be called to handle particular xml nodes. If when I use Rprof(), all the handler

Re: [R] parsing a structured object

2012-10-24 Thread BenM
That's simple and appears to work. Thanks for the prompt response. Ben -- View this message in context: http://r.789695.n4.nabble.com/parsing-a-structured-object-tp4647246p4647334.html Sent from the R help mailing list archive at Nabble.com. __ R-h

Re: [R] parsing a structured object

2012-10-23 Thread David Winsemius
On Oct 23, 2012, at 4:43 PM, BenM wrote: > Hi All, > Thanks in advance for your help. I take this to be a very basic > question, but I'm very new to R. > I'm trying to figure out how to parse an object. I have the following: > >> fileLocation > location > 1 foo.c

[R] parsing a structured object

2012-10-23 Thread BenM
Hi All, Thanks in advance for your help. I take this to be a very basic question, but I'm very new to R. I'm trying to figure out how to parse an object. I have the following: > fileLocation location 1 foo.csv > fileLocation$location

Re: [R] Parsing "back" to API strcuture

2012-09-17 Thread Eric Fail
Problem solved by Josh O'Brien on stackoverflow, http://stackoverflow.com/questions/12393004/parsing-back-to-messy-api-strcuture/12435389#12435389 some_magic <- function(df) { ## Replace NA with "", converting column types as needed df[] <- lapply(df, function(X) { if(any(i

Re: [R] Parsing "back" to API strcuture

2012-09-13 Thread Eric Fail
Dear Jim, Thank you for your response I appreciate your effort! It is close, I must admit that. What I am looking for is an object that is identical to 'RAW.API,' or at least in the stricture (I guess i do not need the ","`Content-Type`" = structure(c("text/html", "utf-8"), .Names = c("", "charse

Re: [R] Parsing "back" to API strcuture

2012-09-12 Thread jim holtman
This is close, but it does quote the header names, but does produce the same dataframe when read back in: > RAW.API <- > structure("id,event_arm,name,dob,pushed_text,pushed_calc,complete\n\"01\",\"event_1_arm_1\",\"John\",\"1979-05-01\",\"\",\"\",2\n\"01\",\"event_2_arm_1\",\"John\",\"2012-09-02\

[R] Parsing "back" to API strcuture

2012-09-12 Thread Eric Fail
Dear R experts, I'm reading data from an online database via API and it gets delivered in this messy comma separated structure, > RAW.API <- > structure("id,event_arm,name,dob,pushed_text,pushed_calc,complete\n\"01\",\"event_1_arm_1\",\"John\",\"1979-05-01\",\"\",\"\",2\n\"01\",\"event_2_arm_1\

Re: [R] Parsing large XML documents in R - how to optimize the speed?

2012-08-11 Thread Erdal Karaca
If this is an option for you: An xml database can handle (very) huge xml files and let you query nodes very efficiently. Then, you could query the xml databse from R (using REST) to do your statistics. There are some open source xquery/xml databases available. 2012/8/11 Frederic Fournier > Hell

Re: [R] Parsing large XML documents in R - how to optimize the speed?

2012-08-11 Thread Duncan Temple Lang
Hi Frederic You definitely want to be using xmlParse() (or equivalently xmlTreeParse( , useInternalNodes = TRUE)). This then allows use of getNodeSet() I would suggest you use Rprof() to find out where the bottlenecks arise, e.g. in the XML functions or in S4 code, or in your code th

Re: [R] Parsing large XML documents in R - how to optimize the speed?

2012-08-10 Thread Martin Morgan
On 08/10/2012 03:46 PM, Frederic Fournier wrote: Hello everyone, I would like to parse very large xml files from MS/MS experiments and create R objects from their content. (By very large, I mean going up to 5-10Gb, although I am using a 'small' 40M file to test my code.) I'm not 100% sure of i

[R] Parsing large XML documents in R - how to optimize the speed?

2012-08-10 Thread Frederic Fournier
Hello everyone, I would like to parse very large xml files from MS/MS experiments and create R objects from their content. (By very large, I mean going up to 5-10Gb, although I am using a 'small' 40M file to test my code.) My first attempt at parsing the 40M file, using the XML package, took more

Re: [R] parsing text files

2012-03-09 Thread jim holtman
Here is one way of doing it; it reads the file and create a 'long' version. ## input <- file("/temp/ClinicalReports.txt", 'r') outFile <- '/temp/output.txt' # tempfile() output <- file(outFile, 'w') writeLines("ID, Date, variable, value", output) ID <- NULL dataSw <- NULL repeat{ lin

Re: [R] parsing text files

2012-03-08 Thread ginger
Ooops, I forgot to specify that for each raw, containing records of the clinical reports , the values of the 22 parameter measurement have to be reported. For example, first raw, first 5 columns: ID DATE GLICEMIA AZOTEMIA CREATININEMIASODIEMIA ...

[R] parsing text files

2012-03-08 Thread ginger
Hello, I have a .txt file with many clinical exams reports (two examples of which are attached to the message). I have to create a data frame with as many rows as the number of clinical exams reports in the text file and 24 columns: the first (to be labelled as "ID") with a number (representing an

Re: [R] Parsing variable-length delimited strings into a matrix

2011-10-04 Thread jim holtman
Will this do it for you: > x <- readLines(textConnection("A,B,C + B,B + A,AA,C + A,B,BB,BBB,B,B")) > closeAllConnections() > x.s <- strsplit(x, ',') > # determine max length > x.max <- max(sapply(x.s, length)) > # create character matrix > x.mat <- matrix( + sapply(x.s, function(a) c(a, rep(NA

Re: [R] Parsing variable-length delimited strings into a matrix

2011-10-03 Thread R. Michael Weylandt
Well how do you want it be made into a matrix if the rows are all different lengths? Methinks you are finding this tricky for a reason... Michael On Mon, Oct 3, 2011 at 11:40 AM, Benjamin Wright wrote: > > I'm struggling to find a way of parsing a vector of data in this sort of form: > > A,B,C >

[R] Parsing variable-length delimited strings into a matrix

2011-10-03 Thread Benjamin Wright
I'm struggling to find a way of parsing a vector of data in this sort of form: A,B,C B,B A,AA,C A,B,BB,BBB,B,B into a matrix (or data frame). The catch is that I don't know a priori how many entries there will be in each element, nor how many characters there will be. strsplit(vec,",") gets me

Re: [R] parsing error when using R CMD check

2011-09-16 Thread Duncan Murdoch
On 11-09-16 4:48 PM, Tarca, Adi wrote: Hi all, I am trying to run R CMD check on a package which passes R CMD INSTALL. The check stops because of a parsing problem in the example of a given function at this line: return(res[res$ID %in% list$targetGeneSets,]) The code is ok, since it runs if I

[R] parsing error when using R CMD check

2011-09-16 Thread Tarca, Adi
Hi all, I am trying to run R CMD check on a package which passes R CMD INSTALL. The check stops because of a parsing problem in the example of a given function at this line: return(res[res$ID %in% list$targetGeneSets,]) The code is ok, since it runs if I paste it in R. Is this a known parsing i

[R] Parsing Apache Combined Log Format in R with regex

2011-07-07 Thread Paul
Hi, I am new to R. Does anyone have a perl-style regular expression for use with the R str_match_all() function for parsing the Combined Log Format [1] that is commonly used by Apache and other web servers to log data. Using Google I could find regexs written for many different languages, but I w

Re: [R] Parsing question, partly comma separated partly underscore separated string

2011-03-08 Thread Gabor Grothendieck
On Mon, Mar 7, 2011 at 10:45 PM, Eric Fail wrote: > Thanks to Gabor Grothendieck and Dennis Murphy I can now solve first > part of my problem and already impress my colleagues with the > R-program below (I know it could be written in a smarter way, but I am > learning). It reads my partly comma se

Re: [R] Parsing question, partly comma separated partly underscore separated string

2011-03-07 Thread Eric Fail
Thanks to Gabor Grothendieck and Dennis Murphy I can now solve first part of my problem and already impress my colleagues with the R-program below (I know it could be written in a smarter way, but I am learning). It reads my partly comma separated partly underscore separated string and cleans it up

Re: [R] Parsing question, partly comma separated partly underscore separated string

2011-03-07 Thread Gabor Grothendieck
On Sun, Mar 6, 2011 at 10:13 PM, Eric Fail wrote: > Dear R-list, > > I have a partly comma separated partly underscore separated string that I am > trying to parse into R. > > Furthermore I have a bunch of them, and they are quite long. I have now spent > most of my Sunday trying to figure this

Re: [R] Parsing question, partly comma separated partly underscore separated string

2011-03-06 Thread Dennis Murphy
Hi: This should get you most of the way there; I'll let you figure out how to assign the BLOCK and RUN numbers. tx <- "Subject ID,ExperimentName,2010-04-23,32:34:23,Version 0.4, 640 by 960 pixels, On Device M, M, 3.2.4, zz_373_462_488_...@9z.svg,592,820,3.35,zz_032_288_436_...@9z.svg ,332,878,3.

Re: [R] Parsing question, partly comma separated partly underscore separated string

2011-03-06 Thread Don McKenzie
On 6-Mar-11, at 7:13 PM, Eric Fail wrote: Dear R-list, I have a partly comma separated partly underscore separated string that I am trying to parse into R. Furthermore I have a bunch of them, and they are quite long. I have now spent most of my Sunday trying to figure this out and thought

[R] Parsing question, partly comma separated partly underscore separated string

2011-03-06 Thread Eric Fail
Dear R-list, I have a partly comma separated partly underscore separated string that I am trying to parse into R. Furthermore I have a bunch of them, and they are quite long. I have now spent most of my Sunday trying to figure this out and thought I would try the list to see if someone here wo

Re: [R] Parsing JSON records to a dataframe

2011-01-07 Thread Martin Morgan
On 01/07/2011 12:05 AM, Dieter Menne wrote: > > > Jeroen Ooms wrote: >> >> What is the most efficient method of parsing a dataframe-like structure >> that has been json encoded in record-based format rather than vector >> based. For example a structure like this: >> >> [ {"name":"joe", "gender":"

Re: [R] Parsing JSON records to a dataframe

2011-01-07 Thread Dieter Menne
Jeroen Ooms wrote: > > What is the most efficient method of parsing a dataframe-like structure > that has been json encoded in record-based format rather than vector > based. For example a structure like this: > > [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna", > "gender":"female",

[R] Parsing JSON records to a dataframe

2011-01-06 Thread Jeroen Ooms
What is the most efficient method of parsing a dataframe-like structure that has been json encoded in record-based format rather than vector based. For example a structure like this: [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna", "gender":"female", "age":23} ] RJSONIO parses this a

Re: [R] Parsing a Simple Chemical Formula

2010-12-27 Thread Mike Marchywka
> From: han...@depauw.edu > To: dwinsem...@comcast.net > Date: Sun, 26 Dec 2010 22:36:49 -0500 > CC: r-h...@stat.math.ethz.ch > Subject: Re: [R] Parsing a Simple Chemical Formula > > Hi David & others... > > I did find the function you recommended, plus, it'

Re: [R] Parsing a Simple Chemical Formula

2010-12-27 Thread Mike Marchywka
> Date: Sun, 26 Dec 2010 20:24:23 -0800 > From: spencer.gra...@structuremonitoring.com > To: han...@depauw.edu > CC: r-h...@stat.math.ethz.ch > Subject: Re: [R] Parsing a Simple Chemical Formula > > Mike Marchywka's pos

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Spencer Graves
Mike Marchywka's post mentioned a CRAN package, "rpubchem", missed by my search for "chemical formula". A further search for "chemical" and "chemistry" still missed it. "compound" found it. Adding "compounds" and combining them with "union" produced a list of 564 links in 219 packages;

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Gabor Grothendieck
On Sun, Dec 26, 2010 at 7:26 PM, Gabor Grothendieck wrote: > On Sun, Dec 26, 2010 at 6:29 PM, Bryan Hanson wrote: >> Hello R Folks... >> >> I've been looking around the 'net and I see many complex solutions in >> various languages to this question, but I have a pretty simple need (and I'm >> not

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Bryan Hanson
Hi David & others... I did find the function you recommended, plus, it's even easier (but a little hidden in the doc): >element(form, "mass"). But, this uses the atomic masses from the periodic table, which are weighted averages of the isotopes of each element. What I'm doing actually inv

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread David Winsemius
On Dec 26, 2010, at 8:28 PM, Bryan Hanson wrote: Thanks Spencer, I'll definitely have a look at this package and it's vignettes. I believe I have looked at it before, but didn't catch it on this particular search. Bryan Using the thermo list that the makeup function accesses to get its

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Mike Marchywka
di...@gmail.com > Date: Sun, 26 Dec 2010 20:01:45 -0500 > CC: r-h...@stat.math.ethz.ch > Subject: Re: [R] Parsing a Simple Chemical Formula > > Well let me just say thanks and WOW! Four great ideas, each worthy of > study and I'll learn several things from each. Interestingly,

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Bryan Hanson
Thanks Spencer, I'll definitely have a look at this package and it's vignettes. I believe I have looked at it before, but didn't catch it on this particular search. Bryan On Dec 26, 2010, at 8:16 PM, Spencer Graves wrote: p.s. help(pac=CHNOSZ) reveals that this package has 3 vignettes. I

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Spencer Graves
p.s. help(pac=CHNOSZ) reveals that this package has 3 vignettes. I have not looked at these vignettes, but most vignettes provide excellent introductions (though rarely with complete coverage) of important capabilities of the package. (The 'sos' package includes a vignette, which exposes mor

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Spencer Graves
Have you considered the 'CHNOSZ' package? > makeup("C5H11BrO" ) count C 5 H 11 Br 1 O 1 I found this using the 'sos' package as follows: library(sos) cf <- ???'chemical formula' found 21 matches; retrieving 2 pages cf The print method for "cf" opened

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread Bryan Hanson
Well let me just say thanks and WOW! Four great ideas, each worthy of study and I'll learn several things from each. Interestingly, these solutions seem more general and more compact than the solutions I found on the 'net using python and perl. More evidence for the power of R! A big th

Re: [R] Parsing a Simple Chemical Formula

2010-12-26 Thread David Winsemius
On Dec 26, 2010, at 6:29 PM, Bryan Hanson wrote: Hello R Folks... I've been looking around the 'net and I see many complex solutions in various languages to this question, but I have a pretty simple need (and I'm not much good at regex). I want to use a chemical formula as a function ar

  1   2   >