[R] Custom XML Readers

2011-12-23 Thread pl.r...@gmail.com
I need to construct a custom XML reader, the files I'm working with are in funky XML format: Paul H USA 2010-02-16 I want to read the file so it looks like: author = Paul H country = USA created_date=2010-02-16 Does any one know how to go about this problem, or know of good references i co

Re: [R] Custom XML Readers

2011-12-28 Thread pl.r...@gmail.com
Thanks all for helpful advise, however I'm still running in to an error while trying to run "readSolrDoc" provided by Ducan Temple Lang. The documents I'm trying to parse come from solr and look very much like the example provided on http://www.omegahat.org/RSXML/ I'm not that familiar with th

Re: [R] Custom XML Readers

2011-12-29 Thread pl.r...@gmail.com
I found the source of the error, in my XML document there are some costume tags such us if I change those tags to the code work. One other source of error is when the text does not fit on to one line such as: MORGANZA, La. (AP) -- Federal officials say they are going to open a Mississippi Ri

Re: [R] Custom XML Readers

2011-12-29 Thread pl.r...@gmail.com
I found the source of the error, in my XML document there are some costume tags such us if I change those tags to the code work. One other source of error is when the text does not fit on to one line such as: MORGANZA, La. (AP) -- Federal officials say they are going to open a Mississippi Riv

[R] Pointwise Mutual Information

2012-04-12 Thread pl.r...@gmail.com
Hi, I want to calculate pointwise mutual information between "label" 2-gram, and words in my corpus "1-gram". Any suggestions as to how to go about it? l =label w = word C = reference collection I want to calculate following: p(w,l| C) p(w| C) p(l | C) -- View this message in context: htt

[R] tm package, custom reader

2012-01-13 Thread pl.r...@gmail.com
I need help with creating custom xml reader for use with the tm package. The objective is to crate a corpus for analysis. Files that I'm working with come from solr and are in a funky XML format never the less I'm able to parse the XML files using solrDocs.R function provided by Duncan Temple La