Re: [R] tm: custom reader for readPlain

2013-01-08 Thread Simon Kiss
Hmm...Thanks a lot! that seems like really useful stuff. It might be a bit over my head, but I'll look into it. The articles are all contained in one text file, but they are clearly delimited (either by a series of ) or the regular expression ^Document.[0-9]. Simon On 2013-01-08, at 4

Re: [R] tm: custom reader for readPlain

2013-01-08 Thread Milan Bouchet-Valat
Le mardi 08 janvier 2013 à 15:56 -0500, Simon Kiss a écrit : > Hello: > I have a series of newspaper articles from a Canadian newspaper > database (Canadian Newsstand) that look just like below. > > I've read through this vignette > (http://cran.r-project.org/web/packages/tm/vignettes/extensions.p

[R] tm: custom reader for readPlain

2013-01-08 Thread Simon Kiss
Hello: I have a series of newspaper articles from a Canadian newspaper database (Canadian Newsstand) that look just like below. I've read through this vignette (http://cran.r-project.org/web/packages/tm/vignettes/extensions.pdf) about creating a custom reader to extract meta-data, but I can't u