[R] Converting corpus into dataframe in r tm package

2014-04-03 Thread vikrant
] mine [[2]] [1] miner I need only results 'mine' , 'miner' in a vector. Please help -- View this message in context: http://r.789695.n4.nabble.com/Converting-corpus-into-dataframe-in-r-tm-package-tp4688081.html Sent from the R help mailing list archive at Nabble.com. _

Re: [R] tm package: problem of TermDocumentMatrix and minWordLength

2012-05-16 Thread Baoqiang Cao
try this: dtm <- DocumentTermMatrix(examplecorpus, control = list(wordLengths=c(1,100))) On Wed, May 16, 2012 at 6:22 AM, C.H. wrote: > Dear All, > > The following code illustrate the problem. > > [R code] > require(tm) > exampledoc <- c("R is good", "R is really good") > examplecorpus <- Corp

[R] tm package: problem of TermDocumentMatrix and minWordLength

2012-05-16 Thread C.H.
Dear All, The following code illustrate the problem. [R code] require(tm) exampledoc <- c("R is good", "R is really good") examplecorpus <- Corpus(VectorSource(exampledoc), encoding = "UTF-8") dtm <- DocumentTermMatrix(examplecorpus, control = list(minWordLength = 1)) as.matrix(dtm) [/R code] Th

Re: [R] tm package: handling contractions

2012-01-27 Thread Milan Bouchet-Valat
Le vendredi 27 janvier 2012 à 09:50 -0500, Michael Friendly a écrit : > I tried making a wordcloud of Obama's State of the Union address using > the tm package to process the text > > sotu <- scan(file="c:/R/data/sotu2012.txt", what="character") > sotu <- tolower(sotu) > corp <-Corpus(VectorSourc

Re: [R] tm package: handling contractions

2012-01-27 Thread Tyler Rinker
t;- function(x) gsub("’", "'", x)corp <- tm_map(corp, exchanger)#=== Cheers,Tyler > Date: Fri, 27 Jan 2012 09:50:51 -0500 > From: frien...@yorku.ca > To: r-help@r-project.org > Subject: [R] tm package: h

[R] tm package: handling contractions

2012-01-27 Thread Michael Friendly
I tried making a wordcloud of Obama's State of the Union address using the tm package to process the text sotu <- scan(file="c:/R/data/sotu2012.txt", what="character") sotu <- tolower(sotu) corp <-Corpus(VectorSource(paste(sotu, collapse=" "))) corp <- tm_map(corp, removePunctuation) corp <- tm

Re: [R] tm package, custom reader

2012-01-14 Thread Andy Adamiec
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat wrote: > Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : > > Hi Milan, > > > > > > The xml solr files are not in a typical format, here is an example > > http://www.omegahat.org/RSXML/solr.xml > > I'm not sure how to parse the d

Re: [R] tm package, custom reader

2012-01-14 Thread Andy Adamiec
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat wrote: > Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : > > Hi Milan, > > > > > > The xml solr files are not in a typical format, here is an example > > http://www.omegahat.org/RSXML/solr.xml > > I'm not sure how to parse the d

Re: [R] tm package, custom reader

2012-01-14 Thread Milan Bouchet-Valat
Le vendredi 13 janvier 2012 à 09:00 -0800, pl.r...@gmail.com a écrit : > I need help with creating custom xml reader for use with the tm package. The > objective is to crate a corpus for analysis. Files that I'm working with > come from solr and are in a funky XML format never the less I'm able t

[R] tm package, custom reader

2012-01-13 Thread pl.r...@gmail.com
I need help with creating custom xml reader for use with the tm package. The objective is to crate a corpus for analysis. Files that I'm working with come from solr and are in a funky XML format never the less I'm able to parse the XML files using solrDocs.R function provided by Duncan Temple La

Re: [R] tm package

2011-05-29 Thread mpavlic
it should be findFreqTerms instead of findfreqterms. m -- View this message in context: http://r.789695.n4.nabble.com/tm-package-tp3558064p3558784.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list http

Re: [R] tm package

2011-05-29 Thread mpavlic
it should be findFreqTerms instead of findfreqTerms. m -- View this message in context: http://r.789695.n4.nabble.com/tm-package-tp3558064p3558783.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list http

Re: [R] tm package

2011-05-28 Thread David Winsemius
On May 28, 2011, at 2:21 PM, lloyd barcza wrote: Hi, I am using the "tm" package. When I try to use findfreqterms I get an error message: findfreqterms(dtm,2,5) Error: could not find function "findfreqterms" You are spelling it incorrectly. Obvious thing such as calling the "tm" libra

[R] tm package

2011-05-28 Thread lloyd barcza
Hi, I am using the "tm" package. When I try to use findfreqterms I get an error message: findfreqterms(dtm,2,5) Error: could not find function "findfreqterms" Obvious thing such as calling the "tm" library and a creating document term matrix have been covered. I cannot find any dependencies th

[R] tm package, makeChunks

2011-05-22 Thread Matevž Pavlič
Hi all, i have a question about a makeChunks function from tm package. I have a text with constructed with a SQL qurey consisting of several rows in a table. The texts should be around 1000 lines but is now one long line of text. My question is : Does this matter in makeChunks()

Re: [R] TM Package - installation

2011-02-03 Thread bpn
Hi Sujatha, if you haven't still resolved this issue, here is a tip: Update your Java with the latest version. I faced with similar problem and resolved it with new java version. Hope this would help. -- View this message in context: http://r.789695.n4.nabble.com/TM-Package-installation-tp2

Re: [R] TM Package - Corpus function - Memory Allocation Problems

2010-08-17 Thread David Winsemius
On Aug 17, 2010, at 3:45 PM, Guelman, Leo wrote: I'm using R 2.11.1 on Win XP (32-bit) with 3 GB of RAM. My data has (only) 16.0 MB. Probably more than that. Each numeric is 8 bytes even before overhead, so a csv file that was all single digit integers and commas would more that double i

Re: [R] TM Package - Corpus function - Memory Allocation Problems

2010-08-17 Thread Guelman, Leo
I'm using R 2.11.1 on Win XP (32-bit) with 3 GB of RAM. My data has (only) 16.0 MB. I want to create a VCorpus object using the Corpus function in the tm package but I'm running into Memory allocation issues: "Error: cannot allocate vector of size 372 Kb". My data is stored in a csv file which

[R] TM Package - installation

2010-08-09 Thread Sujatha Upadhyaya
Hi All, I have been trying to do some text analytics in R using tm package. I have installed and loaded the package, along with dependencies (slam, rWeka,rjava). When I try to run a tm_map command, it gives me "Error in .jnew(name) : java.lang.NoClassDefFoundError: weka/core/stemmers/SnowballSte

Re: [R] tm package- remove stowords failling

2010-04-04 Thread Welma Pereira
> Hi, > > I just noticed that by inspecting the matrix term that no all stopwords are > removed, does someone know how to fix that? > > library(tm) > data("crude") > d<-tm_map(crude, removeWords, stopwords(language='english')) > dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2))

[R] tm package- remove stowords failling

2010-03-31 Thread Welma Pereira
Hi, I just noticed that by inspecting the matrix term that no all stopwords are removed, does someone know how to fix that? library(tm) data("crude") d<-tm_map(crude, removeWords, stopwords(language='english')) dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2)) inspect( dt) I

[R] tm package

2010-02-15 Thread David Neu
Hi, I'm using version 0.5.1 of tm package with R 2.10.1. It looks to me as if after the following reuters21578 <- Corpus(DirSource(corpusDir), readerControl = list(reader = readReut21578XMLasPlain)) reuters21578 <- tm_map(reuters21578, stripWhitespace) reuters21578 <- tm_map(reuters

Re: [R] tm package - how to transform a TermDocMatrix to a data.frame

2007-12-12 Thread Uwe Ligges
Filipe Almeida wrote: > Hi to all, > > I'm using the tm package in a Windows machine. > > This is my sample: >> tts1 <- TermDocMatrix(tts, weighting="tf-idf") >> typeof(tts1) > [1] "S4" > > How can i transform or put the tts1 TermDocMatrix in a simple data.frame or > simple matrix. I need to c

[R] tm package - how to transform a TermDocMatrix to a data.frame

2007-12-11 Thread Filipe Almeida
Hi to all, I'm using the tm package in a Windows machine. This is my sample: >tts1 <- TermDocMatrix(tts, weighting="tf-idf") >typeof(tts1) [1] "S4" How can i transform or put the tts1 TermDocMatrix in a simple data.frame or simple matrix. I need to compute some functions. For example, I need to