subject:"Re\: \[R\] TM reader with text"

Re: [R] TM reader with text

2012-03-04 Thread Mickael R problem

"Try this before running removePuncutation(): corpus <- tm_map(corpus, function(x) gsub("[\'\U2019]«»", " ", x))" It will replace quotation marks with a space, and that's enough to separate them from the rest of the word. I try to use your solution. It's work only for characters, not for a Corpus,

Re: [R] TM reader with text

2012-03-04 Thread Milan Bouchet-Valat

Le samedi 03 mars 2012 à 16:56 -0800, Mickael R problem a écrit : > Hello everybody, > I don't give up the fight, but it's hard. I have finded a solution for the > ligature with a best converter wich tranlated more precisely PDF to plain > text. But a new problem has occured. In french particulary,

Re: [R] TM reader with text

2012-03-03 Thread Mickael R problem

Hello everybody, I don't give up the fight, but it's hard. I have finded a solution for the ligature with a best converter wich tranlated more precisely PDF to plain text. But a new problem has occured. In french particulary, but it should be the case in english too, I have a big problem ' " bracke

Re: [R] TM reader with text

2012-03-01 Thread Milan Bouchet-Valat

Le jeudi 01 mars 2012 à 07:07 -0800, Mickael R problem a écrit : > Hi Richard, > clearly there is a problem with latin ligature because the word resulting > from my ask with findFreqTerms give me some words > "n" > > "nancement" > >> "nancier" "nanciÃ¨re""nanciÃ¨res" >

Re: [R] TM reader with text

2012-03-01 Thread Mickael R problem

Hi Richard, clearly there is a problem with latin ligature because the word resulting from my ask with findFreqTerms give me some words > "n" "nancement" >> "nancier" "nanciÃ¨re""nanciÃ¨res" >> "nanciers""xe" where U+FB01 is a code for latin ligature. The problem

Re: [R] TM reader with text

2012-02-29 Thread Mickael R problem

my computer run under windows vista 64 sp2. The question about encoding, I don't understand it, sorry ? -- View this message in context: http://r.789695.n4.nabble.com/TM-reader-with-text-tp4433394p4433526.html Sent from the R help mailing list archive at Nabble.com.

Re: [R] TM reader with text

2012-02-29 Thread Richard M. Heiberger

Most, maybe all, of the example words you posted include ligatures, With "financier" for example, the leading "fi" is rendered in PDF and in most typesetting situations as a ligature with the a single complex character representing the "fi' combination. ï¬ ï¬ I pasted the "fi" and "fl" ligature

Re: [R] TM reader with text

2012-02-29 Thread David Winsemius

On Feb 29, 2012, at 6:00 PM, Mickael R problem wrote: Hello everybody, I work, I try, with TM but I have a problem with some special words in french. I think this is due to the manner to transform PDF to text, but I'm not perfectly sure. Let's see to the example : findFreqTerms(tdm1,30)

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

Re: [R] TM reader with text

8 matches

Site Navigation

Mail list logo

Footer information