Thanks, the pointer to the tokenizer helped.
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
15032 Hunter Court, Westfield, IN 46074
(317) 490-5129 Work, & Mobile & VoiceMail
"The re
On Thu, Aug 13, 2009 at 03:36:22PM -0400, Mark Kimpel wrote:
> I am using the package "tm" for text-mining of abstracts and would like to use
> it to find instances of gene names that may contain white space. For instance
> "gene regulatory protein 1". The default behavior of tm is to parse this in
I am using the package "tm" for text-mining of abstracts and would like to
use it to find instances of gene names that may contain white space. For
instance "gene regulatory protein 1". The default behavior of tm is to parse
this into 4 separate words, but I would like to use the class constructor
3 matches
Mail list logo