On Feb 11, 2012, at 7:20 AM, Jim foo.bar wrote:

> HI everyone,
> 
> I was just wondering whether anyone has used the clojure-opennlp
> wrapper for multi-word named entity recognition (NER)? I am using it
> to train a drug finder from my private corpus and even though i get
> correct behavior when using the command line tool of apache openNLP
> when trying to use the API i only get single-words entities
> recognised!!! I've opened up a thread in the official mailing list
> because initially i thought there was a genuine problem with openNLP
> but since the command line tool does exactly what i want i'm starting
> to think that it might not be openNLP's fault but either in my code or
> in the clojure wrapper...
> 
> I've followed both the official tutorials and the wrapper
> documentation and thus i am doing everything as instructed...
> I know the name finder expects tokenized sentences and i am indeed
> passing tokenized sentences like this:
> 
> (defn find-names-model [text]
> (map #(drug-find (tokenize %))
>             (get-sentences text)))
> 
> It is very strange because i am getting back "Folic" but not "Folic
> acid" regardless of using the exact same model i used with the command
> line tool...
> 
> Any help will be greatly appreciated...
> Regards,
> Jim

I have inquired on the OpenNLP mailing list about a way to train a tokenizer 
not to automatically split on spaces, if I hear back a way to do it I will add 
it to clojure-opennlp.

- Lee

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to