Re: Text classification with Solr

Shalin Shekhar Mangar Mon, 26 Jan 2009 09:45:16 -0800

On Mon, Jan 26, 2009 at 10:59 PM, Neal Richter <[email protected]> wrote:


> Hey all,
>
>  I'm in the processing of implementing a system to do 'text
> classification' with Solr.  The basic idea is to take an
> ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index
> it and then classify documents into the taxonomy by pushing parsed
> document into the Solr search API.  Why?  Lucene/Solr's ability to do
> weighted term boosting at both search and index time has lots of
> obvious uses here.
>
>  Has anyone worked on this or a similar project yet?  I've seen some
> talk on the list about this area but it's pretty thin... December
> thread "Taxonomy Support on Solr".  I'm assuming Grant Ingersoll is
> looking at similar things with his 'taming text' project.
>
> I store the 'documents' in another repository and they are far too
> dynamic (write intensive) for direct indexing in Solr... so the
> previously suggested procedure of 1) store document 2) execute
> more-like-this and 3) delete document would be too slow.
>
> If people are interested I could start a JIRA issue on this (I do not
> see anything there at the moment).
>
> Thanks - Neal Richter
> http://aicoder.blogspot.com
>

Grant did some work at https://issues.apache.org/jira/browse/SOLR-769

Take a look and see if that helps.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Text classification with Solr

Reply via email to