>From my past projects, our Lucene classification corpus looked like this:
0|document text...|categoryA
1|document text...|categoryB
2|document text...|categoryA
3|document text...|categoryA
...
800|document text...|categoryC
With the faceting capabilities of Solr it is now possible to design mor
On Tue, Jan 27, 2009 at 2:21 PM, Grant Ingersoll wrote:
> One of the things I am interested in is the marriage of Solr and Mahout
> (which has some Genetic Algorithms support) and other ML (Weka, etc.) tools.
[snip]
I love it, good to know you are thinking big here. Here's another big thought:
nce, a
reasonable thing to do with the output from the classification is, of
course, to facet on them.
Neal, what did you have in mind for a JIRA issue? I'd love to see a
patch.
On Jan 26, 2009, at 12:29 PM, Neal Richter wrote:
Hey all,
I'm in the processing of implement
27 jan 2009 kl. 17.23 skrev Neal Richter:
Is it really neccessary to use Solr for it? Things going much
faster with
Lucene low-level api and much faster if you're loading the
classification
corpus into the RAM.
Good points. At the moment I'd rather have a daemon with a service
API.. as
On Tue, Jan 27, 2009 at 1:36 AM, Hannes Carl Meyer wrote:
> Yeah, know it, the challenge on this method is the calculation of the score
> and parametrization of thresholds.
Not as worried about score itself as the score thresholds for prediction in/out.
> Is it really neccessary to use Solr for
>>Instead of indexing documents about 'sports' and searching for hits
>>based upon 'basketball', 'football' etc.. I simply want to index the
>>taxonomy and classify documents into it. This is a an ancient
>>AI/Data-Mining discipline.. but the standard methods of 'indexing' the
>>taxonomy are/were
Thanks for the link Shalin... played with that a while back.. It's
possibly got some indirect possibilities.
On Mon, Jan 26, 2009 at 10:46 AM, Hannes Carl Meyer wrote:
> I didn't understand, is the corpus of documents you want to use to classify
> fix?
Assume the 'documents' are not stored in th
I'm in the processing of implementing a system to do 'text
> classification' with Solr. The basic idea is to take an
> ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index
> it and then classify documents into the taxonomy by pushing parsed
&
On Mon, Jan 26, 2009 at 10:59 PM, Neal Richter wrote:
> Hey all,
>
> I'm in the processing of implementing a system to do 'text
> classification' with Solr. The basic idea is to take an
> ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,
Hey all,
I'm in the processing of implementing a system to do 'text
classification' with Solr. The basic idea is to take an
ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index
it and then classify documents into the taxonomy by pushing parse
10 matches
Mail list logo