Any news regarding this ?
I'm investigating in Solr offline clustering as well ( full index
clustering).

Cheers


2012-09-17 20:16 GMT+01:00 Denis Kuzmenok <forward...@ukr.net>:

>
>
>
> Sorry for late response. To be strict, here is what i want:
>
> * I get documents all the time. Let's assume those are news (It's
> rather similar thing).
>
> * Every time i get new batch of "news" i should add them to Solr index
> and get cluster information for that document. Store this information
> in the DB (so i should know each document's cluster).
>
> * I can't wait for cluster definition service/program to launch from
> time to time, but it should define clusters on the fly.
>
> * I want to be able to get clusters only for some period of time (For
> example i want to search for clusters only for documents that were
> loader one month ago).
>
> * I will have tens of thousands of new documents every day and overall
> base of several millions.
>
> I'm reading "Mahout in action" now. But maybe you can point me to what i
> need.
> --- Исходное сообщение ---
> От кого: "Chandan Tamrakar" <chandan.tamra...@nepasoft.com>
> Кому: solr-user@lucene.apache.org
> Дата: 4 сентября 2012, 12:30:56
> Тема: Re: Solr Clustering
>
>
>
> >
>
> yes there is a solr component if you want to cluster solr documents , check
> the following linkhttp://wiki.apache.org/solr/ClusteringComponent
> Carrot2 might be good if you want to cluster few thousands of documents ,
> for example when user search solr , just cluster the  search results
>
> Mahout is much more scalable and probably you need Hadoop for that
>
>
> thanks
> chandan
>
> On Tue, Sep 4, 2012 at 2:10 PM, Denis Kuzmenok <forward...@ukr.net> wrote:
>
> >
> >
> > -------- Original Message --------
> > Subject: Solr Clustering
> > From: Denis Kuzmenok <forward...@ukr.net>
> > To: solr-user@lucene.apache.org> CC:
> >
> > Hi, all.
> > I know there is carrot2 and mahout for clustering. I want to implement
> > such thing:
> > I fetch documents and want to group them into clusters when they are
> added
> > to index (i want to filter "similar" documents for example for 1 week). i
> > need these documents quickly, so i cant rely on some postponed
> > calculations. Each document should have assigned cluster id (like group
> > similar documents into clusters and assign each document its cluster id.
> > It's something similar to news aggregators like google news. I dont need
> > to search for clusters with documents older than 1 week (for example).
> Each
> > document will have its unique id and saved into DB. But solr will have
> > cluster id field also.
> > Is it possible to implement this with solr/carrot/mahout?
>
>
>
>
> --
> Chandan Tamrakar
> *
> *
>
>


-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to