Re: Number of clustering labels to show

2015-05-29 Thread Stanislaw Osinski
s is here: http://doc.carrot2.org/#section.component.lingo. Stanislaw -- Stanislaw Osinski, stanislaw.osin...@carrotsearch.com http://carrotsearch.com On Fri, May 29, 2015 at 4:29 AM, Zheng Lin Edwin Yeo wrote: > Hi, > > I'm trying to increase the number of cluster result to be shown

Re: Parsing cluster result's docs

2015-03-09 Thread Stanislaw Osinski
Hi, > I have a Solr instance using the clustering component (with the Lingo > algorithm) working perfectly. However when I get back the cluster results > only the ID's of these come back with it. What is the easiest way to > retrieve full documents instead? Should I parse these IDs into a new que

Re: Is it possible to cluster on search results but return only clusters?

2014-05-06 Thread Stanislaw Osinski
Hi Sebastián, Looking quickly through the code of the clustering component, there's currently no way to output only clusters. Let me see if this can be easily implemented. Stanislaw -- Stanislaw Osinski, stanislaw.osin...@carrotsearch.com http://carrotsearch.com On Tue, May 6, 2014 at 6:

Re: [Clustering] Full-Index Offline cluster

2014-03-11 Thread Stanislaw Osinski
> Thank you Ahmet, Staszek and Tomnaso ;) > so the only way to obtain offline Clustering is to move to a customisation > ! > I will take a look to the interface of the API ( If you can give me a link > to the class, it will be appreciated, If not I will find it by myself . > The API stub is the or

Re: [Clustering] Full-Index Offline cluster

2014-03-10 Thread Stanislaw Osinski
> > Thats weird. As far as I know there is no such thing. There is > classification stuff but I haven't heard of clustering. > > http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html I think the wording on the wiki page needs some clarification -- Solr cont

Re: solrconfig.xml carrot2 params

2013-10-21 Thread Stanislaw Osinski
> Thanks, I'm new to the clustering libraries. I finally made this > connection when I started browsing through the carrot2 source. I had > pulled down a smaller MM document collection from our test environment. It > was not ideal as it was mostly structured, but small. I foolishly thought > I

Re: solrconfig.xml carrot2 params

2013-10-18 Thread Stanislaw Osinski
-- Stanislaw Osinski, stanislaw.osin...@carrotsearch.com http://carrotsearch.com On Thu, Oct 17, 2013 at 11:49 PM, youknow...@heroicefforts.net < youknow...@heroicefforts.net> wrote: > Would someone help me out with the syntax for setting > Tokenizer.documentFields in the ClusteringComp

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
> I mean measuring the similarity between the document in each cluster. > Also, difference between document on one cluster with another cluster. > > I saw the sample code ClusteringQualityBencmark.java > However, I do not know how to make use of it for assessing my Solr > Clustering performance. >

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
> Was the picture generated using Lingo 3G algorihtms? > I saw some sub-clusters inside it. > Nice pic :) > That is correct. I am interested to learn it. > How long is the Lingo 3G trial period? > I'll send you the details in a private e-mail in a second. > Is there any way to programmatical

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
1 Staszek -- Stanislaw Osinski http://carrotsearch.com On Fri, Nov 30, 2012 at 4:44 PM, Jorge Luis Betancourt Gonzalez < jlbetanco...@uci.cu> wrote: > Hi all: > > I'm thinking on using nutch combined with solr to index some news sites in > an intranet. And I was wondering

Re: document clustering or tagging

2012-11-18 Thread Stanislaw Osinski
Stanislaw Osinski, stanislaw.osin...@carrotsearch.com http://carrotsearch.com I have very huge solr index. I want to tag all documents with terms that > better represent that document like this > < > http://search.carrotsearch.com/carrot2-webapp/search?source=web&view=folders&am

Re: Carrot2 using rawtext of field for clustering

2012-06-08 Thread Stanislaw Osinski
> > Is there any workaround in Solr/Carrot2 So that we could pass tokens that'd > been filtered with customer tokenizer/filters instead of rawtext that it > currently > uses for clustering ? > > I read an issue in following link too . > > https://issues.apache.org/jira/browse/SOLR-2917 > > > Is wri

Re: System requirements in my case?

2012-05-22 Thread Stanislaw Osinski
> > 3) Measure the size of the index folder, multiply with 8 to get a clue of >> total index size >> > With 12 000 docs my index folder size is: 33Mo > ps: I use "solr.clustering.enabled=true" Clustering is performed at search time, it doesn't affect the size of the index (but obviously it does a

Re: Newbie with Carrot2?

2012-05-22 Thread Stanislaw Osinski
wrote: > Le 20/05/2012 11:43, Stanislaw Osinski a écrit : > > Hi Bruno, >> >> Here's the wiki documentation for Solr's clustering component: >> >> http://wiki.apache.org/solr/**ClusteringComponent<http://wiki.apache.org/solr/ClusteringComponent> >

Re: using Carrot2 custom ITokenizerFactory

2012-05-21 Thread Stanislaw Osinski
feFieldAccessorImpl.** > throwSetIllegalArgumentExcepti**on(UnsafeFieldAccessorImpl.**java:150) >at sun.reflect.**UnsafeObjectFieldAccessorImpl.**set(** > UnsafeObjectFieldAccessorImpl.**java:63) >at java.lang.reflect.Field.set(**Field.java:657) >at org.carrot2.uti

Re: using Carrot2 custom ITokenizerFactory

2012-05-20 Thread Stanislaw Osinski
o make the clustering component and Carrot2 JARs available to the context classloader by copying them to WEB-INF/lib of the WAR. Staszek On Sun, May 20, 2012 at 6:16 PM, Stanislaw Osinski < stanislaw.osin...@carrotsearch.com> wrote: > Interesting... let me investigate. > > S. > > &g

Re: using Carrot2 custom ITokenizerFactory

2012-05-20 Thread Stanislaw Osinski
eFieldAccessorImpl.**java:150) >at sun.reflect.**UnsafeObjectFieldAccessorImpl.**set(** > UnsafeObjectFieldAccessorImpl.**java:63) >at java.lang.reflect.Field.set(**Field.java:657) >at org.carrot2.util.attribute.**AttributeBinder$** > AttributeBinderActionBind.

Re: using Carrot2 custom ITokenizerFactory

2012-05-20 Thread Stanislaw Osinski
Hi Koji, It's fixed in trunk and 3.6.1 branch now. If you hit any other issues with this, let me know. Staszek On Sun, May 20, 2012 at 1:02 PM, Koji Sekiguchi wrote: > Hi Staszek, > > I'll wait your fix. Thank you! > > Koji Sekiguchi from iPad2 > > On 2012/05/

Re: Newbie with Carrot2?

2012-05-20 Thread Stanislaw Osinski
Hi Bruno, Here's the wiki documentation for Solr's clustering component: http://wiki.apache.org/solr/ClusteringComponent For configuration examples, take a look at the Configuration section: http://wiki.apache.org/solr/ClusteringComponent#Configuration. If you hit any problems, let me know. St

Re: using Carrot2 custom ITokenizerFactory

2012-05-20 Thread Stanislaw Osinski
Hi Koji, You're right, the current code overwrites the custom tokenizer though it shouldn't. LuceneCarrot2TokenizerFactory is there to avoid circular dependencies (Carrot2 default tokenizer depends on Lucene), but it shouldn't be an issue with custom tokenizers. I'll try to commit a fix later tod

Re: Old Google Guava library needs updating (r05)

2012-03-26 Thread Stanislaw Osinski
) > Just thought I'd document it somewhere for a proper fix to be done in the > 4.0 release. > > No issues arose for me but then again Erick mentions it's only used in > Carrot2 contrib which I'm not using in my deployment. > > Thanks for the help! > Nick > &g

Re: Old Google Guava library needs updating (r05)

2012-03-26 Thread Stanislaw Osinski
Hi Nick, Which version of Solr do you have in mind? The official 3.x line or 4.0? The quick and dirty fix to try would be to just replace Guava r05 with the latest version, chances are it will work (we did that in the past though the version number difference was smaller). The proper fix would b

Re: Solr 3.5.0 can't find Carrot classes

2012-01-26 Thread Stanislaw Osinski
Hi, Can you paste the logs from the second run? Thanks, Staszek On Wed, Jan 25, 2012 at 00:12, Christopher J. Bottaro wrote: > On Tuesday, January 24, 2012 at 3:07 PM, Christopher J. Bottaro wrote: > > SEVERE: java.lang.NoClassDefFoundError: > org/carrot2/core/ControllerFactory > > at

Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Stanislaw Osinski
); } Let me know if this did the trick. Cheers, S. On Thu, Dec 1, 2011 at 10:43, Vadim Kisselmann wrote: > Hi Stanislaw, > did you already have time to create a patch? > If not, can you tell me please which lines in which class in source code > are relevant? > Thanks and regards

Re: Weird docs-id clustering output in Solr 1.4.1

2011-11-29 Thread Stanislaw Osinski
> > But my actual live system works on solr 1.4.1. i can only change my > solrconfig.xml and integrate new packages... > i check the possibility to upgrade from 1.4.1 to 3.5 with the same index > (without reinidex) with luceneMatchVersion 2.9. > i hope it works... > Another option would be to chec

Re: Weird docs-id clustering output in Solr 1.4.1

2011-11-29 Thread Stanislaw Osinski
Hi, It looks like some serialization issue related to writing integer ids to the output. I've just tried a similar configuration on Solr 3.5 and the integer identifiers looked fine. Can you try the same configuration on Solr 3.5? Thanks, Staszek On Tue, Nov 29, 2011 at 12:03, Vadim Kisselmann

Re: Clustering and FieldType

2011-11-25 Thread Stanislaw Osinski
Hi, You're right -- currently Carrot2 clustering ignores the Solr analysis chain and uses its own pipeline. It is possible to integrate with Solr's analysis components to some extent, see the discussion here: https://issues.apache.org/jira/browse/SOLR-2917. Staszek > > Hi > > Trying to use carr

Re: Clustering not working when using 'text' field as snippet.

2011-08-12 Thread Stanislaw Osinski
Hi Pablo, The reason clustering doesn't work with the "text" field is that the field is not stored: For clustering to work, you'll need to keep your documents' titles and content in stored fields. Staszek On Fri, Aug 12, 2011 at 10:28, Pablo Queixalos wrote: > Hi, > > > > > > I am using s

Re: How to use solr clustering to show in search results

2011-07-01 Thread Stanislaw Osinski
The "docs" array contained in each cluster contains ids of documents belonging to the cluster, so for each id you need to look up the document's content, which comes earlier in the response (in the response/docs array). Cheers, Staszek On Thu, Jun 30, 2011 at 11:50, Romi wrote: > wanted to use

Re: Multicore clustering setup problem

2011-07-01 Thread Stanislaw Osinski
Hi Walter, That makes sense, but this has always been a multi-core setup, so the paths > have not changed, and the clustering component worked fine for core0. The > only thing new is I have fine tuned core1 (to begin implementing it). > Previously the solrconfig.xml file was very basic. I replaced

Re: Solr Clustering For Multiple Pages

2011-07-01 Thread Stanislaw Osinski
> > I am asking about the filter after clustering . Faceting is based on the > single field so,if we need to filter we can search in related field . But > in clustering it is created by multiple field then how can we create a > filter for that. > > Example > > after clusetring you get the foll

Re: Multicore clustering setup problem

2011-06-30 Thread Stanislaw Osinski
It looks like the whole clustering component JAR is not in the classpath. I remember that I once dealt with a similar issue in Solr 1.4 and the cause was the relative path of the tag being resolved against the core's instanceDir, which made the path incorrect when directly copying and pasting from

Re: Multicore clustering setup problem

2011-06-29 Thread Stanislaw Osinski
Hi, Can you post the full strack trace? I'd need to know if it's really org.apache.solr.handler.clustering.ClusteringComponent that's missing or some other class ClusteringComponent depends on. Cheers, Staszek On Thu, Jun 30, 2011 at 04:19, Walter Closenfleight < walter.p.closenflei...@gmail.co

Re: what is solr clustering component

2011-06-29 Thread Stanislaw Osinski
> > and my second question is does clustering effect indexes. > No, it doesn't. Clustering is performed only on the search results produced by Solr, it doesn't change anything in the index. Cheers, Staszek

Re: Solr Clustering For Multiple Pages

2011-06-22 Thread Stanislaw Osinski
I don't quite follow, I must admit. Maybe it's faceting you're after? http://wiki.apache.org/solr/SolrFacetingOverview Staszek On Wed, Jun 22, 2011 at 08:40, nilay@gmail.com wrote: > Can you please tell me how can i apply filter in cluster data in Solr ? > > Currently i storing docid and

Re: Solr Clustering For Multiple Pages

2011-06-21 Thread Stanislaw Osinski
Hi, Currently, only the clustering of search results is implemented in Solr, clustering of the whole index is not possible out of the box. In other words, clustering applies only to the records you fetch during searching. For example, if you set rows=10, only the 10 returned documents will be clus

Re: Mahout & Solr

2011-06-15 Thread Stanislaw Osinski
> > Is it possible to use the clustering component to use predefined clusters > generated by Mahout? Actually, the existing Solr ClusteringComponent's API has been designed to deal with both search results clustering (implemented by Carrot2) and off-line clustering of the whole index. The latter

Re: solr 3.1 java.lang.NoClassDEfFoundError org/carrot2/core/ControllerFactory

2011-06-08 Thread Stanislaw Osinski
Hi Bryan, You'll also need to make sure the your ${solr.dir}/contrib/clustering/lib directory is in the classpath; that directory contains the Carrot2 JARs that provide the classes you're missing. I think the example solrconfig.xml has the relevant declarations. Cheers, S. On Tue, Jun 7, 2011

Re: solr 3.1 java.lang.NoClassDEfFoundError org/carrot2/core/ControllerFactory

2011-06-08 Thread Stanislaw Osinski
Hi Bryan, You'll also need to make sure the your ${solr.home}/contrib/clustering/lib directory is in the classpath; that directory contains the Carrot2 JARs that provide the classes you're missing. I think the example solrconfig.xml has the relevant declarations. Cheers, S. On Tue, Jun 7, 2011

Re: assit with the Clustering component in Solr/Lucene

2011-05-16 Thread Stanislaw Osinski
> > Both of the clustering algorithms that ship with Solr (Lingo and STC) are >> designed to allow one document to appear in more than one cluster, which >> actually does make sense in many scenarios. There's no easy way to force >> them to produce hard clusterings because this would require a comp

Re: assit with the Clustering component in Solr/Lucene

2011-03-31 Thread Stanislaw Osinski
ct. However, I am happy that by adding the threshold to my request URL > produces the desired results > > let me know if I can do any more tests and I will do so. Thanks much > > Ramdev > > > > On Mar 31, 2011, at 10:18 AM, Stanislaw Osinski wrote: >

Re: assit with the Clustering component in Solr/Lucene

2011-03-31 Thread Stanislaw Osinski
> I added the parameter as you suggested. > (LingoClusteringAlgorithm.clusterMergingThreshold) into the searchComponent > section that describes the Clustering module > Changing the value of the parameter did not have any effect on my search > results. > > However, when I used the Carrot2 wor

Re: assit with the Clustering component in Solr/Lucene

2011-03-30 Thread Stanislaw Osinski
> Both of the clustering algorithms that ship with Solr (Lingo and STC) are > designed to allow one document to appear in more than one cluster, which > actually does make sense in many scenarios. There's no easy way to force > them to produce hard clusterings because this would require a complete

Re: assit with the Clustering component in Solr/Lucene

2011-03-30 Thread Stanislaw Osinski
Hi Ramdev, Both of the clustering algorithms that ship with Solr (Lingo and STC) are designed to allow one document to appear in more than one cluster, which actually does make sense in many scenarios. There's no easy way to force them to produce hard clusterings because this would require a compl

Re: Carrot2 clustering component

2011-01-18 Thread Stanislaw Osinski
Hi, I think the exception is caused by the fact that you're trying to use the latest version of Carrot2 with Solr 1.4.x. There are two alternative solutions here: * as described in http://wiki.apache.org/solr/ClusteringComponent, invoke "ant get-libraries" to get the compatible JAR files. or *

Re: Multiple sorting on text fields

2010-09-13 Thread Stanislaw
ame", SolrQuery.ORDER.asc); the results should be sorted in first queue by 'type' (only one letter 'A' or 'B') and then they should be sorted by names how I can define hier 'OR' or 'AND' relations? Best regards, Stanislaw 2010/9/13 Dennis Gearo

Multiple sorting on text fields

2010-09-13 Thread Stanislaw
nd there is only one time in index) If I'm sorting only by one text field, I'm receiving "normal" results w/o problems. Where could I do a mistake, or is it a bug? Best regards, Stanislaw

Re: specifying the doc id in clustering component

2010-08-19 Thread Stanislaw Osinski
> The solr schema has the fields, id, name and desc. > > I would like to get docs:["name Field here" ] instead of the doc Id > field as in > "docs":["200066", "195650", > The idea behind using the document ids was that based on them you could access the individual documents' content, inc

Re: specifying the doc id in clustering component

2010-08-18 Thread Stanislaw Osinski
the group (cluster) of documents. The description is usually a phrase or a number of phrases. The "docs" field lists the ids of documents that the algorithm assigned to the cluster. Can you give an example of the input and output you'd expect? Thanks! Stanislaw

Support loading queries from external files in QuerySenderListener

2010-08-04 Thread Stanislaw
Hi all! I cant load my custom queries from the external file, as written here: https://issues.apache.org/jira/browse/SOLR-784 This option is seems to be not implemented in current version 1.4.1 of Solr. It was deleted or it comes first with new version? regards, Stanislaw

Re: clustering component

2010-07-28 Thread Stanislaw Osinski
> The patch should also work with trunk, but I haven't verified it yet. > I've just added a patch against solr trunk to https://issues.apache.org/jira/browse/SOLR-1804. S.

Re: clustering component

2010-07-27 Thread Stanislaw Osinski
Hi Matt, I'm attempting to get the carrot based clustering component (in trunk) to > work. I see that the clustering contrib has been disabled for the time > being. Does anyone know if this will be re-enabled soon, or even better, > know how I could get it working as it is? > I've recently create

Re: Clustering results limit?

2010-07-22 Thread Stanislaw Osinski
Hi, In my SolrJ, I used ModifiableSolrParams and I set ("rows",50) but it > still returns less than 10 for each cluster. > Oh, the number of documents per cluster very much depends on the characteristics of your documents, it often happens that the algorithms create larger numbers of smaller clus

Re: Clustering results limit?

2010-07-22 Thread Stanislaw Osinski
Hi, I am attempting to cluster a query. It kinda works, but where my > (regular) query returns 500 results the cluster only shows 1-10 hits for > each cluster (5 clusters). Never more than 10 docs and I know its not > right. What could be happening here? It should be showing dozens of > documents

[ANN] Carrot2 3.3.0 released

2010-04-19 Thread Stanislaw Osinski
ngine from Carrot Search. Thanks! Dawid Weiss, Stanislaw Osinski Carrot Search, i...@carrot-search.com

Re: Clustering Search taking 4sec for 100 results

2010-03-05 Thread Stanislaw Osinski
Hi, It might be also interesting to add some logging of clustering time (just filed: https://issues.apache.org/jira/browse/SOLR-1809) to see what the index search vs clustering proportions are. Cheers, S. On Fri, Mar 5, 2010 at 03:26, Erick Erickson wrote: > Search time is only partially depen

Re: Clustering from anlayzed text instead of raw input

2010-03-05 Thread Stanislaw Osinski
> I'll give a try to stopwords treatbment, but the problem is that we > perform > POS tagging and then use payloads to keep only Nouns and Adjectives, and we > thought that could be interesting to perform clustering only with these > elements, to avoid senseless words. > POS tagging could help a

Re: Clustering from anlayzed text instead of raw input

2010-03-03 Thread Stanislaw Osinski
Hi Joan, I'm trying to use carrot2 (now I started with the workbench) and I can > cluster any field, but, the text used for clustering is the original raw > text, the one that was indexed, without any of the processing performed by > the tokenizer or filters. > So I get stop words. > The easiest

[ANN] Carrot2 3.2.0 released

2010-03-03 Thread Stanislaw Osinski
at i...@carrotsearch.com for details. Carrot Search Labs shares some small pieces of software we created when working on Carrot2 and Lingo3G. Please see http://labs.carrotsearch.com for details and downloads. Thanks! Dawid Weiss, Stanislaw Osinski Carrot Search, i...@carrot-search.com

Re: Has anyone got Carrot2 working with Solr without using ant?

2010-01-02 Thread Stanislaw Osinski
> You need, in addition to the ones shipped: > http://repo1.maven.org/maven2/colt/colt/1.2.0/colt-1.2.0.jar > http://download.carrot2.org/maven2/org/carrot2/nni/1.0.0/nni-1.0.0.jar > > http://mirrors.ibiblio.org/pub/mirrors/maven2/org/simpleframework/simple-xml/1.7.3/simple-xml-1.7.3.jar > http://r

[ANN] Carrot2 version 3.1.0 released

2009-09-29 Thread Stanislaw Osinski
-new-clustering-capabilities/ ) Release notes: http://project.carrot2.org/release-3.1.0-notes.html On-line demo: http://search.carrot2.org Download: http://download.carrot2.org Project website: http://project.carrot2.org Thanks, Staszek -- Stanislaw Osinski, http://carrot2.org

Re: SOLR-769 clustering

2009-09-08 Thread Stanislaw Osinski
Hi, It seems like the problem can be on two layers: 1) getting the right contents of stop* files for Carrot2, 2) making sure Solr picks up the changes. I tried your quick and dirty hack too. It didn't work also. phase like > "Carbon Atoms in the Group" with "in" still appear in my clustering labe

Re: SOLR-769 clustering

2009-09-08 Thread Stanislaw Osinski
Hi there, I try to apply the stoplabels with the instructions that you given in the > solr clustering Wiki. But it didn't work. > > I am runing the patched solr on tomcat. So to enable the stop label. I add > "-cp " in to my system's CATALINA_OPTS. I > tried to change the file name from stoplabels

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-15 Thread Stanislaw Osinski
Hi, On Thu, Aug 13, 2009 at 19:29, Mark Bennett wrote: There are comments in the Solr materials about having an option to cluster > based on the entire document set, and some warning about this being > atypical > and possibly slow. And from what you're saying, for a big enough docset, > it > mi

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-13 Thread Stanislaw Osinski
Hi, On Tue, Aug 11, 2009 at 22:19, Mark Bennett wrote: Carrot2 has several pluggable algorithms to choose from, though I have no > evidence that they're "better" than Lucene's. Where TF/IDF is sort of a > one > step algebraic calculation, some clustering algorithms use iterative > approaches, e

Re: Faceting on text fields

2009-06-12 Thread Stanislaw Osinski
Hi, Sorry for being late to the party, let me try to clear some doubts about Carrot2. Do you know under what circumstances or application should we cluster the > whole corpus of documents vs just the search results? I think it depends on what you're trying to achieve. If you'd like to give the

Re: questions about Clustering

2009-05-23 Thread Stanislaw Osinski
> > Hmm, I saw the comment in ClusteringDocumentList.java of Carrot2: > > /* > * If you know what query generated the documents you're about to cluster, > pass > * the query to the algorithm, which will usually increase clustering > quality. > */ > attributes.put(AttributeNames.QUERY, "data mining"

Re: questions about Clustering

2009-05-23 Thread Stanislaw Osinski
> > 1. if q=*:* is requested, Carrot2 will receive "MatchAllDocsQuery" >> via attributes. Is it OK? >> > > Yes, it only clusters on the Doc List, not the Doc Set (in other words, > it's your rows that matter) Just to add to that: Carrot2 should be able to cluster up to ~1000 search results, but b

Re: clustering SOLR-769

2009-05-22 Thread Stanislaw Osinski
Hi there, > Is it possbile to specify more than one snippet field or should I use copy > field to copy copy two or three field into single field and specify it in > snippet field. Currently, you can specify only one snippet field, so you'd need to use copy. Cheers, S.

Re: clustering SOLR-769

2009-05-21 Thread Stanislaw Osinski
Hi. > I built Solr from SVN today morning. I am using Clustering example. I > have added my own schema.xml. > > The problem is the even though I change carrot.snippet field from > features to filecontent the clustering results are not changed a bit. > Please note features field is also there in m

Re: SOLR-769 clustering

2009-04-24 Thread Stanislaw Osinski
> > How would we enable people via SOLR-769 to do this? Good point, Grant! To apply the modified stopwords.* and stoplabels.* files to Solr, simply make them available in the classpath. For the example Solr runner scripts that would be something like: java -cp -Dsolr.solr.home=./clustering/solr

Re: SOLR-769 clustering

2009-04-22 Thread Stanislaw Osinski
Hi Antonio, > To answer your question in terms of minimum term is, I am working with > "joke text" very short in length so the clusters are not so meaning full.. I > mean lot of adverbs and nouns, I thought increasing it might give me less > cluster but bit more meaningful (maybe not). Clusteri

Re: SOLR-769 clustering

2009-04-21 Thread Stanislaw Osinski
Hi Antonio, - is there anyway to have minimum number of labels per cluster? The current search results clustering algorithms (from Carrot2) by design generate one label per cluster, so there is no way to force them to create more. What is the reason you'd like to have more labels per cluster? I

Re: solr + carrot2

2007-08-27 Thread Stanislaw Osinski
generic Solr access UI would be great. > > Lance > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stanislaw > Osinski > Sent: Saturday, August 18, 2007 2:23 AM > To: solr-user@lucene.apache.org > Subject: Re: solr + carrot2

Re: solr + carrot2

2007-08-18 Thread Stanislaw Osinski
upose implementation by cloning the Lucene implementation. I'm not sure if I'm getting you right here... By "implementation" do you mean adding to the Swing application an option for pulling data from Solr (with a configuration dialog for Solr URL etc.)? Thanks, Stanislaw

Re: solr + carrot2

2007-08-17 Thread Stanislaw Osinski
A. Thanks, Stanislaw -- Stanislaw Osinski, [EMAIL PROTECTED] http://www.carrot-search.com On 17/08/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > > Any updates on this? It certainly would be quite interesting to see how > well carrot2 clustering can be integrated with solr,

[release announcement] Carrot2 version 2.1 released

2007-08-13 Thread Stanislaw Osinski
Hi All, A bit of self-promotion again :) I hope you don't find it out of topic, after all, some folks are using Carrot2 with Lucene and Solr, and Nutch has a Carrot2-based clustering plugin. Staszek [EMAIL PROTECTED] ___

Re: solr + carrot2

2007-08-01 Thread Stanislaw Osinski
> > Has anyone looked into using carrot2 clustering with solr? > > I know this is integrated with nutch: > > http://lucene.apache.org/nutch/apidocs/org/apache/nutch/clustering/carrot2/Clusterer.html > > It looks like carrot has support to read results from a solr index: > > http://demo.carrot2.org/