New implementation of MLT

2013-03-30 Thread Gagandeep singh
Hi folks We started using the default implementation of MLT (org.apache.solr.handler.MoreLikeThisHandler) recently and found that there are a couple of things it lacks: 1. Searching for terms in the same field as the original document: - the current implementation picks the top field to

Merging solr indexes with duplicate keys - merging duplicate documents

2013-03-30 Thread Gagandeep singh
Hi folks We have a use case where i have 2 solr indexes with the same schema but different field populated, for example: Common schema: // Unique key Now i have one index which stores the information about products (first 5 fields). This index is built every 2 days. I have a 2nd in

Re: [ANNOUNCE] Solr wiki editing change

2013-03-30 Thread Trey Grainger
Please add TreyGrainger to the the contributors group. Thanks! -Trey On Sun, Mar 24, 2013 at 11:18 PM, Steve Rowe wrote: > The wiki at http://wiki.apache.org/solr/ has come under attack by > spammers more frequently of late, so the PMC has decided to lock it down in > an attempt to reduce the

Re: How to get Term Positions?

2013-03-30 Thread gagan_goku
I tried the same thing today, am happy to share a snippet with you: SchemaField field = req.getSchema().getFields().get("field_name"); AtomicReader ar = req.getSearcher().getAtomicReader(); AtomicReaderContext context = ar.getContext(); final Fields fields = context.reader().fiel

Re: clusterstate.json size

2013-03-30 Thread Yago Riveiro
>From Zookeeper documentation: jute.maxbuffer: (Java system property: jute.maxbuffer) This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The default is 0xf, or just under 1M. I

Re: clusterstate.json size

2013-03-30 Thread Yago Riveiro
Well, the explanation is very simple. When I started using solr the routing for data was implicit and automatic. Therefore, I did the sharding manually to have control of distribution of data. I wanted data sharding by client and month. I have 160 clients (and is expected grows) with 7 years of

RE: clusterstate.json size

2013-03-30 Thread Vaillancourt, Tim
Wow. I'm guessing we may have a new "Largest SolrCloud" winner ;). Tim -Original Message- From: svamb...@gmail.com [mailto:svamb...@gmail.com] Sent: Saturday, March 30, 2013 11:56 AM To: solr-user@lucene.apache.org Cc: solr-user@lucene.apache.org Subject: Re: clusterstate.json size You

Re: clusterstate.json size

2013-03-30 Thread svambati
You can zip the file and send it to ZK Sent from my iPhone On 30-Mar-2013, at 1:47 PM, Mark Miller wrote: > 4.2.1 *should* return a decent response. > > How many nodes!? I didn't even see that size at 1000 shards! > > It's a zk sys prop to raise it - on the road now, so I'd try google. >

RE: [WEBINAR] - "Lucene/Solr 4 - A Revolution in Enterprise Search Technology"

2013-03-30 Thread Vaillancourt, Tim
Too bad I missed it, thanks for putting this on! Are there any links to the recorded version? Cheers, Tim -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Tuesday, March 26, 2013 6:24 PM To: solr-user@lucene.apache.org; java-u...@lucene.apache.org Subject: [W

Re: clusterstate.json size

2013-03-30 Thread Mark Miller
4.2.1 *should* return a decent response. How many nodes!? I didn't even see that size at 1000 shards! It's a zk sys prop to raise it - on the road now, so I'd try google. - mark Sent from my iPhone On Mar 30, 2013, at 1:49 PM, yriveiro wrote: > Hi, > > Is there a size limitation for the

clusterstate.json size

2013-03-30 Thread yriveiro
Hi, Is there a size limitation for the clusterstate file? I can't create more collections for my cluster I have no error but the CREATE command not return any response. I read in the past that the max size for a file in zookeeper was 1MB, my clusterstate file has 1.1MB. It's possible be this th

Re: How to optimize live production server SOLR Index

2013-03-30 Thread Otis Gospodnetic
Hello, Unless you have a strong reason to optimize it, don't do it. :) Check http://search-lucene.com/?q=solr+wunder+optimize (hi Wunder!) Also, if you are using Solr in SolrCloud mode, you may want to move to Solr 4.2.x - we found Solr 4.0 in SoldCloud mode to be problematic. Otis -- Solr & El

How to optimize live production server SOLR Index

2013-03-30 Thread A Geek
Hi All, I'm pretty new to SOLR. Currently I'm using SOLR 4.0 version and we've two indexes one with size around 30Gig and another with size 180 Gig . Each contains more than a million records. I was wondering what is the best way to optimize the Index, and keep serving to user request and also w

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-30 Thread Mark Miller
The gen is an incrementing int for each commit, the version is a timestamp of the commit. - Mark On Mar 30, 2013, at 11:06 AM, adityab wrote: > Mark, isn't this response from master some what confusing. The gen and > version number is out of sync. > > http://.../solr/replication?command=det

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-30 Thread adityab
Mark, isn't this response from master some what confusing. The gen and version number is out of sync. http://.../solr/replication?command=details 0 2 1.52 GB /storage/solrdata/index/ 1364655616211 22 ... true false 1364619609805 20 schema.xml commit startup false 22 This response fo