Rolling partitions with solr shards

2012-05-27 Thread avenka
Is there a simple way to get solr to maintain shards as rolling partitions by date, e.g., the last day's documents in one shard, the week before yesterday in the next shard, the month before that in the next shard, and so on? I really don't need querying to be fast on the entire index, but it is cr

solr java.lang.NullPointerException on select queries

2012-06-16 Thread avenka
I have recently started getting the error pasted below with solr-3.6 on /select queries. I don't know of anything that changed in the config to start causing this error. I am also running a second independent solr server on the same machine, which continues to run fine and has the same configuratio

Re: solr java.lang.NullPointerException on select queries

2012-06-19 Thread avenka
For the first install, I copied over all files in the directory "example" into, let's call it, "install1". I did the same for "install2". The two installs run on different ports, use different jar files, are not really related to each other in any way as far as I can see. In particular, they are no

Re: solr java.lang.NullPointerException on select queries

2012-06-20 Thread avenka
Erick, thanks for pointing that out. I was going to say in my original post that it is almost like some limit on max documents got violated all of a sudden, but the rest of the symptoms didn't seem to quite match. But now that I think about it, the problem probably happened at 2B (corresponding exa

Re: solr java.lang.NullPointerException on select queries

2012-06-20 Thread avenka
Yes, wonky indeed. numDocs : -2006905329 maxDoc : -1993357870 And yes, I meant that the holes are in the database auto-increment ID space, nothing to do with lucene IDs. I will set up sharding. But is there any way to retrieve most of the current index? Currently, all select queries even in

Re: solr java.lang.NullPointerException on select queries

2012-06-20 Thread avenka
Thanks. Do you know if the tons of index files with names like '_zxt.tis' in the index/data/ directory have the lucene IDs embedded in the binaries? The files look good to me and are partly readable even if in binary. I am wondering if I could just set up a new solr instance and move these index fi

Re: solr java.lang.NullPointerException on select queries

2012-06-20 Thread avenka
Erick, thanks for the advice, but let me make sure you haven't misunderstood what I was asking. I am not trying to split the huge existing index in install1 into shards. I am also not trying to make the huge install1 index as one shard of a sharded solr setup. I plan to use a sharded setup only fo

Re: solr java.lang.NullPointerException on select queries

2012-06-21 Thread avenka
Erick, much thanks for detailing these options. I am currently trying the second one as that seems a little easier and quicker to me. I successfully deleted documents with IDs after the problem time that I do know to an accuracy of a couple hours. Now, the stats are: numDocs : 2132454075 maxDo

Re: solr java.lang.NullPointerException on select queries

2012-06-26 Thread avenka
So, I tried 'optimize', but it failed because of lack of space on the first machine. I then moved the whole thing to a different machine where the index was pretty much the only thing and was using about 37% of disk, but it still failed because of a "No space left on device" IOException. Also, the

SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread avenka
I get a JSON parse error (pasted below) when I send an update to a replica node. I downloaded solr 4 alpha and followed the instructions at http://wiki.apache.org/solr/SolrCloud/ and setup numShards=1 with 3 total servers managed by a zookeeper ensemble, the primary at 8983 and the other two at 757

SolrCloud replication question

2012-07-08 Thread avenka
I am trying to wrap my head around replication in SolrCloud. I tried the setup at http://wiki.apache.org/solr/SolrCloud/. I mainly need replication for high query throughput. The setup at the URL above appears to maintain just one copy of the index at the primary node (instead of a replicated index

DataImport using last_indexed_id or getting max(id) quickly

2012-07-08 Thread avenka
My understanding is that the DIH in solr only enters last_indexed_time in dataimport.properties, but not say last_indexed_id for a primary key 'id'. How can I efficiently get the max(id) (note that 'id' is an auto-increment field in the database) ? Maintaining max(id) outside of solr is brittle and

Re: SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread avenka
exactly how you are adding the document? Eg, what update handler are you using, and what is the document you are adding? On Jul 8, 2012, at 12:52 PM, avenka wrote: > I get a JSON parse error (pasted below) when I send an update to a replica > node. I downloaded solr 4 alpha and followed t

Re: SolrCloud replication question

2012-07-09 Thread avenka
e is no master/slave setup any more. And you do > _not_ have to configure replication. > > Best > Erick > > On Sun, Jul 8, 2012 at 1:03 PM, avenka <[hidden email]> wrote: > > > I am trying to wrap my head around replication in SolrCloud. I tried the > > setup at

Re: SolrCloud replication question

2012-07-09 Thread avenka
Hmm, never mind my question about replicating using symlinks. Given that replication on a single machine improves throughput, I should be able to get a similar improvement by simply sharding on a single machine. As also observed at http://carsabi.com/car-news/2012/03/23/optimizing-solr-7x-your-se

Re: DataImport using last_indexed_id or getting max(id) quickly

2012-07-11 Thread avenka
Thanks. Can you explain more the first TermsComponent option to obtain max(id)? Do I have to modify schema.xml to add a new field? How exactly do I query for the lowest value of "1 - id"? -- View this message in context: http://lucene.472066.n3.nabble.com/DataImport-using-last-indexed-id-