Storing ranges on documents and searching all document with specific value included

2014-01-17 Thread Avner Levy
I have millions of documents with the following fields: name (string), start version (int), end version (int). I need to query efficiently all records which answers the query: Select all documents where version >= "start version" and version<="end version" Running the above query took 50-100 ms

Re: Changing Cache Properties after Indexing

2014-01-17 Thread Kranti Parisa
Doesn't it make sense to have a Indexer, Query Engine setup? Indexer = Solr instance with replication configured as Master Query Engine = One or more Solr instances with replication configured as Slave So that, you can do batch indexing on the Indexer, perform threshold checks if needed by disabl

Re: Optimize

2014-01-17 Thread Otis Gospodnetic
If true, I think it is a bug. I think some people rely on optimize not being dumb about this. Otis Solr & ElasticSearch Support http://sematext.com/ On Jan 17, 2014 2:17 PM, "William Bell" wrote: > If I optimize and the core is already optimized, shouldn't it return > immediately? At least that

Storing termVectors for PreAnalyzed type field

2014-01-17 Thread Mou
Can anyone please confirm if this is not supported in the current version? I am trying to use pre-analyzed field for mlt and when creating the mltquery it does not get anything from the index. I think even if I set termVectors=true in the PreAnalyzed field definition, it is being ignored. -- V

Re: QParser parsing date into unix timestamp format

2014-01-17 Thread Chris Hostetter
: rather than seconds. This is how Java deals with time internally. I'm fairly : sure that this is also how Solr's date types work internally. More specifically: the QParser is giving you that query, because the FieldType you have for the specified field (prbably TrieDateField) is parsing the

Re: QParser parsing date into unix timestamp format

2014-01-17 Thread Shawn Heisey
On 1/17/2014 12:59 PM, solr2020 wrote: We are writing our own search handler. We are facing this below issue. We are passing a date(Date:(["2012-10-01T00:00:00.000Z"+TO+"2012-10-01T23:59:59.999Z"])) for date range search to QParser.getParser method but it is converting the date to unix timestamp

QParser parsing date into unix timestamp format

2014-01-17 Thread solr2020
Hi, We are writing our own search handler. We are facing this below issue. We are passing a date(Date:(["2012-10-01T00:00:00.000Z"+TO+"2012-10-01T23:59:59.999Z"])) for date range search to QParser.getParser method but it is converting the date to unix timestamp format.(Date:([132217920 TO 132

Optimize

2014-01-17 Thread William Bell
If I optimize and the core is already optimized, shouldn't it return immediately? At least that is the way it used to work in 3.x. Now it appears to run a full optimize even if the index is already optimized... ? Is this by design? -- Bill Bell billnb...@gmail.com cell 720-256-8076

RE: Indexing URLs from websites

2014-01-17 Thread Teague James
Progress! I changed the value of that property in nutch-default.xml and I am getting the anchor field now. However, the stuff going in there is a bit random and doesn't seem to correlate to the pages I'm crawling. The primary objective is that when there is something on the page that is a link

Re: Solr reload trigger when a configuration file is changed

2014-01-17 Thread Shawn Heisey
On 1/17/2014 7:25 AM, Mohit Jain wrote: Bingo !! Tomcat was the one which was keeping track of changes in his own config/bin dirs. Once the timestamp of those dirs are changed it issued reload on all wars, resulting reload of solr cores. By the way it will be good to have a similar configurable

Re: Changing Cache Properties after Indexing

2014-01-17 Thread P Williams
You're both completely right. There isn't any issue with indexing with large cache settings. I ran the same indexing job five times, twice with large cache and twice with the default values. I threw out the first job because no matter if it's cached or uncached it runs ~2x slower. This must have

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
Hmm, this does get a bit complicated, and I'm not even doing any writes with the DIH SolrWriter. In retrospect, using a DIH to create only EFFs doesn't buy much except for the integration into the Solr Admin UI. Thanks for the pointer to 3671, James. Peter On Fri, Jan 17, 2014 at 10:59 AM, Dyer

RE: How to override rollback behavior in DIH

2014-01-17 Thread Dyer, James
Peter, I think you can override org.apache.solr.handler.dataimport.SolrWriter to have a custom (no-op) rollback method. Your new writer should implement org.apache.solr.handler.dataimport.DIHWriter. You can specify the "writerImpl" request parameter to specify the new class. Unfortunately, i

Re: solr cloud + hdfs issue

2014-01-17 Thread Mark Miller
You can configure the Solr client to use a replication factor of 1 for hdfs and then let Solr replicate for you if you want to avoid this. Other than that, we will be adding further options over time. - Mark On Jan 15, 2014, at 9:46 PM, longsan wrote: > Hi, i'm newer for solr cloud. i met a q

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
I'm actually doing the 'skip' on every successful call to 'nextRow' with this trick: row.put("$externalfield",null); // DocBuilder.addFields will skip fields starting with '$' because I'm only creating ExternalFieldFields. However, an error could also occur in the 'init' call, so exceptions have

Re: How to override rollback behavior in DIH

2014-01-17 Thread Shalin Shekhar Mangar
Can you try using onError=skip on your entities which use this data source? It's been some time since I looked at the code so I don't know if this works with data source. Worth a try I guess. On Fri, Jan 17, 2014 at 7:20 PM, Peter Keegan wrote: > Following up on this a bit - my main index is upd

Re: Solr reload trigger when a configuration file is changed

2014-01-17 Thread Mohit Jain
Hi Erick, Thanks for the response. Even I was surprised to see that behavior. Anyways I debugged it and found that Solr is not doing anything. The flow was - Generate config locally using templates - Sync configs to remote solr servers using "rsync" <--- I started digging deeper and went thr

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
Following up on this a bit - my main index is updated by a SolrJ client in another process. If the DIH fails, the SolrJ client is never informed of the index rollback, and any pending updates are lost. For now, I've made sure that the DIH processor never throws an exception, but this makes it a bit

Re: Solr reload trigger when a configuration file is changed

2014-01-17 Thread Erick Erickson
I don't think this is the case at all, but of course I could have missed something. That is, if by "automatically reloaded" you mean they're picked up on server restart or explicit core reload (see the admin API). But just changing the files on disk doesn't cause Solr to load the changed configs.

RE: Indexing URLs from websites

2014-01-17 Thread Markus Jelsma
-Original message- > From:Teague James > Sent: Thursday 16th January 2014 20:23 > To: solr-user@lucene.apache.org > Subject: RE: Indexing URLs from websites > > Okay. I had used that previously and I just tried it again. The following > generated no errors: > > bin/nutch solrindex

Solr reload trigger when a configuration file is changed

2014-01-17 Thread Mohit Jain
Hi, After upgrading Solr from 3.x to 4.x, I have observed that a solr core gets automatically reloaded if a configuration file is changed. I would like to know further about it - What is the flow of this feature ? - Is there a way to configure the set of files, so that any changes to them would r

Re: High cpu ratio when solr sleep

2014-01-17 Thread YouPeng Yang
Hi Mikhail Khludnev I do confirm that there are no requests at all.And I have checked that nothing abnormal with jstat .About the autowarming,I have set it up,but there are no commits or optimize on these cores. On the instance ,there are about 22 cores whose datadirs are on the HDFS.