Re: Suggest dictionaries not rebuilding after restart

2014-11-13 Thread Walter Underwood
We get no suggestions until we force a build with suggest.build=true. Maybe we need to define a spellchecker component to get that behavior? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Nov 13, 2014, at 10:56 PM, Michael Sokolov wrote: > I believe the spel

Re: Suggest dictionaries not rebuilding after restart

2014-11-13 Thread Michael Sokolov
I believe the spellchecker component persists these indexes now and reloads them on restart rather than rebuilding. -Mike On 11/13/14 7:40 PM, Walter Underwood wrote: We have to manually rebuild the suggest dictionaries after a restart. This seems odd, since someone else had a problem because

Handling growth

2014-11-13 Thread Patrick Henry
Hello everyone, I am working with a Solr collection that is several terabytes in size over several hundred millions of documents. Each document is very rich, and over the past few years we have consistently quadrupled the size our collection annually. Unfortunately, this sits on a single node w

Re: Can we query on _version_field ?

2014-11-13 Thread S.L
Garth and Erick, I am now successfully able to auto generate ids using UUID updateRequestProcessorChain , by giving the id type of string . Thanks for your help folks. On Thu, Nov 13, 2014 at 1:31 PM, Garth Grimm < garthgr...@averyranchconsulting.com> wrote: > So it sounds like you’re OK with u

Resource closing of CloudSolrServer

2014-11-13 Thread Phanindra R
Hi, Our indexing job and expiration job run every ~60 minutes (for about 10 minutes) in the test environment, within same JVM. Every job creates a new CloudSolrServer (decision was taken keeping other parts of system design in mind) and invokes shutdown() after it's complete. We have been seeing

Suggest dictionaries not rebuilding after restart

2014-11-13 Thread Walter Underwood
We have to manually rebuild the suggest dictionaries after a restart. This seems odd, since someone else had a problem because they did rebuild after restart. We’re running 4.7 and our dictionaries are configured like this. We do this for several fields. fieldName FuzzyLookupF

Re: Two Spellcheck Components in a Single Solr Search

2014-11-13 Thread Bruno René Santos
Hi, I had a similar problem for categories and locations of companies. The way I did it was to join everything (categories and locations) on a single field, join the two values with a predefined separator and query the spellchecker for the whole sentence. The spellchecker keeps the separator on th

Two Spellcheck Components in a Single Solr Search

2014-11-13 Thread Carlos Maroto
Hi, Has anyone configured two spellchecker components in Solr so that a single search returns two different sets of suggestions? *Use Case:* Combined index of business names and categories of those businesses *Sample Query:* thisle (misspelling by the user) *Expected Results:* Thistle (act

Re: Can we query on _version_field ?

2014-11-13 Thread Erick Erickson
bq: ..._version_ will change on updates" , shouldnt that be OK Absolutely not OK. Lucene/Solr relies on the uniqueKey being identical to define different documents. So if you update a doc it _must_ have the same uniqueKey or it gets added as a completely new document in addition to the old one

Re: Can we query on _version_field ?

2014-11-13 Thread Garth Grimm
So it sounds like you’re OK with using the docURL as the unique key for routing in SolrCloud, but you don’t want to use it as a lookup mechanism. If you don’t want to do a hash of it and use that unique value in a second unique field and feed time, and you can’t seem to find any other field that

Re: Can we query on _version_field ?

2014-11-13 Thread Michael Della Bitta
You could also find a natural key that doesn't look like an ID and create a name-based (Type 3) UUID out of it, with something like Java's nameUUIDFromBytes: https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html#nameUUIDFromBytes%28byte%5B%5D%29 Implementations of this exist in other l

Re: Can we query on _version_field ?

2014-11-13 Thread S.L
I am not sure if this a case of XY problem. I have no control over the URLs to deduce an id from them , those are from www, I made the URL the uniqueKey , that way the document gets replaced when a new document with that URL comes in . To do the detail look up I can either use the same as it is

Re: Edismax Phrase Search

2014-11-13 Thread Ahmet Arslan
Hi David, pf pf2 pf3 parameters are invented exactly for your use case. It automatically creates artificial clauses to boost documents you describe. https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser Ahmet On Thursday, November 13, 2014 6:43 PM, David Philip w

Edismax Phrase Search

2014-11-13 Thread David Philip
Hi All, How to do a phrase search and then term proximity search using edismax query parser? For ex: If the search term is "red apples", the products having "red apples" in their fields should be returned first and then products having red apples with term proximity of n. Thanks. David

Re: Pivot performance

2014-11-13 Thread Neil Ireson
I thought for completeness I’d try and find which version change caused the issue and in fact the performance was fine up to and including 4.9.0 and so the problem seems to have appeared only since the latest version. N > On 13 Nov 2014, at 14:46, Neil Ireson wrote: > > I found a post > (ht

Re: Can we query on _version_field ?

2014-11-13 Thread Shawn Heisey
On 11/12/2014 10:45 PM, S.L wrote: > We know that _version_field is a mandatory field in solrcloud schema.xml, > it is expected to be of type long , it also seems to have unique value in a > collection. > > However the query of the form > http://server1.mydomain.com:7344/solr/collection1/select/?q=

Re: Can we query on _version_field ?

2014-11-13 Thread S.L
Erick, 1."_version_ will change on updates" , shouldnt that be OK ?My understanding of update here means that the a new document will be inserted with the same unique key in my case ,which will replace the document effectively. This will not be an issue in my case because the initial search resu

Updating solrconfig.xml with zookeeper & HDFS

2014-11-13 Thread Joseph Obernberger
I wanted to make a change to the solrconfig.xml file in my 4.10.2 solr cloud cluster. I modified the files and put it in /tmp/conf - the only file in that directory. I then executed: ./zkcli.sh -cmd upconfig -zkhost list_of_hosts -d /tmp/conf -n ConfigName These ran successfully, and I was able t

Re: Can we query on _version_field ?

2014-11-13 Thread Erick Erickson
_version_ will change on updates I'm pretty sure, so I doubt it's suitable. I _think_ you can use a UUIDUPdateProcessorFactory here. I haven't checked this personally, but the idea here is that the UUID cannot be assigned on the shard. But if you're checking this out, if the UUID is assigned _befo

Re: Can we query on _version_field ?

2014-11-13 Thread S.L
Here is why I want to do this . 1. My unique key is a http URL, doctorURL. 2. If I do a look up based on URL , I am bound to face issues with character escaping and all. 3. To avoid that I was using a UUID for look up , but in SolrCloud it generates unique per replica , which is not acceptable. 4.

Re: Pivot performance

2014-11-13 Thread Neil Ireson
I found a post (http://lucene.472066.n3.nabble.com/Solr-4-3-Pivot-Performance-Issue-td4074617.html ) commenting that the pivot performance issue happened after version 4.0.0. So I ran my test on version 4.0.0 a

Re: Different ids for the same document in different replicas.

2014-11-13 Thread Erick Erickson
bq: can this be used as an unique value instead of generating the hashcode for the urlField Don't do this. The _version_ field is used internally for optimistic locking etc. I'd be _very_ cautious about co-opting this for anything else. Best, Erick On Thu, Nov 13, 2014 at 8:14 AM, Meraj A. Khan

Re: Can we query on _version_field ?

2014-11-13 Thread Erick Erickson
Really, I have to ask why you would want to. This is really purely an internal thing. I don't know what practical value there would be to search on this? Interestingly, I can search _version_:[100 TO *], but specific searches seem to fail. I wonder if there's something wonky going on with sea

Pivot performance

2014-11-13 Thread Neil Ireson
Hi all, I was running an experiment which involved counting terms by day, so I was using pivot facets to get the counts. However as the number of time and term values increased the performance got very rubbish. So I knocked up a quick test, using a collection of 1 million documents with a diffe

boosting not working in solr

2014-11-13 Thread rahulmodi
Hi All, i have to achieve static boost in solr. To apply static boosting i have already gone through the link http://wiki.apache.org/solr/QueryElevationComponent . i did in same way as mentioned in this link even then boosting is not working.. don't know why. My elevate.xml looks like:

Re: Different ids for the same document in different replicas.

2014-11-13 Thread Meraj A. Khan
Thanks , I also noticed that the mandatory _version_ field is also uniquely generated for every document in the collection , can this be used as an unique value instead of generating the hashcode for the urlField. I want to avoid creation of a custom unique filed if _version_ field which is mandat

Re: Different ids for the same document in different replicas.

2014-11-13 Thread Garth Grimm
OK. So it sounds like doctorURL is a good key, but you don’t like the special characters. I’ve used MD5 hashes of URLs before as a way to convert unique URLs into unique alphanumeric strings in a repeatable way. I think most programming languages contain libraries for doing that as you feed t

Re: Problems after upgrade 4.10.1 -> 4.10.2

2014-11-13 Thread Thomas Lamy
Hi, a big thank you to Jeon Woosung - we just upgraded our cloud to 4.10.2. One correction: we had to use /collections/{collection}/leader_initiated_recovery/shard1/node5, where "node5" had to be replaced with the place the down node showed up in the solr cloud dashboard. Also no tomcat restar

Re: Solr: IndexNotFoundException: no segments* file HdfsDirectoryFactory

2014-11-13 Thread Norgorn
Yes, it's late, but I've faced same problem and this question is the only one relevant to the problem in Google results, so, hope it'll help s1. For me, adding this two strings to solrconfig solved the problem ${solr.data.dir:hdfs://192.168.22.11:9001/solr} true In docs it's siad, that there is