Re: Limit fields returned in solr based on content

2015-12-24 Thread Jamie Johnson
Would the doc transformer have access to payloads? On Dec 24, 2015 4:21 PM, "Upayavira" wrote: > You could create a custom DocTransformer. They can enhance the fields > included in the search results. So, instead of fl=somefield you could > have fl=[my-filter:somefield], and your MyFieldDocTransf

Re: Limit fields returned in solr based on content

2015-12-24 Thread Jamie Johnson
I'm currently doing it in a middle tier, but it means I can't return results from the index to users, instead it needs to always hit the store, not the end of the world but was hoping I could use the fields in the index as a quick first view and then get the full result when the user selected an en

Re: Running Lucene/SOR on Hadoop

2015-12-24 Thread Dino Chopins
Hi Erick, Thank you for your response and pointer. What I mean by running Lucene/SOLR on Hadoop is to have Lucene/SOLR index available to be queried using mapreduce or any best practice recommended. I need to have this mechanism to do large scale row deduplication. Let me elaborate why I need thi

Re: Limit fields returned in solr based on content

2015-12-24 Thread Walter Underwood
I would do that in a middle tier. You can’t do every single thing in Solr. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Dec 24, 2015, at 1:21 PM, Upayavira wrote: > > You could create a custom DocTransformer. They can enhance the fields > includ

Re: Limit fields returned in solr based on content

2015-12-24 Thread Upayavira
You could create a custom DocTransformer. They can enhance the fields included in the search results. So, instead of fl=somefield you could have fl=[my-filter:somefield], and your MyFieldDocTransformer makes the decision as to whether or not to include somefield in the output. This would of course

Re: mlt and document boost

2015-12-24 Thread Upayavira
If you are going to go that far, you can get the parsed query from the debug output, but seriously, if you are using a latest Solr and don't need the stream.body functionality in MLT, then use the MLT query parser, it is by far the best way to do it - as you get all the features of other query pars

Re: Limit fields returned in solr based on content

2015-12-24 Thread Jamie Johnson
Sorry hit send too early Is there a mechanism in solr/lucene that allows customization of the fields returned that would have access to the field content and payload? On Dec 24, 2015 4:15 PM, "Jamie Johnson" wrote: > I have what I believe is a unique requirement discussed here in the past > to l

Limit fields returned in solr based on content

2015-12-24 Thread Jamie Johnson
I have what I believe is a unique requirement discussed here in the past to limit data sent to users based on some marking in the field.

Re: [Highlight] Storing one field, highlight with different analysers

2015-12-24 Thread Erick Erickson
Well, actually points 2 and 3 _do_ depend on the stored data. It's certainly true that the stored data won't have to be re-analyzed if you're using FVH, but the original text still needs to be present to highlight anything that would make sense (consider stopwords, stemming all that. The user reall

Re: Unable to extract images content (OCR) from PDF files using Solr

2015-12-24 Thread Erick Erickson
Here's an example of what Upayavira is talking about. https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ It has some RDBMS bits, but you can take those out. Best, Erick On Wed, Dec 23, 2015 at 1:27 AM, Upayavira wrote: > If your needs of Tika fall outside of those provided by the embed

Re: mlt and document boost

2015-12-24 Thread Tim Hearn
One workaround is to use the 'important terms' feature to grab the query generated by the MLT handler, then parse that list into your own solr query to use through a standard search handler. That way, you can get the same results as if you used the MLT handler, and you can also use filter querying

Geospatial search question - document with multiple locations

2015-12-24 Thread Tim Hearn
Hi everyone, Suppose I have the following fields in my schema: And I index multiple latlon coordinates to a document. Then I do a geofilt search against my index. When I do that geofilt search, will ALL locations associated with that document have to be within the 'circle' produced by geofi

Re: Solr 5.4 leader selection

2015-12-24 Thread Erick Erickson
I wouldn't worry about it. I doubt you could even measure the change in overall cluster performance even if all three leaders were on the same node. The REBALANCELEADERS stuff was put in for cases of having 100s of leaders on the same machine. It won't hurt, but is almost certainly completely unn

post.jar with security.json

2015-12-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In the old jetty-based implementation of Basic Authentication, one could use post.jar by running something like java -Durl="http://user:pswd@host:8983/solr/corename/update"; -Dtype=application/xml -jar post.jar example.xml By what mechanism does one pass in the user name and password to post.ja

Re: Null pointer exception in spell checker at addchecker method

2015-12-24 Thread JoeWang
Thank you! It really help me out. -- View this message in context: http://lucene.472066.n3.nabble.com/Null-pointer-exception-in-spell-checker-at-addchecker-method-tp4105489p4247203.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: mlt and document boost

2015-12-24 Thread Upayavira
Which morelikethis are you using? Handler, SearchComponent or QueryParser? You should be a able to wrap the mlt query parser with the boost query parser with no problem. Upayavira On Thu, Dec 24, 2015, at 05:18 AM, Binoy Dalal wrote: > Have you tried applying the boosts to individual fields with

Re: Solr 6 Distributed Join

2015-12-24 Thread Joel Bernstein
I haven't had a chance to review. If you have a reproducible failure on a one-to-many join go ahead and create a jira ticket. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Dec 24, 2015 at 3:25 AM, Akiel Ahmed wrote: > Hi > > Did you get a chance to check whether one-to-many joins were co

Re: Solr 5.4 leader selection

2015-12-24 Thread Ishan Chattopadhyaya
Maybe, you could have a look at REBALANCELEADERS to see if this solves your usecase? https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-RebalanceLeaders On Thu, Dec 24, 2015 at 3:54 PM, wrote: > Hi guys , > i have a solr cluster of 3 nodes using solr cloud , and a co

Solr 5.4 leader selection

2015-12-24 Thread tarek.abouzeid91
Hi guys ,  i have a solr cluster of 3 nodes using solr cloud , and a collection of 3 shards , below is what's seen in cloud  Node 1 : shard1_replica1       ,  shard2_replica2Node 2 : shard2_replica1       ,  shard3_replica2Node 3 : shard1_replica2        , shard3_replica1 Leaders : Shard 1 :Node

Re: How to check when a search exceeds the threshold of timeAllowed parameter

2015-12-24 Thread William Bell
Great I must have missed that. On Wed, Dec 23, 2015 at 9:41 AM, Jeff Wartes wrote: > Looks like it’ll set partialResults=true on your results if you hit the > timeout. > > https://issues.apache.org/jira/browse/SOLR-502 > > https://issues.apache.org/jira/browse/SOLR-5986 > > > > > > > On 12/22/15

Re: Solr 6 Distributed Join

2015-12-24 Thread Akiel Ahmed
Hi Did you get a chance to check whether one-to-many joins were covered in your tests? If yes, can you make any suggestions for what I could be doing wrong? Cheers Akiel From: Joel Bernstein To: solr-user@lucene.apache.org Date: 22/12/2015 13:03 Subject:Re: Solr 6 Distribut