Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-26 Thread Jamie Johnson
Sorry to poke this again but I'm not following the last comment of how I could go about extending the solr index searcher and have the extension used. Is there an example of this? Again thanks Jamie On Aug 25, 2015 7:18 AM, "Jamie Johnson" wrote: > I had seen this as well, if I over wrote this

Re: StrDocValues

2015-08-26 Thread Yonik Seeley
On Wed, Aug 26, 2015 at 6:20 PM, Jamie Johnson wrote: > I don't see it explicitly mentioned, but does the boost only get applied to > the final documents/score that matched the provided query or is it called > for each field that matched? I'm assuming only once per document that > matched the mai

Re: find documents based on specific term frequency

2015-08-26 Thread Chris Hostetter
: "Is there a way to search for documents that have a word appearing more : than a certain number of times? For example, I want to find documents : that only have more than 10 instances of the word "genetics" …" Try... q=text:genetics&fq={!frange+incl=false+l=10}termfreq('text','genetics') No

Re: StrDocValues

2015-08-26 Thread Jamie Johnson
I don't see it explicitly mentioned, but does the boost only get applied to the final documents/score that matched the provided query or is it called for each field that matched? I'm assuming only once per document that matched the main query, is that right? On Wed, Aug 26, 2015 at 5:35 PM, Jamie

find documents based on specific term frequency

2015-08-26 Thread Tang, Rebecca
Hi there, We have an index build on solr 5.0. We received an user question: "Is there a way to search for documents that have a word appearing more than a certain number of times? For example, I want to find documents that only have more than 10 instances of the word "genetics" …" I'm not sure

Re: StrDocValues

2015-08-26 Thread Jamie Johnson
I think I found it. {!boost..} gave me what i was looking for and then a custom collector filtered out anything that I didn't want to show. On Wed, Aug 26, 2015 at 1:48 PM, Jamie Johnson wrote: > Are there any example implementation showing how StrDocValues works? I am > not sure if this is th

Securing Solr 5.3 with Basic Authentication

2015-08-26 Thread Gofio Code
With version 5.3 Solr have full-featured authentication and authorization plugins that use Basic authentication and “permission rules” which are completely driven from ZooKeeper. So I have tried that without success follwong the info in https://cwiki.apache.org/confluence/display/solr/Securing+So

Re: Search opening hours

2015-08-26 Thread Darren Spehr
So thanks to the tireless efforts of David Smiley and the devs at Vivid Solutions (not to mention the various contributors that help power Solr and Lucene) spatial search is awesome, efficient and easy. The biggest roadblock I've run into is not having the JTS (Java Topology Suite) JAR where Solr

Data Import Handler use of JNDI decayed

2015-08-26 Thread Davis, Daniel (NIH/NLM) [C]
NLM tends to be rather security conscious. Nothing appears terribly wrong, but the layout of Solr doesn't include Jetty's start.ini or jetty.xml It will have to be the detailed way - https://wiki.eclipse.org/Jetty/Feature/JNDI#Detailed_Setup Once I've figured it out, I'll request wiki edit per

Re: StrDocValues

2015-08-26 Thread Mikhail Khludnev
Hello Jamie, Check here https://github.com/apache/lucene-solr/blob/7f721a1f9323a85ce2b5b35e12b4788c31271b69/lucene/sandbox/src/java/org/apache/lucene/search/DocValuesRangeQuery.java#L185 Note, SortedSet works there even if an actual field is multivalue=false On Wed, Aug 26, 2015 at 8:48 PM, Jami

Is Solr ready for Nested Documents importing and querying ?

2015-08-26 Thread Rafael
Hi, I'm using solr and I'm starting to index my database. I work for a book seller, but we have a lot of different publications (i.e: different editions from different publishers ) for the same book, and I was wondering if it would be wise to model this "schema" using a hierarchical approach (with

Re: Search opening hours

2015-08-26 Thread O. Klein
Darren, This sounds like solution I'm looking for. Especially nice fix for the Sunday-Monday problem. Never worked with spatial search before, so any pointers are welcome. Will start working on this solution. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-opening

StrDocValues

2015-08-26 Thread Jamie Johnson
Are there any example implementation showing how StrDocValues works? I am not sure if this is the right place or not, but I was thinking about having some document level doc value that I'd like to read in a function query to impact if the document is returned or not. Am I barking up the right tre

Re: Exact substring search with ngrams

2015-08-26 Thread Upayavira
analysis tab does not support multi-valued fields. It only analyses a single field value. On Wed, Aug 26, 2015, at 05:05 PM, Erick Erickson wrote: > bq: my dog > has fleas > I wouldn't want some variant of "og ha" to match, > > Here's where the mysterious "positionIncrementGap" comes in. If you

Re: New Solr installation fails to create collection/core

2015-08-26 Thread Erick Erickson
Deviantcode, did you look at the referenced JIRA: https://issues.apache.org/jira/browse/SOLR-7826 Or is that irrelevant? Best, Erick On Wed, Aug 26, 2015 at 1:58 AM, deviantcode wrote: > I run into this exact problem trying out the latest solr, [5.3.0], @Scott, > how did you fix it? > KR > Hen

Re: Exact substring search with ngrams

2015-08-26 Thread Erick Erickson
bq: my dog has fleas I wouldn't want some variant of "og ha" to match, Here's where the mysterious "positionIncrementGap" comes in. If you make this field "multiValued", and index this like this: my dog has fleas or equivalently in SolrJ just doc.addField("blah", "my dog"); doc.addField("blah

Re: Connect and sync two solr server

2015-08-26 Thread Erick Erickson
>From the description, this is straight forward SolrCloud where you have replicas on the separate machines, see: https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud A different way of accomplishing this would be the master/slave style, see: https://cwiki.apache.org/conf

Re: re:New Solr installation fails to create core

2015-08-26 Thread deviantcode
Hi Scott, How about having logged in as a privileged user, you to run create_core as solr, something like this on a Redhat env: sudo -u solr ./bin/solr create_core -c demo KR Henry -- View this message in context: http://lucene.472066.n3.nabble.com/re-New-Solr-installation-fails-to-create-co

Re: Search opening hours

2015-08-26 Thread Darren Spehr
Sorry - didn't finish my thought. I need to address querying :) So using the above to define what's in the index your queries for a day/time become a CONTAINS operation against the field. Let's say that the field is defined as a location_rpt using JTS and its Spatial Factory (which supports polygon

Re: how to index document with multiple words (phrases) and words permutation?

2015-08-26 Thread afrooz
Simon, Thanks a lot. that is a great tool . I am trying to use it. Great solution. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-index-document-with-multiple-words-phrases-and-words-permutation-tp4224919p4225425.html Sent from the Solr - User mailing list archive at

Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-26 Thread Shawn Heisey
On 8/26/2015 1:11 AM, Esther Goldbraich wrote: > We have benchmarked a set of queries on Solr 4.7.0 and 5.2.1 (with same > data, same solrconfig.xml) and saw better query performance on Solr 4.7.0 > (5-15% better than 5.2.1, with an exception of 100% improvement for one of > the queries ). > Usi

Re: Search opening hours

2015-08-26 Thread Darren Spehr
Sure - and sorry for its density. I reread it and thought the same ;) So imagine a polygon of say 1/2 mile width (I made that up) that stretches around the equator. Let's call this a week's timeline and subdivide it into 7 blocks, one for each day. For the sake of simplicity assume it's a line (wh

Re: Solr performance is slow with just 1GB of data indexed

2015-08-26 Thread Zheng Lin Edwin Yeo
Thanks for your recommendation Toke. Will try to ask in the carrot forum. Regards, Edwin On 26 August 2015 at 18:45, Toke Eskildsen wrote: > On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote: > > > Now I've tried to increase the carrot.fragSize to 75 and > > carrot.summarySnippets t

Re: Search opening hours

2015-08-26 Thread Upayavira
"delightfully dense" = really intriguing, but I couldn't quite understand it - really hoping for more info On Wed, Aug 26, 2015, at 03:49 PM, Upayavira wrote: > Darren, > > That was delightfully dense. Do you think you could unpack it a bit > more? Possibly some sample (pseudo) queries? > > Upay

Re: Search opening hours

2015-08-26 Thread Upayavira
Darren, That was delightfully dense. Do you think you could unpack it a bit more? Possibly some sample (pseudo) queries? Upayavira On Wed, Aug 26, 2015, at 03:02 PM, Darren Spehr wrote: > If you wanted to try a spatial approach that blended times like above, > you > could try a polygon of minim

Re: Search opening hours

2015-08-26 Thread Darren Spehr
If you wanted to try a spatial approach that blended times like above, you could try a polygon of minimum width that spans the globe - this is literally using spatial search (geocodes) against time. So in this scenario you logically subdivide the polygon into 7 distinct regions (for days) and then

Connect and sync two solr server

2015-08-26 Thread shahper
Hi, I want to connect two solrcloud server. and sync there indexes to each other so that is any server is down we can work with other and whenever I update or add index in any server the other also get updated. shahper

Re: splitting shards on 4.7.2 with custom plugins

2015-08-26 Thread Jeff Courtade
im looking at the clusterstate.json t see why it is doing this I really dont understand it though... {"collection1":{ "shards":{ "shard1":{ "range":"8000-", "state":"active", "replicas":{ "core_node1":{ "state":"active",

Re: splitting shards on 4.7.2 with custom plugins

2015-08-26 Thread Jeff Courtade
Hi, So i got the shards too split. But they are very unbalanced. 7204922 total docs on the original collection shard1_0 numdocs 3661699 shard1_1 numdocs 3543132 shard2_0 numdocs 0 shard2_1 numdcs 0 Any ideas? This is what i had to do to get this to split with the custom libs I got shard

Re: Solr performance is slow with just 1GB of data indexed

2015-08-26 Thread Toke Eskildsen
On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote: > Now I've tried to increase the carrot.fragSize to 75 and > carrot.summarySnippets to 2, and set the carrot.produceSummary to > true. With this setting, I'm mostly able to get the cluster results > back within 2 to 3 seconds when I set

Re: Tokenizers and DelimitedPayloadTokenFilterFactory

2015-08-26 Thread Jamie Johnson
Thanks again Erick, I created https://issues.apache.org/jira/browse/SOLR-7975, though I didn't attach s patch because my current implementation is not useful generally right now, it meets my use case but likely would not meet others. I will try to look about generalizing this to allow something cu

Re: Behavior of grouping on a field with same value spread across shards.

2015-08-26 Thread Modassar Ather
Thanks Erick. On Wed, Aug 26, 2015 at 12:11 PM, Erick Erickson wrote: > That should be the case. > > Best, > Erick > > On Tue, Aug 25, 2015 at 8:55 PM, Modassar Ather > wrote: > > Thanks Erick, > > > > I saw the link. So is it that the grouping functionality works fine in > > distributed se

Re: Search opening hours

2015-08-26 Thread Upayavira
On Wed, Aug 26, 2015, at 10:17 AM, O. Klein wrote: > Those options don't fix my problem with closing times the next morning, > or is > there a way to do this? Use the spatial model, and a time window of a week. There are 10,080 minutes in a week, so you could use that as your scale. Assuming th

Re: Exact substring search with ngrams

2015-08-26 Thread Christian Ramseyer
On 26/08/15 00:24, Erick Erickson wrote: > Hmmm, this sounds like a nonsensical question, but "what do you mean > by arbitrary substring"? > > Because if your substrings consist of whole _tokens_, then ngramming > is totally unnecessary (and gets in the way). Phrase queries with no slop > fulfill

Re: Search opening hours

2015-08-26 Thread O. Klein
Those options don't fix my problem with closing times the next morning, or is there a way to do this? -- View this message in context: http://lucene.472066.n3.nabble.com/Search-opening-hours-tp4225250p4225354.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Hash of solr documents

2015-08-26 Thread david . davila
Yes, it´s an XY problem :) We are making the first tests to split our shard (Solr 5.1) The problem we have is this: the number of documents indexed in the new shards is lower than in the original one (19814 and 19653, vs 61100), and always the same. We have no idea why Solr is doing this. A p

Re: New Solr installation fails to create collection/core

2015-08-26 Thread deviantcode
I run into this exact problem trying out the latest solr, [5.3.0], @Scott, how did you fix it? KR Henry -- View this message in context: http://lucene.472066.n3.nabble.com/re-New-Solr-installation-fails-to-create-core-tp4221768p4225350.html Sent from the Solr - User mailing list archive at Nabb

Re: Search opening hours

2015-08-26 Thread Stefan Matheis
Have a look at the links that Alexandre mentioned. a somewhat non-obvious style solution because you'd probably not think about spatial features while dealing with opening time - but it's worth having a look. -Stefan On Wednesday, August 26, 2015 at 10:16 AM, O. Klein wrote: > Thank you for

Re: Please answer my question on StackOverflow ... "Best approach to guarantee commits in SOLR"

2015-08-26 Thread Charlie Hull
On 25/08/2015 13:21, Simer P wrote: http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr . *Question:* How can I get "guarantee commits" with Apache SOLR where persisting data to disk and visibility are both equally important ? *Background:*

Re: best way for adding a new field to all indexed documents...

2015-08-26 Thread Mikhail Khludnev
Sadly, it's always a problem http://searchivarius.org/blog/how-rename-fields-solr On Wed, Aug 26, 2015 at 11:20 AM, Roxana Danger < roxana.dan...@reedonline.co.uk> wrote: > Hello, >I have a index created with solr, and I would like to add a new > field to all the documents of the index.

Re: Hash of solr documents

2015-08-26 Thread Anshum Gupta
Hi David, The route key itself is indexed, but not the hash value. Why do you need to know and display the hash value? This seems like an XY problem to me: http://people.apache.org/~hossman/#xyproblem On Wed, Aug 26, 2015 at 1:17 AM, wrote: > Hi, > > I have read in one post in the Internet that

best way for adding a new field to all indexed documents...

2015-08-26 Thread Roxana Danger
Hello, I have a index created with solr, and I would like to add a new field to all the documents of the index. I suppose I could a) use an updateRequestHandler or b) create another index importing the data from the initial index and the data of my new field. Which could be the best approach

Hash of solr documents

2015-08-26 Thread david . davila
Hi, I have read in one post in the Internet that the hash Solr Cloud calculates over the key field to send each document to a different shard is indexed. Is this true? If true, is there any way to show this hash for each document? Thanks, David

Re: Search opening hours

2015-08-26 Thread O. Klein
Thank you for responding. Yonik's solution is what I had in mind. Was hoping for something more elegant, as he said, but it will work. The thing I haven't figured out is how to deal with closing times early morning next day. So it's 22:00 now and opening hours are 20:00 to 03:00 Can this be don

Re: Search opening hours

2015-08-26 Thread Upayavira
On Tue, Aug 25, 2015, at 10:54 PM, Yonik Seeley wrote: > On Tue, Aug 25, 2015 at 5:02 PM, O. Klein wrote: > > I'm trying to find the best way to search for stores that are open NOW. > > It's probably not the *best* way, but assuming it's currently 4:10pm, > you could do > > +open:[* TO 1610] +

Re: Solr performance is slow with just 1GB of data indexed

2015-08-26 Thread Zheng Lin Edwin Yeo
Hi Toke, Thank you for the link. I'm using Solr 5.2.1 but I think the carrot2 bundled will be slightly older version, as I'm using the latest carrot2-workbench-3.10.3, which is only released recently. I've changed all the settings like fragSize and desiredCluserCountBase to be the same on both si

Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-26 Thread Esther Goldbraich
Hello, We have benchmarked a set of queries on Solr 4.7.0 and 5.2.1 (with same data, same solrconfig.xml) and saw better query performance on Solr 4.7.0 (5-15% better than 5.2.1, with an exception of 100% improvement for one of the queries ). Using same JVM (IBM 1.7) and JVM params. Index's size