FunctionQuery of FloatFieldSource (Lucene 5.0)

2015-07-14 Thread Peyman Faratin
Hi I am having problems accessing float values in a lucene 5.0 index via the functionquery. My setup is as follows Indexing time -- Document doc = new Document(); FieldType f = new FieldType(); f.setStored(false); f.setNumericType(NumericType.FLOAT); f.setDocValuesType(D

Re: FieldCache error for multivalued fields in json facets.

2015-07-14 Thread Iana Bondarska
Yonik, Upayavira, thanks for response. Here is the stacktrace from solr logs. I can make my field single valued, but are there any plans to fix this or in general mulitvalued fields should not be used for metric calculation ? what about other metrics, e.g. avg, min,max -- should I be able to calcul

Re: Implementing MoreLikeThis feature

2015-07-14 Thread Upayavira
Look at your "interesting terms". If your index is too small, it will consider words like "and", "the", etc to be "interesting" and form a part of the query, thus returning your entire index, which doesn't help. Effectively what MLT does is attempt to pick the 25 (configurable) best terms in the s

Re: Persistence problem with swapped cores after Solr restart -- 4.9.1

2015-07-14 Thread Upayavira
Problems between keyboard and chair are the best kind. They are the easiest to resolve. If I were you, I'd be feeling *glad* it wasn't a bug. Upayavira On Tue, Jul 14, 2015, at 07:31 AM, Shawn Heisey wrote: > On 7/13/2015 10:02 PM, Erick Erickson wrote: > > Uggghh. Not persistence again > >

Re: Implementing MoreLikeThis feature

2015-07-14 Thread Zheng Lin Edwin Yeo
Thanks for your advice. I've indexed more content in and it's working better now. Not all the index will be returned everytime now. However, I found that the longer documents will tend to have a higher score than those shorter documents, even though the shorter documents is suppose to have a bette

Re: Implementing MoreLikeThis feature

2015-07-14 Thread Upayavira
There's two ways to "tweak" MLT. Use the parameters (such as minimum term frequency) and so on, or use stop words when indexing. I'd suggest you try those as a means to improve quality! Upayavira On Tue, Jul 14, 2015, at 09:28 AM, Zheng Lin Edwin Yeo wrote: > Thanks for your advice. I've indexe

Re: Suggester configuration queries.

2015-07-14 Thread ssharma7...@gmail.com
Alessandro Benedetti, Thanks for the links. Regards, Sachin Vyas. -- View this message in context: http://lucene.472066.n3.nabble.com/Suggester-configuration-queries-tp4214950p4217234.html Sent from the Solr - User mailing list archive at Nabble.com.

dataDir config

2015-07-14 Thread sat
Hello, I am running Solr Version 4.10.4. When I change the dataDir-Parameter in solrconfig.xml and restart the Server, the change has no effect. The Index/Data path remains the standard ./data folder ? What do I have to do, to change the location where my index/data is stored ? Thank you! --

Solr 5 options

2015-07-14 Thread spleenboy
Many Thanks to those who helped me on my last post: I'm almost there. So here is the doc I need to index: { "doc": { "id":"2", "cus_name_s":"Paul Brown", "cus_email_t":["paul.br...@here.net"], "com_id_i":201, "com_name_s":"Berenices", "url_s":"domain.net/integration/"}}

Re: dataDir config

2015-07-14 Thread Konstantin Gribov
If you have migrated to "new" solr.xml and use core.properties in your config you can set dataDir there. вт, 14 июля 2015 г. в 13:00, sat : > Hello, > > I am running Solr Version 4.10.4. > > When I change the dataDir-Parameter in solrconfig.xml and restart the > Server, the change has no effect.

Re: Solr 5 options

2015-07-14 Thread Shawn Heisey
On 7/14/2015 4:44 AM, spleenboy wrote: > Many Thanks to those who helped me on my last post: I'm almost there. > So here is the doc I need to index: > { > "doc": > { > "id":"2", > "cus_name_s":"Paul Brown", > "cus_email_t":["paul.br...@here.net"], > "com_id_i":201, > "com_na

Re: dataDir config

2015-07-14 Thread Shawn Heisey
On 7/14/2015 3:46 AM, sat wrote: > I am running Solr Version 4.10.4. > > When I change the dataDir-Parameter in solrconfig.xml and restart the > Server, the change has no effect. The Index/Data path remains the standard > ./data folder ? > > What do I have to do, to change the location where my i

Re: XML File Size for Post.jar

2015-07-14 Thread Erik Hatcher
Ravi - don’t get hung up on post.jar - it’s just a means to an end. Use good ol’ curl if you’re having issues with post.jar. But really, 2GB? Is that a single document? Or a bunch of documents? If it’s a bunch of documents, definitely split it up into multiple files. — Erik Hatcher, Senior

Phrase query not matching exact tokens in some cases

2015-07-14 Thread Mike Thomsen
For the query "police office" our users are getting back highlighted results for "police office*r*" (and "police office*rs*") I get why a search for police officers would include just "office" since the stemmer would cause that behavior. However I don't understand why "office" is matching "officer"

Re: Querying Nested documents

2015-07-14 Thread rameshn
The problem is filtering based on a element in 1234-images doc. 1234-images http://somedomain.com/some.jpg 1:1 I have multiple combinations of "image_flatten_s" and "image_uri_s" and should be able to query / filter on those values for "i

Re: Querying Nested documents

2015-07-14 Thread Alessandro Benedetti
Do you mean this ? I want all the children with a specific value for image_uri_s and all the children which have not that value. In this case i would go with : childFilter=(image_uri_s:) OR (-image_uri_s:*) Is this what do you want ? Cheers 2015-07-14 14:58 GMT+01:00 rameshn : > The problem is

Re: Phrase query not matching exact tokens in some cases

2015-07-14 Thread Alessandro Benedetti
Which kind of Highlighter are you using ? Anyway it is responsibility of your analysis chain. it is an heavy analysis chain and I can see : "solr. HunspellStemFilterFactory" If you are using the term vector for your field, to be used by your highlighter, in the term vector , for each document, you

Re: Phrase query not matching exact tokens in some cases

2015-07-14 Thread Erick Erickson
Also, the first place to look for answers for questions like "what is the stemmer doing" is to look at the admin/analysis page. Each step in the analysis chain will be shown. Hover over the light gray initials and you'll see the class used (e.g. WST == WhitespaceTokenizer). Best, Erick On Tue, Ju

Re: Solr 5 options

2015-07-14 Thread Erick Erickson
Well, Shawn I for one am in your corner. Schemaless is great for getting thing running, but it's not an AI. And it can get into trouble guessing. Say it guesses a field should be an int because the first one it sees is 123 but it's really a part number. Then when a part number 123-456 comes throug

Re: Persistence problem with swapped cores after Solr restart -- 4.9.1

2015-07-14 Thread Erick Erickson
Shawn: Were any errors reported in the logs? If not, this is certainly worth a JIRA. If the persistence bits are swallowing file access perms that's A Bad Thing IMO. Erick On Tue, Jul 14, 2015 at 12:46 AM, Upayavira wrote: > Problems between keyboard and chair are the best kind. They are the >

Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-14 Thread Alessandro Benedetti
Just found this interesting article of Mike, that actually explains the sausagization problem, which actually is related to the strange positions in some case. http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html Cheers 2015-07-09 1:13 GMT+01:00 Yonik Seeley : > On Wed,

Re: Persistence problem with swapped cores after Solr restart -- 4.9.1

2015-07-14 Thread Shawn Heisey
On 7/14/2015 10:06 AM, Erick Erickson wrote: Were any errors reported in the logs? If not, this is certainly worth a JIRA. If the persistence bits are swallowing file access perms that's A Bad Thing IMO. I will find out, and file an issue if necessary. It may take me a couple of days. Thank

Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-14 Thread Alessandro Benedetti
Furthermore I was checking with Solr 5.1 to find the WDFilter factory actually to work in a proper way. Is it possible to know what was the conclusion for this issue ? Is there an issue in the WordDelimiter token filter in the current Solr version? Has it been fixed ? Any update ? Cheers 2015-07-

Re: Why I get a hit on %, &, but not on !, @, #, $, ^, *

2015-07-14 Thread Steven White
Thanks Jack. Can you provide me with a concrete example of how to: 1) Be able to search and find "$10" (without quotes). This will get me started on how to add all other variations for !, @, etc. and be able to search on them. In this case, a search for "$10" will give me a hit on text of "$10"

Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-14 Thread Shawn Heisey
On 7/14/2015 10:46 AM, Alessandro Benedetti wrote: > Furthermore I was checking with Solr 5.1 to find the WDFilter factory > actually to work in a proper way. > Is it possible to know what was the conclusion for this issue ? > Is there an issue in the WordDelimiter token filter in the current Solr

Re: Why I get a hit on %, &, but not on !, @, #, $, ^, *

2015-07-14 Thread Erick Erickson
Steve: Simplest solution: remove WordDelimiterFilterFactory. Use something like PatternReplaceCharFilterFactory or PatternReplaceFilterFactory to selectively remove the characters you don't care about and leave in the ones you do care about. You might also want to do this kind of thing in a copyF

SOLR nrt read writes

2015-07-14 Thread Bhawna Asnani
Hi, I have a use case where we have to write data into solr and immediately read it back. The read is not get by Id but a search call. I am doing a softCommit after every such write which needs to be visible immediately. However sometimes the changes are not visible immediately. We have a solr cl

Solr PNG Coordinate Reference

2015-07-14 Thread Joseph Obernberger
Hi All - I'm working with the heatmap PNGs generated from solr as described here: https://issues.apache.org/jira/browse/SOLR-7005 What would be the coordinate reference system that the generated PNG uses? Is it possible to load these PNG files into a geospatial tool as a raster layer like QGI

Re: dataDir config

2015-07-14 Thread Steven White
Hi Shawn, Help me understand this. Are you saying this is https://wiki.apache.org/solr/SolrConfigXml#System_property_substitution how we should specify the "dataDir"? In that link, there is the statement "substituted can be put into a properties file" but it is not clear what this "properties fi

Re: Best way to facets with value preprocessing (w/ docValues)

2015-07-14 Thread Harry Yoo
I had a same issue and here is my solution. Basically, option #1 that Konstantin suggested, public class TextDocValueField extends TextField { @Override public List createFields(SchemaField field, Object value, float boost) { if (field.hasDocValues()) { List fields = new ArrayList

Re: SOLR nrt read writes

2015-07-14 Thread Erick Erickson
bq: I have a use case where we have to write data into solr and immediately read it back. This is simply not going to work with frequent updates. Solr promises Near in NRT, not "real time". If nothing else, if you fire the query before autowarming is completed. In this case you'll sometimes get t

Re: SOLR nrt read writes

2015-07-14 Thread Bhawna Asnani
Thanks. Load is really not a concern. We will be using it only for a handful of admin users and we are ok dedicated a solr server for just this user case. If I have to write a loop to check back if the the updates are written and searcher picked those up, what would that call look like? Can I set

How to dereference boost values?

2015-07-14 Thread Olivier Lebra
Is it possible to do something like this: bf=myfield^$myfactor Thanks, Olivier

Re: dataDir config

2015-07-14 Thread Shawn Heisey
On 7/14/2015 12:55 PM, Steven White wrote: > Help me understand this. Are you saying this is > https://wiki.apache.org/solr/SolrConfigXml#System_property_substitution how > we should specify the "dataDir"? That wiki page shows what I think is a horrible example. It would be completely unworkable

Re: SOLR nrt read writes

2015-07-14 Thread Erick Erickson
Ahh, good point about setting waitSearcher=true, I should have thought of that. Although the default is set to "true", so unless you're doing something different that should be set already. Look at your Solr logs and see if you find messages about "too many warming searchers" or some such. IN that

Re: dataDir config

2015-07-14 Thread Erick Erickson
I'd ask a different question, "why do you want to change the data dir in the first place?"... Just askin' since it's possible you started down this path without really needing to. Why isn't the default place, a "data" directory under each core adequate? Best, Erick On Tue, Jul 14, 2015 at 1:13 P

Dereferencing boost values?

2015-07-14 Thread Olivier Lebra
Is there a way to do something like this: " bf=myfield^$myfactor " ? (Doesn't work, the boost value has to be a direct number) Thanks, Olivier

Possible memory leak? Help!

2015-07-14 Thread Yael Gurevich
Hi, We're running Solr 4.10.1 on Linux using Tomcat. Distributed environment, 40 virtual servers with high resources. Concurrent queries that are quite complex (may be hundreds of terms), NRT indexing and a few hundreds of facet fields which might have many (hundreds of thousands) distinct values.

Migrating from solr cores to collections

2015-07-14 Thread tedsolr
I am in the process of migrating from a single Solr instance, with multiple cores, to the SolrCloud. My product uses cores to physically separate our customers' data: CocaCola has its own core, Pepsi has its own, etc. I want to keep that physical separation but of course I need horizontal scaling n

PatternReplaceCharFilterfactor and Position

2015-07-14 Thread Jae Joo
I am having some issue regarding "start" and "End" position of token. Here is the CharFilterFactory. Then the input data is 1 In the Analysis page, textraw_bytesstartendpositionLengthtypeposition 1[31]21311word1 Should the "end" position "22"? It breaks the Highlighting... HTMLStripCharFilte

Re: Querying Nested documents

2015-07-14 Thread Ramesh Nuthalapati
Yes you are right. So the query you are saying should be like below .. or did I misunderstood it http://localhost:8983/solr/demo/select?q= {!parent which='type:parent'}&fl=*,[child parentFilter=type:parent childFilter=(image_uri_s:somevalue) OR (-image_uri_s:*)]&indent=true If so, I am getting a

Re: Dereferencing boost values?

2015-07-14 Thread Upayavira
You could do q={!boost b=$b v=$qq} qq=your query b=YOUR-FACTOR If what you want is to provide a value outside. Also, with later Solrs, you can use ${whatever} syntax in your main query, which might work for you too. Upayavira On Tue, Jul 14, 2015, at 09:28 PM, Olivier Lebra wrote: > Is there

Re: Range Facet queries for date ranges with with non-constant gaps

2015-07-14 Thread Chris Hostetter
: Are there any examples/documentation for IntervalFaceting using dates that : I could refer to? You just specify the interval set start & end as properly formated date values. This example shows some range faceting and interval faceting on the same field of the "bin/solr -e techproducts" exam

Re: Querying Nested documents

2015-07-14 Thread Alessandro Benedetti
Not sure the '-' is supported by that queryParser, can you check with the NOT operator ? If not I will check in the code. Cheers 2015-07-14 21:56 GMT+01:00 Ramesh Nuthalapati : > Yes you are right. > > So the query you are saying should be like below .. or did I misunderstood > it > > http://lo

Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-14 Thread Shawn Heisey
On 7/14/2015 11:42 AM, Shawn Heisey wrote: > So the problem might be with the rulefile, or with some strange > combination of these analysis components. I did not build this > rulefile myself. It was built by another, eitherRobert Muir or Steve > Rowe if I remember right, when SOLR-4123 was underwa

Re: Dereferencing boost values?

2015-07-14 Thread Chris Hostetter
To clarify the difference: - "bf" is a special param of the dismax parser, which does an *additive* boost function - that function can be something as simple as a numeric field - alternatively, you can use the "boost" parser in your main query string, to wrap any parser (dismax, edismax, stan

Re: Migrating from solr cores to collections

2015-07-14 Thread Shawn Heisey
On 7/14/2015 2:39 PM, tedsolr wrote: > I am in the process of migrating from a single Solr instance, with multiple > cores, to the SolrCloud. My product uses cores to physically separate our > customers' data: CocaCola has its own core, Pepsi has its own, etc. I want > to keep that physical separat

Re: SOLR nrt read writes

2015-07-14 Thread Shawn Heisey
On 7/14/2015 12:19 PM, Bhawna Asnani wrote: > I have a use case where we have to write data into solr and immediately > read it back. > The read is not get by Id but a search call. > > I am doing a softCommit after every such write which needs to be visible > immediately. > However sometimes the ch

Re: dataDir config

2015-07-14 Thread Steven White
Thank you Erick and Shawn. I needed to separate the data from the Solr application so that Solr can be uninstalled / reinstalled / upgraded without impact on the data or the configuration of the core. I did some more research and found it here: https://cwiki.apache.org/confluence/display/solr/Sol

Re: Dereferencing boost values?

2015-07-14 Thread Olivier Lebra
Thanks guys... I'm using edismax, and I have a long bf field, that I want in a solr's requesthandler config as default, but customizable via query string, something like that: product(a,$a)^$fa sum(b,$b1,$b2)^$fb c^$fc ... where the caller would pass $a, $fa, $b1, $b2, $fb, $fc (and a,

Re: dataDir config

2015-07-14 Thread Shawn Heisey
On 7/14/2015 4:05 PM, Steven White wrote: > Thank you Erick and Shawn. > > I needed to separate the data from the Solr application so that Solr can be > uninstalled / reinstalled / upgraded without impact on the data or the > configuration of the core. I did some more research and found it here: >

Re: Dereferencing boost values?

2015-07-14 Thread Chris Hostetter
: For numeric operands, is the dismax boost operator ^ just a pow()? If so, my : problem is solved by doing that: : pow(product(a,$a1),$fa) pow(sum(b,$b1,$b2),$fb) : pow(c,$fc) : Is a^b equiv to pow(a,b)? not exactly ... the "bf" syntax is really, really, really old ... it was a really early a

Re: Implementing MoreLikeThis feature

2015-07-14 Thread Zheng Lin Edwin Yeo
Thanks for the suggestions. The results are more accurate now after I adjust those settings. Regards, Edwin On 14 July 2015 at 16:46, Upayavira wrote: > There's two ways to "tweak" MLT. Use the parameters (such as minimum > term frequency) and so on, or use stop words when indexing. > > I'd sug

schema design

2015-07-14 Thread Midas A
Hi , we are implementing solr search in our e-commerce web application . i have few question in my mind like a) How to store the data so that i can give me more relevant results for queries like " red shirts ", " peter England shirts " in global search where red is "color" and "peter En

Jetty servlet container in production environment

2015-07-14 Thread Adrian Liew
Hi all, Will like to ask your opinion if it is recommended to use the default Jetty servlet container as a service to run Solr on a multi-server production environment. I hear some places that recommend using Tomcat as a servlet container. Is anyone able to share some thoughts about this? Limit

Re: Sorting documents by child documents

2015-07-14 Thread DorZion
I can sort the parent documents with the ScoreMode function, you can take a look here: http://lucene.472066.n3.nabble.com/Sorting-documents-by-nested-child-docs-with-FunctionQueries-tp4209940.html

Re: Jetty servlet container in production environment

2015-07-14 Thread Upayavira
Use Jetty. Or rather, just use bin/solr or bin\solr.cmd to interact with Solr. In the past, Solr shipped as a "war" which could be deployed in any servlet container. Since 5.0, it is to be considered a self-contained application, that just happens to use Jetty underneath. If you used something ot