Performance improvement in latest version comparing to v1.4

2014-10-01 Thread Danesh Kuruppu
Hi all, Currently we are using solr for service meta data indexing and Searching . we have embedded solr server running in our application and we are using solr 1.4 version. Have some doubts to be clear. 1. What are the performance improvements we can gain from updating to the latest solr version

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread Erick Erickson
bq: But with my new query, could I just remove the defType=lucene parameter and the wildcard right Well, It Depends (tm). You can specify the query parser as part of the requestHandler and, indeed, leave it off the query. As far as the wildcard goes, it also depends. You'll change the semantics o

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread PeterKerk
Sorry, one final thing. In my current application I search like this: "&q=title:*&defType=lucene I was checking here: http://wiki.apache.org/solr/SolrQuerySyntax But with my new query, could I just remove the defType=lucene parameter and the wildcard right? Or am I overlooking something then?

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread PeterKerk
You were right, I had an old configuration :) But using your new suggestions had made that it works! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Flexible-search-field-analyser-tokenizer-configuration-tp4161624p4162249.html Sent from the Solr - User mailing list a

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread S.L
That makes perfect sense , thanks again! On Wed, Oct 1, 2014 at 10:09 PM, Shawn Heisey wrote: > On 10/1/2014 7:08 PM, S.L wrote: > > Thanks ,load balancer seems to be the preferred solution here , I have a > > topology where I have 6 Solr nodes that support 3 shards with a > replication > > fact

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread Shawn Heisey
On 10/1/2014 7:08 PM, S.L wrote: > Thanks ,load balancer seems to be the preferred solution here , I have a > topology where I have 6 Solr nodes that support 3 shards with a replication > factor of 2. > > Looks like it woul dbe better to use the load balancers for querying > only.The question, tha

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread Erick Erickson
1> Hmmm, you _should_ have some line like: in solrconfig.xml, otherwise the url you posted has no destination. http://localhost:8983/solr/bm/select implies that there's a request handler to, well, handle it so I'm puzzled. When you _do_ find the request handler, there should be a line like this:

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread S.L
Shawn, Thanks ,load balancer seems to be the preferred solution here , I have a topology where I have 6 Solr nodes that support 3 shards with a replication factor of 2. Looks like it woul dbe better to use the load balancers for querying only.The question, that I have is if I go the load balancer

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread Shawn Heisey
On 10/1/2014 2:29 PM, S.L wrote: > Right , but my query was to know if there are any Python clients which > achieve the same thing as SolrJ , or the approach one should take when > using Python based clients. If the python client can support multiple hosts and failing over between them, then you

Re: Solr Boosting Unique Values

2014-10-01 Thread Chris Hostetter
: My current setup does not use the ImageUrl field for the search (more : specifically as the default search field). The ImageUrl field contains a URL : to the image which is for most part a GUID, which is meaningless to users. : However, I would like to note that the ImageUrl field is Indexed and

Re: Update with non UTF-8 characters

2014-10-01 Thread Chris Hostetter
: I am indexing Solr 4.9.0 using the /update request handler and am getting : errors from Tika - Illegal IOException from : org.apache.tika.parser.xml.DcXMLParser@74ce3bea which is caused by : MalFormedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. I FWIW: that error appears to h

Re: Exact match on string field with special characters

2014-10-01 Thread Ahmet Arslan
Hi, raw query parser or term query parser would be handy. https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermQueryParser Ahmet On Thursday, October 2, 2014 12:32 AM, tedsolr wrote: I am trying to do SQL like aggregation (GROUP BY) with solr faceting. So I use st

RE: Exact match on string field with special characters

2014-10-01 Thread Michael Ryan
When you call addFacetField, the parameter you pass it should just be the fieldName. The fieldValue shouldn't come into play at all (unless I'm misunderstanding what you're trying to do). If you ever do need to escape a value for a query, you can use org.apache.solr.client.solrj.util.ClientUtil

Exact match on string field with special characters

2014-10-01 Thread tedsolr
I am trying to do SQL like aggregation (GROUP BY) with solr faceting. So I use string fields for faceting - to try to get an exact match. However, it seems like to run a facet query I have to surround the value with double quotes. That poses issues when the field value is green "bath" towels -or-

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread Alexandre Rafalovitch
I know of 10 Python Solr clients. you can find the list at: https://leanpub.com/solr-clients/read#leanpub-auto-clients I don't think any of them are as complete as SolrJ, but I would look at SolrCloudPy (it has a nice console too). Regards, Alex. Personal: http://www.outerthoughts.com/ and @ar

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread PeterKerk
Hi Erick, Thanks for clarifying some of this :) That triggers a few more questions: 1. I have no df" setting in my solrconfig.xml file at all, nor do I see a

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread S.L
Right , but my query was to know if there are any Python clients which achieve the same thing as SolrJ , or the approach one should take when using Python based clients. On Wed, Oct 1, 2014 at 3:57 PM, Upayavira wrote: > > > On Wed, Oct 1, 2014, at 08:47 PM, S.L wrote: > > Hi All, > > > > We re

Boost function for custom sorting.

2014-10-01 Thread sai suman
Hi, I have some records which include a source_id field which is an integer and a datetime field. I want the records to be ordered such that the adjacent records should not have the same source ids. It should perform some sort of round robin on the records with the source_id as kay and they should

Re: pySolr and other Python client options for SolrCloud.

2014-10-01 Thread Upayavira
On Wed, Oct 1, 2014, at 08:47 PM, S.L wrote: > Hi All, > > We recently moved from a single Solr instance to SolrCloud and we are > using > pysolr , I am wondering what options (clients) we have from Python to > take advantage of Zookeeper and load balancing capabilities that > SolrCloud > prov

pySolr and other Python client options for SolrCloud.

2014-10-01 Thread S.L
Hi All, We recently moved from a single Solr instance to SolrCloud and we are using pysolr , I am wondering what options (clients) we have from Python to take advantage of Zookeeper and load balancing capabilities that SolrCloud provides if I were to use a smart client like Solrj? Thanks.

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Sujatha Arun
Thanks Erick, No problem. Will check . On Thu, Oct 2, 2014 at 12:54 AM, Erick Erickson wrote: > I'm clueless about the MongoDB connector. I suspect > that's where the issue is unless you can reproduce > this with a solr-only case. > > As you can tell, I don't recall seeing this problem come by >

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Mikhail Khludnev
Hoss, Nice to hear you! I wonder if there is a sequence chart, or maybe a deck, which explains the whole picture of distributed search, especially these ones? If it hasn't been presented to community so far, I'm aware of one conference which can accept such talk. WDYT? On Wed, Oct 1, 2014 at 9:17

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Erick Erickson
I'm clueless about the MongoDB connector. I suspect that's where the issue is unless you can reproduce this with a solr-only case. As you can tell, I don't recall seeing this problem come by the boards. Best, Erick On Wed, Oct 1, 2014 at 11:53 AM, Sujatha Arun wrote: > Erick, > > Actually I am

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread Erick Erickson
There's some confusion here. First of all, you shouldn't be getting docs like "The Wall" at all, _assuming_ your fq clause is meant to only include docs with "the Royal Garden" in the results list. What's happening here is that the text is being searched for in the default search field, which will

Update with non UTF-8 characters

2014-10-01 Thread Teague James
Hello! I am indexing Solr 4.9.0 using the /update request handler and am getting errors from Tika - Illegal IOException from org.apache.tika.parser.xml.DcXMLParser@74ce3bea which is caused by MalFormedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. I believe that this is the result

Re: Wildcard search makes no sense!!

2014-10-01 Thread Erick Erickson
Two things: 1> what version of Solr are you using? If it's prior to 3.6, then the bits that handle applying lowercaseFilter to wildcards isn't in the code. 2> what do you see if you add &debug=query? I just tried it with your analysis chain and it seemed to work. Did you completely blow your ind

Re: Solr + Federated Search Question

2014-10-01 Thread Alexandre Rafalovitch
http://project.carrot2.org/ is worth having a look at. It supports Solr well. In fact, a subset of it is shipped with Solr Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: h

I get zero results when combining query.set("fq","{!collapse field=title_s}"); and query.set("group", "true"); ???

2014-10-01 Thread Michael Joyner
I have a SolrCloud setup with two shards. When I use "query.set("fq","{!collapse field=title_s}");" the results show duplicates because of the sharding. EX: {status=0,QTime=1141,params={fl=id,code_s,issuedate_tdt,pageno_i,subhead_s,title_s,type_s,citation_articleTitle_s,citation_articlePageNo

Re: Wildcard search makes no sense!!

2014-10-01 Thread Alexandre Rafalovitch
If you use "*" you use Multiterm analysis path, which is semi-hidden and is a lot more limited to the things done with normal tokens: https://wiki.apache.org/solr/MultitermQueryAnalysis The Analyzer components that are NOT multiterm aware cannot be used that way. Looking at: http://www.solr-start.

Re: FileNotFoundException, Error closing IndexWriter, Error opening new searcher

2014-10-01 Thread Grainne
Did you solve this problem? I am experiencing (Solr 4.4.0 not clustered) the same behavior when the files are mounted via sshfs (a requirement) but not when they are mounted via nfs. I'm hoping you solved your problem and might have advice on how I can solve mine. Thanks, Grainne -- View th

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Sujatha Arun
Erick, Actually I am synching data between solr and Mongodb using mongo-connector.The details below. I have submitterd an issue in Mongo-conenctor forum,just trying at the solr forum too just in case anybody has encountered the same :) or why does the log state No uncommitted changes. Skipping

Re: Flexible search field analyser/tokenizer configuration

2014-10-01 Thread PeterKerk
Ok, I missed the Query tab where I can do the actual site search :) I've also used your links, but even with those I fail to grasp why the following is happening: This is my query: http://localhost:8983/solr/bm/select?q=*%3A*&fq=The+Royal+Garden&rows=50&fl=id%2Ctitle&wt=xml&indent=true And belo

Re: Boost Query (bq) syntax/usage

2014-10-01 Thread shamik
Thanks a lot Jack, it makes total sense. I check the config and default q.op was set to OR, which was influencing the query. -- View this message in context: http://lucene.472066.n3.nabble.com/Boost-Query-bq-syntax-usage-tp4161989p4162169.html Sent from the Solr - User mailing list archive at N

RE: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Toke Eskildsen
From: Charlie Hull [char...@flax.co.uk]: > We've just found a very similar issue at a client installation. They have > around 27 million documents and are faceting on fields with high > cardinality, and are unhappy with query performance and the server hardware > necessary to make this performance

Re: Adding filter in custom query parser

2014-10-01 Thread Chris Hostetter
: For eg : "red shirt under 20$" should be translated to q=shirt&fq=price:[* : TO 20] and possibly apply color to one the attribute of doc index. : : in parser overrided method, how can i add the filter and pass the query : back? I don't think you can acomplish this just within the QParser API .

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Chris Hostetter
: +1 for using a different cache, but that's being quite unfamiliar with the : code. in (a) common case, people tend to "drill down" and filter on facet constraints -- so using a special purpose cache for the refinements would result in redundent caching of the same info in multiple places. :

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist
I'm still stuck on this actually. I would really appreciate any pointers. If I search for : query 1: Κώστας result: Κώστας query 2: Κώστα* result: I've looked at the analyser but I don't really understand what I'm looking at if I'm honest. It gives the output: Field (name): title Field value: Κ

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Erick Erickson
At this point details matter a lot. What exactly are you doing when you update? What happens if you issue an explicit update command? i.e. http://blahlbah/solr/collection/update?commit=true? Are you sure you aren't seeing, say, browser caching? Best, Erick On Wed, Oct 1, 2014 at 9:04 AM, Sujat

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Sujatha Arun
Thanks, BookId is the unique key, the issue is resolved with respect to delete .Its the update that causing the issue On Wed, Oct 1, 2014 at 8:51 PM, Erick Erickson wrote: > I'd add only one thing to Angel's comments: > you're deleting by "id", but querying by "BookId". This > _should_ work (on

Re: AW: AW: auto completion search with solr using NGrams in SOLR

2014-10-01 Thread Erick Erickson
Perhaps your ngram filter is set to terminate at 14 (maxGram)? Best, Erick On Wed, Oct 1, 2014 at 3:18 AM, xoku wrote: > help me! > i can't find all result. > str name="spellcheck.count">200 > Ex: > i find: file > result expected: file name documentabcxyz > but solr return result (suggest: resul

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Erick Erickson
I'd add only one thing to Angel's comments: you're deleting by "id", but querying by "BookId". This _should_ work (on a quick glance at the code) iff your is "BookId"... I took a quick glance at the code and "id" should delete by , so is your "BookId" the in your schema? On Wed, Oct 1, 2014 a

Re: Solr + Federated Search Question

2014-10-01 Thread Jack Krupansky
Alejandro, you'll have to clarify how you are using the term "federated search". I mean, technically Ahmet is correct in that Solr queries can be fanned out to shards and the results from each shard aggregated ("federated") into a single result list, but... more traditionally, "federated" refer

Re: Solr + Federated Search Question

2014-10-01 Thread Ahmet Arslan
Hi, Federation is possible. Solr has distributed search support with shards parameter. Ahmet On Wednesday, October 1, 2014 4:29 PM, Alejandro Calbazana wrote: Hello, I have a general question about Solr in a federated search context. I understand that Solr does not do federated search and

Solr + Federated Search Question

2014-10-01 Thread Alejandro Calbazana
Hello, I have a general question about Solr in a federated search context. I understand that Solr does not do federated search and that different tools are often used to incorporate Solr indexes into a federated/enterprise search solution. Does anyone have recommendations on any products (open

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist
Ahmet - many thanks - I removed the EnglishPorterFilterFactory and reindexed and this seems to behave as expected now. Jack - thanks aswell - I'm very much a noob with this, and thats a great tip. -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sen

Re: Adding filter in custom query parser

2014-10-01 Thread Jack Krupansky
Unless you consider yourself to be a "Solr expert", it would be best to implement such query translation in an application layer. -- Jack Krupansky -Original Message- From: sagarprasad Sent: Wednesday, October 1, 2014 3:27 AM To: solr-user@lucene.apache.org Subject: Adding filter in c

Re: Wildcard search makes no sense!!

2014-10-01 Thread Jack Krupansky
The presence of a wildcard in a query term short circuits some portions of the analysis process. Some token filters like lower case can still be performed on the query terms, but others, like stemming, cannot. So, either simplify the analysis (be more selective of what token filters you use), or

Re: Wildcard search makes no sense!!

2014-10-01 Thread Toke Eskildsen
On Wed, 2014-10-01 at 13:16 +0200, Wayne W wrote: > query 2: capit* > result: Capital Health > > query 3: capita* > result: You are likely using a stemmer for the field: "Capital Health" gets indexed as "capit" and "health", so there are no tokens starting with "capita". Turn off the stemmer or

Re: Wildcard search makes no sense!!

2014-10-01 Thread Ahmet Arslan
Hi, Probably you have stemmer and it is eating up Capital to capit. Thats the reason. Either remove stemmer from analyser chain or add keyword repeat filter. Ahmet On Wednesday, October 1, 2014 2:16 PM, Wayne W wrote: Hi, I don't understand this at all. We are indexing some contact names.

Wildcard search makes no sense!!

2014-10-01 Thread Wayne W
Hi, I don't understand this at all. We are indexing some contact names. When we do a standard query: query 1: capi* result: Capital Health query 2: capit* result: Capital Health query 3: capita* result: query 4: capital* result: I understand (as we are using solar 3.5) that the wildcard sea

Re: AW: AW: auto completion search with solr using NGrams in SOLR

2014-10-01 Thread xoku
help me! i can't find all result. str name="spellcheck.count">200 Ex: i find: file result expected: file name documentabcxyz but solr return result (suggest: result term object) : - [suggestions:protected] => Array ( [0] => file [1] => file (whitespace)

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread jim ferenczi
I think you should test with facet.shard.limit=-1 this will disallow the limit for the facet on the shards and remove the needs for facet refinements. I bet that returning every facet with a count greater than 0 on internal queries is cheaper than using the filter cache to handle a lot of refinemen

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Charlie Hull
On 30/09/2014 22:25, Erick Erickson wrote: Just from a 20,000 ft. view, using the filterCache this way seems...odd. +1 for using a different cache, but that's being quite unfamiliar with the code. Here's a quick update: 1. LFUCache performs worse so we returned to LRUCache 2. Making the cache

Re: Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Angel Tchorbadjiiski
Hello Sujatha, have you tried to leave the quotes out? :-) Alternatively try using 'id:1.0' to see if the same error arises. A bit more information on the Update issue (exact query sent and all the log corresponding entries) would be needed to help you with your problem. Cheers Angel On 0

Adding filter in custom query parser

2014-10-01 Thread sagarprasad
I am new bee in SOLR and OpenNLP. I am trying to do a POC and want to write a custom parser which can parse the query string using NLP and create an appropriate SOLR query with filters. For eg : "red shirt under 20$" should be translated to q=shirt&fq=price:[* TO 20] and possibly apply color to on

Solr 4.7 .0 Issues with Update and Delete

2014-10-01 Thread Sujatha Arun
I am having the following issue on delete and update in solr 4.7.0 *Delete Issue* I am using the following Curl command to delete a document from index curl http://localhost:8080/solr/bf/update?commit=true -H "Content-Type: text/xml" --data-binary '"1.0"' 07 This is what I see in logs I