Re: Query parser cuts last letter from search term.

2013-04-04 Thread vsl
The problem was connected with filter order. WordDelimiterFilter should be put before others. Thanks for your help. -- View this message in context: http://lucene.472066.n3.nabble.com/Query-parser-cuts-last-letter-from-search-term-tp4053432p4053736.html Sent from the Solr - User mailing list ar

Re: do SearchComponents have access to response contents

2013-04-04 Thread xavier jmlucjav
A custom QueryResponseWriter...this makes sense, thanks Jack On Wed, Apr 3, 2013 at 11:21 PM, Jack Krupansky wrote: > The search components can see the "response" as a namedlist, but it is > only when SolrDispatchFIlter calls the QueryResponseWriter that XML or JSON > or whatever other format (J

Re: Zookeeper dataimport.properties node

2013-04-04 Thread Tim Vaillancourt
It its in your SolrCloud-based collection's config, it won't be on disk and only in Zookeeper. What I did was use the XInclude feature to include a file with my dataimport handler properties, so I'm assuming you're doing the same. Use a relative path to the config dir in Zookeeper, ie: no path

Re: solre scores remains same for exact match and nearly exact match

2013-04-04 Thread Andre Bois-Crettez
On 04/03/2013 07:22 AM, amit wrote: Below is my query http://localhost:8983/solr/select/?q=subject:session management in php&fq=category:[*%20TO%20*]&fl=category,score,subject You specify that you want "session" to appear in field "subject", but the other tokens only match to the default searc

Re: Question on Exact Matches - edismax

2013-04-04 Thread Sandeep Mestry
Hi Jan, Thanks for your reply. I have defined string_ci like below: When I analyse the query in solr, I saw that document containing pg_series_title_ci:"funny" matches when I do a search for pg_series_title_ci:"funny games" a

Spell check component does not return any suggestions

2013-04-04 Thread vsl
Hi, I configured index-based spell check component and unexpected problem occurs. *CASE 1: * I added two documents with following content: 1. handbuch 2. hanbuch The suggestions are returned for both terms: e.g. handbuch -> hanbuch and hanbuch-> handbuch. Comment: Works as expected. *CASE 2:

Re: Spell check component does not return any suggestions

2013-04-04 Thread Eoghan Ó Carragáin
Hi, I think you need to use the alternativeTermCount parameter ( http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount) to return suggestions for terms which occur less often than the user-entered term. More discussion here: https://issues.apache.org/jira/browse/SOLR-2585

Re: Spell check component does not return any suggestions

2013-04-04 Thread vsl
I tried to add spellcheck.alternativeTermCount=5 but still no suggestion has been found. -- View this message in context: http://lucene.472066.n3.nabble.com/Spell-check-component-does-not-return-any-suggestions-tp4053757p4053772.html Sent from the Solr - User mailing list archive at Nabble.com.

Filtered search term suggestions via Facet Prefixing or NGrams

2013-04-04 Thread Andreas Hubold
Hi, we've successfully implemented suggestion of search terms using facet prefixing with Solr 4.0. However, with lots of unique index terms we've encountered performance problems (long running queries) and even exceptions: "Too many values for UnInvertedField faceting on field textbody". We

Re: solre scores remains same for exact match and nearly exact match

2013-04-04 Thread Jack Krupansky
The simple way to write the query: q=subject:session subject:management subject:in subject:php Would be: q=subject:(session management in php) Of course, edismax is usually a better way to go in general. -- Jack Krupansky -Original Message- From: Andre Bois-Crettez Sent: Thursday, Ap

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Jack Krupansky
Technically, update and add are identical from a user perspective - you don't need to worry about whether the document already exists. But, there is another, newer form of update, "selective" or "atomic" which is updating a subset of the fields in an existing document without needing to re-sen

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Furkan KAMACI
I craw webages with Nutch and send them to Solr for indexing. There are two parameters to send data into Solr. One of them is -index and the other one is -reindex. I just want to learn what they do. 2013/4/4 Jack Krupansky > Technically, update and add are identical from a user perspective - yo

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Jack Krupansky
That's a question for the Nutch email list. In Solr, "reindexing" simply means that you manually delete your full Solr index (or at least delete all documents using a query) and fully ingest all documents, from scratch. There is no "option", it's just something that you, the user/developer, do

RE: Difference Between Indexing and Reindexing

2013-04-04 Thread Markus Jelsma
I assume you're using Nutch 2.x? Nutch 1.x does not have such an option and i find it strange to hear 2.x does. It really makes no sense to have a -reindex option and it should be removed. I'd recommend to stick to plain indexing. -Original message- > From:Jack Krupansky > Sent: Thu 0

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Alexandre Rafalovitch
On Thu, Apr 4, 2013 at 9:03 AM, Furkan KAMACI wrote: > I craw webages with Nutch and send them to Solr for indexing. There are two > parameters to send data into Solr. One of them is -index and the other one > is -reindex. I just want to learn what they do. > Are you sure this is not Nutch-side i

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Gora Mohanty
On 4 April 2013 18:33, Furkan KAMACI wrote: > I craw webages with Nutch and send them to Solr for indexing. There are two > parameters to send data into Solr. One of them is -index and the other one > is -reindex. I just want to learn what they do. [...] Which version of Nutch are you using? Unle

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Furkan KAMACI
I use Nutch 2.1 and using that: bin/nutch solrindex http://localhost:8983/solr -index bin/nutch solrindex http://localhost:8983/solr -reindex 2013/4/4 Gora Mohanty > On 4 April 2013 18:33, Furkan KAMACI wrote: > > I craw webages with Nutch and send them to Solr for indexing. There are > two >

Re: SolrCloud not distributing documents across shards

2013-04-04 Thread Michael Della Bitta
Thank you for all your hard work! Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Wed, Apr 3, 2013 at 6:08 PM, Mark Miller wrote: > > On Apr 3, 2013, at 5:5

Re: solre scores remains same for exact match and nearly exact match

2013-04-04 Thread amit
Thanks Jack and Andre I am trying to use edismax;but struck with the NoClassDefFoundError: org/apache/solr/response/QueryResponseWriter I am using solr 3.6 I have followed the steps here http://wiki.apache.org/solr/VelocityResponseWriter#Using_the_VelocityResponseWriter_in_Solr_Core Just the jars

detailed Error reporting in Solr

2013-04-04 Thread eShard
Good morning, I'm currently running Solr 4.0 final with tika v1.2 and Manifoldcf v1.2 dev. And I'm battling Tika XML parse errors again. Solr reports this error:org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: XML parse error which is too vague. I had to manu

Re: detailed Error reporting in Solr

2013-04-04 Thread eShard
ok, one possible fix is to add the xml equivalent to nbsp with is: ]> but how do I add this into the tika configuration? -- View this message in context: http://lucene.472066.n3.nabble.com/detailed-Error-reporting-in-Solr-tp4053821p4053823.html Sent from the Solr - User mailing list archiv

RE: Solr Multiword Search

2013-04-04 Thread Dyer, James
If you are using dismax/edismax with mm=0 (or some other low number), you should override this in the spellchecker. Specify "spellcheck.collateParam.mm=100%", or something high like that. Likewise if you're using the default lucene/solr query parser with q.op=OR, then you can specify "spellch

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Gora Mohanty
On 4 April 2013 19:29, Furkan KAMACI wrote: > I use Nutch 2.1 and using that: > > bin/nutch solrindex http://localhost:8983/solr -index > bin/nutch solrindex http://localhost:8983/solr -reindex [...] Sorry, but are you sure that you are using 2.1. Here is what I get with: ./bin/nutch solrindex U

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Gora Mohanty
On 4 April 2013 20:16, Gora Mohanty wrote: > On 4 April 2013 19:29, Furkan KAMACI wrote: >> I use Nutch 2.1 and using that: >> >> bin/nutch solrindex http://localhost:8983/solr -index >> bin/nutch solrindex http://localhost:8983/solr -reindex > [...] > > Sorry, but are you sure that you are using

RE: Spell check component does not return any suggestions

2013-04-04 Thread Dyer, James
Make sure you also set spellcheck.onlyMorePopular=false (or leave it out as "false" is the default) when using "spellcheck.alternativeTermCount". You may also need to set "spellcheck.maxResultsForSuggest=0". See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxResultsForSuggest t

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Furkan KAMACI
It may be a deprecated usage(maybe not) but certainly can run -index and -reindex on Nutch 2.1. 2013/4/4 Gora Mohanty > On 4 April 2013 20:16, Gora Mohanty wrote: > > On 4 April 2013 19:29, Furkan KAMACI wrote: > >> I use Nutch 2.1 and using that: > >> > >> bin/nutch solrindex http://localhos

Re: Difference Between Indexing and Reindexing

2013-04-04 Thread Jack Krupansky
Could you guys please take this discussion offline or over to a Nutch mailing list - where it belongs? This has nothing to do with Solr. -- Jack Krupansky -Original Message- From: Gora Mohanty Sent: Thursday, April 04, 2013 10:46 AM To: solr-user@lucene.apache.org Subject: Re: Differen

Solr 4.2 single server limitations

2013-04-04 Thread imehesz
hello, I'm using a single server setup with Nutch (1.6) and Solr (4.2) I plan to trigger the Nutch crawling process every 30 minutes or so and add about 300+ websites a month with (~5-10 pages each). At this point I'm not sure about the query requests/sec. Can I run this on a single server (how

Re: Question on Exact Matches - edismax

2013-04-04 Thread Sandeep Mestry
Another problem that I see in Solr analysis is the query term that matches the tokenized field does not match on the case insensitive field. So, if I'm searching for 'coast to coast', I see that the tokenized series title (pg_series_title) is matched but not the ci field which is pg_series_title_ci

Solr Query UI

2013-04-04 Thread scallawa
I am trying to understand how to plug data into the solr query option from the UI. The query below works on our old solr version (1.3) but does not return results on 4.2. I pulled it from the catalina log file. I am trying to plug in the values one by one into the query UI to see which one it is

Re: Solr Query UI

2013-04-04 Thread Gora Mohanty
On 4 April 2013 22:11, scallawa wrote: > I am trying to understand how to plug data into the solr query option from > the UI. > > The query below works on our old solr version (1.3) but does not return > results on 4.2. I pulled it from the catalina log file. I am trying to > plug in the values

Re: detailed Error reporting in Solr

2013-04-04 Thread Jack Krupansky
I'm trying to understand the context is here... are you trying to crawl web pages that have bad HTML? Or, ... what? -- Jack Krupansky -Original Message- From: eShard Sent: Thursday, April 04, 2013 10:23 AM To: solr-user@lucene.apache.org Subject: detailed Error reporting in Solr Good

RE: Solr Multiword Search

2013-04-04 Thread skmirch
Hi James, Thanks for the response. Nope, I'm not using dismax or edismax. Just the standard solr query parser. Also by using the variable "spellcheck.collateParam.q.op=AND" I see this working. This also means that all the words need to correct and the maxEdits can only be 2 else it won't sugges

Re: detailed Error reporting in Solr

2013-04-04 Thread eShard
Yes, that's it exactly. I crawled a link with these ( ›) in each list item and solr couldn't handle it threw the xml parse error and the crawler terminated the job. Is this fixable? Or do I have to submit a bug to the tika folks? Thanks, -- View this message in context: http://lucene.472066.

RE: Solr Multiword Search

2013-04-04 Thread Dyer, James
Use IndexBasedSpellChecker instead of DirectSolrSpellChecker if you need more than 2 edits. You may need to set the "accuracy" parameter lower than the default of .5 Keep in mind that while this might get the correct responses for your test cases, in the wild your users might find their queri

how to avoid single character to get indexed for directspellchecker dictionary

2013-04-04 Thread Rohan Thakur
hi all I am using solr directspellcheker for spell suggestions using raw analyses for indexing but I have some fields which have single characters like l L so its is been indexed in the dictionary and when I am using this for suggestions for query like delll its suggesting de and l l l as the spel

RE: how to avoid single character to get indexed for directspellchecker dictionary

2013-04-04 Thread Dyer, James
I assume if your user queries "delll" and it breaks it into pieces like "de l l l", then you're probably using WordBreakSolrSpellChecker in addition to DirectSolrSpellChecker, right? If so, then you can specify "minBreakLength" in solrconfig.xml like this: ... spellcheckers here ... wo

Re: Solr Query UI

2013-04-04 Thread scallawa
We are still in the testing phase for 4.2. A new server was built and the latest tomcat, java and solr were installed. The schema file was copied over from the old and then customized as follows. Schema Changes We changed all float field types to tfloat. The solrqueryparser default operator is s

Re: maxWarmingSearchers in Solr 4.

2013-04-04 Thread Shawn Heisey
On 4/4/2013 12:34 AM, Dotan Cohen wrote: In the case of maxWarmingSearchers, I would hope that you have your system set up so that you would never need more than 1 warming searcher at a time. If you do a commit while a previous commit is still warming, Solr will try to create a second warming se

Re: detailed Error reporting in Solr

2013-04-04 Thread Jack Krupansky
I've been away from Tika for awhile, so I'm not sure. This might also be an issue of Tika using a strict XML parser for HTML rather than a looser and more error-tolerant HTML-specific parser, like most browsers use, that allows these kinds of technical "errors" that in reality, in most cases, ca

Compressed Fields in 4.2.1

2013-04-04 Thread Jamie Johnson
I had read somewhere that text fields by default were compressed in 4.2.1, is this the case? If not how do I enable compression of stored text fields?

Re: Compressed Fields in 4.2.1

2013-04-04 Thread Yonik Seeley
On Thu, Apr 4, 2013 at 7:41 PM, Jamie Johnson wrote: > I had read somewhere that text fields by default were compressed in 4.2.1, > is this the case? If not how do I enable compression of stored text fields? Compressed stored fields are the default since 4.1 -Yonik http://lucidworks.com

Re: Solr Query UI

2013-04-04 Thread scallawa
I found the problem. The values that we have for cat-path include the special character "/". This was not a special character in pre 4.0 releases. That explains why it worked in my previous version but not in 4.2. Pre 4.0 Lucene supports escaping special characters that are part of the query

SolR InvalidTokenOffsetsException with Highlighter and Synonyms

2013-04-04 Thread juancesarvillalba
Hi I saw some similar problems in other threads but I think that this is a little different and couldn't get any solution.*I get the exception */org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token eightysix exceeds length of provided text sized 80/This happens for example when I

Fwd: Zookeeper dataimport.properties node

2013-04-04 Thread Nathan Findley
- Is dataimport.properties ever written to the filesystem? (Trying to determine if I have a permissions error because I don't see it anywhere on disk). - How do you manually edit dataimport.properties? My system is periodically pulling in new data. If that process has issues, I want to be able

Re: do SearchComponents have access to response contents

2013-04-04 Thread Amit Nithian
"We need to also track the size of the response (as the size in bytes of the whole xml response tat is streamed, with stored fields and all). I was a bit worried cause I am wondering if a searchcomponent will actually have access to the response bytes..." ==> Can't you get this from your container

Re: Solr 4.2 single server limitations

2013-04-04 Thread Amit Nithian
There's a whole heap of information that is missing like what you plan on storing vs indexing and yes QPS too. My short answer is try with one server until it falls over then start adding more. When you say multiple-server setup do you mean multiple servers where each server acts as a slave storin

Re: how to avoid single character to get indexed for directspellchecker dictionary

2013-04-04 Thread Rohan Thakur
hi james after using this its working file for delll but not for de. what does this minbreaklength signifies? also can you tell me why am I not getting suggestions for smaller words like for del i should get dell as suggestion but its not giving any suggestions and also can I get suggestion