Re: Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit

2013-02-19 Thread Jack Krupansky
We also need to see the /select request handler from solrconfig.xml. For the text_ja field, the query is generating an "OR" of the individual unigrams of the query string. -- Jack Krupansky -Original Message- From: kirpakaro Sent: Tuesday, February 19, 2013 7:40 AM To: solr-user@luc

Re: Edismax odd results

2013-02-19 Thread Jack Krupansky
The &fq query is probably using the wrong field. It will be using the "df" parameter since fq uses the Solr query parser, not edismax. -- Jack Krupansky -Original Message- From: David Quarterman Sent: Tuesday, February 19, 2013 10:16 AM To: solr-user@lucene.apache.org Subject: RE: Edi

Dynamic field searching with edismax

2013-02-19 Thread adityab
We have a search query which we want to configure in Solrconfig.xml and lock down in such a way that user will just pass the keyword e.g q=facebook&qf=title.0&defType=edismax We have a dynamic field title.* for each language. During search the "*" is replaced with login user language and hence nee

Re: Is it possible to manually select a shard leader in a running SolrCloud?

2013-02-19 Thread Mark Miller
You can't easily do it the way it's implemented in ZooKeeper. We would probably internally have to do the same thing - elect a new leader and drop him until the one we wanted came up. The main thing doing it internally would gain is that you could skip the elected guy from becoming the actual leade

Re: Storing all attributes in the document so that I can avoid a distributed cache?

2013-02-19 Thread Shawn Heisey
On 2/19/2013 6:47 PM, Erick Erickson wrote: It Depends (tm). Storing data in a Solr index pretty much just consumes disk space, the *.fdt and *.fdx files aren't really germane to the amount of memory needed for search. There will be some additional memory requirements for the document cache thoug

Re: Faceting on a subset of search results

2013-02-19 Thread Otis Gospodnetic
Hi Peyman, I don't think that's possible. I know I opened and issue with exactly this request over in ElasticSearch project a few months ago. Several years ago I implemented something like what you are asking about, but on top of Lucene, and it worked very well, giving very good results. Otis -

Re: SOLR Cloud DIH Import

2013-02-19 Thread Alexandre Rafalovitch
Anything specific? I bet the Wikipedia indexing example will exercise the Cloud setup quite well: https://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch -

Re: HELP: CommonsHttpSolrServer.commit() time out after 1min

2013-02-19 Thread Siping Liu
Solrj. On Tue, Feb 19, 2013 at 9:08 PM, Erick Erickson wrote: > Well, your commits may have to wait until any merges are done, which _may_ > be merging your entire index into a single segment. Possibly this could > take more than 60 seconds. > > _How_ are you doing this? DIH? SolrJ? post.jar? >

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
Also, I was referring to this wiki page: http://wiki.apache.org/solr/UpdateJSON#Update_Commands Thanks Vinay On Tue, Feb 19, 2013 at 6:12 PM, Vinay Pothnis wrote: > Thanks for the reply Eric. > > * I am not using SolrJ > * I am using plain http (apache http client) to send a batch of commands.

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
Thanks for the reply Eric. * I am not using SolrJ * I am using plain http (apache http client) to send a batch of commands. * As I mentioned below, the json payload I am sending is like this (some of the fields have been removed for brevity) * POST http://localhost:8983/solr/sample/update * POST B

Re: HELP: CommonsHttpSolrServer.commit() time out after 1min

2013-02-19 Thread Erick Erickson
Well, your commits may have to wait until any merges are done, which _may_ be merging your entire index into a single segment. Possibly this could take more than 60 seconds. _How_ are you doing this? DIH? SolrJ? post.jar? Best Erick On Tue, Feb 19, 2013 at 8:00 PM, Siping Liu wrote: > Thanks

Re: difference between q=field:"value1 value2" and q=field:value1 value2

2013-02-19 Thread Erick Erickson
Of course you are (field1:article test OR field2:article test OR ...) parses as field1:article defaultfield:test OR field2:article defaultfield:test probably with an implied SHOULD whereas (field1:"article test" OR field2:"article test" OR ... ) is parsing as phrase queries. That is, test m

Re: Edismax odd results

2013-02-19 Thread Erick Erickson
When you get back to this tomorrow, also try and paste the parsed query bits you get back when you append &debug=all. Sometimes it's surprising what the parsed query _really_ looks like Best Erick On Tue, Feb 19, 2013 at 3:13 PM, David Quarterman wrote: > Hi Shawn, > > Now finished for the

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Erick Erickson
Hmmm, this would surprise me unless the add and delete were going to separate machines. how are you sending them? SolrJ? and in a single server.add(doclist) format or with individual adds? Individual commands being sent can come 'round out of sequence, that's what the whole optimistic locking bit

Re: Storing all attributes in the document so that I can avoid a distributed cache?

2013-02-19 Thread Erick Erickson
It Depends (tm). Storing data in a Solr index pretty much just consumes disk space, the *.fdt and *.fdx files aren't really germane to the amount of memory needed for search. There will be some additional memory requirements for the document cache though. And you'll also have resources consumed if

Re: HELP: CommonsHttpSolrServer.commit() time out after 1min

2013-02-19 Thread Siping Liu
Thanks for the quick response. It's Solr 3.4. I'm pretty sure we get plenty memory. On Tue, Feb 19, 2013 at 7:50 PM, Alexandre Rafalovitch wrote: > Which version of Solr? > Are you sure you did not run out of memory half way through import? > > Regards, >Alex. > > Personal blog: http://blog

Re: HELP: CommonsHttpSolrServer.commit() time out after 1min

2013-02-19 Thread Alexandre Rafalovitch
Which version of Solr? Are you sure you did not run out of memory half way through import? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately

HELP: CommonsHttpSolrServer.commit() time out after 1min

2013-02-19 Thread Siping Liu
Hi, we have an index with 2mil documents in it. From time to time we rewrite about 1/10 of the documents (just under 200k). No autocommit. At the end we a single commit and got time out after 60 sec. My questions are: 1. is it normal to have the commit of this size takes more than 1min? I know it's

Re: edismax and fields with spaces - SOLR 3.6

2013-02-19 Thread Otis Gospodnetic
Hi, I think you'll save yourself a ot of trouble if you just rename your fields. Otis Solr & ElasticSearch Support http://sematext.com/ On Feb 19, 2013 12:09 PM, "no spam" wrote: > I am trying to use an edismax query in SOLR 3.6 to search fields that have > spaces in their name, ie field "Actio

difference between q=field:"value1 value2" and q=field:value1 value2

2013-02-19 Thread Sebastian Saip
Hi there, I'm implementing a didYouMean in Java, which will returns collated terms. Unfortunately, the only way (?) to retrieve those collated terms is by "getCollationQueryString()", which will look something like this: input for my query: atricle test my final query: (field1:atricle test OR fiel

Re: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Shawn, Now finished for the day but will post the schema tomorrow. Thanks for the help (and Jack too). Regards, DQ P.S. did reindex after changing schema and the analyzer/query stuff matches precisely!! Shawn Heisey wrote: On 2/19/2013 11:16 AM, David Quarterman wrote: > This is definit

[solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
Hello, I have the following set up: * solr cloud 4.1.0 * 2 shards with embedded zookeeper * plain http to communicate with solr I am testing a scenario where i am batching multiple commands and sending to solr. Since this is the solr cloud setup, I am always sending the updates to one of the nod

Re: Start solr from different folder / .Net code

2013-02-19 Thread Shawn Heisey
On 2/19/2013 11:55 AM, Kiran J wrote: Thanks guys. I ended up setting the working folder in ProcessStartInfo.WorkingDirectory & it works fine. Batch file though it works, I am not able to kill the Solr process from code if I need to. If you start jetty like this: java -DSTOP.PORT=8079 -DSTO

Re: Start solr from different folder / .Net code

2013-02-19 Thread Kiran J
Thanks guys. I ended up setting the working folder in ProcessStartInfo.WorkingDirectory & it works fine. Batch file though it works, I am not able to kill the Solr process from code if I need to. On Sun, Feb 17, 2013 at 1:12 PM, d_k wrote: > Might as well do: > cd /d H:\downloads\apache-solr-

Re: ApacheCon meetup

2013-02-19 Thread Erik Hatcher
I've added a Lucene meetup to the Wednesday night meetup proposed schedule. I'm speaking on Wednesday morning. Let's get the word spread to the Portland tech community as well, making it a good way to bring in folks in the area that may not be also attending ApacheCon. Erik On Feb 1

Re: RequestHandler init failure

2013-02-19 Thread Chris Hostetter
: Found it by myself. It's here : http://mirrors.ibiblio.org/maven2/org/apache/solr/solr-dataimporthandler/4.1.0/ : : Download and move the jar file to solr-webapp/webapp/WEB-INF/lib directory, : and the errors are all gone. you don't need to move/copy/add any jars into hte solr webapp (where

Re: Edismax odd results

2013-02-19 Thread Shawn Heisey
On 2/19/2013 11:16 AM, David Quarterman wrote: This is definitely driving us mad now! Changed to PorterStemming and there's very little difference. If we add fq=engineer, we get 0 results. Add fq=engineer* and we get the 90 in the system. Try with fq=ankle* and we get 2. Correct. Try with fq=h

RE: Edismax odd results

2013-02-19 Thread David Quarterman
Hi, This is definitely driving us mad now! Changed to PorterStemming and there's very little difference. If we add fq=engineer, we get 0 results. Add fq=engineer* and we get the 90 in the system. Try with fq=ankle* and we get 2. Correct. Try with fq=harness* and we get 0! The stemming reduce

Re: SOLR4 SAN vs Local Disk?

2013-02-19 Thread Shawn Heisey
On 2/19/2013 10:39 AM, chamara wrote: Hi Thanks Shawn for the Input, Yes i am using SolrCloud to replicate the index to another server running with the same spec with 32cores and 72GB RAM on each machine. I have to test the performance of RAID 10? Have you ever done a deployment with RAID 10? The

Re: SOLR4 SAN vs Local Disk?

2013-02-19 Thread chamara
Hi Thanks Shawn for the Input, Yes i am using SolrCloud to replicate the index to another server running with the same spec with 32cores and 72GB RAM on each machine. I have to test the performance of RAID 10? Have you ever done a deployment with RAID 10? The indexing will be NRT as far as i can se

get content is put in the index queue but is not committed

2013-02-19 Thread Miguel
Hi everybody Anybody know how-to get content is put in the index queue but is not committed? I am developing a custom UpdateRequestProcessorFactory and in the ADD Event I see the documents, and I would need to access this same documents in the Commit event. When the add event has implicit com

edismax and fields with spaces - SOLR 3.6

2013-02-19 Thread no spam
I am trying to use an edismax query in SOLR 3.6 to search fields that have spaces in their name, ie field "Action 1_t". I'm using this query string: start=0&rows=5&user_id=1&wt=json&qf=SLUG_t%5E10.0+Action%5C+1_t&q=%7B!edismax+q.op%3DAND%7Dlebron&debugQuery=true However my parsed query only has

RE: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Shawn/Jack, The log shows the query going in okay, nothing gets stripped out so we're still at a loss to understand this. Could it be theta Snowball stemming is too invasive? Regards, DQ -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 19 February 2013 16:

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
reindexing is not actually possible for me. data are on the prod' server and can't be stopped I will use the autoGeneratePhraseQueries=true, it's better than nothing... Thanks a lot Jack ! Le 19/02/2013 17:30, Jack Krupansky a écrit : Yeah, terms that expand into phrases cannot use wildcar

RE: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Shawn, I checked the admin analysis earlier. Stemming is taking 'engineer' down to 'engin', but then I'd have thought that a search on 'engin boots' would work but it doesn't. I'll try turning the wick back up on the logging - we set it to 'warning'. Regards, DQ -Original Message-

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Jack Krupansky
Yeah, terms that expand into phrases cannot use wildcard or fuzzy queries. If you want special characters such as slash to be part of the term, try the white space tokenizer (which will require reindexing.) -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Tuesday, Febru

mlt.fl can prevent stemming

2013-02-19 Thread Martin Saturka
Hi, when using MoreLikeThis handler, with externally provided text to match, the external text is not stemmed under a situation. It is when mlt.fl contains at the first position (of its comma separated fields) a field that is not present at searched documents. http://localhost:8983/solr/en/mlt/?st

Re: Edismax odd results

2013-02-19 Thread Shawn Heisey
I do not see the word engineer (or any other similar word) in the score calculation, only boots. A test on my own index shows both words in the calculations. I would use the analysis admin page on the prodnameplurals field to see what happens to the input of "engineer boots" on both index and

RE: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Jack, Here's q test query we've been using: select?q=+engineer+boots&defType=edismax&fl=prodname&qf=prodnameplurals&pf2=prodnameplurals^2.0 This still produces a result set where the first 'engineer boot' is way down the list and subsequent ones are interspersed with other boots. They're all

Re: SOLR4 SAN vs Local Disk?

2013-02-19 Thread Shawn Heisey
On 2/19/2013 8:30 AM, chamara wrote: Hi Guys, I know you all have came across this when doing hardware configuration, If i use all the bells and whistles version of SAN is that enough to out power Physical disk? Also my topology is to have 5 shards in one Machine. The reasons for or against

Re: What should focus be on hardware for solr servers?

2013-02-19 Thread Michael Della Bitta
I actually tried to get the Phoronix test suite working, but for some reason, it was timing out when it was listing the suites of tests that were available. Maybe I don't know enough about it... Michael Della Bitta Appinions 18 East 41st Street, 2n

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
It works but Solr 3.6 changes the request: ic:H04L12/56 by ic:"H04L12 56" (debugQuery=true) result count seems to be ok but if I do H04L12/5* then I get 0 results (H04L12* returns more than 80 000 results) I have the same result without autoGeneratePhraseQueries=true, it's because i

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
oh great, I'm testing on my dev' environment. I will keep you informed Le 19/02/2013 16:45, Jack Krupansky a écrit : No, it's just part of the query generation process, so reindexing is not needed. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Tuesday, February 19, 2

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Jack Krupansky
No, it's just part of the query generation process, so reindexing is not needed. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Tuesday, February 19, 2013 7:41 AM To: solr-user@lucene.apache.org Subject: Re: Problem when I search something that contains a forward slash

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
Hi Jack, If I do this modification: autoGeneratePhraseQueries=true attribute to the text_general field type Must I re-index my index ? thanks for this information, Bruno Le 19/02/2013 16:35, Jack Krupansky a écrit : Add the autoGeneratePhraseQueries=true attribute to the text_general field ty

Re: Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit

2013-02-19 Thread kirpakaro
Hi Jack, Here is the query http://localhost:7983/cores/jp/select/?q=鹿児島%20若石%20龍玉女&rows=22&debugQuery=on&echoParams=all&wt=xml (using dismax parser). Where f1 is being boosted over f2 (qf=f1^10 f2^1) [f1: string f2: text_ja (copyField being copied from f1) 259 <--- time is quite large <

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Jack Krupansky
Add the autoGeneratePhraseQueries=true attribute to the text_general field type, or create your own field type with that attribute set. Otherwise, XX/YY is treated as "XX OR YY". -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Tuesday, February 19, 2013 3:40 AM To: sol

Re: Edismax odd results

2013-02-19 Thread Jack Krupansky
Show us your qf and pf params. Do you have PF2 set? That's the key for getting the phrase "engineer boots" boosted higher than just boots. You may also simply have to give a higher PF2 boost since "boots" probably has a much higher term frequency than "engineer" or even the natural Lucene score

SOLR4 SAN vs Local Disk?

2013-02-19 Thread chamara
Hi Guys, I know you all have came across this when doing hardware configuration, If i use all the bells and whistles version of SAN is that enough to out power Physical disk? Also my topology is to have 5 shards in one Machine. Thanks in advance. -- View this message in context: http://lu

Edismax odd results

2013-02-19 Thread David Quarterman
Hi all, We have an index of boots which contains harness boots, engineer boots , ankle boots, etc. An edismax search on the index for 'harness boots' brings back 2,175 boots with 'harness' results at the top. 'Searching 'engineer boots' brings back everything but 'engineer boots', same for 'ank

RE: Custom shard key, shard partitioning

2013-02-19 Thread Markus Jelsma
-Original message- > From:Mark Miller > Sent: Mon 18-Feb-2013 16:27 > To: solr-user@lucene.apache.org > Subject: Re: Custom shard key, shard partitioning > > Yeah, I think we are missing some docs on this… > > I think the info is in here: https://issues.apache.org/jira/browse/SOLR-259

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
Thanks Sandeep, but it not helps me. Regards, Bruno Le 19/02/2013 13:41, Sandeep Mestry a écrit : Hi Bruno, I have never used 3.6 so I am sorry I might not be of much help. But, I have a similar requirement for 2 fields and I use string & case insensitive string fields and by escaping the

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Sandeep Mestry
Hi Bruno, I have never used 3.6 so I am sorry I might not be of much help. But, I have a similar requirement for 2 fields and I use string & case insensitive string fields and by escaping the forward slash, I get the result correctly. The field definitions are as below:

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
Hi, Even I use backslash, the problem is the same: ic:A01H2\/023 returns the same problem. May be I must disable an option ? or something Le 19/02/2013 13:11, Bruno Mannina a écrit : Hi Sandeep, First thanks for your answer but I use Solr 3.6 and not 4.0. I can't actually update my solr

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
Hi Sandeep, First thanks for your answer but I use Solr 3.6 and not 4.0. I can't actually update my solr to 4.0 version. And using the " " is not the solution because Solr 3.6 has an issue when I use troncation like * inside the request: "A01H2/0*" doesn't work. Do you have an other solution

Re: Combining Solr score with customized user ratings for a document

2013-02-19 Thread Á_____o
Well, as Hoss suggested, I have implemented my own function (ValueSourceParser+ValueSource) :) It's a very simple function which receives a parameter, the userId, and returns a float value depending (with a switch-case structure just for this demo) on it. With this approach now I can boost (i.e.

Re: Problem when I search something that contains a forward slash?

2013-02-19 Thread Sandeep Mestry
Hi Bruno, [image: ] Solr 4.0 added regular expression support, which means that '/' is now a special character and must be escaped if searching for literal forward slash. http://wiki.apache.org/solr/SolrQuerySyntax So, you can either escape it or use quotes like "A01H2/001" Cheers, Sandeep O

Problem when I search something that contains a forward slash?

2013-02-19 Thread Bruno Mannina
Dear Solr Users, I use Solr 3.6 I have a field name IC which contains IPC codes with a forward slash inside like: A01H2/001 G06F1/023 C01C3/147 G06F3/023 etc... My definition for this field is: multiValued="true"/> If i try to search: ic:G06F3/023 http://:/solr/select/?q=ic%3AG0

Re: WIKI: Does JSON Update format actually support single-object submit?

2013-02-19 Thread Alexandre Rafalovitch
Right you are. I meant to say that it fails with {} but passes with [{}]. Basically, I am not sure Wiki is correct. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happ

Re: How to give more more importance to a document if term match is more

2013-02-19 Thread Sandeep Mestry
Hi Pragyanshis, I faced a similar problem few days ago and I was advised on this forum to override Solr DefaultSimilairy calculation to return always a constant value for idf. I think, in your case you'd also want to suppress the length norm which will require re-indexing as length norm is calcula

Re: Reloading config to zookeeper

2013-02-19 Thread Marcin Rzewucki
Right. Collection API can be used here. On 18 February 2013 21:36, Timothy Potter wrote: > @Marcin - Maybe I mis-understood your process but I don't think you > need to reload the collection on each node if you use the expanded > collections admin API, i.e. the following will propagate the reloa

RE: SOLR Performance question

2013-02-19 Thread Harshvardhan Ojha
Hi Anurag, We are running solr with almost same number of documents having more than 50 indexed fields and 78 stored fields, we have no issues as of now, so I can say that you won't face any problem. Regards Harshvardhan Ojha -Original Message- From: anurag.jain [mailto:anurag.k...@gma

Re: What should focus be on hardware for solr servers?

2013-02-19 Thread Dotan Cohen
On Thu, Feb 14, 2013 at 5:54 PM, Michael Della Bitta wrote: > My dual-core, HT-enabled Dell Latitude from last year has this CPU: > model name : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz > bogomips: 4988.65 > > An m3.xlarge reports: > model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz > b

SOLR Performance question

2013-02-19 Thread anurag.jain
Hi everybody. I stored 42 field in solr. and indexed 34 field. and going to store 4-6 coloum more and indexed 3-5 and total doc i have stored --- 250 and may be it will reach upto 500 SO question is, Will i get any problem ?? my machine is m1.small in amazon ec2. so should i

Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?

2013-02-19 Thread Divyanand Tiwari
Thank you for your help Jack. I just wanted to know if there is any ready made solution for this because i really don't know about extracting meta information. awaiting reply.. Thank you On Tue, Feb 19, 2013 at 12:48 PM, Jack Krupansky wrote: > Use the standard update handler and pass the entir

Re: WIKI: Does JSON Update format actually support single-object submit?

2013-02-19 Thread Dmitry Kan
To clarify a bit: > I did a quick test with my example and it seemed to fail with [] > but passing with []. did you mean to use {} in one of these? Dmitry On Sun, Feb 17, 2013 at 4:22 AM, Alexandre Rafalovitch wrote: > I am looking at the Solr WIKI and some of the examples seem to contradict >