Re: Migrating solr 3.6 to solr 4.0

2012-12-03 Thread Tirthankar Chatterjee
can you paste the content of solr.xml On Dec 4, 2012, at 1:26 AM, Shaveta_Chawla wrote: > Hi, > > I had solr3.6 installed on my system, now i am migrating my solr3.6 to > solr4.0. but i am getting the error > > SEVERE: Unable to create core: collection1 > java.io.IOException: Can't find resourc

Migrating solr 3.6 to solr 4.0

2012-12-03 Thread Shaveta_Chawla
Hi, I had solr3.6 installed on my system, now i am migrating my solr3.6 to solr4.0. but i am getting the error SEVERE: Unable to create core: collection1 java.io.IOException: Can't find resource 'solrconfig.xml' in classpath or 'solr/collection1/conf/', cwd=/opt/tomcat/bin i don't know how to r

Re: Difference between 'bf' and 'boost' when using eDismax

2012-12-03 Thread Floyd Wu
Thanks Jack! It helps a lots. Floyd 2012/12/4 Jack Krupansky > "bf" is processed first, then "boost". > > All the bf's will be added, then the resulting scores will be boosted by > the product of all the "boost" function queries. > > -- Jack Krupansky > > -Original Message- From: Floy

Re: How to change Solr UI

2012-12-03 Thread Erick Erickson
That's only one example, there are others, stream.body=blah. or id:* Jack's comment is well taken, consider a real middleware application. Best Erick On Mon, Dec 3, 2012 at 5:28 PM, Iwan Hanjoyo wrote: > > > > > > Note that Velocity _can_ be used for user-facing code, but be very sure > you

Re: Difference between 'bf' and 'boost' when using eDismax

2012-12-03 Thread Jack Krupansky
"bf" is processed first, then "boost". All the bf's will be added, then the resulting scores will be boosted by the product of all the "boost" function queries. -- Jack Krupansky -Original Message- From: Floyd Wu Sent: Monday, December 03, 2012 11:00 PM To: solr-user@lucene.apache.o

Re: search behavior on a case-sensitive field

2012-12-03 Thread Joe Zhang
haha, makes perfect sense! Thanks a lot! On Mon, Dec 3, 2012 at 9:25 PM, Jack Krupansky wrote: > "CoSt" was split into two terms and the query parser generated an OR of > them. Adding the autoGeneratePhraseQueries="**true" attribute to your > field type should fix the problem. > > You can also ch

Re: search behavior on a case-sensitive field

2012-12-03 Thread Jack Krupansky
"CoSt" was split into two terms and the query parser generated an OR of them. Adding the autoGeneratePhraseQueries="true" attribute to your field type should fix the problem. You can also change splitOnCaseChange="1" to splitOnCaseChange="0" to avoid the term splitting issue. Be sure to comp

Re: Solr Query Parameter : ids - What is this used for?

2012-12-03 Thread deniz
Yonik Seeley-4 wrote > It's an internal implementation detail of distributed search - the > second phase selects specific ids on each shard via the "ids" > parameter. > > -Yonik > http://lucidworks.com so i suppose it us unique field? or it depends on which field we are using for querying on shar

search behavior on a case-sensitive field

2012-12-03 Thread Joe Zhang
I have a search like this: When I query "COST", it gives reasonable results (n1); When I query "CoSt", however, it gives me n2 (>n1) results, and I can't locate actual

Difference between 'bf' and 'boost' when using eDismax

2012-12-03 Thread Floyd Wu
Hi there, I'm not sure if I understand this clearly. 'bf' is that final score will be add some value return by bf? for example-> score + bf = final score 'boost' is that score will be multiply with value that return by boost? for example-> score * boost = final score When using both( 'bf' and

Re: Solr Query Parameter : ids - What is this used for?

2012-12-03 Thread Yonik Seeley
On Mon, Dec 3, 2012 at 10:55 PM, deniz wrote: > Hello, as it is clear in the title too, i wanna know for what solr uses this > parameter... i see it on a sharding env on cloud, so i guess it is related > with cloud but still there is no explanation about it in any of wiki pages > that i have check

Solr Query Parameter : ids - What is this used for?

2012-12-03 Thread deniz
Hello, as it is clear in the title too, i wanna know for what solr uses this parameter... i see it on a sharding env on cloud, so i guess it is related with cloud but still there is no explanation about it in any of wiki pages that i have checked... can someone explain the usage and aim of this par

Re: How to change Solr UI

2012-12-03 Thread Jack Krupansky
It is annoying to have to repeat these explanations so much. Any serious objection to removing the VW UI from Solr proper and replacing it with a standalone app? I mean, Solr should have PHP, python, Java, and ruby example apps, right? -- Jack Krupansky -Original Message- From: Iwan

Re: solr war -> osgi

2012-12-03 Thread Iwan Hanjoyo
> Has anyone had any experience repackaging the solr war for osgi? And while > I'm at it, has anyone done this in geronimo 3.0? > > Hi Marcos, Start glassfish web server. Put solr war file inside the autodeploy folder. Finally, you need to find the solr home folder location. Different operating sy

Re: How to change Solr UI

2012-12-03 Thread Iwan Hanjoyo
> > > Note that Velocity _can_ be used for user-facing code, but be very sure you > secure your Solr. If you allow direct access, a user can easily enter > something like http:// > /update?commit=true&stream.body=*:*. > And all your documents will be gone. > > Hi Erickson, Thank you for the input.

Re: News clustering

2012-12-03 Thread Iwan Hanjoyo
Hi Stanislaw, I see. Thank you for the reference. Kind regards, Hanjoyo On Tue, Dec 4, 2012 at 12:37 AM, Stanislaw Osinski wrote: > > I mean measuring the similarity between the document in each cluster. > > Also, difference between document on one cluster with another cluster. > > > > I saw t

Re: Problem with ping handler, SolrJ 4.1-SNAPSHOT, Solr 3.5.0

2012-12-03 Thread Shawn Heisey
On 11/8/2012 3:25 PM, Dyer, James wrote: Could this be a side-effect from SOLR-4019, in branch_4.0 this was commit r1405894 ? Prior to this commit, PingRequestHandler would throw a SolrException for 503/Bad Request. The change is that the exception isn't actually thrown but rather sent in pla

Re: Whole Phrase search in Solr

2012-12-03 Thread Jack Krupansky
Ah! You have conflicting tokenizers in your index and query analyzers. They should be the same. Your index has: Your query has: That has the effect of treating the entire query term as one index term. That actually works for simple terms, but a quoted phrase is passed to the query analy

Re: Luke and SOLR search giving different results

2012-12-03 Thread Shawn Heisey
On 12/3/2012 1:44 PM, Erol Akarsu wrote: I tried as search query not "baş" but "features:baş" in field "q" in SOLR GUI. And, I got result! In the one document, I had some fields type of text_eng, text_general and one field features type of text_tr. If I don't specify field name, SOLR use Engli

solr war -> osgi

2012-12-03 Thread Marcos Mendez
Hi, Has anyone had any experience repackaging the solr war for osgi? And while I'm at it, has anyone done this in geronimo 3.0? Regards, Marcos

Re: Luke and SOLR search giving different results

2012-12-03 Thread Jack Krupansky
As I pointed out in my message, your query is indicating that "text" is your default search field. So, either choose a different default search field, or assure that the "text" field has the desired field type. If you want to change the default search field, eEither use a "df" request paramete

Re: Luke and SOLR search giving different results

2012-12-03 Thread Erol Akarsu
Jack, I see interesting stuff here now. I tried as search query not "baş" but "features:baş" in field "q" in SOLR GUI. And, I got result! In the one document, I had some fields type of text_eng, text_general and one field features type of text_tr. If I don't specify field name, SOLR use Englis

Re: News clustering

2012-12-03 Thread Jorge Luis Betancourt Gonzalez
I'm trying to using to search though news websites, but I was interested in classification on index time, is there any available solution for this? Greetings! On Dec 3, 2012, at 12:37 PM, Stanislaw Osinski wrote: >> I mean measuring the similarity between the document in each cluster. >> Also,

Re: Whole Phrase search in Solr

2012-12-03 Thread NickA
Jack thank you again, however we have the major problem that using QUOTES to bring "phrase" results, actually does not bring any results AT ALL! I mentioned this at the initial post, that we also used these: fq=search_field:"check this" fq=search_field:'check this' But no results appear when q

Re: Luke and SOLR search giving different results

2012-12-03 Thread Erol Akarsu
Jack, I have these in schema.xml that defines "features" as type of text_tr But unfortunately, this fails. On Mon, Dec 3, 2012 at 1:15 PM, Jack Krupansky wrote: > Ah! See where it sa

Re: Luke and SOLR search giving different results

2012-12-03 Thread Jack Krupansky
Ah! See where it says "text:baş"? Your query is against the "text" field, which probably doesn't have the Turkish analysis. There is probably a copyField from "features" to "text". You use the "text_tr" field type for "features", but probably not for the "text" field. -- Jack Krupansky

Re: Luke and SOLR search giving different results

2012-12-03 Thread Erol Akarsu
Jack, I have already set tomcat server fro UTF-Encoding before. I have added URIEncoding="UTF-8" to all elements in server.xml in Tomcat 7. As you see below, when I search word "baş" with debug mode I can see empty response. But when I search word "baştan", I can get correct response. It see

Re: Backing up SolR 4.0

2012-12-03 Thread Shawn Heisey
On 12/3/2012 9:47 AM, Andy D'Arcy Jewell wrote: However, wouldn't re-creating the index on a large dataset take an inordinate amount of time? The system I will be backing up is likely to undergo rapid development and thus schema changes, so I need some kind of insurance against corruption if we

Re: Whole Phrase search in Solr

2012-12-03 Thread Jack Krupansky
The edismax "phrase boost" feature boosts the phrase IF it occurs - it's optional. If you want Solr to search ONLY by whole phrase, Solr does have a precise way to request that - simply enclose the phrase in quotes. But I presume that you knew that. You can certainly preprocess your query to

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
> I mean measuring the similarity between the document in each cluster. > Also, difference between document on one cluster with another cluster. > > I saw the sample code ClusteringQualityBencmark.java > However, I do not know how to make use of it for assessing my Solr > Clustering performance. >

Re: Luke and SOLR search giving different results

2012-12-03 Thread Jack Krupansky
Two points: 1. Possibly an encoding problem with your container? Is UTF-8 encoding enabled? 2. Add &debugQuery=true to your query (from the browser) and see if the parser_query has the expected term that matches what Luke reports for the index and what Solr Admin Analysis also reports for inde

Re: Downloading files from the solr replication Handler

2012-12-03 Thread Eva Lacy
They are the '\0' character. what is a marker? Gettting the following with a wget HTTP request sent, awaiting response... 200 OK Length: unspecified [application/xml] On Fri, Nov 30, 2012 at 4:58 PM, Alexandre Rafalovitch wrote: > What mime type you get for binary files? Maybe server is misconf

Re: Whole Phrase search in Solr

2012-12-03 Thread Jack Krupansky
If you use the edismax query parser and set the "pf", "pf2", and "pf3" fields your phrases should show up as top results. This will not eliminate non-phrase matches, but will assure that phrase matches get boosted. See: http://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29 -- Jack

Re: Whole Phrase search in Solr

2012-12-03 Thread Erick Erickson
As Jack suggested, show the results of adding &debugQuery=on, it'll help us help you. Particularly with this form: q=search_field:"check this". It should be doing what you want. Best Erick On Mon, Dec 3, 2012 at 8:37 AM, NickA wrote: > Thank you Jack, > > the problem with the "AND" is that it

Re: AW: Edismax query parser and phrase queries

2012-12-03 Thread Erick Erickson
It _seems_ like just adding "phrase fields" (qf) to your edismax defaults gets you close. It would have the problem of matching if the field were longer... but it might be "close enough". Otherwise, why not just add in fq clauses on your exact fields? Because one problem you'll have is that you ne

Re: Backing up SolR 4.0

2012-12-03 Thread Andy D'Arcy Jewell
On 03/12/12 16:39, Erick Erickson wrote: There's no real need to do what you ask. First thing is that you should always be prepared, in the worst-case scenario, to regenerate your entire index. That said, perhaps the easiest way to back up Solr is just to use master/slave replication. Consider

Re: Backing up SolR 4.0

2012-12-03 Thread Erick Erickson
There's no real need to do what you ask. First thing is that you should always be prepared, in the worst-case scenario, to regenerate your entire index. That said, perhaps the easiest way to back up Solr is just to use master/slave replication. Consider having a machine that's a slave to the mast

Re: Whole Phrase search in Solr

2012-12-03 Thread NickA
Thank you Jack, the problem with the "AND" is that it does not search for a PHRASE but for the 2 words being SOMEWHERE in the article. For example the "Check this" will NOT search for "Check this" as a PHRASE but for the "Check" word and the "this" word somewhere in the article, even far away the

Re: Luke and SOLR search giving different results

2012-12-03 Thread Erol Akarsu
Jack, Yes. I expect SOLR should give same search results as Luked does. Term analyzer gives correct answer in SOLR as expected. But SOLR does not return correct search results. I don't know why. Erol Akarsu On Mon, Dec 3, 2012 at 11:21 AM, Jack Krupansky wrote: > So, does that highlight the

Re: How to change Solr UI

2012-12-03 Thread Erick Erickson
Adding to what Iwan said, I want to be sure you're not confusing prototyping with a full-fledged application. The Velocity code included is mostly intended as a rapid-prototyping vehicle. There are significant security issues if you try to use it as your user-facing application, be sure you trust y

Re: Solr 4: Join Query

2012-12-03 Thread Erick Erickson
not that I know of. Also, your performance will be much better if you can denormlized the data. On Mon, Dec 3, 2012 at 12:44 AM, Vikash Sharma wrote: > Hi Erick, > One more thing: So is there any other way to get the result? > I mean, I need to get both parent and child document in/not nested f

Re: Luke and SOLR search giving different results

2012-12-03 Thread Jack Krupansky
So, does that highlight the problem for you or not? Is the term analyzed as you expected? -- Jack Krupansky From: Erol Akarsu Sent: Monday, December 03, 2012 8:44 AM To: solr-user@lucene.apache.org Subject: Re: Luke and SOLR search giving different results Jack, Thanks for help. I removed d

Re: Whole Phrase search in Solr

2012-12-03 Thread Jack Krupansky
The OR behavior is because the default operator is OR. You can change that by setting q.op=AND. Try the quoted phrases again, but with &debugQuery=true to see what query is actually generated. Finally, if you remove stop words at index time, then you must remove them at query time as well.

Re: AW: Edismax query parser and phrase queries

2012-12-03 Thread Jack Krupansky
Okay, so the bottom line here is that you wish to change the semantics of quoted phrases. Fine, that's your prerogative, but a change in semantics would require a change to the query parser, or as you originally indicated, a pre-processor. It does sound as if a pre-processor is the way to go her

Re: PHP client

2012-12-03 Thread Bill Au
https://bugs.php.net/bug.php?id=62332 There is a fork with patches applied. On Mon, Dec 3, 2012 at 9:38 AM, Arkadi Colson wrote: > Hi > > Anyone tested the pecl Solr Client in combination with SolrCloud? I seems > to be broken since 4.0 > > Best regard > Arkadi > >

PHP client

2012-12-03 Thread Arkadi Colson
Hi Anyone tested the pecl Solr Client in combination with SolrCloud? I seems to be broken since 4.0 Best regard Arkadi

Re: News clustering

2012-12-03 Thread Iwan Hanjoyo
Hi Stanislaw, I mean measuring the similarity between the document in each cluster. Also, difference between document on one cluster with another cluster. I saw the sample code ClusteringQualityBencmark.java However, I do not know how to make use of it for assessing my Solr Clustering performance

Backing up SolR 4.0

2012-12-03 Thread Andy D'Arcy Jewell
Hi all. I'm new to SolR, and I have recently had to set up a SolR server running 4.0. I've been searching for info on backing it up, but all I've managed to come up with is "it'll be different" or "you'll be able to do push replication" or using http and the command=backup parameter, which

Re: Luke and SOLR search giving different results

2012-12-03 Thread Erol Akarsu
Jack, Thanks for help. I removed data folder of SOLR and indexed this sample doc from scratch, there was no document in SOLR but only one. When I analysed , I can see stemming is correct and I can see these for words "bul", "baş" ,"gör" and "umut" in SF row I attached analyse screens Erol Akar

Whole Phrase search in Solr

2012-12-03 Thread NickA
Hello, I am trying to achieve searching with a phrase in SOLR. Specifically I have the following field in my schema: Also (as a second similar problem) in the “synonyms.txt” I have values like these:

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
> Was the picture generated using Lingo 3G algorihtms? > I saw some sub-clusters inside it. > Nice pic :) > That is correct. I am interested to learn it. > How long is the Lingo 3G trial period? > I'll send you the details in a private e-mail in a second. > Is there any way to programmatical

Re: How to change Solr UI

2012-12-03 Thread Iwan Hanjoyo
Hi Romita, In my opinion, if you are new to Solr, you can start learning from Solritas. Solritas uses Apache Velocity, a templating language, CSS and JQuery to manage it looks and behavior. Besides that you can write a custom SearchComponent inside the /browse SearchHandler to add more functionali

Re: News clustering

2012-12-03 Thread Iwan Hanjoyo
Hi Stanislaw Osinski, Was the picture generated using Lingo 3G algorihtms? I saw some sub-clusters inside it. Nice pic :) I am interested to learn it. How long is the Lingo 3G trial period? Is there any way to programmatically measure the performance of Carrot2 clustering algorithm? thanx cheer

Re: News clustering

2012-12-03 Thread Iwan Hanjoyo
Hi Stanislaw Osinski, On Mon, Dec 3, 2012 at 6:13 PM, Stanislaw Osinski wrote: > One of our clients uses Solr's search results clustering for grouping news. > Instead of the default Carrot2 algorithm that ships with Solr they use a > commercial one, but Carrot2 should give you decent clusters to

Re: behavior of solr.KeepWordFilterFactory

2012-12-03 Thread Joe Zhang
across-the-board case-senstive indexing is not what I want... Let me make sure I understand your suggestion: And define content1 as text1, content2 as text2? On Mon, Dec 3, 2012 at 1:09 AM, Xi Shen wrote: > Solr index is cas

Re: News clustering

2012-12-03 Thread Stanislaw Osinski
One of our clients uses Solr's search results clustering for grouping news. Instead of the default Carrot2 algorithm that ships with Solr they use a commercial one, but Carrot2 should give you decent clusters too. Here's an example clustering result: http://imagebin.org/238001 Staszek -- Stanisl

Re: Replication in SolrCloud

2012-12-03 Thread Arkadi Colson
Never mind I think I found it. There must be some documents into each shardso they havea version number. Then everything seems to work... On 11/30/2012 04:57 PM, Mark Miller wrote: Thanks for all the detailed info! Yes, that is confusing. One of the sore points we have while supporting both

Re: Replication in SolrCloud

2012-12-03 Thread Arkadi Colson
Thanks for the explaination It's clear now... I expanded the setup to: 4 hosts with 2 shards en 1 replicator for each shard. When I shutdown tomcat on solr01-dcg which is the master of shard 1 for both collections, the replicator (solr01-gs) seems NOT to

AW: Edismax query parser and phrase queries

2012-12-03 Thread Tantius, Richard
Hi, the use case we have in mind is that we would like to achieve exact matches for explicit phrases. Our users expect that an explicit phrase not only considers the order of terms, but also the exact wording. Therefore if we search on fields using a data type that is not meant performing exact

How to change Solr UI

2012-12-03 Thread Romita Saha
Hi, I want to change the Solr UI. As far as i understand, Solritas is just for prototyping, where I can change the UI according to a predefined template (Velocity) and cannot add on any additional functionality to that page. How can I change the Solr UI otherwise. Any guidance would be apprecia

Re: Solr 4: Join Query

2012-12-03 Thread Vikash Sharma
Hi Erick, One more thing: So is there any other way to get the result? I mean, I need to get both parent and child document in/not nested format. Regards, Vikash Regards, Vikash Sharma vikash0...@gmail.com On Sat, Dec 1, 2012 at 10:29 PM, Erick Erickson wrote: > That's the way joins work, and

Re: duplicated URL sent from Nutch to solr index

2012-12-03 Thread Xi Shen
Then the "URL" must be the same. On Mon, Dec 3, 2012 at 2:34 PM, Joe Zhang wrote: > Sorry I didn't make it perfectly clear. The "id" field is URL. > > On Sun, Dec 2, 2012 at 11:33 PM, Joe Zhang wrote: > > > Thanks! > > > > > > On Sun, Dec 2, 2012 at 11:20 PM, Xi Shen wrote: > > > >> If the va

Re: behavior of solr.KeepWordFilterFactory

2012-12-03 Thread Xi Shen
Solr index is case-sensitive by default, unless you used the lower case filter. I remember I saw this topic on Solr, and the solution is simple: copy the filed; use a new analyzer/tokenizer to process this field, and do not use lower case filter when query, make sure both fields are included. O