date:20110726

Re: Logically equivalent queries but vastly different no of results?

2011-07-26 Thread Ahmet Arslan



> Yes - I am using edismax but the
> reason is not obvious to me can you give
> me a pointer?

Probably this it the be cause. See Hoss' explanation.

https://issues.apache.org/jira/browse/SOLR-2649

By the way if you want to boost docs (using lucene queries) with (e)dismax, bq 
is the way to go.  &bq=domain_ids:0^1.3

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen

Where you upgrading from Solr 1.4?
SolrJ uses by default for querying the javabin format (wt parameter).
The javabin format is not compatible between 1.4 and 3.1 and above.
So If your clients where running with SolrJ 1.4 versions I would expect
errors to occur.

Martijn

On 25 July 2011 12:15, Tarjei Huse  wrote:

> Hi, I recently went through a little hell when I upgraded my Solr
> servers to 3.2.0. What I didn't anticipate was that my Java SolrJ
> clients depend on the server version.
>
> I would like to add a note about this in the SolrJ docs:
> http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update
>
> Any comments with regard to this?
>
> Are all SolrJ methods dependent on Java Serialization and thus class
> versions?
>
> --
> Regards / Med vennlig hilsen
> Tarjei Huse
> Mobil: 920 63 413
>
>


-- 
Met vriendelijke groet,

Martijn van Groningen

Removing the unwanted Debug messages - Wire.java

2011-07-26 Thread Sowmya V.B.

Hello All

I built a web application in Java/JSP, which calls a Solr Servlet during
search process. While I am able to retrieve and display my search results in
the format I want, my console is filled with Debug messages, printing all
the content of the pages I retrieve.

An example Debug line on my console looks like this:
DEBUG ["http-bio-8080"-exec-1] (Wire.java:70) - << "Quick recipe finder[\n]"

Because of this huge amount of printing to console, the process of
displaying the results on screen is slowing down.

Are there any suggestions on how to work around with this, since the source
of Wire.java, I guess, is not accessible?

S

-- 
Sowmya V.B.

Losing optimism is blasphemy!
http://vbsowmya.wordpress.com

changing the root directory where solrCloud stores info inside zookeeper File system

2011-07-26 Thread Yatir Ben Shlomo

Hi!

I am using solrCloud with a zookeeper ensamble of 3.

I noticed that solcOuld stores information direclt under the root dir in the
ZooKeepr file system:

\config \live_nodes \ collections

In my setup Zookeepr is also used by other modules so I would like solrCLoud
to store everything under /solrCLoud/ or something similar



Is there a property for that or do I need to custom code it ?

Thanks

Re: How to query solr status

2011-07-26 Thread Péter Király

You can use Luke request handler, but for improving the speed set
numTerms parameters to zero, like
http://localhost:8983/solr/admin/luke?numTerms=0
It will give you information about optimized state of index as true

More about this on Solr wiki: http://wiki.apache.org/solr/LukeRequestHandler

2011/7/26 ZiLi :
> Anybody who knows how to query an solr server whether it is optimized or not ?
> As replication can config slave to pull the indexes after "optimized" ,so I 
> think there must be someway to query that .But I didn't find any document to 
> identify that , anyone knows ?
> Thanks so much O(n_n)O
>

Péter
-- 
eXtensible Catalog
http://drupal.org/project/xc

proximity within phrases

2011-07-26 Thread Jame Vaalet

How do u write solr query to mention proximity between two phrases 

dance jockey should appear within 10 words before video jokey 

"("dance jockey") ("video jockey")"~10 

This isn't working fine . can some one suggest a way ?


-JAME

Re: proximity within phrases

2011-07-26 Thread Ahmet Arslan

> How do u write solr query to mention
> proximity between two phrases 
> 
> dance jockey should appear within 10 words before video
> jokey 
> 
> "("dance jockey") ("video jockey")"~10 
> 
> This isn't working fine . can some one suggest a way ?

This is not possible with out-of-the-box solr, though this kind of searches are 
possible with lucene's SpanQuery family.

It should be possible with Xml and Surround query parsers.
http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/

Here is  an effort to integrate xml query parser to solr, 
https://issues.apache.org/jira/browse/SOLR-839

Re: Solr 3.3: Exception in thread "Lucene Merge Thread #1"

2011-07-26 Thread mdz-munich

It seems to work now. 


We simply added 

/ulimit -v unlimited /

to our tomcat-startup-script. 


@Yonik: Thanks again! 


Best regards,

Sebastian 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-3-Exception-in-thread-Lucene-Merge-Thread-1-tp3185248p3200105.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using distributed search with the suggest component

2011-07-26 Thread mdz-munich

Hi Tobias,

try this, it works for us (Solr 3.3):

solrconfig.xml:

/
word

suggestion
org.apache.solr.spelling.suggest.Suggester
org.apache.solr.spelling.suggest.fst.FSTLookup
wordCorpus
score
./suggester
false
true
0.005




true
true
true
true
suggestion
50
50


suggest

/

Query like that:

http://localhost:8080/solr/core.01/suggest?q=wordPrefix&shards=localhost:8080/solr/core.01,localhost:8080/solr/core.02&shards.qt=/suggest


Greetz,

Sebastian



Tobias Rübner wrote:
> 
> Hi,
> 
> I try to use the suggest component (solr 3.3) with multiple cores.
> I added a search component and a request handler as described in the docs
> (
> http://wiki.apache.org/solr/Suggester) to my solrconfig.
> That works fine for 1 core but querying my solr instance with the shards
> parameter does not query multiple cores.
> It just ignores the shards parameter.
> http://localhost:/solr/core1/suggest?q=sa&shards=localhost:/solr/core1,localhost:/solr/core2
> 
> The documentation of the SpellCheckComponent (
> http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support)
> is a bit vage in that point, because I don't know if this feature really
> works with solr 3.3. It is targeted for solr 1.5, which will never come,
> but
> says, it is now available.
> I also tried the shards.qt paramater, but it does not change my results.
> 
> Thanks for any help,
> Tobias
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-distributed-search-with-the-suggest-component-tp3197651p3200143.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to make a valid date facet query?

2011-07-26 Thread Tomás Fernández Löbbe

Hi Floyd, I don't think the feature that allows to use multiple gaps for a
range facet is committed. See
https://issues.apache.org/jira/browse/SOLR-2366
You can achieve a similar functionality by using facet.query. see:
http://wiki.apache.org/solr/SimpleFacetParameters#Facet_Fields_and_Facet_Queries

Regards,

Tomás
On Tue, Jul 26, 2011 at 1:23 AM, Floyd Wu  wrote:

> Hi all,
>
> I need to make date faceted query and I tried to use facet.range but can't
> get result I need.
>
> I want to make 4 facet like following.
>
> 1 Months,3 Months, 6Months, more than 1 Year
>
> The onlinedate field in schema.xml like this
>
> 
>
> I hit the solr by this url
>
> http://localhost:8983/solr/select/?q=*%3A*
> &start=0
> &rows=10
> &indent=on
> &facet=true
> &facet.range=onlinedate
> &f.onlinedate.facet.range.start=NOW-1YEARS
> &f.onlinedate.facet.range.end=NOW%2B1YEARS
> &f.onlinedate.facet.range.gap=NOW-1MONTHS, NOW-3MONTHS,
> NOW-6MONTHS,NOW-1YEAR
>
> But the solr complained Exception during facet.range of onlinedate
> org.apache.solr.common.SolrException: Can't add gap NOW-1MONTHS,
> NOW-3MONTHS, NOW-6MONTHS,NOW-1YEAR to value Mon Jul 26 11:56:40 CST 2010
> for
> 
>
> What is correct way to make this requirement to realized? Please help on
> this.
> Floyd
>

Re: no match or wrong match results

2011-07-26 Thread Tomás Fernández Löbbe

Hi, you are not giving us much information. What's your default operator?
What do you mean with "results are not correct"?

On Tue, Jul 26, 2011 at 3:04 AM, deniz  wrote:

> Here is the situation..
>
> when i make search with 3 or more words, the results are corret, however if
> i make a search by using only one word or two, there is no result, altough
> there must be...
>
> e.g
> query = stephan ruhl germany munich
> results are correct, documents with the words above retrieved
>
> however
>
> query = stephan ruhl
> results are not correct or even there is no result while some of the
> matchinf documents must be shown.
>
>
> any ideas about the issue?
>
> -
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/no-match-or-wrong-match-results-tp3199554p3199554.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolrJ and class versions

2011-07-26 Thread Tarjei Huse

On 07/26/2011 09:26 AM, Martijn v Groningen wrote:
> Where you upgrading from Solr 1.4?
Yep.
> SolrJ uses by default for querying the javabin format (wt parameter).
> The javabin format is not compatible between 1.4 and 3.1 and above.
> So If your clients where running with SolrJ 1.4 versions I would expect
> errors to occur.
I understood the error when I saw it, but I think the version dependency
should be noted in the SolrJ manual.
Regards,
T
> Martijn
>
> On 25 July 2011 12:15, Tarjei Huse  wrote:
>
>> Hi, I recently went through a little hell when I upgraded my Solr
>> servers to 3.2.0. What I didn't anticipate was that my Java SolrJ
>> clients depend on the server version.
>>
>> I would like to add a note about this in the SolrJ docs:
>> http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update
>>
>> Any comments with regard to this?
>>
>> Are all SolrJ methods dependent on Java Serialization and thus class
>> versions?
>>
>> --
>> Regards / Med vennlig hilsen
>> Tarjei Huse
>> Mobil: 920 63 413
>>
>>
>


-- 
Regards / Med vennlig hilsen
Tarjei Huse
Mobil: 920 63 413

RE: Spellcheck compounded words

2011-07-26 Thread O. Klein

Using ShingleFilterFactory and PositionFilterFactory I get some results, but
never as a useful collation.

So I tried to see what results with spellcheck.maxCollations=2 would be, but
I never got this to work. not on 3.3 nor 4.0. Even lowering
maxCollationEvaluations had no effect. I never get a response from Solr. Or
an OOM exception.

Anyone else experiencing this?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3200418.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen

I agree! It should be noted in the documentation.
I just wanted to say that SolrJ doen't depend on Java serialization, but
uses its own serialization:
http://lucene.apache.org/solr/api/solrj/org/apache/solr/common/util/JavaBinCodec.html

Martijn

On 26 July 2011 15:31, Tarjei Huse  wrote:

> On 07/26/2011 09:26 AM, Martijn v Groningen wrote:
> > Where you upgrading from Solr 1.4?
> Yep.
> > SolrJ uses by default for querying the javabin format (wt parameter).
> > The javabin format is not compatible between 1.4 and 3.1 and above.
> > So If your clients where running with SolrJ 1.4 versions I would expect
> > errors to occur.
> I understood the error when I saw it, but I think the version dependency
> should be noted in the SolrJ manual.
> Regards,
> T
> > Martijn
> >
> > On 25 July 2011 12:15, Tarjei Huse  wrote:
> >
> >> Hi, I recently went through a little hell when I upgraded my Solr
> >> servers to 3.2.0. What I didn't anticipate was that my Java SolrJ
> >> clients depend on the server version.
> >>
> >> I would like to add a note about this in the SolrJ docs:
> >> http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update
> >>
> >> Any comments with regard to this?
> >>
> >> Are all SolrJ methods dependent on Java Serialization and thus class
> >> versions?
> >>
> >> --
> >> Regards / Med vennlig hilsen
> >> Tarjei Huse
> >> Mobil: 920 63 413
> >>
> >>
> >
>
>
> --
> Regards / Med vennlig hilsen
> Tarjei Huse
> Mobil: 920 63 413
>
>


-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Solr vs ElasticSearch

2011-07-26 Thread Peter

Have a look:

http://stackoverflow.com/questions/2271600/elasticsearch-sphinx-lucene-solr-xapian-which-fits-for-which-usage

http://karussell.wordpress.com/2011/05/12/elasticsearch-vs-solr-lucene/

Regards,
Peter.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-vs-ElasticSearch-tp3009181p3200492.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Rounding errors in solr

2011-07-26 Thread Brian Lamb

Is this possible to do? If so, how?

On 7/25/11, Brian Lamb  wrote:
> Yes and that's causing some problems in my application. Is there a way to
> truncate the 7th decimal place in regards to sorting by the score?
>
> On Fri, Jul 22, 2011 at 4:27 PM, Yonik Seeley
> wrote:
>
>> On Fri, Jul 22, 2011 at 4:11 PM, Brian Lamb
>>  wrote:
>> > I've noticed some peculiar scoring issues going on in my application.
>> > For
>> > example, I have a field that is multivalued and has several records
>> > that
>> > have the same value. For example,
>> >
>> > 
>> >  National Society of Animal Lovers
>> >  Nat. Soc. of Ani. Lov.
>> > 
>> >
>> > I have about 300 records with that exact value.
>> >
>> > Now, when I do a search for references:(national society animal
>> > lovers),
>> I
>> > get the following results:
>> >
>> > 252
>> > 159
>> > 82
>> > 452
>> > 105
>> >
>> > When I do a search for references:(nat soc ani lov), I get the results
>> > ordered differently:
>> >
>> > 510
>> > 122
>> > 501
>> > 82
>> > 252
>> >
>> > When I load all the records that match, I notice that at some point,
>> > the
>> > scores aren't the same but differ by only a little:
>> >
>> > 1.471928 in one and the one before it was 1.471929
>>
>> 32 bit floats only have 7 decimal digits of precision, and in floating
>> point land (a+b+c) can be slightly different than (c+b+a)
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>

RE: Spellcheck compounded words

2011-07-26 Thread Dyer, James

If you're getting OOM's, double-check that you're on 3.3.  There was a nasty 
bug in 3.0 - 3.2 that would cause OOM in conjunction with spellcheck collations 
in some cases.  Ditto if Solr hangs as you might be in a Garbage Collection 
"loop".  If you have your jvm running with verbose gc's you'll see for sure in 
the server logs if this is happening.

With that said, collations shouldn't cause memory problems with 3.3.  Also, 
"maxCollationEvaluations" really is just to be sure the query doesn't run too 
long looking for spell correction possibilities.  It shouldn't affect memory 
usage, which will be low in any case (on 3.3).  

(although if you are getting OOMs on 3.3 and if you're pretty sure your heap is 
big enough, please post a stack trace!)

You might want to test some queries with all of these parameters enabled:

spellcheck=true
spellcheck.count=10
spellcheck.extendedResults=true
spellcheck.collate=true
spellcheck.collateExtendedResults=true
spellcheck.maxCollationTries=10
spellcheck.maxCollations=1

...the run some test queries and check in the spelling response.  This will 
show you all of the invidual word possibilities and then below that you'll get 
a collation if it could find a combination that can return hits.  Then note:

- If you get nothing from spellcheck, be sure you did a "spellcheck.build" 
since the last restart (or since you committed your data).

- If the "correct" version of one of your misspelled words isn't in the lists 
in the first section, try a highter "spellcheck.count".  However, if that word 
is in the index, there is no hope because Solr won't suggest a word for 
something in the index (but see 
https://issues.apache.org/jira/browse/SOLR-2585).

- If you see all the corrections in the individual lists, but not in a 
collation, try increasing "maxCollationTries" and/or "maxCollations" and see if 
it suggests it.  If all else fails, set "maxCollationTries" to zero and 
"maxCollations" to something higher.  Just keep in mind that with 
"maxCollationTries" at zero, the collations aren't guaranteed to return any 
hits.

- I'm not so sure shingles will work with the collation feature at all.

- I've heard that when using shingles, you have to put the query in 
"spellcheck.q" to get it to work.  But I've never used shingles with spellcheck 
before so I'm not sure.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: O. Klein [mailto:kl...@octoweb.nl] 
Sent: Tuesday, July 26, 2011 9:07 AM
To: solr-user@lucene.apache.org
Subject: RE: Spellcheck compounded words

Using ShingleFilterFactory and PositionFilterFactory I get some results, but
never as a useful collation.

So I tried to see what results with spellcheck.maxCollations=2 would be, but
I never got this to work. not on 3.3 nor 4.0. Even lowering
maxCollationEvaluations had no effect. I never get a response from Solr. Or
an OOM exception.

Anyone else experiencing this?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3200418.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Preserve XML hierarchy

2011-07-26 Thread Lucas Miguez

Hi, finally now I have all the field names of each document using the
Luke Request Handler (http://wiki.apache.org/solr/LukeRequestHandler)
and making HTTP Request to Solr I can get all the fields that contain
the word that I am searching.
I'll keep looking for a better solution.

Thanks!

Regards

2011/7/15 Gora Mohanty
> On Thu, Jul 14, 2011 at 8:43 PM, Lucas Miguez  wrote:
>> Thanks for your help!
>>
>> DIH XPathEntityProcessor helps me to index the XML Files, but, does it
>> help to me to know from where the node comes? Following the example in
>> my previous post:
>>
 example: Imagine that the user search the word "zona", then I have to
 show the TitleP, the TextP, the TitlePart, the TextPart and all the
 TextSubPart that are childs of gSubPart.
>>
>> Well, I tried to create TextPart, TitlePart, etc with the XPath
>> expression of the location in the original XML, using dynamic fields,
>> for example:
>> 
>
> There should not be a space between "TextPart" and "*"
>
>> to have the XPath associated with the field, but I don't know how to
>> search in all "TextPart *" fields...
> [...]
>
> You can search in individual fields, e.g., with ?q=TitlePart:myterm.
> For searching in all "TextPart*" fields, the easiest way probably is
> to copy the fields into a full-text search field. With the default Solr
> schema, this can be done by adding a directive like
>   
> This copies all fields into the field "text", which is searched by
> default. Thus, ?q=myterm will find "myterm" in all "TextPart*"
> fields.
>
> Regards,
> Gora
>

Cant get Synonym working

2011-07-26 Thread Andy Newby

Hi,

I'm coing back to trying to get the Synonynms working (alongside the
spellchecker). Here is what I have:


  




  
  





  


Then in synonyms.txt, I have:

pixima => pixma
cellpne => cellphone
computer => laptop

, car, food, 

aaa, food

However, after restarting Solr and then trying a search for
"aaa" , my test result which contains "food" in, never seems
to come up

Anyone got any ideas?

The annoying thing is that it works fine on a 2nd "dev" install I have (much
simpler, and not using multicore), but using the exact same fieldType setup.

Can anyone shed any light?

TIA!
-- 
Andy Newby
a...@ultranerds.com

Re: Rounding errors in solr

2011-07-26 Thread Yonik Seeley

On Mon, Jul 25, 2011 at 10:12 AM, Brian Lamb
 wrote:
> Yes and that's causing some problems in my application. Is there a way to
> truncate the 7th decimal place in regards to sorting by the score?

Not built in.
With some Java coding, you could create a post filter that manipulates scores.

http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters

-Yonik
http://www.lucidimagination.com



> On Fri, Jul 22, 2011 at 4:27 PM, Yonik Seeley 
> wrote:
>>
>> On Fri, Jul 22, 2011 at 4:11 PM, Brian Lamb
>>  wrote:
>> > I've noticed some peculiar scoring issues going on in my application.
>> > For
>> > example, I have a field that is multivalued and has several records that
>> > have the same value. For example,
>> >
>> > 
>> >  National Society of Animal Lovers
>> >  Nat. Soc. of Ani. Lov.
>> > 
>> >
>> > I have about 300 records with that exact value.
>> >
>> > Now, when I do a search for references:(national society animal lovers),
>> > I
>> > get the following results:
>> >
>> > 252
>> > 159
>> > 82
>> > 452
>> > 105
>> >
>> > When I do a search for references:(nat soc ani lov), I get the results
>> > ordered differently:
>> >
>> > 510
>> > 122
>> > 501
>> > 82
>> > 252
>> >
>> > When I load all the records that match, I notice that at some point, the
>> > scores aren't the same but differ by only a little:
>> >
>> > 1.471928 in one and the one before it was 1.471929
>>
>> 32 bit floats only have 7 decimal digits of precision, and in floating
>> point land (a+b+c) can be slightly different than (c+b+a)
>>
>> -Yonik
>> http://www.lucidimagination.com
>
>

Re: Cant get Synonym working

2011-07-26 Thread Emmanuel Espina

Well it appears to be some issue with the analysis. You can check the
http://localhost:8983/solr/admin/analysis.jsp (the admin page of your
instance, the analysis section) to see how the analysis is applied and see
the end result of "aaa"
You should work with the index and the query analysis too see if what you
wanted is actually what is being performed in Solr.

If you search for "food", does Solr find the correct documents?

Does your other instance (the one that works fine) contains the exact same
documents? Are the indexes identical? remember that after applying changes
to the index analysis chain you must re index all the documents again,
otherwise, the already indexed documents will have the terms with the
previous analysis applied, restarting solr after changes in the indexing
analysis chain is not enough.

Thanks
Emmanuel

2011/7/26 Andy Newby 

> Hi,
>
> I'm coing back to trying to get the Synonynms working (alongside the
> spellchecker). Here is what I have:
>
> 
>  
>
> words="stopwords.txt"/>
>
>
>  
>  
>
> ignoreCase="true" expand="true"/>
> words="stopwords.txt"/>
>
>
>  
> 
>
> Then in synonyms.txt, I have:
>
> pixima => pixma
> cellpne => cellphone
> computer => laptop
>
> , car, food, 
>
> aaa, food
>
> However, after restarting Solr and then trying a search for
> "aaa" , my test result which contains "food" in, never
> seems
> to come up
>
> Anyone got any ideas?
>
> The annoying thing is that it works fine on a 2nd "dev" install I have
> (much
> simpler, and not using multicore), but using the exact same fieldType
> setup.
>
> Can anyone shed any light?
>
> TIA!
> --
> Andy Newby
> a...@ultranerds.com
>

Schema.xml Change...

2011-07-26 Thread Vignesh.v

Dear Team,

   We tried changing the schema.xml to the user xml format but it 
shows error.Kindly give me a solution to carry out this process.

Thank You.

Regards,
Vignesh.V

Re: SolrJ and class versions

2011-07-26 Thread Chris Hostetter


: Hi, I recently went through a little hell when I upgraded my Solr
: servers to 3.2.0. What I didn't anticipate was that my Java SolrJ
: clients depend on the server version.
: 
: I would like to add a note about this in the SolrJ docs:
: http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update

I don't know that it really makes sense to add to that section, but if 
you'd like to add some mention to the "Setting the RequestWriter" section 
(where the binary / xml format choice is mentioned)

In general i think people haven't called it out specificly on the wiki 
because it's not a problem that should affect most people except as an 
upgrade issue: and all known upgrade issues are enumerated in the 
CHANGES.txt for hte version where the issue arrises.

Thr format change was specificly called out in the CHANGES.txt for 3.1...

* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
  JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)



-Hoss

Re: Removing the unwanted Debug messages - Wire.java

2011-07-26 Thread Chris Hostetter


: I built a web application in Java/JSP, which calls a Solr Servlet during
...
: An example Debug line on my console looks like this:
: DEBUG ["http-bio-8080"-exec-1] (Wire.java:70) - << "Quick recipe finder[\n]"
...
: Are there any suggestions on how to work around with this, since the source
: of Wire.java, I guess, is not accessible?

Wire.java is not anything that comes with Solr.

a sporadic google search suggests that this message is coming from 
commons-httpclient (in your client app) which uses commons-logging

I suggest you consult the docs for whatever logging framework you are 
using on disabling DEBUG messages.

For example, if you are using JDK Logging...
http://download.oracle.com/javase/1.4.2/docs/guide/util/logging/overview.html



-Hoss

Re: Schema.xml Change...

2011-07-26 Thread Gora Mohanty

On Tue, Jul 26, 2011 at 3:55 PM, Vignesh.v  wrote:
> Dear Team,
>
>               We tried changing the schema.xml to the user xml format but it 
> shows error.Kindly give me a solution to carry out this process.
[...]

Sorry, what does that mean exactly? Please provide details
of what you tried, and what errors were shown in the log files.

This page might help: http://wiki.apache.org/solr/UsingMailingLists

Regards,
Gora

Re: no match or wrong match results

2011-07-26 Thread Chris Hostetter


: Hi, you are not giving us much information. What's your default operator?
: What do you mean with "results are not correct"?

To elaborate, please note...
http://wiki.apache.org/solr/UsingMailingLists


: 
: On Tue, Jul 26, 2011 at 3:04 AM, deniz  wrote:
: 
: > Here is the situation..
: >
: > when i make search with 3 or more words, the results are corret, however if
: > i make a search by using only one word or two, there is no result, altough
: > there must be...
: >
: > e.g
: > query = stephan ruhl germany munich
: > results are correct, documents with the words above retrieved
: >
: > however
: >
: > query = stephan ruhl
: > results are not correct or even there is no result while some of the
: > matchinf documents must be shown.
: >
: >
: > any ideas about the issue?
: >
: > -
: > Zeki ama calismiyor... Calissa yapar...
: > --
: > View this message in context:
: > 
http://lucene.472066.n3.nabble.com/no-match-or-wrong-match-results-tp3199554p3199554.html
: > Sent from the Solr - User mailing list archive at Nabble.com.
: >
: 

-Hoss

RE: Spellcheck compounded words

2011-07-26 Thread O. Klein

Im using 4.0 for testing this.

Im not sure what to expect, but as soon as I increase maxCollationTries to 1
or more, even with maxCollationEvaluations set to low value like 10 it just
hangs.

With maxCollationTries set to 0 it works just fine.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3200846.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Spellcheck compounded words

2011-07-26 Thread Dyer, James

It sounds like that could be a bug.  Could you provide some details on how 
you're building your dictionary (config snippets), and what parameters you're 
using to query, etc. ?  Your jvm settings and a rough estimate of how big your 
index is would be helpful too.  It would be nice to try and figure out if this 
is a bug and if so, then try and fix it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: O. Klein [mailto:kl...@octoweb.nl] 
Sent: Tuesday, July 26, 2011 11:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Spellcheck compounded words

Im using 4.0 for testing this.

Im not sure what to expect, but as soon as I increase maxCollationTries to 1
or more, even with maxCollationEvaluations set to low value like 10 it just
hangs.

With maxCollationTries set to 0 it works just fine.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3200846.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: please help explaining debug output

2011-07-26 Thread Robert Petersen

That didn't help.  Seems like another case where I should get matches but don't 
and this time it is only for some documents.  Others with similar content do 
match just fine.  The debug output 'explain other' section for a non-matching 
document seems to say the term frequency is 0 for my problematic term, although 
I know it is in the content.  

I ended up making a synonym to do what the analysis stack *should* be doing: 
splitting LaserJet on case changes.  IE putting LaserJet, laser jet in synonyms 
at index time makes this work.  I don't know why though.

Question:  Does this debug output mean it is matching the terms but the term 
frequency vector is returning 0 for the frequency of this term.  IE Does this 
mean the term is in the doc but not in the tf array?

0.0 = no match on required clause (moreWords:"laser jet")
>>
>>0.0 = weight(moreWords:"laser jet" in 32497), product of:
>>
>>  0.60590804 = queryWeight(moreWords:"laser jet"), product of:
>>
>>14.597603 = idf(moreWords: laser=26731 jet=12685)
>>
>>0.041507367 = queryNorm
>>
>>  0.0 = fieldWeight(moreWords:"laser jet" in 32497), product of:
>>
>>0.0 = tf(phraseFreq=0.0)
>>
>>14.597603 = idf(moreWords: laser=26731 jet=12685)
>>
>>0.078125 = fieldNorm(field=moreWords, doc=32497)
>>
>>


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, July 25, 2011 3:28 PM
To: solr-user@lucene.apache.org
Subject: Re: please help explaining debug output

Hmmm, I can't find a convenient 1.4.0 to download, but re-indexing is a good
idea since this seems like it *should* work.

Erick

On Mon, Jul 25, 2011 at 5:32 PM, Robert Petersen  wrote:
> I'm still on solr 1.4.0 and the analysis page looks like they should match, 
> and other products with the same content do in fact match.  I'm reindexing 
> the non-matching ones to rule that out.
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Monday, July 25, 2011 1:58 PM
> To: solr-user@lucene.apache.org
> Subject: Re: please help explaining debug output
>
> Hmmm, I'm assuming that moreWords is your default text field, yes?
>
> But it works for me (tm), using 1.4.1. What version of Solr are you on?
>
> Also, take a glance at the admin/analysis page, that might help...
>
> Gotta run
>
> Erick
>
> On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen  wrote:
>> Sorry, to clarify a search for P1102W matches all three docs but a
>> search for p1102w LaserJet only matches the second two.  Someone asked
>> me a question while I was typing and I got distracted, apologies for any
>> confusion.
>>
>> -Original Message-
>> From: Robert Petersen [mailto:rober...@buy.com]
>> Sent: Monday, July 25, 2011 1:42 PM
>> To: solr-user@lucene.apache.org
>> Subject: please help explaining debug output
>>
>> I have three documents with the following product titles in a text field
>> called moreWords with analysis stack matching the solr example text
>> field definition.
>>
>>
>>
>> 1.       HP LaserJet P1102W Monochrome Laser Printer
>> > oc/101/213824965.html>
>>
>> 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
>> LaserJet M1212nf, P1102, P1102W Series
>> > dge-for-laserjet/q/loc/101/217145536.html>
>>
>> 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
>> M1130, LaserJet M1132, LaserJet M1210
>> > 102w-laserjet-m1130/q/loc/101/222045267.html>
>>
>>
>>
>> A search for P1102W matches (2) and (3), but not (1) above.  Can someone
>> explain the debug output?  It looks like I am getting a non-match on (1)
>> because term frequency is zero?  Am I reading that right?  If so, how
>> could that be? the searched terms are equivalently in all three docs.  I
>> don't get it.
>>
>>
>>
>>
>>
>> 
>>
>> p1102w LaserJet 
>>
>> p1102w LaserJet 
>>
>> +PhraseQuery(moreWords:"p 1102 w")
>> +PhraseQuery(moreWords:"laser jet")
>>
>> +moreWords:"p 1102 w" +moreWords:"laser
>> jet"
>>
>> 
>>
>> 
>>
>> 3.64852 = (MATCH) sum of:
>>
>>  2.4758534 = weight(moreWords:"p 1102 w" in 6667236), product of:
>>
>>    0.7955347 = queryWeight(moreWords:"p 1102 w"), product of:
>>
>>      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)
>>
>>      0.041507367 = queryNorm
>>
>>    3.1121879 = fieldWeight(moreWords:"p 1102 w" in 6667236), product
>> of:
>>
>>      1.7320508 = tf(phraseFreq=3.0)
>>
>>      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)
>>
>>      0.09375 = fieldNorm(field=moreWords, doc=6667236)
>>
>>  1.1726664 = weight(moreWords:"laser jet" in 6667236), product of:
>>
>>    0.60590804 = queryWeight(moreWords:"laser jet"), product of:
>>
>>      14.597603 = idf(moreWords: laser=26731 jet=12685)
>>
>>      0.041507367 = queryNorm
>>
>>    1.935386

RE: Spellcheck compounded words

2011-07-26 Thread O. Klein

I will try to duplicate the behavior in 3.3 as I cant get logging to file
working in 4.0 like in other releases 
http://globalgateway.wordpress.com/2010/01/06/configuring-solr-1-4-logging-with-log4j-in-tomcat/
Solr logging  (maybe you know how to fix this?)

Config is pretty normal I think:

  

   
  solr.IndexBasedSpellChecker
  default
  org.apache.lucene.search.spell.JaroWinklerDistance
  text_spell
  ./spellchecker
  0.7
  .001 
  true

  



  








--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3200945.html
Sent from the Solr - User mailing list archive at Nabble.com.

Severe errors in solr configuration

2011-07-26 Thread Xue-Feng Yang

Hi all,

I'm new to solr. 

I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.

It works fine until I set security manager in glassfish since I don't want to 
everyone can reach the solr's admin page. The error message was as follows.

Severe errors in solr configuration.

Check your log files for more detailed information on what may be wrong.

If you want solr to continue after configuration errors, change: 

 false

in solr.xml

-
java.security.AccessControlException: access denied 
(javax.management.MBeanServerPermission findMBeanServer)

.

Any help is welcome.

Thanks

Re: Spellcheck compounded words

2011-07-26 Thread Markus Jelsma


> I will try to duplicate the behavior in 3.3 as I cant get logging to file
> working in 4.0 like in other releases
> http://globalgateway.wordpress.com/2010/01/06/configuring-solr-1-4-logging-
> with-log4j-in-tomcat/ Solr logging  (maybe you know how to fix this?)

You're  most likely caught by the upgrade of slf4j. Check catalina.out, it'll 
tell you your versions are out of date or complain about a static logger 
binding.

> 
> Config is pretty normal I think:
> 
>name="spellcheckComponent">
> 
>
>   solr.IndexBasedSpellChecker
>   default
>name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance str> text_spell
>   ./spellchecker
>   0.7
>   .001
>   true
> 
>   
> 
>  positionIncrementGap="100" stored="false" multiValued="true">
> 
>   
> 
>  words="stopwordsSpell.txt"/>
> 
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3
> 200945.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with "?" wild card searches in solr

2011-07-26 Thread Dmitry Kan

you can use solr analysis tool from the admin page and see how an analysis
and querying are done for a specific term.

On Sat, Jul 23, 2011 at 1:33 PM, Romi  wrote:

> I am using solr for search . i am facing problem with wildcard searches.
> when i search for dia?mond i get result for diamond
> but when i search for ban?le i get no result.
>
> what can be the problem
>
> -
> Thanks & Regards
> Romi
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/problem-with-wild-card-searches-in-solr-tp3193222p3193222.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,

Dmitry Kan

Re: Severe errors in solr configuration

2011-07-26 Thread Kyle Lee

Could you provide the relevant sections of the logs pertaining to this
error?

On Tue, Jul 26, 2011 at 12:13 PM, Xue-Feng Yang  wrote:

> Hi all,
>
> I'm new to solr.
>
> I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.
>
> It works fine until I set security manager in glassfish since I don't want
> to everyone can reach the solr's admin page. The error message was as
> follows.
>
> Severe errors in solr configuration.
>
> Check your log files for more detailed information on what may be wrong.
>
> If you want solr to continue after configuration errors, change:
>
>  false
>
> in solr.xml
>
> -
> java.security.AccessControlException: access denied
> (javax.management.MBeanServerPermission findMBeanServer)
>
> .
>
> Any help is welcome.
>
> Thanks

Re: Severe errors in solr configuration

2011-07-26 Thread Xue-Feng Yang

Here is the message from server.log

[#|2011-07-26T12:17:37.591-0400|SEVERE|glassfish3.1|org.apache.solr.core.CoreContainer|_ThreadID=10;_ThreadName=Thread-1;|java.security.AccessControlException:
 access denied (javax.management.MBeanServerPermission findMBeanServer)
    at 
java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
    at java.security.AccessController.checkPermission(AccessController.java:546)
    at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
    at 
javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:393)
    at 
javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:343)
    at org.apache.solr.core.JmxMonitoredMap.(JmxMonitoredMap.java:70)
    at org.apache.solr.core.SolrCore.(SolrCore.java:532)
    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
    at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
    at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
    at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
    at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
    at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:266)
    at 
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:120)
    at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4625)
    at org.apache.catalina.core.StandardContext.start(StandardContext.java:5316)
    at com.sun.enterprise.web.WebModule.start(WebModule.java:500)
    at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:917)
    at org.apache.catalina.core.ContainerBase.access$000(ContainerBase.java:148)
    at 
org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:899)
    at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:755)
    at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1980)
    at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1630)
    at com.sun.enterprise.web.WebApplication.start(WebApplication.java:100)
    at org.glassfish.internal.data.EngineRef.start(EngineRef.java:130)
    at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:269)
    at 
org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:286)
    at 
com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:461)
    at 
com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:364)
    at 
com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:208)
    at 
com.sun.hk2.component.AbstractCreatorImpl.inject(AbstractCreatorImpl.java:131)
    at 
com.sun.hk2.component.ConstructorCreator$1.run(ConstructorCreator.java:86)
    at java.security.AccessController.doPrivileged(Native Method)
    at 
com.sun.hk2.component.ConstructorCreator.initialize(ConstructorCreator.java:83)
    at 
com.sun.hk2.component.AbstractCreatorImpl.get(AbstractCreatorImpl.java:82)
    at 
com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:67)
    at 
com.sun.hk2.component.EventPublishingInhabitant.get(EventPublishingInhabitant.java:139)
    at 
com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:76)
    at 
com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:243)
    at 
com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:135)
    at 
com.sun.enterprise.glassfish.bootstrap.GlassFishImpl.start(GlassFishImpl.java:79)
    at 
com.sun.enterprise.glassfish.bootstrap.GlassFishMain$Launcher.launch(GlassFishMain.java:117)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at 
com.sun.enterprise.glassfish.bootstrap.GlassFishMain.main(GlassFishMain.java:97)
    at com.sun.enterprise.glassfish.bootstrap.ASMain.main(ASMain.java:55)
|#]





From: Kyle Lee 
To: solr-user@lucene.apache.org
Sent: Tuesday, July 26, 2011 1:50:54 PM
Subject: Re: Severe errors in solr configuration

Could you provide the relevant sections of the logs pertaining to this
error?

On Tue, Jul 26, 2011 at 12:13 PM, Xue-Feng Yang  wrote:

> Hi all,
>
> I'm new to solr.
>
> I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.
>
> It works fine until I set security manager in glassfish since I don't want
> to everyone can reach the solr's admin page. The error message was as
> follows.
>
> Severe error

Re: Severe errors in solr configuration

2011-07-26 Thread Andrea Gazzarini

I don't know glassfish; the error you're reporting is a low-level security 
exception (method access) and doesn't seem to be related with web application 
(JAAS) security.

Did you change the web.xml of solr war for including security constraints, 
security collections, login-config, roles and so on?

But again...I don't know 
-Original Message-
From: Xue-Feng Yang 
Date: Tue, 26 Jul 2011 10:13:41 
To: solr-user@lucene.apache.org
Reply-To: solr-user@lucene.apache.org
Subject: Severe errors in solr configuration

Hi all,

I'm new to solr. 

I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.

It works fine until I set security manager in glassfish since I don't want to 
everyone can reach the solr's admin page. The error message was as follows.

Severe errors in solr configuration.

Check your log files for more detailed information on what may be wrong.

If you want solr to continue after configuration errors, change: 

 false

in solr.xml

-
java.security.AccessControlException: access denied 
(javax.management.MBeanServerPermission findMBeanServer)

.

Any help is welcome.

Thanks

Re: Severe errors in solr configuration

2011-07-26 Thread Andrea Gazzarini

Sorry, my previous email has been truncated.

Setting a security for a web application has nothing to do with security 
manager, which is something related with jvm and low level permission

(Continue from the previous email)

But anyway, i don't know glassfish and how its security config is working.

Doing that in Jboss or tomcat is very simple.

Regards,
Andrea  


-Original Message-
From: "Andrea Gazzarini" 
Date: Tue, 26 Jul 2011 18:24:48 
To: ; Xue-Feng Yang
Reply-To: andrea.gazzar...@atcult.it
Subject: Re: Severe errors in solr configuration

I don't know glassfish; the error you're reporting is a low-level security 
exception (method access) and doesn't seem to be related with web application 
(JAAS) security.

Did you change the web.xml of solr war for including security constraints, 
security collections, login-config, roles and so on?

But again...I don't know 
-Original Message-
From: Xue-Feng Yang 
Date: Tue, 26 Jul 2011 10:13:41 
To: solr-user@lucene.apache.org
Reply-To: solr-user@lucene.apache.org
Subject: Severe errors in solr configuration

Hi all,

I'm new to solr. 

I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.

It works fine until I set security manager in glassfish since I don't want to 
everyone can reach the solr's admin page. The error message was as follows.

Severe errors in solr configuration.

Check your log files for more detailed information on what may be wrong.

If you want solr to continue after configuration errors, change: 

 false

in solr.xml

-
java.security.AccessControlException: access denied 
(javax.management.MBeanServerPermission findMBeanServer)

.

Any help is welcome.

Thanks

Re: Severe errors in solr configuration

2011-07-26 Thread Xue-Feng Yang



No, I don't have any info to setup this for solr with glassfish. If anyone has 
such a doc for any other application server, such as tomcat, that would be a 
great help.




From: Andrea Gazzarini 
To: solr-user@lucene.apache.org; Xue-Feng Yang 
Sent: Tuesday, July 26, 2011 2:24:48 PM
Subject: Re: Severe errors in solr configuration

I don't know glassfish; the error you're reporting is a low-level security 
exception (method access) and doesn't seem to be related with web application 
(JAAS) security.

Did you change the web.xml of solr war for including security constraints, 
security collections, login-config, roles and so on?

But again...I don't know 
-Original Message-
From: Xue-Feng Yang 
Date: Tue, 26 Jul 2011 10:13:41 
To: solr-user@lucene.apache.org
Reply-To: solr-user@lucene.apache.org
Subject:
 Severe errors in solr configuration

Hi all,

I'm new to solr. 

I installed solr 3.3 with glassfish 3.1 in ubuntu 10.4.

It works fine until I set security manager in glassfish since I don't want to 
everyone can reach the solr's admin page. The error message was as follows.

Severe errors in solr configuration.

Check your log files for more detailed information on what may be wrong.

If you want solr to continue after configuration errors, change: 

false

in solr.xml

-
java.security.AccessControlException: access denied 
(javax.management.MBeanServerPermission findMBeanServer)

.

Any help is welcome.

Thanks

Re: Severe errors in solr configuration

2011-07-26 Thread Chris Hostetter


: Subject: Severe errors in solr configuration
: References: <1311383488148-3192748.p...@n3.nabble.com>
:  <8f0d0142ca7ecc4287a9ec1bd8cb880c17c6a26...@uslvdcmbvp01.ingramcontent.com>
:  <201107251713.22614.markus.jel...@openindex.io>
:  <8f0d0142ca7ecc4287a9ec1bd8cb880c17c6a27...@uslvdcmbvp01.ingramcontent.com>
:  <1311689197416-3200418.p...@n3.nabble.com>
:  <8f0d0142ca7ecc4287a9ec1bd8cb880c17c6b09...@uslvdcmbvp01.ingramcontent.com>
:  <1311698218009-3200846.p...@n3.nabble.com>
:  <8f0d0142ca7ecc4287a9ec1bd8cb880c17c6b0a...@uslvdcmbvp01.ingramcontent.com>
: In-Reply-To:
: <8f0d0142ca7ecc4287a9ec1bd8cb880c17c6b0a...@uslvdcmbvp01.ingramcontent.com
: >


http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.


-Hoss

Re: Spellcheck compounded words

2011-07-26 Thread O. Klein

Adding log4j-1.2.16.jar and deleting slf4j-jdk14-1.6.1.jar does not fix
logging for 4.0 for me.

Anyways, tried it on 3.3 and Solr just hangs here also. No logging, no
exceptions.

I'll let you know if I manage to find source of problem.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3201202.html
Sent from the Solr - User mailing list archive at Nabble.com.

Multiple Solr servers and a shared index (again)

2011-07-26 Thread Gaetano Giunta


Hello

I've been looking for a definitive answer to the question: "is it possible to run Solr on multiple servers with a shared index folder instead of the native 
master/slave configuration?"


So far I have found two threads in this ml about the topic:
http://lucene.472066.n3.nabble.com/Multiple-Solr-servers-and-a-shared-index-vs-master-slaves-td934359.html
http://lucene.472066.n3.nabble.com/Shared-index-base-td484439.html

But I am still not 100% sure. From what I understand:
1. having many servers serving reads and one doing writes might work - many 
servers doing writes definitely not
2. using lockType=native is a good idea whenever there is more than one process 
at a time accessing the index file...
3. ...which I suppose means that Solr will use flock() calls. Hence the need 
for a filesystem supporting them (eg. ocfs2 and gfs preferred to nfs)
4. "sending commit commands makes readers re-read the indexes"

About point 4: does this mean only the readers on the current server, or the 
readers on other servers too?
If the latter is the case, then this kind of setup is basically only good to have an active/passive cluster with a fast failover, not for doing master/slave 
load balancing.


Bye
Gaetano

ps: please understand that I'd much rather go with native master/slave or with the upcoming zookeeper-based configurations, but the setup described above I have 
found installed at some customer...

Re: Spellcheck compounded words

2011-07-26 Thread François Schiettecatte

FWIW, here is the process I follow to create a log4j aware version of the 
apache solr war file and the corresponding lo4j.properties files.

Have fun :)

François


##
#
# Log4J configuration for SOLR
#
#   http://wiki.apache.org/solr/SolrLogging
#
#
# 1) Download SLF4J:
#   http://www.slf4j.org/
#   http://www.slf4j.org/download.html
#   http://www.slf4j.org/dist/slf4j-1.6.1.tar.gz
#
# 2) Unpack Solr:
#   jar xvf apache-solr-3.3.0.war
#
# 3) Delete:
#   WEB-INF/lib/log4j-over-slf4j-1.6.1.jar
#   WEB-INF/lib/slf4j-jdk14-1.6.1.jar
#
# 4) Copy:
#   slf4j-1.6.1/slf4j-log4j12-1.6.1.jar ->  
WEB-INF/lib
#   log4j.properties (this file)->  
WEB-INF/classes/ (needs to be created)
#
# 5) Pack Solr:
#   jar cvf apache-solr-3.3.0.war admin favicon.ico index.jsp 
META-INF WEB-INF
#
#
#   Author: Francois Schiettecatte
#   Version:1.0
#
##



##
#
# Logging levels (helpful reminder)
#
# DEBUG < INFO < WARN < ERROR < FATAL
#



##
#
# Logging setup
#

log4j.rootLogger=ERROR, SOLR


# Daily Rolling File Appender (SOLR)
log4j.appender.SOLR=org.apache.log4j.DailyRollingFileAppender
log4j.appender.SOLR.File=${catalina.base}/logs/solr.log
log4j.appender.SOLR.Append=true
log4j.appender.SOLR.Encoding=UTF-8
log4j.appender.SOLR.DatePattern='-'-MM-dd
log4j.appender.SOLR.layout=org.apache.log4j.PatternLayout
log4j.appender.SOLR.layout.ConversionPattern=%d [%t] %-5p %c - %m%n



##
#
# Logging levels for SOLR
#

# Default logging level
log4j.logger.org.apache.solr=ERROR



##




On Jul 26, 2011, at 2:49 PM, O. Klein wrote:

> Adding log4j-1.2.16.jar and deleting slf4j-jdk14-1.6.1.jar does not fix
> logging for 4.0 for me.
> 
> Anyways, tried it on 3.3 and Solr just hangs here also. No logging, no
> exceptions.
> 
> I'll let you know if I manage to find source of problem.
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3201202.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Spellcheck compounded words

2011-07-26 Thread O. Klein

François Schiettecatte wrote:
> 
> #
> # 4) Copy:
> # slf4j-1.6.1/slf4j-log4j12-1.6.1.jar ->  
> WEB-INF/lib
> # log4j.properties (this file)->  
> WEB-INF/classes/ (needs to be
> created)
> #
> 

Don't you mean log4j-1.2.16/slf4j-log4j12-1.6.1.jar ?

Anyways. I was testing on 3.3 and found that when I added
&spellcheck.maxCollations=2&spellcheck.maxCollationTries=2 as parameters to
the URL there was no problem at all.

Adding 

  2
  2

to the default requestHandler in solrconfig.xml caused request to hang.

Can someone verify if this is a bug?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3201332.html
Sent from the Solr - User mailing list archive at Nabble.com.

slave data files way bigger than master

2011-07-26 Thread Jonathan Rochkind


So I've got Solr 1.4.  I've got replication going on.

Once a day, before replication, I optimize on master.  Then I replicate.

I'd expect optimization before replicate would basically replace all 
files on slave, this is expected.


But that means I'd also expect that the index files on slave would be 
identical, and the same size, as on master, after replication, this is 
the point of replication, yes?


But they are not. The master is only 12G, the slave is 39G.  The index 
files in slave and master have completely different filenames too, I 
don't know if that's expected, but it's not what I expected.  I'll post 
complete file lists below.


Anyone have any idea what's going on?  Also... I wonder if these extra 
index files on the slave are just extra not even looekd at by the slave 
solr, or if instead they actually ARE included in the indexes!  If the 
latter, and we have 'ghost' documents in the index, that could explain 
some weird problems I'm having with the slave getting Java out of heap 
space errors due to huge uninverted indexes, even though the index is 
basically the same with the same solrconfig.xml settings as it has been 
for a while, without such problems.


Greatly appreciate if anyone has any ideas.


MASTER: ls -lh master_index

total 12G
-rw-rw-r-- 1 tomcat tomcat  3.0G Jul 26 06:37 _24p.fdt
-rw-rw-r-- 1 tomcat tomcat   15M Jul 26 06:37 _24p.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 06:33 _24p.fnm
-rw-rw-r-- 1 tomcat tomcat  1.2G Jul 26 06:44 _24p.frq
-rw-rw-r-- 1 tomcat tomcat   49M Jul 26 06:44 _24p.nrm
-rw-rw-r-- 1 tomcat tomcat  1.1G Jul 26 06:44 _24p.prx
-rw-rw-r-- 1 tomcat tomcat  7.8M Jul 26 06:44 _24p.tii
-rw-rw-r-- 1 tomcat tomcat  660M Jul 26 06:44 _24p.tis
-rw-rw-r-- 1 tomcat tomcat  2.1G Jul 26 08:54 _2k4.fdt
-rw-rw-r-- 1 tomcat tomcat  7.6M Jul 26 08:54 _2k4.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 08:51 _2k4.fnm
-rw-rw-r-- 1 tomcat tomcat  719M Jul 26 08:59 _2k4.frq
-rw-rw-r-- 1 tomcat tomcat   25M Jul 26 08:59 _2k4.nrm
-rw-rw-r-- 1 tomcat tomcat  797M Jul 26 08:59 _2k4.prx
-rw-rw-r-- 1 tomcat tomcat  5.0M Jul 26 08:59 _2k4.tii
-rw-rw-r-- 1 tomcat tomcat  436M Jul 26 08:59 _2k4.tis
-rw-rw-r-- 1 tomcat tomcat  211M Jul 26 09:25 _2n3.fdt
-rw-rw-r-- 1 tomcat tomcat  774K Jul 26 09:25 _2n3.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 09:25 _2n3.fnm
-rw-rw-r-- 1 tomcat tomcat   72M Jul 26 09:26 _2n3.frq
-rw-rw-r-- 1 tomcat tomcat  2.5M Jul 26 09:26 _2n3.nrm
-rw-rw-r-- 1 tomcat tomcat   78M Jul 26 09:26 _2n3.prx
-rw-rw-r-- 1 tomcat tomcat  668K Jul 26 09:26 _2n3.tii
-rw-rw-r-- 1 tomcat tomcat   53M Jul 26 09:26 _2n3.tis
-rw-rw-r-- 1 tomcat tomcat  186M Jul 26 09:49 _2q6.fdt
-rw-rw-r-- 1 tomcat tomcat  774K Jul 26 09:49 _2q6.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 09:49 _2q6.fnm
-rw-rw-r-- 1 tomcat tomcat   60M Jul 26 09:50 _2q6.frq
-rw-rw-r-- 1 tomcat tomcat  2.5M Jul 26 09:50 _2q6.nrm
-rw-rw-r-- 1 tomcat tomcat   64M Jul 26 09:50 _2q6.prx
-rw-rw-r-- 1 tomcat tomcat  562K Jul 26 09:50 _2q6.tii
-rw-rw-r-- 1 tomcat tomcat   45M Jul 26 09:50 _2q6.tis
-rw-rw-r-- 1 tomcat tomcat  246M Jul 26 10:16 _2t9.fdt
-rw-rw-r-- 1 tomcat tomcat  774K Jul 26 10:16 _2t9.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 10:16 _2t9.fnm
-rw-rw-r-- 1 tomcat tomcat   68M Jul 26 10:17 _2t9.frq
-rw-rw-r-- 1 tomcat tomcat  2.5M Jul 26 10:17 _2t9.nrm
-rw-rw-r-- 1 tomcat tomcat   89M Jul 26 10:17 _2t9.prx
-rw-rw-r-- 1 tomcat tomcat  602K Jul 26 10:17 _2t9.tii
-rw-rw-r-- 1 tomcat tomcat   53M Jul 26 10:17 _2t9.tis
-rw-rw-r-- 1 tomcat tomcat  221M Jul 26 10:45 _2wc.fdt
-rw-rw-r-- 1 tomcat tomcat  774K Jul 26 10:45 _2wc.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 10:45 _2wc.fnm
-rw-rw-r-- 1 tomcat tomcat   69M Jul 26 10:46 _2wc.frq
-rw-rw-r-- 1 tomcat tomcat  2.5M Jul 26 10:46 _2wc.nrm
-rw-rw-r-- 1 tomcat tomcat   82M Jul 26 10:46 _2wc.prx
-rw-rw-r-- 1 tomcat tomcat  613K Jul 26 10:46 _2wc.tii
-rw-rw-r-- 1 tomcat tomcat   53M Jul 26 10:46 _2wc.tis
-rw-rw-r-- 1 tomcat tomcat   75M Jul 26 11:14 _2y6.fdt
-rw-rw-r-- 1 tomcat tomcat  315K Jul 26 11:14 _2y6.fdx
-rw-rw-r-- 1 tomcat tomcat   11M Jul 26 11:15 _2ze.fdt
-rw-rw-r-- 1 tomcat tomcat   42K Jul 26 11:15 _2ze.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 11:14 _2ze.fnm
-rw-rw-r-- 1 tomcat tomcat  157K Jul 26 11:14 _2ze.frq
-rw-rw-r-- 1 tomcat tomcat  6.9K Jul 26 11:14 _2ze.nrm
-rw-rw-r-- 1 tomcat tomcat  201K Jul 26 11:14 _2ze.prx
-rw-rw-r-- 1 tomcat tomcat  3.8K Jul 26 11:14 _2ze.tii
-rw-rw-r-- 1 tomcat tomcat  293K Jul 26 11:14 _2ze.tis
-rw-rw-r-- 1 tomcat tomcat  224M Jul 26 11:14 _2zf.fdt
-rw-rw-r-- 1 tomcat tomcat  774K Jul 26 11:14 _2zf.fdx
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 11:14 _2zf.fnm
-rw-rw-r-- 1 tomcat tomcat   79M Jul 26 11:15 _2zf.frq
-rw-rw-r-- 1 tomcat tomcat  2.5M Jul 26 11:15 _2zf.nrm
-rw-rw-r-- 1 tomcat tomcat   88M Jul 26 11:15 _2zf.prx
-rw-rw-r-- 1 tomcat tomcat  869K Jul 26 11:15 _2zf.tii
-rw-rw-r-- 1 tomcat tomcat   76M Jul 26 11:15 _2zf.tis
-rw-rw-r-- 1 tomcat tomcat   836 Jul 26 11:14 _2zg.fnm
-rw-rw-r-- 1 tomcat tomcat  71

Re: Spellcheck compounded words

2011-07-26 Thread François Schiettecatte

I get slf4j-log4j12-1.6.1.jar from 
http://www.slf4j.org/dist/slf4j-1.6.1.tar.gz, it is what interfaces  slf4j to 
log4j, you will also need to add log4j-1.2.16.jar to WEB-INF/lib.


François 


On Jul 26, 2011, at 3:40 PM, O. Klein wrote:

> 
> François Schiettecatte wrote:
>> 
>> #
>> # 4) Copy:
>> #slf4j-1.6.1/slf4j-log4j12-1.6.1.jar ->  
>> WEB-INF/lib
>> #log4j.properties (this file)->  
>> WEB-INF/classes/ (needs to be
>> created)
>> #
>> 
> 
> Don't you mean log4j-1.2.16/slf4j-log4j12-1.6.1.jar ?
> 
> Anyways. I was testing on 3.3 and found that when I added
> &spellcheck.maxCollations=2&spellcheck.maxCollationTries=2 as parameters to
> the URL there was no problem at all.
> 
> Adding 
> 
>  2
>  2
> 
> to the default requestHandler in solrconfig.xml caused request to hang.
> 
> Can someone verify if this is a bug?
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3201332.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multiple Solr servers and a shared index (again)

2011-07-26 Thread Emmanuel Espina

Regarding point 4, you will have to reload the indexes to preserve
consistency among the indexes.
When yo perform a commit in solr you have (for an instant) two versions of
the index. The commit produces new segments (with new documents, new
deletions, etc). After creating these new segments a new index searcher is
created and its caches begin to autowarm. At this point the old index
searcher that you were using is still active receiving requests. After the
new index searcher finishes loading and autowarming the old searcher is
discarded.
I think that the coordination needed to accomplish the same functionality in
a distributed environment (ie having multiple solr instances accessing the
same files) will be unnecessarily complicated.

Also is generally recommended to have the index stored locally for
performance. Even if a distributed file system is available the most
performant solution is to have the index stored locally in a disk (and much
simpler)

So in my opinion "replicating the state" is far easier than "sharing the
state".

Thank you
Emmanuel


2011/7/26 Gaetano Giunta 

> Hello
>
> I've been looking for a definitive answer to the question: "is it possible
> to run Solr on multiple servers with a shared index folder instead of the
> native master/slave configuration?"
>
> So far I have found two threads in this ml about the topic:
> http://lucene.472066.n3.**nabble.com/Multiple-Solr-**
> servers-and-a-shared-index-vs-**master-slaves-td934359.html
> http://lucene.472066.n3.**nabble.com/Shared-index-base-**td484439.html
>
> But I am still not 100% sure. From what I understand:
> 1. having many servers serving reads and one doing writes might work - many
> servers doing writes definitely not
> 2. using lockType=native is a good idea whenever there is more than one
> process at a time accessing the index file...
> 3. ...which I suppose means that Solr will use flock() calls. Hence the
> need for a filesystem supporting them (eg. ocfs2 and gfs preferred to nfs)
> 4. "sending commit commands makes readers re-read the indexes"
>
> About point 4: does this mean only the readers on the current server, or
> the readers on other servers too?
> If the latter is the case, then this kind of setup is basically only good
> to have an active/passive cluster with a fast failover, not for doing
> master/slave load balancing.
>
> Bye
> Gaetano
>
> ps: please understand that I'd much rather go with native master/slave or
> with the upcoming zookeeper-based configurations, but the setup described
> above I have found installed at some customer...
>

Re: Spellcheck compounded words

2011-07-26 Thread O. Klein

I see you use this for Solr 3.3.

In 3.3 there is no problem with logging. Have you tried to do the same thing
for 4.0?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3201597.html
Sent from the Solr - User mailing list archive at Nabble.com.

Exact match not the first result returned

2011-07-26 Thread Brian Lamb

Hi all,

I am a little confused as to why the scoring is working the way it is:

I have a field defined as:



And I have several documents where that value is:

RECORD 1

  Fred
  Fred (the coolest guy in town)


OR

RECORD 2

  Fred Anderson


What happens when I do a search for
http://localhost:8983/solr/search/?q=myname:Fred I get RECORD 2
returned before RECORD 1.

RECORD 2
5.282213 = (MATCH) fieldWeight(myname:Fred in 256575), product of:
  1.0 = tf(termFreq(myname:Fred)=1)
  8.451541 = idf(docFreq=7306, maxDocs=12586425)
  0.625 = fieldNorm(field=myname, doc=256575)

RECORD 1
4.482106 = (MATCH) fieldWeight(myname:Fred in 215), product of:
  1.4142135 = tf(termFreq(myname:Fred)=2)
  8.451541 = idf(docFreq=7306, maxDocs=12586425)
  0.375 = fieldNorm(field=myname, doc=215)

So the difference is fieldNorm obviously but I think that's only part
of the story. Why is RECORD 2 returned with a higher score than RECORD
1 even though RECORD 1 matches "Fred" exactly? And how should I do
this differently so that I am getting the results I am expecting?

Thanks,

Brian Lamb

Re: Exact match not the first result returned

2011-07-26 Thread Emmanuel Espina

That is caused by the size of the documents. The principle is pretty
intuitive if one of your documents is the entire three volumes of The Lord
of the Rings, and you search for "tree" I know that The Lord of the Rings
will be in the results, and I haven't memorized the entire text of that book
:p
It is a matter of probability that if you have a big (big!) text any word
will have a greater chance to be found than in a smaller letter. So one can
infer that the letter is more relevant than the big text. That is the
principle applied here and Lucene does that when building the ranking.
The first document is bigger (remember that all the values of a multivalued
field are merged into one field in the index, so you can not tell one value
from another apart) than the second one. In the first one you have
[Fred, coolest,
guy, town] and in the second [Fred, Anderson], so the second document is
more relevant than the first one.

To avoid all this procedure you can set omitNorms to true and that should
make the first document more relevant because Fred appears twice (not
because Fred appears alone in a value)

Regards
Emmanuel

2011/7/26 Brian Lamb 

> Hi all,
>
> I am a little confused as to why the scoring is working the way it is:
>
> I have a field defined as:
>
>  required="false" multivalued="true" />
>
> And I have several documents where that value is:
>
> RECORD 1
> 
>  Fred
>  Fred (the coolest guy in town)
> 
>
> OR
>
> RECORD 2
> 
>  Fred Anderson
> 
>
> What happens when I do a search for
> http://localhost:8983/solr/search/?q=myname:Fred I get RECORD 2
> returned before RECORD 1.
>
> RECORD 2
> 5.282213 = (MATCH) fieldWeight(myname:Fred in 256575), product of:
>  1.0 = tf(termFreq(myname:Fred)=1)
>  8.451541 = idf(docFreq=7306, maxDocs=12586425)
>  0.625 = fieldNorm(field=myname, doc=256575)
>
> RECORD 1
> 4.482106 = (MATCH) fieldWeight(myname:Fred in 215), product of:
>  1.4142135 = tf(termFreq(myname:Fred)=2)
>  8.451541 = idf(docFreq=7306, maxDocs=12586425)
>  0.375 = fieldNorm(field=myname, doc=215)
>
> So the difference is fieldNorm obviously but I think that's only part
> of the story. Why is RECORD 2 returned with a higher score than RECORD
> 1 even though RECORD 1 matches "Fred" exactly? And how should I do
> this differently so that I am getting the results I am expecting?
>
> Thanks,
>
> Brian Lamb
>

Re: performance variation with respect to the index size

2011-07-26 Thread François Schiettecatte

Finally got to running these tests.

Here are the basics...

Core i7 - 960
24GB RAM
Solr index on its own drive

Solr 3.3.0  running under tomcat 7.0.19, jdk1.6.0_26, java opts are:

JAVA_OPTS="-Xmx4096M -XX:-UseGCOverheadLimit" 
 
Raw data is 80GB in SOLR marking for adding, sample below:

5
en
202
2008-07-31T23:29:40Z
http://tomfoolery4.wordpress.com/2008/07/31/finally-a-buffalo-webmedia-site-that-doesnt-sit-on-the-fence/
Finally! A Buffalo Web/Media Site That Doesn’t Sit On The 
Fence!
The Buffalo News has got my back on this one. A lot of area 
musicians, artists, writers and photographers have got my back on this one. And 
now, I'm pleased to say, so does WNYMedia.net, another new voice in a 
small 
sea of journalistic endeavors afoot in Buffalo. What I like about this site 
[...]


icwsm does not include content - 52GB














icwsm2 includes content - 117GB 














I used 1,000 searches from a 162,000 search set I saved from feedster days, 
here are some sample searches:

belize
st louis cardinals
offshoring
2010 olympic games
nanotubes
"beamed power"
"space elevator"
"power beaming"
world news
dogster
vancouver-centre
news


I ran six tests, two on icwsm getting the key and the score (10 rows and 100 
rows), two on icwsm2 getting the key and the score (10 rows and 100 rows), and 
two on icwsm2 getting all the fields and the scores (10 rows and 100 rows). 
Each test was run 10 times consecutively, nothing was running on the machine.

This table shows the time elapsed, the index name, the rows requested and the 
fields requested:

 182  icwsm  10  key,score
 184  icwsm  10  key,score
 182  icwsm  10  key,score
 182  icwsm  10  key,score
 184  icwsm  10  key,score
 183  icwsm  10  key,score
 183  icwsm  10  key,score
 183  icwsm  10  key,score
 184  icwsm  10  key,score
 183  icwsm  10  key,score

 190  icwsm  100  key,score
 183  icwsm  100  key,score
 184  icwsm  100  key,score
 184  icwsm  100  key,score
 183  icwsm  100  key,score
 183  icwsm  100  key,score
 182  icwsm  100  key,score
 183  icwsm  100  key,score
 185  icwsm  100  key,score
 184  icwsm  100  key,score

 204  icwsm2  10  key,score
 183  icwsm2  10  key,score
 184  icwsm2  10  key,score
 184  icwsm2  10  key,score
 185  icwsm2  10  key,score
 184  icwsm2  10  key,score
 183  icwsm2  10  key,score
 185  icwsm2  10  key,score
 184  icwsm2  10  key,score
 184  icwsm2  10  key,score

 288  icwsm2  100  key,score
 184  icwsm2  100  key,score
 186  icwsm2  100  key,score
 184  icwsm2  100  key,score
 186  icwsm2  100  key,score
 186  icwsm2  100  key,score
 186  icwsm2  100  key,score
 186  icwsm2  100  key,score
 189  icwsm2  100  key,score
 188  icwsm2  100  key,score

 185  icwsm2  10  *,score
 184  icwsm2  10  *,score
 183  icwsm2  10  *,score
 184  icwsm2  10  *,score
 184  icwsm2  10  *,score
 184  icwsm2  10  *,score
 185  icwsm2  10  *,score
 184  icwsm2  10  *,score
 184  icwsm2  10  *,score
 184  icwsm2  10  *,score

 206  icwsm2  100  *,score
 185  icwsm2  100  *,score
 186  icwsm2  100  *,score
 190  icwsm2  100  *,score
 195  icwsm2  100  *,score
 191  icwsm2  100  *,score
 193  icwsm2  100  *,score
 190  icwsm2  100  *,score
 186  icwsm2  100  *,score
 186  icwsm2  100  *,score

Basically storing the data in the index has virtually no impact on search speed 
from what I can see which is what I would expect.


Cheers

François






On Jul 8, 2011, at 12:18 PM, Erick Erickson wrote:

> Well, it depends (tm). Raw search time should be unaffected (or very
> close to that). The stored data is in a completely separate file in
> the index directory and is not referenced during searches.
> 
> That said, assembling the response may take longer since you're
> potentially reading more data from the disk to create each document.
> 
> Insure that lazy field loading is turned on, and when you're comparing
> times it would probably be best to return the same fields (perhaps just ID).
> 
> Note that the Qtime in the response packet is the search, exclusive of
> assembling the response so that's probably a good number to measure.
> 
> Best
> Erick
> 
> On Fri, Jul 8, 2011 at 8:01 AM, jame vaalet  wrote:
>> i would prefer every setting to be in its default stage and compare the
>> result with stored = true and False .
>> 
>> 2011/7/8 François Schiettecatte 
>> 
>>> Hi
>>> 
>>> I don't think that anyone has run such benchmarks, in fact this topic came
>>> up two weeks ago and I volunteered some time to do that because I have some
>>> spare time this week, so I am going to run some benchmarks this weekend and
>>> report back.
>>> 
>>> The machine I have to do this a core i7 960, 24GB, 4TB of disk. I am going
>>> to run SOLR 3.3 under Tomcat 7.0.16. I have three databases I can use for
>>> this, icwsm-2009 (38.5GB compressed), cdip (24GB compressed), trec vlc2
>>> (3

Solr DataImport with multiple DBs

2011-07-26 Thread spravin

Hi All

I am stuck with an issue with delta-import while configuring solr in an
environment where multiple databases exist.

My schema looks like this:

names exist in one DB and keywords in a table in the other DB (with id as
foreign key).

For delta import, I would need to check against the updated column in both
the tables. But they are in two different databases, so I can't do this in a
single deltaquery.
So I'm not able to detect if the field in the second database has changed.

The relevant part of my dataconfig xml looks like this:


  
  
  
http://dataimporter.delta.id/>}'"
deltaQuery="SELECT ID FROM records WHERE Updated >
'${dataimporter.last_index_time}'">





  


I'm hoping someone in this list could point me to a solution: a way to
specify deltaQuery across multiple databases.

(In the above example, I would like to add "OR ID IN (SELECT ID FROM
keywords WHERE Updated > '${dataimporter.last_index_time}')" to the
deltaQuery, but this table can be accessed only from a different dataSource.

Thanks
- PS


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-DataImport-with-multiple-DBs-tp3201843p3201843.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Preserve XML hierarchy

2011-07-26 Thread Michael Sokolov

Here's an idea: if you index the full text of your XML document using 
XmlCharFilter - available as a patch (or HtmlCharFilter), and then 
highlight the entire document (you will need to fiddle with highlighter 
parameters a bit to make sure you get 1 fragment that covers the entire 
file) with some tag like , then you can take the highlighted 
result, parse it as an XML document into a tree model like JDOM or DOM, 
and execute XPath like: name(/descendant::match[1]/..) to find out the 
context in which your (first) hit appears.


-Mike

On 7/26/2011 10:48 AM, Lucas Miguez wrote:

Hi, finally now I have all the field names of each document using the
Luke Request Handler (http://wiki.apache.org/solr/LukeRequestHandler)
and making HTTP Request to Solr I can get all the fields that contain
the word that I am searching.
I'll keep looking for a better solution.

Thanks!

Regards

2011/7/15 Gora Mohanty

On Thu, Jul 14, 2011 at 8:43 PM, Lucas Miguez  wrote:

Thanks for your help!

DIH XPathEntityProcessor helps me to index the XML Files, but, does it
help to me to know from where the node comes? Following the example in
my previous post:


example: Imagine that the user search the word "zona", then I have to
show the TitleP, the TextP, the TitlePart, the TextPart and all the
TextSubPart that are childs of gSubPart.

Well, I tried to create TextPart, TitlePart, etc with the XPath
expression of the location in the original XML, using dynamic fields,
for example:


There should not be a space between "TextPart" and "*"


to have the XPath associated with the field, but I don't know how to
search in all "TextPart *" fields...

[...]

You can search in individual fields, e.g., with ?q=TitlePart:myterm.
For searching in all "TextPart*" fields, the easiest way probably is
to copy the fields into a full-text search field. With the default Solr
schema, this can be done by adding a directive like
   
This copies all fields into the field "text", which is searched by
default. Thus, ?q=myterm will find "myterm" in all "TextPart*"
fields.

Regards,
Gora

Re: SolrJ and class versions

2011-07-26 Thread Michael Sokolov

It's not clear to me (from the wiki, or the jira issue) whether the 
compatibility break goes both ways - maybe I should just try and see, 
but just to get this out there on the list: is the 3.X javabin client 
able to talk to 1.4 servers?  If so, then there is a nicely decoupled 
upgrade path: get all your clients upgraded, then the servers.


-Mike

On 7/26/2011 12:01 PM, Chris Hostetter wrote:

: Hi, I recently went through a little hell when I upgraded my Solr
: servers to 3.2.0. What I didn't anticipate was that my Java SolrJ
: clients depend on the server version.
:
: I would like to add a note about this in the SolrJ docs:
: http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update

I don't know that it really makes sense to add to that section, but if
you'd like to add some mention to the "Setting the RequestWriter" section
(where the binary / xml format choice is mentioned)

In general i think people haven't called it out specificly on the wiki
because it's not a problem that should affect most people except as an
upgrade issue: and all known upgrade issues are enumerated in the
CHANGES.txt for hte version where the issue arrises.

Thr format change was specificly called out in the CHANGES.txt for 3.1...

* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
   JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)



-Hoss

Re: SolrJ and class versions

2011-07-26 Thread Shawn Heisey


On 7/26/2011 6:26 PM, Michael Sokolov wrote:
It's not clear to me (from the wiki, or the jira issue) whether the 
compatibility break goes both ways - maybe I should just try and see, 
but just to get this out there on the list: is the 3.X javabin client 
able to talk to 1.4 servers?  If so, then there is a nicely decoupled 
upgrade path: get all your clients upgraded, then the servers.


If you change things so it's using an XML response parser instead of 
javabin, they will communicate just fine, as long you don't try to use 
features not supported by one end or the other.  If you leave it at the 
default (javabin), they will not communicate.


I was pointed at the following code snippet by someone, either here or 
on the IRC channel.  Our development team was able to take the 
information and make it work.  We are currently using SolrJ version 
1.4.0 against Solr 3.2.0 with no problem.  It should also work the other 
way, with a new SolrJ and an old Solr.  Once both of them are upgraded 
to at least 3.1, you can go back to javabin for efficiency.


new CommonsHttpSolrServer(new URL("http://solr1.4.0Instance:8080/solr";), 
null, new XMLResponseParser(), false);


I haven't actually looked at our Java code to see what's been done, and 
my Java experience is limited.  I hope this helps you!


Shawn

how often do you boys restart your tomcat?

2011-07-26 Thread Bing Yu

I find that, if I do not restart the master's tomcat for some days,
the load average will keep rising to a high level, solr become slow
and unstable, so I add a crontab to restart the tomcat everyday.

do you boys restart your tomcat ? and is there any way to avoid restart tomcat?

Re: how often do you boys restart your tomcat?

2011-07-26 Thread Chamnap Chhorn

I often restarted the tomcat service before the memory reaches the os limit.
Usually, it eats up only 4 GB, but eventually it eats up 11GB.

On Wed, Jul 27, 2011 at 8:42 AM, Bing Yu  wrote:

> I find that, if I do not restart the master's tomcat for some days,
> the load average will keep rising to a high level, solr become slow
> and unstable, so I add a crontab to restart the tomcat everyday.
>
> do you boys restart your tomcat ? and is there any way to avoid restart
> tomcat?
>

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: how often do you boys restart your tomcat?

2011-07-26 Thread Dave Hall


On 27/07/11 11:42, Bing Yu wrote:

do you boys restart your tomcat ? and is there any way to avoid restart tomcat?


Our female sysadmin takes care of managing our server.

Re: how often do you boys restart your tomcat?

2011-07-26 Thread Bing Yu

I want to let system do the job instead of system adminm, beause I'm lazy ~ ^__^

But I just want a better way to fix the problem. restart server will
cause some other problem like I need to rebuild the changes happened
during the restart.

2011/7/27 Dave Hall :
> On 27/07/11 11:42, Bing Yu wrote:
>>
>> do you boys restart your tomcat ? and is there any way to avoid restart
>> tomcat?
>
> Our female sysadmin takes care of managing our server.
>

Re: how often do you boys restart your tomcat?

2011-07-26 Thread Shawn Heisey


On 7/26/2011 7:42 PM, Bing Yu wrote:

I find that, if I do not restart the master's tomcat for some days,
the load average will keep rising to a high level, solr become slow
and unstable, so I add a crontab to restart the tomcat everyday.

do you boys restart your tomcat ? and is there any way to avoid restart tomcat?


I run Solr under the jetty included with the Solr examples.  With Solr 
version 1.4.1, I've had over 60 days of uptime with no problem.  I am 
now running 3.2.0, but things have been pretty volatile so I haven't 
been able to accumulate any real uptime yet.  I don't expect any 
problems, though.


Tomcat is something I've got little experience with, but that does sound 
unusual.  Other groups in the company do use it.  When they start having 
problems like this, it tends to be configuration issues, a bug in their 
homegrown applications, or a problem with resources (usually RAM).


Shawn

Re: How to make a valid date facet query?

2011-07-26 Thread Floyd Wu

Hi Tomás

Is facet queries support following queries?

facet.query=onlinedate:[NOW/YEAR-3YEARS TO NOW/YEAR+5YEARS]

I tried this but returned result was not correct.

Am I missing something?

Floyd

2011/7/26 Tomás Fernández Löbbe 

> Hi Floyd, I don't think the feature that allows to use multiple gaps for a
> range facet is committed. See
> https://issues.apache.org/jira/browse/SOLR-2366
> You can achieve a similar functionality by using facet.query. see:
>
> http://wiki.apache.org/solr/SimpleFacetParameters#Facet_Fields_and_Facet_Queries
>
> Regards,
>
> Tomás
> On Tue, Jul 26, 2011 at 1:23 AM, Floyd Wu  wrote:
>
> > Hi all,
> >
> > I need to make date faceted query and I tried to use facet.range but
> can't
> > get result I need.
> >
> > I want to make 4 facet like following.
> >
> > 1 Months,3 Months, 6Months, more than 1 Year
> >
> > The onlinedate field in schema.xml like this
> >
> > 
> >
> > I hit the solr by this url
> >
> > http://localhost:8983/solr/select/?q=*%3A*
> > &start=0
> > &rows=10
> > &indent=on
> > &facet=true
> > &facet.range=onlinedate
> > &f.onlinedate.facet.range.start=NOW-1YEARS
> > &f.onlinedate.facet.range.end=NOW%2B1YEARS
> > &f.onlinedate.facet.range.gap=NOW-1MONTHS, NOW-3MONTHS,
> > NOW-6MONTHS,NOW-1YEAR
> >
> > But the solr complained Exception during facet.range of onlinedate
> > org.apache.solr.common.SolrException: Can't add gap NOW-1MONTHS,
> > NOW-3MONTHS, NOW-6MONTHS,NOW-1YEAR to value Mon Jul 26 11:56:40 CST 2010
> > for
> > 
> >
> > What is correct way to make this requirement to realized? Please help on
> > this.
> > Floyd
> >
>

Conditional field values in DataImport

2011-07-26 Thread solruser@9913

This may be a trivial question - I am noob :).
In the dataimport of a CSV file, am trying to assign a field based on a
conditional check on another field.

E.g. 
   
  
   this works well.  However I need to create another field A that is
assigned a value based on X.  

   Something like this
 If X contains "abc" then A="complex-action" else A="SimpleAction"

  

I can do all the way upto writing the second regex for checking the value
inside X - however i am not sure how to assign the conditional value to A
based on a match or fail

Any help is much appreciated.

-g

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Conditional-field-values-in-DataImport-tp3202136p3202136.html
Sent from the Solr - User mailing list archive at Nabble.com.

Autocomplete with Solr 3.1

2011-07-26 Thread scorpking

Hi all, 
when i use autocomplete to suggest like google:
http://www.google.com/webhp?complete=1&hl=en and follow this url
http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/ to config my
project, but when i tested with more two terms in my query, it's not right,
i don't know why? 
Can anyone tell me ? 
Thanks for help.

 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Autocomplete-with-Solr-3-1-tp3202214p3202214.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how often do you boys restart your tomcat?

2011-07-26 Thread Bernd Fehling


Till now I used jetty and got 2 week as the longest uptime until OOM.
I just switched to tomcat6 and will see how that one behaves but
I think its not a problem of the servlet container.
Solr is pretty unstable if having a huge database.
Actually this can't be blamed directly to Solr it is a problem of
Lucene and its fieldCache. Somehow during 2 weeks runtime with searching
and replication the fieldCache gets doubled until OOM.

Currently there is no other solution to this than restarting your
tomcat or jetty regularly :-(


Am 27.07.2011 03:42, schrieb Bing Yu:

I find that, if I do not restart the master's tomcat for some days,
the load average will keep rising to a high level, solr become slow
and unstable, so I add a crontab to restart the tomcat everyday.

do you boys restart your tomcat ? and is there any way to avoid restart tomcat?

63 matches

Mail list logo