RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Charton, Andre
Hi,

yes we do. 

If you use a limit number of categories (like 100) you can use dynamic fields 
with the termscomponent and by choosing a category specific prefix, like:

{schema.xml}
...

...
{schema.xml}

And within data import handler we script prefix from given category:

{data-config.xml}
function setCatPrefixFields(row) {
var catId = row.get('category');
var title = row.get('freetext');
var cat_prefix = "c" + catId + "_suggestion";
return row;
}
{data-config.xml}

Then you we adapt these in our application layer by a specific request handler, 
regarding these prefix.

Pro:
- works fine for limit number of categories

Con:
- index is getting bigger, we measure increasing by ~40 percent

Regards

André Charton


-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com] 
Sent: Wednesday, April 27, 2011 9:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Ebay Kleinanzeigen and Auto Suggest

Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Eric,
>
> Before using the terms component, allow me to point out:
>
> * http://sematext.com/products/autocomplete/index.html (used on
> http://search-lucene.com/ for example)
>
> * http://wiki.apache.org/solr/Suggester
>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Eric Grobler 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, April 26, 2011 1:11:11 PM
> > Subject: Ebay Kleinanzeigen and Auto Suggest
> >
> > Hi
> >
> > Someone told me that ebay is using solr.
> > I was looking at their  Auto Suggest implementation and I guess they are
> > using Shingles and the  TermsComponent.
> >
> > I managed to get a satisfactory implementation but I have  a problem with
> > category specific filtering.
> > Ebay suggestions are sensitive  to categories like Cars and Pets.
> >
> > As far as I understand it is not  possible to using filters with a term
> > query.
> > Unless one uses multiple  fields or special prefixes for the words to
> index I
> > cannot think how to  implement this.
> >
> > Is their perhaps a workaround for this  limitation?
> >
> > Best  Regards
> > EricZ
> >
> > ---
> >
> > I am have  a shingle type like:
> >  > positionIncrementGap="100">
> > 
> >
> > > maxShingleSize="4"  />
> >
> >
> > 
> >
> >
> >
> > and a query like
> >
> http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
> >i
> >
>


Re: fq parameter with partial value

2011-05-03 Thread elisabeth benoit
Ok, thanks a lot.

After making a few tests, I finally understood what you meant.

Best regards,
Elisabeth

2011/5/2 Jonathan Rochkind 

> So if you have a field that IS tokenized, regardless of what it's called,
> then when you send "My Great Restaurant" to it for _indexing_, it gets
> _tokenized upon indexing_ to seperate tokens:  "My", "Great", "Restaurant".
>  Depending on what other analysis you have, it may get further analyzed,
> perhaps to: "my", "great", "restaurant".
>
> You don't need to seperate into tokens yourself before sending it to Solr
> for indexing, if you define the field using a tokenizer, Solr will do that
> when you index.  Because this is a VERY common thing to do with Solr; pretty
> much any field that you want to be effectively searchable you have Solr
> tokenize like this.
>
> Because Solr pretty much always matches on individual tokens, that's the
> fundamental way Solr works.
> Those seperate tokens is what allows you to SEARCH on the field, and get a
> match on "my" or on "restaurant".   If the field were non-tokenized, you'd
> ONLY get a hit if the user entered "My Great Restaurant" (and really not
> even then unless you take other actions, because of the way Solr query
> parsers work you'll have trouble getting ANY hits to a user-entered search
> with the 'lucene' or 'dismax' query parsers if you don't tokenize).
>
> That tokenized filed won't facet very well though -- if you facetted on a
> tokenized field with that example entered in it, you'll get a facet "my"
> with that item in it, and another facet "great" with that item in it, and
> another facet "restuarant" with that item in it.
>
> Which is why you likely want to use a seperate _untokenized_ field for
> facetting. Which is why you end up wanting/needing two seperate fields --
> one that is tokenized for searching, and one that is not tokenized (and
> usually not analyzed at all) for facetting.
>
> Hope this helps.
>
>
> On 5/2/2011 2:43 AM, elisabeth benoit wrote:
>
>> I'm a bit confused here.
>>
>> What is the difference between CATEGORY and CATEGORY_TOKENIZED if I just
>> do
>> a copyField from what field to another? And how can I search only for
>> Restaurant (fq= CATEGORY_TOKENIZED: Restaurant). Shouldn't I have
>> something
>> like
>> Hotel, if I want this to work.
>> And
>> from what I understand, this means I should do more then just copy
>> Restaurant Hotel
>> to CATEGORY_TOKENIZED.
>>
>> Thanks,
>> Elisabeth
>>
>>
>> 2011/4/28 Erick Erickson
>>
>>  See below:
>>>
>>>
>>> On Thu, Apr 28, 2011 at 9:03 AM, elisabeth benoit
>>>   wrote:
>>>
 yes, the multivalued field is not broken up into tokens.

 so, if I understand well what you mean, I could have

 a field CATEGORY with  multiValued="true"
 a field CATEGORY_TOKENIZED with  multiValued=" true"

 and then some POI

 POI_Name
 ...
 Restaurant Hotel
 Restaurant
 Hotel

>>> [EOE] If the above is the document you're sending, then no. The
>>> document would be indexed with
>>> Restaurant Hotel
>>> Restaurant Hotel
>>>
>>>
>>> Or even just:
>>> Restaurant Hotel
>>>
>>> and set up a  to copy the value from CATEGORY to
>>> CATEGORY_TOKENIZED.
>>>
>>> The multiValued part comes from:
>>> "And a single POIs might have different categories so your document could
>>> have"
>>> which would look like:
>>> Restaruant Hotel
>>> Health Spa
>>> Dance Hall
>>>
>>> and your document would be counted for each of those entries while
>>> searches
>>> against CATEGORY_TOKENIZED would match things like "dance" "spa" etc.
>>>
>>> But do notice that if you did NOT want searching for "restaurant hall"
>>> (no quotes),
>>> to match then you could do proximity searches for less than your
>>> increment gap. e.g.
>>> (this time with the quotes) would be "restaurant hall"~50, which would
>>> then
>>> NOT match if your increment gap were 100.
>>>
>>> Best
>>> Erick
>>>
>>>
>>>  do faceting on CATEGORY and fq on CATEGORY_TOKENIZED.

 But then, wouldn't it be possible to do faceting on CATEGORY_TOKENIZED?

 Best regards
 Elisabeth


 2011/4/28 Erick Erickson

  So, I assume your CATEGORY field is multiValued but each value is not
> broken up into tokens, right? If that's the case, would it work to have
>
 a
>>>
 second field CATEGORY_TOKENIZED and run your fq against that
> field instead?
>
> You could have this be a multiValued field with an increment gap if you
> wanted
> to prevent matches across separate entries and have your fq do a
>
 proximity
>>>
 search where the proximity was less than the increment gap
>
> Best
> Erick
>
> On Thu, Apr 28, 2011 at 6:03 AM, elisabeth benoit
>   wrote:
>
>> Hi Stefan,
>>
>> Thanks for answering.
>>
>> In more details, my problem is the following. I'm working on searching
>> points of interest (POIs), which can be hotels, restaurants, plumbers,
>>

Re: Indexing multiple languages

2011-05-03 Thread Stefan Matheis
Peter,

is there a specific need to split these entities? why not just fetch
both columns in one entity? like this:


   
   
   
   


two additional hints:
1) if you 'alias' your fields in your select query (select title_en as
categories_en) then it's no longer needed to use a  assignment
2) if you join both tables already in your first entity, you can skip
the second query to the database

Regards
Stefan

On Mon, May 2, 2011 at 11:53 PM, PeterKerk  wrote:
> I have categories facets. Currently on in 1 language, but since my site is
> multilanguage, I need to index them in multiple languages.
>
> My table looks like this:
>
> [music_categories]
> id      int     Unchecked
> title   nvarchar(50)    Unchecked
> title_en        nvarchar(50)    Unchecked
> title_nl        nvarchar(50)    Unchecked
>
>
>
>
> In my data-config.xml I have this, only for 1 language:
>
> 
>        
>                
>        
> 
>
>
> Now, the only way I can imagine indexing multiple languages is by
> duplicating these lines:
>
> 
>        
>                
>        
> 
>
> 
>        
>                
>        
> 
>
>
> Is there a better way, e.g. where I can do some sort of parametering like
> {lang] or something?
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Indexing-multiple-languages-tp2891546p2891546.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


stemming for English

2011-05-03 Thread Dmitry Kan
Dear list,

In SOLR schema on the index side we use no stemming to support favor
wildcard search. On the query side of the index we use Porter stemming.

I have noticed the following issue: the term "pretty" gets stemmed to
"pretti" and thus not found.

What would be the approach to handle such situations, is going all the way
to modifying the Porter stemming source code the best choice?

-- 
Regards,

Dmitry Kan


Re: stemming for English

2011-05-03 Thread lboutros
Hi,

I think you have to use stemming on both side (index and query) if you
really want to use stemming.

Ludovic

2011/5/3 Dmitry Kan [via Lucene] <
ml-node+2893599-894006307-383...@n3.nabble.com>

> Dear list,
>
> In SOLR schema on the index side we use no stemming to support favor
> wildcard search. On the query side of the index we use Porter stemming.
>
> I have noticed the following issue: the term "pretty" gets stemmed to
> "pretti" and thus not found.
>
> What would be the approach to handle such situations, is going all the way
> to modifying the Porter stemming source code the best choice?
>
> --
> Regards,
>
> Dmitry Kan
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893599.html
>  To start a new topic under Solr - User, email
> ml-node+472068-1765922688-383...@n3.nabble.com
> To unsubscribe from Solr - User, click 
> here.
>
>


-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893611.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: stemming for English

2011-05-03 Thread Dmitry Kan
Hi Ludovic,

That's an option we had before we decided to go for a full-blown support of
wildcards.

Do you know of a way to keep both stemming and consistent wildcard support
in the same field?`

Dmitry

On Tue, May 3, 2011 at 12:56 PM, lboutros  wrote:

> Hi,
>
> I think you have to use stemming on both side (index and query) if you
> really want to use stemming.
>
> Ludovic
>
> 2011/5/3 Dmitry Kan [via Lucene] <
> ml-node+2893599-894006307-383...@n3.nabble.com>
>
> > Dear list,
> >
> > In SOLR schema on the index side we use no stemming to support favor
> > wildcard search. On the query side of the index we use Porter stemming.
> >
> > I have noticed the following issue: the term "pretty" gets stemmed to
> > "pretti" and thus not found.
> >
> > What would be the approach to handle such situations, is going all the
> way
> > to modifying the Porter stemming source code the best choice?
> >
> > --
> > Regards,
> >
> > Dmitry Kan
> >
> >
> > --
> >  If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893599.html
> >  To start a new topic under Solr - User, email
> > ml-node+472068-1765922688-383...@n3.nabble.com
> > To unsubscribe from Solr - User, click here<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472068&code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=
> >.
> >
> >
>
>
> -
> Jouve
> France.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893611.html
> Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,

Dmitry Kan


Re: stemming for English

2011-05-03 Thread lboutros
Dmitry,

I don't know any way to keep both stemming and consistent wildcard support
in the same field.
To me, you have to create 2 different fields.

Ludovic.

2011/5/3 Dmitry Kan [via Lucene] <
ml-node+2893628-993677979-383...@n3.nabble.com>

> Hi Ludovic,
>
> That's an option we had before we decided to go for a full-blown support of
>
> wildcards.
>
> Do you know of a way to keep both stemming and consistent wildcard support
> in the same field?`
>
> Dmitry
>
>


-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893652.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Nutch Web Interface - not anymore in 1.3

2011-05-03 Thread Gabriele Kahlout
Hello,

I'm also in favor of maintaing a web interface that ships with nutch. As has
been mentioned it say well be a bridge to Solr. If I find the time to
contribute my solution (and make it general enough), I'll happily do it.

Earlier I was wondering of actually using the previous nutch web interface
(not solritas/velocity) and integrate with solr index. I still find this
tempting, what's the motivation against it?


I've evaluated Ajax Solr but i didn't get it to work. Listening to Markus
I've tried Solritas I got it to work but w/o highlighting. Why?
Those are the relevant  solrconfig.xml sections:



 
 
   browse
   velocity.properties
   text/html;charset=UTF-8
   Solritas

   *
   standard
   velocity
   
   1
   on

   dismax
   *:*
   *,score
   on
   title
   1
 
 
   
 
  


This was already there:

  
   
   
   

 100

   

   
   

  
  70
  
  0.5
  
  [-\w ,/\n\"']{20,200}

   

   
   

 
 

   
  


Pointers:
http://stackoverflow.com/questions/5071675/ajax-solr-how-to-make-an-ajax-page-readable-by-google

On Mon, May 2, 2011 at 7:43 PM, Mattmann, Chris A (388J) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Hi Gabriele,
>
> I would have loved to have done this myself but haven't had the time. I
> also favored having a web interface still included as well.
>
> If you find time to port it to the 1.3 branch/framework I can tell you I'd
> happily devote my time towards a 1.4 release that includes it.
>
> Cheers,
> Chris
>
> On May 2, 2011, at 10:54 AM, Gabriele Kahlout wrote:
>
> > The reason I'm asking is because I had found the nutch webapp pretty neet
> > for a prototype interface (it even did highlighting).
> > I'm thinking of changing it so that it pulls the data from solr index,
> > updating this part in search.jsp:
> >
> > // perform query
> >// NOTE by Dawid Weiss:
> >// The 'clustering' window actually moves with the start
> >// position this is good, bad?... ugly?
> >   Hits hits;
> >   try{
> >  query.getParams().initFrom(start + hitsToRetrieve, hitsPerSite,
> > "site", sort, reverse);
> > hits = bean.search(query);
> >   } catch (IOException e){
> > hits = new Hits(0,new Hit[0]);
> >   }
> >
> >
> > Has someone gone through that already? Are there other alternatives you
> have
> > taken? I stumbled upon (w/o stumbledupon.com)
> > http://evolvingweb.github.com/ajax-solr/examples/reuters/index.htmlwhich is
> > quite sophisticated and doesn't do the highlighting!
> >
> >
> > On Mon, May 2, 2011 at 4:45 PM, Markus Jelsma <
> markus.jel...@openindex.io>wrote:
> >
> >> Yes. It was removed. Indexing and searching is delegated to Solr for
> now.
> >>
> >> On Monday 02 May 2011 16:41:32 Gabriele Kahlout wrote:
> >>> Hello,
> >>>
> >>> Some time ago I was trying to use nutch/search.jsp to search my Solr
> >>> indexes. Trying to do that again I've noticed that in nutch-1.3 there
> is
> >> no
> >>> support for a Nutch web querying interface (presumably in favor of
> solr's
> >>> own). Is it?
> >>
> >> --
> >> Markus Jelsma - CTO - Openindex
> >> http://www.linkedin.com/in/markus17
> >> 050-8536620 / 06-50258350
> >>
> >
> >
> >
> > --
> > Regards,
> > K. Gabriele
> >
> > --- unchanged since 20/9/10 ---
> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> > receipt within 48 hours then I don't resend the email.
> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x)
> > < Now + 48h) ⇒ ¬resend(I, this).
> >
> > If an email is sent by a sender that is not a trusted contact or the
> email
> > does not contain a valid code then the email is not received. A valid
> code
> > starts with a hyphen and ends with "X".
> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> > L(-[a-z]+[0-9]X)).
>
>
> ++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattm...@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySaf

Re: stemming for English

2011-05-03 Thread Dmitry Kan
Yes, Ludovic. Thus effectively we get index doubled. Given the volume of
data we store, we very carefully consider such cases, where the doubling of
index is must.

Dmitry

On Tue, May 3, 2011 at 1:08 PM, lboutros  wrote:

> Dmitry,
>
> I don't know any way to keep both stemming and consistent wildcard support
> in the same field.
> To me, you have to create 2 different fields.
>
> Ludovic.
>
> 2011/5/3 Dmitry Kan [via Lucene] <
> ml-node+2893628-993677979-383...@n3.nabble.com>
>
> > Hi Ludovic,
> >
> > That's an option we had before we decided to go for a full-blown support
> of
> >
> > wildcards.
> >
> > Do you know of a way to keep both stemming and consistent wildcard
> support
> > in the same field?`
> >
> > Dmitry
> >
> >
>
>
> -
> Jouve
> France.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893652.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,

Dmitry Kan


Dismax scoring multiple fields TIE

2011-05-03 Thread roySolr
Hello,

I have a question about scoring when i use the dismax handler. I will give
some examples:

name  category  related
category
1. Chelsea best club everChelseaSport
2. ChelseaChelsea   
Sport

When i search for "Chelsea" i want a higher score for number 2. I think it
is a better match on fieldlength.
I use the dismax and both records has the same score. I see some difference
in fieldNorm both still the score is the same. How can i fix this?


my config:



 dismax
 
   name category related_category
 

1.0


SCORE 1:
0.75269306 = (MATCH) sum of:
  0.75269306 = (MATCH) max of:
0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
  0.3085193 = queryWeight(category:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
  2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
1.0 = tf(termFreq(category:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=category, doc=680)
0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
  0.3085193 = queryWeight(name:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
  1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
1.0 = tf(termFreq(name:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.5 = fieldNorm(field=name, doc=680)


SCORE 2:
0.75269306 = (MATCH) sum of:
  0.75269306 = (MATCH) max of:
0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
  0.3085193 = queryWeight(category:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
  2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
1.0 = tf(termFreq(category:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=category, doc=678)
0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
  0.3085193 = queryWeight(name:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
  2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
1.0 = tf(termFreq(name:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=name, doc=678)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Dismax scoring multiple fields TIE

2011-05-03 Thread Erick Erickson
I'm not sure you can. very short fields aren't differentiated on the basis
of field length due to rounding errors. Here's a cut-n-paste from Jay Hill:


So the values are not pre-set for the lengthNorm, but for some counts the
fieldLength value winds up being the same because of the precision los. Here
is a list of lengthNorm values for 1 to 10 term fields:

# of termslengthNorm
  1  1.0
  2 .625
  3 .5
  4 .5
  5 .4375
  6 .375
  7 .375
  8 .3125
  9 .3125
 10 .3125


I'd ask, though, if this behavior is "good enough", are your users well
served by spending time on this case?

Best
Erick

On Tue, May 3, 2011 at 7:48 AM, roySolr  wrote:
> Hello,
>
> I have a question about scoring when i use the dismax handler. I will give
> some examples:
>
>    name                                  category                  related
> category
> 1. Chelsea best club ever                Chelsea                    Sport
> 2. Chelsea                                    Chelsea
> Sport
>
> When i search for "Chelsea" i want a higher score for number 2. I think it
> is a better match on fieldlength.
> I use the dismax and both records has the same score. I see some difference
> in fieldNorm both still the score is the same. How can i fix this?
>
>
> my config:
>
> 
>    
>     dismax
>     
>       name category related_category
>     
>    
>    1.0
> 
>
> SCORE 1:
> 0.75269306 = (MATCH) sum of:
>  0.75269306 = (MATCH) max of:
>    0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
>      0.3085193 = queryWeight(category:chelsea), product of:
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        0.12645814 = queryNorm
>      2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
>        1.0 = tf(termFreq(category:chelsea)=1)
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        1.0 = fieldNorm(field=category, doc=680)
>    0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
>      0.3085193 = queryWeight(name:chelsea), product of:
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        0.12645814 = queryNorm
>      1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
>        1.0 = tf(termFreq(name:chelsea)=1)
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        0.5 = fieldNorm(field=name, doc=680)
>
>
> SCORE 2:
> 0.75269306 = (MATCH) sum of:
>  0.75269306 = (MATCH) max of:
>    0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
>      0.3085193 = queryWeight(category:chelsea), product of:
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        0.12645814 = queryNorm
>      2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
>        1.0 = tf(termFreq(category:chelsea)=1)
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        1.0 = fieldNorm(field=category, doc=678)
>    0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
>      0.3085193 = queryWeight(name:chelsea), product of:
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        0.12645814 = queryNorm
>      2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
>        1.0 = tf(termFreq(name:chelsea)=1)
>        2.4396951 = idf(docFreq=236, maxDocs=1000)
>        1.0 = fieldNorm(field=name, doc=678)
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Dismax scoring multiple fields TIE

2011-05-03 Thread roySolr
No, but i think the difference between fieldlength is large and the score is
still the same.

Same score for this results(q=chelsea):

1. Chelsea is a very very big club in london, england  Chelsea  
Sport
2. Chelsea
Chelsea   Sport

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2894026.html
Sent from the Solr - User mailing list archive at Nabble.com.


facet search and UnInverted multi-valued field?

2011-05-03 Thread Bernd Fehling

Dear list,

we use solr 3.1.0.

my logs have the following entry:
May 3, 2011 2:01:39 PM org.apache.solr.request.UnInvertedField uninvert
INFO: UnInverted multi-valued field
{field=f_dcperson,memSize=1966237,tindexSize=35730,time=849,phase1=782,nTerms=12,bigTerms=0,termInstances=368008,uses=0}

The schema.xml has the field:


The query was:
May 3, 2011 2:01:40 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null 
params={facet=true&fl=score&facet.mincount=1&facet.sort=&start=0&event=firstSearcher&q=text:antigone^200&facet.prefix=&facet.limit=100&facet.field=f_dcperson&facet.field=f_dcsubject&facet.field=f_dcyear&facet.field=f_dccollection&facet.field=f_dctypenorm&facet.field=f_dccontenttype&rows=10} 
hits=1 status=0 QTime=1816


At first the log entry is an info, but what does it tell me?

Am I doing something wrong or can something be done better?

Regards,
Bernd


Re: Dismax scoring multiple fields TIE

2011-05-03 Thread elisabeth benoit
for category:chelsea, you have a fieldNorm=1.0, so your category field must
have a type with omitNorms=true. if you don't have omitNorms=true, then
shorter field will score higher.

I'm new to Solr, but from what I've experienced, this is the cause.

Regards,
Elisabeth

2011/5/3 roySolr 

> Hello,
>
> I have a question about scoring when i use the dismax handler. I will give
> some examples:
>
>name  category  related
> category
> 1. Chelsea best club everChelseaSport
> 2. ChelseaChelsea
> Sport
>
> When i search for "Chelsea" i want a higher score for number 2. I think it
> is a better match on fieldlength.
> I use the dismax and both records has the same score. I see some difference
> in fieldNorm both still the score is the same. How can i fix this?
>
>
> my config:
>
> 
>
> dismax
> 
>   name category related_category
> 
>
>1.0
> 
>
> SCORE 1:
> 0.75269306 = (MATCH) sum of:
>  0.75269306 = (MATCH) max of:
>0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
>  0.3085193 = queryWeight(category:chelsea), product of:
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>0.12645814 = queryNorm
>  2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
>1.0 = tf(termFreq(category:chelsea)=1)
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>1.0 = fieldNorm(field=category, doc=680)
>0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
>  0.3085193 = queryWeight(name:chelsea), product of:
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>0.12645814 = queryNorm
>  1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
>1.0 = tf(termFreq(name:chelsea)=1)
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>0.5 = fieldNorm(field=name, doc=680)
>
>
> SCORE 2:
> 0.75269306 = (MATCH) sum of:
>  0.75269306 = (MATCH) max of:
>0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
>  0.3085193 = queryWeight(category:chelsea), product of:
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>0.12645814 = queryNorm
>  2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
>1.0 = tf(termFreq(category:chelsea)=1)
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>1.0 = fieldNorm(field=category, doc=678)
>0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
>  0.3085193 = queryWeight(name:chelsea), product of:
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>0.12645814 = queryNorm
>  2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
>1.0 = tf(termFreq(name:chelsea)=1)
>2.4396951 = idf(docFreq=236, maxDocs=1000)
>1.0 = fieldNorm(field=name, doc=678)
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


full-import called simultaneously for multiple core.

2011-05-03 Thread Kannan
Hi 
   I am running one instance with multiple core. If call full-import URI
simultanesously for multiple core few of the fields are not getting
indexed.If i do full-import one by one.Its works fine. Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/full-import-called-simultaneously-for-multiple-core-tp2894606p2894606.html
Sent from the Solr - User mailing list archive at Nabble.com.


Unable to use DataImportHandler

2011-05-03 Thread serenity keningston
Hello Friends,


I am new to Solr and experiencing issue while trying to use
DataImportHandler. I added the required fields to schema.xml file and here
is my data-config.xml file :


  
  

 
 
 
 

  


I am getting the following errors :

org.apache.solr.common.SolrException: Document [null] missing required
field: id

solr home defaulted to 'solr/' (could not find system property or JNDI)
May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/'

DataImportHandler can extract the information from the database and it is
displayed in log file, however, it is not indexing the data. Can anyone
please let me know, where I am commiting mistake ?

Regards,
Serenity


Re: Unable to use DataImportHandler

2011-05-03 Thread Stefan Matheis
Serenity,

there is no field fileNo in your SELECT-Query? you've defined it in
the pk-Attribut of the  Tag, but it's also required in the
Query itself. just to note it: you can skip your  definition,
if the query returns the field with exactly the same name as the
solr-schema requires it.

Regards
Stefan

On Tue, May 3, 2011 at 5:38 PM, serenity keningston
 wrote:
> Hello Friends,
>
>
> I am new to Solr and experiencing issue while trying to use
> DataImportHandler. I added the required fields to schema.xml file and here
> is my data-config.xml file :
>
> 
>      url="jdbc:mysql://localhost:3306/mp3"
>   user="root"
>   password="root" />
>  
>    
>     
>     
>     
>     
>    
>  
> 
>
> I am getting the following errors :
>
> org.apache.solr.common.SolrException: Document [null] missing required
> field: id
>
> solr home defaulted to 'solr/' (could not find system property or JNDI)
> May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
> INFO: Solr home set to 'solr/'
>
> DataImportHandler can extract the information from the database and it is
> displayed in log file, however, it is not indexing the data. Can anyone
> please let me know, where I am commiting mistake ?
>
> Regards,
> Serenity
>


Re: Unable to use DataImportHandler

2011-05-03 Thread serenity keningston
Dear Stefan,

Am still getting the following error message even after including the
pk-Attribute to the query :

WARNING: Error creating document :
SolrInputDocument[{lname=lname(1.0)={cindy}, file=file(1.0)={
http://localhost:8084/Access/UploadFiles/laura.mp3},
fname=fname(1.0)={troutman}}]
org.apache.solr.common.SolrException: Document [null] missing required
field: id
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:305)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:75)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:292)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:392)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)


and here is the data-config.xml file :


  
  


 
 
 

  


Regards,
Serenity

On Tue, May 3, 2011 at 10:43 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Serenity,
>
> there is no field fileNo in your SELECT-Query? you've defined it in
> the pk-Attribut of the  Tag, but it's also required in the
> Query itself. just to note it: you can skip your  definition,
> if the query returns the field with exactly the same name as the
> solr-schema requires it.
>
> Regards
> Stefan
>
> On Tue, May 3, 2011 at 5:38 PM, serenity keningston
>  wrote:
> > Hello Friends,
> >
> >
> > I am new to Solr and experiencing issue while trying to use
> > DataImportHandler. I added the required fields to schema.xml file and
> here
> > is my data-config.xml file :
> >
> > 
> >   >url="jdbc:mysql://localhost:3306/mp3"
> >   user="root"
> >   password="root" />
> >  
> >
> > 
> > 
> > 
> > 
> >
> >  
> > 
> >
> > I am getting the following errors :
> >
> > org.apache.solr.common.SolrException: Document [null] missing required
> > field: id
> >
> > solr home defaulted to 'solr/' (could not find system property or JNDI)
> > May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
> > INFO: Solr home set to 'solr/'
> >
> > DataImportHandler can extract the information from the database and it is
> > displayed in log file, however, it is not indexing the data. Can anyone
> > please let me know, where I am commiting mistake ?
> >
> > Regards,
> > Serenity
> >
>


Re: full-import called simultaneously for multiple core.

2011-05-03 Thread Erick Erickson
Do you log files show any errors?

Erick

On Tue, May 3, 2011 at 11:06 AM, Kannan  wrote:
> Hi
>   I am running one instance with multiple core. If call full-import URI
> simultanesously for multiple core few of the fields are not getting
> indexed.If i do full-import one by one.Its works fine. Thanks in advance
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/full-import-called-simultaneously-for-multiple-core-tp2894606p2894606.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Getting field information inside a Tokenizer

2011-05-03 Thread FatMan Corp
Hi, I would like to get another's field information for the same document
within a Tekonizer class.
How can this be achieved?

Thanks


Re: Unable to use DataImportHandler

2011-05-03 Thread Erick Erickson
The next thing to check is if your select statement returns the
fileNo for every field.

Wait.. You took out the  bit of your entity
definition, is that a cut/paste error?

You might get some joy from the DIH debug page at:
solr/admin/dataimport.jsp
it's not very well known, but it's a debug console for
your SQL data import process.


BTW, what version of Solr are you using?

Best
Erick

On Tue, May 3, 2011 at 12:04 PM, serenity keningston
 wrote:
> Dear Stefan,
>
> Am still getting the following error message even after including the
> pk-Attribute to the query :
>
> WARNING: Error creating document :
> SolrInputDocument[{lname=lname(1.0)={cindy}, file=file(1.0)={
> http://localhost:8084/Access/UploadFiles/laura.mp3},
> fname=fname(1.0)={troutman}}]
> org.apache.solr.common.SolrException: Document [null] missing required
> field: id
>    at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:305)
>    at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>    at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:75)
>    at
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:292)
>    at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:392)
>    at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
>    at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
>    at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
>    at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
>    at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
>
>
> and here is the data-config.xml file :
>
> 
>      url="jdbc:mysql://localhost:3306/mp3"
>   user="root"
>   password="root" />
>  
>    
>
>     
>     
>     
>    
>  
> 
>
> Regards,
> Serenity
>
> On Tue, May 3, 2011 at 10:43 AM, Stefan Matheis <
> matheis.ste...@googlemail.com> wrote:
>
>> Serenity,
>>
>> there is no field fileNo in your SELECT-Query? you've defined it in
>> the pk-Attribut of the  Tag, but it's also required in the
>> Query itself. just to note it: you can skip your  definition,
>> if the query returns the field with exactly the same name as the
>> solr-schema requires it.
>>
>> Regards
>> Stefan
>>
>> On Tue, May 3, 2011 at 5:38 PM, serenity keningston
>>  wrote:
>> > Hello Friends,
>> >
>> >
>> > I am new to Solr and experiencing issue while trying to use
>> > DataImportHandler. I added the required fields to schema.xml file and
>> here
>> > is my data-config.xml file :
>> >
>> > 
>> >  > >    url="jdbc:mysql://localhost:3306/mp3"
>> >   user="root"
>> >   password="root" />
>> >  
>> >    
>> >     
>> >     
>> >     
>> >     
>> >    
>> >  
>> > 
>> >
>> > I am getting the following errors :
>> >
>> > org.apache.solr.common.SolrException: Document [null] missing required
>> > field: id
>> >
>> > solr home defaulted to 'solr/' (could not find system property or JNDI)
>> > May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
>> > INFO: Solr home set to 'solr/'
>> >
>> > DataImportHandler can extract the information from the database and it is
>> > displayed in log file, however, it is not indexing the data. Can anyone
>> > please let me know, where I am commiting mistake ?
>> >
>> > Regards,
>> > Serenity
>> >
>>
>


Re: Unable to use DataImportHandler

2011-05-03 Thread serenity keningston
Dear Erick,

I am using Solr 1.4 version. Yes, for each row , we will get one fileNo
which is the primary key for the table "file".
No, I intentionally removed the   from the
data-config.xml

I tried opening the dataimport.jsp to debug but I don't know, it doesn't
show any result but it updates the log file every time I tried to debug.

Regards,
Serenity

On Tue, May 3, 2011 at 11:42 AM, Erick Erickson wrote:

> The next thing to check is if your select statement returns the
> fileNo for every field.
>
> Wait.. You took out the  bit of your
> entity
> definition, is that a cut/paste error?
>
> You might get some joy from the DIH debug page at:
> solr/admin/dataimport.jsp
> it's not very well known, but it's a debug console for
> your SQL data import process.
>
>
> BTW, what version of Solr are you using?
>
> Best
> Erick
>
> On Tue, May 3, 2011 at 12:04 PM, serenity keningston
>  wrote:
> > Dear Stefan,
> >
> > Am still getting the following error message even after including the
> > pk-Attribute to the query :
> >
> > WARNING: Error creating document :
> > SolrInputDocument[{lname=lname(1.0)={cindy}, file=file(1.0)={
> > http://localhost:8084/Access/UploadFiles/laura.mp3},
> > fname=fname(1.0)={troutman}}]
> > org.apache.solr.common.SolrException: Document [null] missing required
> > field: id
> >at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:305)
> >at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
> >at
> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:75)
> >at
> >
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:292)
> >at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:392)
> >at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
> >at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
> >at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
> >at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
> >at
> >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
> >
> >
> > and here is the data-config.xml file :
> >
> > 
> >   >url="jdbc:mysql://localhost:3306/mp3"
> >   user="root"
> >   password="root" />
> >  
> >
> >
> > 
> > 
> > 
> >
> >  
> > 
> >
> > Regards,
> > Serenity
> >
> > On Tue, May 3, 2011 at 10:43 AM, Stefan Matheis <
> > matheis.ste...@googlemail.com> wrote:
> >
> >> Serenity,
> >>
> >> there is no field fileNo in your SELECT-Query? you've defined it in
> >> the pk-Attribut of the  Tag, but it's also required in the
> >> Query itself. just to note it: you can skip your  definition,
> >> if the query returns the field with exactly the same name as the
> >> solr-schema requires it.
> >>
> >> Regards
> >> Stefan
> >>
> >> On Tue, May 3, 2011 at 5:38 PM, serenity keningston
> >>  wrote:
> >> > Hello Friends,
> >> >
> >> >
> >> > I am new to Solr and experiencing issue while trying to use
> >> > DataImportHandler. I added the required fields to schema.xml file and
> >> here
> >> > is my data-config.xml file :
> >> >
> >> > 
> >> >   >> >url="jdbc:mysql://localhost:3306/mp3"
> >> >   user="root"
> >> >   password="root" />
> >> >  
> >> >
> >> > 
> >> > 
> >> > 
> >> > 
> >> >
> >> >  
> >> > 
> >> >
> >> > I am getting the following errors :
> >> >
> >> > org.apache.solr.common.SolrException: Document [null] missing required
> >> > field: id
> >> >
> >> > solr home defaulted to 'solr/' (could not find system property or
> JNDI)
> >> > May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
> >> > INFO: Solr home set to 'solr/'
> >> >
> >> > DataImportHandler can extract the information from the database and it
> is
> >> > displayed in log file, however, it is not indexing the data. Can
> anyone
> >> > please let me know, where I am commiting mistake ?
> >> >
> >> > Regards,
> >> > Serenity
> >> >
> >>
> >
>


getLuceneVersion parsing xml node on every request

2011-05-03 Thread Stephane Bailliez
I' m using Solr 3.1 right now.

I was looking at a threadump trying to figure out why queries were not
exactly fast and noticed that it keeps parsing xml over and over from
the schema to get the lucene version.

SolrQueryParser are created for each request and in the constructor
there is a call similar to

getSchema().getSolrConfig().getLuceneVersion("luceneMatchVersion",
Version.LUCENE_24)

which calls getVal() which is calling getNode() which creates a new
XPath object which ends up creating a new object factory which ends up
loading a class...

I cannot find a reference to this issue anywhere in jira nor google.
Hard to see right now how much effect that does have, but this seems
not quite optimal to do for every request.

Am I missing something obvious here ?

The stack looks like:

  java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:369)
- waiting to lock <0x2aaab3bb43b0> (a
org.mortbay.jetty.webapp.WebAppClassLoader)
at 
org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101)
at 
com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135)
at 
com.sun.org.apache.xpath.internal.XPathContext.(XPathContext.java:100)
at 
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201)
at 
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
at org.apache.solr.core.Config.getNode(Config.java:230)
at org.apache.solr.core.Config.getVal(Config.java:256)
at org.apache.solr.core.Config.getLuceneVersion(Config.java:325)
at 
org.apache.solr.search.SolrQueryParser.(SolrQueryParser.java:76)
at 
org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277)
at 
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:76)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)

Cheers,

-- stephane


UIMA analysisEngine path

2011-05-03 Thread Barry Hathaway
I'm new to Solr and trying to get it call a UIMA aggregate analysis 
engine and not having much luck.
The null pointer exception indicates that it can't find the xml file 
associated with the engine.
I have tried a number of combinations of a path in the   
element, but nothing
seems to work. In addition, I've put the directory containing the 
descriptor in both the classpath

when starting the server and in a  element in solrconfig.xml. So:

What "classpath" does the  tag effectively search for to 
locate the descriptor?


Do the  entries in solrconfig.xml affect this classpath?

Do the engine descriptors have to be in a jar or can they be in an 
expanded directory?


Thanks in advance.

Barry





Re: facet search and UnInverted multi-valued field?

2011-05-03 Thread Jay Hill
UnInvertedField is similar to Lucene's FieldCache, except, while the
FieldCache cannot work with multivalued fields, UnInvertedField is designed
for that very purpose. So since your f_dcperson field is multivalued, by
default you use UnInvertedField. You're not doing anything wrong, that's
default and normal behavior.

-Jay
http://lucidimagination.com



On Tue, May 3, 2011 at 7:03 AM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> Dear list,
>
> we use solr 3.1.0.
>
> my logs have the following entry:
> May 3, 2011 2:01:39 PM org.apache.solr.request.UnInvertedField uninvert
> INFO: UnInverted multi-valued field
>
> {field=f_dcperson,memSize=1966237,tindexSize=35730,time=849,phase1=782,nTerms=12,bigTerms=0,termInstances=368008,uses=0}
>
> The schema.xml has the field:
>  multiValued="true" />
>
> The query was:
> May 3, 2011 2:01:40 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=null path=null
> params={facet=true&fl=score&facet.mincount=1&facet.sort=&start=0&event=firstSearcher&q=text:antigone^200&facet.prefix=&facet.limit=100&facet.field=f_dcperson&facet.field=f_dcsubject&facet.field=f_dcyear&facet.field=f_dccollection&facet.field=f_dctypenorm&facet.field=f_dccontenttype&rows=10}
> hits=1 status=0 QTime=1816
>
> At first the log entry is an info, but what does it tell me?
>
> Am I doing something wrong or can something be done better?
>
> Regards,
> Bernd
>


RE: stemming for English

2011-05-03 Thread Robert Petersen
From what I have seen, adding a second field with the same terms as the first 
does *not* double your index size at all.

-Original Message-
From: Dmitry Kan [mailto:dmitry@gmail.com] 
Sent: Tuesday, May 03, 2011 4:06 AM
To: solr-user@lucene.apache.org
Subject: Re: stemming for English

Yes, Ludovic. Thus effectively we get index doubled. Given the volume of
data we store, we very carefully consider such cases, where the doubling of
index is must.

Dmitry

On Tue, May 3, 2011 at 1:08 PM, lboutros  wrote:

> Dmitry,
>
> I don't know any way to keep both stemming and consistent wildcard support
> in the same field.
> To me, you have to create 2 different fields.
>
> Ludovic.
>
> 2011/5/3 Dmitry Kan [via Lucene] <
> ml-node+2893628-993677979-383...@n3.nabble.com>
>
> > Hi Ludovic,
> >
> > That's an option we had before we decided to go for a full-blown support
> of
> >
> > wildcards.
> >
> > Do you know of a way to keep both stemming and consistent wildcard
> support
> > in the same field?`
> >
> > Dmitry
> >
> >
>
>
> -
> Jouve
> France.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/stemming-for-English-tp2893599p2893652.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,

Dmitry Kan


RE: Getting field information inside a Tokenizer

2011-05-03 Thread Steven A Rowe
Hi FMC,

On 5/3/2011 at 12:37 PM, FatMan Corp wrote:
> Hi, I would like to get another's field information for the same document
> within a Tekonizer class.
> How can this be achieved?

Use s in your schema 
, and associate different 
analysis pipelines with each field.  Each field's analysis pipeline will be fed 
the original raw text.

Presently Lucene's analysis pipeline is single-field only: you have to create 
separate analysis pipelines for each field, with an extra pass over the 
original text for each field. I personally think Lucene should provide 
multi-field analysis capabilities, but this would not be a simple change.  Even 
if Lucene does eventually gain this capability, modifying Solr to expose it 
would be an added layer of complexity, and given that  already 
exists as a workaround, there may be little motivation to do so.

Some of the use cases full multi-field analysis could serve are already handled 
in Lucene (but not yet in Solr) by TeeSinkTokenFilter 
.
  An enterprising Lucene user could write a single-pass tokenizer that emits 
tokens with one type per target field, then employ one TeeSinkTokenFilter per 
field to approximate full multi-field analysis.  Adding TeeSinkTokenFilter 
support to Solr, though, would require substantial changes to Solr's code and 
schema format (schema schema?).

Steve

> -Original Message-
> From: FatMan Corp [mailto:fatmanc...@gmail.com]
> Sent: Tuesday, May 03, 2011 12:37 PM
> To: solr-user@lucene.apache.org
> Subject: Getting field information inside a Tokenizer
> 
> Hi, I would like to get another's field information for the same document
> within a Tekonizer class.
> How can this be achieved?
> 
> Thanks


Re: Unable to use DataImportHandler

2011-05-03 Thread Erick Erickson
OK, put it back 

According to this page:
http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1
the pk is used for delta imports and "has no relation to uniqueKey
defined in the schema.xml...".

The error you're getting is because your schema.xml defines the "id"
field as required (required="true"), assuming you're just using the
example schema and/or your  field is "id". You have two
choices:
1> form your select such that every document gets an id. That is the
 will do if (and only if) you select fileNo in
your select statement. Just ignore the pk for this puprose
2> remove the "required='true'" form your id field definition in schema.xml.
AND remove the id entry.

The second will make it hard to do delta imports, so I'd go for the
first. The point is that whatever field you use to populate the 
key should uniquely identify the record so it may be updated.

Hope that helps
Erick

On Tue, May 3, 2011 at 12:54 PM, serenity keningston
 wrote:
> Dear Erick,
>
> I am using Solr 1.4 version. Yes, for each row , we will get one fileNo
> which is the primary key for the table "file".
> No, I intentionally removed the   from the
> data-config.xml
>
> I tried opening the dataimport.jsp to debug but I don't know, it doesn't
> show any result but it updates the log file every time I tried to debug.
>
> Regards,
> Serenity
>
> On Tue, May 3, 2011 at 11:42 AM, Erick Erickson 
> wrote:
>
>> The next thing to check is if your select statement returns the
>> fileNo for every field.
>>
>> Wait.. You took out the  bit of your
>> entity
>> definition, is that a cut/paste error?
>>
>> You might get some joy from the DIH debug page at:
>> solr/admin/dataimport.jsp
>> it's not very well known, but it's a debug console for
>> your SQL data import process.
>>
>>
>> BTW, what version of Solr are you using?
>>
>> Best
>> Erick
>>
>> On Tue, May 3, 2011 at 12:04 PM, serenity keningston
>>  wrote:
>> > Dear Stefan,
>> >
>> > Am still getting the following error message even after including the
>> > pk-Attribute to the query :
>> >
>> > WARNING: Error creating document :
>> > SolrInputDocument[{lname=lname(1.0)={cindy}, file=file(1.0)={
>> > http://localhost:8084/Access/UploadFiles/laura.mp3},
>> > fname=fname(1.0)={troutman}}]
>> > org.apache.solr.common.SolrException: Document [null] missing required
>> > field: id
>> >    at
>> >
>> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:305)
>> >    at
>> >
>> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>> >    at
>> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:75)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:292)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:392)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
>> >    at
>> >
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
>> >
>> >
>> > and here is the data-config.xml file :
>> >
>> > 
>> >  > >    url="jdbc:mysql://localhost:3306/mp3"
>> >   user="root"
>> >   password="root" />
>> >  
>> >    
>> >
>> >     
>> >     
>> >     
>> >    
>> >  
>> > 
>> >
>> > Regards,
>> > Serenity
>> >
>> > On Tue, May 3, 2011 at 10:43 AM, Stefan Matheis <
>> > matheis.ste...@googlemail.com> wrote:
>> >
>> >> Serenity,
>> >>
>> >> there is no field fileNo in your SELECT-Query? you've defined it in
>> >> the pk-Attribut of the  Tag, but it's also required in the
>> >> Query itself. just to note it: you can skip your  definition,
>> >> if the query returns the field with exactly the same name as the
>> >> solr-schema requires it.
>> >>
>> >> Regards
>> >> Stefan
>> >>
>> >> On Tue, May 3, 2011 at 5:38 PM, serenity keningston
>> >>  wrote:
>> >> > Hello Friends,
>> >> >
>> >> >
>> >> > I am new to Solr and experiencing issue while trying to use
>> >> > DataImportHandler. I added the required fields to schema.xml file and
>> >> here
>> >> > is my data-config.xml file :
>> >> >
>> >> > 
>> >> >  > >> >    url="jdbc:mysql://localhost:3306/mp3"
>> >> >   user="root"
>> >> >   password="root" />
>> >> >  
>> >> >    
>> >> >     
>> >> >     
>> >> >     
>> >> >     
>> >> >    
>> >> >  
>> >> > 
>> >> >
>> >> > I am getting the following errors :
>> >> >
>> >> > org.apache.solr.common.SolrException: Document [null] missing required
>> >> > field: id
>> >> >
>> >> > solr home defaulted to 'solr/' (could not find system property or
>> JNDI)
>> >> > May 3, 2011 9:59:08 AM org.apache.solr.core.SolrResourceLoader 
>> >> > INFO: Solr home set to 'solr/'
>>

Re: How to debug if termsComponent is used

2011-05-03 Thread cyang2010
I tried it.  It just does not work.   the debug component only works when
query component is there, and it is just showing debugging information for
query result, not term match result.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-debug-if-termsComponent-is-used-tp2891735p2895647.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to debug if termsComponent is used

2011-05-03 Thread Erick Erickson
Saying "it does not work" doesn't give us much to go on. Can you describe
what you've tried? *How* it fails? Have you looked in the log for any clues?

You might review this page:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Tue, May 3, 2011 at 3:35 PM, cyang2010  wrote:
> I tried it.  It just does not work.   the debug component only works when
> query component is there, and it is just showing debugging information for
> query result, not term match result.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-debug-if-termsComponent-is-used-tp2891735p2895647.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


How to know which value matched for a multi-value field

2011-05-03 Thread cyang2010
Hi,

I have a use case where i need to know for a paritcular multivalue field,
which particular value match when a query is run on that field.  For
example, for a movie document, movie title name is single-value field, 
movie actors is multi-value field.  When user search "colin", i know it is
"colin firth" matches the query, rather than "jeffery rush" matches.   So
that i can return "colin firth" field value back.

title 
king's speech 

actors:  colin firth, jeffery rush


Thanks in advance,


cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-know-which-value-matched-for-a-multi-value-field-tp2895814p2895814.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to debug if termsComponent is used

2011-05-03 Thread cyang2010
Sorry i didn't mean to give random reply.  It is just today my solr
configuration/schema is different now and can't get the error message.

Anyway, i rerun the test.

Basically by specifying such searchcomponent and requesthandler, you won't
get any error.

When you query this it is fine without any debug message (of course, since
no debug parameter is defined in query).

http://localhost:8080/solr/titles/terms?terms=true&terms.fl=autosuggest&terms.prefix=andy&terms.mincount=1&;

As soon as i specify the only debug parameter i know, "debugQuery" the solr
server give this error:
http://localhost:8080/solr/titles/terms?terms=true&terms.fl=autosuggest&terms.prefix=andy&terms.mincount=1&debugQuery=true


May 3, 2011 1:27:37 PM org.apache.solr.core.SolrCore execute
INFO: [titles] webapp=/solr path=/terms
params={debugQuery=true&terms.mincount=1
&terms.fl=autosuggest&terms=true&terms.prefix=andy} status=500 QTime=641
May 3, 2011 1:27:37 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at
org.apache.solr.handler.component.DebugComponent.process(DebugCompone
nt.java:54)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
rchHandler.java:203)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
erBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl
icationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF
ilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV
alve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextV
alve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j
ava:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.j
ava:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:
568)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineVal
ve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav
a:286)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java
:845)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proce
ss(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:44
7)
at java.lang.Thread.run(Thread.java:619)



That is all i get.  Let me know if i use the wrong parameter or what.

Thanks.


cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-debug-if-termsComponent-is-used-tp2891735p2895897.html
Sent from the Solr - User mailing list archive at Nabble.com.


An error I can't manage to fix: java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin

2011-05-03 Thread Gavin Engel
Hello all,

I've been trying to add the Spatial Search Plugin to my Solr 1.4.1 setup,
and I get this error:


> java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin
> at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>  at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> ...
> Caused by: java.lang.ClassNotFoundException:
> org.apache.solr.search.QParserPlugin
> at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>  at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>  ... 50 more



I've been trying my best with the devlopers' documentation, but I am still
stuck on the install phase of SSP 2.0.  I wonder if there are users of SSP 2
that can help me troubleshoot this, please?

-Gavin


Re: An error I can't manage to fix: java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin

2011-05-03 Thread Markus Jelsma
Where did you store the jar? Is it in a directory Solr looks for libs? 
Depending on your distro or set up there can be different places to store the 
jar. The easiest solution is to put it in a dir where other Solr libs are 
found or in a dir that you configured in a  directive in solrconfig.

> Hello all,
> 
> I've been trying to add the Spatial Search Plugin to my Solr 1.4.1 setup,
> 
> and I get this error:
> > java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > 
> >  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> > 
> > at
> > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > 
> >  at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> > 
> > ...
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.solr.search.QParserPlugin
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> > 
> >  at java.security.AccessController.doPrivileged(Native Method)
> > 
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > 
> >  at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > 
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > 
> >  ... 50 more
> 
> I've been trying my best with the devlopers' documentation, but I am still
> stuck on the install phase of SSP 2.0.  I wonder if there are users of SSP
> 2 that can help me troubleshoot this, please?
> 
> -Gavin


Re: An error I can't manage to fix: java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin

2011-05-03 Thread Gavin Engel
Oh, I apparently figured out how to get the jar file to load, so problem is
solved I suppose.

The fix seems very odd to me, but I got it from a comment on the SSP 2 blog
page (
http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/comment-page-1/#comment-4774
):

The solution, for those of you getting the NoClassDefFoundError exception
thrown, is to put the jar file in your example directory, under:
solr/work/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp/WEB-INF/lib/





I created that odd directory structure first, copied in the jar, and started
Jetty.  It looked like the jar was deleted, so I re-copied the jar into it.
 The second time around, everything seems to have worked.

I am lost as to why its looking in that strange folder structure for the jar
file, instead of ./lib or ./solr/lib.



On Tue, May 3, 2011 at 4:35 PM, Markus Jelsma wrote:

> Where did you store the jar? Is it in a directory Solr looks for libs?
> Depending on your distro or set up there can be different places to store
> the
> jar. The easiest solution is to put it in a dir where other Solr libs are
> found or in a dir that you configured in a  directive in solrconfig.
>
> > Hello all,
> >
> > I've been trying to add the Spatial Search Plugin to my Solr 1.4.1 setup,
> >
> > and I get this error:
> > > java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin
> > > at java.lang.ClassLoader.defineClass1(Native Method)
> > >
> > >  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> > >
> > > at
> > > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > >
> > >  at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> > >
> > > ...
> > > Caused by: java.lang.ClassNotFoundException:
> > > org.apache.solr.search.QParserPlugin
> > > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> > >
> > >  at java.security.AccessController.doPrivileged(Native Method)
> > >
> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > >
> > >  at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > >
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > >
> > >  ... 50 more
> >
> > I've been trying my best with the devlopers' documentation, but I am
> still
> > stuck on the install phase of SSP 2.0.  I wonder if there are users of
> SSP
> > 2 that can help me troubleshoot this, please?
> >
> > -Gavin
>


RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Andy

--- On Tue, 5/3/11, Charton, Andre  wrote:
> 
> yes we do. 
> 
> If you use a limit number of categories (like 100) you can
> use dynamic fields with the termscomponent and by choosing a
> category specific prefix, like:
> 
> {schema.xml}
> ...
>  indexed="true" stored="false" multiValued="true"
> omitNorms="true"/>
> ...
> {schema.xml}
> 
> And within data import handler we script prefix from given
> category:
> 
> {data-config.xml}
>         function
> setCatPrefixFields(row) {
>            
> var catId = row.get('category');
>            
> var title = row.get('freetext');
>            
> var cat_prefix = "c" + catId + "_suggestion";
>            
> return row;
>         }
> {data-config.xml}
> 
> Then you we adapt these in our application layer by a
> specific request handler, regarding these prefix.
> 
> Pro:
>     - works fine for limit number of
> categories
> 
> Con:
>     - index is getting bigger, we measure
> increasing by ~40 percent


Very interesting.

Why did the index get bigger? You're still indexing the same title, just to 
different dynamic fields, right? So the total amount of data indexed should 
still be the same. Adding dynamic fields shouldn't increase the index size. 
What am I missing?

Andy


Re: Has NRT been abandoned?

2011-05-03 Thread Nagendra Nagarajayya

Thanks Andy!

Solr-RA is the same as Solr, except that the underlying search library 
is now RankingAlgorithm library instead of Lucene. BoostQParserPlugin 
works at the Solr level, so this should still work as before.


A query of the form q={!boost b=log(x)}abcde comes back with results but am not 
sure if it is working as expected.

Regards,
- NN


On 5/2/2011 10:08 PM, Andy wrote:

Everything should work as before. So faceting, function
queries, query
boosting should still work.

For eg:
q=name:efghij^2.2 name:abcd^3.2

returns all docs with name efghij and abcd but ranking
documents named
abcd above efghij


Thanks Nagendra.

But I wasn't talking about field boost. The kind of boosting I need:

{!boost b=log(popularity)}foo

requires BoostQParserPlugin 
(http://search-lucene.com/jd/solr/org/apache/solr/search/BoostQParserPlugin.html
 )

Does Solr-RA come with BoostQParserPlugin?

Thanks.






Re: getLuceneVersion parsing xml node on every request

2011-05-03 Thread Stephane Bailliez
I went ahead and patched locally the SolrQueryParser in current 3_x branch.
Doing a quick test, baring any obvious mistake due to sleep
deprivation I get close to a 10X performance boost from 200qps to
2000qps.

I opened https://issues.apache.org/jira/browse/SOLR-2493

cheers,

-- stephane


On Tue, May 3, 2011 at 1:45 PM, Stephane Bailliez  wrote:
> I' m using Solr 3.1 right now.
>
> I was looking at a threadump trying to figure out why queries were not
> exactly fast and noticed that it keeps parsing xml over and over from
> the schema to get the lucene version.
>
> SolrQueryParser are created for each request and in the constructor
> there is a call similar to
>
> getSchema().getSolrConfig().getLuceneVersion("luceneMatchVersion",
> Version.LUCENE_24)
>
> which calls getVal() which is calling getNode() which creates a new
> XPath object which ends up creating a new object factory which ends up
> loading a class...
>
> I cannot find a reference to this issue anywhere in jira nor google.
> Hard to see right now how much effect that does have, but this seems
> not quite optimal to do for every request.
>
> Am I missing something obvious here ?
>
> The stack looks like:
>
>  java.lang.Thread.State: BLOCKED (on object monitor)
>        at 
> org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:369)
>        - waiting to lock <0x2aaab3bb43b0> (a
> org.mortbay.jetty.webapp.WebAppClassLoader)
>        at 
> org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101)
>        at 
> com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135)
>        at 
> com.sun.org.apache.xpath.internal.XPathContext.(XPathContext.java:100)
>        at 
> com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201)
>        at 
> com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
>        at org.apache.solr.core.Config.getNode(Config.java:230)
>        at org.apache.solr.core.Config.getVal(Config.java:256)
>        at org.apache.solr.core.Config.getLuceneVersion(Config.java:325)
>        at 
> org.apache.solr.search.SolrQueryParser.(SolrQueryParser.java:76)
>        at 
> org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277)
>        at 
> org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:76)
>        at org.apache.solr.search.QParser.getQuery(QParser.java:142)
>
> Cheers,
>
> -- stephane
>


RE: How to take differential backup of Solr Index

2011-05-03 Thread Gaurav Shingala

how we can configure query server in solr using replication feature?

Thanks,
Gaurav

> Date: Mon, 2 May 2011 22:05:33 -0700
> Subject: Re: How to take differential backup of Solr Index
> From: goks...@gmail.com
> To: solr-user@lucene.apache.org
> 
> The Replication feature does this. If you configure a query server as
> a 'backup' server, it downloads changes but does not read them.
> 
> On Mon, May 2, 2011 at 9:56 PM, Gaurav Shingala
>  wrote:
> >
> > Hi,
> >
> > Is there any way to take differential backup of Solr Index?
> >
> > Thanks,
> > Gaurav
> >
> >
> 
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
  

Re: Replicaiton Fails with Unreachable error when master host is responding.

2011-05-03 Thread Jed Glazner
So it turns out that it's the host names.  According the DNS RFC 
underscores are not valid in host names. Most DNS servers now support 
them, but it's not in the rfc strictly speaking.  So there must be 
something in the underlying java classes that bork when using 
underscores in host names, though  I didn't see anything in the stack 
trace that indicated an invalid host name exception. That was most the 
issue though.  Once I changed the host name to the master's IP address  
replication worked great.  So I'm working with our IT to remove 
underscores from our host names.


Just thought I would post my answer here in case anyone else had that 
issue.


Thanks.

Jed.

On 04/28/2011 02:03 PM, Mike Sokolov wrote:

No clue. Try wireshark to gather more data?

On 04/28/2011 02:53 PM, Jed Glazner wrote:

Anybody?

On 04/27/2011 01:51 PM, Jed Glazner wrote:

Hello All,

I'm having a very strange problem that I just can't figure out. The
slave is not able to replicate from the master, even though the master
is reachable from the slave machine.  I can telnet to the port it's
running on, I can use text based browsers to navigate the master from
the slave. I just don't understand why it won't replicate.  The admin
screen gives me an Unreachable in the status, and in the log there is an
exception thrown.  Details below:

BACKGROUND:

OS: Arch Linux
Solr Version: svn revision 1096983 from
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/
No custom plugins, just whatever came with the version above.
Java Setup:

java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

We have 3 cores running, all 3 cores are not able to replicate.

The admin on the slave shows  the Master as
http://solr-master-01_dev.la.bo:8983/solr/music/replication  - *Unreachable*
Replicaiton def on the slave

529
530
531http://solr-master-01_dev.la.bo:8983/solr/music/replication
53200:15:00
533
534

Replication def on the master:

529
530
531commit
532startup
533schema.xml,stopwords.txt
534
535

Below is the log start to finish for replication attempts, note that it
says connection refused, however, I can telnet to 8983 from the slave to
the master, so I know it's up and reachable from the slave:

telnet solr-master-01_dev.la.bo 8983
Trying 172.12.65.58...
Connected to solr-master-01_dev.la.bo.
Escape character is '^]'.

I double checked the master to make sure that it didn't have replication
turned off, and it's not.  So I should be able to replicate but it
can't.  I just dont' know what else to check.  The log from the slave is
below.

Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse
WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please
use the corresponding class in org.apache.solr.response
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler
getReplicationDetails
WARNING: Exception while invoking 'details' method for replication on
master
java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
   at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
   at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
   at java.net.Socket.connect(Socket.java:546)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:616)
   at
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
   at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
   at
org.apache.commons.httpclient.HttpConnection

Using lowercase as field type

2011-05-03 Thread Isan Fulia
Hi ,

My schema consists of a field of type lowercase(for applying the lowercase
filter factory)  and is the unique key .  But its no longer behaving as
unique key. Multiple documents with same value for the unique key are
getting indexed.
Does anyone know why this is happening or is it that the field of type
lowercase cannot be unique key.

-- 
Thanks & Regards,
Isan Fulia.


How to preserve filelist / commit-points after master restart

2011-05-03 Thread Maduranga Kannangara
Hi All,

We use Solr 1.4.1. Single core setup with a repeater (for QA) and a few
slaves (for Production).

Master will index many sources and make data ready. Once all data is
"ready-for-production", optimization will take place. On master
"replicateAfter" is set to "optimize". (Subsequently on repeater
replicateAfter=commit,startup). We do not want to use
replicateAfter=startup,optimize on master as that would release bad data. As
you can see, a bunch of sources should fit together to be able to release a
sensible product. So we use "replicateAfter=optimize" to denote data is now
okay to move to the next level.

The problem is when master is restarted the filelist command on
ReplicationHandler returns nothing and replication will not take place until
another optimise command is done to master.

How can I preserve the "optimized" state (or filelist or commit-points, not
sure what keyword to use..) even after a master restart so that slaves can
carry on from there. (I saw the mail thread Yonik has answered: "Replication
filelist command failure on container restart", but I am trying to figure
out if its possible to persist this file-list or indexDeletionPolicy or
whatever that state -- please correct me on that and sorry for my layman
language)

We have too many master indexes setup in this way, therefore its not a good
idea for us to run optimize or have replicateAfter=startup on each index as
that will reduce the data quality or possible level-of-automation.

Any solution to work around or fix this issue is highly appreciated.

Thanks in advance
Madu


Re: An error I can't manage to fix: java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin

2011-05-03 Thread Markus Jelsma
So, you're using Jetty. That's indeed a place to store the file when using 
Jetty. 

> Oh, I apparently figured out how to get the jar file to load, so problem is
> solved I suppose.
> 
> The fix seems very odd to me, but I got it from a comment on the SSP 2 blog
> page (
> http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/co
> mment-page-1/#comment-4774 ):
> 
> The solution, for those of you getting the NoClassDefFoundError exception
> thrown, is to put the jar file in your example directory, under:
> solr/work/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp/WEB-INF/lib/
> 
> 
> 
> 
> 
> I created that odd directory structure first, copied in the jar, and
> started Jetty.  It looked like the jar was deleted, so I re-copied the jar
> into it. The second time around, everything seems to have worked.
> 
> I am lost as to why its looking in that strange folder structure for the
> jar file, instead of ./lib or ./solr/lib.
> 
> On Tue, May 3, 2011 at 4:35 PM, Markus Jelsma 
wrote:
> > Where did you store the jar? Is it in a directory Solr looks for libs?
> > Depending on your distro or set up there can be different places to store
> > the
> > jar. The easiest solution is to put it in a dir where other Solr libs are
> > found or in a dir that you configured in a  directive in solrconfig.
> > 
> > > Hello all,
> > > 
> > > I've been trying to add the Spatial Search Plugin to my Solr 1.4.1
> > > setup,
> > > 
> > > and I get this error:
> > > > java.lang.NoClassDefFoundError: org/apache/solr/search/QParserPlugin
> > > > at java.lang.ClassLoader.defineClass1(Native Method)
> > > > 
> > > >  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> > > > 
> > > > at
> > > > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14
> > > > 2)
> > > > 
> > > >  at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> > > > 
> > > > ...
> > > > Caused by: java.lang.ClassNotFoundException:
> > > > org.apache.solr.search.QParserPlugin
> > > > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> > > > 
> > > >  at java.security.AccessController.doPrivileged(Native Method)
> > > > 
> > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > > > 
> > > >  at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > > > 
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > > > 
> > > >  ... 50 more
> > > 
> > > I've been trying my best with the devlopers' documentation, but I am
> > 
> > still
> > 
> > > stuck on the install phase of SSP 2.0.  I wonder if there are users of
> > 
> > SSP
> > 
> > > 2 that can help me troubleshoot this, please?
> > > 
> > > -Gavin


Re: Using lowercase as field type

2011-05-03 Thread Markus Jelsma
So those multiple documents overwrite eachother? In that case, your data is 
not suited for a lowercased docID. I'd recommend not doing any analysis on the 
docID to prevent such headaches.

> Hi ,
> 
> My schema consists of a field of type lowercase(for applying the lowercase
> filter factory)  and is the unique key .  But its no longer behaving as
> unique key. Multiple documents with same value for the unique key are
> getting indexed.
> Does anyone know why this is happening or is it that the field of type
> lowercase cannot be unique key.


Solr Terms and Date field issues

2011-05-03 Thread Viswa S

Hello,

The terms query for a date field seems to get populated with some weird dates, 
many of these dates (1970,2009,2011-04-23) are not present in the indexed data. 
 Please see sample data below

I also notice that a delete and optimize does not remove the relevant terms for 
date fields, the string fields seems work fine.

Thanks
Viswa

Results from Terms component:


3479

3479

3479

3479

3479

3479

3479

3479

3479

265


Result from facet component, rounded by seconds.:


1

1148

2333

+1SECOND

2011-05-03T06:14:14Z

2011-05-04T06:14:14Z
  

Re: Using lowercase as field type

2011-05-03 Thread Isan Fulia
I want multiple documents with same unique key  to overwrite each other but
they are not overwriting because of lowercase field type as unique key

On 4 May 2011 11:45, Markus Jelsma  wrote:

> So those multiple documents overwrite eachother? In that case, your data is
> not suited for a lowercased docID. I'd recommend not doing any analysis on
> the
> docID to prevent such headaches.
>
> > Hi ,
> >
> > My schema consists of a field of type lowercase(for applying the
> lowercase
> > filter factory)  and is the unique key .  But its no longer behaving as
> > unique key. Multiple documents with same value for the unique key are
> > getting indexed.
> > Does anyone know why this is happening or is it that the field of type
> > lowercase cannot be unique key.
>



-- 
Thanks & Regards,
Isan Fulia.