Re: Solr on JBOSS 4.0.3

2007-06-01 Thread Thierry Collogne

Thank you for the suggestion. I downloaded a newer version of JBOSS and then
copied the files from the JBOS_HOME\lib\endorsed to the WEB-INF\lib folder
of Solr.

That solved it.

On 31/05/07, Daniel Naber <[EMAIL PROTECTED]> wrote:


On Thursday 31 May 2007 09:58, Thierry Collogne wrote:

> Is there someone who can explain to me what the dependencies are with
> the above jar files? Are perhaps offer another solution?

You need to find the right version of those files (probably newer than the
ones in JBoss?) and place them in WEB-INF/lib of Solr. Then Solr should
use them and the rest of the system should not be affected. I'm not sure
how to find those versions other than trying.

Regards
Daniel

--
http://www.danielnaber.de



question about highlight field

2007-06-01 Thread Xuesong Luo
Hi, there,

I have a question about how to use the highlight field(hl.fl), below is
my test result. As you can see, if I don't use hl.fl in the query, the
highlighting element in the result only shows the id information. I have
to add the field name (hl.fl=TITLE) to the query to see the field
information. Is that the correct behavior? If there are multiple fields
that could contain the search string, I have to add all of them to
hl.fl?

 

Thanks

Xuesong

 

http://localhost:8080/search/select/?q=Consultant&version=2.2&start=0&ro
ws=10&indent=on&hl=true

 

-
  <   

  <   

  <

 

 

http://localhost:8080/search/select/?q=Consultant&version=2.2&start=0&ro
ws=10&indent=on&hl=true&hl.fl=TITLE
 

 

 

< 

  <   

 

   Senior Event Manager 

   

  <  



 



Re: question about highlight field

2007-06-01 Thread Mike Klaas


On 1-Jun-07, at 9:37 AM, Xuesong Luo wrote:


Hi, there,

I have a question about how to use the highlight field(hl.fl),  
below is

my test result. As you can see, if I don't use hl.fl in the query, the
highlighting element in the result only shows the id information. I  
have

to add the field name (hl.fl=TITLE) to the query to see the field
information. Is that the correct behavior? If there are multiple  
fields

that could contain the search string, I have to add all of them to
hl.fl?


Highlighting uses the following fields:

1. hl.fl, if present, will define all fields to be highlighted.  You  
can highlight fields that were not part of the query (as you  
demonstrate below)
2. if hl.fl is absent and qt=standard, the default search field is  
highlighted (set in schema.xml or df= parameter

3. if hl.fl is absent and qt=dismax, the query fields are used (qf=)

Note that every field to be highlighted must be stored.  If not, it  
will not be present in the output (perhaps that is what you are  
seeing in your example).


Finally, all terms are highlighted in all highlight fields.  If you  
query searches for different terms in different fields and you want  
this exactitude to carry forth in your highlighting, specify  
hl.requireFieldMatch=true.


-Mike


Re: question about highlight field

2007-06-01 Thread Chris Hostetter

: I have a question about how to use the highlight field(hl.fl), below is
: my test result. As you can see, if I don't use hl.fl in the query, the
: highlighting element in the result only shows the id information. I have

according to the wiki, a blank (or missing) hl.fl should result in the
fields you searched being used for highlighting ... so either the wiki is
out of date, or there is a bug in the highlighting ... i'm not sure which.

(hopefully someone who knows more about highlighting can chime in)

Demo using example schema and explicitly querying a field that is
stored...
http://localhost:8983/solr/select?q=features%3Asolr&hl=on


-Hoss



Indexing a lot of documents?

2007-06-01 Thread Jordan Hayes
New user here, and I ran into a problem trying to load a lot of 
documents (~900k).  I tried to load them all at once, which seemed to 
run for a long time and then finally crap out with "Too many open files" 
... so I read in an FAQ that "about 100" might be a good number.  I 
split my documents up and added  to the end of each batch, and 
got about 10k into it before getting that error again.


Am I just doing something wrong?  And: is there a way to just hand the 
XML file to Solr without having to POST it?


Thanks,

/jordan 



Re: Indexing a lot of documents?

2007-06-01 Thread Mike Klaas

On 1-Jun-07, at 6:35 PM, Jordan Hayes wrote:

New user here, and I ran into a problem trying to load a lot of  
documents (~900k).  I tried to load them all at once, which seemed  
to run for a long time and then finally crap out with "Too many  
open files" ... so I read in an FAQ that "about 100" might be a  
good number.  I split my documents up and added  to the  
end of each batch, and got about 10k into it before getting that  
error again.


I'm not clear exactly what you mean by batches.  There are two types:

1. batches of documents sent in a single  command to Solr.  Good  
values are between 10 and 100, depending on document size.  It is  
mostly about reducing http overhead (which is small regardless), so  
it is very quickly pointless to increase this number.  Try persistent  
HTTP connections
2. batches of docs sent between .   In theory unlimited, but  
I once ran into a problem that I could not reliably reproduce when  
ing 4m docs.  It occurred under tightish memory conditions  
(for 4m docs) and I've since made a change to Solr's deleted docs  
algo which should optimize the io in such cases.  In any case,  
ing every 300-400k docs would not hurt.  I would not ever  
commit as frequently as 100 docs, unless there was query timeliness  
requirements.



Am I just doing something wrong?


No.  Lucene sometimes just requires many file descriptors (this will  
be somewhat alleviated with Solr 1.2).  I suggest upping the open  
file limit (I upped mine from 1024 to 45000 to handle huge indices).   
You can alleviate this by reducing the mergeFactor, but this can  
impact indexing performance.


And: is there a way to just hand the XML file to Solr without  
having to POST it?


No, but POST'ing shouldn't be a bottleneck.

-Mike


Re: Indexing a lot of documents?

2007-06-01 Thread Erik Hatcher


On Jun 1, 2007, at 10:47 PM, Mike Klaas wrote:

Am I just doing something wrong?


No.  Lucene sometimes just requires many file descriptors (this  
will be somewhat alleviated with Solr 1.2).  I suggest upping the  
open file limit (I upped mine from 1024 to 45000 to handle huge  
indices).  You can alleviate this by reducing the mergeFactor, but  
this can impact indexing performance.


Another thing to do which will definitely keep file handles down is  
to set to the compound index format.  That setting is in solrconfig.xml


Erik





Re: Indexing a lot of documents?

2007-06-01 Thread Yonik Seeley

On 6/1/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:

On Jun 1, 2007, at 10:47 PM, Mike Klaas wrote:
>> Am I just doing something wrong?
>
> No.  Lucene sometimes just requires many file descriptors (this
> will be somewhat alleviated with Solr 1.2).  I suggest upping the
> open file limit (I upped mine from 1024 to 45000 to handle huge
> indices).  You can alleviate this by reducing the mergeFactor, but
> this can impact indexing performance.

Another thing to do which will definitely keep file handles down is
to set to the compound index format.  That setting is in solrconfig.xml


That should be less necessary when Solr 1.2 comes out (next week, I promise ;-)
There are now 8 files per segment instead of 7 + num_indexed_fields

-Yonik


Re: Indexing a lot of documents?

2007-06-01 Thread Jordan Hayes

I'm not clear exactly what you mean by batches.


What I'm doing is:

 
 <-- #1
   [...]
 <-- #100
 

So that's 100, in 1 HTTP POST.

Lucene sometimes just requires many file descriptors (this will  
be somewhat alleviated with Solr 1.2).


Is there a way to find out how many is "many" ...?

/jordan