Re: OutOfMemory error while sorting

2006-06-19 Thread Marcus Stratmann

Hi,

Chris Hostetter wrote:

This is a fairly typical Lucene issue (ie: not specific to Solr)...
Ah, I see. I should really put more attention on Lucene. But when 
working with Solr I sometimes forget about the underlying technology.



Sorting on a field requires building a FieldCache for every document --
regardless of how many documents match your query.  This cache is reused
for all searches thta sort on that field.
This makes things clear to me now. I always observed that Solr is slow 
after a commit or optimze. When I put a newly created or updated index 
into service the server always seemed to hang up. The CPU usage went to 
nearly 100 percent and no queries were answered. I found out that 
"warming" the server with serial queries, not parallel ones, bypassed 
this problem (not to be confused with warming the caches!). So after a 
commit I sent some hundred queries from our log to the server and this 
worked fine. But now I know I only need a few specific queries to do the 
job.


Thanks Chris for the great support! The Solr team is doing a very good 
job. With your help I finally got Solr running. Our system is live now 
and I will now switch over to the "Who uses Solr" thread to give you 
some feedback.


Again, thank you very much!

Marcus


Re: who uses Solr?

2006-06-19 Thread Fabio Confalonieri

We (Zero Computing S.r.l. of Italy www.zero.it) are now using Solr as index
of a classified ads portal for our customer Gruppo Espresso:
"one of the leading media groups in Italy with interests in publishing,
radio, advertising, internet businesses and television" (from their site
http://www.gruppoespresso.it/gruppoesp/eng/index.jsp).

In particular I've developed a custom request handler to achive faceted
browsing capability (like You can see in www.oodle.com).

We think to deploy the portal end of July, as I already said, as soon as we
will be online, I'll update the wiki.

Happy presentation !

Fabio

-- 
Ing. Fabio Confalonieri
Zero Computing S.r.l. (www.zero.it)
--
View this message in context: 
http://www.nabble.com/who-uses-Solr--t1799697.html#a4937164
Sent from the Solr - User forum at Nabble.com.



Re: who uses Solr?

2006-06-19 Thread Marcus Stratmann

Our Solr system is up now since a few days. You can find it at
http://www.booklooker.de/
I'm sorry we have a german user interface only, but maybe if you want to 
try out our system you just can fill out some fields in our search form 
and press "suchen" on the right side. We are "book brokers" and maybe 
it's not to hard to find out that "Autor" means "author" and "Titel" is 
"title". "Stichwort" may be interesting because this means "keyword" and 
 will perform a search in a "multiValued" field in Solr. One important 
notice: there are two checkboxes labeled "gebraucht" (used) and "neu" 
(new). Do not check "neu" because this will search in an external 
database which is much more slower than ours. ;-)


For the more technically interested I give you some parameters. We have 
now about 10.5 million documents in our index, each consisting of 24 
fields (you can see why, when you click "SUCHEN" on the left side which 
will present you a detailed search form). The index is 2.6G big on disk.
We have two Solr servers running (actually Tomcat server), but normally 
just one is active. Our users submit about 200.000 queries per day which 
is 2.3 queries per second. Typically this varies from 1.5 to 4.5 queries 
per second over the day. Additionally we have about 100.000 "search 
tasks" in our database which are processed in the morning hours 
(increasing the number of queries per second to 11). The index is 
updated once per day on our main server and then copied to our second 
server.

If you have any question I'm glad to give you further information.

Thanks to the Solr community for helping us setting up this system!

Marcus


Re: OutOfMemory error while sorting

2006-06-19 Thread Chris Hostetter

: nearly 100 percent and no queries were answered. I found out that
: "warming" the server with serial queries, not parallel ones, bypassed
: this problem (not to be confused with warming the caches!). So after a

Note that you can have Solr do this automatically for you in both
firstSearcher and newSearcher listeners (so you never risk having one of
your users hit the searcher before your warming queries).  Take a look at
the commented out usage of QuerySenderListener in the example
solrconfig.xml...


  
 solr 0 10 
 rocks 0 10 
  



-Hoss



newbie Q regarding schema configuration

2006-06-19 Thread Ian Holsman

hi.

so I finally managed to find a bit of time to get a SolR instance  
going, and now have some questions about it ;-)


first the application is tagging. ie.. to associate some keywords  
with a given item, and to show them on a particular object (you can  
see this in action here http://economy-chat.com/aggy/detail/andrew- 
leigh/ )


It user-based (ie individuals can tag a particular object themselves,  
and that get's merged into a global summary for that object)
and it is also hierarchal, ie tagging a child implies you have also  
tagged the parent.


so.. my first question in schema.xml, can you have a composite key as  
the 'uniquekey' field, or do i need to do this on the client side?


2nd question.

can you have complex types which are multivalued?
I'd like to store something like
a tag-name with a corresponding tag-weighting.

can you do sum(*) type queries in lucene/solr? it is efficient ? or  
are you better having a 2nd index which has these sum(*) values in it  
and keep it up to date instead.




Thanks


Wildcard Query

2006-06-19 Thread Pace Davis

I have been using Lucene for about a month now and trying to port the same
functionality to Solr.  How do I do a wildcard query with a leading "*"
 ...This is possible with Lucene if you do not use the standard query
parser.  How do you do this with Solr  This is probably very easy but I
can not find any information in docs or mailing list.

Please help

Thanks