RE: Solr e Terracotta

2006-12-08 Thread Fuad Efendi
Cool... I don't believe into any "real-time" for enterprise. Everything is measured by response time, which is very good in case of 0.5-3 seconds, and acceptable for up to 10 seconds... BEA offers 'real-time' WebLogic, JRockit uses 'determenistic garbage collection'. Indeed, we have 'asynchronous'

Re: Result: numFound inaccuracies

2006-12-08 Thread Yonik Seeley
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: Hello, me again. I have been running some extensive tests of my search engine and have been seeing inaccuracies with the "numFound" attribute. It tends to return 1 more than what is actually show in the XML. Is this a bug, or could I be doing

Result: numFound inaccuracies

2006-12-08 Thread Andrew Nagy
Hello, me again. I have been running some extensive tests of my search engine and have been seeing inaccuracies with the "numFound" attribute. It tends to return 1 more than what is actually show in the XML. Is this a bug, or could I be doing something wrong? I have a specific example in fr

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Erik Hatcher wrote: On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote: My data is 492,000 records of book data. I am faceting on 4 fields: author, subject, language, format. Format and language are fairly simple as their are only a few unique terms. Author and subject however are much differe

Re: Facet Performance

2006-12-08 Thread Chris Hostetter
: Unfortunately which strategy will be chosen is currently undocumented : and control is a bit oblique: If the field is tokenized or multivalued : or Boolean, the FilterQuery method will be used; otherwise the : FieldCache method. I expect I or others will improve that shortly. Bear in mind, wh

Re: Facet Performance

2006-12-08 Thread Erik Hatcher
On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote: My data is 492,000 records of book data. I am faceting on 4 fields: author, subject, language, format. Format and language are fairly simple as their are only a few unique terms. Author and subject however are much different in that there are

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
J.J. Larrea wrote: Unfortunately which strategy will be chosen is currently undocumented and control is a bit oblique: If the field is tokenized or multivalued or Boolean, the FilterQuery method will be used; otherwise the FieldCache method. I expect I or others will improve that shortly.

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, J.J. Larrea <[EMAIL PROTECTED]> wrote: Unfortunately which strategy will be chosen is currently undocumented and control is a bit oblique: If the field is tokenized or multivalued or Boolean, the FilterQuery method will be used; otherwise the FieldCache method. If anyone had time

Re: Facet Performance

2006-12-08 Thread J.J. Larrea
Andrew Nagy, ditto on what Yonik said. Here is some further elaboration: I am doing much the same thing (faceting on Author etc.). When my Author field was defined as a solr.TextField, even using solr.KeywordTokenizerFactory so it wasn't actually tokenized, the faceting code chose the QueryFilt

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: Are they multivalued, and do they need to be. Anything that is of type "string" and not multivalued will use the lucene FieldCache rather than the filterCache. The author field is multivalued. Will this be a strong performance issue? I could make multiple author fields as

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: Chris Hostetter wrote: >: Could you suggest a better configuration based on this? > >If that's what your stats look like after a single request, then i would >guess you would need to make your cache size at least 1.6 million in order >for it to

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Chris Hostetter wrote: : Could you suggest a better configuration based on this? If that's what your stats look like after a single request, then i would guess you would need to make your cache size at least 1.6 million in order for it to be of any use in improving your facet speed. Would th

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : My data is 492,000 records of book data. I am faceting on 4 fields: : author, subject, language, format. : Format and language are fairly simple as their are only a few unique : terms. Author and subject however are much different in that

Re: Facet Performance

2006-12-08 Thread Chris Hostetter
: Here are the stats, Im still a newbie to SOLR, so Im not totally sure : what this all means: : lookups : 1530036 : hits : 2 : hitratio : 0.00 : inserts : 1530035 : evictions : 1504435 : size : 25600 those numbers are telling you that your cache is capable of holding 25,600 items. you have attem

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: I changed the filterCache to the following: However a search that normally takes .04s is taking 74 seconds once I use the facets since I am faceting on 4 fields. The first time or subsequent times? Is your filterCa

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: I changed the filterCache to the following: However a search that normally takes .04s is taking 74 seconds once I use the facets since I am faceting on 4 fields. The first time or subsequent times? Is your filterCache big enough yet? Wha

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: 1) facet on single-valued strings if you can 2) if you can't do (1) then enlarge the fieldcache so that the number of filters (one per possible term in the field you are filtering on) can fit. I changed the filterCache to the following: However a search that normally t