Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
With many-to-many relationship between Category and Product we can go with multivalued Category field, or we can even have repeated values in Category&Point-of-Interest fields (_single_ valued); it's not necessary to store all fields in an index - you can store pointer to database Primary Key for

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
Simple design with _single_ valued fields: IdCategoryProduct 001 TVSONY 12345 002 Radio Panasonic 54321 003 TV

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
But answer to initial question... I think your documents are huge... Funtick wrote: > > > > Britske wrote: >> >> I do understand that, at first glance, it seems possible to use >> multivalued fields, but with multivalued fields it's not possible to >> pinpoint the exact value within the mu

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
I do understand that, at first glance, it seems possible to use multivalued fields, but with multivalued fields it's not possible to pinpoint the exact value within the multivalued field that I need. I used a technics with single document consisting on single Category and multiple Products (m

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Britske
no, I'm using dynamic fields, they've been around for a pretty long time. I use int-values in the 10k fields for filtering and sorting. On top of that I use a lot of full-text filtering on the other fields, as well as faceting, etc. I do understand that, at first glance, it seems possible to us

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
Yes, it should be extremely simple! I simply can't understand how you describe it: Britske wrote: > > Rows in solr represent productcategories. I will have up to 100k of them. > > - Each product category can have 10k products each. These are encoded as > the 10k columns / fields (all 10k fiel

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Funtick wrote: > > > Britske wrote: >> >> - Rows in solr represent productcategories. I will have up to 100k of >> them. >> - Each product category can have 10k products each. These are encoded as >> the 10k columns / fields (all 10k fields are int values) >> > > You are using multivalued

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Funtick wrote: > > > Britske wrote: >> >> - Rows in solr represent productcategories. I will have up to 100k of >> them. >> - Each product category can have 10k products each. These are encoded as >> the 10k columns / fields (all 10k fields are int values) >> > > You are using multivalued

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Britske wrote: > > - Rows in solr represent productcategories. I will have up to 100k of > them. > - Each product category can have 10k products each. These are encoded as > the 10k columns / fields (all 10k fields are int values) > You are using multivalued fields, you are not using 10k fie

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Hi Fuad, Funtick wrote: > > > Britske wrote: >> >> When performing these queries I notice a big difference between qTime >> (which is mostly in the 15-30 ms range due to caching) and total time >> taken to return the response (measured through SolrJ's elapsedTime), >> which takes between 500

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Britske wrote: > > When performing these queries I notice a big difference between qTime > (which is mostly in the 15-30 ms range due to caching) and total time > taken to return the response (measured through SolrJ's elapsedTime), which > takes between 500-1600 ms. > Documents have a lot of st

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Currently, I can't say what the data actualle represents but the analogy of t Mike Klaas wrote: > > On 28-Jul-08, at 11:16 PM, Britske wrote: > >> >> That sounds interesting. Let me explain my situation, which may be a >> variant >> of what you are proposing. My documents contain more than

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-29 Thread Mike Klaas
On 28-Jul-08, at 11:16 PM, Britske wrote: That sounds interesting. Let me explain my situation, which may be a variant of what you are proposing. My documents contain more than 10.000 fields, but these fields are divided like: 1. about 20 general purpose fields, of which more than 1 can b

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
That sounds interesting. Let me explain my situation, which may be a variant of what you are proposing. My documents contain more than 10.000 fields, but these fields are divided like: 1. about 20 general purpose fields, of which more than 1 can be selected in a query. 2. about 10.000 fields of

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
On 28-Jul-08, at 1:53 PM, Britske wrote: Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? It does help, but not enough. With lots of data per document and not a lot of memory, it becomes probabilistically likely that each doc resides in a

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
I'm using the solr-nightly of 2008-04-05 Grant Ingersoll-6 wrote: > > What version of Solr/Lucene are you using? > > On Jul 28, 2008, at 4:53 PM, Britske wrote: > >> >> I'm on a development box currently and production servers will be >> bigger, but >> at the same time the index will be to

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Thanks for clearing that up for me. I'm going to investigate some more... Yonik Seeley wrote: > > On Mon, Jul 28, 2008 at 4:53 PM, Britske <[EMAIL PROTECTED]> wrote: >> Each query requests at most 20 stored fields. Why doesn't help >> lazyfieldloading in this situation? > > It's the disk seek

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Grant Ingersoll
What version of Solr/Lucene are you using? On Jul 28, 2008, at 4:53 PM, Britske wrote: I'm on a development box currently and production servers will be bigger, but at the same time the index will be too. Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in th

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Yonik Seeley
On Mon, Jul 28, 2008 at 4:53 PM, Britske <[EMAIL PROTECTED]> wrote: > Each query requests at most 20 stored fields. Why doesn't help > lazyfieldloading in this situation? It's the disk seek that kills you... loading 1 byte or 1000 bytes per document would be about the same speed. > Also, if I und

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
I'm on a development box currently and production servers will be bigger, but at the same time the index will be too. Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? I don't need to retrieve all stored fields and I thought I wasn't doing this (

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
Another possibility is to partition the stored fields into a frequently-accessed set and a full set. If the frequently-accessed set is significantly smaller (in terms of # bytes), then the documents will be tightly-packed on disk and the os caching will be much more effective given the sam

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Yonik Seeley
That's a bit too tight to have *all* of the index cached...your best bet is to go to 4GB+, or figure out a way not to have to retrieve so many stored fields. -Yonik On Mon, Jul 28, 2008 at 4:27 PM, Britske <[EMAIL PROTECTED]> wrote: > > Size on disk is 1.84 GB (of which 1.3 GB sits in FDT files i

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Size on disk is 1.84 GB (of which 1.3 GB sits in FDT files if that matters) Physical RAM is 2 GB with -Xmx800M set to Solr. Yonik Seeley wrote: > > That high of a difference is due to the part of the index containing > these particular stored fields not being in OS cache. What's the size > on

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Yonik Seeley
That high of a difference is due to the part of the index containing these particular stored fields not being in OS cache. What's the size on disk of your index compared to your physical RAM? -Yonik On Mon, Jul 28, 2008 at 4:10 PM, Britske <[EMAIL PROTECTED]> wrote: > > Hi all, > > For some quer

big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Hi all, For some queries I need to return a lot of rows at once (say 100). When performing these queries I notice a big difference between qTime (which is mostly in the 15-30 ms range due to caching) and total time taken to return the response (measured through SolrJ's elapsedTime), which takes