Hi Salman,

Ah, so in the end you *did* have TV enabled on one of your fields! :) (I think 
this was a problem we were trying to solve a few weeks ago here)

How many docs you have in the index doesn't matter here - only N docs/fields 
that you need to display on a page with N results need to be reanalyzed for 
highlighting purposes, so follow Grant's advice, make a small index without TV, 
and compare highlighting speed with and without TV.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Salman Akram <salman.ak...@northbaysolutions.net>
> To: solr-user@lucene.apache.org
> Sent: Fri, February 4, 2011 8:03:06 AM
> Subject: Re: Highlighting with/without Term Vectors
> 
> Basically Term Vectors are only on one main field i.e. Contents. Average
> size  of each document would be few KB's but there are around 130 million
> documents  so what do you suggest now?
> 
> On Fri, Feb 4, 2011 at 5:24 PM, Otis  Gospodnetic <otis_gospodne...@yahoo.com
> >  wrote:
> 
> > Salman,
> >
> > It also depends on the size of your  documents.  Re-analyzing 20 fields of
> > 500
> > bytes each will  be a lot faster than re-analyzing 20 fields with 50 KB
> >  each.
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > ----- Original  Message ----
> > > From: Grant Ingersoll <gsing...@apache.org>
> > > To: solr-user@lucene.apache.org
> >  > Sent: Wed, January 26, 2011 10:44:09 AM
> > > Subject: Re:  Highlighting with/without Term Vectors
> > >
> > >
> > > On  Jan 24, 2011, at 2:42 PM, Salman Akram wrote:
> > >
> > > >  Hi,
> > > >
> > > > Does anyone have any benchmarks how much  highlighting speeds up with
> >  Term
> > > > Vectors  (compared to without it)? e.g. if highlighting on 20  documents
> >  take
> > > > 1 sec with Term Vectors any idea how long it will  take  without them?
> > > >
> > > > I need to know  since the index used for  highlighting has a TVF file of
> > > >  around 450GB (approx 65% of total index  size) so I am trying to  see
> > whether
> > > > the decreasing the index size by   dropping TVF would be more helpful
> > for
> > > > performance  (less RAM, should be  good for I/O too I guess) or keeping
> > it  is
> > > > still better?
> > > >
> > > > I know  the best way is try it out but indexing takes a very long time
> >   so
> > > > trying to see whether its even worthy or not.
> >  >
> > >
> > > Try testing  on a smaller set.  In  general, you are saving the process of
> > >re-analyzing  the  content, so, to some extent it is going to be dependent
> > on how
> >  >fast your  analyzer chain is.  At the size you are at, I don't  know if
> > storing
> > >TVs is  worth  it.
> >
> 
> 
> 
> -- 
> Regards,
> 
> Salman Akram
> 

Reply via email to