Hm, this is a mystery to me - I don't see anything that would turn on Term Vectors...
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Salman Akram <salman.ak...@northbaysolutions.net> > To: solr-user@lucene.apache.org > Sent: Sun, January 16, 2011 2:26:53 PM > Subject: Re: TVF file > > Please see below the dir listing and relevant part of schema file (I have > removed the name part from fields for obvious reasons). > > Also regarding .frq file why exactly is it needed? Is it required in phrase > searching (I am not using highlighting or MoreLikeThis on this index file) > too? and this is not made if all fields are using omitTF? > > Thanks alot! > > --------------Dir Listing------------------ > 01/16/2011 06:05 AM <DIR> . > 01/16/2011 06:05 AM <DIR> .. > 01/15/2011 03:58 PM <DIR> log > 04/22/2010 12:42 AM 549 luke.jnlp > 01/16/2011 04:58 AM 20 segments.gen > 01/16/2011 04:58 AM 287 segments_5hl > 01/16/2011 02:17 AM 4,760,716,827 _36w.fdt > 01/16/2011 02:17 AM 107,732,836 _36w.fdx > 01/16/2011 02:15 AM 4,032 _36w.fnm > 01/16/2011 04:36 AM 25,221,109,245 _36w.frq > 01/16/2011 04:38 AM 4,457,445,928 _36w.nrm > 01/16/2011 04:36 AM 126,866,227,056 _36w.prx > 01/16/2011 04:36 AM 22,510,915 _36w.tii > 01/16/2011 04:36 AM 1,635,096,862 _36w.tis > 01/16/2011 04:58 AM 18,341,750 _36w.tvd > 01/16/2011 04:58 AM 78,450,397,739 _36w.tvf > 01/16/2011 04:58 AM 215,465,668 _36w.tvx > 14 File(s) 241,755,049,714 bytes > 3 Dir(s) 1,072,112,025,600 bytes free > > > -----------------Schema File---------------------- > > F:\IndexingAppsRealTime\index> > <?xml version="1.0" encoding="UTF-8" ?> > <schema name="example" version="1.2"> > <types> > <fieldType name="string" class="solr.StrField" sortMissingLast="true" > omitNorms="true"/> > <fieldtype name="text" class="solr.TextField"> > <analyzer> > <tokenizer class="solr.StandardTokenizerFactory" > luceneMatchVersion="LUCENE_29"/> > <filter class="solr.StandardFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory"/>--> > </analyzer> > </fieldtype> > </types> > <fields> > <field name="text" type="text" indexed="true" stored="false" /> > </fields> > > > <fields> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true" /> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="false" /> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true" /> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="string" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <field name="" type="text" indexed="true" stored="true"/> > <dynamicField name="" type="text" indexed="true" stored="false"/> > > </schema> > > > > On Sun, Jan 16, 2011 at 6:52 PM, Otis Gospodnetic < > otis_gospodne...@yahoo.com> wrote: > > > Hm, want to email the index dir listing (ls -lah) + the field type and > > field > > definitions from your schema.xml? > > > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > ----- Original Message ---- > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > To: solr-user@lucene.apache.org > > > Sent: Sun, January 16, 2011 7:51:15 AM > > > Subject: Re: TVF file > > > > > > Nops. I optimized it with Standard File Format and cleaned up Index dir > > > through Luke. It adds upto to the total size when I optimized it with > > > Compound File Format. > > > > > > On Sun, Jan 16, 2011 at 5:46 PM, Otis Gospodnetic < > > > otis_gospodne...@yahoo.com> wrote: > > > > > > > Is it possible that the tvf file you are looking at is old (i.e. not > > part > > > > of > > > > your active index)? > > > > > > > > Otis > > > > ---- > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Sun, January 16, 2011 6:17:23 AM > > > > > Subject: Re: TVF file > > > > > > > > > > Some more info I copied it from Luke and below is what it says > > for... > > > > > > > > > > Text Fields --> stored/uncompressed,indexed,tokenized > > > > > String Fields --> > > stored/uncompressed,indexed,omitTermFreqAndPositions > > > > > > > > > > The main contents field is not stored so it doesn't show up on Luke > > but > > > > that > > > > > is Analyzed and Tokenized for searching. > > > > > > > > > > On Sun, Jan 16, 2011 at 3:50 PM, Salman Akram < > > > > > salman.ak...@northbaysolutions.net> wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > From my understanding TVF file stores the Term Vectors > > > > (Positions/Offset) > > > > > > so if no field has Field.TermVector set (default is NO) so it > > > > shouldn't be > > > > > > created, right? > > > > > > > > > > > > I have an index created through SOLR on which no field had any > > value > > > > for > > > > > > TermVectors so by default it shouldn't be saved. All the fields > > are > > > > either > > > > > > String or Text. All fields have just indexed and stored > > attributes set > > > > to > > > > > > True. String fields have omitNorms = true as well. > > > > > > > > > > > > Even in Luke it doesn't show V (Term Vector) flag but I have a > > big > > TVF > > > > file > > > > > > in my index. Its almost 30% of the total index (around 60% is the > > PRX > > > > > > positions file). > > > > > > > > > > > > Also in Luke it shows 'f' (omitTF) flag for strings but not for > > text > > > > > > fields. > > > > > > > > > > > > Any ideas what's going on? Thanks! > > > > > > > > > > > > -- > > > > > > Regards, > > > > > > > > > > > > Salman Akram > > > > > > Senior Software Engineer - Tech Lead > > > > > > 80-A, Abu Bakar Block, Garden Town, Pakistan > > > > > > Cell: +92-321-4391210 > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Regards, > > > > > > > > > > Salman Akram > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > Salman Akram > > > > > > > > > -- > Regards, > > Salman Akram >