Well anyways thanks for the help. Also can you please reply to this about .frq file (since that's quite big too).
"Also regarding .frq file why exactly is it needed? Is it required in phrase searching (I am not using highlighting or MoreLikeThis on this index file) too? and this is not made if all fields are using omitTF?" On Mon, Jan 17, 2011 at 10:18 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hm, this is a mystery to me - I don't see anything that would turn on Term > Vectors... > > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > ----- Original Message ---- > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > To: solr-user@lucene.apache.org > > Sent: Sun, January 16, 2011 2:26:53 PM > > Subject: Re: TVF file > > > > Please see below the dir listing and relevant part of schema file (I > have > > removed the name part from fields for obvious reasons). > > > > Also regarding .frq file why exactly is it needed? Is it required in > phrase > > searching (I am not using highlighting or MoreLikeThis on this index > file) > > too? and this is not made if all fields are using omitTF? > > > > Thanks alot! > > > > --------------Dir Listing------------------ > > 01/16/2011 06:05 AM <DIR> . > > 01/16/2011 06:05 AM <DIR> .. > > 01/15/2011 03:58 PM <DIR> log > > 04/22/2010 12:42 AM 549 luke.jnlp > > 01/16/2011 04:58 AM 20 segments.gen > > 01/16/2011 04:58 AM 287 segments_5hl > > 01/16/2011 02:17 AM 4,760,716,827 _36w.fdt > > 01/16/2011 02:17 AM 107,732,836 _36w.fdx > > 01/16/2011 02:15 AM 4,032 _36w.fnm > > 01/16/2011 04:36 AM 25,221,109,245 _36w.frq > > 01/16/2011 04:38 AM 4,457,445,928 _36w.nrm > > 01/16/2011 04:36 AM 126,866,227,056 _36w.prx > > 01/16/2011 04:36 AM 22,510,915 _36w.tii > > 01/16/2011 04:36 AM 1,635,096,862 _36w.tis > > 01/16/2011 04:58 AM 18,341,750 _36w.tvd > > 01/16/2011 04:58 AM 78,450,397,739 _36w.tvf > > 01/16/2011 04:58 AM 215,465,668 _36w.tvx > > 14 File(s) 241,755,049,714 bytes > > 3 Dir(s) 1,072,112,025,600 bytes free > > > > > > -----------------Schema File---------------------- > > > > F:\IndexingAppsRealTime\index> > > <?xml version="1.0" encoding="UTF-8" ?> > > <schema name="example" version="1.2"> > > <types> > > <fieldType name="string" class="solr.StrField" > sortMissingLast="true" > > omitNorms="true"/> > > <fieldtype name="text" class="solr.TextField"> > > <analyzer> > > <tokenizer class="solr.StandardTokenizerFactory" > > luceneMatchVersion="LUCENE_29"/> > > <filter class="solr.StandardFilterFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.EnglishPorterFilterFactory"/>--> > > </analyzer> > > </fieldtype> > > </types> > > <fields> > > <field name="text" type="text" indexed="true" stored="false" /> > > </fields> > > > > > > <fields> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true" /> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="false" /> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true" /> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="string" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <field name="" type="text" indexed="true" stored="true"/> > > <dynamicField name="" type="text" indexed="true" stored="false"/> > > > > </schema> > > > > > > > > On Sun, Jan 16, 2011 at 6:52 PM, Otis Gospodnetic < > > otis_gospodne...@yahoo.com> wrote: > > > > > Hm, want to email the index dir listing (ls -lah) + the field type and > > > field > > > definitions from your schema.xml? > > > > > > Otis > > > ---- > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > ----- Original Message ---- > > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > > To: solr-user@lucene.apache.org > > > > Sent: Sun, January 16, 2011 7:51:15 AM > > > > Subject: Re: TVF file > > > > > > > > Nops. I optimized it with Standard File Format and cleaned up Index > dir > > > > through Luke. It adds upto to the total size when I optimized it > with > > > > Compound File Format. > > > > > > > > On Sun, Jan 16, 2011 at 5:46 PM, Otis Gospodnetic < > > > > otis_gospodne...@yahoo.com> wrote: > > > > > > > > > Is it possible that the tvf file you are looking at is old (i.e. > not > > > part > > > > > of > > > > > your active index)? > > > > > > > > > > Otis > > > > > ---- > > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > > > > To: solr-user@lucene.apache.org > > > > > > Sent: Sun, January 16, 2011 6:17:23 AM > > > > > > Subject: Re: TVF file > > > > > > > > > > > > Some more info I copied it from Luke and below is what it says > > > for... > > > > > > > > > > > > Text Fields --> stored/uncompressed,indexed,tokenized > > > > > > String Fields --> > > > stored/uncompressed,indexed,omitTermFreqAndPositions > > > > > > > > > > > > The main contents field is not stored so it doesn't show up on > Luke > > > but > > > > > that > > > > > > is Analyzed and Tokenized for searching. > > > > > > > > > > > > On Sun, Jan 16, 2011 at 3:50 PM, Salman Akram < > > > > > > salman.ak...@northbaysolutions.net> wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > From my understanding TVF file stores the Term Vectors > > > > > (Positions/Offset) > > > > > > > so if no field has Field.TermVector set (default is NO) so > it > > > > > shouldn't be > > > > > > > created, right? > > > > > > > > > > > > > > I have an index created through SOLR on which no field had > any > > > value > > > > > for > > > > > > > TermVectors so by default it shouldn't be saved. All the > fields > > > are > > > > > either > > > > > > > String or Text. All fields have just indexed and stored > > > attributes set > > > > > to > > > > > > > True. String fields have omitNorms = true as well. > > > > > > > > > > > > > > Even in Luke it doesn't show V (Term Vector) flag but I > have a > > > big > > > TVF > > > > > file > > > > > > > in my index. Its almost 30% of the total index (around 60% > is > the > > > PRX > > > > > > > positions file). > > > > > > > > > > > > > > Also in Luke it shows 'f' (omitTF) flag for strings but not > for > > > text > > > > > > > fields. > > > > > > > > > > > > > > Any ideas what's going on? Thanks! > > > > > > > > > > > > > > -- > > > > > > > Regards, > > > > > > > > > > > > > > Salman Akram > > > > > > > Senior Software Engineer - Tech Lead > > > > > > > 80-A, Abu Bakar Block, Garden Town, Pakistan > > > > > > > Cell: +92-321-4391210 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Regards, > > > > > > > > > > > > Salman Akram > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Regards, > > > > > > > > Salman Akram > > > > > > > > > > > > > > > -- > > Regards, > > > > Salman Akram > > > -- Regards, Salman Akram Senior Software Engineer - Tech Lead 80-A, Abu Bakar Block, Garden Town, Pakistan Cell: +92-321-4391210