Thanks If this is really the case, i declared a new filed called mySpellTextDup and retired the original field. Now i have a new field which powers my dictionary with no words in it and now i am free to index which ever term i want.
This is not the best of solution but i cant think of a reasonable workaround Thanks darniz Lance Norskog-2 wrote: > > This is a quirk of Lucene - when you delete a document, the indexed > terms for the document are not deleted. That is, if 2 documents have > the word 'frampton' in an indexed field, the term dictionary contains > the entry 'frampton' and pointers to those two documents. When you > delete those two documents, the index contains the entry 'frampton' > with an empty list of pointers. So, the terms are still there even > when you delete all of the documents. > > Facets and the spellchecking dictionary build from this term > dictionary, not from the text string that are 'stored' and returned > when you search for the documents. > > The <optimize> command throws away these remnant terms. > > http://www.lucidimagination.com/blog/2009/03/18/exploring-lucenes-indexing-code-part-2/ > > On Wed, Feb 17, 2010 at 12:24 PM, darniz <rnizamud...@edmunds.com> wrote: >> >> Please bear with me on the limitted understanding. >> i deleted all documents and i made a rebuild of my spell checker using >> the >> command >> spellcheck=true&spellcheck.build=true&spellcheck.dictionary=default >> >> After this i went to the schema browser and i saw that mySpellText still >> has >> around 2000 values. >> How can i make sure that i clean up that field. >> We had the same issue with facets too, even though we delete all the >> documents, and if we do a facet on make we still see facets but we can >> filter out facets by saying facet.mincount>0. >> >> Again coming back to my question how can i make mySpellText fields get >> rid >> of all previous terms >> >> Thanks a lot >> darniz >> >> >> >> hossman wrote: >>> >>> : But still i cant stop thinking about this. >>> : i deleted my entire index and now i have 0 documents. >>> : >>> : Now if i make a query with accrd i still get a suggestion of accord >>> even >>> : though there are no document returned since i deleted my entire index. >>> i >>> : hope it also clear the spell check index field. >>> >>> there are two Lucene indexes when you use spell checking. >>> >>> there is the "main" index which is goverend by your schema.xml and is >>> what >>> you add your own documents to, and what searches are run agains for the >>> result section of solr responses. >>> >>> There is also the "spell" index which has only two fields and in >>> which each "document" corrisponds to a "word" that might be returend as >>> a >>> spelling suggestion, and the other fields contain various >>> start/end/middle >>> ngrams that represent possible misspellings. >>> >>> When you use the spellchecker component it builds the "spell" index >>> makinga document out of every word it finds in whatever field name you >>> configure it to use. >>> >>> deleting your entire "main" index won't automaticly delete the "spell" >>> index (allthough you should be able rebuild the "spell" index using the >>> *empty* "main" index, that should work). >>> >>> : i am copying both fields to a field called >>> : <copyField source="make" dest="mySpellText"/> >>> : <copyField source="model" dest="mySpellText"/> >>> >>> ..at this point your "main" index has a field named mySpellText, and for >>> ever document it contains a copy of make and model. >>> >>> : <lst name="spellchecker"> >>> : <str name="name">default</str> >>> : <str name="field">mySpellText</str> >>> : <str name="buildOnOptimize">true</str> >>> : <str name="buildOnCommit">true</str> >>> >>> ...so whenever you commit or optimize your "main" index it will take >>> every >>> word from the mySpellText and use them all as individual documents in >>> the >>> "spell" index. >>> >>> In your previous email you said you changed hte copyField declaration, >>> and >>> then triggered a commit -- that rebuilt your "spell" index, but the data >>> was still all there in the mySpellText field of the "main" index, so the >>> rebuilt "spell" index was exactly the same. >>> >>> : i have buildOnOPtmize and buildOnCommit as true so when i index new >>> document >>> : i want my dictionary to be created but how can i make sure i remove >>> the >>> : preivious indexed terms. >>> >>> everytime the spellchecker component "builds" it will create a >>> completley >>> new "spell" index .. but if the old data is still in the "main" index >>> then >>> it will also be in the "spell" index. >>> >>> The only reason i can think of why you'd be seeing words in your "spell" >>> index after deleting documents from your "main" index is that even if >>> you >>> delete documents, the Terms are still there in the underlying index >>> untill >>> the segments are merged ... so if you do an optimize that will force >>> them >>> to be expunged --- but i honestly have no idea if that is what's causing >>> your problem, because quite frankly i really don't understand what your >>> problem is ... you have to provide specifics: reproducible steps anyone >>> can take using a clean install of solr to see the the behavior you are >>> seeing that seems incorrect. (ie: modifications to the example schema, >>> and commands to execute against hte demo port to see the bug) >>> >>> if you can provide details like that then it's possible to understand >>> what >>> is going wrong for you -- which is a prereq to providing useful help. >>> >>> >>> >>> -Hoss >>> >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/Deleting-spelll-checker-index-tp27376823p27629740.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > Lance Norskog > goks...@gmail.com > > -- View this message in context: http://old.nabble.com/Deleting-spelll-checker-index-tp27376823p27644054.html Sent from the Solr - User mailing list archive at Nabble.com.