Well stated. You are correct. Here is the field
<field name="text_t" type="text" indexed="true" stored="true" multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/> It uses the text field type as its defined in Solr schema. I didn't change it. The input text is a 6 page UTF-8 text document, the relevant line the term seems to be related to. Just a sentence with no specific boundaries. "...perform more queries and read more results. Even though this example is simple, consider cases where there are intersections between thousands ..." Maybe I need to indicate tokenized? Darren On Fri, 2010-06-18 at 12:52 -0700, Chris Hostetter wrote: > : Thanks for the explanation Chris. I'll try it but the term > : "<lst > : > name="queriesandreadmoreresultseventhoughthisexampleissimpleconsidercaseswheretherear"> > " > : > : strikes me as not very legitimate and the source text is just space > : bounded words so even if its doing what it is supposed to, I'm not sure > : this term is helpful in the index. > > i didnt' say it was helpful -- i just said there's no indication of a bug > in TFVC. it may be a bug in your source data, or a bad decision in your > field type, or a bug in the indexing code ... it's not neccessarily > "right" but nothing you've posted gives any indication of a bug in solr. > > show us your fieldtype and your source data and we might be able to offer > more help, but as is all you've shown us is that you have a really long > term in your index. > > > > -Hoss >