1.  Exactly which version of Solr / SolrJ are you using?

2. ...

: >>>> I am using the SolrJ client to add documents to in my index. My field
: >>>> is a normal "text" field type and the text itself is the first 1000
: >>>> characters of an article.

Can you put the orriginal (pre solr, pre solrj, raw untouched, etc...) 
file that this solr doc came from online somewhere?

What does your *indexing* code look like? ... Can you add some debuging to 
the SolrJ client when you *add* this doc to print out exactly what those 
1000 characters are?

My hunch: when you are extracting the first 1000 characters, you're 
getting only the first half of a character ...or... you are getting docs 
with less them 1000 characters and winding up with a buffer (char[]?) that 
has garbage at the end; SolrJ isn't complaining on the way in, but 
something farther down (maybe before indexing, maybe after) is seeing that 
garbage and cutting the field off at that point.



-Hoss

Reply via email to