If you're using a String fieldtype, you're not indexing it so much as
dumping the whole content blob in there as a single term for exact
matching.

You probably want to look at one of the text field types for textural
content.

That doesn't explain the difference in behavior between Solr versions, but
my hunch is that you'll be happier in general with the behavior of a field
type that does tokenizing and stemming for plain text search anyway.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions
<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
w: appinions.com <http://www.appinions.com/>

On Mon, Sep 15, 2014 at 10:06 AM, Christopher Gross <cogr...@gmail.com>
wrote:

> Solr 4.9.0
> Java 1.7.0_49
>
> I'm indexing an internal Wiki site.  I was running on an older version of
> Solr (4.1) and wasn't having any trouble indexing the content, but now I'm
> getting errors:
>
> SCHEMA:
> <field name="content" type="string" indexed="false" stored="true"
> required="true"/>
>
> LOGS:
> Caused by: java.lang.IllegalArgumentException: Document contains at least
> one immense term in field="content" (whose UTF8 encoding is longer than the
> max length 32766), all of which were skipped.  Please correct the analyzer
> to not produce such terms.  The prefix of the first immense term is: '[60,
> 33, 45, 45, 32, 98, 111, 100, 121, 67, 111, 110, 116, 101, 110, 116, 32,
> 45, 45, 62, 10, 9, 9, 9, 60, 100, 105, 118, 32, 115]...', original message:
> bytes can be at most 32766 in length; got 183250
> ....
> Caused by:
> org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
> can be at most 32766 in length; got 183250
>
> I was indexing it, but I switched that off (as you can see above) but it
> still is having problems.  Is there a different type I should use, or a
> different analyzer?  I imagine that there is a way to index very large
> documents in Solr.  Any recommendations would be helpful.  Thanks!
>
> -- Chris
>

Reply via email to