Morning,

Last week I was having a problem with terms visible in my search results in
large documents not causing query hits:

http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-td26029040.html#a26029351

Erick suggested it might be related to maxFieldLength, so I set this to
2147483647 in my solrconfig.xml and reindexed over the weekend.

Unfortunately I'm having the same problem now, even though Erick appears to
be right! I've narrowed it down to a single document for testing purposes,
and I can get it returned by querying for a term near the beginning, but
terms near the end cause no hit, and I can even find the point part way
through the document, after which, none of the remaining terms seem to cause
a hit.

The document is about 32000 terms long, most of which is in a single field
called related_ids of about 31000 terms. My first thought was that the text
was being chopped up into so many tokens that it was going over the
maxFieldLength anyway, but 2147483647/32000=67109, and it seems very
unlikely that 67109 tokens would be generated per term!

I've tried undeploying and redeploying the whole web app from Tomcat in case
the new maxFieldLength hadn't been read, but no difference. If I go to

http://localhost:8080/solr/admin/file/?file=solrconfig.xml

I can see

<maxFieldLength>2147483647</maxFieldLength>

as expected.

Does anyone have any more ideas? This could potentially be a showstopper for
us as we have quite a few long-ish documents to index. (32K words doesn't
seem that long to me, but still...)

I've tried it with today's nightly build (2009-10-26) and it makes no
difference. If this sounds like a bug, I'll open a JIRA and attach tars of
my config and data directories. Any thoughts?

Thanks,

Andrew.

-- 
View this message in context: 
http://www.nabble.com/Solr-ignoring-maxFieldLength--tp26057808p26057808.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to