Lucene has a maxFieldLength (the number of tokens to index for a given
field name).
It can be configured via solrconfig.xml:
<maxFieldLength>10000</maxFieldLength>

-Yonik

On Tue, Jul 22, 2008 at 11:38 AM, Tom Lord <[EMAIL PROTECTED]> wrote:
> Hi, we've looked for info about this issue online and in the code and am
> none the wiser - help would be much appreciated.
>
> We are indexing the full text of journals using Solr. We currently pass
> in the journal text, up to maybe 130 pages, and index it in one go.
>
> We are seeing Solr stop indexing after ~30 pages or so. That is, when we
> look at the indexed text field using Luke, we can see where it gives up
> collecting information from the text.
>
> What is the maximum size that we can index on? Is this a known issue or
> standard behaviour, or is something else amiss?
>
> If this is standard behaviour, what is the approved way of avoiding this
> issue? Should we index on a per-page basis rather than trying to do 130
> pages as a single document?
>
> thanks in advance,
> Tom.
>
> --
> Tom Lord | ([EMAIL PROTECTED])
>
> Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
> The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES
>
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791.
>
>

Reply via email to