Hi Otis Thanks for the info. I tried 2 different ways that both seem to work okay.
I added <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="100000"/> to the <indexConfig> in the solrconfig.xml And I tried adding the <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="100000"/> To the <fieldType><analyzer type="index"> section, in the Schema.xml file. Both ways work ok. Cheers Mark On 28/02/2013 08:05, "Otis Gospodnetic" <otis.gospodne...@gmail.com> wrote: > Mark, > > Look at > http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1 > /conf/solrconfig.xml: > > <indexConfig> > <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a > LimitTokenCountFilterFactory in your fieldType definition. E.g. > <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/> > --> > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Wed, Feb 27, 2013 at 11:08 AM, Mark Wilson <m...@sanger.ac.uk> wrote: > >> Hi >> >> I am using Nutch to crawl a site, and post it in Solr 3.6.1. The page is >> very large. >> >> When I query the index, using the Solr Admin query page, it only finds the >> result if it is in the top X% of the page, probably about 30%. >> >> The page is about 79Kb, and consists of 19,067 words. >> >> Is there a setting somewhere that sets the maxFieldSize? Or maxTokenSize? >> >> I set the field content to be displayed on the result page, and it displays >> all the data correctly, where I can see all the tokens I get no results >> from. >> >> I can't split the page up, as it is auto-generated from a database. >> >> Any help gratefully received. >> >> Thanks Mark >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome Research >> Limited, a charity registered in England with number 1021457 and a >> company registered in England with number 2742969, whose registered >> office is 215 Euston Road, London, NW1 2BE. >> -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.