Hi, I'm having a problem with certain search terms not being found when I do a query. I'm using Solrj to index a pdf document, and add the contents to the 'contents' field. If I query the 'contents' field on the SolrInputDocument doc object as below, I get 50k tokens.
StringTokenizer to = new StringTokenizer((String)doc.getFieldValue( "contents")); System.out.println( "Tokens:" + to.countTokens() ); However, once the doc is indexed and I use Luke to analyse the index, it has only 3300 tokens in that field. Where did the other 47k go? I read some other threads mentioning to increase the maxfieldLength in solrconfig.xml, and my setting is below. <maxFieldLength>2147483647</maxFieldLength> Any advice is appreciated, Paul