Hi all I'm trying to index some text files using Solr Cell. I'm using the schema from Avi Rappoport's tutorial about indexing html and text files although I also had the same problem with the example/solr setup.
My problem is that words past or "below" a certain point in a file are not being indexed. I must be hitting some limit but I haven't been able to figure out what. I'm hosting with Tomcat and using cURL to post files to /update/extract as per Avi's tutorial and other docs. I don't think it's an http limit during the POST because the whole file is being successfully stored in Solr. I know that because if I retrieve the file body with a query that does work, the word that doesn't work appears lower down in the returned contents. I'm storing the contents now for testing. Once I have this working, the file contents will probably be indexed only. On a test file that I've been editing and moving my unique word around, it seems to stop working if that word is beyond the 100 KB point in the file. I think another file earlier gave a different result. Hopefully I'm missing something obvious. Thanks for any help. Ross