What Nutch does is, after fetching document from server they are passed to parser to parse and parser detects the document type and accordingly do the parsing.
One possibility could be parser had failed to parse some documents. and that's why you are getting count mismatch. -- View this message in context: http://lucene.472066.n3.nabble.com/What-kind-of-nutch-documents-does-Solr-index-tp4231646p4232034.html Sent from the Solr - User mailing list archive at Nabble.com.