: We recently updated our Solr and Solr indexing from DIH using Solr 1.4 to our : own Hadoop import using SolrJ and Solr 3.4. ... : Any document that has a string field value with a carriage return "\r" is : having that carriage return stripped before being added to the index. All : line breaks "\n" are not being stripped. ... : This did not occur with the DIH. : : Thoughts? Is there a way to not have solrJ strip all carriage returns?
What makes you think this is SolrJ? If it is, you should be able to create a ~10 line test of SOlrJ demonstrating this with hard coded date. I suspect your data is getting cleaned somewhere else in your data flow that didn't exist when DIH was fetching it directly. -Hoss