Re: Indexing files from HDFS

2017-10-12 Thread Shawn Heisey
On 10/12/2017 2:04 AM, István wrote: The question is not about Hue but about why file_path is in the schema for HDFS files when using search-mr. I am wondering what is the standard way of indexing files on HDFS. The error in your original post indicates that at least one document in the update

Re: Indexing files from HDFS

2017-10-12 Thread István
Hi Erik, The question is not about Hue but about why file_path is in the schema for HDFS files when using search-mr. I am wondering what is the standard way of indexing files on HDFS. THanks, Istvan On Wed, Oct 11, 2017 at 4:53 PM, Erick Erickson wrote: > You probably get much more informed re

Re: Indexing files from HDFS

2017-10-11 Thread Erick Erickson
You probably get much more informed responses from the Cloudera folks, especially about Hue. Best, Erick On Wed, Oct 11, 2017 at 6:05 AM, István wrote: > Hi, > > I have Solr 4.10.3 part of a CDH5 installation and I would like to index > huge amount of CSV files on HDFS. I was wondering what is t

Indexing files from HDFS

2017-10-11 Thread István
Hi, I have Solr 4.10.3 part of a CDH5 installation and I would like to index huge amount of CSV files on HDFS. I was wondering what is the best way of doing that. Here is the current approach: data.csv: id, fruit 10, apple 20, orange Indexing with the following command using search-mr-1.0.0-cd