Re: Solr indexer and Hadoop

2013-07-02 Thread Michael Della Bitta
Yes, I've read directly from NFS. Consider the case where your mapper takes as input a list of the file paths to operate on. Your mapper would load each file one by one by using standard java.io.* calls, build a SolrInputDocument out of each one, and submit it to a SolrServer implementation stored

Re: Solr indexer and Hadoop

2013-07-02 Thread Anatoli Matuskova
If you can upload your data to hdfs you can use this patch to build the solr indexes: https://issues.apache.org/jira/browse/SOLR-1301 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4074635.html Sent from the Solr - User mailing list archive

Re: Solr indexer and Hadoop

2013-07-02 Thread engy.morsy
Michael, I understand from your post that I can use the current storage without in Hadoop. I already have the storage mounted via NFS. Does your map function read from the mounted storage directly? If possible can you please illustrate more on that. Thanks Engy -- View this message in contex

Re: Solr indexer and Hadoop

2013-06-26 Thread Jack Krupansky
o 4.4. If not in 4.4, 4.5 is probably a slam-dunk. -- Jack Krupansky -Original Message- From: David Larochelle Sent: Wednesday, June 26, 2013 11:24 AM To: solr-user@lucene.apache.org Subject: Re: Solr indexer and Hadoop Pardon, my unfamiliarity with the Solr development process. Now

Re: Solr indexer and Hadoop

2013-06-26 Thread David Larochelle
1st Street > >> > > >> > New York, NY 10017 > >> > > >> > t: @appinions <https://twitter.com/Appinions> | g+: > >> > plus.google.com/appinions > >> > w: appinions.com <http://www.appinions.com/> > >

Re: Solr indexer and Hadoop

2013-06-26 Thread Erick Erickson
ppinions.com/> >> > >> > >> > On Tue, Jun 25, 2013 at 8:58 AM, Jack Krupansky > >wrote: >> > >> >> ??? >> >> >> >> Hadoop=HDFS >> >> >> >> If the data is not in Hadoop/HDFS, just use the nor

Re: Solr indexer and Hadoop

2013-06-25 Thread Michael Della Bitta
ons.com/> > > > > > > On Tue, Jun 25, 2013 at 8:58 AM, Jack Krupansky >wrote: > > > >> ??? > >> > >> Hadoop=HDFS > >> > >> If the data is not in Hadoop/HDFS, just use the normal Solr indexing > >> tools, includi

Re: Solr indexer and Hadoop

2013-06-25 Thread Erick Erickson
ort Handler, and possibly ManifoldCF. >> >> >> -- Jack Krupansky >> >> -Original Message- From: engy.morsy >> Sent: Tuesday, June 25, 2013 8:10 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Solr indexer and Hadoop >> >> >>

Re: Solr indexer and Hadoop

2013-06-25 Thread Michael Della Bitta
y ManifoldCF. > > > -- Jack Krupansky > > -Original Message- From: engy.morsy > Sent: Tuesday, June 25, 2013 8:10 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr indexer and Hadoop > > > Thank you Jack. So, I need to convert those nodes holding data to

Re: Solr indexer and Hadoop

2013-06-25 Thread Jack Krupansky
@lucene.apache.org Subject: Re: Solr indexer and Hadoop Thank you Jack. So, I need to convert those nodes holding data to HDFS. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4073013.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexer and Hadoop

2013-06-25 Thread Otis Gospodnetic
But note that MapReduce and HDFS are not the only way to go. For example, can you split your source data? If you can, you could do that, put them on N machines, and run indexer on all of them, each for some number of threads. Of course, your Solr(Cloud?) cluster better have enough servers/CPU cor

RE: Solr indexer and Hadoop

2013-06-25 Thread James Thomas
>> The problem I am facing is how to read those data from hard disks which are >> not HDFS If you are planning to use a Map-Reduce job to do the indexing then the source data will definitely have to be on HDFS. The Map function can transform the source data to Solr documents and send them to So

Re: Solr indexer and Hadoop

2013-06-25 Thread engy.morsy
Thank you Jack. So, I need to convert those nodes holding data to HDFS. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4073013.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexer and Hadoop

2013-06-25 Thread Jack Krupansky
Solr does not have any integrated Hadoop/HDFS crawling or indexing support today. Sorry. LucidWorks Search does have HDFS crawling support: http://docs.lucidworks.com/display/lweug/Using+the+High+Volume+HDFS+Crawler Cloudera Search has HDFS support as well. -- Jack Krupansky -Original Mes