Yes, I've read directly from NFS.
Consider the case where your mapper takes as input a list of the file paths
to operate on. Your mapper would load each file one by one by using
standard java.io.* calls, build a SolrInputDocument out of each one, and
submit it to a SolrServer implementation stored
If you can upload your data to hdfs you can use this patch to build the solr
indexes:
https://issues.apache.org/jira/browse/SOLR-1301
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4074635.html
Sent from the Solr - User mailing list archive
Michael,
I understand from your post that I can use the current storage without in
Hadoop. I already have the storage mounted via NFS.
Does your map function read from the mounted storage directly? If possible
can you please illustrate more on that.
Thanks
Engy
--
View this message in contex
o
4.4. If not in 4.4, 4.5 is probably a slam-dunk.
-- Jack Krupansky
-Original Message-
From: David Larochelle
Sent: Wednesday, June 26, 2013 11:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr indexer and Hadoop
Pardon, my unfamiliarity with the Solr development process.
Now
1st Street
> >> >
> >> > New York, NY 10017
> >> >
> >> > t: @appinions <https://twitter.com/Appinions> | g+:
> >> > plus.google.com/appinions
> >> > w: appinions.com <http://www.appinions.com/>
> >
ppinions.com/>
>> >
>> >
>> > On Tue, Jun 25, 2013 at 8:58 AM, Jack Krupansky > >wrote:
>> >
>> >> ???
>> >>
>> >> Hadoop=HDFS
>> >>
>> >> If the data is not in Hadoop/HDFS, just use the nor
ons.com/>
> >
> >
> > On Tue, Jun 25, 2013 at 8:58 AM, Jack Krupansky >wrote:
> >
> >> ???
> >>
> >> Hadoop=HDFS
> >>
> >> If the data is not in Hadoop/HDFS, just use the normal Solr indexing
> >> tools, includi
ort Handler, and possibly ManifoldCF.
>>
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: engy.morsy
>> Sent: Tuesday, June 25, 2013 8:10 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr indexer and Hadoop
>>
>>
>>
y ManifoldCF.
>
>
> -- Jack Krupansky
>
> -Original Message- From: engy.morsy
> Sent: Tuesday, June 25, 2013 8:10 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr indexer and Hadoop
>
>
> Thank you Jack. So, I need to convert those nodes holding data to
@lucene.apache.org
Subject: Re: Solr indexer and Hadoop
Thank you Jack. So, I need to convert those nodes holding data to HDFS.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4073013.html
Sent from the Solr - User mailing list archive at Nabble.com.
But note that MapReduce and HDFS are not the only way to go.
For example, can you split your source data? If you can, you could do
that, put them on N machines, and run indexer on all of them, each for
some number of threads. Of course, your Solr(Cloud?) cluster better
have enough servers/CPU cor
>> The problem I am facing is how to read those data from hard disks which are
>> not HDFS
If you are planning to use a Map-Reduce job to do the indexing then the source
data will definitely have to be on HDFS.
The Map function can transform the source data to Solr documents and send them
to So
Thank you Jack. So, I need to convert those nodes holding data to HDFS.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4073013.html
Sent from the Solr - User mailing list archive at Nabble.com.
Solr does not have any integrated Hadoop/HDFS crawling or indexing support
today. Sorry.
LucidWorks Search does have HDFS crawling support:
http://docs.lucidworks.com/display/lweug/Using+the+High+Volume+HDFS+Crawler
Cloudera Search has HDFS support as well.
-- Jack Krupansky
-Original Mes
14 matches
Mail list logo