Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

<http://wiki.apache.org/solr/DataImportHandler>SolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds <G>. See:
http://wiki.apache.org/solr/Solrj

<http://wiki.apache.org/solr/Solrj>It's probably worth your while to get a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer <seshadri...@gmail.com>wrote:

> Hi Lance,
>
> Thank you very much for responding (not sure how I reply to the group, so,
> writing to you).
>
> Can you please expand on your suggestion? I am not a web guy and so, don't
> know where to start.
>
> What is the difference between SolrJ and DataImportHandler? Do I need to
> set
> up web servers on all my storage boxes?
>
> Apologies for the basic level of questions, but hope I can get started and
> implement this before the year end (you know why :o)
>
> Thanks,
>
> Sesh
>
> On 12 November 2010 13:31, Lance Norskog <goks...@gmail.com> wrote:
>
> > Using 'curl' is fine. There is a library called SolrJ for Java and
> > other libraries for other scripting languages that let you upload with
> > more control. There is a thing in Solr called the DataImportHandler
> > that lets you script walking a file system.
> >
> > On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer <seshadri...@gmail.com
> >
> > wrote:
> > > Hi,
> > >
> > > Pardon me if this sounds very elementary, but I have a very basic
> > question
> > > regarding Solr search. I have about 10 storage devices running Solaris
> > with
> > > hundreds of thousands of text files (there are other files, as well,
> but
> > my
> > > target is these text files). The directories on the Solaris boxes are
> > > exported and are available as NFS mounts.
> > >
> > > I have installed Solr 1.4 on a Linux box and have tested the
> > installation,
> > > using curl to post  documents. However, the manual says that curl is
> not
> > the
> > > recommended way of posting documents to Solr. Could someone please tell
> > me
> > > what is the preferred approach in such an environment? I am not a
> > programmer
> > > and would appreciate some hand-holding here :o)
> > >
> > > Thanks in advance,
> > >
> > > Sesh
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > goks...@gmail.com
> >
>

Reply via email to