>If Perl is you choice:
>http://search.cpan.org/~bricas/WebService-Solr-0.07/lib/WebService/Solr.pm
>
OOOOh. Very interesting; I had not seen this!


>Otis
>--
>Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>----- Original Message ----
>> From: Francis Yakin <fya...@liquid.com>
>> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
>> Sent: Wednesday, July 8, 2009 1:16:04 AM
>> Subject: Updating Solr index from XML files
>> 
>> 
>> I have the following "curl" cmd to update and doing commit to Solr ( I have 
>> 10 
>> xml files just for testing)
>> 
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-100170.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101062.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101238.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101400.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101513.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101517.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101572.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101691.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101694.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101698.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> curl http://solr00:7001/solr/update --data-binary @commit.txt -H 
>> 'Content-type:text/plain; charset=utf-8'
>> 
>> It works so far. But I will have  30000 xml files.
>> 
>> What's the efficient way to do these things? I can script it with for loop 
>> using 
>> regular shell script or perl.

Assuming Solr1.4 or a nightly build. I would use DIH for this:-

  If all the files to be added/updated are in a directory. Then the
  FileListEntityProcessor could be used to find and index the files.
  It walks the disk from a given starting point.

  If you have another file, listing the files to be indexed, then
  I would use "LineEntityProcessor" to process that list.

One or other of the above would locate file to be indexed and 
would pass the filename to XPathEntityProcessor with useSolrAddSchema
set to true.
  
  See http://wiki.apache.org/solr/DataImportHandler

>> 
>> I am also looking into solr.pm from this:
>> 
>> http://wiki.apache.org/solr/IntegratingSolr
>> 
>> BTW: We are using weblogic to deploy the solr.war and by default solr in 
>> weblogic using port 7001, but not 8983.
>> 
>> Thanks
>> 
>> Francis

-- 

===============================================================
Fergus McMenemie               Email:fer...@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================

Reply via email to