>If Perl is you choice: >http://search.cpan.org/~bricas/WebService-Solr-0.07/lib/WebService/Solr.pm > OOOOh. Very interesting; I had not seen this!
>Otis >-- >Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > >----- Original Message ---- >> From: Francis Yakin <fya...@liquid.com> >> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> >> Sent: Wednesday, July 8, 2009 1:16:04 AM >> Subject: Updating Solr index from XML files >> >> >> I have the following "curl" cmd to update and doing commit to Solr ( I have >> 10 >> xml files just for testing) >> >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-100170.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101062.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101238.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101400.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101513.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101517.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101572.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101691.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101694.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @xml_Artist-101698.txt -H >> 'Content-type:text/plain; charset=utf-8' >> curl http://solr00:7001/solr/update --data-binary @commit.txt -H >> 'Content-type:text/plain; charset=utf-8' >> >> It works so far. But I will have 30000 xml files. >> >> What's the efficient way to do these things? I can script it with for loop >> using >> regular shell script or perl. Assuming Solr1.4 or a nightly build. I would use DIH for this:- If all the files to be added/updated are in a directory. Then the FileListEntityProcessor could be used to find and index the files. It walks the disk from a given starting point. If you have another file, listing the files to be indexed, then I would use "LineEntityProcessor" to process that list. One or other of the above would locate file to be indexed and would pass the filename to XPathEntityProcessor with useSolrAddSchema set to true. See http://wiki.apache.org/solr/DataImportHandler >> >> I am also looking into solr.pm from this: >> >> http://wiki.apache.org/solr/IntegratingSolr >> >> BTW: We are using weblogic to deploy the solr.war and by default solr in >> weblogic using port 7001, but not 8983. >> >> Thanks >> >> Francis -- =============================================================== Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===============================================================