Would Solr's post.jar work for you? It has a directory recurse option. The usage/help output is pasted below.
Here's what should work for you: "java -Dauto -Drecursive -jar post.jar /some/folder" Erik exampledocs java -jar post.jar --help SimplePostTool version 1.5 Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]] Supported System Properties and their defaults: -Ddata=files|web|args|stdin (default=files) -Dtype=<content-type> (default=application/xml) -Durl=<solr-update-url> (default=http://localhost:8983/solr/update) -Dauto=yes|no (default=no) -Drecursive=yes|no|<depth> (default=0) -Ddelay=<seconds> (default=0 for files, 10 for web) -Dfiletypes=<type>[,<type>,...] (default=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log) -Dparams="<key>=<value>[&<key>=<value>...]" (values must be URL-encoded) -Dcommit=yes|no (default=yes) -Doptimize=yes|no (default=no) -Dout=yes|no (default=no) This is a simple command line tool for POSTing raw data to a Solr port. Data can be read from files specified as commandline args, URLs specified as args, as raw commandline arg strings or via STDIN. Examples: java -jar post.jar *.xml java -Ddata=args -jar post.jar '<delete><id>42</id></delete>' java -Ddata=stdin -jar post.jar < hd.xml java -Ddata=web -jar post.jar http://example.com/ java -Dtype=text/csv -jar post.jar *.csv java -Dtype=application/json -jar post.jar *.json java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=a -Dtype=application/pdf -jar post.jar a.pdf java -Dauto -jar post.jar * java -Dauto -Drecursive -jar post.jar afolder java -Dauto -Dfiletypes=ppt,html -jar post.jar afolder The options controlled by System Properties include the Solr URL to POST to, the Content-Type of the data, whether a commit or optimize should be executed, and whether the response should be written to STDOUT. If auto=yes the tool will try to set type and url automatically from file name. When posting rich documents the file name will be propagated as "resource.name" and also used as "literal.id". You may override these or any other request parameter through the -Dparams property. To do a commit only, use "-" as argument. The web mode is a simple crawler following links within domain, default delay=10s. On Mar 5, 2013, at 04:38 , Syao Work wrote: > Hello, > > I am trying to index some FS folder tree. > Spent 2 days finding what could be the problem - got nothing :) There are not > so much examples on indexing File System. > In the logs I cant find any exceptions why it does not process the info > Data import configuration and debug response are attached > > > Using: > 1. solr web admin tool, > 2. Java version "1.7.0_09-icedtea" > OpenJDK Runtime Environment (fedora-2.3.7.0.fc17-x86_64) > OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) > > Thank you for your time, > Ro > > P.S. Excuse my bad English, I am not a native English speaker. > <data-config.xml><import-debug-response.json>