Would Solr's post.jar work for you?   It has a directory recurse option.  The 
usage/help output is pasted below.

Here's what should work for you: "java -Dauto -Drecursive -jar post.jar 
/some/folder"

        Erik



exampledocs  java -jar post.jar --help
SimplePostTool version 1.5
Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg> 
[<file|folder|url|arg>...]]

Supported System Properties and their defaults:
  -Ddata=files|web|args|stdin (default=files)
  -Dtype=<content-type> (default=application/xml)
  -Durl=<solr-update-url> (default=http://localhost:8983/solr/update)
  -Dauto=yes|no (default=no)
  -Drecursive=yes|no|<depth> (default=0)
  -Ddelay=<seconds> (default=0 for files, 10 for web)
  -Dfiletypes=<type>[,<type>,...] 
(default=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log)
  -Dparams="<key>=<value>[&<key>=<value>...]" (values must be URL-encoded)
  -Dcommit=yes|no (default=yes)
  -Doptimize=yes|no (default=no)
  -Dout=yes|no (default=no)

This is a simple command line tool for POSTing raw data to a Solr
port.  Data can be read from files specified as commandline args,
URLs specified as args, as raw commandline arg strings or via STDIN.
Examples:
  java -jar post.jar *.xml
  java -Ddata=args  -jar post.jar '<delete><id>42</id></delete>'
  java -Ddata=stdin -jar post.jar < hd.xml
  java -Ddata=web -jar post.jar http://example.com/
  java -Dtype=text/csv -jar post.jar *.csv
  java -Dtype=application/json -jar post.jar *.json
  java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=a 
-Dtype=application/pdf -jar post.jar a.pdf
  java -Dauto -jar post.jar *
  java -Dauto -Drecursive -jar post.jar afolder
  java -Dauto -Dfiletypes=ppt,html -jar post.jar afolder
The options controlled by System Properties include the Solr
URL to POST to, the Content-Type of the data, whether a commit
or optimize should be executed, and whether the response should
be written to STDOUT. If auto=yes the tool will try to set type
and url automatically from file name. When posting rich documents
the file name will be propagated as "resource.name" and also used
as "literal.id". You may override these or any other request parameter
through the -Dparams property. To do a commit only, use "-" as argument.
The web mode is a simple crawler following links within domain, default 
delay=10s.


On Mar 5, 2013, at 04:38 , Syao Work wrote:

> Hello,
> 
> I am trying to index some FS folder tree. 
> Spent 2 days finding what could be the problem - got nothing :) There are not 
> so much examples on indexing File System.
> In the logs I cant find any exceptions why it does not process the info
> Data import configuration and debug response are attached 
> 
> 
> Using: 
> 1. solr web admin tool, 
> 2. Java version "1.7.0_09-icedtea"
>    OpenJDK Runtime Environment (fedora-2.3.7.0.fc17-x86_64) 
>    OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
> 
> Thank you for your time,
> Ro
> 
> P.S. Excuse my bad English, I am not a native English speaker.
> <data-config.xml><import-debug-response.json>

Reply via email to