Thanks, Jan, for making the post tool do this type of thing.  Great stuff.

The filename would be a good one add for out of the box goodness.  We can 
easily add just the filename to the index with something like the patch below.  
And on that note, what else would folks want in an easy to use document search 
system like this?

        Erik

Index: core/src/java/org/apache/solr/util/SimplePostTool.java
===================================================================
--- core/src/java/org/apache/solr/util/SimplePostTool.java      (revision 
1450270)
+++ core/src/java/org/apache/solr/util/SimplePostTool.java      (working copy)
@@ -749,6 +749,7 @@
               urlStr = appendParam(urlStr, "resource.name=" + 
URLEncoder.encode(file.getAbsolutePath(), "UTF-8"));
             if(urlStr.indexOf("literal.id")==-1)
               urlStr = appendParam(urlStr, "literal.id=" + 
URLEncoder.encode(file.getAbsolutePath(), "UTF-8"));
+            urlStr = appendParam(urlStr, "literal.filename_s=" + 
URLEncoder.encode(file.getName(), "UTF-8"));
             url = new URL(urlStr);
           }
         } else {



On Mar 8, 2013, at 19:16 , Jan Høydahl wrote:

> Since this is a POC you could simply run this command with the default 
> example schema:
> 
> cd solr/example/exampledocs
> java -Dauto -Drecursive=0 -jar post.jar path/to/folder
> 
> You will get the full file name with path in field "resourcename"
> If you need to search just the filename, you can achieve that through adding 
> a new field "filename" with a copyField resourcename->filename and a custom 
> fieldType for filename with a PatternReplaceFilterFactory to remove the path.
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
> 
> 7. mars 2013 kl. 22:11 skrev Alexandre Rafalovitch <arafa...@gmail.com>:
> 
>> You could use DataImportHandler with FileListEntityProcessor to get the
>> file names in:
>> http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor
>> 
>> Then, if it is recursive enumeration and not just one level, you probably
>> want a tokenizer that splits on path separator characters (e.g. /). Or
>> maybe you want to index filename as a separate field from full path (can do
>> it in FileListEntityProcessor itself).
>> 
>> And if you combined the list of files with inner entity using Tika, you can
>> load the file content for searching as well:
>> http://wiki.apache.org/solr/DataImportHandler#Tika_Integration
>> 
>> Regards,
>>  Alex.
>> 
>> Personal blog: http://blog.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all at
>> once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
>> 
>> 
>> On Thu, Mar 7, 2013 at 3:39 PM, pavangolla <pavango...@gmail.com> wrote:
>> 
>>> HI,
>>> I am new to apache solr,
>>> 
>>> I am doing a poc, where there is a folder (in sys or some repository) which
>>> has different files with diff extensions pdf, doc, xls..,
>>> 
>>> I want to search with a file name and retrieve all the files with the name
>>> matching
>>> 
>>> How do i proceed on this.
>>> 
>>> Please help me on this.
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Search-a-folder-with-File-name-and-retrieve-all-the-files-matched-tp4045629.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
> 

Reply via email to