Re: Recursively scan documents for indexing in a folder in SolrJ

Jan Høydahl Fri, 16 Oct 2015 04:14:06 -0700

SolrJ does not have any file crawler built in.
But you are free to steal code from SimplePostTool.java related to directory 
traversal,
and then index each document found using SolrJ.


Note that SimplePostTool.java tries to be smart with what endpoint to post 
files to,
xml, csv and json content will be posted to /update while office docs go to 
/update/extract

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 16. okt. 2015 kl. 05.22 skrev Zheng Lin Edwin Yeo <edwinye...@gmail.com>:
> 
> Hi,
> 
> I understand that in SimplePostTool (post.jar), there is this command to
> automatically detect content types in a folder, and recursively scan it for
> documents for indexing into a collection:
> bin/post -c gettingstarted afolder/
> 
> This has been useful for me to do mass indexing of all the files that are
> in the folder. Now that I'm moving to production and plans to use SolrJ to
> do the indexing as it can do more things like robustness checks and retires
> for indexes that fails.
> 
> However, I can't seems to find a way to do the same in SolrJ. Is it
> possible for this to be done in SolrJ? I'm using Solr 5.3.0
> 
> Thank you.
> 
> Regards,
> Edwin

Re: Recursively scan documents for indexing in a folder in SolrJ

Reply via email to