I'm not sure, it's a remote team but will get more info.  For now, assuming
that a certain directory is specified, like "/user/andrew/", and a regex is
applied to capture anything two directories below matching "*/*/*.pdf".

Would there be a way to capture the wild-carded values and index them as
fields?

On Tue, Jul 21, 2015 at 11:20 AM, Upayavira <u...@odoko.co.uk> wrote:

> Keeping to the user list (the right place for this question).
>
> More information is needed here - how are you getting these documents
> into Solr? Are you posting them to /update/extract? Or using DIH, or?
>
> Upayavira
>
> On Tue, Jul 21, 2015, at 06:31 PM, Andrew Musselman wrote:
> > Dear user and dev lists,
> >
> > We are loading files from a directory and would like to index a portion
> > of
> > each file path as a field as well as the text inside the file.
> >
> > E.g., on HDFS we have this file path:
> >
> > /user/andrew/1234/1234/file.pdf
> >
> > And we would like the "1234" token parsed from the file path and indexed
> > as
> > an additional field that can be searched on.
> >
> > From my initial searches I can't see how to do this easily, so would I
> > need
> > to write some custom code, or a plugin?
> >
> > Thanks!
>

Reply via email to