would you consider to include the filename as another meta data fields for
being indexed? I think your downstream python can do that easily.


*------------------------------------------------*
*Sincerely yours,*


*Raymond*

On Fri, May 18, 2018 at 3:47 PM, S.Ashwath <ashwat...@gmail.com> wrote:

> Hello,
>
> I have 2 directories: 1 with txt files and the other with corresponding
> JSON (metadata) files (around 90000 of each). There is one JSON file for
> each CSV file, and they share the same name (they don't share any other
> fields).
>
> The txt files just have plain text, I mapped each line to a field call
> 'sentence' and included the file name as a field using the data import
> handler. No problems here.
>
> The JSON file has metadata: 3 tags: a URL, author and title (for the
> content in the corresponding txt file).
> When I index the JSON file (I just used the _default schema, and posted the
> fields to the schema, as explained in the official solr tutorial),* I don't
> know how to get the file name into the index as a field.* As far as i know,
> that's no way to use the Data import handler for JSON files. I've read that
> I can pass a literal through the bin/post tool, but again, as far as I
> understand, I can't pass in the file name dynamically as a literal.
>
> I NEED to get the file name, it is the only way in which I can associate
> the metadata with each sentence in the txt files in my downstream Python
> code.
>
> So if anybody has a suggestion about how I should index the JSON file name
> along with the JSON content (or even some workaround), I'd be eternally
> grateful.
>
> Regards,
>
> Ash
>

Reply via email to