would you consider to include the filename as another meta data fields for being indexed? I think your downstream python can do that easily.
*------------------------------------------------* *Sincerely yours,* *Raymond* On Fri, May 18, 2018 at 3:47 PM, S.Ashwath <ashwat...@gmail.com> wrote: > Hello, > > I have 2 directories: 1 with txt files and the other with corresponding > JSON (metadata) files (around 90000 of each). There is one JSON file for > each CSV file, and they share the same name (they don't share any other > fields). > > The txt files just have plain text, I mapped each line to a field call > 'sentence' and included the file name as a field using the data import > handler. No problems here. > > The JSON file has metadata: 3 tags: a URL, author and title (for the > content in the corresponding txt file). > When I index the JSON file (I just used the _default schema, and posted the > fields to the schema, as explained in the official solr tutorial),* I don't > know how to get the file name into the index as a field.* As far as i know, > that's no way to use the Data import handler for JSON files. I've read that > I can pass a literal through the bin/post tool, but again, as far as I > understand, I can't pass in the file name dynamically as a literal. > > I NEED to get the file name, it is the only way in which I can associate > the metadata with each sentence in the txt files in my downstream Python > code. > > So if anybody has a suggestion about how I should index the JSON file name > along with the JSON content (or even some workaround), I'd be eternally > grateful. > > Regards, > > Ash >