Hi,
I have a custom library, which is used to input a file path and it returns
file content as a string output.
My DB has a file path in one of the table and using DIH configuration in
Solr to do the indexing. I couldnt use TikaEntityProcessor to do indexing of
a file located in file system. I though of using Custom Transformer to
transform file_path to file_content field in the row.

I would like to know following details:
1. Setting file content as a string to a custom file_content field might
cause memory issue if a very big file over hundreds of mega bites might
consume the RAM space. Is it possible to send a stream as input to Solr?
What is the filed type should be configured in schema.xml?
2. Is there any better approach than a custom transformer?
3. Any other best approach to implement indexing based on a file path?
Thanks a lot.

Reply via email to