Re: Apache SOLR Design Query

2018-05-13 Thread Rahul Singh
This is a good start. Few things to consider. 1. Extract the contents via Tika externally or via Tika Server. 2. Create a canonical “Item” document schema which would have title, metadata, contents, imagePreview (something to consider) , etc. 3. Use the extracted Tika data to populate your index.

Apache SOLR Design Query

2018-05-12 Thread NetUser MSUser
Hi team, We have a business case like the below one. There are nearly 150 GB of docs(pdf/ppt/word/xl/msg) files which are in stored in a N/w Path as of now. To implement text search on these , we are planning to use solr search in these. Listed below is the plan. 1)Using a high configuration W