> 1- How to combine data from DIH and content extracted from file system > document into one document in the index? http://wiki.apache.org/solr/TikaEntityProcessor You can have one sql entity that retrieves metadata from database and another nested entity that parses binary file into additional fields in the document.
> 2- Should I move the per-user permissions into a separate index? What > technique to implement? I would start with keeping permissions in the same index as the actual content. On Tue, Nov 23, 2010 at 11:35 AM, Darx Oman <darxo...@gmail.com> wrote: > Hi guys > > I'm kind of new to solr and I'm wondering how to configure solr to best > fulfills my requirements. > > Requirements are as follow: > > I have 2 data sources: database and file system documents. Every document in > the file system has related information stored in the database. Both the > file content and the related database fields must be indexed. Along with > the DB data is per-user permissions for every document. I'm using DIH for > the DB and Tika for the file System. The documents contents nearly never > change, while the DB data especially the permissions changes very > frequently. Total number of documents roughly around 2M and each document is > about 500KB. > > 1- How to combine data from DIH and content extracted from file system > document into one document in the index? > > 2- Should I move the per-user permissions into a separate index? What > technique to implement? >