thanx Alexey
I downloaded Solr 4 and implemented the TikaEntityProcessor, it worked fine
with Tika 0.6.
didn't work with Tika 0.7 nor Tika 0.8 SNAPSHOT


On Sat, Nov 27, 2010 at 4:05 AM, Alexey Serba <ase...@gmail.com> wrote:

> > 1-      How to combine data from DIH and content extracted from file
> system
> > document into one document in the index?
> http://wiki.apache.org/solr/TikaEntityProcessor
> You can have one sql entity that retrieves metadata from database and
> another nested entity that parses binary file into additional fields
> in the document.
>
> > 2-      Should I move the per-user permissions into a separate index?
> What
> > technique to implement?
> I would start with keeping permissions in the same index as the actual
> content.
>
>
> On Tue, Nov 23, 2010 at 11:35 AM, Darx Oman <darxo...@gmail.com> wrote:
> > Hi guys
> >
> > I'm kind of new to solr and I'm wondering how to configure solr to best
> > fulfills my requirements.
> >
> > Requirements are as follow:
> >
> > I have 2 data sources: database and file system documents. Every document
> in
> > the file system has related information stored in the database.  Both the
> > file content and the related database fields must be indexed.  Along with
> > the DB data is per-user permissions for every document.  I'm using DIH
> for
> > the DB and Tika for the file System.  The documents contents nearly never
> > change, while the DB data especially the permissions changes very
> > frequently. Total number of documents roughly around 2M and each document
> is
> > about 500KB.
> >
> > 1-      How to combine data from DIH and content extracted from file
> system
> > document into one document in the index?
> >
> > 2-      Should I move the per-user permissions into a separate index?
> What
> > technique to implement?
> >
>

Reply via email to