Re: How to index data from multiple data source

2015-01-24 Thread Yusniel Hidalgo
Thanks Alex, indeed, the relative path to PDF document is stored in the database. I will try to use your approach. Regards, Yusniel Hidalgo - Mensaje original - De: "Alexandre Rafalovitch" Para: "solr-user" Enviados: Sábado, 24 de Enero 2015 18:19:48 Asunto: R

Re: How to index data from multiple data source

2015-01-24 Thread Alexandre Rafalovitch
You could use nested entities in DIH. So, if you store - for example - path to the PDF in the database, you could do a nested entity with TikaEntityProcessor to load the content. Just make sure the field names do not conflict. Regards, Alex. Sign up for my Solr resources newsletter at ht

How to index data from multiple data source

2015-01-24 Thread Yusniel Hidalgo
Dear Solr community, I am diving into Solr recently and I need help in the following usage scenery. I am working on a project for extract and search bibliographic metadata from PDF files. Firstly, my PDF files are processed to extract bibliographic metadata such as title, authors, affiliations

Re: How to index data from multiple data source

2015-01-21 Thread Diego Pino
Hi Yusniel, Solr manages documents as a whole. This means updating an existing document means replacing. So you should/could index metadata and full text in one step, one solr document under one unique ID. That would the simplest case. You could also also use nested child documents to use bloc

Re: How to index data from multiple data source

2015-01-21 Thread Shawn Heisey
On 1/20/2015 10:43 PM, Yusniel Hidalgo Delgado wrote: > I am diving into Solr recently and I need help in the following usage > scenery. I am working on a project for extract and search bibliographic > metadata from PDF files. Firstly, my PDF files are processed to extract > bibliographic metada

Re: How to index data from multiple data source

2015-01-20 Thread Alvaro Cabrerizo
Hi, You can find several examples of configuring tika+dih to index pdf in internet (e.g. https://tuxdna.wordpress.com/2013/02/04/indexing-the-documents-stored-in-a-database-using-apache-solr-and-apache-tika/ ) Regards. On Jan 21, 2015 6:54 AM, "Yusniel Hidalgo Delgado" wrote: > > > Dear Solr co

How to index data from multiple data source

2015-01-20 Thread Yusniel Hidalgo Delgado
Dear Solr community, I am diving into Solr recently and I need help in the following usage scenery. I am working on a project for extract and search bibliographic metadata from PDF files. Firstly, my PDF files are processed to extract bibliographic metadata such as title, authors, affilia