bq: We routinely store images and pdfs in Solr. There *is* a benefit, since
you don't need to manage another storage system, you don't have to worry
about Solr getting out of sync with the other system, you can use Solr
replication for all your assets, etc.

Do the same holds good for large Blobs like image, audio, video as well?
Tika supports multiple file formats (http://tika.apache.org/1.5/formats.html)
but not sure how good is the Solr/Tika combination. Storing pdf and other
docs could be useful in Solr, tika can extract metadata from the docs and
make them discoverable.

Considering all the above cases there should also be a support for File
field type in Solr like other types Date, Float, Int, Long, String etc. but
looks like there are only two file types (
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/)
and both re external file storage.

   - ExternalFileField.java
   
<http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/ExternalFileField.java>
   - ExternalFileFieldReloader.java
   
<http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/ExternalFileFieldReloader.java>

What type can be used in schema when storing the files internally?


On Thu, Nov 13, 2014 at 3:48 AM, Jeon Woosung <jeonwoos...@gmail.com> wrote:

> How about this?
>
> First, define a field for filter query. It should be multivalued.
>
> Second, implements transformer to extract json dynamic fields, and put the
> dynamic fields into the solr field.
>
> For example,
>
> <fieldType name="terms" class="string" multivalued="true"/>
>
> Data : {a:1,b:2,c:3}
>
> You can split the data to "a:1", "b:2", "c:3", and put them into terms.
>
> And then you can use filter query like "fq=terms:a:1"
> 2014. 11. 13. 오전 3:59에 "Michael Sokolov" <msoko...@safaribooksonline.com
> >님이
> 작성:
>
> > We routinely store images and pdfs in Solr. There *is* a benefit, since
> > you don't need to manage another storage system, you don't have to worry
> > about Solr getting out of sync with the other system, you can use Solr
> > replication for all your assets, etc.
> >
> > I don't use DIH, so personally I don't care whether it handles blobs, but
> > it does seem like a natural extension for a system that indexes data from
> > SQL in Solr.
> >
> > -Mike
> >
> >
> > On 11/12/2014 01:31 PM, Anurag Sharma wrote:
> >
> >> BLOB is non-searchable field so there is no benefit of storing it into
> >> Solr. Any external key-value store can be used to store the blob and
> >> reference of this blob can be stored as a string field in Solr.
> >>
> >> On Wed, Nov 12, 2014 at 5:56 PM, stockii <stock.jo...@googlemail.com>
> >> wrote:
> >>
> >>  I had a similar problem and didnt find any solution to use the fields
> in
> >>> JSON
> >>> Blob for a filter ... Not with DIH.
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> >>>
> http://lucene.472066.n3.nabble.com/DIH-Blob-data-tp4168896p4168925.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>
> >>>
> >
>

Reply via email to