bq: We routinely store images and pdfs in Solr. There *is* a benefit, since you don't need to manage another storage system, you don't have to worry about Solr getting out of sync with the other system, you can use Solr replication for all your assets, etc.
Do the same holds good for large Blobs like image, audio, video as well? Tika supports multiple file formats (http://tika.apache.org/1.5/formats.html) but not sure how good is the Solr/Tika combination. Storing pdf and other docs could be useful in Solr, tika can extract metadata from the docs and make them discoverable. Considering all the above cases there should also be a support for File field type in Solr like other types Date, Float, Int, Long, String etc. but looks like there are only two file types ( http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/) and both re external file storage. - ExternalFileField.java <http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/ExternalFileField.java> - ExternalFileFieldReloader.java <http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/ExternalFileFieldReloader.java> What type can be used in schema when storing the files internally? On Thu, Nov 13, 2014 at 3:48 AM, Jeon Woosung <jeonwoos...@gmail.com> wrote: > How about this? > > First, define a field for filter query. It should be multivalued. > > Second, implements transformer to extract json dynamic fields, and put the > dynamic fields into the solr field. > > For example, > > <fieldType name="terms" class="string" multivalued="true"/> > > Data : {a:1,b:2,c:3} > > You can split the data to "a:1", "b:2", "c:3", and put them into terms. > > And then you can use filter query like "fq=terms:a:1" > 2014. 11. 13. 오전 3:59에 "Michael Sokolov" <msoko...@safaribooksonline.com > >님이 > 작성: > > > We routinely store images and pdfs in Solr. There *is* a benefit, since > > you don't need to manage another storage system, you don't have to worry > > about Solr getting out of sync with the other system, you can use Solr > > replication for all your assets, etc. > > > > I don't use DIH, so personally I don't care whether it handles blobs, but > > it does seem like a natural extension for a system that indexes data from > > SQL in Solr. > > > > -Mike > > > > > > On 11/12/2014 01:31 PM, Anurag Sharma wrote: > > > >> BLOB is non-searchable field so there is no benefit of storing it into > >> Solr. Any external key-value store can be used to store the blob and > >> reference of this blob can be stored as a string field in Solr. > >> > >> On Wed, Nov 12, 2014 at 5:56 PM, stockii <stock.jo...@googlemail.com> > >> wrote: > >> > >> I had a similar problem and didnt find any solution to use the fields > in > >>> JSON > >>> Blob for a filter ... Not with DIH. > >>> > >>> > >>> > >>> -- > >>> View this message in context: > >>> > http://lucene.472066.n3.nabble.com/DIH-Blob-data-tp4168896p4168925.html > >>> Sent from the Solr - User mailing list archive at Nabble.com. > >>> > >>> > > >