I just pushed a fix for TIKA-2861. If you can either build locally or wait a few hours for Jenkins to build #182, let me know if that works with straight tika-app.jar.
On Thu, May 2, 2019 at 5:00 AM Where is Where <whis...@gmail.com> wrote: > > Thank you Alex and Tim. > I have looked at the solrconfig.xml file (I am trying the techproducts demo > config), the only related place I can find is the extract handle > > <requestHandler name="/update/extract" > startup="lazy" > class="solr.extraction.ExtractingRequestHandler" > > <lst name="defaults"> > <str name="lowernames">true</str> > <!--<str name="uprefix">ignored_</str>--> > > <!-- capture link hrefs but ignore div attributes --> > <str name="captureAttr">true</str> > <str name="fmap.a">links</str> > <str name="fmap.div">ignored_</str> > </lst> > </requestHandler> > > I am using this command bin/post -c techproducts example/exampledocs/1.mp4 > -params "literal.id=mp4_1&uprefix=attr_" > > I have tried commenting out <str name="uprefix">ignored_</str> and changing > to <str name="fmap.div">div</str> > but still not working. I don't quite get why image is getting gps etc > metadata but video is acting differently while it is using the same > solrconfig and the gps metadata are in the same fields. There is no > differentiation in solrconfig setting between image and video. > > Tim yes this is related to the TIKA link. Thank you! > > Here is the output in solr for mp4. > > { > "attr_meta":["stream_size", > "5721559", > "date", > "2019-03-29T04:36:39Z", > "X-Parsed-By", > "org.apache.tika.parser.DefaultParser", > "X-Parsed-By", > "org.apache.tika.parser.mp4.MP4Parser", > "stream_content_type", > "application/octet-stream", > "meta:creation-date", > "2019-03-29T04:36:39Z", > "Creation-Date", > "2019-03-29T04:36:39Z", > "tiff:ImageLength", > "1080", > "resourceName", > "/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4", > "dcterms:created", > "2019-03-29T04:36:39Z", > "dcterms:modified", > "2019-03-29T04:36:39Z", > "Last-Modified", > "2019-03-29T04:36:39Z", > "Last-Save-Date", > "2019-03-29T04:36:39Z", > "xmpDM:audioSampleRate", > "1000", > "meta:save-date", > "2019-03-29T04:36:39Z", > "modified", > "2019-03-29T04:36:39Z", > "tiff:ImageWidth", > "1920", > "xmpDM:duration", > "2.64", > "Content-Type", > "video/mp4"], > "id":"mp4_4", > "attr_stream_size":["5721559"], > "attr_date":["2019-03-29T04:36:39Z"], > "attr_x_parsed_by":["org.apache.tika.parser.DefaultParser", > "org.apache.tika.parser.mp4.MP4Parser"], > "attr_stream_content_type":["application/octet-stream"], > "attr_meta_creation_date":["2019-03-29T04:36:39Z"], > "attr_creation_date":["2019-03-29T04:36:39Z"], > "attr_tiff_imagelength":["1080"], > > "resourcename":"/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4", > "attr_dcterms_created":["2019-03-29T04:36:39Z"], > "attr_dcterms_modified":["2019-03-29T04:36:39Z"], > "last_modified":"2019-03-29T04:36:39Z", > "attr_last_save_date":["2019-03-29T04:36:39Z"], > "attr_xmpdm_audiosamplerate":["1000"], > "attr_meta_save_date":["2019-03-29T04:36:39Z"], > "attr_modified":["2019-03-29T04:36:39Z"], > "attr_tiff_imagewidth":["1920"], > "attr_xmpdm_duration":["2.64"], > "content_type":["video/mp4"], > "content":[" \n \n \n \n \n \n \n \n \n \n \n \n \n > \n \n \n \n \n \n \n \n \n \n "], > "_version_":1632383499325407232}] > }} > > JPEG is getting these: > "attr_meta":[.... > "GPS Latitude", > "37° 47' 41.99\"", > .... > "attr_gps_latitude":["37° 47' 41.99\""], > > > On Wed, May 1, 2019 at 2:57 PM Where is Where <whis...@gmail.com> wrote: > > > uploading video to solr via tika > > https://lucene.apache.org/solr/guide/7_7/uploading-data-with-solr-cell-using-apache-tika.html > > The index has no video GPS metadata which is extracted and indexed for > > images such as jpeg. I have checked both MP4 and MOV files, the files I > > checked all have GPS Exif data embedded in the same fields as image. Any > > idea? Thanks! > >