Re: Solr 6.4. Can't index MS Visio vsdx files

2017-07-04 Thread Charlie Hull
from POI. Best, Tim -Original Message----- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, April 11, 2017 1:56 PM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Thanks for your responses. Are there any posibilities to i

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-07-03 Thread Allison, Timothy B.
Sorry. Y, you'll have to update commons-compress to 1.14. -Original Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Monday, July 3, 2017 9:15 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files hi, So I'm ba

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-07-03 Thread Gytis Mikuciunas
collections4 (which > is new in POI w Tika 1.14). (I assume you have already added curvesapi?) > > -Original Message- > From: Gytis Mikuciunas [mailto:gyt...@gmail.com] > Sent: Saturday, June 3, 2017 5:39 AM > To: solr-user@lucene.apache.org > Subject: RE: Solr 6.4. Ca

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-06-05 Thread Allison, Timothy B.
AM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Great Tim. What do I need to do to integrate it on my current installation? On May 31, 2017 16:24, "Allison, Timothy B." wrote: Apache Tika 1.15 is now available. -Original Message

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-06-03 Thread Gytis Mikuciunas
solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Probably better to ask on the Tika list. We'll push the release asap after PDFBox 2.0.6 is out. Andreas plans to cut the release candidate for PDFBox this Friday. Tika will probably have an RC by M

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-05-31 Thread Allison, Timothy B.
Apache Tika 1.15 is now available. -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Tuesday, May 9, 2017 7:45 AM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Probably better to ask on the Tika list. We'l

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-05-09 Thread Allison, Timothy B.
ts_pdfbox_2_0_6.tar.gz -Original Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, May 9, 2017 7:17 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files Are there any news regarding Tika 1.15? Maybe it's already ready f

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-05-09 Thread Gytis Mikuciunas
Mikuciunas [mailto:gyt...@gmail.com] > Sent: Wednesday, April 12, 2017 1:00 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr 6.4. Can't index MS Visio vsdx files > > when 1.15 will be released? maybe you have some beta version and I could > test it :) > > SAX sounds

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-12 Thread Allison, Timothy B.
est, Tim -Original Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Wednesday, April 12, 2017 1:00 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files when 1.15 will be released? maybe you have some beta version and I could

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
a stream if there's an > exception, but that's often a sign that something needs to be fixed with > the parser. In short, the solution will come from POI. > > Best, > > Tim > > -Original Message- > From: Gytis Mikuciunas [mailto:gyt...@gm

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Allison, Timothy B.
er. In short, the solution will come from POI. Best, Tim -Original Message----- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, April 11, 2017 1:56 PM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Thanks for your re

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
Thanks for your responses. Are there any posibilities to ignore parsing errors and continue indexing? because now solr/tika stops parsing whole document if it finds any exception On Apr 11, 2017 19:51, "Allison, Timothy B." wrote: > You might want to drop a note to the dev or user's list on Apac

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Allison, Timothy B.
You might want to drop a note to the dev or user's list on Apache POI. I'm not extremely familiar with the vsd(x) portion of our code base. The first item ("PolylineTo") may be caused by a mismatch btwn your doc and the ooxml spec. The second item appears to be an unsupported feature. The thir

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
t;, "org.apache.solr.common.SolrException", "root-error-class", "java.lang.ArrayIndexOutOfBoundsException" ] } } Regards, Gytis On Mon, Feb 6, 2017 at 6:54 PM, Allison, Timothy B. wrote: > Shouldn't have take

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Allison, Timothy B.
Sent: Monday, February 6, 2017 11:15 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files Tim, you saved my day ;) now vsdx files were indexed successfully. Thank you very much!!! summary: as a workaround I have in solr-6.4.0\contrib\extraction\lib: 1. oox

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Gytis Mikuciunas
solr-user@lucene.apache.org > Subject: Re: Solr 6.4. Can't index MS Visio vsdx files > > sad, but didn't help. > > what I did: > > 1. stopped solr: bin\solr stop -p 80 > 2. removed poi-ooxml-schemas-3.15.jar from contrib\extraction\lib 3. add > ooxml-schemas-1.3.ja

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Allison, Timothy B.
-Original Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Monday, February 6, 2017 8:19 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files sad, but didn't help. what I did: 1. stopped solr: bin\solr stop -p 80 2. removed poi-oox

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Gytis Mikuciunas
nd, you can > > rm "poi-ooxml-schemas" and add the full "ooxml-schemas", and you > > should be good to go. [3] > > > > Cheers, > > > > Tim > > > > [1] http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.14.jar > > > &

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Allison, Timothy B.
che. > tika$tika-app/artifact/org.apache.tika/tika-app/1.15- > 20170202.203920-124/tika-app-1.15-20170202.203920-124.jar > > [3] http://poi.apache.org/faq.html#faq-N10025 > > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Friday, F

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-05 Thread Gytis Mikuciunas
14.jar > > [2] https://builds.apache.org/job/Tika-trunk/1193/org.apache. > tika$tika-app/artifact/org.apache.tika/tika-app/1.15- > 20170202.203920-124/tika-app-1.15-20170202.203920-124.jar > > [3] http://poi.apache.org/faq.html#faq-N10025 > > -Original Message- &

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-03 Thread Gytis Mikuciunas
> tika$tika-app/artifact/org.apache.tika/tika-app/1.15- > 20170202.203920-124/tika-app-1.15-20170202.203920-124.jar > > [3] http://poi.apache.org/faq.html#faq-N10025 > > -Original Message----- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Friday, Febru

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-03 Thread Allison, Timothy B.
mailto:arafa...@gmail.com] Sent: Friday, February 3, 2017 9:49 AM To: solr-user Subject: Re: Solr 6.4. Can't index MS Visio vsdx files This kind of information extraction comes from Apache Tika that is shipped with Solr. However Solr does not ship every possible parser with its installation. So,

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-03 Thread Alexandre Rafalovitch
This kind of information extraction comes from Apache Tika that is shipped with Solr. However Solr does not ship every possible parser with its installation. So, I think you are hitting Tika where it manages to figure out what type of content you have, but does not have (Apache POI - another O/S pr