RE: Odp.: solr issue with pdf forms

2015-04-30 Thread Davis, Daniel (NIH/NLM) [C]
ck Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, April 30, 2015 11:28 AM To: solr-user@lucene.apache.org Subject: Re: Odp.: solr issue with pdf forms Jack: I keep forgetting those things exist, thanks for the reminder! On Thu, Apr 30, 2015 at 8:23 AM, Jack Krupansky wrote: > Or use a S

Re: Odp.: solr issue with pdf forms

2015-04-30 Thread Erick Erickson
Daz >> > >> > Best >> > Steve >> > >> > -Ursprüngliche Nachricht- >> > Von: Allison, Timothy B. [mailto:talli...@mitre.org] >> > Gesendet: Mittwoch, 29. April 2015 14:16 >> > An: solr-user@lucene.apache.org >> >

Re: Odp.: solr issue with pdf forms

2015-04-30 Thread Jack Krupansky
nachweise bei.^HDaz > > > > Best > > Steve > > > > -Ursprüngliche Nachricht- > > Von: Allison, Timothy B. [mailto:talli...@mitre.org] > > Gesendet: Mittwoch, 29. April 2015 14:16 > > An: solr-user@lucene.apache.org > > Cc: u...@tika.apache.org &

Re: Odp.: solr issue with pdf forms

2015-04-30 Thread Erick Erickson
Gesendet: Mittwoch, 29. April 2015 14:16 > An: solr-user@lucene.apache.org > Cc: u...@tika.apache.org > Betreff: RE: Odp.: solr issue with pdf forms > > I completely agree with Erick about the utility of the TermsComponent to see > what is actually being indexed. If you find probl

AW: Odp.: solr issue with pdf forms

2015-04-30 Thread Steve.Scholl
@gmail.com] Sent: Tuesday, April 28, 2015 9:07 PM To: solr-user@lucene.apache.org Subject: Re: Odp.: solr issue with pdf forms There better be. 1> go to the admin UI 2> select a core 3> select "schema browser" 4> select a field from the drop-down Until you do step 4 the window w

AW: Odp.: solr issue with pdf forms

2015-04-29 Thread Steve.Scholl
kson [mailto:erickerick...@gmail.com] Gesendet: Mittwoch, 29. April 2015 16:07 An: solr-user@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms Steve: I'd just look at one field at a time Presumably you have a field that's displaying poorly, "content"? Just look at

Re: Odp.: solr issue with pdf forms

2015-04-29 Thread Erick Erickson
e pdfbox-app.jar (ExtractText option) > on your files outside of Solr to see what text/noise you're getting for the > files that are causing problems. > > > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Tuesday, April 28,

RE: Odp.: solr issue with pdf forms

2015-04-29 Thread Allison, Timothy B.
day, April 28, 2015 9:07 PM To: solr-user@lucene.apache.org Subject: Re: Odp.: solr issue with pdf forms There better be. 1> go to the admin UI 2> select a core 3> select "schema browser" 4> select a field from the drop-down Until you do step 4 the window will be pr

AW: Odp.: solr issue with pdf forms

2015-04-29 Thread Steve.Scholl
@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms There better be. 1> go to the admin UI 2> select a core 3> select "schema browser" 4> select a field from the drop-down Until you do step 4 the window will be pretty blank. Here's the info for TermsCompon

Re: Odp.: solr issue with pdf forms

2015-04-28 Thread Erick Erickson
fault configured. > > Thanks a lot > Best > Steve > > -Ursprüngliche Nachricht- > Von: Erick Erickson [mailto:erickerick...@gmail.com] > Gesendet: Montag, 27. April 2015 17:23 > An: solr-user@lucene.apache.org > Betreff: Re: Odp.: solr issue with pdf forms > > W

AW: Odp.: solr issue with pdf forms

2015-04-28 Thread Steve.Scholl
erickerick...@gmail.com] Gesendet: Montag, 27. April 2015 17:23 An: solr-user@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms We're still not quite there. There should be a "load term info" button on that page. Clicking that button will show you the terms in your index (as o

Re: Odp.: solr issue with pdf forms

2015-04-27 Thread Erick Erickson
inct: 160403 > > Does this somehow help to figure out the issue? > Thanks > Best > Steve > > > -Ursprüngliche Nachricht- > Von: Erick Erickson [mailto:erickerick...@gmail.com] > Gesendet: Freitag, 24. April 2015 20:15 > An: solr-user@lucene.apache.org > Be

AW: Odp.: solr issue with pdf forms

2015-04-26 Thread Steve.Scholl
rg Betreff: Re: Odp.: solr issue with pdf forms Steve: Right, it's not exactly obvious. Bring up the admin UI, something like http://localhost:8983/solr. From there you have to select a core in the 'core selector' drop-down on the left side. If you're using SolrCloud, this wi

Re: Odp.: solr issue with pdf forms

2015-04-24 Thread Erick Erickson
admin schema browser, but what > should I see there? Sorry I'm not firm with the admin schema browser. :-( > > Best > Steve > > > -Ursprüngliche Nachricht- > Von: Erick Erickson [mailto:erickerick...@gmail.com] > Gesendet: Donnerstag, 23. April 2015 18:00 &g

AW: Odp.: solr issue with pdf forms

2015-04-24 Thread Steve.Scholl
April 2015 18:00 An: solr-user@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms When you say "they're not indexed correctly", what's your evidence? You cannot rely on the display in the browser, that's the raw input just as it was sent to Solr, _not_ the actua

Re: Odp.: solr issue with pdf forms

2015-04-23 Thread Dan Davis
fields) created with "Adobe InDesign > CS5 (7.0.1)" are indexed with the blank space issue > > > > Best > > Steve > > > > -Ursprüngliche Nachricht- > > Von: Erick Erickson [mailto:erickerick...@gmail.com] > > Gesendet: Mittwoch, 22. April

Re: Odp.: solr issue with pdf forms

2015-04-23 Thread Erick Erickson
üngliche Nachricht- > Von: Erick Erickson [mailto:erickerick...@gmail.com] > Gesendet: Mittwoch, 22. April 2015 17:11 > An: solr-user@lucene.apache.org > Betreff: Re: Odp.: solr issue with pdf forms > > Are they not _indexed_ correctly or not being displayed correctly? > Ta

AW: Odp.: solr issue with pdf forms

2015-04-23 Thread Steve.Scholl
gmail.com] Gesendet: Mittwoch, 22. April 2015 17:11 An: solr-user@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms Are they not _indexed_ correctly or not being displayed correctly? Take a look at admin UI>>schema browser>> your field and press the "load terms" butto

Re: solr issue with pdf forms

2015-04-22 Thread Dan Davis
Steve, Are you using ExtractingRequestHandler / DataImportHandler or extracting the text content from the PDF outside of Solr? On Wed, Apr 22, 2015 at 6:40 AM, wrote: > Hi guys, > > hopefully you can help me with my issue. We are using a solr setup and > have the following issue: > - usual pdf

Re: Odp.: solr issue with pdf forms

2015-04-22 Thread Dan Davis
orry I didn't get the point. > > :-( > > > > > > -Ursprüngliche Nachricht- > > Von: LAFK [mailto:tomasz.bo...@gmail.com] > > Gesendet: Mittwoch, 22. April 2015 14:01 > > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org &

Re: Odp.: solr issue with pdf forms

2015-04-22 Thread Erick Erickson
t are you > trying to say? Sorry I didn't get the point. > :-( > > > -Ursprüngliche Nachricht- > Von: LAFK [mailto:tomasz.bo...@gmail.com] > Gesendet: Mittwoch, 22. April 2015 14:01 > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org > Betreff: Od

AW: Odp.: solr issue with pdf forms

2015-04-22 Thread Steve.Scholl
-user@lucene.apache.org Betreff: Odp.: solr issue with pdf forms Out of my head I'd follow how are writable PDFs created and encoded. @LAFK_PL   Oryginalna wiadomość   Od: steve.sch...@t-systems.com Wysłano: środa, 22 kwietnia 2015 12:41 Do: solr-user@lucene.apache.org Odpowiedz: solr-user@lucene.apache

Odp.: solr issue with pdf forms

2015-04-22 Thread LAFK
Out of my head I'd follow how are writable PDFs created and encoded. @LAFK_PL   Oryginalna wiadomość   Od: steve.sch...@t-systems.com Wysłano: środa, 22 kwietnia 2015 12:41 Do: solr-user@lucene.apache.org Odpowiedz: solr-user@lucene.apache.org Temat: solr issue with pdf forms Hi guys, hope

solr issue with pdf forms

2015-04-22 Thread Steve.Scholl
Hi guys, hopefully you can help me with my issue. We are using a solr setup and have the following issue: - usual pdf files are indexed just fine - pdf files with writable form-fields look like this: Ich�bestätige�mit�meiner�Unterschrift,�dass�alle�Angaben�korrekt�und�vollständig�sind Somehow th