ck Erickson [mailto:erickerick...@gmail.com]
Sent: Thursday, April 30, 2015 11:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Odp.: solr issue with pdf forms
Jack:
I keep forgetting those things exist, thanks for the reminder!
On Thu, Apr 30, 2015 at 8:23 AM, Jack Krupansky
wrote:
> Or use a S
Daz
>> >
>> > Best
>> > Steve
>> >
>> > -Ursprüngliche Nachricht-
>> > Von: Allison, Timothy B. [mailto:talli...@mitre.org]
>> > Gesendet: Mittwoch, 29. April 2015 14:16
>> > An: solr-user@lucene.apache.org
>> >
nachweise bei.^HDaz
> >
> > Best
> > Steve
> >
> > -Ursprüngliche Nachricht-
> > Von: Allison, Timothy B. [mailto:talli...@mitre.org]
> > Gesendet: Mittwoch, 29. April 2015 14:16
> > An: solr-user@lucene.apache.org
> > Cc: u...@tika.apache.org
&
Gesendet: Mittwoch, 29. April 2015 14:16
> An: solr-user@lucene.apache.org
> Cc: u...@tika.apache.org
> Betreff: RE: Odp.: solr issue with pdf forms
>
> I completely agree with Erick about the utility of the TermsComponent to see
> what is actually being indexed. If you find probl
@gmail.com]
Sent: Tuesday, April 28, 2015 9:07 PM
To: solr-user@lucene.apache.org
Subject: Re: Odp.: solr issue with pdf forms
There better be.
1> go to the admin UI
2> select a core
3> select "schema browser"
4> select a field from the drop-down
Until you do step 4 the window w
kson [mailto:erickerick...@gmail.com]
Gesendet: Mittwoch, 29. April 2015 16:07
An: solr-user@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms
Steve:
I'd just look at one field at a time
Presumably you have a field that's displaying poorly, "content"? Just look at
e pdfbox-app.jar (ExtractText option)
> on your files outside of Solr to see what text/noise you're getting for the
> files that are causing problems.
>
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Tuesday, April 28,
day, April 28, 2015 9:07 PM
To: solr-user@lucene.apache.org
Subject: Re: Odp.: solr issue with pdf forms
There better be.
1> go to the admin UI
2> select a core
3> select "schema browser"
4> select a field from the drop-down
Until you do step 4 the window will be pr
@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms
There better be.
1> go to the admin UI
2> select a core
3> select "schema browser"
4> select a field from the drop-down
Until you do step 4 the window will be pretty blank.
Here's the info for TermsCompon
fault configured.
>
> Thanks a lot
> Best
> Steve
>
> -Ursprüngliche Nachricht-
> Von: Erick Erickson [mailto:erickerick...@gmail.com]
> Gesendet: Montag, 27. April 2015 17:23
> An: solr-user@lucene.apache.org
> Betreff: Re: Odp.: solr issue with pdf forms
>
> W
erickerick...@gmail.com]
Gesendet: Montag, 27. April 2015 17:23
An: solr-user@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms
We're still not quite there. There should be a "load term info" button on that
page. Clicking that button will show you the terms in your index (as o
inct: 160403
>
> Does this somehow help to figure out the issue?
> Thanks
> Best
> Steve
>
>
> -Ursprüngliche Nachricht-
> Von: Erick Erickson [mailto:erickerick...@gmail.com]
> Gesendet: Freitag, 24. April 2015 20:15
> An: solr-user@lucene.apache.org
> Be
rg
Betreff: Re: Odp.: solr issue with pdf forms
Steve:
Right, it's not exactly obvious. Bring up the admin UI, something like
http://localhost:8983/solr. From there you have to select a core in the 'core
selector' drop-down on the left side. If you're using SolrCloud, this wi
admin schema browser, but what
> should I see there? Sorry I'm not firm with the admin schema browser. :-(
>
> Best
> Steve
>
>
> -Ursprüngliche Nachricht-
> Von: Erick Erickson [mailto:erickerick...@gmail.com]
> Gesendet: Donnerstag, 23. April 2015 18:00
&g
April 2015 18:00
An: solr-user@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms
When you say "they're not indexed correctly", what's your evidence?
You cannot rely
on the display in the browser, that's the raw input just as it was sent to
Solr, _not_ the actua
fields) created with "Adobe InDesign
> CS5 (7.0.1)" are indexed with the blank space issue
> >
> > Best
> > Steve
> >
> > -Ursprüngliche Nachricht-
> > Von: Erick Erickson [mailto:erickerick...@gmail.com]
> > Gesendet: Mittwoch, 22. April
üngliche Nachricht-
> Von: Erick Erickson [mailto:erickerick...@gmail.com]
> Gesendet: Mittwoch, 22. April 2015 17:11
> An: solr-user@lucene.apache.org
> Betreff: Re: Odp.: solr issue with pdf forms
>
> Are they not _indexed_ correctly or not being displayed correctly?
> Ta
gmail.com]
Gesendet: Mittwoch, 22. April 2015 17:11
An: solr-user@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms
Are they not _indexed_ correctly or not being displayed correctly?
Take a look at admin UI>>schema browser>> your field and press the "load terms"
butto
Steve,
Are you using ExtractingRequestHandler / DataImportHandler or extracting
the text content from the PDF outside of Solr?
On Wed, Apr 22, 2015 at 6:40 AM, wrote:
> Hi guys,
>
> hopefully you can help me with my issue. We are using a solr setup and
> have the following issue:
> - usual pdf
orry I didn't get the point.
> > :-(
> >
> >
> > -Ursprüngliche Nachricht-
> > Von: LAFK [mailto:tomasz.bo...@gmail.com]
> > Gesendet: Mittwoch, 22. April 2015 14:01
> > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org
&
t are you
> trying to say? Sorry I didn't get the point.
> :-(
>
>
> -Ursprüngliche Nachricht-
> Von: LAFK [mailto:tomasz.bo...@gmail.com]
> Gesendet: Mittwoch, 22. April 2015 14:01
> An: solr-user@lucene.apache.org; solr-user@lucene.apache.org
> Betreff: Od
-user@lucene.apache.org
Betreff: Odp.: solr issue with pdf forms
Out of my head I'd follow how are writable PDFs created and encoded.
@LAFK_PL
Oryginalna wiadomość
Od: steve.sch...@t-systems.com
Wysłano: środa, 22 kwietnia 2015 12:41
Do: solr-user@lucene.apache.org
Odpowiedz: solr-user@lucene.apache
Out of my head I'd follow how are writable PDFs created and encoded.
@LAFK_PL
Oryginalna wiadomość
Od: steve.sch...@t-systems.com
Wysłano: środa, 22 kwietnia 2015 12:41
Do: solr-user@lucene.apache.org
Odpowiedz: solr-user@lucene.apache.org
Temat: solr issue with pdf forms
Hi guys,
hope
Hi guys,
hopefully you can help me with my issue. We are using a solr setup and have the
following issue:
- usual pdf files are indexed just fine
- pdf files with writable form-fields look like this:
Ich�bestätige�mit�meiner�Unterschrift,�dass�alle�Angaben�korrekt�und�vollständig�sind
Somehow th
24 matches
Mail list logo