On Sunday, October 14, 2012 04:26:02 PM Fabio D'Urso wrote:
> [snip]
> So we have two working options:
> - Always write /I, not matter if multiselect is true or false (current
> approach in my 0002 patch)
> - Only write /I if multiselect is true, but write export values instead of
> option names in /V, which is also against the specs
Hi,
I'm attaching a new set of patches implementing the second option.
They no longer abuse the /I property for single-selection only fields, instead
they only break the spec in two things:
1. They write/expect the export value instead of the option name, if an
export value is present (see patches 0004 and 0005);
2. If /I is present, it is given precedence over /V to determine initially
selected options. In other words, if /I and /V don't agree, /I is used.
This is the reverse of what the spec says (table 231 - last paragraph),
but acroread seems to do the same (see patches 0003 and 0006).
Note that #2 only affects how poppler decides which option are initially
selected. #1 is a little more sensible because it causes poppler to produce
invalid documents with reference to the PDF spec but, as I've explained in the
previous posts, tests show that acroread expects *this* behavior and there's
at least foxit already doing this. Therefore, I feel that we're on a safe
track.
Details for each patch:
== 0001-FormFieldChoice-ctor-Don-t-convert-human-readable-op.patch ==
With this patch, poppler no longer converts option names to Unicode while
reading them. Note that frontend clients won't notice the difference because
both glib and qt4 re-handle encoding conversion internally.
I've removed the conversion so that when we write /V, we set it to a string
that is binary-equal to the corresponding /Opt entry.
Without this patch, it often occurred that poppler couldn't parse the selected
option in forms saved by poppler itself: it wrote a Unicode string in /V, but
when it read the resulting file it compared /V against the strings in /Opt,
which are usually in pdfdoc encoding. Since it is a binary comparison, all
tests failed and no selected option was shown.
I could have fixed it by making the comparison encoding-aware, but this
approach seems to work just fine and therefore I've decided to keep it simple.
Note: Documents saved with old poppler versions will not work, but I haven't
worried about them as they have never really worked anyway.
== 0002-FormFieldChoice-updateSelection-Fixed-wrong-loop-con.patch ==
Fixes a wrong loop condition. I've put it in a dedicated patch so that you can
push it to 0.20 too.
== 0003-FormFieldChoice-updateSelection-Write-I-too.patch ==
This patch writes the /I property if the field supports multiple selection or
removes /I if it doesn't. In other words, this patch makes sure that /I, if
present, is updated to the current selection. This fixes the situation in
which the orginal document contained /I, we changed selection (by updating /V
only) and acroread still showed the old value because /I had not been updated.
Note: This *acroread* behavior is against the spec, that say that "if the
items identified by I differ from those in the V entry of the field
dictionary, the V entry shall be used" (table 231 - last paragraph). Note that
in patch 0006 I knowingly introduce the same bug in poppler too. See below for
details.
== 0004-FormFieldChoice-ctor-Be-able-to-understand-V-values-.patch,
0005-FormFieldChoice-updateSelection-Write-the-export-val.patch ===
These two patches break the spec to overcome the issue described here:
http://lists.freedesktop.org/archives/poppler/2012-October/009641.html .
They make poppler expect/write the export name instead of the option name if
both are present (i.e. if /Opt is an array of string pairs).
NOTE: With the 5 patches so far, choice fields should be already fully
working, with the exception of those with duplicate options.
== 0006-FormFieldChoice-ctor-Look-for-selected-options-in-I-.patch ==
This patch adds support to parse selected options from /I.
According to the spec, data in /I should be compared with data in /V and /V
should be used if they differ. However, acroread seems to blindly trust /I and
so I've decided to keep things simple and do the same.
Using /I, if available, instead of /V gives us better handling of duplicate
options (which couldn't be distinguished otherwise) and independence on the
option name vs export value issue (because we don't look at /V at all).
Note: This patch only adds about 10 lines, the rest is just indentation change
P.S.: I'm working on three more patches about editable combobox fields that
fix the issue described here:
http://lists.freedesktop.org/archives/poppler/2012-October/009688.html .
I'll post them in that thread.
Thank you,
Fabio
From 95786140541e18bc5a84d8fa5c39a4cba762cc2e Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Wed, 31 Oct 2012 15:26:37 +0100
Subject: [PATCH 1/6] FormFieldChoice ctor: Don't convert "human-readable"
option names to unicode
Despite that comment, they're not meant to be read by humans only, but they
are also used as option identifiers.
This patch stops poppler from forcing them to be unicode. Instead,
they now stay the same encoding as their corresponding /Opt entry.
This fixes poppler not being able to recognize selected entries
in documents produced by poppler itself: previously, the /V value was
always written in Unicode encoding, and therefore it was very often not
binary-equal to the corresponding /Opt entry.
Now the /V value is always binary-equal to the corresponding /Opt entry.
---
poppler/Form.cc | 16 +---------------
1 files changed, 1 insertions(+), 15 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index 68a4219..d396437 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1130,7 +1130,7 @@ FormFieldChoice::FormFieldChoice(PDFDoc *docA, Object *aobj, const Ref& ref, For
}
obj1.free();
- // find selected items and convert choice's human readable strings to UTF16
+ // find selected items
if (Form::fieldLookup(dict, "V", &obj1)->isString()) {
for (int i = 0; i < numChoices; i++) {
if (!choices[i].optionName)
@@ -1138,13 +1138,6 @@ FormFieldChoice::FormFieldChoice(PDFDoc *docA, Object *aobj, const Ref& ref, For
if (choices[i].optionName->cmp(obj1.getString()) == 0)
choices[i].selected = true;
-
- if (!choices[i].optionName->hasUnicodeMarker()) {
- int len;
- char* buffer = pdfDocEncodingToUTF16(choices[i].optionName, &len);
- choices[i].optionName->Set(buffer, len);
- delete [] buffer;
- }
}
} else if (obj1.isArray()) {
for (int i = 0; i < numChoices; i++) {
@@ -1162,13 +1155,6 @@ FormFieldChoice::FormFieldChoice(PDFDoc *docA, Object *aobj, const Ref& ref, For
}
obj2.free();
}
-
- if (!choices[i].optionName->hasUnicodeMarker()) {
- int len;
- char* buffer = pdfDocEncodingToUTF16(choices[i].optionName, &len);
- choices[i].optionName->Set(buffer, len);
- delete [] buffer;
- }
}
}
obj1.free();
--
1.7.6.5
From 086c5cccfb8028455a39776e9eeebb43f7a76fd0 Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Wed, 31 Oct 2012 16:57:56 +0100
Subject: [PATCH 2/6] FormFieldChoice::updateSelection: Fixed wrong loop
condition
---
poppler/Form.cc | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index d396437..4a3f30d 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1189,7 +1189,7 @@ void FormFieldChoice::updateSelection() {
if (numSelected == 0) {
obj1.initString(new GooString(""));
} else if (numSelected == 1) {
- for (int i = 0; numChoices; i++) {
+ for (int i = 0; i < numChoices; i++) {
if (choices[i].optionName && choices[i].selected) {
obj1.initString(choices[i].optionName->copy());
break;
--
1.7.6.5
From 4aee3c5124b98b4682a5c3a3b2ffc03fc15fbfd4 Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Sat, 13 Oct 2012 00:13:33 +0200
Subject: [PATCH 3/6] FormFieldChoice::updateSelection: Write /I too
This improves handling of choice fields containing two or more entries
with the same name, and also makes sure that the previous value of /I
gets updated (failing to update it results in acroread still showing
the old selection).
---
poppler/Form.cc | 50 ++++++++++++++++++++++++++++++++++++--------------
1 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index 4a3f30d..53e6167 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1179,35 +1179,57 @@ void FormFieldChoice::print(int indent)
#endif
void FormFieldChoice::updateSelection() {
- Object obj1;
+ Object objV, objI, obj1;
+ objI.initNull();
- //this is an editable combo-box with user-entered text
if (edit && editedChoice) {
- obj1.initString(editedChoice->copy());
+ // This is an editable combo-box with user-entered text
+ objV.initString(editedChoice->copy());
} else {
- int numSelected = getNumSelected();
+ const int numSelected = getNumSelected();
+
+ // Create /I array only if multiple selection is allowed (as per PDF spec)
+ if (multiselect) {
+ objI.initArray(xref);
+ }
+
if (numSelected == 0) {
- obj1.initString(new GooString(""));
+ // No options are selected
+ objV.initString(new GooString(""));
} else if (numSelected == 1) {
+ // Only one option is selected
for (int i = 0; i < numChoices; i++) {
- if (choices[i].optionName && choices[i].selected) {
- obj1.initString(choices[i].optionName->copy());
- break;
+ if (choices[i].selected) {
+ if (multiselect) {
+ objI.arrayAdd(obj1.initInt(i));
+ }
+
+ if (choices[i].optionName) {
+ objV.initString(choices[i].optionName->copy());
+ }
+
+ break; // We've just written the selected option. No need to keep on scanning
}
}
} else {
- obj1.initArray(xref);
+ // More than one option is selected
+ objV.initArray(xref);
for (int i = 0; i < numChoices; i++) {
- if (choices[i].optionName && choices[i].selected) {
- Object obj2;
- obj2.initString(choices[i].optionName->copy());
- obj1.arrayAdd(&obj2);
+ if (choices[i].selected) {
+ if (multiselect) {
+ objI.arrayAdd(obj1.initInt(i));
+ }
+
+ if (choices[i].optionName) {
+ objV.arrayAdd(obj1.initString(choices[i].optionName->copy()));
+ }
}
}
}
}
- obj.getDict()->set("V", &obj1);
+ obj.getDict()->set("V", &objV);
+ obj.getDict()->set("I", &objI);
xref->setModifiedObject(&obj, ref);
updateChildrenAppearance();
}
--
1.7.6.5
From e63903cc9d623d6f75fcd67c319882f3e299b4ea Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Wed, 31 Oct 2012 15:44:32 +0100
Subject: [PATCH 4/6] FormFieldChoice ctor: Be able to understand /V values
containing the export value instead of the option name
According to the PDF spec, /V should always contain an "option name" and
never an "export value" if /Opt is an array of couples. However, it
seems that acroread works the other way round: it is able to identify
selected options only if they are referred by their export value
instead of the option name.
With this patch, we mimic this behavior.
---
poppler/Form.cc | 41 ++++++++++++++++++++++++++++-------------
1 files changed, 28 insertions(+), 13 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index 53e6167..6f4d659 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1131,29 +1131,44 @@ FormFieldChoice::FormFieldChoice(PDFDoc *docA, Object *aobj, const Ref& ref, For
obj1.free();
// find selected items
+ // Note: According to PDF specs, /V should *never* contain the exportVal.
+ // However, if /Opt is an array of (exportVal,optionName) pairs, acroread
+ // seems to expect the exportVal instead of the optionName and so we do too.
if (Form::fieldLookup(dict, "V", &obj1)->isString()) {
for (int i = 0; i < numChoices; i++) {
- if (!choices[i].optionName)
- continue;
-
- if (choices[i].optionName->cmp(obj1.getString()) == 0)
- choices[i].selected = true;
+ if (choices[i].exportVal) {
+ if (choices[i].exportVal->cmp(obj1.getString()) == 0) {
+ choices[i].selected = true;
+ }
+ } else if (choices[i].optionName) {
+ if (choices[i].optionName->cmp(obj1.getString()) == 0) {
+ choices[i].selected = true;
+ }
+ }
}
} else if (obj1.isArray()) {
for (int i = 0; i < numChoices; i++) {
- if (!choices[i].optionName)
- continue;
-
for (int j = 0; j < obj1.arrayGetLength(); j++) {
Object obj2;
-
obj1.arrayGet(j, &obj2);
- if (choices[i].optionName->cmp(obj2.getString()) == 0) {
- choices[i].selected = true;
- obj2.free();
- break;
+ GBool matches = gFalse;
+
+ if (choices[i].exportVal) {
+ if (choices[i].exportVal->cmp(obj2.getString()) == 0) {
+ matches = gTrue;
+ }
+ } else if (choices[i].optionName) {
+ if (choices[i].optionName->cmp(obj2.getString()) == 0) {
+ matches = gTrue;
+ }
}
+
obj2.free();
+
+ if (matches) {
+ choices[i].selected = true;
+ break; // We've determined that this option is selected. No need to keep on scanning
+ }
}
}
}
--
1.7.6.5
From 8dd3fa9888324f36ca58abe644447a5e62519873 Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Wed, 31 Oct 2012 16:49:41 +0100
Subject: [PATCH 5/6] FormFieldChoice::updateSelection: Write the export value
instead of option name in /V if /Opt in an array of
couples
Because this is what acroread expects (see previous patch's commit message).
---
poppler/Form.cc | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index 6f4d659..9998240 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1219,7 +1219,9 @@ void FormFieldChoice::updateSelection() {
objI.arrayAdd(obj1.initInt(i));
}
- if (choices[i].optionName) {
+ if (choices[i].exportVal) {
+ objV.initString(choices[i].exportVal->copy());
+ } else if (choices[i].optionName) {
objV.initString(choices[i].optionName->copy());
}
@@ -1235,7 +1237,9 @@ void FormFieldChoice::updateSelection() {
objI.arrayAdd(obj1.initInt(i));
}
- if (choices[i].optionName) {
+ if (choices[i].exportVal) {
+ objV.arrayAdd(obj1.initString(choices[i].exportVal->copy()));
+ } else if (choices[i].optionName) {
objV.arrayAdd(obj1.initString(choices[i].optionName->copy()));
}
}
--
1.7.6.5
From d34d1813c06382191ce4239750c9ed542c34cf4b Mon Sep 17 00:00:00 2001
From: Fabio D'Urso <[email protected]>
Date: Wed, 31 Oct 2012 19:43:51 +0100
Subject: [PATCH 6/6] FormFieldChoice ctor: Look for selected options in /I
instead of /V if /I is available
Since /I stores the indices of the selected options, it can distinguish
duplicate option (i.e. options with the same name/export value).
---
poppler/Form.cc | 71 ++++++++++++++++++++++++++++++++----------------------
1 files changed, 42 insertions(+), 29 deletions(-)
diff --git a/poppler/Form.cc b/poppler/Form.cc
index 9998240..586482d 100644
--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -1130,44 +1130,57 @@ FormFieldChoice::FormFieldChoice(PDFDoc *docA, Object *aobj, const Ref& ref, For
}
obj1.free();
- // find selected items
- // Note: According to PDF specs, /V should *never* contain the exportVal.
- // However, if /Opt is an array of (exportVal,optionName) pairs, acroread
- // seems to expect the exportVal instead of the optionName and so we do too.
- if (Form::fieldLookup(dict, "V", &obj1)->isString()) {
- for (int i = 0; i < numChoices; i++) {
- if (choices[i].exportVal) {
- if (choices[i].exportVal->cmp(obj1.getString()) == 0) {
- choices[i].selected = true;
- }
- } else if (choices[i].optionName) {
- if (choices[i].optionName->cmp(obj1.getString()) == 0) {
- choices[i].selected = true;
- }
+ // Find selected items
+ // Note: PDF specs say that /V has precedence over /I, but acroread seems to
+ // do the opposite. We do the same.
+ if (Form::fieldLookup(dict, "I", &obj1)->isArray()) {
+ for (int i = 0; i < obj1.arrayGetLength(); i++) {
+ Object obj2;
+ if (obj1.arrayGet(i, &obj2)->isInt() && obj2.getInt() >= 0 && obj2.getInt() < numChoices) {
+ choices[obj2.getInt()].selected = true;
}
+ obj2.free();
}
- } else if (obj1.isArray()) {
- for (int i = 0; i < numChoices; i++) {
- for (int j = 0; j < obj1.arrayGetLength(); j++) {
- Object obj2;
- obj1.arrayGet(j, &obj2);
- GBool matches = gFalse;
-
+ } else {
+ obj1.free();
+ // Note: According to PDF specs, /V should *never* contain the exportVal.
+ // However, if /Opt is an array of (exportVal,optionName) pairs, acroread
+ // seems to expect the exportVal instead of the optionName and so we do too.
+ if (Form::fieldLookup(dict, "V", &obj1)->isString()) {
+ for (int i = 0; i < numChoices; i++) {
if (choices[i].exportVal) {
- if (choices[i].exportVal->cmp(obj2.getString()) == 0) {
- matches = gTrue;
+ if (choices[i].exportVal->cmp(obj1.getString()) == 0) {
+ choices[i].selected = true;
}
} else if (choices[i].optionName) {
- if (choices[i].optionName->cmp(obj2.getString()) == 0) {
- matches = gTrue;
+ if (choices[i].optionName->cmp(obj1.getString()) == 0) {
+ choices[i].selected = true;
}
}
+ }
+ } else if (obj1.isArray()) {
+ for (int i = 0; i < numChoices; i++) {
+ for (int j = 0; j < obj1.arrayGetLength(); j++) {
+ Object obj2;
+ obj1.arrayGet(j, &obj2);
+ GBool matches = gFalse;
- obj2.free();
+ if (choices[i].exportVal) {
+ if (choices[i].exportVal->cmp(obj2.getString()) == 0) {
+ matches = gTrue;
+ }
+ } else if (choices[i].optionName) {
+ if (choices[i].optionName->cmp(obj2.getString()) == 0) {
+ matches = gTrue;
+ }
+ }
- if (matches) {
- choices[i].selected = true;
- break; // We've determined that this option is selected. No need to keep on scanning
+ obj2.free();
+
+ if (matches) {
+ choices[i].selected = true;
+ break; // We've determined that this option is selected. No need to keep on scanning
+ }
}
}
}
--
1.7.6.5
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler