https://bugs.documentfoundation.org/show_bug.cgi?id=125110
Eike Rathke <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- OS|Windows (All) |All Hardware|x86-64 (AMD64) |All --- Comment #13 from Eike Rathke <[email protected]> --- (In reply to Mike Kaganski from comment #7) > So I suppose that what should have been done here is: > 1. Seeing the opening double quote in the beginning of the field, start > "quote-enclosed field" mode. > 2. If it encounters something *invalid* for such a mode, it should re-read > the field again, this time without the "quote-enclosed field" mode (to > properly re-consume possible field separators that could had been read in > the first pass as the quoted field content). > > This way, this sample would be read properly, without introducing any > ambiguity. It would fail in other constellations that now are handled well, like "abc "def" ghi, jkl" where |abc "def" ghi, jkl| is supposed to be *one* field content because the generator didn't escape quotes by doubling them. Your approach would result in |"abc "def" ghi| jkl"| Whatever we'll do, it will make things fail differently for other data of broken generators. You could throw more logic at it like thinking in "words" to be ignored re-triggering quotes have to have a space left (opening quote) or right (closing quote), which would fail for data that simply doesn't follow that assumption. Things get even worse if space was a field separator. Take a look at what is done with the field start mode and quote state to fix known broken data cases and bug 48621 for test case sample files and related. I tend to close this for a too broken generator, but if you can come up with some loose magic that doesn't break any of the already handled cases, then fine.. -- You are receiving this mail because: You are the assignee for the bug.
