Anyone? Konstantin
2012/9/10 Konstantin Ritt <[email protected]>: > Hi folks, > > In fact, the current QTBF behaves just like if it were a broken break > iterator... > I mean that, [Issue1] despite it's name, it stops at every break > opportunity and reports NotAtBoundary via boundaryReasons() method for > the break opportunities that are not boundaries (this affects Line and > Word modes). > [Issue2] As for Grapheme and Sentence modes, there are no optional > break opportunities and thus such behavior is ok, except of > boundaryReasons() does a wild guess based on surrounding white space > characters > and reports (StartWord | EndWord) or NotAtBoundary reasons most of the time. > [Issue3] All this requires the developer to use two different > iteration models according to which of QTBF modes is currently set: > iterating by using toNextBoundary() - for Grapheme and Sentence modes, > and iterating by using toNextBoundary() with extra checking the > boundaryReasons() result - for Line and Word modes. > > But even then, there is no guarantee QTBF will produce expected results. > A good example of what I'm saying about is searching the word > start/end positions at some [arbitrary] position: > > [code] // -- from src/plugins/platforms/windows/qwindowsinputcontext.cpp:~560 > // Find the word in the surrounding text. > QTextBoundaryFinder bounds(QTextBoundaryFinder::Word, surroundingText); > bounds.setPosition(pos); > if (bounds.isAtBoundary()) { > if (QTextBoundaryFinder::EndWord == bounds.boundaryReasons()) > bounds.toPreviousBoundary(); > } else { > bounds.toPreviousBoundary(); > } > const int startPos = bounds.position(); > bounds.toNextBoundary(); > const int endPos = bounds.position(); > [/code] > > In the code above, if the surroundingText doesn't contain a word or if > it ends up with a several white space characters at \a pos, then the > result is a garbage. > > > I see a two major ways to fix the behavior and make the iteration > process consistent unaware of which mode is in use: > > A) a1. introduce BreakOpportunity BoundaryReason enum value and make > boundaryReasons() report BreakOpportunity (instead of NotAtBoundary) > for the break opportunities that are not boundaries in Line and Word > modes, and for the boundaries in Grapheme and Sentence modes; > a2. introduce MandatoryBreak BoundaryReason enum value and make > boundaryReasons() report MandatoryBreak (instead of combination of > StartWord and EndWord values) for the mandatory line breaks (CR, LF, > NEL, EOT); > a3. make boundaryReasons() carefully report StartWord and/or > EndWord exactly for word start and word end positions. > > B) b1. fix QTBF to *not* stop at break opportunities that are not > boundaries in Line and Word modes in order to fix Issue1; > b2. apply a2 and a3 to QTBF in order to fix Issues 2 and 3; > b3. introduce a new QTextBreakIterator class that would implement > everything described in A (this could be delayed for 5.1). Then, QTBF > could be a cheap convenience layer on top of QTBI. Alternatively, QTBI > could provide both "toNextBreak()" and "toNextBoundary()" methods in > order to replace QTBF completely. > > Either way, a major impact of such a change is that that > boundaryReasons() will never report StartWord/EndWord in modes other > than Word + boundaryReasons() will never report NotAtBoundary when > toPreviousBoundary()/toNextBoundary() stops at a valid position. > Because of QTBF is broken-by-design and the code that uses it should > be revised anyways, I believe Qt5 is a most-correct time to fix it and > such an API and behavior change is still acceptable for 5.0 even now, > after beta1 is released. > > I, personally, like the second option quite more. > What do you think? Any objections on making described changes in 5.0? > > Kind regards, > Konstantin _______________________________________________ Development mailing list [email protected] http://lists.qt-project.org/mailman/listinfo/development
