Re: [Development] Views in APIs

Marc Mutz via Development Wed, 13 May 2020 02:44:23 -0700

On 2020-05-13 09:04, Lars Knoll wrote:

On 12 May 2020, at 18:59, Marc Mutz <marc.m...@kdab.com> wrote:

[...]

Most other classes:
* Only take and return QString
This is wrong. By taking a QString in the API of non-string-relatedAPIs, you expose QString as an implementation detail of the class.Think QRegion, which, before I added begin()/end(), had a SSO-likeinternal container (1 QRect or QVector<QRect>) which forced most usersto code around the owning-container API like this:
I don’t think you can compare those cases. QString is the container
our users use in 99% of the cases to hold a string. And this won’t
change (and I don’t think we should advocate any changes here).

So then, an API taking and returning a QString is the most logical,
easiest and most convenient to use.


There are two independent issues here: taking and returning.

For taking I don't see how "taking QString is most logical, easiest,most convenient" follows. It's simply not true that 99% of the cases tohold a string use QString. In the common case of applications which arenot localized, all the nice QString APIs are fed with const char*literals, causing QString creation and destruction in user code. Even ifyou change all these to const char16_t* and a QString that doesn'tallocate in this case, you still have all these complex (in theQTypeInfo sense) QString objects created and destroyed at the call site.This bloats user code. If all those shiny QString APIs would takeQStringView instead, the construction and destruction of which istrivial (in the C++ sense), the compiler can remove all those calls andthe QString construction (if any) happens centrally inside the libraryfunction (O(1) instead of O(N), N = #callers).

In the not so hypothetical case that Qt is used to visualize results ofsome business calculations, chances are that thrid-party libraries willuse std::string or std::u16string, and not QString, requiring the use ofQString::fromStdString() to pass these to a QString API. Had the APItaken QStringView, no extra code would have been necessary.

So, I have shown that taking QString is neither the most convenient, northe most easy, choice, as it requires users with other string types asdata sources to jump through hoops to pass their data. Only if youemploy a very narrow focus where both efficiency and the existence of3rd-party code are all ignored, can you still maintain that a functiontaking an owning container is more convenient and easier than one takinga view. The only logic in such an API that _I_ can find, however, is"MUST ... NOT ... BREAK ... COW logic", which has been proven as flawedover twenty years ago:


http://www.gotw.ca/gotw/043.htm
http://www.gotw.ca/gotw/044.htm
http://www.gotw.ca/gotw/045.htm

(and CoW isn't even correctly implemented in Qt, ever since unsharabledata was removed).

In case the internal storage _is_ QString, then providing a QString&&overload to avoid the copy is a good idea, if you're willing to impartthe details of the implementation that way.


Second, returning.

You talk about QString getting an SSO buffer, maybe. Then returning aQString will become even more expensive. It already got more expensive,since instead of sizeof(void*) it's now two or three (didn't check)words, but adding SSO will not make it better. And if we don't get SSO,classes can decide to store u16string instead, which _has_ SSO already.So, efficiency wins here, too.

  if (r.rectCount() == 1) {
     use(r.boundingRect());
  } else {
     const auto rects = r.rects();
     for (const QRect &rect : rects)
        use(rect);
  }
If rects() had been a view (today, we'd use gsl/std::span<constQRect>), all users could just do


I don’t get that argument. The region is a list of rectangles today,
so you could simply add a rects() method that returns them and the
code below would work.

I think you should take a look at QRegionPrivate::begin() (in Qt 6 or Qt5). And in Qt 5's version of QRegion::rects().

  for (const QRect &rect : r.rects())
     use(rect);
This is objectively a better API, for two reasons: a) the user doesn'tneed to care about some weird idiosyncrasies of the class to avoidperformance penalties and b) the class author is now free to extendthe SSO buffer from one to, say, four, without changing the API, noteven those affected by Hyrum.
It _seems_ your solution is to fold views into owning containers, andwhile that may seem to work, it's dangerous:
Assume QRegion::rects() returned a QVector-acting-as-view. Then thiswould silently fail:
   QRegion region();
   for (const QRect &rect : region().rects())
       use(rect);
because, clearly, QVector is an owning container, so we don't carethat the QRegion went out of scope. Whereas with a view, it will beimmediately obvious (to a tool like Clazy, at least) that this can'twork.
The above example is rather weirdly constructed.

But anyhow, those data lifetime issues when using views as return
values extensively are a lot less obvious to spot when a human reads
the code. APIs should be safe to use wherever possible, so that our
users don’t have to worry about those things. This will lead to fewer
bugs in the resulting code and faster development times. Trading that
for some saved CPU cycles is in almost all cases (and yes there are
exceptions in low level classes) a very bad deal.

You didn't get my point: If I return a view, it's clear what's going on(to user and tools) and that the data will only be valid until anon-const member function on the source object is called. So far sosimple. This is what we have with QString::data() and a ton of otherAPIs, and it's easy to understand.

Now change that to an owning container. Say QString, for the sake ofargument. Now, users (and tools) expect that QString to own the data,but that's far from guaranteed. We had the problem with QStringLiteralin plugins, where the data just went away on plugin unloading. Solution?Don't actually unload plugins on unloading. Wow! What we do to saveQString-as-a-view! Tell me how that's convenient and easy API? Had allthose functions not returned QStringLiterals through QString, butthrough QStringView, it would have been more suggestive to copy the datathan it was with QString.

And this will just become more pronounced if every construction from achar16_t* will create a QString that doesn't actually own the data. Bychoosing QString (owning container or view) over QStringView (view only)in APIs, you (deliberately, if I may say so) blurred the line betweenowning container and view, and in doing so you kill the raison d'étrefor returning owning containers: avoiding dangling references. This isneither convenient nor easy. Actually, you're forcing all users to takea deep copy, as per


   QString deepCopy(QStringView v) { return v.toString(); }

(and hope that suggestions for QStringView carrying the d-pointer in avein attempt at "optimizing" QStringView::toString() will not beimplemented).

So, I'd argue for the complete opposite here: we would increaseencapsulation of our APIs if they stopped trafficking in owningcontainer types. I call this the NOI pattern (Non-Owning Interface).By not having to serve QString everywhere, we'll be much more free touse alternative storage types in the implementation (e.g.QVarLengthArray<char16_t>, or - the horror! - std::pmr::u16string).Handling-wise, QStringView makes all these choices equal, so theimplementation of a class can use whatever is objectively optimalinstead of being bound to QString.
You can just as well argue the other way round. Returning a view
requires the class to store the data in continuous memory. It can’t
create it on the fly if asked.

That's not true. You can always do the construction lazily and thenreturn a view. Thread-safety is an issue, yes, but it's not terriblydifficult to fix (and not fixed in many other classes, either, QFileInfocomes to mind, so it can't be that important). Compared to the 'usual'implementation in these cases, repeated calls to the function won't haveto re-calculate the result anew each time (seehttps://codereview.qt-project.org/c/qt/qtbase/+/299986 for a recentexample).

AsidE: If you think that CoW is still a thing today: no. SSO is athing these days, and it seems that QString will not have it in Qt 6,either. NOI favours SSO, QString-everywhere cements the naïve CoWworld of the 1990s for yet another decade.
Let’s see if we can get SSO working for QString and QBA in time. It
should not be very difficult to implement with the new structure we’re
having.

Even if you enable it for QString and QBA, you can't implement it forQVector without breaking tons of code that relies on iterator stability(which is why std::vector can't do it, either).

You might call CoW naive, but I do believe that the fact that Qt does
use containers that own their data in most of our APIs is what makes
Qt significantly simpler and safer to use than many other APIs.

Agreed. But you're in the process of blurring the line between owningcontainers and views. This already started in Qt 5 with QStringLiteraland even earlier with QString::fromRawData(). At least the former wasstatically-allocated and presented a problem in only a a very limitedcircumstances and the latter is explicit. But you're makingQString(const char16_t*) the equivalent of QString::fromRawData() nowand so the cases where QStrings are actually views that doesn't own thedata is going to explode.


   QLineEdit *le = ...;
   {
       std::u16string u16s = businessResult();

le->setText(u16s.data()); // = setText(QString(const char16_t*))= setText(QString::fromRawData())

   } // BOOM

No such API exists in std, so a std::vector or a std::string _always_own the data. For everything else, there's string_view and span, which_never_ own.

Using views in our API would make it in many cases harder to track
lifetime, esp. if they are combined with the use of auto. Yes, tools
like clazy can help, but I’d rather have inherent safety than rely on
additional tools.

I said in a review the other days is that APIs are easy to use _not_when the number of classes is minimized, but when the responsibilitiesof any given class are minimal and to-the-point. For a given domainarea, that means more small classes are better than fewer large ones.

You don't want to port APIs to QStringView, because it's a ton of work,but you want to the benefits, so you're folding QStringView into QStringand make the use of QString that much harder (btw: the same happened toQList/QVector).

That's legitimate, esp. as a step in the direction of fullyview-enabling the library.

It's _not_ legitimate to claim that this makes the API "easier to use,more logical and convenient". It doesn't. It's also _not_ legitimate touse this as an argument against more QStringView-overloads around thelibrary.

To spell it out: Just like QList-is-QVector, what you're doing toQString is a (hopefully stop-gap) measure to avoid rewriting all theAPIs and classes to take and accept views at the expense of makingQString even use harder to reason about than it already has been in Qt5.

Learn from QRegion!
I have spoken on many conferences (at least QtWS, Meeting C++, emBO++)on this, if anyone wants to learn more.
Not everybody agrees with your opinions, and we need to remember that
most of our users are not necessarily people knowing the C++ standard
inside out. And they *shouldn’t* have to be.

I fail to see what this has to do with "knowing the C++ standard insideout". Maybe you can enlighten me? AFAICT, all I'm doing is applyingsound engineering principles, incl., but not limited to, "make an APIeasy to use and hard to misuse" and "minimal, _efficient_ basis ofoperations" to Qt.


Thanks,
Marc
_______________________________________________
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

Re: [Development] Views in APIs

Reply via email to