Re: Add support for FT_Face cloning

Tayyab Akram Wed, 09 Feb 2022 09:54:41 -0800

> First, the proposed merge requests adds a ton of conditional code paths that 
> are impossible to properly review now or even maintain over time. Since there 
> is no proper distinction in the source code between what is supposed to be 
> shareable+readonly and exclusive+mutable, There is no way by looking at the 
> code to determine the correctness of the change, this will make improving 
> library internals even more difficult in the future. It is very likely that 
> this feature is gonna break in subtle ways that are very hard to detect 
> beforehand. Given the scope of the library's API, I don't think it is 
> possible to come up with an extensive regression test that covers every 
> possibility. I am convinced this will not be maintainable over time. Sorry.
I get your point and have a few options in mind to overcome this. But it would 
require some structural changes. That's why I was deliberately being quiet 
about that. So basically, we need to organize the fields in such a way that 
they reflect their traits. It can be having a prefix describing the trait or 
organizing like the following:
typedef struct ABC {  struct {    /* Mutable Properties. */  } mutable;
  struct {    /* Lazy Properties. */  } lazy;
  /* Immutable Properties. */  FT_UInt first;  FT_UInt second;} ABC;
Let me know if this sounds interesting, then we can further refine it.



> The locking requirements do not only apply to the input stream but to any 
> data that is loaded or created on demand. A lot of these have been removed to 
> make the library "more thread-safe" (it really isn't). And since the FreeType 
> API was designed to use mutable state very intentionally (this saves a lot of 
> memory compared to the use of immutable data structures, which was important 
> for embedded systems). There is no clear way for a client to determine 
> whether a given API will mutate the state, so the only safe option for a 
> client is to lock on every API call.
Again, I agree with this. But I have shared the problem statement and how 
immutability can overcome that. Definitely the safe option would be to acquire 
the lock on every API call but we can document the APIs that would need it. I 
was only referring to associate the lock with the input stream, not necessarily 
acquiring it in read operation.


> Finally, calling these instances "clones" or "copies" is misleading. A best 
> they should be called "dependent faces" or "child faces", but the fact that 
> they complicate the lifecycle management of many objects is a problem in 
> itself.
We can also consider to call it a "derived" face and name the function 
"FT_Derive_Face" instead. We can also keep the list of child faces in the root 
face to manage their life cycle. A child face would retain the parent face and 
remove itself from the list on deallocation.


> An FT_Clone_Face() API that takes an input FT_Face instance and returns a new 
> instance that has its own lifecycle (except that they will be children of the 
> same FT_Library, and use the same FT_Memory allocator).
I have shared an alternate above.

> For the input stream, a way to either clone it, or share it safely between 
> instances is needed, and should be provided as an argument to the function in 
> some way. We could change the stream implementation used internally by the 
> library to make this easier, or we could require the client to use 
> FT_Open_Face() with a user-provided shareable stream for FT_Clone_Face() to 
> work.
It seems a little complicated to me. Besides, some OS don't allow many 
instances of the same file.

> The initial implementation would simply re-open the face with the new stream, 
> inefficient but completely safe. But this opens the door to identifying 
> read-only data in the source face that can be copied directly to the clone, 
> saving all the work required to load/validate/compute it.
I have already identified the traits of variables. They are written alongside 
each copy operation in clone implementations.

> Note that I wrote "copy" above, because for efficient and safe read-only 
> sharing, atomic reference counting is required, which is not part of C99, 
> hence not portable. However, it can be introduced as a separate step by 
> defining the right abstractions (types and macros to operate on them). 
> Essentially, we need the equivalent of std::shared_ptr<>, or the intrusive 
> versions of it where the ref-count is at the start of the shared object, but 
> in C99 instead. For platforms that do not support threads, just do non-atomic 
> refcounts.
I also had this in mind but I was trying my level best to keep things simple 
and additive only. I was thinking more of requiring atomic integer 
implementation from the client side because it's cheaper than a lock.If we 
decide to go this way then this would need to be catered first, I guess.    On 
Wednesday, February 9, 2022, 05:37:56 AM GMT+5, David Turner 
<[email protected]> wrote:  
 
 I have specific worries about what is being proposed here:
   
   - First, the proposed merge requests adds a ton of conditional code paths 
that are impossible to properly review now or even maintain over time. Since 
there is no proper distinction in the source code between what is supposed to 
be shareable+readonly and exclusive+mutable, There is no way by looking at the 
code to determine the correctness of the change, this will make improving 
library internals even more difficult in the future. It is very likely that 
this feature is gonna break in subtle ways that are very hard to detect 
beforehand. Given the scope of the library's API, I don't think it is possible 
to come up with an extensive regression test that covers every possibility. I 
am convinced this will not be maintainable over time. Sorry.
   
   - The locking requirements do not only apply to the input stream but to any 
data that is loaded or created on demand. A lot of these have been removed to 
make the library "more thread-safe" (it really isn't). And since the FreeType 
API was designed to use mutable state very intentionally (this saves a lot of 
memory compared to the use of immutable data structures, which was important 
for embedded systems). There is no clear way for a client to determine whether 
a given API will mutate the state, so the only safe option for a client is to 
lock on every API call.
   
   - Finally, calling these instances "clones" or "copies" is misleading. A 
best they should be called "dependent faces" or "child faces", but the fact 
that they complicate the lifecycle management of many objects is a problem in 
itself.
Just like for multi-thread usage, the only correct way to deal with a mutable 
API is to either use locking of a single instance before doing any mutable 
operation (be it loading something, or changing some settings), or to create 
several FT_Face instances instead. This keeps an already complicated API 
manageable without introducing new failure modes.

However, it should be possible to implement a real "cloning" facility that 
could be used for safe multi-threaded and variable-fonts usage. As long as all 
read-only sharing is hidden from the client, it can be introduced progressively 
into the source tree.
What I mean more precisely:
   
   - An FT_Clone_Face() API that takes an input FT_Face instance and returns a 
new instance that has its own lifecycle (except that they will be children of 
the same FT_Library, and use the same FT_Memory allocator).
   - For the input stream, a way to either clone it, or share it safely between 
instances is needed, and should be provided as an argument to the function in 
some way. We could change the stream implementation used internally by the 
library to make this easier, or we could require the client to use 
FT_Open_Face() with a user-provided shareable stream for FT_Clone_Face() to 
work.
   - The initial implementation would simply re-open the face with the new 
stream, inefficient but completely safe. But this opens the door to identifying 
read-only data in the source face that can be copied directly to the clone, 
saving all the work required to load/validate/compute it.   

   - Note that I wrote "copy" above, because for efficient and safe read-only 
sharing, atomic reference counting is required, which is not part of C99, hence 
not portable. However, it can be introduced as a separate step by defining the 
right abstractions (types and macros to operate on them). Essentially, we need 
the equivalent of std::shared_ptr<>, or the intrusive versions of it where the 
ref-count is at the start of the shared object, but in C99 instead. For 
platforms that do not support threads, just do non-atomic refcounts.   

   - The most important part is being able to progressively increase the 
efficiency of the cloning process in a way that adds read-only sharing in an 
explicit way that is easy to control at review time, or when changing the 
library's internals.






Le ven. 4 févr. 2022 à 17:02, Werner LEMBERG <[email protected]> a écrit :


> This proposal aims to take the best of both approaches by
> introducing a new function, `FT_Clone_Face`.  [...]

Excellent summary, much appreciated, thanks!

What we are mainly interested in is whether other users would going to
use the proposed API as advertised.  If yes, please say so.  Otherwise
it would be great if you could discuss problematic issues with
Tayyab's approach.


    Werner

Re: Add support for FT_Face cloning

Reply via email to