context: https://summerofcode.withgoogle.com/proposals/details/6CaOVRhz

I want to share my idea for the database format and get feedback on it.

I am assuming that the list of character adjustments is small enough to
store in-memory without issues.  This changes completely if a file was
needed instead.
The database would consist of an array of records, where each record
contains a unicode code point and a value for each kind of adjustment.  To
reduce memory footprint and take advantage of the fact that most characters
will not have adjustments, unused entries can be left out, at a cost of
lookup time.  The increased lookup time could be mitigated by sorting the
code points in order and using a binary search.

would this work?

Also, I have a question for Werner specifically: The Google Summer of Code
proposal page called this database the capabilities database.  Why that
name?  I consider "actions" or "instructions" to be the appropriate word
the thing stored in the database, not "capabilities".

Reply via email to