On Sat, Oct 05, 2024 at 11:00:05AM +0100, Gavin Smith wrote: > I'd also like to know how many more of these commits are coming.
I still have some conflict/rebasing/cherry picking anomalies that may require one or two small commits. After that, there are 4 commits that are still pending, corresponding to an implementation of INFO_MATH_IMAGES in Info. They could be left for the next release if you prefer. > From this point on, I have had much less understanding of the code > in texi2any. Correspondingly, my ability and interest to fix issues > in texi2any has gone down. I don't know if there was any alternative, > though. The commits of this time are mainly (I have probably missed some) * improvement of C parser performance, mainly memory, but also speed, including some new tests for special cases determined when doing that * convert def line category not in code * @*ref formatting different in Plaintext than in Info * translation of Perl to C for speed increase of conversion to HTML. This includes adding CSV files to describe output used to generate both C and Perl code. * setup separate po directories for gnulib strings * use C strings in po files. * More translation of Perl to C for speed increase of conversion to HTML. * Add a demonstration program in C that converts to HTML * Fix compiler warning, refactor C code to avoid having all the conversion in only one file * Add C++ code to have an hashmap when Perl is not used * update to libintl-perl-1.33 * simplify passing arguments to converters from texi2any.pl and XS version set to main version > I thought that having some more of Patrice's private changes on master > might prevent a recurral of this situation with a huge number of changes > being made after a release. Also for me it had become very time consuming to rebase and cherry pick after master got important changes lately. I think that it is also better to have all this code in the release, as the user visible changes are few (and could easily be reverted as they are more or less independent of th ebig changes going). > I hope we are not going to have development of major new features before > the next release or much more widespread restructuring of the code. > I guess it doesn't matter as much if code that is new since 7.1 is > changed. The improvement of C parser performance included lots of changes in existing code. But otherwise, the main changes are all related to conversion with XS. > At the moment, I can only keep an eye on commits and ChangeLog entries, > and try to flag anything up that I think that needs further discussion. > > Commit a45bddf685edd (dated 2024-10-05, committed 2024-10-05) > "Add C++ code to have an hashmap when Perl is not used" > > This adds C++ code to the project!! This is very concerning. We > already have a lot of languages in the Texinfo project: TeX, Perl, C, > even JavaScript. I have no desire to understand or maintain C++ code > in Texinfo as well. It is also a lot for any new contributors in > the future to get up to speed with (as things are, there may never be > any such new contributors). Can we please find some alternative > for whatever this C++ code is used for? There are three alternatives: * Perl hashmap directly used from C * linear search in C * C++ hashmap used from C My understanding is that there is no other way than using C++ to have an hash map in C, as was discussed in this thread: https://lists.gnu.org/archive/html/bug-texinfo/2023-10/msg00074.html Note that this code is only used in a pure C implementation, in the default case a Perl hashmap is directly used from C. > You wrote to me in a private mail on 2024-05-25: > > In the end I used the Perl HV to have a hash map, it works well (it > > probably uses more memory), I propose not to use the C++ hashmap until > > we completely remove Perl. This is still correct. But now, there is the demonstrator program fully in C such that, even though we do not distribute that program (and I think that we should not distribute it, it is for development only), we are at that point that Perl is completly removed from that internal program. To me, the main use of the demonstrator program is to compare the timings with texi2any. If the linear search is used and not the C++ hashmap, this is an 'unfair' treatment of the C program for this purpose. To me, this C++ is not really in Texinfo code, it is more like an external C library that happens to be in C++, but it is well separated. If needed, I could try to make it part of contrib/ instead of being in the main code. > This change has also been made to TODO: > > If others are interested in processing Texinfo files directly, > - however, it could be possible to work on providing a practical API > - for the C codes in the Texinfo project, and maybe bindings for > - other languages than C or Perl. > + however, it could be possible to work on providing a practical and > + public API for the C codes in the Texinfo project, especially if the > + existing APIs can be reused, and maybe bindings for other languages > + than C or Perl. > > I have no interest in developing or supporting a public, stable API for > hooking into texi2any internals. I do not think that it should be stable either. But we do have an API, be it only because we need it for the interface of the converters written in C. To me, it is similar to the Perl modules API, we do not try to make it stable, but we have an API and it is better to treat it as a regular interface ti have a clean code and modularity. > I have probably written emails about > this before. It will tend to "freeze" texi2any implementation details > and make further development more difficult. Users will probably not > be able to achieve exactly the kind of output they want using the API > and will come asking for help. (This already happens with the Perl > customization API.) That's quite different, the Perl HTML customization API is for the users, it is normal that they come asking for help. The C API is equivalent with the perl modules API, not with the Perl HTML customization API, and we have never had users coming for help for the Perm modules API, and I believe that it is only used inside of GNU Texinfo. > A C API might also forestall any future rewrite of > texi2any in a different programming language. I will change the wording to remove 'public' if it can be interpreted as stable. -- Pat