Re: Discussion: How to handle HALs, SDKs and libraries

Christian MAUDERER Thu, 25 May 2023 00:29:31 -0700

On 2023-05-25 01:57, Chris Johns wrote:

On 24/5/2023 5:07 pm, Christian MAUDERER wrote:

Hello Chris,


On 2023-05-24 03:44, Chris Johns wrote:

Hi Christian,

Thanks for raising this topic. It is a tough one.

On 24/5/2023 12:11 am, Kinsey Moore wrote:

On Tue, May 23, 2023 at 2:26 AM Christian MAUDERER
<christian.maude...@embedded-brains.de
<mailto:christian.maude...@embedded-brains.de>> wrote:

      Hello,

      I recently updated the HAL in the i.MXRT BSP. I used the same approach
      that we use for a lot of similar cases: Import the sources into RTEMS
      and adapt them slightly so that they work for us. So basically a
      Clone-and-Own approach.

      During the discussion of the patches, some concerns were raised, whether
      we should find a better solution to handle HALs, SDKs and similar cases.
      We should start discussing a solution that can be used after the 6
      release so that maybe someone can start to work on a prototype.

      Some example cases are:

      - the mcux_sdk in the imxrt BSP
      - the hal in the stm32h7 BSP
      - general ARM CMSIS files
      - zlib
      - libfdt

      One solution could be to build these libraries external and only link
      RTEMS with them. There are disadvantages to this aproach:

      - Also in my experience, the API of the HALs / SDKs / libraries seems to
      be quite stable, it's possible that there are combinations where some
      unexpected change breaks a driver or makes it impossible to link the
      applications.


Xilinx with the more complex devices like the Versal have been moving things
about. The Versal SMC call set is fluid and the PM (platform manager) seems to
functionally align to Xilinx tools releases plus Petalinux versions. For example
there are stable defined API calls in Versal Linux (XRT/zocl) that depends on PM
code that is commented in the code as "to be removed".

When I first used the Zynq I used Xilinx's drivers like OAR is currently with
the Microblaze. I could not release the results because of the license at the
time. I quickly found the drivers lacked functionality for general use and broke
under high loads and boundary conditions. The fixes are part of a project and
cannot be released because the license at the time made it impossible. What I
leant from the exercise is to not depend on their drivers.


That sounds like a quite bad case. So it's a good example for that discussion.
Thanks for bringing it up.


I view the repo as open but not open source ... if that sentence makes sense?

I think I understand what you mean. But it's still a good example forthe discussion. If a solution theoretically works with that case, itshould work with a lot of other cases too.


I feel what we considered stable will depend on the origin of the code and that
will be case by case.


Agreed.

      - BSPs rely on basic drivers from these libraries (like console or clock
      driver). If we link against the libraries, the testsuite wouldn't build
      any more without preinstalled libraries.


Yes the mutual dependence if built externally and before RTEMS is not easy to
solve. The idea of the HAL code being supplied as .h and a .a does let a user
update the drivers without needing an RTEMS version update.

      Another solution could be to include libraties like that as submodules
      and build them using the RTEMS build system. We could clone the repos
      onto the RTEMS git server, and add necessary patches. Advantage would be
      that it is more similar to the process that we currently have. Another
      advantage is that we have a known-working version of the files. Upstream
      updates could be either merged or we could rebase our patches to a new
      version.


See below for the problems this creates.

       From my point of view, the second option would be the better one
      especially because we have a tested, fixed version of the library
      instead telling the user to just use some random version that might or
      might not work.


This is important. We need to define what a release is and it is a requirement
we provide all code as tarball files. This implies the release process knows how
to create the tarfiles.

      Regardless which aproach we use: We have to think about how to handle
      that on releases. In the link aproach (first case), we have to somehow
      archive source tar balls and some kind of build recipe. In the submodule
      aproach, we could checkout all submodules and pack the files into the
      RTEMS release tar ball. So I would expect that the second aproach has
      less impact here too.

      Comments? Improvements? Better suggestions?

I would definitely prefer the submodule approach over the linking approach to
avoid the test issues since some of these HALs bring core functionality. The
Xilinx driver framework (embeddedsw repo on Github) would be well-suited to the
submodule approach since it is already broken out into the shared driver space
because it can apply to at least 3 architectures (ARM, AArch64, MicroBlaze).


I suggest you avoid making that repo a submodule of anything. The code in that
repo is "over the wall" and there is no continuity. I have it as a submodule in
my XRT repo and a Xilinx push of the next release of tools broke the code. What
I had depended on was removed and moved somewhere else. The Xilinx updates are
based on the release cycle of their tools and they do not respond to issues or
PRs. They are free to make what ever changes they like and they do that
internally and what appears externally is based on changes across their internal
repos. To make things harder there is no consistent point they update these
public repos so the code they removed did not reappear for a long time.

One issue with either approach is the need to modify the HAL source to suit
RTEMS. As far as I'm aware, there is no tooling in place in git for applying
patches to submodules and in the external build scenario we'd end up maintaining
a branch of the origin repo with patches applied. Upstreaming the changes would
be ideal, but I wouldn't expect them to accept RTEMS-specific patches. The
Xilinx NAND driver already requires a minor modification because that driver
doesn't expose an option and instead has a defined macro that determines how
many chip selects are usable to address different parts of the NAND chip.
Technically, this particular change could be worked around with some include
path trickery to leave the original sources unmodified, but many other changes
would not be suited by that type of workaround and it makes the source less
maintainable. We would need to come up with our own tooling for submodule patch
application and silencing of warnings about dirty submodule trees due to applied
patches.


Direct dependence on external repos we do not control is a long term maintenance
problem. Repos move and change [1] and this makes maintaining past releases a
challenge. Who is responsible for the long term release branch maintenance?
Without a working submnodule a release cannot be made and that is not great.
Expecting the release manager to clean up is not going to work given the task is
unfunded.


Let's make the dependencies indirect: We clone repos to git.rtems.org and to our
mirrors. Then we can either use a submodule URL starting with
git://git.rtems.org or even with a relative URL if we want to make better use of
the mirrors.

If necessary, that approach allows adding an RTEMS-branch that adds patches.
It's more similar to the clone and own we do now. But having a clone of the
original repo makes it a lot simpler to merge upstream changes. Having an
RTEMS-branch makes it easier to see what has been changed for RTEMS.


Separate repos have the advantage of allowing per repo rules for maintenance and
ownership and I like that. It would map nicely into gitlab.

We don't have to integrate automatic updates or similar. We only maintain and
keep a tested version. If a BSP maintainer or user wants to upgrade, he pulls
the changes from the upstream repo and merges them into the branch that includes
our patches.

That should even work for your extreme case of the Xilinx repo. We have a tested
version on our server. If someone wants to update, he has to update, find out
what Xilinx did break during their updates and adapt to that. Then we can push
that new version to our clone of the Xilinx repo.


If we take a repo into git.rtems.org the project is undertaking long term
maintenance of that code base. Is this what we want if we are using small pieces
of it?

I don't think that we have to maintain the entire code base. There are alot of unmaintained clones of a lot of software. Usually the lastcommits in a repo clearly show whether it isn't touched since 5 years orwhether it is actively maintained.


I am still not sure what role this code base is performing? I know we need code
and/or drivers to boot RTEMS, eg x86 and EFI, which has to be in rtems.git so
the tests link and run. A list is SMP core starting, timer, MMU and console.
What other drivers are being added?

I mentioned libraries in my original mail too. I think libfdt and zlibwould be two candidates. Libfdt is necessary for some BSPs in basicdrivers too.

Submodules in rtems.git is a change in policy. We allow submodules in add-on
packages like libbsd but it has never been something we have allowed with
rtems.git.


I agree that it would be a change in policy. But that's the whole point of the
discussion:


Great. I made the statement to make sure we all understand this. :)


Thanks for highlighting that point.

The current method makes it hard to maintain library code. Do we
find a better solution that either fits in current policies or do we find
sensible adaptions to the policies that are OK for everyone?


Yes it is not great. Please understand my question is to make sure we understand
what we take on with a path we take.

OK.

I don't see submodules as the only valid solution. But it's one that looks
promising to me, and therefore I brought it up. It is similar to the approach
that has worked well in libbsd. What I currently suggest only tries to avoid the
step of copying code between the upstream repo and the local one like we do in
libbsd.


Submodules could be made to work. We need to understand some issues it brings
first. For example we would need to manage the eco-system side. An example is
only needing the submodule if the related BSP or BSP family is being built? Or
how we manage the submodule initialisation so users do not end up with custom
commands they run which break if a BSP or arch is removed from the tree in a
future release.

Good point. You are right that we should think about how we want toinitialize the submodules. From a user point of view, I only want todownload the necessary code. So I want to initialize as few submodulesas possible. Can be tricky.

Maybe we can teach waf to handle that. But that will add a differencebetween development branches and releases.

We should consider merge requests. How does a merge request for a submodule get
checked to make sure it does not break rtems.git? It is possible to check this
at the submodule repo level?


I'm not sure whether I understand you correctly here.

Usually if you update a submodule, you have to create a merge request intwo repos. The submodule itself and in the main repo that uses thesubmodule.

I think your question is about how we check a merge request in thesubmodule before we merge it, correct?

To be honest: I don't have a good answer for that. Most likely it willbe similar to what we do now: Apply patches locally, build the BSP andcheck whether it works. Difference is that we have to apply the patchesto multiple repos.

When merging, we have to first merge the submodule patches, then mostlikely ask the patch creator to update the patches for RTEMS or updatethem ourself (the SHA1 of the submodule will change if we add a mergecommit) and then merge the patches in RTEMS.

FYI a release checks a repo for submodules and if present gets that code and
merges it into the master repo source to make a complete source package. The git
archive command does not include submodules. The rtems.git release tarfile will
be the sum of all submodule repos. Is this something we need to consider if
these repos are large?


Yes it is. It will mean that the release tar balls will grow quite a lot.

If we use waf to build the sources from the submodules (which iscurrently what I expect that it will be the case) we could use waf tocopy only the files that are used. That would shrink the size again.

Disadvantage is, that we maybe miss adding included files to the buildsystem files. That will work well if the complete submodule is checkedout, but it will break during the release.


Do you have a good alternative idea that would need less changes in policy?


No I do not have good alternatives. I feel what we end up with will be a
compromise of some form. There is no perfect solution with an open project like
we have.

Could we step away from submodules being in rtems.git and maybe there is a BSP
option that points to a source tree? I have no idea if the build system could
make this work. We would host the repo for that source tree on git.rtems.org and
control it so users have a simple means to find the repo and the version they
need. An advantage is the "driver" repo can be updated independently to RTEMS
and for some users that may be a good thing. For example a long life cycle
project is stable on a version of RTEMS however the drivers need bug fixes? The
down side is the extra steps needed to get the code and to set it up. It is an
alternative but not a good one. :)


Just to make sure that I understand that correctly:

A BSP uses for example

  https://git.rtems.org/hal/foo-hal/plain/driver.c

The build system would note that this file hasn't been downloaded yetand download it before using it. Is that correct?

You say that an advantage would be that I can update driver.cindependently of RTEMS. I would see that as a big and problematicdisadvantage:

If I update my foo-hal to a version 2 that changes something in theinterface and that needs adaptions in RTEMS, that will automaticallybreak old RTEMS versions.

Even if the interface doesn't change, but only some bug is fixed, Ican't build a RTEMS BSP that has the bug and one that don't. The bugwill just magically suddenly vanish. I think as a user I would have avery hard time figuring out what happened and why I can't reproduce thebug that has been there yesterday.

So in my opinion, a commit in RTEMS should always use fixed versions ofthe files in the HAL / library. We could also use URLs with a fixed ID like:



https://git.rtems.org/hal/foo-hal/plain/driver.c?id=698732a6c45424263d9de0ac23850c21383c4154

But I think that would make it harder to maintain compared to submodulesthat already use that ID.


Just as a sanity check on this discussion ... Would gitlib merge requests aid
the management of the code in rtems.git or are there other factors complicating
the maintenance task?

> Chris

I don't think gitlab (or any other similar system I know) will help alot with these tasks. But most likely it will also not become moredifficult to handle that with these systems. If you want, we can set upsome simple test repos on gitlab.com to check how submodules are handled.


In general: Submodule will make some stuff harder and some stuff simpler.

It will be harder to merge patches. A patch will have two parts: One inthe submodule to (for example) update the HAL. And one in RTEMS to usethe new version and maybe adapt files using that HAL.


On the other hand, it will be a lot simpler to update HALs and libs:

- In the best case for an unchanged library it's just syncing withupstream - that would also result in a straightforward pull-request inthe repo. Then a simple patch that updates the submodule in RTEMS.Syncing with upstream maybe could even be done with some automatism.Only the patches in RTEMS should be at least manually reviewed ones.

- If some RTEMS specific patches have been added, it will be a 'gitmerge upstream/main' and push to the RTEMS specific branch or 'gitrebase upstream/main' and push to a new branch (we have to keep the oldones so that old RTEMS versions still find the commits). Then a simplepatch that updates the submodule in RTEMS.

- In the worst case (like your Xilinx repo) it will be merging upstreamchanges, figuring out what has been broken by Xilinx and adapt thedrivers in RTEMS to that. Then first push the submodule and after thatpush the necessary patches in RTEMS.

Compared to the current workflow, that's a lot simpler. My currentworkflow is usually something like this:

1. Figure out what has been changed compared to the original HAL version(that is hopefully noted somewhere in the commit message or the files).

2. Find all matching new HAL files and copy them over the old ones.Maybe throw away no longer existing ones.

3. Re-apply changes from step 1 manually. Hope that I didn't forget someimportant fix.


4. See whether everything still builds and works.

Step 2 and 3 have to be one commit because otherwise there would be anon-working commit in RTEMS. So the next update is even harder becausethere never have been unchanged files in the repo and figuring out theRTEMS specific changes means that I first have to find the original files.


Best regards

Christian

--
--------------------------------------------
embedded brains GmbH & Co. KG
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email:  christian.maude...@embedded-brains.de
phone:  +49-89-18 94 741 - 18
mobile: +49-176-152 206 08

Registergericht: Amtsgericht München
Registernummer: HRA 117265
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: Discussion: How to handle HALs, SDKs and libraries

Reply via email to