Re: Protocol/procedures for new hardware?

Russell King - ARM Linux Thu, 18 Apr 2002 15:35:11 -0700

The following are my general comments only.  I'm sure there are
other people who will want to put forward their own comments (so
I've made this mail slightly narrower than normal to cope with
the amount of indentations 8).)

The first thing to realise is that no one has all the answers.
Each time a new class of machines come along, or even a new
machine comes along, we learn something new.

The learning point is not when someone comes along and says "hey,
I have this machine, how do I port Linux to it", but its the
"well, we've got all this code and we need to merge it somehow."
Certainly in the early days of ARM Linux, each machine type that
came along caused radical changes throughout the ARM code, moving
the interface points between various sections of code so we ended
up with something reasonably sane.  Some of this still happens -
for instance, the boundary between what appears in include/asm-arm/*
and include/asm-arm/{arch,proc}-* still changes slightly over time.

What I think we've learned from the SA1100 platforms is that unless
things are thought out from the start, it's very easy for a machine
class to get quite messy, and has the potential to spread outside
that machine class.  By messy, I mean containing a huge number of
ifdefs, and code changes throughout the rest of the kernel, including
the generic kernel.

These changes to the generic kernel are difficult to merge, especially
when you end up modifing drivers for PC hardware by introducing lots
if preprocessor directives, which obfuscate the code.  The chances of
merging this reduces as the amount of obfuscation increases.

There are various techniques within Linux that can reduce the
obfuscation.  The most obvious one is the use of automatic IRQ
probing - probe_irq_on() and probe_irq_off().  This is useful if
there's a chance that the IRQ to a particular device will vary
between machines, since it means you no longer have to hard-code
interrupt numbers into the driver.  Another way out of this problem
is to have a per-machine #define, but that can only work reliably
if one machine class uses the same IRQ for the same purpose.

It gets a little more hairy with base addresses, but not too much if
done properly.  For instance, on CLPS711x stuff, we have a set of
standard definitions for the memory space assigned to each chip
select line, CS0_PHYS_BASE .. CS7_PHYS_BASE.  These are always
defined if you include <asm/hardware.h>

As a result, you can do things like:

        unsigned long base = -1UL;

        if (machine_is_xyz() || machine_is_def())
                base = CS0_PHYS_BASE;
        if (machine_is_abc())
                base = CS1_PHYS_BASE;

        if (base == -1UL)
                return -ENODEV;

        ... request memory resource ...
        ... ioremap ...
        ... detect chip ...
        ... detect interrupt ...

See how cleaner and flexible the above is than:

        unsigned long base;

#ifdef CONFIG_ARCH_XYZ
        base = CS0_PHYS_BASE;
#elif defined(CONFIG_ARCH_ABC)
        base = CS1_PHYS_BASE;
#elif defined(CONFIG_ARCH_DEF)
        base = CS0_PHYS_BASE;
#else
        return -ENODEV;
#endif

        ... request memory resource ...
        ... ioremap ...
        ... detect chip ...
        ... detect interrupt ...

For existing drivers, the noises I've been getting from the Linux
community is that we should be looking at inb() and friends to
perform the necessary fixups for the architecture concerned, or
using a separate driver.

For instance, if you're using a LAN91C96 chip in byte mode, but
the driver uses inw(), Linux expects your implementation of inw()
to handle this by effectively performing two inb() and combining
the result.  The same applies for outw().  Where this is not possible
(because the chip has some special requirements about the ordering
of the reads), consultation with the driver maintainer wouldn't go
amiss - maybe the inw() in the code can change to two inb() calls
to make it more correct.

And now down to what I really want to say.  If something _can_ be
autoprobed (eg, contains an ident) and can safely be autoprobed,
then its probably a good idea to autoprobe for it at all possible
addresses.  For instance, if you know that a device will probably
be connected to one of 3 chip select lines, and it has an ID
register at address offset 0x10, then:

1. try claiming the memory resource.  If this fails, try the next
   region.
2. try reading offset 0x10 of the region.  If this doesn't match,
   you know there definitely isn't a device of that type there.
3. see if behaves like you expect (eg, try writing to some register
   and see if you get the expected feedback from the device).  If
   not, restore the old value, and move onto the next chip select
   region.

What you'll notice with this is, if you happen to have to redesign
your board, you might need to do zero modifications to the kernel.
Someone else coming along with their board might not need to add
another set of ifdefs to the driver.  And so on.

We now get to the problem of GPIOs, like we have in the SA1100 L3 bus
drivers.  I think we've solved this adequately in the SA1100 design,
except for one small problem - if only the
GPIO_MY_MACHINE_TYPE_THIS_IS_THE_L3_MODE_SIGNAL definitions would
go away and use the generic ones, suddenly we loose all of those
ifdefs around the if (machine_is_xyz()) { } blocks -> cleaner code.

Lastly, we get to the unsolved problem of the other GPIOs used for
PCMCIA.  Firstly, I think that the SA1100 PCMCIA layer doesn't help
too much here - it could do with a lot more cleaning, especially
where we're passing structures that only contain one element and
a socket number around... (that's why the new socket_init() and
socket_suspend() calls take an 'int sock' argument.)

Many of the SA1100 PCMCIA drivers purely use GPIO.  Its probably
highly likely that a number of these could be combined, and the
differences selected via sets of data, in a similar way that the
SA1100 LCD stuff works.  I'm not saying that the SA1100 LCD stuff
is perfect mind you.

With the exception of the GPIO stuff, I think we could probably
do a real good job of not needing soo many machine type numbers.

On Thu, Apr 18, 2002 at 04:02:38PM -0500, Cam Mayor wrote:
> on the arm linux pages, under "machines", i see the cs89712 listed as a 
> machine (#92), but then it says that the information is for the demo board, 
> not for the cs89712 processor, itself.  (and the demoboard information listed 
> is not quite right)   Later, the cdb89712 (#107) is listed.  There is also a 
> (new?) cs89712-based unit (#118).  When our hardware is ready for primetime, 
> i expect to see it take a slot, too.

There is a problem with the machine type list, and its makes it
difficult to determine if an existing number is for your particular
hardware, or if it is compatible in some way.  The problems are:

1. The machine type list doesn't contain enough information.

2. The code from people doesn't get merged (or even sent in to be
   reviewed and merged) fast enough so other people can look at
   the code.

For all I know, CDB89712 and CS89712 are actually the same
hardware, and two people registered the entries around the
same time.

I'm sure this is going to spark a big long discussion, and I
hope that something positive comes out of that discussion.
Above all, I hope this provides something useful to people
looking to port Linux to new machines.

_______________________________________________
http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm
http://www.arm.linux.org.uk/armlinux/mailinglists.php
Please visit the above addresses for information on this list.

Re: Protocol/procedures for new hardware?

Reply via email to