wide chars with 16 BITS_PER_UNIT

2007-03-30 Thread Thomas Gill

Hi there,

I maintain a GCC port for a small 16 bit processor called XAP2+. I'm
having problems with strings of wide characters.

I have the following defines, among others:

#define BITS_PER_UNIT   16
...
#define WCHAR_TYPE  "int"
#define WCHAR_TYPE_SIZE 16

So, I'm expecting char and wchar_t to both be 16 bits wide. This mostly
works fine. However, when outputing assembler for wide strings, it all
goes wrong. Internally, gcc stores the string as a char*, so a string
literal like this:

L"foo"

Is translated by libcpp into something like this:

"\0f\0o\0o"

The string gets passed around in a STRING_CST and varasm uses
ASM_OUTPUT_ASCII to write some assembler. However, at this point, I need
to do different things depending on whether I'm dealing with wide or
narrow characters (ie, wide characters have two octets packed into 16
bits, and narrow characters are {sign,zero} extended to 16 bits) and
that information seems to have been thrown away.

I have a bodge, which involves looking at the type of the STRING_CST in
varasm, guessing if it's wide and maybe calling a new
assemble_string_wide function.

I'd be interested to hear any comments or suggestions.

I'm working with gcc 4.0.2.

Ned.


Re: wide chars with 16 BITS_PER_UNIT

2007-04-02 Thread Thomas Gill
On Fri, Mar 30, 2007 at 09:52:22AM -0700, Richard Henderson wrote:

> I think the problem is that we've not told libcpp what the correct
> narrow character set is.  I suggest adding something like
> 
>   if (BITS_PER_UNIT >= 32)
> cpp_opts->narrow_charset = BYTES_BIG_ENDIAN ? "UTF-32BE" : "UTF-32LE";
>   else if (BITS_PER_UNIT >= 16)
> cpp_opts->narrow_charset = BYTES_BIG_ENDIAN ? "UTF-16BE" : "UTF-16LE";
> 

Ah! That seems to do the trick. I'm still getting problems with numeric
escapes, but I notice this comment in emit_numeric_escape:

 /* Note: this code does not handle the case where the target
and host have a different number of bits in a byte.  */

So my guess is that needs a fix too. I'm also seeing warnings from
character literals like:

 warning: character constant too long for its type

I should be able to chase this down too, though.


Thanks for the help,
Ned.


Different sized data and code pointers

2005-03-01 Thread Thomas Gill
Hi all.
I'm working on a GCC backend for a small embedded processor. We've got a
 Harvard architecture with 16 bit data addresses and 24 bit code
addresses. How well does GCC support having different sized pointers for
this sort of thing? The macros POINTER_SIZE and Pmode seem to suggest 
that there's one pointer size for everything.

The backend that I've inherited gets most of the way with some really
horrible hacks, but it would be nice if those hacks weren't necessary. 
In any case, the hacks don't cope with casting function pointers to 
integers.

Thanks,
Ned.
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**


Re: Different sized data and code pointers

2005-03-02 Thread Thomas Gill
Paul Schlie wrote:
With the arguable exception of function pointers (which need not be literal
address) all pointers are presumed to point to data, not code; therefore
may be simplest to define pointers as being 16-bits, and call functions
indirectly through a lookup table constructed at link time from program
memory, assuming it's readable via some mechanism; as the call penalty
incurred would likely be insignificant relative to the potential complexity
of attempting to support 24-bit code pointers in the rare circumstances
they're typically used, on an otherwise native 16-bit machine.
Thanks for the response.
Suppose we don't have enough space to burn on a layer of indirection for
every function pointer. Do I take it that there's really not a clean way
to make GCC treat function pointers as 24 bit while still treating data
pointers as 16 bits?
Thanks,
Ned.

**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**


Re: Different sized data and code pointers

2005-03-03 Thread Thomas Gill
Paul Schlie wrote:
the target ports are in gcc/config/...
Sure, I mean which target should I be looking at?
Ned.
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**


Re: Different sized data and code pointers

2005-03-04 Thread Thomas Gill
Julian Brown wrote:
FWIW, a port I did used indirection for all function pointers, albeit
for a different reason, and I can report that it seems to work OK in
practice with a little linker magic. It wasn't really production-quality
code though, I admit.
Perhaps the indirection table can safely hold only those functions whose
address is taken? (Or maybe that was assumed anyway?)
A previous version of our back end did indeed use the indirection (data) 
method, and IIRC only stored pointers for address taken functions. 
However, we really don't have enough data space for that sort of thing. 
Especially since all the developers are used to using a compiler which 
does this `properly'.

Another possibilty that we've discussed is putting an indirection for 
function pointers at <64k in code space (ie, a load and a branch for 
each function).

However, either way requires us to modify all the obscure function 
pointer code in our firmware (eg, functions that get loaded from flash 
memory to ram before they're run) and we've got a hack that *nearly* 
works[1]. It worries me that doing anything other than leaving it alone 
it likely to introduce more bugs than it fixes. In fact I've now patched 
up the slight breakage we were seeing in the hack, so it's even more 
tempting to just leave it alone again.

Ned.
[1] Ok, here's the hack. Get a bucket, you might be sick.
The header in the machine description has something like:
extern tree type, decl;
#define Pmode xap_pmode(type,decl)
#define POINTER_SIZE xap_pointer_size(type,decl)
Those functions check to see if they're looking at the global type and 
decl (which are defined to NULL in the C file), or if there's some other 
type or decl in scope where the macro's used. If one of them is in 
scope, and it's a function pointer type, then they return HImode or 32 
otherwise they return QImode or 16 (yes, QImode - our platform has 16 
bit bytes).

It's possibly the worst hack I've ever had the misfortune to come 
across. I can't imagine how anyone ever thought it was a good idea. The 
worst thing of all it that it actually works quite well and since people 
are already using it, I'm not going to be allowed to get rid of it until 
I come up with something that works equally well :(

**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**