wide chars with 16 BITS_PER_UNIT
Hi there, I maintain a GCC port for a small 16 bit processor called XAP2+. I'm having problems with strings of wide characters. I have the following defines, among others: #define BITS_PER_UNIT 16 ... #define WCHAR_TYPE "int" #define WCHAR_TYPE_SIZE 16 So, I'm expecting char and wchar_t to both be 16 bits wide. This mostly works fine. However, when outputing assembler for wide strings, it all goes wrong. Internally, gcc stores the string as a char*, so a string literal like this: L"foo" Is translated by libcpp into something like this: "\0f\0o\0o" The string gets passed around in a STRING_CST and varasm uses ASM_OUTPUT_ASCII to write some assembler. However, at this point, I need to do different things depending on whether I'm dealing with wide or narrow characters (ie, wide characters have two octets packed into 16 bits, and narrow characters are {sign,zero} extended to 16 bits) and that information seems to have been thrown away. I have a bodge, which involves looking at the type of the STRING_CST in varasm, guessing if it's wide and maybe calling a new assemble_string_wide function. I'd be interested to hear any comments or suggestions. I'm working with gcc 4.0.2. Ned.
Re: wide chars with 16 BITS_PER_UNIT
On Fri, Mar 30, 2007 at 09:52:22AM -0700, Richard Henderson wrote: > I think the problem is that we've not told libcpp what the correct > narrow character set is. I suggest adding something like > > if (BITS_PER_UNIT >= 32) > cpp_opts->narrow_charset = BYTES_BIG_ENDIAN ? "UTF-32BE" : "UTF-32LE"; > else if (BITS_PER_UNIT >= 16) > cpp_opts->narrow_charset = BYTES_BIG_ENDIAN ? "UTF-16BE" : "UTF-16LE"; > Ah! That seems to do the trick. I'm still getting problems with numeric escapes, but I notice this comment in emit_numeric_escape: /* Note: this code does not handle the case where the target and host have a different number of bits in a byte. */ So my guess is that needs a fix too. I'm also seeing warnings from character literals like: warning: character constant too long for its type I should be able to chase this down too, though. Thanks for the help, Ned.
Different sized data and code pointers
Hi all. I'm working on a GCC backend for a small embedded processor. We've got a Harvard architecture with 16 bit data addresses and 24 bit code addresses. How well does GCC support having different sized pointers for this sort of thing? The macros POINTER_SIZE and Pmode seem to suggest that there's one pointer size for everything. The backend that I've inherited gets most of the way with some really horrible hacks, but it would be nice if those hacks weren't necessary. In any case, the hacks don't cope with casting function pointers to integers. Thanks, Ned. ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **
Re: Different sized data and code pointers
Paul Schlie wrote: With the arguable exception of function pointers (which need not be literal address) all pointers are presumed to point to data, not code; therefore may be simplest to define pointers as being 16-bits, and call functions indirectly through a lookup table constructed at link time from program memory, assuming it's readable via some mechanism; as the call penalty incurred would likely be insignificant relative to the potential complexity of attempting to support 24-bit code pointers in the rare circumstances they're typically used, on an otherwise native 16-bit machine. Thanks for the response. Suppose we don't have enough space to burn on a layer of indirection for every function pointer. Do I take it that there's really not a clean way to make GCC treat function pointers as 24 bit while still treating data pointers as 16 bits? Thanks, Ned. ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **
Re: Different sized data and code pointers
Paul Schlie wrote: the target ports are in gcc/config/... Sure, I mean which target should I be looking at? Ned. ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **
Re: Different sized data and code pointers
Julian Brown wrote: FWIW, a port I did used indirection for all function pointers, albeit for a different reason, and I can report that it seems to work OK in practice with a little linker magic. It wasn't really production-quality code though, I admit. Perhaps the indirection table can safely hold only those functions whose address is taken? (Or maybe that was assumed anyway?) A previous version of our back end did indeed use the indirection (data) method, and IIRC only stored pointers for address taken functions. However, we really don't have enough data space for that sort of thing. Especially since all the developers are used to using a compiler which does this `properly'. Another possibilty that we've discussed is putting an indirection for function pointers at <64k in code space (ie, a load and a branch for each function). However, either way requires us to modify all the obscure function pointer code in our firmware (eg, functions that get loaded from flash memory to ram before they're run) and we've got a hack that *nearly* works[1]. It worries me that doing anything other than leaving it alone it likely to introduce more bugs than it fixes. In fact I've now patched up the slight breakage we were seeing in the hack, so it's even more tempting to just leave it alone again. Ned. [1] Ok, here's the hack. Get a bucket, you might be sick. The header in the machine description has something like: extern tree type, decl; #define Pmode xap_pmode(type,decl) #define POINTER_SIZE xap_pointer_size(type,decl) Those functions check to see if they're looking at the global type and decl (which are defined to NULL in the C file), or if there's some other type or decl in scope where the macro's used. If one of them is in scope, and it's a function pointer type, then they return HImode or 32 otherwise they return QImode or 16 (yes, QImode - our platform has 16 bit bytes). It's possibly the worst hack I've ever had the misfortune to come across. I can't imagine how anyone ever thought it was a good idea. The worst thing of all it that it actually works quite well and since people are already using it, I'm not going to be allowed to get rid of it until I come up with something that works equally well :( ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **