On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck <zad...@naturalbridge.com> wrote: > > On 10/23/2012 10:12 AM, Richard Biener wrote: >> >> On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck <zad...@naturalbridge.com> >> wrote: >>> >>> This patch implements the wide-int class. this is a more general >>> version >>> of the double-int class and is meant to be the eventual replacement for >>> that >>> class. The use of this class removes all dependencies of the host from >>> the target compiler's integer math. >>> >>> I have made all of the changes i agreed to in the earlier emails. In >>> particular, this class internally maintains a bitsize and precision but >>> not >>> a mode. The class now is neutral about modes and tree-types. the >>> functions that take modes or tree-types are just convenience functions >>> that >>> translate the parameters into bitsize and precision and where ever there >>> is >>> a call that takes a mode, there is a corresponding call that takes a >>> tree-type. >>> >>> All of the little changes that richi suggested have also been made. >>> >>> The buffer sizes is now twice the size needed by the largest integer >>> mode. >>> This gives enough room for tree-vrp to do full multiplies on any type >>> that >>> the target supports. >>> >>> Tested on x86-64. >>> >>> This patch depends on the first three patches. I am still waiting on >>> final >>> approval on the hwint.h patch. >>> >>> Ok to commit? >> >> diff --git a/gcc/wide-int.h b/gcc/wide-int.h >> new file mode 100644 >> index 0000000..efd2c01 >> --- /dev/null >> +++ b/gcc/wide-int.h >> ... >> +#ifndef GENERATOR_FILE > > >> The whole file is guarded with that ... why? That is bound to be fragile >> once >> use of wide-int spreads? How do generator programs end up including >> this file if they don't need it at all? > > This is so that wide-int can be included at the level of the generators. > There some stuff that needs to see this type that is done during the build > build phase that cannot see the types that are included in wide-int.h. > >> +#include "tree.h" >> +#include "hwint.h" >> +#include "options.h" >> +#include "tm.h" >> +#include "insn-modes.h" >> +#include "machmode.h" >> +#include "double-int.h" >> +#include <gmp.h> >> +#include "insn-modes.h" >> + >> >> That's a lot of tree and rtl dependencies. double-int.h avoids these by >> placing conversion routines in different headers or by only resorting to >> types in coretypes.h. Please try to reduce the above to a minimum. >> >> + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / >> HOST_BITS_PER_WIDE_INT]; >> >> are we sure this rounds properly? Consider a port with max byte mode >> size 4 on a 64bit host. > > I do not believe that this can happen. The core compiler includes all > modes up to TI mode, so by default we already up to 128 bits.
And mode bitsizes are always power-of-two? I suppose so. >> I still would like to have the ability to provide specializations of >> wide_int >> for "small" sizes, thus ideally wide_int would be a template templated >> on the number of HWIs in val. Interface-wise wide_int<2> should be >> identical to double_int, thus we should be able to do >> >> typedef wide_int<2> double_int; > > If you want to go down this path after the patches get in, go for it. I > see no use at all for this. > This was not meant to be a plug in replacement for double int. This goal of > this patch is to get the compiler to do the constant math the way that the > target does it. Any such instantiation is by definition placing some > predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter ... >> in double-int.h and replace its implementation with a specialization of >> wide_int. Due to a number of divergences (double_int is not a subset >> of wide_int) that doesn't seem easily possible (one reason is the >> ShiftOp and related enums you use). Of course wide_int is not a >> template either. For the hypotetical embedded target above we'd >> end up using wide_int<1>, a even more trivial specialization. >> >> I realize again this wide-int is not what your wide-int is (because you >> add a precision member). Still factoring out the "common"s of >> wide-int and double-int into a wide_int_raw <> template should be >> possible. >> >> +class wide_int { >> + /* Internal representation. */ >> + >> + /* VAL is set to a size that is capable of computing a full >> + multiplication on the largest mode that is represented on the >> + target. The full multiplication is use by tree-vrp. If >> + operations are added that require larger buffers, then VAL needs >> + to be changed. */ >> + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / >> HOST_BITS_PER_WIDE_INT]; >> + unsigned short len; >> + unsigned int bitsize; >> + unsigned int precision; >> >> The len, bitsize and precision members need documentation. At least >> one sounds redundant. >> >> + public: >> + enum ShiftOp { >> + NONE, >> NONE is never a descriptive name ... I suppose this is for arithmetic vs. >> logical shifts? > > suggest something Make it an overload instead of passing an extra argument. Or as I say make callers apply shift-count truncation separately. >> + /* There are two uses for the wide-int shifting functions. The >> + first use is as an emulation of the target hardware. The >> + second use is as service routines for other optimizations. The >> + first case needs to be identified by passing TRUNC as the value >> + of ShiftOp so that shift amount is properly handled according to >> the >> + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift >> + amount is always truncated by the bytesize of the mode of >> + THIS. */ >> + TRUNC >> >> ah, no, it's for SHIFT_COUNT_TRUNCATED. "mode of THIS"? Now >> it's precision I suppose. That said, handling SHIFT_COUNT_TRUNCATED >> in wide-int sounds over-engineered, the caller should be responsible >> of applying SHIFT_COUNT_TRUNCATED when needed. > > I am fighting all of the modes out. i will update this patch with more > cleanups Thanks. >> + enum SignOp { >> + /* Many of the math functions produce different results depending >> + on if they are SIGNED or UNSIGNED. In general, there are two >> + different functions, whose names are prefixed with an 'S" and >> + or an 'U'. However, for some math functions there is also a >> + routine that does not have the prefix and takes an SignOp >> + parameter of SIGNED or UNSIGNED. */ >> + SIGNED, >> + UNSIGNED >> + }; >> >> double-int and _all_ of the rest of the middle-end uses a 'int uns' >> parameter >> for this. _Please_ follow that. Otherwise if you communicate between >> those interfaces you have to to uns ? UNSIGNED : SIGNED and >> signop == UNSIGNED ? 1 : 0 all over the place. > > I really do not want to. What i discovered is that some places in the > compiler do and some places do not and some places take the reverse > convention MNEMONIC is better than NUMERIC. I don't think that you scheme is an improvement given that GCC has a mix of both. I also fail to remember a place where it is not unsigned == 1, signed == 0 (which you btw reverse ... SIGNED == 0, UNSIGNED ==1, so even ugly casting doesn't allow moderating between the two schemes). >> >> + static wide_int from_shwi (HOST_WIDE_INT op0, unsigned int bitsize, >> + unsigned int precision); >> + static wide_int from_shwi (HOST_WIDE_INT op0, unsigned int bitsize, >> + unsigned int precision, bool *overflow); >> >> I suppose , bool *overflow = NULL would do as well? What's the >> distinction between bitsize and precision (I suppose, see the above >> question)? I suppose precision <= bitsize and bits above precision >> are sign/zero extended (and the rest of the val[] array contains undefined >> content?)? But we also have 'len', which then matches bitsize (well >> it may be larger). So IMHO either bitsize or len is redundant. At least >> the tree level nowhere considers partial integer modes special this way >> (only the precision is ever taken into account, but we always >> sign-/zero-extend >> to the whole double-int - thus 'len' in wide-int terms). > > Some operations, mostly shifting, needs both the bitsize and precision. > In the early days of the compiler, people pretty much ignored the precision > and most of the compiler math was done using the bitsize. This made it > very painful for the people who supported ports that had odd sized modes. > Bernd has been cleaning this up at the rtl level and the first 5 patches > move that forward. But you really do need both. I fail to understand. How do you require anything else than precision for shifting? Do you say that shifting a PSImode 24-bit precision integer right with a logical shift will pull in sign-bits that happen to be 'extended' into the upper 8 bits? Richard, can you clarify this please? I still see len == bitsize and thus a redundancy. >> + inline static wide_int from_hwi (HOST_WIDE_INT op0, const_tree type); >> + inline static wide_int from_hwi (HOST_WIDE_INT op0, const_tree type, >> + bool *overflow); >> >> Are you needing this overload or are you adding it for "completeness"? >> Because this interface is wrong(TM), and whoever calls it has at least >> cleanup opportunities ... from_tree or from_rtx makes sense. > > the functions are actually quite different. in general overflow checking > at least doubles the cost of implementation and sometimes it is much > greater. Having them be separate cleans up the implementation. Still in the above you take the actual value from op0, a HOST_WIDE_INT (with what precison / bitsize?), the wide_int will have precision / bitsize from the tree type argument (will op0 be truncated / extended here? How is *overflow computed?). Btw, if you have an overload like this please make 'overflow' a reference, it doesn't make sense to pass NULL here (just use the other overload). > > >> >> Also in which cases do you need "overflow"? a HWI always fits in >> a wide-int! All trees and rtxen do, too. You seem to merge two >> operations here, conversion to wide-int and truncation / extension. >> That doesn't look like a clean interface to me. > > This is the big difference between double_int and wide_int: I do not care > if it fits in the underlying representation, i care if it fits in the > precision of the type. If the type is for a char and the value is 100000, > overflow is set. In general setting overflow in the wide-int interface > means something very different from double-int interface. A large number > of places that check for overflow with double in do not need to check it for > wide int. Then in the above interface you really want to ask hwi_fits_type_p instead of globbing this into the construction of a wide-int. (This whole interface is really a mess - sorry) If overflow setting for wide-int means somehting very different from double-int then it shouldn't be called overflow or it should at least be documented! So - what does it actually mean? Double-ints implicitely have precision of 2*HWI so overflow is set whenever the operation overflows that precision. It doesn't make sense to ask if a HWI fits in a double-int for example. >> + static wide_int from_double_int (enum machine_mode, double_int); >> >> the choice of passing a mode seems arbitrary (the natural interface >> would be nothing - precision is 2 * HWI). Passing it as first parameter >> is even more strange to me ;) > > the first part of the question is answered above. the second part of the > question was fixed on my private tree a few days ago and will get pushed > out. Thanks. >> + static wide_int from_tree (const_tree); >> + static wide_int from_rtx (const_rtx, enum machine_mode); >> >> + HOST_WIDE_INT to_shwi (unsigned int prec) const; >> >> See above - merges two basic operations. You should write >> >> w.sext (prec).to_shwi () >> >> instead (I _suppose_ it should sign-extend, should it? ;)). Btw, why >> don't we need to always specify bitsize together with precision in all >> the places? (not that I am arguing for it, I'm arguing for the >> removal of bitsize) > > because the bitsize and precision are part of the representation of the > value. You only have to specify them on the way into wide-int or if you > need to change them (this is rare but it does happen). My objection to to_shwi still stands. to_shwi should be HOST_WIDE_INT to_shwi () const; and it could gcc_assert () that the wide-int fits in a shwi (you should be able to see a pattern in my complaints ...) >> + static wide_int max_value (unsigned int bitsize, unsigned int prec, >> SignOp sgn); >> >> now that I am seeing this - is there any restriction on how the precision >> of a partial integer mode may differ from its bitsize? Can we have >> POImode with 1 bit precision? I suppose the solution for all this is >> that when converting a wide-int to a RTX with a mode then we need to >> zero-/sign-extend to the modes bitsize (and wide-int only cares about >> precision). Eventually a set_len can adjust the amount of BITS_PER_UNITs >> we fill with meaningful values if needed. Otherwise len == precision >> / BITS_PER_UNIT (rounded to HWI for obvious practical reasons). > > the precision must be less than or equal to the bitsize. That is the only > restriction. > I do not know if you can have poi1? Every case that i have seen, the > partial int is a partial of the next largest power of 2 mode. But there is > nothing in wide-int that cares about this. > > >> >> + inline static wide_int minus_one (unsigned int bitsize, unsigned int >> prec); >> + inline static wide_int minus_one (const_tree type); >> + inline static wide_int minus_one (enum machine_mode mode); >> + inline static wide_int zero (unsigned int bitsize, unsigned int prec); >> + inline static wide_int zero (const_tree type); >> + inline static wide_int zero (enum machine_mode mode); >> + inline static wide_int one (unsigned int bitsize, unsigned int prec); >> + inline static wide_int one (const_tree type); >> + inline static wide_int one (enum machine_mode mode); >> + inline static wide_int two (unsigned int bitsize, unsigned int prec); >> + inline static wide_int two (const_tree type); >> + inline static wide_int two (enum machine_mode mode); >> + inline static wide_int ten (unsigned int bitsize, unsigned int prec); >> + inline static wide_int ten (const_tree type); >> + inline static wide_int ten (enum machine_mode mode); >> >> wheeee .... ;) > > yes, and they are all used. > >> What's wrong with from_uhwi (10, ...)? double-int has the above for >> compatibility reasons only. And why should I care about type/mode/size >> for something as simple as '1'? > > the 10 is an interesting case. at least in my patches it is not used, but > i had put it in because i started from double-int and it has it. However, > i believe in fat api's if something gets used a lot, then it should be part > of the api. all of them except for 10 are used a lot. > > I will point out than in my original patch, these were just macros that > expanded into from_uhwi, but richi wanted them to be real functions. Yeah, but now with all the fancy overloads (but see my other suggestion below). > > > >> + inline unsigned short get_len () const; >> + inline unsigned int get_bitsize () const; >> + inline unsigned int get_precision () const; >> + inline unsigned int get_full_len () const; >> >> not sure which air you are pulling full_len from ;) > > when you need it, you need it. the dwarf writer really needs it, because > it wants to see all of the words on a multiword value, not just the ones > that "need" to be represented so that it is easy to read. > > I have a big comment on when not to use this in my tree. + /* VRP appears to be badly broken and this is a very ugly fix. */ + if (i >= 0) + return val[i] >> (HOST_BITS_PER_WIDE_INT - 1); Err?? You mean you are getting array-bound warnings or what? >> + wide_int force_to_size (unsigned int bitsize, >> + unsigned int precision) const; >> >> or rather 'trunc'? Seems to be truncation and set_len combined? > > why do you think it is only shortening it? Because it does not say whether you sign- or zero-extend: +/* Copy THIS replacing the mode with MODE. */ Fix comment. +wide_int +wide_int::force_to_size (unsigned int bs, unsigned int prec) const +{ + wide_int result; + int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1); + int blocks_needed = BLOCKS_NEEDED (prec); + int i; + + result.bitsize = bs; + result.precision = prec; + result.len = blocks_needed < len ? blocks_needed : len; + for (i = 0; i < result.len; i++) + result.val[i] = val[i]; + + if (small_prec & (blocks_needed == len)) what? small_prec seems to tell you whether precision is a multiple of a HWI, now you only look at bit 1 (so it's equal to precision & (blocks_needed == len). + result.val[blocks_needed-1] + = sext_hwi (result.val[blocks_needed-1], small_prec); But I really think you want sth else... maybe &&? You also unconditionally sign-extend, so the function name should somehow reflect this. + return result; +} > >> >> I wonder if for the various ways to specify precision/len there is a nice >> C++ >> way of moving this detail out of wide-int. I can think only of one: >> >> struct WIntSpec { >> WIntSpec (unsigned int len, unsigned int precision); >> WIntSpec (const_tree); >> WIntSpec (enum machine_mode); >> unsigned int len; >> unsigned int precision; >> }; >> >> and then (sorry to pick one of the less useful functions): >> >> inline static wide_int zero (WIntSpec) > > It depends on what you have available in your hands when you need to call a > function. At the rtl level we almost never have tree types, but we have > modes. At the tree level, you almost never have modes. In general the > convenience functions take anything and just extract the prec and bitsize > for you. But there are several places that need to specify the prec and > bitsize and so these are now the base primitives. You always have prec and bitsize. GET_MODE_PRECISION and GET_MODE_BITSIZE, TYPE_PRECISION and TYPE_SIZE tell you this. The other overloads are for convenience only. You didn't respond to the idea of making the interface less cluttered by using some abstraction. >> which you should be able to call like >> >> wide_int::zero (SImode) >> wide_int::zero (integer_type_node) >> >> and (ugly) >> >> wide_int::zero (WIntSpec (32, 32)) >> >> with C++0x wide_int::zero ({32, 32}) should be possible? Or we keep >> the precision overload. At least providing the WIntSpec abstraction >> allows custom ways of specifying required bits to not pollute wide-int >> itself too much. Lawrence? >> >> + /* Printing functions. */ >> + >> + void print_dec (char *buf, SignOp sgn) const; >> + void print_dec (FILE *file, SignOp sgn) const; >> + void print_decs (char *buf) const; >> + void print_decs (FILE *file) const; >> + void print_decu (char *buf) const; >> + void print_decu (FILE *file) const; >> + void print_hex (char *buf) const; >> + void print_hex (FILE *file) const; >> >> consider moving them to standalone functions, out of wide-int.h > > I do not see much reason to do this. They use the internals of wide-int > and moving them somewhere else is just exposing the internals for no real > reason. All internals are exposed already. >> >> + inline bool minus_one_p () const; >> + inline bool zero_p () const; >> + inline bool one_p () const; >> + inline bool neg_p () const; >> >> what's wrong with w == -1, w == 0, w == 1, etc.? > > I would love to do this and you seem to be somewhat knowledgeable of c++. > But i cannot for the life of me figure out how to do it. > > say i have a TImode number, which must be represented in 4 ints on a 32 bit > host (the same issue happens on 64 bit hosts, but the examples are simpler > on 32 bit hosts) and i compare it to -1. The value that i am going to see > as the argument of the function is going have the value 0xffffffff. > but the value that i have internally is 128 bits. do i take this and 0 or > sign extend it? in particular if someone wants to compare a number to > 0xdeadbeef i have no idea what to do. I tried defining two different > functions, one that took a signed and one that took and unsigned number but > then i wanted a cast in front of all the positive numbers. > > If there is a way to do this, then i will do it, but it is going to have to > work properly for things larger than a HOST_WIDE_INT. > > I know that double-int does some of this and it does not carry around a > notion of signedness either. is this just code that has not been fully > tested or is there a trick in c++ that i am missing? Not sure, but at least minus_one_p, zero_p and one_p can be implemented with a signed HWI overload of operator== But I suppose leaving things as-is is fine as well. >> + bool only_sign_bit_p (unsigned int prec) const; >> + bool only_sign_bit_p () const; >> >> what's that? Some of the less obvious functions should be documented >> in the header I think. Smells of combining two things again here. >> Either wide-int has an intrinsic precision or it has not ... (like >> double-int). > > > Again, i have put in things that are useful, it is all driven from what the > clients need. This is done all over the back end. What's the current function that is called? >> >> + bool fits_u_p (unsigned int prec) const; >> + bool fits_s_p (unsigned int prec) const; >> >> See above. >> >> + /* Extension */ >> + >> + inline wide_int ext (unsigned int offset, SignOp sgn) const; >> + wide_int sext (unsigned int offset) const; >> + wide_int sext (enum machine_mode mode) const; >> + wide_int zext (unsigned int offset) const; >> + wide_int zext (enum machine_mode mode) const; >> >> 'offset'? I suppose that's 'precision'. Does that alter the >> precision of *this? >> I think it should (and thus there should be no set_precision function). >> If it doesn't alter precision the functions don't make much sense to me. > > The ones that alter the precision take a precision and bitsize, the ones > that just want do the extension from some place and end up with a bit > pattern that looks a certain way take the offset. Different semantics for a function overload is ... ugly. > >> + wide_int set_bit (unsigned int bitpos) const; >> >> this kind of interface is strange. You call it like w.set_bit (1) but it >> doesn't actually set bit 1 in w but it constructs a new wide_int and >> returns that. So I suppose it should be >> >> wide_int with_bit_set (unsigned int bitpos) const; > > the interface is pure. if you want me to change the name, that is fine. > >> >> or similar. Or simply have a mutating set_bit. Or leave it out entierly, >> we cannot have many uses of this kind of weird interface. >> >> similar comments for the rest. >> >> .... rest skipped ... >> >> + / HOST_BITS_PER_WIDE_INT + 32)); >> + char *dump (char* buf) const; >> + private: >> + >> + /* Private utility routines. */ >> + wide_int decompress (unsigned int target, unsigned int bitsize, >> + unsigned int precision) const; >> + static wide_int add_overflow (const wide_int *op0, const wide_int *op1, >> + wide_int::SignOp sgn, bool *overflow); >> + static wide_int sub_overflow (const wide_int *op0, const wide_int *op1, >> + wide_int::SignOp sgn, bool *overflow); >> +}; >> >> >> IMHO way too many functions for a well tested initial implementation. >> There are a lot of things that seem operation compositions. Is your >> concern efficiency here? That would be bad as that means wide_ints >> are too heavy weight. > > There are two sides of ease of use, you can force people to write out > everything using a few primitives or you give them a rich interface. I > come from the rich interface school. > > If i was just going out and selling a new interface, something clean and > small would be easier to sell. However, that is not what i am doing. I > have converted substantially the entire back end to this and in the next few > days i will submit patches that do the tree level. So i am a big user and > a rich api makes that conversion much easier. Maybe for you, but not for people that come looking for a way to do something with that interface. They see a galore of functions with similar names and weird arguments. They try to figure out which to pick and get confused. They end up resorting to the _implementation_ :( The interface is, frankly, a mess. And I don't agree that it is a pre-existing interface. The wide-int class should have a clean and small interface. You can write the rich interface as standalone functions, the clean and small interface should provide all building blocks you need. > Remember that these patches are not syntactic changes like the conversion of > double-int to use c++ interfaces. I am actually converting most of the > code that only does transformations if the value fits in some fixed number > of HWIs to work at the targets precision. My motivation is that GCC does > not actually work correctly with larger types. I will do what it takes to > get these patches in an acceptable form to the api police but my motivations > is that the compiler is now neither correct nor robust with 128 bit types > and above. Well, you may gain that things work with 128 bit and above types, but at the same time GCC will be a more confusing place to work with. One general implementation detail comment: +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; put VAL last so one can allocate less storage, thus it should be a trailing array. Make all of wide-int have non-mutating 'len' (that's required to make variable storage size for VAL viable). Thus, const unsigned int len; unsigned int bitsize; unsigned int precision; HOST_WIDE_INT val[1 /* len */]; that is, how much of the wide-int implementation is mutating? (I'd even say bitsize and precision should be 'const'). Then of couse I'd factor out the bitsize / precision notion into a wrapper to be able to do the double-int sharing I am after... Richard. > kenny > > >> Can you use gcov to see which functions have (how much) coverage? >> >> Thanks, >> Richard. >> >> >> >>> kenny >>> >>> >