[PATCH] D53738: [Fixed Point Arithmetic] Fixed Point Addition

John McCall via Phabricator via cfe-commits Wed, 05 Dec 2018 07:59:14 -0800

rjmccall added inline comments.


================
Comment at: clang/lib/Sema/SemaExpr.cpp:1304
+    RHSTy = ResultTy;
+  }
+
----------------
ebevhan wrote:
> rjmccall wrote:
> > leonardchan wrote:
> > > rjmccall wrote:
> > > > leonardchan wrote:
> > > > > rjmccall wrote:
> > > > > > leonardchan wrote:
> > > > > > > rjmccall wrote:
> > > > > > > > ebevhan wrote:
> > > > > > > > > rjmccall wrote:
> > > > > > > > > > leonardchan wrote:
> > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > ebevhan wrote:
> > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > Hmm.  So adding a signed integer to an unsigned 
> > > > > > > > > > > > > > fixed-point type always converts the integer to 
> > > > > > > > > > > > > > unsigned?  That's a little weird, but only because 
> > > > > > > > > > > > > > the fixed-point rule seems to be the other way.  
> > > > > > > > > > > > > > Anyway, I assume it's not a bug in the spec.
> > > > > > > > > > > > > `handleFixedPointConversion` only calculates the 
> > > > > > > > > > > > > result type of the expression, not the calculation 
> > > > > > > > > > > > > type. The final result type of the operation will be 
> > > > > > > > > > > > > the unsigned fixed-point type, but the calculation 
> > > > > > > > > > > > > itself will be done in a signed type with enough 
> > > > > > > > > > > > > precision to represent both the signed integer and 
> > > > > > > > > > > > > the unsigned fixed-point type. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Though, if the result ends up being negative, the 
> > > > > > > > > > > > > final result is undefined unless the type is 
> > > > > > > > > > > > > saturating. I don't think there is a test for the 
> > > > > > > > > > > > > saturating case (signed integer + unsigned saturating 
> > > > > > > > > > > > > fixed-point) in the SaturatedAddition tests.
> > > > > > > > > > > > > `handleFixedPointConversion` only calculates the 
> > > > > > > > > > > > > result type of the expression, not the calculation 
> > > > > > > > > > > > > type.
> > > > > > > > > > > > 
> > > > > > > > > > > > Right, I understand that, but the result type of the 
> > > > > > > > > > > > expression obviously impacts the expressive range of 
> > > > > > > > > > > > the result, since you can end up with a negative value.
> > > > > > > > > > > > 
> > > > > > > > > > > > Hmm.  With that said, if the general principle is to 
> > > > > > > > > > > > perform the operation with full precision on the 
> > > > > > > > > > > > original operand values, why are unsigned fixed-point 
> > > > > > > > > > > > operands coerced to their corresponding signed types 
> > > > > > > > > > > > *before* the operation?  This is precision-limiting if 
> > > > > > > > > > > > the unsigned representation doesn't use a padding bit.  
> > > > > > > > > > > > That seems like a bug in the spec.
> > > > > > > > > > > > Hmm. With that said, if the general principle is to 
> > > > > > > > > > > > perform the operation with full precision on the 
> > > > > > > > > > > > original operand values, why are unsigned fixed-point 
> > > > > > > > > > > > operands coerced to their corresponding signed types 
> > > > > > > > > > > > *before* the operation? This is precision-limiting if 
> > > > > > > > > > > > the unsigned representation doesn't use a padding bit. 
> > > > > > > > > > > > That seems like a bug in the spec.
> > > > > > > > > > > 
> > > > > > > > > > > Possibly. When the standard mentions converting from 
> > > > > > > > > > > signed to unsigned fixed point, the only requirement 
> > > > > > > > > > > involved is conservation of magnitude (the number of 
> > > > > > > > > > > integral bits excluding the sign)
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > when signed and unsigned fixed-point types are mixed, the 
> > > > > > > > > > > unsigned type is converted to the corresponding signed 
> > > > > > > > > > > type, and this should go without loss of magnitude
> > > > > > > > > > > ```
> > > > > > > > > > > 
> > > > > > > > > > > This is just speculation, but I'm under the impression 
> > > > > > > > > > > that not as much "attention" was given for unsigned types 
> > > > > > > > > > > as for signed types since "`In the embedded processor 
> > > > > > > > > > > world, support for unsigned fixed-point data types is 
> > > > > > > > > > > rare; normally only signed fixed-point data types are 
> > > > > > > > > > > supported`", but was kept anyway since unsigned types are 
> > > > > > > > > > > used a lot.
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > However, to disallow unsigned fixed-point arithmetic from 
> > > > > > > > > > > programming languages (in general, and from C in 
> > > > > > > > > > > particular) based on this observation, seems overly 
> > > > > > > > > > > restrictive.
> > > > > > > > > > > ```
> > > > > > > > > > > 
> > > > > > > > > > > I also imagine that if the programmer needs more 
> > > > > > > > > > > precision, the correct approach would be to cast up to a 
> > > > > > > > > > > type with a higher scale. The standard seems to make an 
> > > > > > > > > > > effort to expose as much in regards to the underlying 
> > > > > > > > > > > fixed point types as possible:
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > it should be possible to write fixed-point algorithms 
> > > > > > > > > > > that are independent of the actual fixed-point hardware 
> > > > > > > > > > > support. This implies that a programmer (or a running 
> > > > > > > > > > > program) should have access to all parameters that define 
> > > > > > > > > > > the behavior of the underlying hardware (in other words: 
> > > > > > > > > > > even if these parameters are implementation-defined).
> > > > > > > > > > > ```
> > > > > > > > > > > 
> > > > > > > > > > > So the user would at least know that unsigned types may 
> > > > > > > > > > > not have the same scale as their corresponding signed 
> > > > > > > > > > > types if the hardware handles them with different scales.
> > > > > > > > > > > 
> > > > > > > > > > > Also added test for signed integer + unsigned saturating 
> > > > > > > > > > > fixed point.
> > > > > > > > > > As long as we maintain the same typing behavior, does the 
> > > > > > > > > > standard permit us to just Do The Right Thing here and do 
> > > > > > > > > > the extended arithmetic with the original unsigned operand? 
> > > > > > > > > >  I'm sure there are some cases where we would produce a 
> > > > > > > > > > slightly different value than an implementation that does 
> > > > > > > > > > the coercion before the operation, but that might be 
> > > > > > > > > > permitted under the standard, and as you say, it would only 
> > > > > > > > > > affect some situations that it doesn't seem the standard 
> > > > > > > > > > has given much thought to.
> > > > > > > > > The coercion of unsigned to signed is likely done for 
> > > > > > > > > pragmatic reasons; if you have a signed and unsigned type of 
> > > > > > > > > the same rank, operating on them together won't require any 
> > > > > > > > > 'extra bits'. This means that if your hardware has native 
> > > > > > > > > support for the operations, you won't end up in a situation 
> > > > > > > > > where you really need 33 or 34 bits (for a 32 bit value) to 
> > > > > > > > > actually perform the operation. This is one of the benefits 
> > > > > > > > > of these kinds of implicit conversions.
> > > > > > > > > 
> > > > > > > > > It probably makes the implementation a bit easier as well, 
> > > > > > > > > since you don't have to consider cases where both sides have 
> > > > > > > > > different signedness. The loss of the extra bit of precision 
> > > > > > > > > from the unsigned operand is probably not considered to be 
> > > > > > > > > very important. As Leonard said, if you wanted higher 
> > > > > > > > > precision you can simply use the next rank of types. ...Of 
> > > > > > > > > course, they seem to have forgotten that that exact issue 
> > > > > > > > > with differing signedness also applies to operations with 
> > > > > > > > > fixed-point and integer operands.
> > > > > > > > > 
> > > > > > > > > Personally, I think the spec could be a bit more restrictive 
> > > > > > > > > with how it handles these conversions, as the way they are 
> > > > > > > > > now makes it really hard to generate code sensibly for 
> > > > > > > > > machines that have native support for these kinds of 
> > > > > > > > > operations; in many cases you'll end up requiring a couple 
> > > > > > > > > bits too many or a couple bits too few to fit in a sensible 
> > > > > > > > > register size, and ensuring that the scales match what your 
> > > > > > > > > hardware can handle is also a challenge.
> > > > > > > > > The coercion of unsigned to signed is likely done for 
> > > > > > > > > pragmatic reasons; if you have a signed and unsigned type of 
> > > > > > > > > the same rank, operating on them together won't require any 
> > > > > > > > > 'extra bits'. This means that if your hardware has native 
> > > > > > > > > support for the operations, you won't end up in a situation 
> > > > > > > > > where you really need 33 or 34 bits (for a 32 bit value) to 
> > > > > > > > > actually perform the operation. This is one of the benefits 
> > > > > > > > > of these kinds of implicit conversions.
> > > > > > > > 
> > > > > > > > Yeah, I can see that.  I think it really depends on the target. 
> > > > > > > >  If you have hardware support, and you're using an unpadded 
> > > > > > > > representation for unsigned fixed-point values, then preserving 
> > > > > > > > the extra bit on mixed-signedness operations will sometimes 
> > > > > > > > prevent you from using the hardware optimally.  On the other 
> > > > > > > > hand, if you don't have hardware support, preserving the extra 
> > > > > > > > bit is usually easier than not: e.g., IIUC, multiplying a 
> > > > > > > > 32-bit signed _Fract with a 32-bit unpadded unsigned _Fract 
> > > > > > > > means just doing a 64-bit multiplication and dropping the low 
> > > > > > > > 32 bits, whereas doing the conversion first actually requires 
> > > > > > > > extra operations to ensure that the low bit doesn't contribute 
> > > > > > > > to the result.  And a target with hardware support is probably 
> > > > > > > > going to use a padded format for unsigned fixed-point precisely 
> > > > > > > > so that it can take advantage of the hardware.
> > > > > > > > 
> > > > > > > > > ...Of course, they seem to have forgotten that that exact 
> > > > > > > > > issue with differing signedness also applies to operations 
> > > > > > > > > with fixed-point and integer operands.
> > > > > > > > 
> > > > > > > > Right.
> > > > > > > For signed-unsigned multiplication, we could do that without 
> > > > > > > having to do the conversion first, but in the case of addition 
> > > > > > > and subtraction, we would still need to drop a bit with unpadded 
> > > > > > > unsigned types so both have the same scales.
> > > > > > > 
> > > > > > > I don't want to make too many assumptions about hardware support, 
> > > > > > > but I imagine that if one supports 32 bit signed addition, and 
> > > > > > > unsigned types did not have padding, it would be better instead 
> > > > > > > to LSHR on unsigned fixed point type then do 32 bit signed 
> > > > > > > addition rather than SEXT+SHL on the signed fixed point type then 
> > > > > > > do 33 bit signed addition.
> > > > > > > [I}n the case of addition and subtraction, we would still need to 
> > > > > > > drop a bit with unpadded unsigned types so both have the same 
> > > > > > > scales.
> > > > > > 
> > > > > > Right, but I don't think it matters because you're dropping that 
> > > > > > bit either way: with the specified semantics, you do the conversion 
> > > > > > beforehand which drops the bit, and with full-precision semantics, 
> > > > > > the bit isn't going to be representable and dropping it during the 
> > > > > > computation is equivalent to either rounding down (addition) or up 
> > > > > > (subtraction), either of which is allowed.
> > > > > > 
> > > > > > > I don't want to make too many assumptions about hardware support, 
> > > > > > > but I imagine that if one supports 32 bit signed addition, and 
> > > > > > > unsigned types did not have padding, it would be better instead 
> > > > > > > to LSHR on unsigned fixed point type then do 32 bit signed 
> > > > > > > addition rather than SEXT+SHL on the signed fixed point type then 
> > > > > > > do 33 bit signed addition.
> > > > > > 
> > > > > > My point is just that (1) the 33-bit signed addition can just be a 
> > > > > > 32-bit signed addition given that you ultimately have to produce a 
> > > > > > 32-bit result and (2) targets with hardware support are unlikely to 
> > > > > > use an unpadded unsigned type anyway.  Although I suppose (2) is a 
> > > > > > bit questionable because hardware support could be added well after 
> > > > > > an ABI has been settled on.
> > > > > Ok. So should anything be changed here then if both semantics work? 
> > > > > Currently we still do the 32 bit addition for signed and unsigned 
> > > > > types.
> > > > Well, most of what we're doing here is trying to lay good groundwork 
> > > > for all operations, and "both semantics work" is only true for 
> > > > same-rank addition, which is why I'm asking whether we should consider 
> > > > what semantics we actually want in general.
> > > I think the semantics for all binary arithmetic operators (+, -, *, /) 
> > > can be simplified to:
> > > 
> > > 1. If one operand is a signed fixed point type and the other an unsigned 
> > > type, convert the unsigned type to a signed type (which may involve 
> > > reducing the scale).
> > > 2. Find the common type of both operands that can hold both the larger 
> > > scale and magnitude of the 2 operands, and cast them to this common type.
> > > 3. Perform the binary operation.
> > > 4. Cast to result type.
> > > 
> > > I think the semantics for addition are already settled with these 
> > > semantics, and I believe subtraction shares the same semantics as those 
> > > for addition. As you said before, the extra bit would not even matter 
> > > with full precision since it would be trimmed anyway when converting from 
> > > the common type to the result, and rounding can be either up or down.
> > > 
> > > Signed integer addition/subtraction with an unsigned fixed point type 
> > > with these semantics will only lead to undefined behavior if the 
> > > difference in the values between both types is negative since a negative 
> > > value cannot be represented by an unsigned fixed point type (unless the 
> > > result is saturating).
> > > 
> > > For multiplication and division, the result could technically be 
> > > calculated also without having to cast between both sides, although for 
> > > using the fixed point mul/div intrinsics, both operands would need to be 
> > > casted to a common type with the same scale anyway. If a bit does end up 
> > > getting dropped when casting an operand from unsigned to signed in (1), 
> > > that dropped bit would not be representable when casting to the result 
> > > anyway and the standard still allows us to return a rounded result.
> > > 
> > > When multiplying or dividing between a fixed point and integer though, we 
> > > could just perform a regular integer mul/div if the result is 
> > > non-saturating without having to perform the steps above. For saturating 
> > > mul/div with an int, it might be easier to cast both sides to a common 
> > > type then call the saturating fixed point mul/div intrinsics. In either 
> > > case, I do not think rounding will be an issue here since we can still 
> > > calculate the result precisely and the result will be either rounded up 
> > > or down.
> > > 
> > > These semantics do not apply to comparison/relational operands, where if 
> > > we do compare operands of different types or ranks, we would just need to 
> > > find the common type between the 2 and perform the comparison.
> > > As you said before, the extra bit would not even matter with full 
> > > precision since it would be trimmed anyway when converting from the 
> > > common type to the result, and rounding can be either up or down.
> > 
> > Note that this is only true if the unsigned value has a scale that's 
> > greater than or equal to the scale of the result type, which is true for 
> > same-rank addition but not necessarily with mixed-rank.
> > 
> > > For multiplication and division, the result could technically be 
> > > calculated also without having to cast between both sides, although for 
> > > using the fixed point mul/div intrinsics, both operands would need to be 
> > > casted to a common type with the same scale anyway. If a bit does end up 
> > > getting dropped when casting an operand from unsigned to signed in (1), 
> > > that dropped bit would not be representable when casting to the result 
> > > anyway and the standard still allows us to return a rounded result.
> > 
> > In a `_Fract` multiplication (or a mixed-rank multiplication), the extra 
> > bit of precision can affect higher bits in the result.  Consider an 8-bit 
> > format where you're multiplying unsigned `000.00001` with signed 
> > `+010.0000`: if you do this multiplication in full precision, the result is 
> > `+000.0001`, but if you do the conversion first, the result is `0`.
> > 
> > On some level, it just seems really strange to me that the spec calls for 
> > doing arbitrary-precision arithmetic and then converting to the result type 
> > for every situation except a signed/unsigned mismatch.
> Wait, wait... You might be on to something.
> 
> The spec says in 4.1.4:
> > In order to get useful and attainable results, the usual arithmetic 
> > conversions do not apply to the combination of an integer type and a 
> > fixed-point type, or the combination of two fixed-point types.
> > In these cases:
> > 1. the result of the operation is calculated using the values of the two 
> > operands, with their full precision;
> > 2. if one operand has signed fixed-point type and the other operand has 
> > unsigned fixed-point type, then the unsigned fixed-point operand is 
> > converted to its corresponding signed fixed-point type and the resulting 
> > type is the type of the converted operand;
> > 3. the result type is the type with the highest rank, [...]
> 
> (1) comes before (2); not after. In other words, shouldn't the true result of 
> the operation be computed **before** the unsigned-signed correction? Point 2 
> is just worded really oddly, with 'operand is converted' and 'resulting type'.
> 
> It says that "the unsigned fixed-point operand is converted to its 
> corresponding signed fixed-point type and the resulting type is the type of 
> the converted operand", but this doesn't mean that the resulting converted 
> operand is actually used, only its type is. This means that for the purpose 
> of result type calculation in (3), the converted operand type from (2) will 
> be used as that operand's type, rather than the unsigned type itself. Or?
> 
> The wording here is very hard to interpret.
Sure, I think you could make an argument for (2) only being meant to 
participate in determining the result type.  At the very least, that seems more 
consistent, especially with the behavior for fixed-point + integer operations.  
Do you have any existing contact with the committee?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53738/new/

https://reviews.llvm.org/D53738



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D53738: [Fixed Point Arithmetic] Fixed Point Addition

Reply via email to