While we were trying to implement RISC-V p extension in TVM, we found the 
arithmetic instructions in RISC-V 
p extension only worked with integer type. RISC-V p extension supports fixed 
point instructions. Therefore, we tried to add fixed-point type in TVM to 
quantize the inference programs from floating-point type to fixed-point type.

## Transform Floating-point to Fixed-point

The value of a fixed-point type variable is determined by the value of its 
floating-point counter part and the point position which can be set by the 
users. The higher point position means more accuracy in decimal part after 
quantization. However, there will be less bits to express integer part thus 
resulting a smaller number range. In another word, we could set a higher point 
position if the data set consists with smaller number range. Otherwise, the 
quantiztion will cause too much saturation and result in low accuracy 
prediction from inference program. 

The actual relationship between fixed-point variable and floating-point 
variable can be depicted as follow:

We refer Fxp as fixed-point value, Fp as floating-point value and PP as point 
position

$Fxp = Fp * pow(2,PP)$

We can see that point position is actually the exponent of 2, we multiply it 
with floating-point value and retain the integer part as fixed-point value

e.g.
If we set point position as 10. The floating-point value 0.25 is equal to 256 
in fixed-point value. (0.25*pow(2,10) = 256)

## Arithmetic of Fixed-point

The arithmetic operations of fixed-point are similar to the ones of integer, 
but there are some operations need some tweaks to get the correct result.

e.g.
One of the most common operation we will encounter is multiplication. Assume 
point position is 10. The fixed-point value of 0.5 is 512. If we multiply 0.5 
with itself, we should get 0.25(256 in fixed-point) as result. However, we 
won't get same result from multiplication of fixed-point value. We have to 
divide it with pow(2,10) for the correct answer(or shift right 10 digits).

In order to get the correct arithmetic result in fixed-point, the binary point 
position is an essential information.


## Implementing Fixed-point type in TVM

If we want to build a NN model with fixed-point type while using nnvm or relay 
as below, we need to set the fixed-point type first in python code.

```
func, params = relay.frontend.from_mxnet(block, shape_dict, "fxp16_13")
with relay.build_config(opt_level=level):
            graph, lib, params = relay.build(func, target, params=params)
```

In this case, the fixed-point type is presented as 'fxp16_13'. 16 is the number 
of bits in the fixed-point variable. 13 is the point position.

We use llvm as our target in experiment. The main goal is to pass down the 
information of point position to backend compiler. The method is to add a new 
type in TVM.

### TVM IR Type

We base our fixed-point type on HalideIR::Type. It is similar to HalideIR::Int. 
The only different is that we use 'halideir_handle_cplusplus_type' to carry the 
point position.

```
inline Type Fxp(int bits, int fxp_pos, int lanes = 1) {
    halideir_handle_cplusplus_type *fxp_info = new 
halideir_handle_cplusplus_type{
            halideir_cplusplus_type_name(halideir_cplusplus_type_name::Simple, 
"fxp16_"+std::to_string(fxp_pos)),
            {}, {}, {}};
    return Type(Type::Int,bits, lanes,fxp_info);
}

```

### TVMType

Since TVMType is costructed by three integer variables (code, bits and lanes). 
There is no variable to carry point position. We have to give each possible 
point position a unique type code to distinguish them from each other. If we 
want to add fxp16 type into TVMType, we will need 16 codes(numbers) for each 
possible position.

### Codegen

As mentioned above, we need to add extra instructions so fixed-point 
multiplication would return correct result. Our method is to modify the codegen 
of llvm and override 'VisitExpr_(const Mul* op)' function to verify the 
fixed-point multiplication. The type class in codegen phase is TVM IR type and 
we can get the data of point position from it. Then we add shift-right 
instruction according to the point position and generate the fixed-point 
version of multiplication in llvm IR form.

In our experiment, we use a llvm intrinsic to replace normal integer 
multiplications with fixed-point multiplications. 
```
llvm::Value* CodeGenRISCV::VisitExpr_(const Mul* op) {

    int fxp_pos=0;
        
    if(op->type.handle_type!=nullptr) {
      std::string fxp_name = op->type.handle_type->inner_name.name;
      if(fxp_name.find("fxp")!=std::string::npos)
        fxp_pos = std::stoi(fxp_name.substr(fxp_name.find("_")+1));
    }
    
    if(op->type.bits()==16||op->type.bits()==8) {
      int total_bits = op->type.bits()*op->type.lanes();
      if(total_bits==32||total_bits==64) {
        Expr e = CreateMulIntr(op,op->type.bits(),op->type.lanes(),fxp_pos);
        return CodeGenCPU::CreateIntrinsic(e.as<Call>());
      }       
    }    
    return CreateMul(op->type, MakeValue(op->a), MakeValue(op->b));  
    
}
```


We should use llvm backend compiler to compile llvm IR and get the correct 
multiplication library with fixed-point type. 

Thanks for reading and welcome to comment

-Bing-Sung Lu (Allen Lu), Chao-Lin Lee from Peakhills Group AI team,
  Jenq-Kuen Lee, YI-RU CHEN from NTHU

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/4446

Reply via email to