[I] [Bug] [RISC-V RVV] cos operator shows slight performance degradation [tvm]

via GitHub Mon, 08 Dec 2025 20:19:47 -0800


yanyanyanggg opened a new issue, #18572:
URL: https://github.com/apache/tvm/issues/18572


   ### Issue: [RISC-V RVV] cos operator shows slight performance degradation
   
   #### Description
   The cosine operator shows minor performance degradation with the RISC‑V 
Vector (RVV) extension, achieving 0.981× the performance of the scalar 
implementation. While the regression is small, it still indicates room for 
optimization in vectorized trigonometric functions.
   
   #### Steps to Reproduce
   1. Generate the cos operator with the following configuration:
   ```python
   params = {
       "dtype": "float32",
       "batch": 14,
       "channels": 23,
       "input_height": 67,
       "input_width": 99
   }
   ```
   
   2. Export the operator to two targets:
      - **RV target** (scalar, without vector extension):
        ```
        llvm -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mabi=lp64d 
-mattr=+64bit,+m,+a,+f,+d,+c
        ```
      - **RVV target** (with vector extension):
        ```
        llvm -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mabi=lp64d 
-mattr=+64bit,+m,+a,+f,+d,+c,+v
        ```
   
   3. Run performance measurement on both targets.
   
   Operator definition code:
   ```python
   def export_cos(params, set_dir=None, platform="rv"):
       data = relay.var("data",
                        shape=(params["batch"], params["channels"],
                               params["input_height"], params["input_width"]),
                        dtype=params["dtype"])
       cos_op = relay.cos(data)
       export_op(cos_op, params["op_name"], [data], params, set_dir=set_dir)
   ```
   
   #### Performance Data
   - **RV execution time**: 15.894500 ms
   - **RVV execution time**: 16.210500 ms
   - **Acceleration ratio (RV/RVV)**: 0.981 (RVV is ~1.02× slower)
   
   #### Environment Information
   - **TVM version**: 0.19.0
   - **LLVM version**: [Please provide: `llvm-config --version`]
   - **Hardware**: Spacemit K1‑X bit‑brick board
   - **CPU**: Spacemit X60 (8 cores, 1.6 GHz)
   - **ISA**: rv64imafdcv (with vector extensions)
   - **Memory**: 7.6 GB
   - **OS**: Bianbu 2.2, Linux kernel 6.6.63
   - **Operation**: Elementwise cosine on ~1.7M elements
   
   #### Expected Behavior
   RVV vectorization should provide a performance improvement over the scalar 
RV baseline for trigonometric functions like cosine.
   
   #### Additional Context
   - The cos operation is applied elementwise to a tensor of ~1.7M elements.
   - While the performance regression is minimal compared to other operators, 
it still shows that vectorization is not providing the expected speedup. This 
suggests that even for operations that are computationally intensive, the 
current RVV vectorization may not be optimal.
   - This issue is part of a broader pattern where all tested operators 
(including sum, log, relu, bias_add, sqrt, floor, round, avg_pool2d, sigmoid, 
softmax, negative, max_pool2d, and cos) show performance degradation with RVV, 
indicating a potential systemic issue in TVM's RVV code generation or 
optimization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [Bug] [RISC-V RVV] cos operator shows slight performance degradation [tvm]

Reply via email to