CalvinKirs opened a new pull request, #55565:
URL: https://github.com/apache/doris/pull/55565

   …
   
   
   
   ### What problem does this PR solve?
   
   When inserting into a Hive partitioned table stored on oss-hdfs, the 
following issue occurs:
   
   First insert succeeds: Since the partition does not exist yet, 
HiveTableSink#setPartitionValues does not set storage-related information for 
the partition.
   
   Subsequent inserts fail: Once the partition exists, the system tries to 
resolve the partition’s storage information. At this stage, oss-hdfs is 
incorrectly treated as s3 instead of being recognized as hdfs, leading to 
insert failure.
   
   This PR fixes the storage type handling logic so that oss-hdfs partitions 
are correctly recognized as hdfs.
   
   #### 
   
   How to Reproduce
   ```
   Step 1: Create a Hive catalog whose storage is configured to use oss-hdfs. 
   
   CREATE TABLE hive_partition_table
   (
     `ts` DATETIME COMMENT 'ts',
     `col1` BOOLEAN COMMENT 'col1',
     `col2` INT COMMENT 'col2',
     `col3` BIGINT COMMENT 'col3',
     `col4` FLOAT COMMENT 'col4',
     `col5` DOUBLE COMMENT 'col5',
     `col6` DECIMAL(9,4) COMMENT 'col6',
     `col7` STRING COMMENT 'col7',
     `col8` DATE COMMENT 'col8',
     `col9` DATETIME COMMENT 'col9',
     `pt1` STRING COMMENT 'pt1',
     `pt2` STRING COMMENT 'pt2'
   )
   PARTITION BY LIST (day(ts), pt1, pt2) ()
   PROPERTIES (
     'write-format'='orc',
     'compression-codec'='zlib'
   );
   
   -- First insert (works fine)
   INSERT INTO hive_partition_table VALUES
     ('2023-01-01 00:00:00', true, 1, 1, 1.0, 1.0, 1.0000, '1', '2023-01-01', 
'2023-01-01 00:00:00', 'a', '1'),
     ('2023-01-02 00:00:00', false, 2, 2, 2.0, 2.0, 2.0000, '2', '2023-01-02', 
'2023-01-02 00:00:00', 'b', '2'),
     ('2023-01-03 00:00:00', true, 3, 3, 3.0, 3.0, 3.0000, '3', '2023-01-03', 
'2023-01-03 00:00:00', 'c', '3');
   
   -- Second insert (fails)
   INSERT INTO hive_partition_table VALUES
     ('2023-01-01 00:00:00', true, 1, 1, 1.0, 1.0, 1.0000, '1', '2023-01-01', 
'2023-01-01 00:00:00', 'a', '1'),
     ('2023-01-02 00:00:00', false, 2, 2, 2.0, 2.0, 2.0000, '2', '2023-01-02', 
'2023-01-02 00:00:00', 'b', '2'),
     ('2023-01-03 00:00:00', true, 3, 3, 3.0, 3.0, 3.0000, '3', '2023-01-03', 
'2023-01-03 00:00:00', 'c', '3');
   
   
   Error message on the second insert:
   
   [INVALID_ARGUMENT] Invalid S3 URI: 
oss://emr-ssss-oss.cn-beijing.oss-dls.aliyuncs.com/tmp/.sss/root/4118a835d5d948f8adc34107230c9b9b/pt1=a/pt2=1/727bd17a7b9541db-8f4bb2fbfda35b86_6ec0a4b4-cacc-4dd3-b3fc-b130cadcd508-0.zlib.orc
   ```
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [ ] Regression test
       - [ ] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to