dutyu opened a new pull request, #24491:
URL: https://github.com/apache/doris/pull/24491

   ## Proposed changes
   
   Relevant pr #23391 #21873
   
   **This pr mainly has these changes:** 
   
   - rename `ExternalTable.getLatestUpdateTime()` to 
`ExternalTable.getUpdateTime()`, because there is already a method called 
`getUpdateTime()` existed at `TableIf`, and the meaning is the same, better to 
merge these two methods to avoid ambiguity.
   
   - rename the field `ExternalTable.lastestUpdateTime` to `schemaUpdateTime`, 
and the default impl of `ExternalTable.getUpdateTime()` is just get the value 
of `schemaUpdateTime`, bacause `schemaUpdateTime` is the timestamp after scheme 
loading of external tables.
   
   - add a field named `partitionUpdateTime` at `HMSExternalTable`, update 
`partitionUpdateTime` when processing hms partition events, override 
`getUpdateTime()` of `HMSExternalTable`, return the max value between 
`schemaUpdateTime` and `partitionUpdateTime`. The `partitionUpdateTime` will be 
refreshed when (1. add partitions 2. delete partitions 3. alter partitions) 
with hms event listener enabled. 
   
   Now `FE` does not record the update time of hms tbl's partitons, so the sql 
cache may be hit even the hive table's partitions have changed. This pr add a 
field to record the partition update time, and use it when enable sql-cache.
   The cache will be missed if any partition has changed at hive side.
   
   Use `System.currentTimeMillis()` but not the event time of hms event because 
we would better keep the same measurement with the `schemaUpdateTime` of 
external table. Add this value to `ExternalObjectLog` and let slave `FE`s 
replay it because it is better to keep the same value with all `FE`s, so the 
sql-cache can be hit by the querys through different `FE`s.
   
   
   **I have test with following steps:**
   
   1. Enable hms event listener, enable sql cache and query profile, submit a 
query with hms catalog two times (`lh_test_p` is a hive partition-table): 
   ```sql
   select count(0) from hive_safe_lycc.test.lh_test_p;
   ```
   
![image](https://github.com/apache/doris/assets/5926365/bf791748-7ffe-4172-babb-5168b94e6dae)
   
   The second time will hit the sql cache:
   
![image](https://github.com/apache/doris/assets/5926365/b5bfbbd4-3cb5-4389-8057-67ec24fba5bd)
   
   The last update time of this hive table: 
   
![image](https://github.com/apache/doris/assets/5926365/fc6601a2-4211-472e-843c-76e3dace9a65)
   
   
   2. Add a partition as hive side by :
   ```sql
   alter table lh_test_p add partition (pday='20230920');
   ```
   
   
![image](https://github.com/apache/doris/assets/5926365/890596e1-3c1c-4888-8783-aeab10787e2f)
   
      Wait some time for processing hms events, execute `select count(0) from 
hive_safe_lycc.test.lh_test_p;` again and the cache will be missed:
   
![image](https://github.com/apache/doris/assets/5926365/475189e9-7375-4c9a-b734-279bf4ff460b)
   
    And the last update time has changed too:
   
![image](https://github.com/apache/doris/assets/5926365/59795f7d-4fe2-421f-baf9-5bca98a77b2e)
   
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to