vagetablechicken opened a new issue #3199: check_rowset_id_in_unused_rowsets() 
is inefficient
URL: https://github.com/apache/incubator-doris/issues/3199
 
 
   
https://github.com/apache/incubator-doris/blob/f6374fa9a5a52135a85e4ecca23bad76d6c7a54b/be/src/olap/storage_engine.h#L299
   
   We have unused_rowsets here, the key of this map is RowSet::unique_id(), as 
follows.
   
https://github.com/apache/incubator-doris/blob/08e4035a41bbff8301ba23d612376aeceb4b9913/be/src/olap/rowset/rowset.h#L208-L210
   
   But in the frequent func check_rowset_id_in_unused_rowsets(), we just 
iterate through the map.
   
https://github.com/apache/incubator-doris/blob/f6374fa9a5a52135a85e4ecca23bad76d6c7a54b/be/src/olap/storage_engine.cpp#L918-L929
   
   so this func uses lots of CPU, the perf result is
   
![image](https://user-images.githubusercontent.com/24697960/77542879-d47ed400-6ee1-11ea-9bb1-5652a7e94a13.png)
   
   ## Solution
   
   We can use unordered_multimap, the key is rowset_id, value is 
pair<rowset_path, RowsetSharedPtr>.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to