kangkaisen opened a new issue #2846: [Proposal] Support in memory olap table in Segment V2 URL: https://github.com/apache/incubator-doris/issues/2846 ## 1 Why need in memory olap table Currently, Disk seek is still a bottleneck for most of Doris queries, So we could use in memory table to speed up Doris queries like other Database (HBase, Arrow, ClickHouse ...) ## 2 How to implement in memory olap table As for as I know, there should be two ways to implement in memory table ### 2.1 Cache disk data by memory Like HBase, HBase implements in memory table by `BlockCache`, we could refer to http://hbase.apache.org/book.html#block.cache.design and https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ### 2.2 Design special memory layout and data struct for memory Like ClickHouse https://clickhouse.tech/docs/en/operations/table_engines/memory/ Apache Arrow http://arrow.apache.org/docs/cpp/api/table.html#tables SnappyData https://blog.bcmeng.com/post/snappydata.html Memsql https://www.memsql.com/blog/what-is-skiplist-why-skiplist-index-for-memsql/ **Because Doris has implemented page cache, So I decide to implement in memory table base on page cache in Doris.** Of course, We could implement special in memory table engine in the future, which two solutions are not conflicting. ## 3 Detailed Design 1 introduce `CachePriority` to `LRUCache`. The entry with smaller CachePriority In `LRUCache` will evict firstly. Currently CachePriority has two value, `DURABLE` for in memory table, `NORMAL` for normal table. When `_evict_from_lru`, we will firstly evict all cache entries with `NORMAL` priority, and finally evict cache entries with `DURABLE` priority. 2 Add a `in_memory` property to `OlapTable` 3 Add a `is_in_memory` field to `TabletSchema` 4 Add a `cache_in_memory` field to `ColumnReaderOptions`
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org