marising opened a new issue #2581: Doris caches query results based on partition URL: https://github.com/apache/incubator-doris/issues/2581 1.If the table is partitioned by date and only the data of the current day is updated, the data before the current day will not be updated. When querying 30 days of data, 29 days can be cached. For example, today is 20191226,Data is being imported in real time through StreamLoad. **SELECT event_date,COUNT( event_id ) AS event_count FROM music_event WHERE event_date>=20191127 and event_date <= 20191226 GROUP BY event_date ORDER BY event_date;** The row batch of event_date and event_count of 20191127-20191225 can be obtained from the memory cache, and only the data of 20191226 can be queried from the physical table. If the first query fails to hit the cache, the query results will be cached. 2.In another case, although the data is not queried by partition key, the data is updated by day. For example, userprofile uses UserID as partition key to query the number of users in each country。 **SELECT country,COUNT( UserID ) FROM UserProfile GROUP BY country;** This query result can be cached. Therefore, queries that are not updated in real time can be cached, or data that is partitioned on a daily basis can be cached on a partition by partition basis only when the most recent partition is updated. This feature will reduce query time and improve cluster QPS. The specific design and use details will be supplemented according to the development progress.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
