FreeOnePlus opened a new issue, #10473: URL: https://github.com/apache/doris/issues/10473
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description In ClickHouse, support for sequenceCount function and retention function is provided, but Doris currently does not have these two functions, and hopes to gain support ### Use case > The following are the explanations and use cases of the two functions in ClickHouse ## sequenceCount(pattern)(time, cond1, cond2, …) Counts the number of event chains that matched the pattern. The function searches event chains that do not overlap. It starts to search for the next chain after the current chain is matched. WARNING Events that occur at the same second may lay in the sequence in an undefined order affecting the result. sequenceCount(pattern)(timestamp, cond1, cond2, ...) Arguments timestamp — Column considered to contain time data. Typical data types are Date and DateTime. You can also use any of the supported [UInt](https://clickhouse.com/docs/en/sql-reference/data-types/int-uint) data types. cond1, cond2 — Conditions that describe the chain of events. Data type: UInt8. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn’t described in a condition, the function skips them. Parameters pattern — Pattern string. See [Pattern syntax](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/parametric-functions/#sequence-function-pattern-syntax). Returned values Number of non-overlapping event chains that are matched. Type: UInt64. Example Consider data in the t table: ┌─time─┬─number─┐ │ 1 │ 1 │ │ 2 │ 3 │ │ 3 │ 2 │ │ 4 │ 1 │ │ 5 │ 3 │ │ 6 │ 2 │ └──────┴────────┘ Count how many times the number 2 occurs after the number 1 with any amount of other numbers between them: SELECT sequenceCount('(?1).*(?2)')(time, number = 1, number = 2) FROM t ┌─sequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))─┐ │ 2 │ └─────────────────────────────────────────────────────────────────────────┘ ## retention function The function takes as arguments a set of conditions from 1 to 32 arguments of type UInt8 that indicate whether a certain condition was met for the event. Any condition can be specified as an argument (as in [WHERE](https://clickhouse.com/docs/en/sql-reference/statements/select/where#select-where)). The conditions, except the first, apply in pairs: the result of the second will be true if the first and second are true, of the third if the first and third are true, etc. Syntax retention(cond1, cond2, ..., cond32); Arguments cond — An expression that returns a UInt8 result (1 or 0). Returned value The array of 1 or 0. 1 — Condition was met for the event. 0 — Condition wasn’t met for the event. Type: UInt8. Example Let’s consider an example of calculating the retention function to determine site traffic. 1. Сreate a table to illustrate an example. CREATE TABLE retention_test(date Date, uid Int32) ENGINE = Memory; INSERT INTO retention_test SELECT '2020-01-01', number FROM numbers(5); INSERT INTO retention_test SELECT '2020-01-02', number FROM numbers(10); INSERT INTO retention_test SELECT '2020-01-03', number FROM numbers(15); Input table: Query: SELECT * FROM retention_test Result: ┌───────date─┬─uid─┐ │ 2020-01-01 │ 0 │ │ 2020-01-01 │ 1 │ │ 2020-01-01 │ 2 │ │ 2020-01-01 │ 3 │ │ 2020-01-01 │ 4 │ └────────────┴─────┘ ┌───────date─┬─uid─┐ │ 2020-01-02 │ 0 │ │ 2020-01-02 │ 1 │ │ 2020-01-02 │ 2 │ │ 2020-01-02 │ 3 │ │ 2020-01-02 │ 4 │ │ 2020-01-02 │ 5 │ │ 2020-01-02 │ 6 │ │ 2020-01-02 │ 7 │ │ 2020-01-02 │ 8 │ │ 2020-01-02 │ 9 │ └────────────┴─────┘ ┌───────date─┬─uid─┐ │ 2020-01-03 │ 0 │ │ 2020-01-03 │ 1 │ │ 2020-01-03 │ 2 │ │ 2020-01-03 │ 3 │ │ 2020-01-03 │ 4 │ │ 2020-01-03 │ 5 │ │ 2020-01-03 │ 6 │ │ 2020-01-03 │ 7 │ │ 2020-01-03 │ 8 │ │ 2020-01-03 │ 9 │ │ 2020-01-03 │ 10 │ │ 2020-01-03 │ 11 │ │ 2020-01-03 │ 12 │ │ 2020-01-03 │ 13 │ │ 2020-01-03 │ 14 │ └────────────┴─────┘ 2. Group users by unique ID uid using the retention function. Query: SELECT uid, retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r FROM retention_test WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03') GROUP BY uid ORDER BY uid ASC Result: ┌─uid─┬─r───────┐ │ 0 │ [1,1,1] │ │ 1 │ [1,1,1] │ │ 2 │ [1,1,1] │ │ 3 │ [1,1,1] │ │ 4 │ [1,1,1] │ │ 5 │ [0,0,0] │ │ 6 │ [0,0,0] │ │ 7 │ [0,0,0] │ │ 8 │ [0,0,0] │ │ 9 │ [0,0,0] │ │ 10 │ [0,0,0] │ │ 11 │ [0,0,0] │ │ 12 │ [0,0,0] │ │ 13 │ [0,0,0] │ │ 14 │ [0,0,0] │ └─────┴─────────┘ 3. Calculate the total number of site visits per day. Query: SELECT sum(r[1]) AS r1, sum(r[2]) AS r2, sum(r[3]) AS r3 FROM ( SELECT uid, retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r FROM retention_test WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03') GROUP BY uid ) Result: ┌─r1─┬─r2─┬─r3─┐ │ 5 │ 5 │ 5 │ └────┴────┴────┘ Where: r1- the number of unique visitors who visited the site during 2020-01-01 (the cond1 condition). r2- the number of unique visitors who visited the site during a specific time period between 2020-01-01 and 2020-01-02 (cond1 and cond2 conditions). r3- the number of unique visitors who visited the site during a specific time period between 2020-01-01 and 2020-01-03 (cond1 and cond3 conditions). ### Related issues _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org