Copilot commented on code in PR #3507: URL: https://github.com/apache/doris-website/pull/3507#discussion_r3014196339
########## versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md: ########## @@ -245,6 +244,28 @@ Watch for: - Do not turn the whole JSON schema into a static template. That defeats the point of `VARIANT`. - Schema Template should cover key paths only; the rest stays dynamic. +## Performance + +The chart below compares single-path extraction time on a 10K-path wide-column dataset (200K rows, extracting one key, 16 CPUs, median of 3 runs). + + + +| Mode | Query Time | Peak Memory | +|---|---:|---:| +| DOC Materialized | 76 ms | 1 MiB | +| VARIANT Default | 76 ms | 1 MiB | +| DOC Map (Sharded) | 148 ms | 1 MiB | +| JSONB | 887 ms | 32 GiB | +| DOC Map | 2,533 ms | 1 MiB | +| MAP\<STRING,STRING\> | 2,800 ms | 1 MiB | +| STRING (raw JSON) | 6,104 ms | 48 GiB | + +Key takeaways: + +- **Materialized subcolumns win.** Both Default and DOC Materialized deliver ~76 ms — 80× faster than raw STRING, 12× faster than JSONB. +- **DOC Map with sharding helps.** Sharding the doc map cuts query time from 2.5 s to 148 ms for un-materialized paths. Review Comment: Typo/wording: “un-materialized” is not standard usage here; use “unmaterialized” (or “not yet materialized”) for clarity and consistency with the rest of the page. ```suggestion - **DOC Map with sharding helps.** Sharding the doc map cuts query time from 2.5 s to 148 ms for unmaterialized paths. ``` ########## docs/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md: ########## @@ -245,6 +244,28 @@ Watch for: - Do not turn the whole JSON schema into a static template. That defeats the point of `VARIANT`. - Schema Template should cover key paths only; the rest stays dynamic. +## Performance + +The chart below compares single-path extraction time on a 10K-path wide-column dataset (200K rows, extracting one key, 16 CPUs, median of 3 runs). + + + +| Mode | Query Time | Peak Memory | +|---|---:|---:| +| DOC Materialized | 76 ms | 1 MiB | +| VARIANT Default | 76 ms | 1 MiB | +| DOC Map (Sharded) | 148 ms | 1 MiB | +| JSONB | 887 ms | 32 GiB | +| DOC Map | 2,533 ms | 1 MiB | +| MAP\<STRING,STRING\> | 2,800 ms | 1 MiB | +| STRING (raw JSON) | 6,104 ms | 48 GiB | + +Key takeaways: + +- **Materialized subcolumns win.** Both Default and DOC Materialized deliver ~76 ms — 80× faster than raw STRING, 12× faster than JSONB. +- **DOC Map with sharding helps.** Sharding the doc map cuts query time from 2.5 s to 148 ms for un-materialized paths. Review Comment: Typo/wording: “un-materialized” is not standard usage here; use “unmaterialized” (or “not yet materialized”) for clarity and consistency with the rest of the page. ```suggestion - **DOC Map with sharding helps.** Sharding the doc map cuts query time from 2.5 s to 148 ms for unmaterialized paths. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
