This is an automated email from the ASF dual-hosted git repository.
zclllyybb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new bde1977e938 [Fix](variance) Fix sample variance/stddev null res for
single value (#3765)
bde1977e938 is described below
commit bde1977e9386037e308d77e28c552436900268b2
Author: linrrarity <[email protected]>
AuthorDate: Mon Jun 1 12:06:16 2026 +0800
[Fix](variance) Fix sample variance/stddev null res for single value (#3765)
## Versions
- [x] dev
- [x] 4.x
- [ ] 3.x
- [ ] 2.1 or older (not covered by version/language sync gate)
## Languages
- [x] Chinese
- [x] English
- [ ] Japanese candidate translation needed
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
- [ ] Updated required version and language counterparts, or explained
why not
- [ ] If only one language changed, confirmed whether source/translation
counterparts need sync
---------
Co-authored-by: zclllyybb <[email protected]>
---
.../aggregate-functions/stddev-samp.md | 33 ++++++++++++++++++++--
.../sql-functions/aggregate-functions/var-samp.md | 31 +++++++++++++++++++-
.../aggregate-functions/stddev-samp.md | 31 +++++++++++++++++++-
.../sql-functions/aggregate-functions/var-samp.md | 31 +++++++++++++++++++-
.../aggregate-functions/stddev-samp.md | 31 +++++++++++++++++++-
.../sql-functions/aggregate-functions/var-samp.md | 31 +++++++++++++++++++-
.../aggregate-functions/stddev-samp.md | 33 ++++++++++++++++++++--
.../sql-functions/aggregate-functions/var-samp.md | 31 +++++++++++++++++++-
8 files changed, 242 insertions(+), 10 deletions(-)
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
b/docs/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
index 46c26916539..8c03f904b2e 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
@@ -10,6 +10,14 @@
Returns the sample standard deviation of the expr expression
+The calculation formula is:
+
+$
+\mathrm{STDDEV\_SAMP}(x)=\sqrt{\mathrm{VAR\_SAMP}(x)}=\sqrt{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}
+$
+
+Where `n` is the number of valid values in the group.
+
## Syntax
```sql
@@ -25,11 +33,11 @@ STDDEV_SAMP(<expr>)
## Return Value
Return the sample standard deviation of the expr expression as Double type.
-If there is no valid data in the group, returns NULL.
+If there is no valid data in the group, returns NULL. If the number of valid
values in the group is 1, returns NaN.
### Examples
```sql
--- Create sample tables
+-- Create sample table
CREATE TABLE score_table (
student_id INT,
score DOUBLE
@@ -58,3 +66,24 @@ FROM score_table;
| 4.949747468305831 |
+-------------------+
```
+
+When the number of valid values is 1, `STDDEV_SAMP` returns `NaN`.
+
+```sql
+-- Create a single-column sample table
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT STDDEV_SAMP(value) AS sample_stddev FROM sample_values;
++---------------+
+| sample_stddev |
++---------------+
+| NaN |
++---------------+
+```
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/var-samp.md
b/docs/sql-manual/sql-functions/aggregate-functions/var-samp.md
index 069452376d5..c3e61d58796 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/var-samp.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/var-samp.md
@@ -10,6 +10,14 @@
The VAR_SAMP function calculates the sample variance of a specified
expression. Unlike VARIANCE (population variance), VAR_SAMP uses n-1 as the
divisor, which is considered an unbiased estimate of the population variance in
statistics.
+The calculation formula is:
+
+$
+\mathrm{VAR\_SAMP}(x)=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}
+$
+
+Where `n` is the number of valid values in the group.
+
## Alias
- VARIANCE_SAMP
@@ -28,7 +36,7 @@ VAR_SAMP(<expr>)
## Return Value
Returns a Double value representing the calculated sample variance.
-If there is no valid data in the group, returns NULL.
+If there is no valid data in the group, returns NULL. If the number of valid
values in the group is 1, returns NaN.
## Examples
```sql
@@ -66,3 +74,24 @@ FROM student_scores;
| 29.4107142857143 | 25.73437500000001 |
+------------------+---------------------+
```
+
+When the number of valid values is 1, `VAR_SAMP` returns `NaN`.
+
+```sql
+-- Create a single-column sample table
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT VAR_SAMP(value) AS sample_variance FROM sample_values;
++-----------------+
+| sample_variance |
++-----------------+
+| NaN |
++-----------------+
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
index c153609b66c..19452bac10d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
@@ -10,6 +10,14 @@
返回 expr 表达式的样本标准差
+计算公式如下:
+
+$
+\mathrm{STDDEV\_SAMP}(x)=\sqrt{\mathrm{VAR\_SAMP}(x)}=\sqrt{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}
+$
+
+其中 `n` 为组内合法数据的个数。
+
## 语法
```sql
@@ -25,7 +33,7 @@ STDDEV_SAMP(<expr>)
## 返回值
返回 Double 类型的参数 expr 的样本标准差。
-当组内没有合法数据时,返回 NULL。
+当组内没有合法数据时,返回 NULL。组内合法数据个数为 1 时,返回 NaN。
## 举例
```sql
@@ -58,3 +66,24 @@ FROM score_table;
| 4.949747468305831 |
+-------------------+
```
+
+当合法数据个数为 1 时,`STDDEV_SAMP` 返回 `NaN`。
+
+```sql
+-- 创建单列示例表
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT STDDEV_SAMP(value) AS sample_stddev FROM sample_values;
++---------------+
+| sample_stddev |
++---------------+
+| NaN |
++---------------+
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/var-samp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/var-samp.md
index 790d42085b0..611e0257767 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/var-samp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/var-samp.md
@@ -10,6 +10,14 @@
VAR_SAMP 函数计算指定表达式的样本方差。与 VARIANCE(总体方差)不同,VAR_SAMP 使用 n-1
作为除数,这在统计学上被认为是对总体方差的无偏估计。
+计算公式如下:
+
+$
+\mathrm{VAR\_SAMP}(x)=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}
+$
+
+其中 `n` 为组内合法数据的个数。
+
## 别名
- VARIANCE_SAMP
@@ -28,7 +36,7 @@ VAR_SAMP(<expr>)
## 返回值
返回一个 Double 类型的值,表示计算得到的样本方差。
-组内没有合法数据时,返回 NULL。
+组内没有合法数据时,返回 NULL。组内合法数据个数为 1 时,返回 NaN。
## 举例
```sql
@@ -66,3 +74,24 @@ FROM student_scores;
| 29.4107142857143 | 25.73437500000001 |
+------------------+---------------------+
```
+
+当合法数据个数为 1 时,`VAR_SAMP` 返回 `NaN`。
+
+```sql
+-- 创建单列示例表
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT VAR_SAMP(value) AS sample_variance FROM sample_values;
++-----------------+
+| sample_variance |
++-----------------+
+| NaN |
++-----------------+
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
index c153609b66c..19452bac10d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
@@ -10,6 +10,14 @@
返回 expr 表达式的样本标准差
+计算公式如下:
+
+$
+\mathrm{STDDEV\_SAMP}(x)=\sqrt{\mathrm{VAR\_SAMP}(x)}=\sqrt{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}
+$
+
+其中 `n` 为组内合法数据的个数。
+
## 语法
```sql
@@ -25,7 +33,7 @@ STDDEV_SAMP(<expr>)
## 返回值
返回 Double 类型的参数 expr 的样本标准差。
-当组内没有合法数据时,返回 NULL。
+当组内没有合法数据时,返回 NULL。组内合法数据个数为 1 时,返回 NaN。
## 举例
```sql
@@ -58,3 +66,24 @@ FROM score_table;
| 4.949747468305831 |
+-------------------+
```
+
+当合法数据个数为 1 时,`STDDEV_SAMP` 返回 `NaN`。
+
+```sql
+-- 创建单列示例表
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT STDDEV_SAMP(value) AS sample_stddev FROM sample_values;
++---------------+
+| sample_stddev |
++---------------+
+| NaN |
++---------------+
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
index 790d42085b0..611e0257767 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
@@ -10,6 +10,14 @@
VAR_SAMP 函数计算指定表达式的样本方差。与 VARIANCE(总体方差)不同,VAR_SAMP 使用 n-1
作为除数,这在统计学上被认为是对总体方差的无偏估计。
+计算公式如下:
+
+$
+\mathrm{VAR\_SAMP}(x)=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}
+$
+
+其中 `n` 为组内合法数据的个数。
+
## 别名
- VARIANCE_SAMP
@@ -28,7 +36,7 @@ VAR_SAMP(<expr>)
## 返回值
返回一个 Double 类型的值,表示计算得到的样本方差。
-组内没有合法数据时,返回 NULL。
+组内没有合法数据时,返回 NULL。组内合法数据个数为 1 时,返回 NaN。
## 举例
```sql
@@ -66,3 +74,24 @@ FROM student_scores;
| 29.4107142857143 | 25.73437500000001 |
+------------------+---------------------+
```
+
+当合法数据个数为 1 时,`VAR_SAMP` 返回 `NaN`。
+
+```sql
+-- 创建单列示例表
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT VAR_SAMP(value) AS sample_variance FROM sample_values;
++-----------------+
+| sample_variance |
++-----------------+
+| NaN |
++-----------------+
+```
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
index 46c26916539..8c03f904b2e 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/stddev-samp.md
@@ -10,6 +10,14 @@
Returns the sample standard deviation of the expr expression
+The calculation formula is:
+
+$
+\mathrm{STDDEV\_SAMP}(x)=\sqrt{\mathrm{VAR\_SAMP}(x)}=\sqrt{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}
+$
+
+Where `n` is the number of valid values in the group.
+
## Syntax
```sql
@@ -25,11 +33,11 @@ STDDEV_SAMP(<expr>)
## Return Value
Return the sample standard deviation of the expr expression as Double type.
-If there is no valid data in the group, returns NULL.
+If there is no valid data in the group, returns NULL. If the number of valid
values in the group is 1, returns NaN.
### Examples
```sql
--- Create sample tables
+-- Create sample table
CREATE TABLE score_table (
student_id INT,
score DOUBLE
@@ -58,3 +66,24 @@ FROM score_table;
| 4.949747468305831 |
+-------------------+
```
+
+When the number of valid values is 1, `STDDEV_SAMP` returns `NaN`.
+
+```sql
+-- Create a single-column sample table
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT STDDEV_SAMP(value) AS sample_stddev FROM sample_values;
++---------------+
+| sample_stddev |
++---------------+
+| NaN |
++---------------+
+```
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
index 069452376d5..c3e61d58796 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/var-samp.md
@@ -10,6 +10,14 @@
The VAR_SAMP function calculates the sample variance of a specified
expression. Unlike VARIANCE (population variance), VAR_SAMP uses n-1 as the
divisor, which is considered an unbiased estimate of the population variance in
statistics.
+The calculation formula is:
+
+$
+\mathrm{VAR\_SAMP}(x)=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}
+$
+
+Where `n` is the number of valid values in the group.
+
## Alias
- VARIANCE_SAMP
@@ -28,7 +36,7 @@ VAR_SAMP(<expr>)
## Return Value
Returns a Double value representing the calculated sample variance.
-If there is no valid data in the group, returns NULL.
+If there is no valid data in the group, returns NULL. If the number of valid
values in the group is 1, returns NaN.
## Examples
```sql
@@ -66,3 +74,24 @@ FROM student_scores;
| 29.4107142857143 | 25.73437500000001 |
+------------------+---------------------+
```
+
+When the number of valid values is 1, `VAR_SAMP` returns `NaN`.
+
+```sql
+-- Create a single-column sample table
+CREATE TABLE sample_values (
+ value INT
+) DISTRIBUTED BY HASH(value)
+PROPERTIES (
+ "replication_num" = "1"
+);
+
+INSERT INTO sample_values VALUES (10);
+
+SELECT VAR_SAMP(value) AS sample_variance FROM sample_values;
++-----------------+
+| sample_variance |
++-----------------+
+| NaN |
++-----------------+
+```
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]