[CANCEL][VOTE ]Release Apache Doris 0.12.0-incubating-rc01

2020-04-05 Thread lichaoyong
Dear all:

Sorry that I have to cancel the previous release of Apache Doris
0.12.0-incubating-rc01, Because:


1. There is a serious problem in code which we have to fix it before
release.
2. I already have start Doris 0.12.0-incubating-rc02 vote in previous email.
We can start vote on Doris 0.12.0-incubating-rc02.

Thanks,
Li Chaoyong

-
To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org
For additional commands, e-mail: dev-h...@doris.apache.org



[Result][VOTE] Apache Doris 0.12.0-incubating-rc03

2020-04-15 Thread lichaoyong
Thanks to everyone, and this vote is now closed.

It has passed with 3 +1 (binding) votes and no 0 or -1 votes.

Binding:
+1 Zhao Chun
+1 Kang Kaisen
+1 Chen Mingyu

Best Regards,
Li Chaoyong



[ANNOUNCE] Apache Doris (incubating) 0.12.0 Release

2020-04-24 Thread lichaoyong
Hi All,

We are pleased to announce the release of Apache Doris 0.12.0-incubating.

Apache Doris (incubating) is an MPP-based interactive SQL data
warehousing for reporting and analysis.

The release is available
at:https://downloads.apache.org/incubator/doris/0.12.0-incubating

Thanks to everyone who has contributed to this release, and the
release note can be found
here:https://github.com/apache/incubator-doris/releases
Best Regards,

On behalf of the Doris team,
Li Chaoyong


DISCLAIMER-WIP:
Apache Doris is an effort undergoing incubation at The Apache Software
Foundation (ASF),
sponsored by the Apache Incubator. Incubation is required of all newly
accepted projects
until a further review indicates that the infrastructure,
communications, and decision
making process have stabilized in a manner consistent with other
successful ASF projects.
While incubation status is not necessarily a reflection of the
completeness or stability
of the code, it does indicate that the project has yet to be fully
endorsed by the ASF.

Some of the incubating project’s releases may not be fully compliant
with ASF policy. For
example, releases may have incomplete or un-reviewed licensing
conditions. What follows is
a list of known issues the project is currently aware of (note that
this list, by definition,
is likely to be incomplete):

 * Releases may have incomplete licensing conditions

If you are planning to incorporate this work into your
product/project, please be aware that
you will need to conduct a thorough licensing review to determine the
overall implications of
including this work. For the current status of this project through
the Apache Incubator
visit: https://incubator.apache.org/projects/doris.html


New committer: Conghui Cai

2020-08-17 Thread lichaoyong
The Podling Project Management Committee (PPMC) for Apache Doris

has invited Conghui Cai to become a committer and we are pleased

to announce that he has accepted.


==
Best Regards
Chaoyong Li
Email : lichaoy...@apache.org


Re: [VOTE] Release Apache Doris 0.13.0-incubating-rc03

2020-09-25 Thread lichaoyong
+1 Approve the release
I have check
   Download links
   Checksums and PGP signatures
   DISCLAIMER
   Source code artifacts have correct names matching the current release
   LICENSE and NOTICE files are correct for the repository
   All files have license headers if necessary
   No compiled archives bundled in source archive
   Building is OK

==
lichaoyong

ling miao  于2020年9月24日周四 下午7:36写道:

> Hi all,
>
> Please review and vote on Apache Doris 0.13.0-incubating-rc03 release.
>
> The release candidate has been tagged in GitHub as 0.13.0-rc03, available
> here:
> https://github.com/apache/incubator-doris/tree/0.13.0-rc03
>
> Release Notes are here:
> https://github.com/apache/incubator-doris/issues/4370
>
> Thanks to everyone who has contributed to this release.
>
> The artifacts (source, signature and checksum) corresponding to this
> release
> candidate can be found here:
> https://dist.apache.org/repos/dist/dev/incubator/doris/0.13/0.13.0-rc3/
>
> This has been signed with PGP key 517E5B28, corresponding to
> lingm...@apache.org.
> KEYS file is available here:
> https://dist.apache.org/repos/dist/dev/incubator/doris/KEYS
> It is also listed here:
> https://people.apache.org/keys/committer/lingmiao.asc
>
> To verify and build, you can refer to following wiki:
> https://github.com/apache/incubator-doris/wiki/How-to-verify-Apache-Release
> https://wiki.apache.org/incubator/IncubatorReleaseChecklist
>
> The vote will be open for at least 72 hours.
> [ ] +1 Approve the release
> [ ] +0 No opinion
> [ ] -1 Do not release this package because ...
>
> Best Regards,
> Ling Miao
>
> 
> DISCLAIMER-WIP:
> Apache Doris is an effort undergoing incubation at The Apache Software
> Foundation (ASF),
> sponsored by the Apache Incubator. Incubation is required of all newly
> accepted projects
> until a further review indicates that the infrastructure, communications,
> and decision
> making process have stabilized in a manner consistent with other successful
> ASF projects.
> While incubation status is not necessarily a reflection of the completeness
> or stability
> of the code, it does indicate that the project has yet to be fully endorsed
> by the ASF.
>
> Some of the incubating project’s releases may not be fully compliant with
> ASF policy. For
> example, releases may have incomplete or un-reviewed licensing conditions.
> What follows is
> a list of known issues the project is currently aware of (note that this
> list, by definition,
> is likely to be incomplete):
>
>  * Releases may have incomplete licensing conditions
>
> If you are planning to incorporate this work into your product/project,
> please be aware that
> you will need to conduct a thorough licensing review to determine the
> overall implications of
> including this work. For the current status of this project through the
> Apache Incubator
> visit: https://incubator.apache.org/projects/doris.html
>


Optimize nested type implementation

2020-12-29 Thread lichaoyong
Array is used to store data like user labels.
It made up of (offset, element), element can be an Array.
[1, 2, 3], [4, 5, 6]
Doris has implemented array nest type. But it's not extensible.
 1. It stores NULL with offset, which only is feasible to Array, but not to
Struct.
 Struct is made of {element 1, element2}, but does not have an implicit
offset column.
 Array should store null as a new column, not attached to an offset
column.
 2. It stores offset with absolute ordinal, so every offset ColumnWriter
will have to
 callback ArrayColumnWriter to record the position. The logic is so
tricky to understand.
 3. When reading the ArrayColumn, it has to seek the next position to get
the right end offset.

The array will support the **array_slice, flatten, a[5]** function.
This function should eliminate the memory copy in Array.
So the memory layout will store the data in the bottom level.
Every function will return a new Array and only change set the offsets.
```
The memory layout:

Two-level data : [[1, 2], [3, 4]], [[5, 6, 7], null, [8]], [[9, 10]]

* First Level Offsets (int32), First Level have no nulls

  | Bytes 0-3  | Bytes 4-7  | Bytes 8-11 | Bytes 12-15 |
  ||||-|
  | 0  |  2 |  5 |  6  |

  * Second Nulls (uint8)

| Byte 1- 3 | Bytes 4  | Bytes 5 - 7|
|---|
| 0 | 1| 0  |

  * Second Level Offsets (int32)

| Bytes 0-27   |
|--|
| 0, 2, 4, 7, 7, 8, 10 |

  * Elements array (int32):

| Bytes 0-9 |
|---|
| 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 |
```
Like the above, every level in the array only stores nulls and offsets.
And also, offsets will store zero to calculate the first size.

The disk layout should take encoding algorithms and reading efficiency into
consideration.
So the disk layout will store the array_size without the absolute for every
array element.
```
The disk layout
* First Level array_sizes (int32), First Level have no nulls

  | Bytes 0-3  | Bytes 4-7  | Bytes 8-11 |
  ||||
  | 2 |  3 |  1|

  * Second Nulls (uint8)

| Byte 1- 3 | Bytes 4  | Bytes 5 - 7|
|---|
| 0 | 1| 0  |

  * Second Level array_sizes (int32)

| Bytes 0-23   |
|--|
| 2, 2, 3, 0, 1, 2 |

  * Elements array (int32):

| Bytes 0-9 |
|---|
| 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 |
```
For null, array_size, element array, it will construct the specified
ColumnWriter to write the data.
The ArrayColumnWriter only needs to call the three writers.
```
1. Call the writer to write nulls.
2. Call the writer to write array_sizes. And also add a new meta to record
the corresponding relation between array_size ordinal
 and element ordinal.
3. Call element writer recursively.
```

Upon read, when seeking to specify the ordinal, it will seek the null,
array_size, element separately.
When seeking the element column, it will get the start ordinal from
array_size reader.
Because the array_size has to seek the specified ordinal, so It only needs
one sum.

Asides from the above read and write logic, I will refactor the TypeInfo
class to support nested types(Array/Struct/Map).
I divided into two function
```
using TypeInfoPtr = std::shared_ptr
const TypeInfoPtr& get_type_info(FieldType type)
TypeInfoPtr get_type_info(const TabletColumn& column);
```
The first interface is used to scalar type, and the second interface is
used to all types including nested types(Array/Struct/Map).
To alleviate the copy of shared_ptr, the scalar get_type_info interface
will return a const reference of TypeInfo.