[CANCEL][VOTE ]Release Apache Doris 0.12.0-incubating-rc01
Dear all: Sorry that I have to cancel the previous release of Apache Doris 0.12.0-incubating-rc01, Because: 1. There is a serious problem in code which we have to fix it before release. 2. I already have start Doris 0.12.0-incubating-rc02 vote in previous email. We can start vote on Doris 0.12.0-incubating-rc02. Thanks, Li Chaoyong - To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org For additional commands, e-mail: dev-h...@doris.apache.org
[Result][VOTE] Apache Doris 0.12.0-incubating-rc03
Thanks to everyone, and this vote is now closed. It has passed with 3 +1 (binding) votes and no 0 or -1 votes. Binding: +1 Zhao Chun +1 Kang Kaisen +1 Chen Mingyu Best Regards, Li Chaoyong
[ANNOUNCE] Apache Doris (incubating) 0.12.0 Release
Hi All, We are pleased to announce the release of Apache Doris 0.12.0-incubating. Apache Doris (incubating) is an MPP-based interactive SQL data warehousing for reporting and analysis. The release is available at:https://downloads.apache.org/incubator/doris/0.12.0-incubating Thanks to everyone who has contributed to this release, and the release note can be found here:https://github.com/apache/incubator-doris/releases Best Regards, On behalf of the Doris team, Li Chaoyong DISCLAIMER-WIP: Apache Doris is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. Some of the incubating project’s releases may not be fully compliant with ASF policy. For example, releases may have incomplete or un-reviewed licensing conditions. What follows is a list of known issues the project is currently aware of (note that this list, by definition, is likely to be incomplete): * Releases may have incomplete licensing conditions If you are planning to incorporate this work into your product/project, please be aware that you will need to conduct a thorough licensing review to determine the overall implications of including this work. For the current status of this project through the Apache Incubator visit: https://incubator.apache.org/projects/doris.html
New committer: Conghui Cai
The Podling Project Management Committee (PPMC) for Apache Doris has invited Conghui Cai to become a committer and we are pleased to announce that he has accepted. == Best Regards Chaoyong Li Email : lichaoy...@apache.org
Re: [VOTE] Release Apache Doris 0.13.0-incubating-rc03
+1 Approve the release I have check Download links Checksums and PGP signatures DISCLAIMER Source code artifacts have correct names matching the current release LICENSE and NOTICE files are correct for the repository All files have license headers if necessary No compiled archives bundled in source archive Building is OK == lichaoyong ling miao 于2020年9月24日周四 下午7:36写道: > Hi all, > > Please review and vote on Apache Doris 0.13.0-incubating-rc03 release. > > The release candidate has been tagged in GitHub as 0.13.0-rc03, available > here: > https://github.com/apache/incubator-doris/tree/0.13.0-rc03 > > Release Notes are here: > https://github.com/apache/incubator-doris/issues/4370 > > Thanks to everyone who has contributed to this release. > > The artifacts (source, signature and checksum) corresponding to this > release > candidate can be found here: > https://dist.apache.org/repos/dist/dev/incubator/doris/0.13/0.13.0-rc3/ > > This has been signed with PGP key 517E5B28, corresponding to > lingm...@apache.org. > KEYS file is available here: > https://dist.apache.org/repos/dist/dev/incubator/doris/KEYS > It is also listed here: > https://people.apache.org/keys/committer/lingmiao.asc > > To verify and build, you can refer to following wiki: > https://github.com/apache/incubator-doris/wiki/How-to-verify-Apache-Release > https://wiki.apache.org/incubator/IncubatorReleaseChecklist > > The vote will be open for at least 72 hours. > [ ] +1 Approve the release > [ ] +0 No opinion > [ ] -1 Do not release this package because ... > > Best Regards, > Ling Miao > > > DISCLAIMER-WIP: > Apache Doris is an effort undergoing incubation at The Apache Software > Foundation (ASF), > sponsored by the Apache Incubator. Incubation is required of all newly > accepted projects > until a further review indicates that the infrastructure, communications, > and decision > making process have stabilized in a manner consistent with other successful > ASF projects. > While incubation status is not necessarily a reflection of the completeness > or stability > of the code, it does indicate that the project has yet to be fully endorsed > by the ASF. > > Some of the incubating project’s releases may not be fully compliant with > ASF policy. For > example, releases may have incomplete or un-reviewed licensing conditions. > What follows is > a list of known issues the project is currently aware of (note that this > list, by definition, > is likely to be incomplete): > > * Releases may have incomplete licensing conditions > > If you are planning to incorporate this work into your product/project, > please be aware that > you will need to conduct a thorough licensing review to determine the > overall implications of > including this work. For the current status of this project through the > Apache Incubator > visit: https://incubator.apache.org/projects/doris.html >
Optimize nested type implementation
Array is used to store data like user labels. It made up of (offset, element), element can be an Array. [1, 2, 3], [4, 5, 6] Doris has implemented array nest type. But it's not extensible. 1. It stores NULL with offset, which only is feasible to Array, but not to Struct. Struct is made of {element 1, element2}, but does not have an implicit offset column. Array should store null as a new column, not attached to an offset column. 2. It stores offset with absolute ordinal, so every offset ColumnWriter will have to callback ArrayColumnWriter to record the position. The logic is so tricky to understand. 3. When reading the ArrayColumn, it has to seek the next position to get the right end offset. The array will support the **array_slice, flatten, a[5]** function. This function should eliminate the memory copy in Array. So the memory layout will store the data in the bottom level. Every function will return a new Array and only change set the offsets. ``` The memory layout: Two-level data : [[1, 2], [3, 4]], [[5, 6, 7], null, [8]], [[9, 10]] * First Level Offsets (int32), First Level have no nulls | Bytes 0-3 | Bytes 4-7 | Bytes 8-11 | Bytes 12-15 | ||||-| | 0 | 2 | 5 | 6 | * Second Nulls (uint8) | Byte 1- 3 | Bytes 4 | Bytes 5 - 7| |---| | 0 | 1| 0 | * Second Level Offsets (int32) | Bytes 0-27 | |--| | 0, 2, 4, 7, 7, 8, 10 | * Elements array (int32): | Bytes 0-9 | |---| | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 | ``` Like the above, every level in the array only stores nulls and offsets. And also, offsets will store zero to calculate the first size. The disk layout should take encoding algorithms and reading efficiency into consideration. So the disk layout will store the array_size without the absolute for every array element. ``` The disk layout * First Level array_sizes (int32), First Level have no nulls | Bytes 0-3 | Bytes 4-7 | Bytes 8-11 | |||| | 2 | 3 | 1| * Second Nulls (uint8) | Byte 1- 3 | Bytes 4 | Bytes 5 - 7| |---| | 0 | 1| 0 | * Second Level array_sizes (int32) | Bytes 0-23 | |--| | 2, 2, 3, 0, 1, 2 | * Elements array (int32): | Bytes 0-9 | |---| | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 | ``` For null, array_size, element array, it will construct the specified ColumnWriter to write the data. The ArrayColumnWriter only needs to call the three writers. ``` 1. Call the writer to write nulls. 2. Call the writer to write array_sizes. And also add a new meta to record the corresponding relation between array_size ordinal and element ordinal. 3. Call element writer recursively. ``` Upon read, when seeking to specify the ordinal, it will seek the null, array_size, element separately. When seeking the element column, it will get the start ordinal from array_size reader. Because the array_size has to seek the specified ordinal, so It only needs one sum. Asides from the above read and write logic, I will refactor the TypeInfo class to support nested types(Array/Struct/Map). I divided into two function ``` using TypeInfoPtr = std::shared_ptr const TypeInfoPtr& get_type_info(FieldType type) TypeInfoPtr get_type_info(const TabletColumn& column); ``` The first interface is used to scalar type, and the second interface is used to all types including nested types(Array/Struct/Map). To alleviate the copy of shared_ptr, the scalar get_type_info interface will return a const reference of TypeInfo.