Hi Julien Le Dem, I am one of the developers in CarbonData project. Thanks for pointing out this issue. Actually, we are in a process of rapid development of this new file format and still missed proper documentation by now.
CarbonData's goal is a columnar file format that can be used to satisfy various query scenarios, so by design it has some unique features like builtin multi-level index, operable encoded data, collumn group, etc. (Liang has pointed out some of them in his last post). But since it is a columnar file format, it shares some common terminologies with Apache Parquet and Apache ORC, which I think it is inevitable. To reduce the confusion to minimal in the future, I think we will improve our documentation later on. And do you have other suggestion also? For the file format specification, I have updated the wiki and thrift definition to reflect the design of CarbonData. Please check whether still have issues. Regards, Jacky Li -- View this message in context: http://apache-incubator-general.996316.n3.nabble.com/DISCUSS-CarbonData-incubation-proposal-tp49643p49676.html Sent from the Apache Incubator - General mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org