Re: [PROPOSAL] Grill as new Incubator project

Ted Dunning Fri, 19 Sep 2014 11:42:48 -0700

There is a strong phonetic similarity to Apache Drill, a project in the
same general domain.


Is the Grill name already baked in (pun intended)?



On Fri, Sep 19, 2014 at 7:24 AM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Thank you Sharad. So I could use this system for remote sensing
> data, like 3-dimension (time, space, and measurement) type of cubes?
> Does it support numerical data well?
>
> Sorry for so many questions just excited :)
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Sharad Agarwal <sha...@apache.org>
> Reply-To: "sha...@apache.org" <sha...@apache.org>
> Date: Friday, September 19, 2014 4:06 AM
> To: Chris Mattmann <chris.a.mattm...@jpl.nasa.gov>
> Cc: "general@incubator.apache.org" <general@incubator.apache.org>
> Subject: Re: [PROPOSAL] Grill as new Incubator project
>
> >Chris, Thanks for your comments.
> >
> >
> >The differences that I see are:
> >- SciDB exposes Array Data model and Array Query Language (AQL). Grill
> >data model is based on OLAP Fact and Dimensions. Grill exposes SQL like
> >language (a subset of Hive QL) that works on *logical* entities (facts,
> >dimensions)
> >
> >
> >- The goal of Grill is not to build a new query execution database, but
> >to unify them by having a central metadata catalog, and provide a Cube
> >abstraction layer on top of it.
> >
> >
> >
> >Thanks,
> >Sharad
> >
> >
> >On Fri, Sep 19, 2014 at 9:34 AM, Mattmann, Chris A (3980)
> ><chris.a.mattm...@jpl.nasa.gov> wrote:
> >
> >This sounds super cool!
> >
> >How does this relate to SciDB? is it trying to do a similar thing?
> >
> >Cheers,
> >Chris
> >
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Chief Architect
> >Instrument Software and Science Data Systems Section (398)
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 168-519, Mailstop: 168-527
> >Email: chris.a.mattm...@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Associate Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Sharad Agarwal <sha...@apache.org>
> >Reply-To: "general@incubator.apache.org" <general@incubator.apache.org>,
> >"sha...@apache.org" <sha...@apache.org>
> >Date: Thursday, September 18, 2014 8:54 PM
> >To: "general@incubator.apache.org" <general@incubator.apache.org>
> >Subject: [PROPOSAL] Grill as new Incubator project
> >
> >>Grill Proposal
> >>==========
> >>
> >># Abstract
> >>
> >>Grill is a platform that enables multi-dimensional queries in a unified
> >>way
> >>over datasets stored in multiple warehouses. Grill integrates Apache Hive
> >>with other data warehouses by tiering them together to form logical data
> >>cubes.
> >>
> >>
> >># Proposal
> >>
> >>Grill provides a unified Cube abstraction for data stored in different
> >>stores. Grill tiers multiple data warehouses for unified representation
> >>and
> >>efficient access. It provides SQL-like Cube query language to query and
> >>describe data sets organized in data cubes. It enables users to run
> >>queries
> >>against Facts and Dimensions that can span multiple physical tables
> >>stored
> >>in different stores.
> >>
> >>The primary use cases that Grill aims to solve:
> >>- Facilitate analytical queries by providing the OLAP like Cube
> >>abstraction
> >>- Data Discovery by providing single metadata layer for data stored in
> >>different stores
> >>- Unified access to data by integrating Hive with other traditional data
> >>warehouses
> >>
> >>
> >># Background
> >>
> >>Apache Hive is a data warehouse that facilitates querying and managing
> >>large datasets stored in distributed storage systems like HDFS. It
> >>provides
> >>SQL like language called HiveQL aka HQL.  Apache Hive is a widely used
> >>platform in various organizations for doing adhoc analytical queries.
> >>In a typical Data warehouse scenario, the data is multi-dimensional and
> >>organized into Facts and Dimensions to form Data Cubes. Grill provides
> >>this
> >>logical layer to enable querying and manage data as Cubes.
> >>The Grill project is actively being developed at InMobi to provide the
> >>higher level of analytical abstraction to query data stored in different
> >>storages including Hive and beyond seamlessly.
> >>
> >>
> >># Rationale
> >>
> >>The Grill project aims to ease the analytical querying capabilities and
> >>cut
> >>the data-silos by providing a single view of data across multiple data
> >>stores.
> >>Conceiving data as a cube with hierarchical dimensions leads to
> >>conceptually straightforward operations to facilitate analysis.
> >>Integrating
> >>Apache Hive with other traditional warehouses provides the opportunity to
> >>optimize on the query execution cost by tiering the data across multiple
> >>warehouses. Grill provides
> >>- Access to data Cubes via Cube Query language similar to HiveQL.
> >>- Driver based architecture to allow for plugging systems like Hive and
> >>other warehouses such as columnar data RDBMS.
> >>- Cost based engine selection that provides optimal use of resources by
> >>selecting the best execution engine for a given query.
> >>
> >>In a typical Data warehouse, data is organized in Cubes with multiple
> >>dimensions and measures. This facilitates the analysis by conceiving the
> >>data in terms of Facts and Dimensions instead of physical tables. Grill
> >>aims to provide this logical Cube abstraction on Data warehouses like
> >>Hive
> >>and other traditional warehouses.
> >>
> >>
> >># Initial Goals
> >>
> >>- Donate the Grill source code and documentation to Apache Software
> >>Foundation
> >>- Build a user and developer community
> >>- Support Hive and other Columnar data warehouses
> >>- Support full query life cycle management
> >>- Add authentication for querying cubes
> >>- Provide detailed query statistics
> >>
> >>
> >># Long Term Goals
> >>
> >>Here are some longer-term capabilities that would be added to Grill
> >>- Add authorization for managing and querying Cubes
> >>- Provide REST and CLI for full Admin controls
> >>- Capability to schedule queries
> >>- Query caching
> >>- Integrate with Apache Spark. Creating Spark RDD from Grill query
> >>- Integrate with Apache Optiq
> >>
> >>
> >># Current Status
> >>
> >>The project is actively developed at InMobi. The first version is
> >>deployed
> >>at InMobi 4 months back. This version allows querying dimension and fact
> >>data stored in Hive over CLI. The source code and documentation is hosted
> >>at GitHub.
> >>
> >>## Meritocracy
> >>
> >>We intend to build a diverse developer and user community for the project
> >>following the Apache meritocracy model. We want to encourage contributors
> >>from multiple organizations, provide plenty of support to new developers
> >>and welcome them to be committers.
> >>
> >>## Community
> >>
> >>Currently the project is being developed at InMobi. We hope to extend our
> >>contributor and user base significantly in the future and build a solid
> >>open source community around Grill.
> >>Core Developers
> >>Grill is currently being developed by Amareshwari Sriramadasu, Sharad
> >>Agarwal and Jaideep Dhok from InMobi, and Sreekanth Ramakrishnan who is
> >>currently employed by SoftwareAG. Raghavendra Singh from InMobi has built
> >>the QA automation for Grill.
> >>
> >>## Alignment
> >>
> >>The ASF is a natural home to Grill as it is for Apache Hadoop, Apache
> >>Hive,
> >>Apache Spark and other emerging projects in Big Data space.
> >>We believe in any enterprise, multiple data warehouses will co-exist, as
> >>not all workloads are cost effective to run on single one. Apache Hive is
> >>one of the crucial data warehouse along with upcoming projects like
> >>Apache
> >>Spark in Hadoop ecosystem. Grill will benefit in working in close
> >>proximity
> >>with these projects.
> >>The traditional Columnar data warehouses complement Apache Hive as
> >>certain
> >>workloads continue to be cost effective to run in traditional columnar
> >>data
> >>warehouses. Having multiple data warehouses leads to data silos that
> >>Grill
> >>aims to cut within the enterprise and provide a holistic unified access
> >>to
> >>data.
> >>
> >>
> >># Known Risks
> >>
> >>## Orphaned products & Reliance on Salaried Developers
> >>
> >>There is little risk of Grill getting orphaned, as Grill is key part of
> >>the
> >>Data Platform stack at InMobi. The core Grill developers plan to work on
> >>it
> >>full-time. We think Grill will bring value in the Big Data space and we
> >>plan to grow the community of users and contributors.
> >>
> >>## Inexperience with Open Source
> >>
> >>All the core developers have long and significant experience in Apache
> >>projects and Hadoop ecosystem. Amareshwari Sriramadasu has long standing
> >>contributions to Apache Hadoop MapReduce and Apache Hive, she being PMC
> >>member of Hadoop and a committer of Hive. Sharad Agarwal is a PMC member
> >>of
> >>Hadoop and contributed to Hadoop YARN and Hadoop MapReduce. Srikanth
> >>Sundarrajan is a PMC member of Apache Falcon.  Sreekanth Ramakrishnan is
> >>committer of Apache Hadoop.  Jaideep Dhok has contributed patches to
> >>Apache
> >>Hive. Gunther is a PMC member of Apache Hive. Vikram is a committer of
> >>Apache Hive.
> >>
> >>## Homogeneous Developers
> >>
> >>The initial developers are employed by Hortonworks, InMobi and
> >>SoftwareAG.
> >>We are committed to recruiting additional committers from other companies
> >>based on their contribution to the project.
> >>
> >>## Reliance on Salaried Developers
> >>
> >>The majority of initial committers are paid by their employee to
> >>contribute
> >>to the project and few are contributing in their spare time. Once the
> >>project has a community built, we are committed to recruit committers and
> >>developers from outside the current core developers.
> >>
> >>## Relationships with Other Apache Products
> >>
> >>Grill is deeply integrated with other Apache projects. Grill uses and
> >>extends Apache Hive HCatalog to store and manage the Data cubes. It uses
> >>HDFS and Hive session management libraries. Grill has the driver-based
> >>architecture that allows for adding multiple execution drivers. Apart
> >>from
> >>integrating Apache Hive, it can be integrated with Apache Spark over
> >>Spark
> >>SQL or Shark, Apache Drill, Apache Tajo and Apache Phoenix.
> >>In future we want to use Apache Optiq in Grill for query optimization and
> >>cost based driver selection.
> >>
> >>## An Excessive Fascination with the Apache Brand
> >>
> >>The project is conceived from beginning to be in line with the Apache
> >>philosophy. As the core developers have good experience with Apache, the
> >>source code organization, build, review and commit process are highly
> >>influenced by Apache. We believe that Apache will be a solid home for
> >>Grill
> >>to grow and build the open source community. We have also described the
> >>reasons in the Rationale and Alignment sections.
> >>
> >>
> >># Documentation
> >>
> >>http://inmobi.github.io/grill/
> >>
> >>
> >># Initial Source
> >>
> >>The source is currently in github repository at:
> >>https://github.com/inmobi/grill
> >>
> >>
> >># Source and Intellectual Property Submission Plan
> >>
> >>The complete Grill code is already under Apache Software License 2.
> >>
> >>
> >># External Dependencies
> >>
> >>The dependencies all have Apache compatible licenses. These include
> >>Apache
> >>2.0, BSD, MIT, EPL and CDDL licensed dependencies.
> >>
> >>
> >># Cryptography
> >>
> >>None
> >>
> >>
> >># Required Resources
> >>
> >>## Mailing lists
> >>
> >>grill-dev AT incubator DOT apache DOT org
> >>grill-commits AT incubator DOT apache DOT org
> >>grill-private AT incubator DOT apache DOT org
> >>
> >>## Subversion Directory
> >>
> >>Git is the preferred source control system: git://
> >>git.apache.org/incubator-grill <http://git.apache.org/incubator-grill>
> >>
> >>## Issue Tracking
> >>
> >>JIRA Grill (GRILL)
> >>
> >>
> >># Initial Committers
> >>
> >>Amareshwari Sriramadasu (amareshwari AT apache DOT org)
> >>Gunther Hagleitner (gunther AT apache DOT org)
> >>Jaideep Dhok (jaideep.dhok AT Inmobi DOT com)
> >>Raghavendra Singh (raghavendra.singh AT Inmobi DOT com)
> >>Sharad Agarwal (sharad AT apache DOT org)
> >>Sreekanth Ramakrishnan (sreekanth AT apache DOT org)
> >>Srikanth Sundarrajan (sriksun AT apache DOT org)
> >>Suma Shivaprasad (suma.shivaprasad AT Inmobi DOT com)
> >>Vikram Dixit (vikram AT apache DOT org)
> >>
> >>
> >># Affiliations
> >>
> >>Amareshwari SR (InMobi)
> >>Gunther Hagleitner (Hortonworks)
> >>Jaideep Dhok (InMobi)
> >>Raghavendra Singh (InMobi)
> >>Sharad Agarwal (InMobi)
> >>Sreekanth Ramakrishnan (SoftwareAG)
> >>Srikanth Sundarrajan (InMobi)
> >>Suma Shivaprasad (InMobi)
> >>Vikram Dixit (Hortonworks)
> >>
> >>
> >># Sponsors
> >>
> >>## Champion
> >>
> >>Vinod K <vinodkv AT apache DOT org> (Apache Member)
> >>
> >>## Nominated Mentors
> >>
> >>Chris Douglas (Microsoft)
> >>Jacob Homan (Microsoft)
> >>Jean Baptiste Onofre (Talend)
> >>Vinod K (Hortonworks)
> >>
> >>## Sponsoring Entity
> >>
> >>Incubator PMC
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>

Re: [PROPOSAL] Grill as new Incubator project

Reply via email to