fritzb commented on issue #6768:
URL: https://github.com/apache/pinot/issues/6768#issuecomment-1652264822

   I have a simplified DBT use case, which I believe is the first step to 
involve Pinot in DBT. I'd like to start the discussion about whether this idea 
is feasible as a short project. The idea is to handle the DBT dags mostly in 
Trino/Iceberg and convert the final materialized table from Iceberg to Pinot in 
DBT. This way, we narrow down the project into converting the final 
materialized table in Parquet format to a Pinot table.
   
   The workflow is as follows:
   
   1. Create model(s) in DBT with the output as Datalake, using Trino/Iceberg.
   2. The very last stage of the DAG is a materialized table that powers the 
Business Intelligence dashboards. The current output is Iceberg (via 
Trino/Iceberg).
   3. This is where Pinot comes in to make the Dashboard blazing fast. We aim 
to convert the last materialized table in Iceberg to a Pinot table. Currently, 
this is done manually by copying the Iceberg table into a simple Parquet format 
since Pinot does not support Iceberg ingestion. The process involves deleting 
and re-creating the schema+table in Pinot, followed by ingesting the Parquet 
files into a Pinot table by setting the ingestion config to an S3 location.
   
   @xiangfu0 I was wondering if Step 3 could be automated as a DBT connector by:
   
   1. Detect the new schema for the materialized table and automatically create 
the Pinot schema+table (offline table?).
   2. Insert the rows of the materialized table into the Pinot table. If 
inserts via Trino are not supported, can we instruct Pinot to ingest Parquet 
from an S3 location using an SQL statement? This way, I can write this 
ingestion instruction as a DBT SQL model.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to