EmmyMiao87 opened a new issue #2830: Support subquery in disjunction
URL: https://github.com/apache/incubator-doris/issues/2830
 
 
   **Is your feature request related to a problem? Please describe.**
   Doris does not support the subquery in disjunction.
   For example:
   ```
   SELECT *
   FROM customer c
   WHERE   EXISTS
           (
                 SELECT *
                 FROM web_sales
                 WHERE c.c_customer_sk = ws_bill_customer_sk
           )
           OR EXISTS
           (
                 SELECT *
                 FROM catalog_sales
                 WHERE c.c_customer_sk = cs_ship_customer_sk
   
           );
   ```
   ```
   SELECT *
   FROM  customer_address, item
   WHERE
       substr(ca_zip, 1, 5)
   
       IN
   
       (
            '85669'
       )
       OR i_item_id
   
       IN
   
       (
             SELECT i_item_id
             FROM item
             WHERE i_item_sk IN (2)
       );
   ```
   Subqueries in OR predicates are not supported
   
   TPC-DS 10,35,45 query:
   ```
   query10:
   SELECT cd_gender,
         cd_marital_status,
         cd_education_status,
         count(*) cnt1,
         cd_purchase_estimate,
         count(*) cnt2,
         cd_credit_rating,
         count(*) cnt3,
         cd_dep_count,
         count(*) cnt4,
         cd_dep_employed_count,
         count(*) cnt5,
         cd_dep_college_count,
         count(*) cnt6
   FROM customer c,customer_address ca,customer_demographics
   WHERE c.c_current_addr_sk = ca.ca_address_sk
         AND ca_county IN ('Walker County','Richland County','Gaines 
County','Douglas County','Dona Ana County')
         AND cd_demo_sk = c.c_current_cdemo_sk
         AND EXISTS
         (
   
               SELECT *
               FROM store_sales,date_dim
               WHERE c.c_customer_sk = ss_customer_sk
                     AND ss_sold_date_sk = d_date_sk
                     AND d_year = 2002
                    AND d_moy
                     BETWEEN 4
                     AND 4+3
   
         )
         AND
   
         (
   
               EXISTS
               (
   
                     SELECT *
                     FROM web_sales,date_dim
                     WHERE c.c_customer_sk = ws_bill_customer_sk
                           AND ws_sold_date_sk = d_date_sk
                           AND d_year = 2002
                           AND d_moy
                     BETWEEN 4
                     AND 4+3
   
               )
               OR EXISTS
               (
   
                     SELECT *
                     FROM catalog_sales,date_dim
                     WHERE c.c_customer_sk = cs_ship_customer_sk
                           AND cs_sold_date_sk = d_date_sk
                           AND d_year = 2002
                           AND d_moy
                           BETWEEN 4
                           AND 4+3
   
               )
   
         )
   GROUP BY cd_gender, cd_marital_status, cd_education_status, 
cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, 
cd_dep_college_count
   ORDER BY cd_gender, cd_marital_status, cd_education_status, 
cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, 
cd_dep_college_count
   
   LIMIT 100;
   ```
   
   ```
   query 35:
   SELECT ca_state,
         cd_gender,
         cd_marital_status,
         cd_dep_count,
         count(*) cnt1,
         avg(cd_dep_count),
         max(cd_dep_count),
         sum(cd_dep_count),
         cd_dep_employed_count,
         count(*) cnt2,
         avg(cd_dep_employed_count),
         max(cd_dep_employed_count),
         sum(cd_dep_employed_count),
         cd_dep_college_count,
         count(*) cnt3,
         avg(cd_dep_college_count),
         max(cd_dep_college_count),
         sum(cd_dep_college_count)
   FROM customer c,customer_address ca,customer_demographics
   WHERE c.c_current_addr_sk = ca.ca_address_sk
         AND cd_demo_sk = c.c_current_cdemo_sk
         AND EXISTS
         (
   
               SELECT *
               FROM store_sales,date_dim
               WHERE c.c_customer_sk = ss_customer_sk
                     AND ss_sold_date_sk = d_date_sk
                     AND d_year = 1999
                     AND d_qoy < 4
   
         )
         AND
   
         (
   
               EXISTS
               (
   
                     SELECT *
                     FROM web_sales,date_dim
                     WHERE c.c_customer_sk = ws_bill_customer_sk
                           AND ws_sold_date_sk = d_date_sk
                           AND d_year = 1999
                           AND d_qoy < 4
   
               )
               OR EXISTS
               (
   
                     SELECT *
                     FROM catalog_sales,date_dim
                     WHERE c.c_customer_sk = cs_ship_customer_sk
                           AND cs_sold_date_sk = d_date_sk
                           AND d_year = 1999
                           AND d_qoy < 4
   
               )
   
         )
   GROUP BY ca_state, cd_gender, cd_marital_status, cd_dep_count, 
cd_dep_employed_count, cd_dep_college_count
   ORDER BY ca_state, cd_gender, cd_marital_status, cd_dep_count, 
cd_dep_employed_count, cd_dep_college_count
   
   LIMIT 100;
   ```
   
   ```
   Query 45:
   SELECT ca_zip, ca_county, SUM(ws_sales_price)
   FROM web_sales, customer, customer_address, date_dim, item
   WHERE
   
         (
   
               ws_bill_customer_sk = c_customer_sk
              AND c_current_addr_sk = ca_address_sk
              AND ws_item_sk = i_item_sk
              AND
   
              (
   
                     substr(ca_zip, 1, 5)
   
                     IN
   
                     (
                          '85669',
                          '86197',
                          '88274',
                          '83405',
                           '86475',
                           '85392',
                           '85460',
                          '80348',
                           '81792'
                    )
                    OR i_item_id
   
                     IN
   
                     (
                           SELECT i_item_id
                           FROM item
                           WHERE i_item_sk IN (2, 3, 5, 7, 11, 13, 17, 19, 23, 
29)
                    )
   
               )
               AND ws_sold_date_sk = d_date_sk
              AND d_qoy = 2
              AND d_year = 2000
   
         )
   GROUP BY ca_zip, ca_county
   ORDER BY ca_zip, ca_county
   LIMIT 100;
   ```
   
   **Describe the solution you'd like**
   The subquery could be supported inn disjunction.
   Select * from t1 where a in subquery1 or a b in subquery2.
   
   **Competitive analysis**
   Greenplum:
   Impala:
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to