morningman opened a new issue #5337: URL: https://github.com/apache/incubator-doris/issues/5337
当前的导入功能,我们允许用户通过where语句过滤部分导入数据。而这个过滤条件列的值是经过转换的 并且必须存在于目的表中。 但是用户可能会有如下需求。比如两张表的数据产生到同一个kafka队列中。用户希望通过数据中的某一 标识列来区别不同表的数据,从而选择性的进行导入。而这个标识列不存在于表中。 使用where无法支持以上场景。因此,这里提供一种新的过滤条件,可以对原始数据进行前置过滤。 最终的数据处理流程如下: 1. 使用前置过滤条件过滤原始数据。 2. 对过滤后的数据进行列的映射、变换 3. 再通过过滤条件对变换后的数据进行过滤 4. 导入最终数据 With the current import function, we allow users to filter part of the imported data through the `where` statement. The value of this filter condition column is converted and must exist in the destination table. But users may have the following requirements. For example, the data of two tables are generated in the same Kafka queue. The user wants to distinguish the data of different tables through a certain identification column in the data, so as to selectively import. And this identity column does not exist in the table. The above scenarios cannot be supported using where. Therefore, here is a new filter condition that can pre-filter the original data. The final data processing flow is as follows: 1. Use pre-filter conditions to filter the original data. 2. Column mapping and transformation of filtered data 3. Filter the transformed data through filter conditions 4. Import the final data ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org