[ 
https://issues.apache.org/jira/browse/KYLIN-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18071277#comment-18071277
 ] 

ASF GitHub Bot commented on KYLIN-4260:
---------------------------------------

codecov-commenter commented on PR #950:
URL: https://github.com/apache/kylin/pull/950#issuecomment-4189191591

   ## 
[Codecov](https://app.codecov.io/gh/apache/kylin/pull/950?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   :white_check_mark: All modified and coverable lines are covered by tests.
   :white_check_mark: Project coverage is 25.57%. Comparing base 
([`6b853ce`](https://app.codecov.io/gh/apache/kylin/commit/6b853ceda852c92c4b3e884426be6b34e3c62e0d?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache))
 to head 
([`e2ca66e`](https://app.codecov.io/gh/apache/kylin/commit/e2ca66ea1a93e2e28775f7f9d80a6be52c4e63e4?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)).
   
   <details><summary>Additional details and impacted files</summary>
   
   
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #950      +/-   ##
   ============================================
   - Coverage     25.58%   25.57%   -0.01%     
   + Complexity     6140     6139       -1     
   ============================================
     Files          1412     1412              
     Lines         85072    85071       -1     
     Branches      11928    11928              
   ============================================
   - Hits          21762    21757       -5     
   - Misses        61205    61207       +2     
   - Partials       2105     2107       +2     
   ```
   </details>
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/kylin/pull/950?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   <details><summary> :rocket: New features to boost your workflow: </summary>
   
   - :snowflake: [Test 
Analytics](https://docs.codecov.com/docs/test-analytics): Detect flaky tests, 
report on failures, and find test suite problems.
   - :package: [JS Bundle 
Analysis](https://docs.codecov.com/docs/javascript-bundle-analysis): Save 
yourself from yourself by tracking and limiting bundle sizes in JS merges.
   </details>




> When using server side PreparedStatement cache, the query result are not 
> match on TopN scenario
> -----------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4260
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4260
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v3.0.0-alpha2, v2.6.4
>            Reporter: Marc Wu
>            Assignee: Marc Wu
>            Priority: Major
>             Fix For: v3.1.0, v3.0.1
>
>         Attachments: image-2019-11-18-15-55-00-312.png, 
> image-2019-11-18-15-55-11-906.png, image-2019-11-18-19-29-34-489.png, 
> image-2019-11-18-19-29-42-721.png
>
>
> Hi Kylin team,
> I found an issue while server side PreparedStatement enabled. The second time 
> query and after's result will be different from the first when query TopN, 
> and the result is not right.
> Part of Cube info:
>  
>  Dimensions
>  TRANS_ID
>  PART_DT
>  SELLER_ID
>  BUYER_ID
> Measures:
>  SUM(PRICE)
>  MAX(PRICE)
>  TOPN(PRICE) Group By:KYLIN_SALES.SELLER_ID,KYLIN_SALES.BUYER_ID
>  
> SQL:
> {code:java}
> {"sql":"select seller_id, buyer_id, sum(PRICE) from glaucus.kylin_sales where 
> PART_DT >= ? and PART_DT <= ? group by seller_id, buyer_id order by 
> sum(PRICE) desc limit 20","project":"DDTFORTEST_Analytics", 
> "params":[{"className": "java.lang.String","value": 
> "2012-01-01"},{"className": "java.lang.String","value": "2012-01-10"}]}
> {code}
> The First query result:
> !image-2019-11-18-15-55-00-312.png!
> The Second and after:
> !image-2019-11-18-15-55-11-906.png|width=2046,height=1096!
>  -----------------------------------------------
> h2. -Root Cause-
> Cached preparedContext is changed when doing preparedStatement.executeQuery, 
> and losing groupByColumns. So the first execution result is correct, the 
> second and the after will be incorrect.
> !image-2019-11-18-19-29-34-489.png!
> !image-2019-11-18-19-29-42-721.png!
> h2. Real Root Cause
> The first time we entered PreparedStatement logic, we'll try to borrow 
> preparedContext from cache pool, of course there isn't any, but the cache 
> pool will execute create method to create a new preparedContext, and then 
> loaned it to us.
> I didn't figure out how adjustSqlDigest works before, and try to remove code
> {code:java}
> sqlDigest.groupbyColumns.removeAll(topnLiteralCol){code}
> but it's not right. Top-N isn't like some other measures, the dimensions 
> aren't as part of row key, they stored in measures in design, so it's why the 
> adjustSqlDigest matters, especially those codes.
> {code:java}
> sqlDigest.groupbyColumns.removeAll(topnLiteralCol);
> sqlDigest.metricColumns.addAll(topnLiteralCol);
> {code}
> The root cause for this issue is because the create sql digest is execute 
> again after we store it in cache, so the digest changed.
> {code:java}
> # This is from the first time
> fact table GLAUCUS.KYLIN_SALES,group by [],filter on 
> [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=TOP_N, 
> parameter=GLAUCUS.KYLIN_SALES.PRICE,GLAUCUS.KYLIN_SALES.SELLER_ID,GLAUCUS.KYLIN_SALES.BUYER_ID,
>  returnType=topn(5000,8)]].
> # This is the second one
> fact table GLAUCUS.KYLIN_SALES,group by [],filter on 
> [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=SUM, 
> parameter=GLAUCUS.KYLIN_SALES.PRICE, returnType=decimal(19,4)]].
> {code}
> So the second time and after we execute the same query or same pattern, the 
> expression will be changed to SUM instead of TOPN, that's why the strange 
> result show up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to