iffyio commented on code in PR #2352:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/2352#discussion_r3362091838


##########
src/parser/mod.rs:
##########
@@ -1717,6 +1731,23 @@ impl<'a> Parser<'a> {
             return prefix;
         }
 
+        // Memoize parse_prefix failures to break 2^N speculation when both
+        // prefix arms fail at every level (e.g. `IF(current_time(...x`).
+        // The per-arm cache in `parse_prefix_inner` complements this for
+        // chains where the reserved arm fails but the unreserved fallback
+        // succeeds (e.g. `case-case-...c`).
+        let start_index = self.index;
+        if let Some(cached) = self.failed_prefix_positions.get(&start_index) {
+            return Err(cached.clone());
+        }
+        let result = self.parse_prefix_inner();
+        if let Err(ref e) = result {
+            self.failed_prefix_positions.insert(start_index, e.clone());

Review Comment:
   essentially the cache is on the order of tokens - so that even a 2x memory 
increase is linear but doesn't make it ideal.
   and it would be nice to improve the behavior if we can. One way I can think 
of boils down to the fact that the error message is already just a 
deterministic function of the index we're caching, so from what I can tell, 
including the actual error string in the cache isn't beneficial. Hence my 
thinking to switch to storing something light weight.
   
   Here's a rough illustration of that thought in code (naming might be off).
   
   ```rust
   // Copy type to track the failure type (recursion limit is the only non-expr 
related error)
   #[Copy]
   enum ExprPrefixError {
        RecursionLimitExceeded
        Err
   }
   
   impl From<&ParserError> for ExprPrefixError {
        fn from(e) -> Self {
                if matches!(e, RecursionLimitExceeded) {
                        Self::RecursionLimitExceeded
                } else {
                        Self::Err
                }
        }
   }
   
   // ...
   // We change the cache to store metadata instead of the actual error.
   failed_prefix_positions: BTreeMap<usize, ExprPrefixError>
   failed_reserved_word_prefix_positions: BTreeMap<usize, ExprPrefixError>
   
   // On a cache miss, we store the metadata
   // ...
   self.failed_reserved_word_prefix_positions.insert(idx, 
ExprPrefixError::from(&e));
   
   // On a cache hit, we generate the error message on demand.
   // ...
   if let Some(cached) = self.failed_reserved_word_prefix_positions {
        return Err(self.generate_expr_prefix_error(cached, idx))
   }
   
   // where we can reconstruct the error message given the error context.
   fn generate_expr_prefix_error(err, index) {
        match err {
        ExprPrefixError::RecursionLimitExceeded => 
ParserError::RecursionLimitExceeded,
           ExprPrefixError::Err => self.expected_at::<()>("an expression", 
index)
       }
        
   }
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to