dataroaring commented on PR #53540:
URL: https://github.com/apache/doris/pull/53540#issuecomment-3284898297

   **🚨 CRITICAL: Lambda Memory Safety in Async Context**
   
   **File**: `be/src/cloud/cloud_tablet.cpp` lines 240-280
   
   ```cpp
   auto download_done = [=, version = rs_meta.version()](Status st) {
       // ❌ Implicit 'this' capture in async lambda
       if (st.ok()) {
           std::shared_lock rlock(_meta_lock);  // Uses 'this->_meta_lock'
           // ... more 'this' member access
       }
   };
   
   // Lambda is used in async context
   _engine.storage_resource().download_file(remote_path, local_path, 
download_done);
   ```
   
   **Issue**: The lambda captures variables by value (`[=]`) but implicitly 
accesses `this` members (`_meta_lock`, etc.) in an **async execution context**. 
   
   **Critical Risk**: 
   - **Use-after-free**: If the `CloudTablet` object is destroyed before the 
async callback executes, accessing `this` members causes undefined behavior
   - **Memory corruption**: Accessing deallocated memory can corrupt the heap
   - **Production crashes**: This could cause intermittent crashes that are 
hard to reproduce
   
   **Root Cause**: The download operation is asynchronous, but the lambda 
lifetime is tied to the tablet object lifetime without proper synchronization.
   
   **Fix Options**:
   
   **Option 1** - Use weak_ptr for safe access:
   ```cpp
   std::weak_ptr<CloudTablet> weak_self = shared_from_this();
   auto download_done = [weak_self, version = rs_meta.version()](Status st) {
       auto self = weak_self.lock();
       if (!self) return;  // Tablet was destroyed, safe to return
       
       if (st.ok()) {
           std::shared_lock rlock(self->_meta_lock);
           // ... safe member access through self
       }
   };
   ```
   
   **Option 2** - Explicit lifetime management:
   ```cpp
   auto tablet_id = tablet_id();  // Copy needed values
   auto download_done = [tablet_id, version = rs_meta.version()](Status st) {
       // Don't access 'this' members at all
       if (st.ok()) {
           LOG(INFO) << "Download completed for tablet: " << tablet_id;
           // Use global registry to safely notify completion
       }
   };
   ```
   
   **This is a blocking safety issue** that could cause production crashes and 
must be fixed before merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to