paleolimbot opened a new issue, #745:
URL: https://github.com/apache/sedona-db/issues/745
There is at least one place where we may be double counting a large chunk of
memory in the EvaluatedBatch:
```rust
// NOTE: sometimes `geom_array` will reuse the memory of `batch`,
especially when
// the expression for evaluating the geometry is a simple column
reference. In this case,
// the in_mem_size will be overestimated. It is a conservative
estimation so there's no risk
// of running out of memory because of underestimation.
let record_batch_size = get_record_batch_memory_size(&self.batch)?;
let geom_array_size = self.geom_array.in_mem_size()?;
Ok(record_batch_size + geom_array_size)
```
This might explain an issue we ran across when trying to enable this by
default where we determined we'd need to set the memory pool size to more than
twice as much memory as was required for a join used in the released post (my
reading of that comment is that we would be reserving ~2x as much memory as was
required for most joins but I have not investigated).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]