zeroshade opened a new issue, #827: URL: https://github.com/apache/arrow-go/issues/827
### Describe the bug, including details regarding any error messages, version, and platform. **Surfaced while implementing #825 (closes #815).** See the PR's "Out of scope" section and [this comment](https://github.com/apache/arrow-go/pull/825#issuecomment-4530190386). `*scalar.Extension` does not implement `Release()` or `Retain()`. As a result, when `compute.ScalarDatum.Release()` is called on a `*ScalarDatum` whose `Value` is an extension scalar, the storage scalar's `Release()` is never invoked and any underlying buffers leak. #### Mechanism `compute.ScalarDatum.Release()` only forwards `Release()` if the wrapped scalar satisfies the unexported `releasable` interface ([`arrow/compute/datum.go#L122-L130`](https://github.com/apache/arrow-go/blob/v18.5.2/arrow/compute/datum.go#L122-L130)): ```go type releasable interface { Release() } func (d *ScalarDatum) Release() { if v, ok := d.Value.(releasable); ok { v.Release() } } ``` `*scalar.Extension` ([`arrow/scalar/scalar.go#L397-L467`](https://github.com/apache/arrow-go/blob/v18.5.2/arrow/scalar/scalar.go#L397-L467)) defines `value()`, `equals()`, `Validate()`, `ValidateFull()`, `CastTo()`, and `String()` — but neither `Release()` nor `Retain()`. The type assertion in `ScalarDatum.Release()` therefore fails and the inner storage scalar is never released, even though the storage scalar itself (e.g. `*scalar.FixedSizeList`) carries a buffer-bearing array. Peer scalar types implement the pair correctly by delegating to their inner storage. For example, `*scalar.Binary` ([`arrow/scalar/binary.go#L43-L53`](https://github.com/apache/arrow-go/blob/v18.5.2/arrow/scalar/binary.go#L43-L53)): ```go func (b *Binary) Retain() { if b.Value != nil { b.Value.Retain() } } func (b *Binary) Release() { if b.Value != nil { b.Value.Release() } } ``` `*scalar.List`, `*scalar.Struct`, `*scalar.Dictionary`, `*scalar.SparseUnion`, `*scalar.DenseUnion`, and `*scalar.RunEndEncoded` all follow the same pattern in `arrow/scalar/nested.go`. `*scalar.Extension` is the lone outlier among scalar types whose storage holds buffers. #### Affected paths Anything that produces a `*scalar.Extension` and reaches `ScalarDatum.Release()`. Concretely in this repo: - The `*types.IntervalYearToMonthType` and `*types.IntervalDayType` branches in `*expr.ProtoLiteral` handling, factored into `intervalYearToMonthDatum` / the inline `*types.IntervalDayType` branch in [`arrow/compute/exprs/exec.go`](https://github.com/apache/arrow-go/blob/main/arrow/compute/exprs/exec.go) (the `intervalYearToMonthDatum` helper is added in #825; the underlying leak predates that PR). - The new `expr.IntervalYearToMonthLiteral` value/pointer cases added in #825. - The `intervalYear()` / `intervalDay()` extension type constructors in [`arrow/compute/exprs/extension_types.go#L142-L147`](https://github.com/apache/arrow-go/blob/v18.5.2/arrow/compute/exprs/extension_types.go#L142-L147). - Any user-defined extension type whose storage scalar holds buffers. #### Reproducer Drive any of the affected paths with `memory.NewCheckedAllocator` and call `Release()` on the resulting `*ScalarDatum`. `AssertSize(t, 0)` will fail because the storage scalar's buffer is never released. (#825's `TestLiteralToDatumIntervalYearToMonth` deliberately uses `memory.DefaultAllocator` for exactly this reason and explicitly defers the leak fix to this issue.) ### Expected behavior `*scalar.Extension` should implement `Release()` and `Retain()` by delegating to `s.Value` when the inner storage scalar implements `Releasable` ([`arrow/scalar/scalar.go#L68-L71`](https://github.com/apache/arrow-go/blob/v18.5.2/arrow/scalar/scalar.go#L68-L71)), mirroring `*scalar.Binary` and the nested scalars. Suggested implementation: ```go func (s *Extension) Retain() { if r, ok := s.Value.(Releasable); ok { r.Retain() } } func (s *Extension) Release() { if r, ok := s.Value.(Releasable); ok { r.Release() } } ``` This is purely additive: - Storage scalars that don't implement `Releasable` (e.g. primitive numeric storage) become no-ops, matching today's behavior. - Storage scalars that do (e.g. `*scalar.FixedSizeList` for the interval extension types) get properly released — closing the leak on the existing `*types.IntervalYearToMonthType` / `*types.IntervalDayType` paths and on any future extension types whose storage carries buffers. - The `Retain()` half restores symmetry: today, a caller that takes ownership of an extension scalar has no way to bump the storage refcount. A follow-up test using `memory.NewCheckedAllocator` with `AssertSize(t, 0)` around the interval extension paths in `arrow/compute/exprs/exec_internal_test.go` would lock in the fix. ### Component(s) Other (`arrow/scalar`, `arrow/compute`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
