vbekiaris opened a new issue, #437: URL: https://github.com/apache/iceberg-go/issues/437
### Apache Iceberg version main (development) ### Please describe the bug 🐞 `Scan.PlanFiles` takes a `context` argument. This creates the expectation that this context is actually used internally for downloading objects from S3. However this is not the case: the context is not propagated through the `IO` abstraction and the implementation actually uses a stored context (previously stored from `Catalog.LoadTable`). This breaks the case where `Table`s are cached across requests (to avoid hitting catalog and download/parse table metadata on each request), as setting a context with timeout on `LoadTable` results in getting "context cancelled" errors in any further request. The reproducer below just uses `LoadTable` from a separate function and cancels the context with timeout for resource cleanup (as per recommended practice), then `PlanFiles` fails with "context canceled". ``` func TestLoadTableWithTimeout(t *testing.T) { ctx := context.Background() cat, err := GetCatalog(ctx) require.NoError(t, err) tbl, err := loadTableWithTimeout(ctx, cat, "db.test_table") require.NoError(t, err) // the following fails, because PlanFiles does not really use passed context // error: Received unexpected error: // could not open manifest file: operation error S3: GetObject, context canceled _, err = tbl.Scan().PlanFiles(ctx) require.NoError(t, err) } func loadTableWithTimeout(ctx context.Context, cat catalog.Catalog, tblName string) (*table.Table, error) { ctxWithTimeout, cancelFn := context.WithTimeout(ctx, 1*time.Minute) defer cancelFn() return cat.LoadTable(ctxWithTimeout, catalog.ToIdentifier(tblName), nil) } ``` We use the Glue catalog, but any implementation that uses `io.LoadFS` ultimately stores the context in `IO` implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org