GabrielM98 opened a new issue, #309:
URL: https://github.com/apache/iceberg-go/issues/309

   ### Apache Iceberg version
   
   main (development)
   
   ### Please describe the bug 🐞
   
   **Version:** v0.1.0
   
   Does the library support scanning tables with fields of type `list`? 
   
   I'm seeing some strange behaviour whilst attempting to scan a table (with 
all fields selected and no row filters applied) with the following schema:
   ```
   {
       "type" : "struct",
       "schema-id" : 0,
       "fields" : [ {
         "id" : 1,
         "name" : "uuid",
         "required" : false,
         "type" : "string"
       }, {
         "id" : 2,
         "name" : "source",
         "required" : false,
         "type" : {
           "type" : "struct",
           "fields" : [ {
             "id" : 5,
             "name" : "type",
             "required" : false,
             "type" : "string"
           }, {
             "id" : 6,
             "name" : "serviceId",
             "required" : false,
             "type" : "string"
           } ]
         }
       }, {
         "id" : 3,
         "name" : "subjects",
         "required" : false,
         "type" : {
           "type" : "list",
           "element-id" : 7,
           "element" : {
             "type" : "struct",
             "fields" : [ {
               "id" : 8,
               "name" : "type",
               "required" : false,
               "type" : "string"
             }, {
               "id" : 9,
               "name" : "id",
               "required" : false,
               "type" : "string"
             } ]
           },
           "element-required" : false
         }
       }, {
         "id" : 4,
         "name" : "timing",
         "required" : false,
         "type" : {
           "type" : "struct",
           "fields" : [ {
             "id" : 10,
             "name" : "createdAt",
             "required" : false,
             "type" : "timestamptz"
           }, {
             "id" : 11,
             "name" : "emittedAt",
             "required" : false,
             "type" : "timestamptz"
           } ]
         }
   ```
   
   When I call `(*table.Scan).ToArrowRecords` and attempt to loop over the 
resulting iterator, the loop yields nothing. 
   
   Hooking up a debugger to my code, I can see there's an error being returned 
by `(*table.Scan).recordsFromTask` 
([here](https://github.com/apache/iceberg-go/blob/0921b84b53e3184a1867481bf1e1a22f5a059b5c/table/arrow_scanner.go#L555))
 which is resulting in the context being cancelled. Hence, the iterator returns 
without yielding anything. However, on some occasions it does yield an error, 
which seems to indicate that there's a race condition between the write to the 
done channel of the `context.Context` and the write to the `out` channel in 
`(*table.Scan).recordsFromTask` 
([here](https://github.com/apache/iceberg-go/blob/0921b84b53e3184a1867481bf1e1a22f5a059b5c/table/arrow_scanner.go#L377)).
   
   Race condition aside, the error being returned is the following...
   ```
   error encountered during arrow schema visitor: invalid schema: cannot 
convert list: type=struct<type: utf8, id: utf8>, nullable to Iceberg field, 
missing field_id
   ```
   
   I've been doing a bit of digging and noticed an intriguing bit of behaviour 
with regard to the projected field IDs. It appears that if the field is of type 
`map` or `list` that it doesn't get added to the set of projected field IDs 
(see `switch` statement 
[here](https://github.com/apache/iceberg-go/blob/0921b84b53e3184a1867481bf1e1a22f5a059b5c/table/arrow_scanner.go#L222))?
 Is this a piece of functionality that is yet to be implemented or is this 
intended behaviour? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to