Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23485 )
Change subject: KUDU-1261 Add array type support to Python client ...................................................................... Patch Set 1: Code-Review+1 (9 comments) Thanks a lot for implementing this! I took a quick look: overall it looks good, just a few questions and nits. http://gerrit.cloudera.org:8080/#/c/23485/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23485/1//COMMIT_MSG@27 PS1, Line 27: - Decimal arrays: blocked on C++ API int128 overload support : (existing decimal uses int128, but SetArrayDecimal only has : int32/int64 overloads - needs API design review) Is it a practical restriction (i.e. something Cython-related) or that's something else? http://gerrit.cloudera.org:8080/#/c/23485/1//COMMIT_MSG@30 PS1, Line 30: - Fix: Empty BOOL array read triggers C++ assertion failure : (requires C++ client fix) The current C++'s client's test coverage is spotty, indeed. I'll work on adding more tests in coming days. Interesting, if that were something that's addressed by https://gerrit.cloudera.org/#/c/23486/ , that would affect all the types. Anyway, I'll take a closer look, adding more scenarios in the set of test cases in https://gerrit.cloudera.org/#/c/23220 I'm planning to keep you posted on my findings and on the fix once it's available. http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/client.pyx File python/kudu/client.pyx: http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/client.pyx@3234 PS1, Line 3234: value Is this supposed to be Python list in case of arrays? Maybe a topic for a separate patch, but does it make sense to provide an interface where Python array (not list) is used as a parameter to write into array columns? That might be quite beneficial from performance reasons. http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/client.pyx@3344 PS1, Line 3344: for elem in value: : cpp_validity.push_back(elem is not None) For performance reasons, is it possible to reserve the necessary space in the cpp_validity and the typed arrays before calling 'push_back' for all the elements? http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/client.pyx@3472 PS1, Line 3472: nit: remove stray spaces http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/libkudu_client.pxd File python/kudu/libkudu_client.pxd: http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/libkudu_client.pxd@473 PS1, Line 473: nit: remove the trailing space http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/tests/test_array_datatype.py File python/kudu/tests/test_array_datatype.py: http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/tests/test_array_datatype.py@47 PS1, Line 47: API inconsistency Does it mean it's not possible to supply int32_t and int64_t from Python to reach the C++ client API with overloaded methods for decimal arrays (e.g., due to Cython-related issues or alike) or that's something else? http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/tests/test_schema.py File python/kudu/tests/test_schema.py: http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/tests/test_schema.py@628 PS1, Line 628: self.assertEqual(col.type.name, 'nested') nit: in C++ tests there also verification for the type of array elements (at least, in string representation, e.g. TestSchema.SimpleSchemaWithScalarArrays from schema-test.cc); does it make sense to do here as well? http://gerrit.cloudera.org:8080/#/c/23485/1/python/kudu/tests/test_schema.py@683 PS1, Line 683: empty_test_data = [ : ('arr_int8', []), : ('arr_int16', []), : ('arr_int32', []), : ('arr_int64', []), : ('arr_float', []), : ('arr_double', []), : ('arr_bool', []), : ('arr_string', []), : ('arr_binary', []), : ('arr_unixtime_micros', []), : ('arr_date', []), : ('arr_varchar', []), : ] nit: does it make sense to add a similar bunch, but with nulls instead of empty arrays? -- To view, visit http://gerrit.cloudera.org:8080/23485 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2329c7466534bd4961860c05b600e7d4b4a11507 Gerrit-Change-Number: 23485 Gerrit-PatchSet: 1 Gerrit-Owner: Marton Greber <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Gabriella Lotz <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Fri, 03 Oct 2025 00:17:16 +0000 Gerrit-HasComments: Yes
