dwilson1988 commented on code in PR #177:
URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813445471


##########
manifest.go:
##########
@@ -876,7 +1030,140 @@ func (m *manifestEntryV2) FileSequenceNum() *int64 {
        return m.FileSeqNum
 }
 
-func (m *manifestEntryV2) DataFile() DataFile { return &m.Data }
+func (m *manifestEntryV2) DataFile() DataFile { return m.Data }
+
+// DataFileBuilder is a helper for building a data file struct which will
+// conform to the DataFile interface.
+type DataFileBuilder struct {
+       d *dataFile
+}
+
+// NewDataFileBuilder is passed all of the required fields and then allows
+// all of the optional fields to be set by calling the corresponding methods
+// before calling [DataFileBuilder.Build] to construct the object.
+func NewDataFileBuilder(
+       content ManifestEntryContent,
+       path string,
+       format FileFormat,
+       partitionData map[string]any,
+       recordCount int64,
+       fileSize int64,
+) *DataFileBuilder {
+       return &DataFileBuilder{
+               d: &dataFile{
+                       Content:       content,
+                       Path:          path,
+                       Format:        format,
+                       PartitionData: partitionData,
+                       RecordCount:   recordCount,
+                       FileSize:      fileSize,
+               },
+       }
+}
+
+// BlockSizeInBytes sets the block size in bytes for the data file. Deprecated 
in v2.
+func (b *DataFileBuilder) BlockSizeInBytes(size int64) *DataFileBuilder {
+       b.d.BlockSizeInBytes = size
+       return b
+}
+
+// ColumnSizes sets the column sizes for the data file.
+func (b *DataFileBuilder) ColumnSizes(sizes map[int]int64) *DataFileBuilder {
+       colSizes := make([]colMap[int, int64], 0, len(sizes))
+       for k, v := range sizes {
+               colSizes = append(colSizes, colMap[int, int64]{Key: k, Value: 
v})
+       }
+       b.d.ColSizes = &colSizes
+       return b

Review Comment:
   at this point, I don't think we have access to the schema, are you 
suggesting we should either embed the schema with the `dataFile` or accept the 
schema as an argument? Without that, I'm not sure how we could validate. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to