connortsui20 commented on issue #1038:
URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2721832186

   > - The implementer does not need to do extra work.
   > - The caller can know what error has occurred.
   > - The caller can decide how to handle this error.
   > - The caller can understand why this error occurred.
   
   If these are the main goals, then I don't think the current system is ideal. 
When you say the caller, do you mean the developer or the code? The current 
design implies that the caller is really the _developer_. Right now the only 
way to check what actually went wrong is by debug printing everything out 
(reporting). The _code_ cannot exhaustively match (recover) against the error 
that happened because 1) there aren't enough variants to choose from and 2) 
even if there were enough variants in `ErrorKind`, the caller would have to 
introduce logic to manually parse out that context and see what went wrong, 
which is not ideal.
   
   Right now, there are only _reportable_ errors. What I am proposing is that 
we introduce _recoverable_ errors.
   
   In the small example above, yes it may seem like it is not super useful to 
have another subenum inside the `ErrorKind`s because there isn't very many 
error kinds here _right now_. If you look at the [API 
spec](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml)
 and do a search for "examples:", it will show that there are actually 
significantly more (I've counted at least 10 more) error cases that users need 
to handle.
   
   So I think the argument here is this: Do we have implementors manually craft 
contexts for each of these unique error cases, and require users to extract 
that context out (either by logging the error or having to manually match 
against static strings in the context (which doesn't actually seem possible 
right now because the context is just used for debugging)), or do we make an 
effort to discriminate _all_ of the different kinds of errors now so that users 
can match against any of them so that the caller can decide if they want to 
recover?
   
   In my opinion, the idealized way to handle errors here would be something 
like this (obviously room for improvement):
   
   ```rust
   #[derive(Debug)]
   pub struct IcebergError {
       // Name can definitely be changed.
       err: IcebergErrorInner,
       // Could have everything else in here that the current version does.
       context: ErrorContext,
   }
   
   // I'm guessing that there is other things we would want to put in here.
   #[derive(Debug)]
   pub enum IcebergErrorInner {
       Unexpected,
       DataInvalid,
       FeatureUnsupported,
       CatalogError(CatalogError),
   }
   
   // Only a small subset of possible error cases.
   #[derive(Debug)]
   pub enum CatalogError {
       TableNotExists,
       TableAlreadyExists { existing_table: String },
       NamespaceNotExists,
       TableCommitFailed(TableCommitFailed),
   }
   
   let table = match ctl.rename_table("before", "after").await {
       Ok(table) => table,
       Err(IcebergError {
           err: IcebergErrorInner::Catalog(catalog_error),
           context,
       }) => {
           match catalog_error {
               CatalogError::TableNotExists => (),
               CatalogError::NamespaceNotExists => (),
               CatalogError::TableAlreadyExists { existing_table } => (),
           };
           eprintln!("{context}");
       }
       // This has the benefit of being able to always reject non-catalog 
errors.
       Err(iceberg_error) => panic!("Something went wrong: {:?}", 
iceberg_error),
   };
   
   // Note that update doesn't work yet for all implementations, but when it 
   // does get implemented then this will certainly be something that callers
   // want to handle properly / potentially recover from.
   let table = match ctl.update_table(commit).await {
       Ok(table) => table,
       Err(IcebergError {
           err: IcebergErrorInner::Catalog(catalog_error),
           context,
       }) => {
           match catalog_error {
               CatalogError::TableCommitFailed(commit_failed) => {
                   todo!("There is a whole class of things that a caller might 
want to do here")
               }
               CatalogError::TableCommitStateUnknown(commit_state_unknown) => 
todo!("Same here"),
               CatalogError::TableNotExists => todo!(),
           };
           eprintln!("{context}");
       }
       Err(iceberg_error) => panic!("Something went wrong: {:?}", 
iceberg_error),
   };
   ```
   
   I think there are 2 key things here. The first is that code can exhaustively 
match error cases without having to parse a context with dynamic fields. And 
second is that as implementors, we can now simply raise the correct error case 
whenever it is needed. If we need to write a macro that helps construct the 
backtrace and where it came from, then so be it. But like I said above, we 
could just use `snafu` to help with that.
   
   _Note that the first example I gave above with that subenum was an attempt 
to get something in between the `ErrorKind` design and a more robust design, 
and you are right in that it doesn't make a lot of sense to do both things 
badly._


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to