connortsui20 commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2721832186
> - The implementer does not need to do extra work. > - The caller can know what error has occurred. > - The caller can decide how to handle this error. > - The caller can understand why this error occurred. If these are the main goals, then I don't think the current system is ideal. When you say the caller, do you mean the developer or the code? The current design implies that the caller is really the _developer_. Right now the only way to check what actually went wrong is by debug printing everything out (reporting). The _code_ cannot exhaustively match (recover) against the error that happened because 1) there aren't enough variants to choose from and 2) even if there were enough variants in `ErrorKind`, the caller would have to introduce logic to manually parse out that context and see what went wrong, which is not ideal. Right now, there are only _reportable_ errors. What I am proposing is that we introduce _recoverable_ errors. In the small example above, yes it may seem like it is not super useful to have another subenum inside the `ErrorKind`s because there isn't very many error kinds here _right now_. If you look at the [API spec](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml) and do a search for "examples:", it will show that there are actually significantly more (I've counted at least 10 more) error cases that users need to handle. So I think the argument here is this: Do we have implementors manually craft contexts for each of these unique error cases, and require users to extract that context out (either by logging the error or having to manually match against static strings in the context (which doesn't actually seem possible right now because the context is just used for debugging)), or do we make an effort to discriminate _all_ of the different kinds of errors now so that users can match against any of them so that the caller can decide if they want to recover? In my opinion, the idealized way to handle errors here would be something like this (obviously room for improvement): ```rust #[derive(Debug)] pub struct IcebergError { // Name can definitely be changed. err: IcebergErrorInner, // Could have everything else in here that the current version does. context: ErrorContext, } // I'm guessing that there is other things we would want to put in here. #[derive(Debug)] pub enum IcebergErrorInner { Unexpected, DataInvalid, FeatureUnsupported, CatalogError(CatalogError), } // Only a small subset of possible error cases. #[derive(Debug)] pub enum CatalogError { TableNotExists, TableAlreadyExists { existing_table: String }, NamespaceNotExists, TableCommitFailed(TableCommitFailed), } let table = match ctl.rename_table("before", "after").await { Ok(table) => table, Err(IcebergError { err: IcebergErrorInner::Catalog(catalog_error), context, }) => { match catalog_error { CatalogError::TableNotExists => (), CatalogError::NamespaceNotExists => (), CatalogError::TableAlreadyExists { existing_table } => (), }; eprintln!("{context}"); } // This has the benefit of being able to always reject non-catalog errors. Err(iceberg_error) => panic!("Something went wrong: {:?}", iceberg_error), }; // Note that update doesn't work yet for all implementations, but when it // does get implemented then this will certainly be something that callers // want to handle properly / potentially recover from. let table = match ctl.update_table(commit).await { Ok(table) => table, Err(IcebergError { err: IcebergErrorInner::Catalog(catalog_error), context, }) => { match catalog_error { CatalogError::TableCommitFailed(commit_failed) => { todo!("There is a whole class of things that a caller might want to do here") } CatalogError::TableCommitStateUnknown(commit_state_unknown) => todo!("Same here"), CatalogError::TableNotExists => todo!(), }; eprintln!("{context}"); } Err(iceberg_error) => panic!("Something went wrong: {:?}", iceberg_error), }; ``` I think there are 2 key things here. The first is that code can exhaustively match error cases without having to parse a context with dynamic fields. And second is that as implementors, we can now simply raise the correct error case whenever it is needed. If we need to write a macro that helps construct the backtrace and where it came from, then so be it. But like I said above, we could just use `snafu` to help with that. _Note that the first example I gave above with that subenum was an attempt to get something in between the `ErrorKind` design and a more robust design, and you are right in that it doesn't make a lot of sense to do both things badly._ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org