bobbai00 commented on code in PR #3715:
URL: https://github.com/apache/texera/pull/3715#discussion_r2395286324
##########
core/file-service/src/main/scala/edu/uci/ics/texera/service/resource/DatasetResource.scala:
##########
@@ -239,43 +239,70 @@ class DatasetResource {
withTransaction(context) { ctx =>
val uid = user.getUid
- val datasetDao: DatasetDao = new DatasetDao(ctx.configuration())
val datasetUserAccessDao: DatasetUserAccessDao = new
DatasetUserAccessDao(ctx.configuration())
val datasetName = request.datasetName
val datasetDescription = request.datasetDescription
val isDatasetPublic = request.isDatasetPublic
val isDatasetDownloadable = request.isDatasetDownloadable
- // Check if a dataset with the same name already exists
- if (!datasetDao.fetchByName(datasetName).isEmpty) {
- throw new BadRequestException("Dataset with the same name already
exists")
- }
-
- // Initialize the repository in LakeFS
+ // validate dataset name
try {
- LakeFSStorageClient.initRepo(datasetName)
+ LakeFSStorageClient.validateRepositoryName(datasetName)
Review Comment:
Semantically,
- datasetName is user given
- repositoryName is allocated by internal services.
Therefore, although LakeFS has a good validation rule, it is misleading to
call `LakeFSStorageClient.validateRepositoryName()` on the `datasetName`.
You can do the following here:
- create a static function in `DatasetResource` named `validateDatasetName`.
This function does the exact same thing as `validateRepositoryName`.
- call validateDatasetName here.
This duplication is meaningful as it makes the semantic consistent. Also it
provides the flexibility for us to modify the validation rule later, without
affecting the lakefs repository name validation
##########
core/file-service/src/main/scala/edu/uci/ics/texera/service/resource/DatasetResource.scala:
##########
@@ -239,43 +239,70 @@ class DatasetResource {
withTransaction(context) { ctx =>
val uid = user.getUid
- val datasetDao: DatasetDao = new DatasetDao(ctx.configuration())
val datasetUserAccessDao: DatasetUserAccessDao = new
DatasetUserAccessDao(ctx.configuration())
val datasetName = request.datasetName
val datasetDescription = request.datasetDescription
val isDatasetPublic = request.isDatasetPublic
val isDatasetDownloadable = request.isDatasetDownloadable
- // Check if a dataset with the same name already exists
- if (!datasetDao.fetchByName(datasetName).isEmpty) {
- throw new BadRequestException("Dataset with the same name already
exists")
- }
-
- // Initialize the repository in LakeFS
+ // validate dataset name
try {
- LakeFSStorageClient.initRepo(datasetName)
+ LakeFSStorageClient.validateRepositoryName(datasetName)
Review Comment:
Semantically,
- datasetName is user given
- repositoryName is allocated by internal services.
Therefore, although LakeFS has a good validation rule, it is misleading to
call `LakeFSStorageClient.validateRepositoryName()` on the `datasetName`.
You can do the following here:
- create a static function in `DatasetResource` named `validateDatasetName`.
This function does the exact same thing as `validateRepositoryName`.
- call validateDatasetName here.
This duplication is meaningful as it makes the semantic consistent. Also it
provides the flexibility for us to modify the validation rule later, without
affecting the lakefs repository name validation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]