[ 
https://issues.apache.org/jira/browse/TIKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18069064#comment-18069064
 ] 

ASF GitHub Bot commented on TIKA-4703:
--------------------------------------

nddipiazza opened a new pull request, #2715:
URL: https://github.com/apache/tika/pull/2715

   ## Summary
   
   Moves Docker build infrastructure into the main tika repo so that Docker 
image releases are tied directly to Tika releases, eliminating the need for 
cross-repo coordination with `tika-docker` and `tika-grpc-docker`.
   
   - **Snapshot workflow** (main branch push): builds and pushes `apache/tika`, 
`apache/tika-full`, and `apache/tika-grpc` snapshot images to Docker Hub
   - **Release workflow** (version tag push): builds and pushes versioned + 
`latest` tags for all three images
   - **tika-server Dockerfiles**: copied from `tika-docker` repo (source of 
truth), plus new `Dockerfile.snapshot` variants that use the Maven assembly 
output instead of downloading from Apache mirrors
   - **tika-grpc docker-build**: Dockerfile, entrypoint script, and build 
context assembly script
   - **TikaGrpcServer**: now falls back to a bundled empty 
`default-tika-config.json` from classpath when no `-c` flag is provided, 
matching standard Java application conventions
   - **Tested locally**: all three images (minimal, full, grpc) build and start 
successfully
   
   ## Required Setup
   
   `DOCKERHUB_USERNAME` and `DOCKERHUB_TOKEN` secrets must be configured in the 
repo settings for the workflows to push images.
   
   ## Test plan
   
   - [x] tika-server minimal: HTTP 200 on port 9998, user 35002:35002
   - [x] tika-server full: HTTP 200 on port 9998, user 35002:35002, ImageMagick 
verified
   - [x] tika-grpc: gRPC server starts on port 9090, all plugins loaded, no 
config file required
   - [ ] Test Docker push to personal Docker Hub
   - [ ] Verify snapshot workflow triggers on main merge
   - [ ] Verify release workflow triggers on version tag
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)




> Integrate Docker image builds into apache/tika and deprecate standalone 
> Docker repos
> ------------------------------------------------------------------------------------
>
>                 Key: TIKA-4703
>                 URL: https://issues.apache.org/jira/browse/TIKA-4703
>             Project: Tika
>          Issue Type: Task
>            Reporter: Nicholas DiPiazza
>            Priority: Major
>
> h2. Summary
> Move Docker image building and publishing into the main 
> [apache/tika|https://github.com/apache/tika] repository, deprecating the 
> standalone Docker repos. This ensures Docker image releases are naturally 
> tied to Tika releases through the existing Maven workflow, rather than 
> requiring cross-repo coordination.
> h2. Current State
> * [tika-docker|https://github.com/apache/tika-docker] - standalone repo that 
> builds the tika-server Docker image, published to [apache/tika on Docker 
> Hub|https://hub.docker.com/r/apache/tika]
> * [tika-grpc-docker|https://github.com/apache/tika-grpc-docker] - standalone 
> repo that builds the tika-grpc Docker image, published to [apache/tika-grpc 
> on Docker Hub|https://hub.docker.com/r/apache/tika-grpc]
> h2. Problem
> Having Docker builds in separate repos means:
> * Docker image releases are decoupled from Tika releases - requires manual 
> coordination
> * No guarantee Docker images match the released Tika version
> * Extra maintenance burden across multiple repos
> * Harder for contributors to understand the full release pipeline
> h2. Proposed Approach
> # Move Dockerfiles and related build config from {{tika-docker}} and 
> {{tika-grpc-docker}} into the main {{apache/tika}} repo
> # Add GitHub Actions workflows to {{apache/tika}} that build and publish 
> Docker images as part of the release process
> # Integrate with the existing Maven workflow so Docker builds happen 
> naturally alongside Java artifact publishing
> # Docker images to publish:
> #* {{apache/tika}} (tika-server) to [Docker 
> Hub|https://hub.docker.com/r/apache/tika]
> #* {{apache/tika-grpc}} (tika-grpc) to [Docker 
> Hub|https://hub.docker.com/r/apache/tika-grpc]
> # Support multi-architecture builds (amd64, arm64) if applicable
> # Proper image tagging tied to Maven release versions (e.g. {{3.1.0}}, 
> {{latest}})
> # Deprecate {{tika-docker}} and {{tika-grpc-docker}} repos with README 
> notices pointing to {{apache/tika}}
> h2. Acceptance Criteria
> * Dockerfiles and build config live in the {{apache/tika}} repo
> * GitHub Actions in {{apache/tika}} build and publish both Docker images on 
> release
> * Docker image versions are automatically tied to Tika release versions
> * {{tika-docker}} and {{tika-grpc-docker}} repos are marked as deprecated



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to