joepurdy commented on code in PR #2715: URL: https://github.com/apache/tika/pull/2715#discussion_r3011359092
########## tika-grpc/docker-build/Dockerfile: ########## @@ -0,0 +1,53 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); you may not +# use this file except in compliance with the License. You may obtain a copy of +# the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations under +# the License. + +FROM ubuntu:plucky Review Comment: The `tika-server` Dockerfiles run as `USER 35002:35002` (matching the upstream `tika-docker` convention), but this Dockerfile has no USER directive. The gRPC server runs as root. `docker-tool.sh` even asserts 35002:35002 in its test function. Should just need to add `ARG UID_GID="35002:35002"` like in the `tika-server` Dockerfile and reference that ARG is a `USER` directive. ```suggestion # "random" uid/gid hopefully not used anywhere else # This needs to be set globally and then referenced in # the subsequent stages -- see TIKA-3912 ARG UID_GID="35002:35002" FROM ubuntu:plucky ``` ########## tika-grpc/docker-build/Dockerfile: ########## @@ -0,0 +1,53 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); you may not +# use this file except in compliance with the License. You may obtain a copy of +# the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations under +# the License. + +FROM ubuntu:plucky +COPY libs/ /tika/libs/ +COPY plugins/ /tika/plugins/ +COPY config/ /tika/config/ +COPY bin/ /tika/bin +ARG JRE='openjdk-17-jre-headless' +ARG VERSION +ARG TIKA_GRPC_MAX_INBOUND_MESSAGE_SIZE=104857600 +ARG TIKA_GRPC_MAX_OUTBOUND_MESSAGE_SIZE=104857600 +ARG TIKA_GRPC_NUM_THREADS=4 +RUN set -eux \ + && apt-get update \ + && apt-get install --yes --no-install-recommends gnupg2 software-properties-common \ + && DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends $JRE \ + gdal-bin \ + tesseract-ocr \ + tesseract-ocr-eng \ + tesseract-ocr-ita \ + tesseract-ocr-fra \ + tesseract-ocr-spa \ + tesseract-ocr-deu \ + && echo ttf-mscorefonts-installer msttcorefonts/accepted-mscorefonts-eula select true | debconf-set-selections \ + && DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends \ + xfonts-utils \ + fonts-freefont-ttf \ + fonts-liberation \ + ttf-mscorefonts-installer \ + wget \ + cabextract \ + && apt-get clean -y \ + && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* + +EXPOSE 9090 Review Comment: Use the `ARG` suggested previously to run as a nonroot user ```suggestion USER $UID_GID EXPOSE 9090 ``` ########## tika-grpc/docker-build/Dockerfile: ########## @@ -0,0 +1,53 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); you may not +# use this file except in compliance with the License. You may obtain a copy of +# the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations under +# the License. + +FROM ubuntu:plucky +COPY libs/ /tika/libs/ +COPY plugins/ /tika/plugins/ +COPY config/ /tika/config/ +COPY bin/ /tika/bin +ARG JRE='openjdk-17-jre-headless' Review Comment: The `tika-server` images default to `openjdk-21-jre-headless`. Any reason to pin grpc to 17? If intentional, might be worth a comment explaining why, otherwise someone will "fix" it later and potentially break something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
