This is an automated email from the ASF dual-hosted git repository.
ndipiazza pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git
The following commit(s) were added to refs/heads/main by this push:
new 89b2bf9cf TIKA-4587: Add pf4j development mode support to
TikaPluginManager (#2473)
89b2bf9cf is described below
commit 89b2bf9cfe70969ed5d67d03da5006633233ce5f
Author: Nicholas DiPiazza <[email protected]>
AuthorDate: Sun Dec 21 19:19:45 2025 -0600
TIKA-4587: Add pf4j development mode support to TikaPluginManager (#2473)
* TIKA-4587: Add pf4j development mode support to TikaPluginManager
Enables plugin development without packaging as ZIP files by supporting
pf4j's DEVELOPMENT runtime mode. This allows developers to point
plugin-roots directly at unpackaged plugin directories (e.g.,
target/classes)
for faster iteration during development.
Features:
- Configure via system property: -Dtika.plugin.dev.mode=true
- Configure via environment variable: TIKA_PLUGIN_DEV_MODE=true
- Skips ZIP extraction when in development mode
- Logs mode on startup for visibility
- Defaults to DEPLOYMENT mode for backward compatibility
This aligns with pf4j best practices documented at:
https://pf4j.org/doc/development-mode.html
JIRA: https://issues.apache.org/jira/browse/TIKA-4587
* TIKA-4587: Add development mode documentation to tika-grpc README
Adds comprehensive guide on using pf4j development mode for plugin
development:
- How to enable development mode (system property or env var)
- Configuration examples with plugin-roots pointing to target/classes
- Complete development workflow
- Multiple plugins example
- Troubleshooting tips
- Switching between dev and production modes
* TIKA-4587: Add IntelliJ IDEA setup guide for loading all plugins in dev
mode
Adds comprehensive IntelliJ configuration example:
- Complete dev-config.json with ALL plugin class directories
- Step-by-step IntelliJ Run Configuration setup
- VM options and environment variable configuration
- Hot reload workflow for fast iteration
- Shell script to auto-generate config with all plugins
* TIKA-4587: Clarify that plugin-roots is in Tika config JSON
Added explicit note that plugin-roots is configured in the Tika
configuration JSON file (e.g., tika-config.json, dev-config.json)
---
tika-grpc/README.md | 329 +++++++++++++++++++++
.../org/apache/tika/plugins/TikaPluginManager.java | 33 ++-
.../apache/tika/plugins/TikaPluginManagerTest.java | 57 ++++
3 files changed, 417 insertions(+), 2 deletions(-)
diff --git a/tika-grpc/README.md b/tika-grpc/README.md
index f2269b22b..aa4898cc0 100644
--- a/tika-grpc/README.md
+++ b/tika-grpc/README.md
@@ -11,3 +11,332 @@ This server will manage a pool of Tika Pipes clients.
* Delete
* Fetch + Parse a given Fetch Item
+## Plugin Development Mode
+
+When developing plugins, you can use pf4j's development mode to load plugins
directly from their `target/classes` directories without needing to package
them as ZIP files. This significantly speeds up the development cycle.
+
+**Configuration Location:** The `plugin-roots` setting is specified in your
**Tika configuration JSON file** (commonly named `tika-config.json`,
`dev-config.json`, or similar).
+
+### Enabling Development Mode
+
+Set one of the following:
+
+**System Property:**
+```bash
+-Dtika.plugin.dev.mode=true
+```
+
+**Environment Variable:**
+```bash
+export TIKA_PLUGIN_DEV_MODE=true
+```
+
+### Configuration Example
+
+In your Tika configuration JSON, point `plugin-roots` to the unpackaged plugin
directories:
+
+```json
+{
+ "plugin-roots": [
+ "/path/to/tika/tika-pipes/tika-pipes-plugins/tika-pipes-s3/target/classes",
+
"/path/to/tika/tika-pipes/tika-pipes-plugins/tika-pipes-file-system/target/classes",
+
"/path/to/tika/tika-pipes/tika-pipes-plugins/tika-pipes-kafka/target/classes"
+ ],
+ "fetchers": [
+ {
+ "s3": {
+ "myS3Fetcher": {
+ "region": "us-east-1",
+ "bucket": "my-bucket"
+ }
+ }
+ }
+ ]
+}
+```
+
+### Development Workflow
+
+1. **Build the plugin modules** (only needed once or when dependencies change):
+ ```bash
+ cd tika-pipes/tika-pipes-plugins
+ mvn clean compile
+ ```
+
+2. **Enable development mode** via system property or environment variable:
+ ```bash
+ export TIKA_PLUGIN_DEV_MODE=true
+ ```
+
+3. **Configure plugin-roots** to point to `target/classes` directories in your
config JSON
+
+4. **Run your application** - plugins will be loaded directly from the class
directories:
+ ```bash
+ java -jar tika-grpc-server.jar --config my-config.json
+ ```
+
+5. **Make code changes** to your plugin
+
+6. **Recompile just the changed plugin** (much faster than full rebuild):
+ ```bash
+ cd tika-pipes/tika-pipes-plugins/tika-pipes-s3
+ mvn compile
+ ```
+
+7. **Restart your application** - changes are immediately picked up
+
+### What Happens in Development Mode
+
+- **ZIP extraction is skipped** - TikaPluginManager doesn't try to unzip
plugins
+- **Plugins loaded from directories** - pf4j loads classes directly from
`target/classes`
+- **Each plugin directory must contain** `plugin.properties` in the root
(already present after `mvn compile`)
+
+### Example Directory Structure
+
+When pointing to `target/classes`, pf4j expects this structure:
+
+```
+tika-pipes-s3/target/classes/
+├── plugin.properties # Required: plugin metadata
+├── META-INF/
+│ └── extensions.idx # Generated by pf4j annotation processor
+└── org/
+ └── apache/
+ └── tika/
+ └── pipes/
+ └── fetcher/
+ └── s3/
+ ├── S3Fetcher.class
+ └── S3FetcherFactory.class
+```
+
+### Multiple Plugins During Development
+
+You can load multiple plugins simultaneously by listing all their
`target/classes` directories:
+
+```json
+{
+ "plugin-roots": [
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-s3/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-kafka/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-opensearch/target/classes"
+ ]
+}
+```
+
+### IntelliJ IDEA Setup - Loading All Plugins
+
+For IntelliJ IDEA development, you can load ALL available plugins at once.
Here's a complete configuration example:
+
+#### 1. Create a development configuration JSON file
+
+Create `dev-config.json` in your project root with all plugin class
directories:
+
+```json
+{
+ "plugin-roots": [
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-az-blob/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-csv/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-file-system/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-gcs/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-http/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-jdbc/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-json/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-kafka/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-microsoft-graph/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-opensearch/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-s3/target/classes",
+
"${project.basedir}/tika-pipes/tika-pipes-plugins/tika-pipes-solr/target/classes"
+ ],
+ "fetchers": [
+ {
+ "fs": {
+ "myFetcher": {
+ "basePath": "/tmp/input"
+ }
+ }
+ }
+ ]
+}
+```
+
+**Note:** If using absolute paths instead of `${project.basedir}`, replace
with your actual Tika project path:
+
+```json
+{
+ "plugin-roots": [
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-az-blob/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-csv/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-file-system/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-gcs/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-http/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-jdbc/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-json/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-kafka/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-microsoft-graph/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-opensearch/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-s3/target/classes",
+
"/home/user/tika/tika-pipes/tika-pipes-plugins/tika-pipes-solr/target/classes"
+ ]
+}
+```
+
+#### 2. Configure IntelliJ Run Configuration
+
+1. **Go to:** Run → Edit Configurations
+2. **Add New Configuration:** Click `+` → Application (or your existing run
configuration)
+3. **Set Main Class:** Your application's main class (e.g.,
`org.apache.tika.grpc.TikaGrpcServer`)
+4. **VM Options:** Add development mode flag:
+ ```
+ -Dtika.plugin.dev.mode=true
+ ```
+5. **Program Arguments:** Point to your config file:
+ ```
+ --config dev-config.json
+ ```
+6. **Environment Variables:** (Alternative to VM option)
+ ```
+ TIKA_PLUGIN_DEV_MODE=true
+ ```
+7. **Working Directory:** Set to your Tika project root
+ ```
+ $PROJECT_DIR$
+ ```
+
+#### 3. Build All Plugins Once
+
+Before running in IntelliJ, compile all plugins:
+
+```bash
+# From Tika project root
+cd tika-pipes/tika-pipes-plugins
+mvn clean compile
+```
+
+Or use IntelliJ's Maven tool window:
+- Open **Maven** tool window (View → Tool Windows → Maven)
+- Expand **tika-pipes-plugins**
+- Right-click → Lifecycle → **compile**
+
+#### 4. Run in IntelliJ
+
+Click the **Run** button with your configured run configuration. You should
see in the console:
+
+```
+INFO TikaPluginManager running in DEVELOPMENT mode
+INFO PF4J version 3.14.0 in 'development' mode
+INFO Plugin '[email protected]' resolved
+INFO Plugin '[email protected]' resolved
+INFO Plugin '[email protected]' resolved
+...
+```
+
+#### 5. Hot Reload During Development
+
+When you modify plugin code:
+
+1. **Make your changes** in the plugin source code
+2. **Build just that module:** In IntelliJ, right-click the module → Build
Module
+3. **Restart your run configuration** - changes are immediately picked up
+
+No need to rebuild ZIPs or the entire project!
+
+### Shell Script to Generate Config with All Plugins
+
+You can also generate the configuration dynamically:
+
+```bash
+#!/bin/bash
+# generate-dev-config.sh
+
+TIKA_ROOT="/path/to/your/tika" # Update this path
+
+cat > dev-config.json << 'EOF'
+{
+ "plugin-roots": [
+EOF
+
+# Add all plugin class directories
+for plugin in $(ls -d
$TIKA_ROOT/tika-pipes/tika-pipes-plugins/*/target/classes 2>/dev/null); do
+ echo " \"$plugin\"," >> dev-config.json
+done
+
+# Remove trailing comma from last entry
+sed -i '$ s/,$//' dev-config.json
+
+cat >> dev-config.json << 'EOF'
+ ],
+ "fetchers": [
+ {
+ "fs": {
+ "defaultFetcher": {
+ "basePath": "/tmp"
+ }
+ }
+ }
+ ]
+}
+EOF
+
+echo "Generated dev-config.json with all available plugins"
+```
+
+Run it:
+```bash
+chmod +x generate-dev-config.sh
+./generate-dev-config.sh
+```
+
+
+
+### Switching Back to Production Mode
+
+For production deployments, use packaged ZIP files:
+
+1. **Remove or set development mode to false**:
+ ```bash
+ unset TIKA_PLUGIN_DEV_MODE
+ # OR
+ export TIKA_PLUGIN_DEV_MODE=false
+ ```
+
+2. **Build plugin ZIPs**:
+ ```bash
+ cd tika-pipes/tika-pipes-plugins
+ mvn clean package
+ ```
+
+3. **Update plugin-roots** to point to the directory containing ZIP files:
+ ```json
+ {
+ "plugin-roots": [
+ "/opt/tika/plugins"
+ ]
+ }
+ ```
+
+4. **Place plugin ZIPs** in the configured directory:
+ ```bash
+ cp tika-pipes-plugins/*/target/*.zip /opt/tika/plugins/
+ ```
+
+### Troubleshooting
+
+**Plugin not loading?**
+- Ensure `mvn compile` was run on the plugin module
+- Check that `plugin.properties` exists in `target/classes/`
+- Verify development mode is enabled
+- Look for "DEVELOPMENT mode" in the logs on startup
+
+**Changes not picked up?**
+- Recompile the plugin module: `mvn compile`
+- Restart the application
+- Check that you're editing the correct plugin module
+
+### References
+
+- [pf4j Development Mode
Documentation](https://pf4j.org/doc/development-mode.html)
+- [JIRA TIKA-4587](https://issues.apache.org/jira/browse/TIKA-4587) -
Development mode implementation
+
diff --git
a/tika-plugins-core/src/main/java/org/apache/tika/plugins/TikaPluginManager.java
b/tika-plugins-core/src/main/java/org/apache/tika/plugins/TikaPluginManager.java
index cd6296755..0cd635660 100644
---
a/tika-plugins-core/src/main/java/org/apache/tika/plugins/TikaPluginManager.java
+++
b/tika-plugins-core/src/main/java/org/apache/tika/plugins/TikaPluginManager.java
@@ -29,6 +29,7 @@ import com.fasterxml.jackson.databind.ObjectMapper;
import org.pf4j.DefaultExtensionFinder;
import org.pf4j.DefaultPluginManager;
import org.pf4j.ExtensionFinder;
+import org.pf4j.RuntimeMode;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -44,6 +45,9 @@ import org.apache.tika.exception.TikaConfigException;
public class TikaPluginManager extends DefaultPluginManager {
private static final Logger LOG =
LoggerFactory.getLogger(TikaPluginManager.class);
+
+ private static final String DEV_MODE_PROPERTY = "tika.plugin.dev.mode";
+ private static final String DEV_MODE_ENV = "TIKA_PLUGIN_DEV_MODE";
//we're only using this to convert a single path or a list of paths to a
list
//we don't need all the functionality of the polymorphic objectmapper in
tika-serialization
@@ -142,8 +146,29 @@ public class TikaPluginManager extends
DefaultPluginManager {
public TikaPluginManager(List<Path> pluginRoots) throws IOException {
super(pluginRoots);
+ configureRuntimeMode();
init();
}
+
+ private void configureRuntimeMode() {
+ RuntimeMode mode = isDevelopmentMode() ? RuntimeMode.DEVELOPMENT :
RuntimeMode.DEPLOYMENT;
+ this.runtimeMode = mode;
+ if (mode == RuntimeMode.DEVELOPMENT) {
+ LOG.info("TikaPluginManager running in DEVELOPMENT mode");
+ }
+ }
+
+ private static boolean isDevelopmentMode() {
+ String sysProp = System.getProperty(DEV_MODE_PROPERTY);
+ if (sysProp != null) {
+ return Boolean.parseBoolean(sysProp);
+ }
+ String envVar = System.getenv(DEV_MODE_ENV);
+ if (envVar != null) {
+ return Boolean.parseBoolean(envVar);
+ }
+ return false;
+ }
/**
* Override to disable classpath scanning for extensions.
@@ -163,8 +188,12 @@ public class TikaPluginManager extends
DefaultPluginManager {
}
private void init() throws IOException {
- for (Path root : pluginsRoots) {
- unzip(root);
+ if (getRuntimeMode() == RuntimeMode.DEPLOYMENT) {
+ for (Path root : pluginsRoots) {
+ unzip(root);
+ }
+ } else {
+ LOG.debug("Skipping ZIP extraction in DEVELOPMENT mode");
}
}
diff --git
a/tika-plugins-core/src/test/java/org/apache/tika/plugins/TikaPluginManagerTest.java
b/tika-plugins-core/src/test/java/org/apache/tika/plugins/TikaPluginManagerTest.java
new file mode 100644
index 000000000..ca492d055
--- /dev/null
+++
b/tika-plugins-core/src/test/java/org/apache/tika/plugins/TikaPluginManagerTest.java
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.plugins;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+import java.nio.file.Path;
+import java.util.Collections;
+
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
+import org.pf4j.RuntimeMode;
+
+public class TikaPluginManagerTest {
+
+ @Test
+ public void testDefaultRuntimeModeIsDeployment(@TempDir Path tmpDir)
throws Exception {
+ TikaPluginManager manager = new
TikaPluginManager(Collections.singletonList(tmpDir));
+ assertEquals(RuntimeMode.DEPLOYMENT, manager.getRuntimeMode());
+ }
+
+ @Test
+ public void testDevelopmentModeViaSystemProperty(@TempDir Path tmpDir)
throws Exception {
+ System.setProperty("tika.plugin.dev.mode", "true");
+ try {
+ TikaPluginManager manager = new
TikaPluginManager(Collections.singletonList(tmpDir));
+ assertEquals(RuntimeMode.DEVELOPMENT, manager.getRuntimeMode());
+ } finally {
+ System.clearProperty("tika.plugin.dev.mode");
+ }
+ }
+
+ @Test
+ public void testDeploymentModeWhenPropertyIsFalse(@TempDir Path tmpDir)
throws Exception {
+ System.setProperty("tika.plugin.dev.mode", "false");
+ try {
+ TikaPluginManager manager = new
TikaPluginManager(Collections.singletonList(tmpDir));
+ assertEquals(RuntimeMode.DEPLOYMENT, manager.getRuntimeMode());
+ } finally {
+ System.clearProperty("tika.plugin.dev.mode");
+ }
+ }
+}