This is an automated email from the ASF dual-hosted git repository. robertlazarski pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/axis-axis2-java-core.git
commit bf3a403c548b96890d377767005f62dade1921b7 Author: Robert Lazarski <[email protected]> AuthorDate: Tue Apr 7 03:03:41 2026 -1000 MCP catalog B1: mcpInputSchema parameter support + build-time code-gen script - OpenApiSpecGenerator: generateMcpCatalogJson() now reads mcpInputSchema parameter (operation-level overrides service-level via existing getMcpStringParam). Parses value with Jackson to validate JSON; falls back to empty schema with WARN log on parse failure. Backward compatible — no param = existing empty schema. - McpCatalogGeneratorTest: 6 new B1 tests covering parameter override, required array preservation, service-level fallback, precedence, invalid JSON fallback, and backward-compat empty schema baseline. - tools/gen_mcp_schema.py: Option 3 build-time code-gen. Parses typedef struct{} blocks from Axis2/C .h files, maps C types to JSON Schema (integer/number/string/ boolean/array/object), and writes mcpInputSchema parameters into services.xml. Run: python3 tools/gen_mcp_schema.py --header service.h --services services.xml - AXIS2_MODERNIZATION_PLAN.md: new Immediate Track section covering B1/B2/B3/C3 (Java), D1/D2/D3 (Axis2/C), and E (Penguin deployment) with sprint sequence. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> --- AXIS2_MODERNIZATION_PLAN.md | 183 +++++++++++++ .../apache/axis2/openapi/OpenApiSpecGenerator.java | 51 +++- .../axis2/openapi/McpCatalogGeneratorTest.java | 142 ++++++++++ tools/gen_mcp_schema.py | 300 +++++++++++++++++++++ 4 files changed, 669 insertions(+), 7 deletions(-) diff --git a/AXIS2_MODERNIZATION_PLAN.md b/AXIS2_MODERNIZATION_PLAN.md index 7e0242f4c0..6cdda944d6 100644 --- a/AXIS2_MODERNIZATION_PLAN.md +++ b/AXIS2_MODERNIZATION_PLAN.md @@ -23,6 +23,189 @@ entirely. No other Java framework can do all three from the same service deploym --- +## Immediate Track — MCP inputSchema + Axis2/C + Penguin Demo + +**Goal**: Complete the MCP catalog to production quality, port the catalog handler to +Axis2/C, and run a live demo on penguin via Apache httpd. This track runs ahead of +Phases 1–6 because it validates the MCP story end-to-end on real hardware. + +### Step B1 — `mcpInputSchema` static parameter support (Java + C) + +**Problem**: Every tool in `/openapi-mcp.json` emits `"inputSchema": {}`. Claude has to +guess parameters. This kills usability for financial benchmark tools with 6+ fields. + +**Approach (dual strategy)**: + +1. **Option 1 — Static declaration in services.xml** (ships first, zero risk): + Each `<operation>` carries a `mcpInputSchema` parameter whose value is a literal + JSON Schema string. `OpenApiSpecGenerator.generateMcpCatalogJson()` reads it with + `getMcpStringParam()` and embeds it verbatim, parsing with Jackson to validate. + Falls back to `{}` on parse failure with a WARN log. + + ```xml + <operation name="portfolioVariance"> + <parameter name="mcpInputSchema">{ + "type": "object", + "required": ["n_assets", "weights", "covariance_matrix"], + "properties": { + "n_assets": {"type": "integer", "minimum": 2, "maximum": 2000}, + "weights": {"type": "array", "items": {"type": "number"}}, + "covariance_matrix": {"type": "array", "items": {"type": "number"}}, + "request_id": {"type": "string"} + } + }</parameter> + </operation> + ``` + +2. **Option 3 — Build-time code generation from C headers** (ships second): + A Python script (`tools/gen_mcp_schema.py`) reads Axis2/C service header files, + maps C struct fields to JSON Schema types, and writes `mcpInputSchema` parameters + directly back into `services.xml`. The C type mapping table: + + | C type | JSON Schema type | + |--------|-----------------| + | `int`, `long`, `axis2_int32_t` | `"integer"` | + | `double`, `float` | `"number"` | + | `axis2_char_t *`, `char *` | `"string"` | + | `axis2_bool_t` | `"boolean"` | + | pointer-to-struct | `"object"` | + | array pointer + count field | `"array"` | + + The script detects `_request_t` structs, infers which fields are required vs + optional (required = no default value set in initialiser), and outputs a + standards-compliant JSON Schema. Services.xml is updated in-place. + + Run: `python3 tools/gen_mcp_schema.py --header financial_benchmark_service.h \ + --services services.xml` + +**Java implementation**: `OpenApiSpecGenerator.generateMcpCatalogJson()` — check +`mcpInputSchema` param before falling back to empty schema. Single method change. + +**Tests**: `McpCatalogGeneratorTest` — add tests for schema embedding, invalid JSON +graceful fallback, and missing param fallback. + +### Step B2 — `mcpAuthScope` per-operation parameter + +Operation-level auth scope string embedded in catalog for MCP clients that support +scope-based auth (e.g. `"mcpAuthScope": "read:portfolio"`). Reads via +`getMcpStringParam()`. Omitted from tool node when absent. + +### Step B3 — `mcpStreaming` hint + +Boolean `mcpStreaming` parameter marks operations that can stream chunked responses +(e.g. large Monte Carlo results). Adds `"x-streaming": true` to the tool node. +Reads via `getMcpBoolParam()`. + +### Step C3 — MCP Resources endpoint + +New servlet path `GET /mcp-resources` returns a JSON array of `resource://` URIs: + +```json +{ + "resources": [ + {"uri": "resource://axis2/openapi", "name": "OpenAPI Spec", "mimeType": "application/json"}, + {"uri": "resource://axis2/field-catalog", "name": "Field Catalog", "mimeType": "application/json"} + ] +} +``` + +Individual resource content served at `GET /mcp-resource?uri=resource://axis2/openapi`. +Wired in `OpenApiServlet` as a new path case. + +--- + +### Step D1 — Axis2/C MCP catalog handler + +New file: `modules/mcp/mcp_catalog_handler.c` + +Walks `axis2_conf_t` service map at request time — same traversal as Java's +`axisConfig.getServices()`. Emits the identical JSON catalog format. Key functions: + +```c +// Entry point registered on GET /_mcp/openapi-mcp.json +axis2_status_t mcp_catalog_handler_invoke( + axis2_handler_t *handler, + const axutil_env_t *env, + struct axis2_msg_ctx *msg_ctx); + +// Reads axis2_op_t parameter, falls back to axis2_svc_t parameter +static const axis2_char_t *get_mcp_param( + axis2_op_t *op, axis2_svc_t *svc, + const axutil_env_t *env, + const axis2_char_t *param_name, + const axis2_char_t *default_val); +``` + +Parameter reading uses `axis2_op_get_param()` / `axis2_svc_get_param()` — the same +two-level lookup as Java. `mcpDescription`, `mcpReadOnly`, `mcpDestructive`, +`mcpIdempotent`, `mcpInputSchema` all supported. + +JSON output built with `json_object_new_object()` (json-c) — no string concatenation. + +### Step D2 — Axis2/C correlation ID error hardening + +New helper: `axis2_json_secure_fault.c` + +```c +axis2_char_t *axis2_json_make_secure_fault_message( + const axutil_env_t *env, + int is_parse_error); +// Returns "Bad Request [errorRef=<uuid>]" or "Internal Server Error [errorRef=<uuid>]" +// UUID generated from /dev/urandom (16 bytes → hex with hyphens) +// Full context logged to axutil_log before sanitized message returned +``` + +Applied to `financial_benchmark_service_handler.c` JSON parse error paths and any +`axis2_json_rpc_msg_recv` equivalent in Axis2/C. + +### Step D3 — Populate `mcpInputSchema` in all 5 financial benchmark operations + +Using Option 1 (hand-authored) immediately; Option 3 code-gen script validates against +it. The 5 operations: + +| Operation | Required fields | +|-----------|----------------| +| `portfolioVariance` | `n_assets`, `weights`, `covariance_matrix` | +| `monteCarlo` | `n_simulations`, `n_periods`, `initial_value`, `expected_return`, `volatility` | +| `scenarioAnalysis` | `n_assets`, `assets` | +| `generateTestData` | `n_assets` | +| `metadata` | *(none — GET operation)* | + +### Step E — Penguin deployment + +1. Build `mod_axis2.so` from `axis-axis2-c-core` targeting penguin's Apache httpd +2. `httpd.conf` fragment: + ```apache + LoadModule axis2_module modules/mod_axis2.so + Axis2RepoPath /opt/axis2c/repository + <Location /axis2> + SetHandler axis2_module + </Location> + ``` +3. Deploy `FinancialBenchmarkService` to repository +4. Verify: + ```bash + curl https://penguin/axis2/_mcp/openapi-mcp.json + curl -X POST https://penguin/axis2/services/FinancialBenchmarkService/monteCarlo \ + -H 'Content-Type: application/json' \ + -d '{"monteCarlo":[{"arg0":{"n_simulations":10000,"n_periods":252,...}}]}' + ``` +5. Demo: MCP-aware client resolves tools from catalog, calls financial operations + +### Immediate Sprint Sequence + +``` +B1 (Java) → B1 tests → B2/B3 (Java, config-only) → C3 (Java, new servlet path) + ↓ +D1 (Axis2/C catalog handler) → D2 (error hardening) → D3 (services.xml schemas) + ↓ +Option 3 code-gen script (tools/gen_mcp_schema.py) + ↓ +E (Penguin deployment + demo) +``` + +--- + ## Phase 1 — Spring Boot Starter **Goal**: Reduce Axis2 + Spring Boot integration from a multi-day configuration project diff --git a/modules/openapi/src/main/java/org/apache/axis2/openapi/OpenApiSpecGenerator.java b/modules/openapi/src/main/java/org/apache/axis2/openapi/OpenApiSpecGenerator.java index 4f17b655dc..5824367b2c 100644 --- a/modules/openapi/src/main/java/org/apache/axis2/openapi/OpenApiSpecGenerator.java +++ b/modules/openapi/src/main/java/org/apache/axis2/openapi/OpenApiSpecGenerator.java @@ -754,13 +754,50 @@ public class OpenApiSpecGenerator { service.getName() + ": " + opName); toolNode.put("description", description); - // inputSchema: minimal MCP-compliant structure. Richer schemas are - // produced when services carry @McpTool annotations (future work). - com.fasterxml.jackson.databind.node.ObjectNode schema = - toolNode.putObject("inputSchema"); - schema.put("type", "object"); - schema.putObject("properties"); - schema.putArray("required"); + // inputSchema: prefer mcpInputSchema parameter (literal JSON Schema + // string set in services.xml at operation or service level). + // Falls back to an empty schema when absent or malformed. + // + // Option 1 usage (services.xml): + // <operation name="portfolioVariance"> + // <parameter name="mcpInputSchema">{ + // "type": "object", + // "required": ["n_assets", "weights"], + // "properties": { + // "n_assets": {"type": "integer"}, + // "weights": {"type": "array", "items": {"type": "number"}} + // } + // }</parameter> + // </operation> + // + // Option 3: schemas can also be written by the build-time code-gen + // script (tools/gen_mcp_schema.py) which reads C header structs and + // emits mcpInputSchema parameters into services.xml automatically. + String mcpInputSchemaStr = getMcpStringParam(operation, service, + "mcpInputSchema", null); + if (mcpInputSchemaStr != null) { + try { + com.fasterxml.jackson.databind.JsonNode parsedSchema = + jackson.readTree(mcpInputSchemaStr); + toolNode.set("inputSchema", parsedSchema); + } catch (Exception parseEx) { + log.warn("[MCP] Invalid mcpInputSchema JSON for operation '" + + opName + "' in service '" + service.getName() + + "' — falling back to empty schema: " + + parseEx.getMessage()); + com.fasterxml.jackson.databind.node.ObjectNode schema = + toolNode.putObject("inputSchema"); + schema.put("type", "object"); + schema.putObject("properties"); + schema.putArray("required"); + } + } else { + com.fasterxml.jackson.databind.node.ObjectNode schema = + toolNode.putObject("inputSchema"); + schema.put("type", "object"); + schema.putObject("properties"); + schema.putArray("required"); + } toolNode.put("endpoint", "POST " + path); diff --git a/modules/openapi/src/test/java/org/apache/axis2/openapi/McpCatalogGeneratorTest.java b/modules/openapi/src/test/java/org/apache/axis2/openapi/McpCatalogGeneratorTest.java index 33682c0651..a799ef7a84 100644 --- a/modules/openapi/src/test/java/org/apache/axis2/openapi/McpCatalogGeneratorTest.java +++ b/modules/openapi/src/test/java/org/apache/axis2/openapi/McpCatalogGeneratorTest.java @@ -715,6 +715,148 @@ public class McpCatalogGeneratorTest extends TestCase { assertFalse("openWorldHint default must be false", annotations.path("openWorldHint").asBoolean()); } + // ── B1: mcpInputSchema static parameter ────────────────────────────────── + + /** + * When an operation has a {@code mcpInputSchema} parameter containing a valid + * JSON Schema string, that schema is embedded verbatim in the catalog tool entry. + * This is Option 1: explicit declaration in services.xml. + */ + public void testMcpInputSchemaParamOverridesEmptySchema() throws Exception { + AxisService svc = new AxisService("FinancialBenchmarkService"); + AxisOperation op = new InOutAxisOperation(); + op.setName(QName.valueOf("portfolioVariance")); + op.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", + "{\"type\":\"object\",\"required\":[\"n_assets\",\"weights\"]," + + "\"properties\":{\"n_assets\":{\"type\":\"integer\"}," + + "\"weights\":{\"type\":\"array\",\"items\":{\"type\":\"number\"}}}}")); + svc.addOperation(op); + axisConfig.addService(svc); + + JsonNode schema = getCatalogTools().get(0).path("inputSchema"); + assertEquals("type must be 'object'", "object", schema.path("type").asText()); + assertFalse("properties must be present from mcpInputSchema", + schema.path("properties").isMissingNode()); + assertFalse("n_assets property must be present", + schema.path("properties").path("n_assets").isMissingNode()); + assertEquals("n_assets must be integer type", + "integer", schema.path("properties").path("n_assets").path("type").asText()); + } + + /** + * The required array from the mcpInputSchema parameter must be preserved + * exactly — not replaced with an empty array. + */ + public void testMcpInputSchemaRequiredArrayPreserved() throws Exception { + AxisService svc = new AxisService("FinancialBenchmarkService"); + AxisOperation op = new InOutAxisOperation(); + op.setName(QName.valueOf("monteCarlo")); + op.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", + "{\"type\":\"object\",\"required\":[\"n_simulations\",\"n_periods\"]," + + "\"properties\":{\"n_simulations\":{\"type\":\"integer\"}," + + "\"n_periods\":{\"type\":\"integer\"}}}")); + svc.addOperation(op); + axisConfig.addService(svc); + + JsonNode required = getCatalogTools().get(0).path("inputSchema").path("required"); + assertTrue("required must be an array", required.isArray()); + assertEquals("required must have 2 entries", 2, required.size()); + // Collect required field names + java.util.Set<String> reqFields = new java.util.HashSet<>(); + for (JsonNode r : required) reqFields.add(r.asText()); + assertTrue("n_simulations must be required", reqFields.contains("n_simulations")); + assertTrue("n_periods must be required", reqFields.contains("n_periods")); + } + + /** + * mcpInputSchema set at service level applies to all operations in the service + * that do not have their own operation-level override. + */ + public void testServiceLevelMcpInputSchemaAppliesWhenNoOperationLevel() throws Exception { + AxisService svc = new AxisService("MetadataService"); + svc.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", + "{\"type\":\"object\",\"properties\":{\"request_id\":{\"type\":\"string\"}}}")); + AxisOperation op = new InOutAxisOperation(); + op.setName(QName.valueOf("metadata")); + svc.addOperation(op); + axisConfig.addService(svc); + + JsonNode schema = getCatalogTools().get(0).path("inputSchema"); + assertFalse("request_id property must come from service-level mcpInputSchema", + schema.path("properties").path("request_id").isMissingNode()); + } + + /** + * Operation-level mcpInputSchema takes precedence over a service-level one. + */ + public void testOperationLevelMcpInputSchemaTakesPrecedenceOverServiceLevel() throws Exception { + AxisService svc = new AxisService("SomeService"); + svc.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", + "{\"type\":\"object\",\"properties\":{\"service_field\":{\"type\":\"string\"}}}")); + AxisOperation op = new InOutAxisOperation(); + op.setName(QName.valueOf("specificOp")); + op.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", + "{\"type\":\"object\",\"properties\":{\"op_field\":{\"type\":\"integer\"}}}")); + svc.addOperation(op); + axisConfig.addService(svc); + + JsonNode props = getCatalogTools().get(0).path("inputSchema").path("properties"); + assertFalse("op_field from operation-level schema must be present", + props.path("op_field").isMissingNode()); + assertTrue("service_field must not be present when operation-level overrides", + props.path("service_field").isMissingNode()); + } + + /** + * When mcpInputSchema contains invalid JSON, the generator must log a warning + * and fall back to the empty schema — never throw or produce invalid JSON. + */ + public void testInvalidMcpInputSchemaFallsBackToEmptySchema() throws Exception { + AxisService svc = new AxisService("BrokenService"); + AxisOperation op = new InOutAxisOperation(); + op.setName(QName.valueOf("brokenOp")); + op.addParameter(new org.apache.axis2.description.Parameter( + "mcpInputSchema", "NOT_VALID_JSON{{")); + svc.addOperation(op); + axisConfig.addService(svc); + + // Must not throw — output must still be valid JSON + String json = generator.generateMcpCatalogJson(mockRequest); + JsonNode root = MAPPER.readTree(json); + assertNotNull("Output must still be valid JSON after mcpInputSchema parse failure", root); + + JsonNode schema = root.path("tools").get(0).path("inputSchema"); + assertEquals("Fallback schema must have type=object", "object", + schema.path("type").asText()); + assertFalse("Fallback schema must still have properties", + schema.path("properties").isMissingNode()); + } + + /** + * When no mcpInputSchema parameter is set, the catalog emits the baseline + * empty schema — preserving backward compatibility for all existing services. + */ + public void testAbsentMcpInputSchemaProducesEmptyBaselineSchema() throws Exception { + addService("LegacyService", "legacyOp"); + + JsonNode schema = getCatalogTools().get(0).path("inputSchema"); + assertEquals("Absent mcpInputSchema must produce type=object", "object", + schema.path("type").asText()); + assertTrue("Baseline properties must be an empty object", + schema.path("properties").isObject()); + assertEquals("Baseline properties must be empty", 0, + schema.path("properties").size()); + assertTrue("Baseline required must be an empty array", + schema.path("required").isArray()); + assertEquals("Baseline required must be empty", 0, + schema.path("required").size()); + } + // ── tool list mirrors existing OpenAPI paths ────────────────────────────── /** diff --git a/tools/gen_mcp_schema.py b/tools/gen_mcp_schema.py new file mode 100644 index 0000000000..1110925d2d --- /dev/null +++ b/tools/gen_mcp_schema.py @@ -0,0 +1,300 @@ +#!/usr/bin/env python3 +""" +gen_mcp_schema.py — Build-time MCP inputSchema generator (Option 3) + +Reads an Axis2/C service header file, finds *_request_t structs, maps C field +types to JSON Schema types, and writes mcpInputSchema parameters into the +corresponding services.xml. + +Usage +----- + python3 tools/gen_mcp_schema.py \\ + --header path/to/service.h \\ + --services path/to/services.xml \\ + [--dry-run] + +The script writes in-place unless --dry-run is given, in which case it prints +the updated XML to stdout. + +C → JSON Schema type mapping +----------------------------- +int / long / int32_t / int64_t / axis2_int32_t → "integer" +double / float → "number" +char * / axis2_char_t * → "string" +axis2_bool_t / bool / int (named is_*/has_*) → "boolean" +pointer-to-struct (foo_t *) → "object" +array + companion _count / n_ field → "array" + +Required fields: any field without a "= 0" / "= NULL" / "= false" default in +the struct definition is treated as required. Fields named *_id, n_*, count_* +are also always required. + +The script uses regex-only parsing (no libclang) so it works without a C +toolchain installed. It is conservative: when a type cannot be mapped +unambiguously, it emits "type": "object" and logs a warning. +""" + +import argparse +import json +import re +import sys +import textwrap +from pathlib import Path + +# --------------------------------------------------------------------------- +# C type → JSON Schema type table +# --------------------------------------------------------------------------- +_SCALAR_MAP = [ + # (regex_pattern, json_schema_type) + (r'\bint\b|\blong\b|\bint32_t\b|\bint64_t\b|\buint32_t\b|\buint64_t\b' + r'|\baxis2_int32_t\b|\bsize_t\b', "integer"), + (r'\bdouble\b|\bfloat\b', "number"), + (r'\baxis2_char_t\s*\*|\bchar\s*\*', "string"), + (r'\baxis2_bool_t\b|\bbool\b', "boolean"), +] + +_STRUCT_PTR_RE = re.compile(r'\b(\w+_t)\s*\*') + + +def c_type_to_json_schema(c_type: str, field_name: str) -> dict: + """Map a C type string to a minimal JSON Schema dict.""" + c_type = c_type.strip() + + # Boolean heuristic: field named is_*/has_* with int type + if re.match(r'(is|has|enable|use)_', field_name) and re.search(r'\bint\b', c_type): + return {"type": "boolean"} + + # Pointer to array (double * / float * used for matrix/weight arrays) + if re.search(r'\bdouble\s*\*|\bfloat\s*\*', c_type): + return {"type": "array", "items": {"type": "number"}} + + for pattern, schema_type in _SCALAR_MAP: + if re.search(pattern, c_type): + return {"type": schema_type} + + m = _STRUCT_PTR_RE.search(c_type) + if m: + return {"type": "object"} + + # Fallback + print(f" WARNING: unmapped C type '{c_type}' for field '{field_name}' → object", + file=sys.stderr) + return {"type": "object"} + + +# --------------------------------------------------------------------------- +# Struct parser +# --------------------------------------------------------------------------- +_STRUCT_RE = re.compile( + r'typedef\s+struct\s+\w*\s*\{([^}]+)\}\s*(\w+_t)\s*;', + re.DOTALL +) +_FIELD_RE = re.compile( + r'^\s*(?P<type>(?:const\s+)?[\w\s\*]+?)\s+(?P<name>\w+)\s*(?:=\s*(?P<default>[^;]+))?\s*;', + re.MULTILINE +) + + +def parse_structs(header_text: str) -> dict[str, dict]: + """ + Return {struct_name: {field_name: {"c_type": ..., "has_default": bool}}}. + Only parses typedef struct { ... } name_t; blocks. + """ + structs = {} + for m in _STRUCT_RE.finditer(header_text): + body = m.group(1) + name = m.group(2) + fields = {} + for fm in _FIELD_RE.finditer(body): + field_name = fm.group("name") + c_type = fm.group("type") + default = fm.group("default") + # Skip comment-only or empty lines picked up by the regex + if c_type.strip().startswith("//") or c_type.strip().startswith("*"): + continue + fields[field_name] = { + "c_type": c_type.strip(), + "has_default": default is not None, + } + if fields: + structs[name] = fields + return structs + + +def build_json_schema(struct_fields: dict) -> dict: + """Build a JSON Schema object from parsed struct fields.""" + properties = {} + required = [] + + # Fields that are always array companions (paired with n_* / *_count) — skip them + # as array size information; they are implicit. + companion_size_re = re.compile(r'^n_|_count$|_len$|_size$') + + # First pass: collect array-indicator field names + array_fields = set() + for fname, info in struct_fields.items(): + c_type = info["c_type"] + if re.search(r'\bdouble\s*\*|\bfloat\s*\*', c_type): + array_fields.add(fname) + + for fname, info in struct_fields.items(): + c_type = info["c_type"] + has_default = info["has_default"] + + # Skip size companion fields (n_assets accompanies weights[], etc.) + if companion_size_re.search(fname) and fname not in array_fields: + # Keep n_assets as it is the primary dimension parameter + if not fname.startswith("n_"): + continue + + schema_prop = c_type_to_json_schema(c_type, fname) + + # Annotate array items for common financial arrays + if schema_prop.get("type") == "array" and not schema_prop.get("items"): + schema_prop["items"] = {"type": "number"} + + properties[fname] = schema_prop + + # Required: no default AND not a companion size field + always_required = re.match(r'.+_id$|^n_', fname) + if always_required or not has_default: + required.append(fname) + + schema = { + "type": "object", + "properties": properties, + } + if required: + schema["required"] = required + return schema + + +# --------------------------------------------------------------------------- +# services.xml patcher +# --------------------------------------------------------------------------- +def find_request_struct(structs: dict, op_name: str) -> str | None: + """ + Heuristically find the request struct for an operation name. + Tries: finbench_{op_name}_request_t, {op_name}_request_t, {op_name}_req_t + """ + service_prefix = "finbench_" + candidates = [ + f"{service_prefix}{op_name}_request_t", + f"{op_name}_request_t", + f"{op_name}_req_t", + ] + for c in candidates: + if c in structs: + return c + # Case-insensitive fallback + op_lower = op_name.lower() + for sname in structs: + if op_lower in sname.lower() and "request" in sname.lower(): + return sname + return None + + +_OP_RE = re.compile( + r'(<operation\s+name="(?P<opname>[^"]+)"[^>]*>)', + re.DOTALL +) +_EXISTING_SCHEMA_RE = re.compile( + r'\s*<parameter\s+name="mcpInputSchema">.*?</parameter>', + re.DOTALL +) + + +def patch_services_xml(xml_text: str, structs: dict) -> tuple[str, list[str]]: + """ + For each <operation name="..."> block, find the matching request struct + and inject (or replace) a mcpInputSchema parameter. + + Returns (patched_xml, list_of_change_messages). + """ + messages = [] + result = xml_text + + for m in _OP_RE.finditer(xml_text): + op_name = m.group("opname") + struct_name = find_request_struct(structs, op_name) + if struct_name is None: + messages.append(f" SKIP {op_name}: no matching *_request_t struct found") + continue + + schema = build_json_schema(structs[struct_name]) + schema_json = json.dumps(schema, indent=16) + param_block = f'<parameter name="mcpInputSchema">{schema_json}</parameter>' + + # Check if an mcpInputSchema already exists after this <operation ...> tag + op_start = m.start() + # Find the closing </operation> + close_re = re.compile(r'</operation>', re.DOTALL) + close_m = close_re.search(result, op_start) + if close_m is None: + continue + op_block = result[op_start:close_m.end()] + + if '<parameter name="mcpInputSchema">' in op_block: + # Replace existing + new_op_block = _EXISTING_SCHEMA_RE.sub( + "\n " + param_block, op_block) + result = result[:op_start] + new_op_block + result[close_m.end():] + messages.append(f" UPDATE {op_name}: replaced mcpInputSchema from {struct_name}") + else: + # Insert after the opening <operation ...> tag + tag_end = op_start + len(m.group(1)) + indent = "\n " + result = (result[:tag_end] + + indent + param_block + + result[tag_end:]) + messages.append(f" INSERT {op_name}: wrote mcpInputSchema from {struct_name}") + + return result, messages + + +# --------------------------------------------------------------------------- +# CLI +# --------------------------------------------------------------------------- +def main(): + p = argparse.ArgumentParser(description=__doc__, + formatter_class=argparse.RawDescriptionHelpFormatter) + p.add_argument("--header", required=True, help="Path to .h file") + p.add_argument("--services", required=True, help="Path to services.xml") + p.add_argument("--dry-run", action="store_true", + help="Print patched XML to stdout, do not write") + args = p.parse_args() + + header_path = Path(args.header) + services_path = Path(args.services) + + if not header_path.exists(): + sys.exit(f"ERROR: header not found: {header_path}") + if not services_path.exists(): + sys.exit(f"ERROR: services.xml not found: {services_path}") + + header_text = header_path.read_text(encoding="utf-8") + services_text = services_path.read_text(encoding="utf-8") + + structs = parse_structs(header_text) + if not structs: + sys.exit("ERROR: no typedef struct { } name_t; blocks found in header") + + print(f"Parsed {len(structs)} structs from {header_path.name}:", file=sys.stderr) + for sname in structs: + print(f" {sname} ({len(structs[sname])} fields)", file=sys.stderr) + + patched, messages = patch_services_xml(services_text, structs) + + print("Schema generation results:", file=sys.stderr) + for msg in messages: + print(msg, file=sys.stderr) + + if args.dry_run: + print(patched) + else: + services_path.write_text(patched, encoding="utf-8") + print(f"Written: {services_path}", file=sys.stderr) + + +if __name__ == "__main__": + main()
