This is an automated email from the ASF dual-hosted git repository. aldettinger pushed a commit to branch camel-quarkus-main in repository https://gitbox.apache.org/repos/asf/camel-quarkus-examples.git
The following commit(s) were added to refs/heads/camel-quarkus-main by this push: new 02f3b7e Add an example of data extraction with Quarkus LangChain4j 02f3b7e is described below commit 02f3b7ebb155588be73ba79d802640bf0f2b4ff4 Author: aldettinger <aldettin...@gmail.com> AuthorDate: Thu Jul 25 18:30:03 2024 +0200 Add an example of data extraction with Quarkus LangChain4j --- data-extract-langchain4j/README.adoc | 126 ++++++++ .../eclipse-formatter-config.xml | 276 +++++++++++++++++ data-extract-langchain4j/pom.xml | 326 +++++++++++++++++++++ .../extraction/CustomPojoExtractionService.java | 64 ++++ .../java/org/acme/extraction/CustomPojoStore.java | 50 ++++ .../src/main/java/org/acme/extraction/Routes.java | 49 ++++ .../src/main/resources/application.properties | 36 +++ .../org/acme/extraction/OllamaTestResource.java | 126 ++++++++ .../src/test/java/org/acme/extraction/RouteIT.java | 24 ++ .../test/java/org/acme/extraction/RouteTest.java | 84 ++++++ ..._chat-495066d1-9278-4e4b-b8f6-a1c2fa296779.json | 24 ++ ..._chat-52961581-e5b1-4309-b62f-1c6e1e0008eb.json | 24 ++ ..._chat-5c1f926c-0480-41e9-9ca7-a93e17919e99.json | 24 ++ .../01_sarah-london-10-07-1986-satisfied.json | 4 + .../02_john-doe-01-11-2001-unsatisfied.json | 4 + .../03_kate-boss-13-08-1999-satisfied.json | 4 + docs/modules/ROOT/attachments/examples.json | 5 + 17 files changed, 1250 insertions(+) diff --git a/data-extract-langchain4j/README.adoc b/data-extract-langchain4j/README.adoc new file mode 100644 index 0000000..5260f5a --- /dev/null +++ b/data-extract-langchain4j/README.adoc @@ -0,0 +1,126 @@ += Unstructured Data Extraction with LangChain4j: A Camel Quarkus example +:cq-example-description: An example that shows how to convert unstructured text data to structured Java objects helped with a Large Language Model and LangChain4j + +{cq-description} + +TIP: Check the https://camel.apache.org/camel-quarkus/latest/first-steps.html[Camel Quarkus User guide] for prerequisites +and other general information. + +Suppose the volume of https://en.wikipedia.org/wiki/Unstructured_data[unstructured data] grows at a high pace in a given organization. +How could one transform those disseminated gold particles into a conform bullion that could be used in banks. +For instance, let's imagine an insurance company that would record the transcripts of the conversation when customers are discussing with the hotline. +There is probably a lot of valuable information that could be extracted from those conversation transcripts. +In this example, we'll convert those text conversations into Java Objects that could then be used in the rest of the Camel route. + +In order to achieve this extraction, we'll need a https://en.wikipedia.org/wiki/Large_language_model[Large Language Model (LLM)] that natively supports JSON output. +Here, we arbitrarily choose https://ollama.com/library/codellama[codellama] served through https://ollama.com/[ollama]. +In order to invoke the served model, we'll use the high-level LangChain4j APIs like https://docs.langchain4j.dev/tutorials/ai-services[AiServices]. +As we are using the Quarkus runtime, we can leverage all the advantages of the https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus LangChain4j extension]. + +=== Start the Large Language Model + +Let's start a container to serve the LLM with Ollama: + +[source,shell] +---- +docker run -p11434:11434 langchain4j/ollama-codellama:latest +---- + +After a moment, a log like below should be output: + +[source,shell] +---- +time=2024-09-03T08:03:15.532Z level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="62.5 GiB" available="54.4 GiB" +---- + +That's it, the LLM is now ready to serve our data extraction requests. + +=== Package and run the application + +You are now ready to package and run the application. + +TIP: Find more details about the JVM mode and Native mode in the Package and run section of +https://camel.apache.org/camel-quarkus/latest/first-steps.html#_package_and_run_the_application[Camel Quarkus User guide] + +==== JVM mode + +[source,shell] +---- +mvn clean package -DskipTests +java -jar target/quarkus-app/quarkus-run.jar +---- + +==== Extracting data from unstructured conversation + +Let's atomically copy/move the transcript files to the input folder named `target/transcripts/`, for instance like below: + +[source,shell] +---- +cp -rf src/test/resources/transcripts/ target/transcripts-tmp +mv target/transcripts-tmp/*.json target/transcripts/ +---- + +The Camel route should output a log as below: + +[source,shell] +---- +024-09-03 10:14:34,757 INFO [route1] (Camel (camel-1) thread #1 - file://target/transcripts) A document has been received by the camel-quarkus-file extension: { + "id": 1, + "content": "Operator: Hello, how may I help you ?\nCustomer: Hello, I'm calling because I need to declare an accident on my main vehicle.\nOperator: Ok, can you please give me your name ?\nCustomer: My name is Sarah London.\nOperator: Could you please give me your birth date ?\nCustomer: 1986, July the 10th.\nOperator: Ok, I've got your contract and I'm happy to share with you that we'll be able to reimburse all expenses linked to this accident.\nCustomer: Oh great, many thanks." +} +---- + +In the first log above, we can see that a JSON file handling transcript related information has been consumed. +The conversation is present in the JSON field named `content`. +This content will be injected into the LLM prompt. + +After a few seconds or minutes depending on your hardware setup, the LLM provides an answer strictly conforming to the expected JSON schema. +It's now easy for LangChain4j to convert the returned JSON into a Java Object. +At the end, we are provided with a Plain Old Java Object (POJO) handling the extracted data like below. + +[source,shell] +---- +2024-09-03 10:14:51,284 INFO [org.acm.ext.CustomPojoStore] (Camel (camel-1) thread #1 - file://target/transcripts) An extracted POJO has been added to the store: +{ + "customerSatisfied": "true", + "customerName": "Sarah London", + "customerBirthday": "10 July 1986", + "summary": "Declare an accident on main vehicle and receive reimbursement for expenses." +} +---- + +See how the LLM shows its capacity to: + * Extract a human friendly sentiment like `customerSatisfied` + * Exhibits https://nlp.stanford.edu/projects/coref.shtml#:~:text=Overview,question%20answering%2C%20and%20information%20extraction.[coreference resolution], like `customerName` that is deduced from information spread in the whole conversation + * Manage issues related to date format, like the field `customerBirthday` + * Mixed structured and unstructured data (semi-structured data) with the field `summary`. + +Cherry on the cake, all those informations are computed simultaneously during a single LLM inference. + +At the end, the application should have extracted 3 POJOs. +For each of them, it could be interesting to compare the unstructured input text and the corresponding structured POJO. + +More details can be found in the `src/main/java/org/acme/extraction/CustomPojoExtractionService.java` class. + +==== Native mode + +IMPORTANT: Native mode requires having GraalVM and other tools installed. Please check the Prerequisites section +of https://camel.apache.org/camel-quarkus/latest/first-steps.html#_prerequisites[Camel Quarkus User guide]. + +If the application is still running in JVM mode, please kill it, for instance with `CTRL+C`. + +Now, to prepare a native executable using GraalVM, run the following commands: + +[source,shell] +---- +mvn clean package -DskipTests -Dnative +./target/*-runner +---- + +The compilation is a bit slower. Beyond that, notice how the application behaves the same way. +Indeed, you should be able to send the JSON files and see the extracted data exactly as it was done in JVM mode. +The only variation compared to the JVM mode is actually that the application was packaged as a native executable. + +== Feedback + +Please report bugs and propose improvements via https://github.com/apache/camel-quarkus/issues[GitHub issues of Camel Quarkus] project. diff --git a/data-extract-langchain4j/eclipse-formatter-config.xml b/data-extract-langchain4j/eclipse-formatter-config.xml new file mode 100644 index 0000000..2248b2b --- /dev/null +++ b/data-extract-langchain4j/eclipse-formatter-config.xml @@ -0,0 +1,276 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +--> +<profiles version="8"> + <profile name="Camel Java Conventions" version="8" kind="CodeFormatterProfile"> + <setting id="org.eclipse.jdt.core.formatter.align_type_members_on_columns" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_arguments_in_allocation_expression" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_arguments_in_enum_constant" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_arguments_in_explicit_constructor_call" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_arguments_in_method_invocation" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_arguments_in_qualified_allocation_expression" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_binary_expression" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_compact_if" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_conditional_expression" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_enum_constants" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_expressions_in_array_initializer" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_multiple_fields" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_parameters_in_constructor_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_parameters_in_method_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_selector_in_method_invocation" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_superclass_in_type_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_superinterfaces_in_enum_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_superinterfaces_in_type_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_throws_clause_in_constructor_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.alignment_for_throws_clause_in_method_declaration" value="16"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_after_imports" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_after_package" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_field" value="0"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_first_class_body_declaration" value="0"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_imports" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_member_type" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_method" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_new_chunk" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_before_package" value="0"/> + <setting id="org.eclipse.jdt.core.formatter.blank_lines_between_type_declarations" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_annotation_type_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_anonymous_type_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_array_initializer" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_block" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_block_in_case" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_constructor_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_enum_constant" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_enum_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_method_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_switch" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.brace_position_for_type_declaration" value="end_of_line"/> + <setting id="org.eclipse.jdt.core.formatter.comment.align_tags_names_descriptions" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.comment.align_tags_descriptions_grouped" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.clear_blank_lines" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_block_comments" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_comments" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_header" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_html" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_javadoc_comments" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_line_comments" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.format_source_code" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.comment.indent_parameter_description" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.indent_return_description" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.indent_root_tags" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.comment.insert_new_line_before_root_tags" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.comment.insert_new_line_for_parameter" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.comment.line_length" value="120"/> + <setting id="org.eclipse.jdt.core.formatter.compact_else_if" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.continuation_indentation" value="2"/> + <setting id="org.eclipse.jdt.core.formatter.continuation_indentation_for_array_initializer" value="2"/> + <setting id="org.eclipse.jdt.core.formatter.format_guardian_clause_on_one_line" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.indent_body_declarations_compare_to_enum_constant_header" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_body_declarations_compare_to_enum_declaration_header" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_body_declarations_compare_to_type_header" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_breaks_compare_to_cases" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_statements_compare_to_block" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_statements_compare_to_body" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_switchstatements_compare_to_cases" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.indent_switchstatements_compare_to_switch" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.indentation.size" value="8"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_after_annotation" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_after_annotation_on_parameter" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_after_opening_brace_in_array_initializer" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_at_end_of_file_if_missing" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_before_catch_in_try_statement" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_before_closing_brace_in_array_initializer" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_before_else_in_if_statement" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_before_finally_in_try_statement" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_before_while_in_do_statement" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_anonymous_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_block" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_enum_constant" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_enum_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_method_body" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_new_line_in_empty_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_and_in_type_parameter" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_assignment_operator" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_at_in_annotation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_at_in_annotation_type_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_binary_operator" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_closing_angle_bracket_in_type_arguments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_closing_angle_bracket_in_type_parameters" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_closing_brace_in_block" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_closing_paren_in_cast" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_colon_in_assert" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_colon_in_case" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_colon_in_conditional" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_colon_in_for" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_colon_in_labeled_statement" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_allocation_expression" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_annotation" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_array_initializer" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_constructor_declaration_parameters" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_constructor_declaration_throws" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_enum_constant_arguments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_enum_declarations" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_explicitconstructorcall_arguments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_for_increments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_for_inits" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_method_declaration_parameters" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_method_declaration_throws" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_method_invocation_arguments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_multiple_field_declarations" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_multiple_local_declarations" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_parameterized_type_reference" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_superinterfaces" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_type_arguments" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_comma_in_type_parameters" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_ellipsis" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_angle_bracket_in_parameterized_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_angle_bracket_in_type_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_angle_bracket_in_type_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_brace_in_array_initializer" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_bracket_in_array_allocation_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_bracket_in_array_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_annotation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_cast" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_catch" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_constructor_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_enum_constant" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_for" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_if" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_method_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_method_invocation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_parenthesized_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_switch" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_synchronized" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_opening_paren_in_while" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_postfix_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_prefix_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_question_in_conditional" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_question_in_wildcard" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_semicolon_in_for" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_after_unary_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_and_in_type_parameter" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_assignment_operator" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_at_in_annotation_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_binary_operator" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_angle_bracket_in_parameterized_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_angle_bracket_in_type_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_angle_bracket_in_type_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_brace_in_array_initializer" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_bracket_in_array_allocation_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_bracket_in_array_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_annotation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_cast" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_catch" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_constructor_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_enum_constant" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_for" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_if" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_method_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_method_invocation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_parenthesized_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_switch" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_synchronized" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_closing_paren_in_while" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_assert" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_case" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_conditional" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_default" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_for" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_colon_in_labeled_statement" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_allocation_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_annotation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_array_initializer" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_constructor_declaration_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_constructor_declaration_throws" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_enum_constant_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_enum_declarations" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_explicitconstructorcall_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_for_increments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_for_inits" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_method_declaration_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_method_declaration_throws" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_method_invocation_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_multiple_field_declarations" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_multiple_local_declarations" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_parameterized_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_superinterfaces" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_type_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_comma_in_type_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_ellipsis" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_angle_bracket_in_parameterized_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_angle_bracket_in_type_arguments" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_angle_bracket_in_type_parameters" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_annotation_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_anonymous_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_array_initializer" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_block" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_constructor_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_enum_constant" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_enum_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_method_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_switch" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_brace_in_type_declaration" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_bracket_in_array_allocation_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_bracket_in_array_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_bracket_in_array_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_annotation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_annotation_type_member_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_catch" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_constructor_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_enum_constant" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_for" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_if" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_method_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_method_invocation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_parenthesized_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_switch" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_synchronized" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_opening_paren_in_while" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_postfix_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_prefix_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_question_in_conditional" value="insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_question_in_wildcard" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_semicolon" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_semicolon_in_for" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_before_unary_operator" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_brackets_in_array_type_reference" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_braces_in_array_initializer" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_brackets_in_array_allocation_expression" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_parens_in_annotation_type_member_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_parens_in_constructor_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_parens_in_enum_constant" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_parens_in_method_declaration" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.insert_space_between_empty_parens_in_method_invocation" value="do not insert"/> + <setting id="org.eclipse.jdt.core.formatter.join_lines_in_comments" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.join_wrapped_lines" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.keep_else_statement_on_same_line" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.keep_empty_array_initializer_on_one_line" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.keep_imple_if_on_one_line" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.keep_then_statement_on_same_line" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.lineSplit" value="128"/> + <setting id="org.eclipse.jdt.core.formatter.number_of_blank_lines_at_beginning_of_method_body" value="0"/> + <setting id="org.eclipse.jdt.core.formatter.number_of_empty_lines_to_preserve" value="1"/> + <setting id="org.eclipse.jdt.core.formatter.put_empty_statement_on_new_line" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.tabulation.char" value="space"/> + <setting id="org.eclipse.jdt.core.formatter.tabulation.size" value="4"/> + <setting id="org.eclipse.jdt.core.formatter.use_tabs_only_for_leading_indentations" value="false"/> + <setting id="org.eclipse.jdt.core.formatter.use_on_off_tags" value="true"/> + <setting id="org.eclipse.jdt.core.formatter.disabling_tag" value="CHECKSTYLE:OFF"/> + <setting id="org.eclipse.jdt.core.formatter.enabling_tag" value="CHECKSTYLE:ON"/> + </profile> +</profiles> diff --git a/data-extract-langchain4j/pom.xml b/data-extract-langchain4j/pom.xml new file mode 100644 index 0000000..576664d --- /dev/null +++ b/data-extract-langchain4j/pom.xml @@ -0,0 +1,326 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +--> +<project xmlns="http://maven.apache.org/POM/4.0.0" + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> + <modelVersion>4.0.0</modelVersion> + + <artifactId>camel-quarkus-examples-data-extract-langchain4j</artifactId> + <groupId>org.apache.camel.quarkus.examples</groupId> + <version>3.15.0-SNAPSHOT</version> + + <name>Camel Quarkus :: Examples :: Data Extract LangChain4j Repository</name> + <description>Camel Quarkus Example :: Data Extract LangChain4j Repository</description> + + <properties> + + <quarkus.platform.version>3.14.1</quarkus.platform.version> + <camel-quarkus.platform.version>3.15.0-SNAPSHOT</camel-quarkus.platform.version> + + <quarkus.platform.group-id>io.quarkus</quarkus.platform.group-id> + <quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id> + <camel-quarkus.platform.group-id>org.apache.camel.quarkus</camel-quarkus.platform.group-id> + <camel-quarkus.platform.artifact-id>camel-quarkus-bom</camel-quarkus.platform.artifact-id> + + <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> + <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> + <maven.compiler.target>17</maven.compiler.target> + <maven.compiler.source>17</maven.compiler.source> + <maven.compiler.testTarget>${maven.compiler.target}</maven.compiler.testTarget> + <maven.compiler.testSource>${maven.compiler.source}</maven.compiler.testSource> + + <formatter-maven-plugin.version>2.24.1</formatter-maven-plugin.version> + <groovy-maven-plugin.version>2.1.1</groovy-maven-plugin.version> + <impsort-maven-plugin.version>1.11.0</impsort-maven-plugin.version> + <license-maven-plugin.version>4.5</license-maven-plugin.version> + <maven-compiler-plugin.version>3.13.0</maven-compiler-plugin.version> + <maven-jar-plugin.version>3.4.2</maven-jar-plugin.version> + <maven-resources-plugin.version>3.3.1</maven-resources-plugin.version> + <maven-surefire-plugin.version>3.4.0</maven-surefire-plugin.version> + <quarkus-langchain4j-version>0.17.2</quarkus-langchain4j-version> + <wiremock-version>3.9.1</wiremock-version> + </properties> + + <dependencyManagement> + <dependencies> + <!-- Import BOM --> + <dependency> + <groupId>io.quarkiverse.langchain4j</groupId> + <artifactId>quarkus-langchain4j-bom</artifactId> + <version>${quarkus-langchain4j-version}</version> + <type>pom</type> + <scope>import</scope> + </dependency> + <dependency> + <groupId>${quarkus.platform.group-id}</groupId> + <artifactId>${quarkus.platform.artifact-id}</artifactId> + <version>${quarkus.platform.version}</version> + <type>pom</type> + <scope>import</scope> + </dependency> + <dependency> + <groupId>${camel-quarkus.platform.group-id}</groupId> + <artifactId>${camel-quarkus.platform.artifact-id}</artifactId> + <version>${camel-quarkus.platform.version}</version> + <type>pom</type> + <scope>import</scope> + </dependency> + </dependencies> + </dependencyManagement> + + <dependencies> + <dependency> + <groupId>org.apache.camel.quarkus</groupId> + <artifactId>camel-quarkus-bean</artifactId> + </dependency> + <dependency> + <groupId>org.apache.camel.quarkus</groupId> + <artifactId>camel-quarkus-file</artifactId> + </dependency> + <dependency> + <groupId>org.apache.camel.quarkus</groupId> + <artifactId>camel-quarkus-jsonpath</artifactId> + </dependency> + <dependency> + <groupId>org.apache.camel.quarkus</groupId> + <artifactId>camel-quarkus-platform-http</artifactId> + </dependency> + <dependency> + <groupId>io.quarkiverse.langchain4j</groupId> + <artifactId>quarkus-langchain4j-ollama</artifactId> + </dependency> + + <!-- Test --> + <dependency> + <groupId>io.quarkus</groupId> + <artifactId>quarkus-junit5</artifactId> + <scope>test</scope> + </dependency> + <dependency> + <groupId>org.awaitility</groupId> + <artifactId>awaitility</artifactId> + <scope>test</scope> + </dependency> + <dependency> + <groupId>io.rest-assured</groupId> + <artifactId>rest-assured</artifactId> + <scope>test</scope> + </dependency> + <dependency> + <groupId>org.testcontainers</groupId> + <artifactId>testcontainers</artifactId> + <scope>test</scope> + </dependency> + <dependency> + <groupId>org.wiremock</groupId> + <artifactId>wiremock-standalone</artifactId> + <version>${wiremock-version}</version> + <scope>test</scope> + </dependency> + </dependencies> + + <build> + <pluginManagement> + <plugins> + + <plugin> + <groupId>net.revelc.code.formatter</groupId> + <artifactId>formatter-maven-plugin</artifactId> + <version>${formatter-maven-plugin.version}</version> + <configuration> + <configFile>${maven.multiModuleProjectDirectory}/eclipse-formatter-config.xml</configFile> + <lineEnding>LF</lineEnding> + </configuration> + </plugin> + + <plugin> + <groupId>net.revelc.code</groupId> + <artifactId>impsort-maven-plugin</artifactId> + <version>${impsort-maven-plugin.version}</version> + <configuration> + <groups>java.,javax.,org.w3c.,org.xml.,junit.</groups> + <removeUnused>true</removeUnused> + <staticAfter>true</staticAfter> + <staticGroups>java.,javax.,org.w3c.,org.xml.,junit.</staticGroups> + </configuration> + </plugin> + + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-compiler-plugin</artifactId> + <version>${maven-compiler-plugin.version}</version> + <configuration> + <showDeprecation>true</showDeprecation> + <showWarnings>true</showWarnings> + <compilerArgs> + <!-- Specifying -parameters is optional, see CustomPojoExtractionService.extractFromText javadoc --> + <arg>-parameters</arg> + <arg>-Xlint:unchecked</arg> + </compilerArgs> + </configuration> + </plugin> + + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-surefire-plugin</artifactId> + <version>${maven-surefire-plugin.version}</version> + <configuration> + <failIfNoTests>false</failIfNoTests> + <systemPropertyVariables> + <java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager> + </systemPropertyVariables> + </configuration> + </plugin> + + <plugin> + <groupId>${quarkus.platform.group-id}</groupId> + <artifactId>quarkus-maven-plugin</artifactId> + <version>${quarkus.platform.version}</version> + </plugin> + + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-failsafe-plugin</artifactId> + <version>${maven-surefire-plugin.version}</version> + </plugin> + + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-jar-plugin</artifactId> + <version>${maven-jar-plugin.version}</version> + </plugin> + + <plugin> + <groupId>com.mycila</groupId> + <artifactId>license-maven-plugin</artifactId> + <version>${license-maven-plugin.version}</version> + <configuration> + <failIfUnknown>true</failIfUnknown> + <header>${maven.multiModuleProjectDirectory}/header.txt</header> + <excludes> + <exclude>**/*.adoc</exclude> + <exclude>**/*.odp</exclude> + <exclude>**/*.txt</exclude> + <exclude>**/LICENSE.txt</exclude> + <exclude>**/LICENSE</exclude> + <exclude>**/NOTICE.txt</exclude> + <exclude>**/NOTICE</exclude> + <exclude>**/README</exclude> + <exclude>**/pom.xml.versionsBackup</exclude> + </excludes> + <mapping> + <java>SLASHSTAR_STYLE</java> + <properties>CAMEL_PROPERTIES_STYLE</properties> + <kt>SLASHSTAR_STYLE</kt> + </mapping> + <headerDefinitions> + <headerDefinition>${maven.multiModuleProjectDirectory}/license-properties-headerdefinition.xml</headerDefinition> + </headerDefinitions> + </configuration> + </plugin> + </plugins> + </pluginManagement> + + <plugins> + <plugin> + <groupId>${quarkus.platform.group-id}</groupId> + <artifactId>quarkus-maven-plugin</artifactId> + <executions> + <execution> + <id>build</id> + <goals> + <goal>build</goal> + </goals> + </execution> + </executions> + </plugin> + + <plugin> + <groupId>net.revelc.code.formatter</groupId> + <artifactId>formatter-maven-plugin</artifactId> + <executions> + <execution> + <id>format</id> + <goals> + <goal>format</goal> + </goals> + <phase>process-sources</phase> + </execution> + </executions> + </plugin> + + <plugin> + <groupId>net.revelc.code</groupId> + <artifactId>impsort-maven-plugin</artifactId> + <executions> + <execution> + <id>sort-imports</id> + <goals> + <goal>sort</goal> + </goals> + <phase>process-sources</phase> + </execution> + </executions> + </plugin> + </plugins> + </build> + + <profiles> + <profile> + <id>native</id> + <activation> + <property> + <name>native</name> + </property> + </activation> + <properties> + <quarkus.native.enabled>true</quarkus.native.enabled> + </properties> + <build> + <plugins> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-failsafe-plugin</artifactId> + <executions> + <execution> + <goals> + <goal>integration-test</goal> + <goal>verify</goal> + </goals> + </execution> + </executions> + </plugin> + </plugins> + </build> + </profile> + <profile> + <id>skip-testcontainers-tests</id> + <activation> + <property> + <name>skip-testcontainers-tests</name> + </property> + </activation> + <properties> + <skipTests>true</skipTests> + </properties> + </profile> + </profiles> + +</project> diff --git a/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoExtractionService.java b/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoExtractionService.java new file mode 100644 index 0000000..098e146 --- /dev/null +++ b/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoExtractionService.java @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import java.time.LocalDate; + +import dev.langchain4j.service.UserMessage; +import io.quarkiverse.langchain4j.RegisterAiService; +import io.quarkus.runtime.annotations.RegisterForReflection; +import jakarta.enterprise.context.ApplicationScoped; +import org.apache.camel.Handler; + +@RegisterAiService +@ApplicationScoped +public interface CustomPojoExtractionService { + + @RegisterForReflection + static class CustomPojo { + public boolean customerSatisfied; + public String customerName; + public LocalDate customerBirthday; + public String summary; + + private final static String FORMAT = "\n{\n" + + "\t\"customerSatisfied\": \"%s\",\n" + + "\t\"customerName\": \"%s\",\n" + + "\t\"customerBirthday\": \"%td %tB %tY\",\n" + + "\t\"summary\": \"%s\"\n" + + "}\n"; + + public String toString() { + return String.format(FORMAT, this.customerSatisfied, this.customerName, this.customerBirthday, + this.customerBirthday, this.customerBirthday, this.summary); + } + } + + static final String CUSTOM_POJO_EXTRACT_PROMPT = "Extract information about a customer from the text delimited by triple backticks: ```{text}```." + + "The customerBirthday field should be formatted as YYYY-MM-DD." + + "The summary field should concisely relate the customer main ask."; + + /** + * The text parameter of this method is automatically injected as {text} in the CUSTOM_POJO_EXTRACT_PROMPT. This + * is made possible as the code is compiled with -parameters argument in the maven-compiler-plugin related section + * of the pom.xml file. Without -parameters, one would need to use the @V annotation like in the method signature + * proposed below: extractFromText(@dev.langchain4j.service.V("text") String text); + */ + @UserMessage(CUSTOM_POJO_EXTRACT_PROMPT) + @Handler + CustomPojo extractFromText(String text); +} diff --git a/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoStore.java b/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoStore.java new file mode 100644 index 0000000..ee031f9 --- /dev/null +++ b/data-extract-langchain4j/src/main/java/org/acme/extraction/CustomPojoStore.java @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import java.util.List; +import java.util.concurrent.CopyOnWriteArrayList; +import java.util.stream.Collectors; + +import jakarta.enterprise.context.ApplicationScoped; +import org.acme.extraction.CustomPojoExtractionService.CustomPojo; +import org.apache.camel.Handler; +import org.jboss.logging.Logger; + +@ApplicationScoped +public class CustomPojoStore { + + private static final Logger LOG = Logger.getLogger(CustomPojoStore.class); + + private List<CustomPojo> pojos = new CopyOnWriteArrayList<>(); + + @Handler + CustomPojo addPojo(CustomPojo pojo) { + LOG.info("An extracted POJO has been added to the store: " + pojo); + pojos.add(pojo); + return pojo; + } + + String asString() { + StringBuilder sb = new StringBuilder("{ \"pojos\": ["); + String pojoString = pojos.stream().map(CustomPojo::toString).collect(Collectors.joining(",")); + sb.append(pojoString); + sb.append("] }"); + return sb.toString(); + } + +} diff --git a/data-extract-langchain4j/src/main/java/org/acme/extraction/Routes.java b/data-extract-langchain4j/src/main/java/org/acme/extraction/Routes.java new file mode 100644 index 0000000..68980be --- /dev/null +++ b/data-extract-langchain4j/src/main/java/org/acme/extraction/Routes.java @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import jakarta.enterprise.context.ApplicationScoped; +import jakarta.inject.Inject; +import org.apache.camel.builder.RouteBuilder; + +@ApplicationScoped +public class Routes extends RouteBuilder { + + @Inject + CustomPojoExtractionService customPojoExtractionService; + + @Inject + CustomPojoStore customPojoStore; + + @Override + public void configure() { + + // Consumes file documents that contain conversation transcripts (JSON format) + from("file:target/transcripts") + .log("A document has been received by the camel-quarkus-file extension: ${body}") + // Retrieves the conversation content from the JSON field named "content" + .setBody(jsonpath("$.content")) + // The CustomPojoExtractionService transforms the conversation transcript into a CustomPojoExtractionService.CustomPojo + .bean(customPojoExtractionService) + // Store extracted CustomPojoExtractionService.CustomPojos objects into the CustomPojoStore for later inspection + .bean(customPojoStore); + + // This route make it possible to inspect the extracted POJOs, mainly used for demo and test + from("platform-http:/custom-pojo-store?produces=application/json") + .bean(customPojoStore, "asString"); + } +} diff --git a/data-extract-langchain4j/src/main/resources/application.properties b/data-extract-langchain4j/src/main/resources/application.properties new file mode 100644 index 0000000..633c2db --- /dev/null +++ b/data-extract-langchain4j/src/main/resources/application.properties @@ -0,0 +1,36 @@ +## --------------------------------------------------------------------------- +## Licensed to the Apache Software Foundation (ASF) under one or more +## contributor license agreements. See the NOTICE file distributed with +## this work for additional information regarding copyright ownership. +## The ASF licenses this file to You under the Apache License, Version 2.0 +## (the "License"); you may not use this file except in compliance with +## the License. You may obtain a copy of the License at +## +## http://www.apache.org/licenses/LICENSE-2.0 +## +## Unless required by applicable law or agreed to in writing, software +## distributed under the License is distributed on an "AS IS" BASIS, +## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +## See the License for the specific language governing permissions and +## limitations under the License. +## --------------------------------------------------------------------------- + +quarkus.banner.enabled = false + +# Configure Quarkus LangChain4j that handle interactions with the Large Language Model +quarkus.langchain4j.ollama.base-url = http://localhost:11434 +quarkus.langchain4j.ollama.timeout = 3m +quarkus.langchain4j.ollama.chat-model.model-id = codellama +quarkus.langchain4j.ollama.chat-model.format = json +quarkus.langchain4j.ollama.chat-model.temperature = 0 +# Uncomment lines below to log Ollama client requests and responses +#quarkus.langchain4j.ollama.log-requests=true +#quarkus.langchain4j.ollama.log-responses=true + +# Or uncomment lines below to log HTTP traffic between LangChain4j & the LLM API +#quarkus.rest-client.logging.scope=request-response +#quarkus.rest-client.logging.body-limit=10000 +#quarkus.log.category."org.jboss.resteasy.reactive.client.logging".level=DEBUG + +# Configure Quarkus LangChain4j to keep a single message in memory, forgetting about previous data extractions +quarkus.langchain4j.chat-memory.memory-window.max-messages = 1 diff --git a/data-extract-langchain4j/src/test/java/org/acme/extraction/OllamaTestResource.java b/data-extract-langchain4j/src/test/java/org/acme/extraction/OllamaTestResource.java new file mode 100644 index 0000000..2a9489e --- /dev/null +++ b/data-extract-langchain4j/src/test/java/org/acme/extraction/OllamaTestResource.java @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import java.util.Arrays; +import java.util.Map; + +import com.github.tomakehurst.wiremock.WireMockServer; +import io.quarkus.test.common.QuarkusTestResourceLifecycleManager; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.testcontainers.containers.GenericContainer; +import org.testcontainers.containers.output.Slf4jLogConsumer; +import org.testcontainers.containers.wait.strategy.Wait; + +import static java.lang.String.format; + +public class OllamaTestResource implements QuarkusTestResourceLifecycleManager { + + private static final Logger LOG = LoggerFactory.getLogger(OllamaTestResource.class); + + // LangChain4j offers only latest tag for ollama-codellama, hence we do use latest until more tags are introduced + private static final String OLLAMA_IMAGE = "langchain4j/ollama-codellama:latest"; + private static final int OLLAMA_SERVER_PORT = 11434; + + private GenericContainer<?> ollamaContainer; + + private WireMockServer wireMockServer; + private String baseUrl; + + private static final String MODE_MOCK = "mock"; + private static final String MODE_RECORDING = "record"; + private static final String MODE_CONTAINER = "container"; + + /** + * The testMode value could be defined, for instance by invoking: + * mvn clean test -DtestMode=mock. + * + * With the default value "mock", the LLM is faked based on the last recorded run. + * With the value "record", tests are run against a containerized LLM while the HTTP interactions are recorded. + * With the value "container" tests are run against a containerized LLM without recording. + * With any other value, an IllegalArgumentException is thrown. + */ + private boolean isMockMode; + private boolean isRecordingMode; + private boolean isContainerMode; + + private static final String BASE_URL_FORMAT = "http://%s:%s"; + + @Override + public Map<String, String> start() { + + // Check the test running mode + String testMode = System.getProperty("testMode", MODE_MOCK); + isMockMode = MODE_MOCK.equals(testMode); + isRecordingMode = MODE_RECORDING.equals(testMode); + isContainerMode = MODE_CONTAINER.equals(testMode); + if (!isMockMode && !isRecordingMode && !isContainerMode) { + throw new IllegalArgumentException( + "testMode value should be one of " + Arrays.asList(MODE_MOCK, MODE_RECORDING, MODE_CONTAINER)); + } + + if (isMockMode) { + LOG.info("Starting a fake Ollama server backed by wiremock"); + initWireMockServer(); + } else { + LOG.info("Starting an Ollama server backed by testcontainers"); + ollamaContainer = new GenericContainer<>(OLLAMA_IMAGE) + .withExposedPorts(OLLAMA_SERVER_PORT) + .withLogConsumer(new Slf4jLogConsumer(LOG).withPrefix("basicAuthContainer")) + .waitingFor(Wait.forLogMessage(".* msg=\"inference compute\" .*", 1)); + ollamaContainer.start(); + + baseUrl = format(BASE_URL_FORMAT, ollamaContainer.getHost(), ollamaContainer.getMappedPort(OLLAMA_SERVER_PORT)); + + if (isRecordingMode) { + LOG.info("Recording interactions with the Ollama server backed by testcontainers"); + initWireMockServer(); + } + } + + return Map.of("quarkus.langchain4j.ollama.base-url", baseUrl); + } + + private void initWireMockServer() { + wireMockServer = new WireMockServer(); + wireMockServer.start(); + if (isRecordingMode) { + wireMockServer.resetMappings(); + wireMockServer.startRecording(baseUrl); + } + baseUrl = format(BASE_URL_FORMAT, "localhost", wireMockServer.port()); + } + + @Override + public void stop() { + try { + if (ollamaContainer != null) { + ollamaContainer.stop(); + } + } catch (Exception ex) { + LOG.error("An issue occurred while stopping " + ollamaContainer.getNetworkAliases(), ex); + } + + if (isMockMode) { + wireMockServer.stop(); + } else if (isRecordingMode) { + wireMockServer.stopRecording(); + wireMockServer.saveMappings(); + } + } +} diff --git a/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteIT.java b/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteIT.java new file mode 100644 index 0000000..56cf33e --- /dev/null +++ b/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteIT.java @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import io.quarkus.test.junit.QuarkusIntegrationTest; + +@QuarkusIntegrationTest +public class RouteIT extends RouteTest { + +} diff --git a/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteTest.java b/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteTest.java new file mode 100644 index 0000000..fe466c4 --- /dev/null +++ b/data-extract-langchain4j/src/test/java/org/acme/extraction/RouteTest.java @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.acme.extraction; + +import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; +import java.nio.file.StandardCopyOption; +import java.util.concurrent.TimeUnit; + +import io.quarkus.test.common.QuarkusTestResource; +import io.quarkus.test.junit.QuarkusTest; +import io.restassured.http.ContentType; +import org.apache.commons.io.FileUtils; +import org.junit.jupiter.api.Test; + +import static io.restassured.RestAssured.given; +import static org.awaitility.Awaitility.await; +import static org.hamcrest.Matchers.empty; +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.not; + +@QuarkusTestResource(OllamaTestResource.class) +@QuarkusTest +public class RouteTest { + + @Test + void unstructuredFileTranscriptsAreTransformedToPojos() throws IOException { + + FileUtils.deleteQuietly(new File("target/transcripts-tmp/")); + FileUtils.deleteQuietly(new File("target/transcripts/")); + + FileUtils.copyDirectory(new File("src/test/resources/transcripts/"), new File("target/transcripts-tmp/")); + Files.move(Paths.get("target/transcripts-tmp"), Paths.get("target/transcripts"), StandardCopyOption.ATOMIC_MOVE); + + await().pollInterval(200, TimeUnit.MILLISECONDS).atMost(3, TimeUnit.MINUTES) + .until( + () -> { + return given() + .contentType(ContentType.JSON) + .when() + .get("/custom-pojo-store") + .path("pojos.size()").equals(3); + }); + + given() + .contentType(ContentType.JSON) + .when() + .get("/custom-pojo-store") + .then() + .statusCode(200) + // Assert values of the first extracted POJO + .body("pojos[0].customerSatisfied", is("true")) + .body("pojos[0].customerName", is("Sarah London")) + .body("pojos[0].customerBirthday", is("10 July 1986")) + .body("pojos[0].summary", not(empty())) + // Assert values of the second extracted POJO + .body("pojos[1].customerSatisfied", is("false")) + .body("pojos[1].customerName", is("John Doe")) + .body("pojos[1].customerBirthday", is("01 November 2001")) + .body("pojos[1].summary", not(empty())) + // Assert values of the third extracted POJO + .body("pojos[2].customerSatisfied", is("true")) + .body("pojos[2].customerName", is("Kate Boss")) + .body("pojos[2].customerBirthday", is("13 August 1999")) + .body("pojos[2].summary", not(empty())); + } + +} diff --git a/data-extract-langchain4j/src/test/resources/mappings/api_chat-495066d1-9278-4e4b-b8f6-a1c2fa296779.json b/data-extract-langchain4j/src/test/resources/mappings/api_chat-495066d1-9278-4e4b-b8f6-a1c2fa296779.json new file mode 100644 index 0000000..ba1c2d3 --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/mappings/api_chat-495066d1-9278-4e4b-b8f6-a1c2fa296779.json @@ -0,0 +1,24 @@ +{ + "id" : "495066d1-9278-4e4b-b8f6-a1c2fa296779", + "name" : "api_chat", + "request" : { + "url" : "/api/chat", + "method" : "POST", + "bodyPatterns" : [ { + "equalToJson" : "{\n \"model\" : \"codellama\",\n \"messages\" : [ {\n \"role\" : \"assistant\",\n \"content\" : \"{\\n\\\"customerSatisfied\\\": false,\\n\\\"customerName\\\": \\\"John Doe\\\",\\n\\\"customerBirthday\\\": \\\"2001-11-01\\\",\\n\\\"summary\\\": \\\"Insurance company failed to notify customer of automatic cancellation of full reimbursement option and only provided half reimbursement for accident.\\\"\\n}\"\n }, {\n \"role\" : \"user\",\n \"content\" : [...] + "ignoreArrayOrder" : true, + "ignoreExtraElements" : true + } ] + }, + "response" : { + "status" : 200, + "body" : "{\"model\":\"codellama\",\"created_at\":\"2024-08-28T16:54:19.439835677Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n\\\"customerSatisfied\\\": true,\\n\\\"customerName\\\": \\\"Kate Boss\\\",\\n\\\"customerBirthday\\\": \\\"1999-08-13\\\",\\n\\\"summary\\\": \\\"Customer was unable to find their insurance contract and had to go through a process of updating their name on the contract. The operator provided assistance throughout the process, ensuring that the cu [...] + "headers" : { + "Date" : "Wed, 28 Aug 2024 16:54:19 GMT", + "Content-Type" : "application/json; charset=utf-8" + } + }, + "uuid" : "495066d1-9278-4e4b-b8f6-a1c2fa296779", + "persistent" : true, + "insertionIndex" : 4 +} \ No newline at end of file diff --git a/data-extract-langchain4j/src/test/resources/mappings/api_chat-52961581-e5b1-4309-b62f-1c6e1e0008eb.json b/data-extract-langchain4j/src/test/resources/mappings/api_chat-52961581-e5b1-4309-b62f-1c6e1e0008eb.json new file mode 100644 index 0000000..9a626a0 --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/mappings/api_chat-52961581-e5b1-4309-b62f-1c6e1e0008eb.json @@ -0,0 +1,24 @@ +{ + "id" : "52961581-e5b1-4309-b62f-1c6e1e0008eb", + "name" : "api_chat", + "request" : { + "url" : "/api/chat", + "method" : "POST", + "bodyPatterns" : [ { + "equalToJson" : "{\n \"model\" : \"codellama\",\n \"messages\" : [ {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the text delimited by triple backticks: ```Operator: Hello, how may I help you ?\\nCustomer: Hello, I'm calling because I need to declare an accident on my main vehicle.\\nOperator: Ok, can you please give me your name ?\\nCustomer: My name is Sarah London.\\nOperator: Could you please give me your birth date ?\\nCustomer: 1 [...] + "ignoreArrayOrder" : true, + "ignoreExtraElements" : true + } ] + }, + "response" : { + "status" : 200, + "body" : "{\"model\":\"codellama\",\"created_at\":\"2024-08-28T16:52:55.508750951Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n\\\"customerSatisfied\\\": true,\\n\\\"customerName\\\": \\\"Sarah London\\\",\\n\\\"customerBirthday\\\": \\\"1986-07-10\\\",\\n\\\"summary\\\": \\\"Declare an accident on main vehicle and receive reimbursement for expenses.\\\"\\n}\"},\"done_reason\":\"stop\",\"done\":true,\"total_duration\":22737688834,\"load_duration\":848753776,\"prompt_eval_ [...] + "headers" : { + "Date" : "Wed, 28 Aug 2024 16:52:55 GMT", + "Content-Type" : "application/json; charset=utf-8" + } + }, + "uuid" : "52961581-e5b1-4309-b62f-1c6e1e0008eb", + "persistent" : true, + "insertionIndex" : 6 +} \ No newline at end of file diff --git a/data-extract-langchain4j/src/test/resources/mappings/api_chat-5c1f926c-0480-41e9-9ca7-a93e17919e99.json b/data-extract-langchain4j/src/test/resources/mappings/api_chat-5c1f926c-0480-41e9-9ca7-a93e17919e99.json new file mode 100644 index 0000000..9ed4cfe --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/mappings/api_chat-5c1f926c-0480-41e9-9ca7-a93e17919e99.json @@ -0,0 +1,24 @@ +{ + "id" : "5c1f926c-0480-41e9-9ca7-a93e17919e99", + "name" : "api_chat", + "request" : { + "url" : "/api/chat", + "method" : "POST", + "bodyPatterns" : [ { + "equalToJson" : "{\n \"model\" : \"codellama\",\n \"messages\" : [ {\n \"role\" : \"assistant\",\n \"content\" : \"{\\n\\\"customerSatisfied\\\": true,\\n\\\"customerName\\\": \\\"Sarah London\\\",\\n\\\"customerBirthday\\\": \\\"1986-07-10\\\",\\n\\\"summary\\\": \\\"Declare an accident on main vehicle and receive reimbursement for expenses.\\\"\\n}\"\n }, {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the text delimited by trip [...] + "ignoreArrayOrder" : true, + "ignoreExtraElements" : true + } ] + }, + "response" : { + "status" : 200, + "body" : "{\"model\":\"codellama\",\"created_at\":\"2024-08-28T16:53:37.033889726Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n\\\"customerSatisfied\\\": false,\\n\\\"customerName\\\": \\\"John Doe\\\",\\n\\\"customerBirthday\\\": \\\"2001-11-01\\\",\\n\\\"summary\\\": \\\"Insurance company failed to notify customer of automatic cancellation of full reimbursement option and only provided half reimbursement for accident.\\\"\\n}\"},\"done_reason\":\"stop\",\"done\":true,\" [...] + "headers" : { + "Date" : "Wed, 28 Aug 2024 16:53:37 GMT", + "Content-Type" : "application/json; charset=utf-8" + } + }, + "uuid" : "5c1f926c-0480-41e9-9ca7-a93e17919e99", + "persistent" : true, + "insertionIndex" : 5 +} \ No newline at end of file diff --git a/data-extract-langchain4j/src/test/resources/transcripts/01_sarah-london-10-07-1986-satisfied.json b/data-extract-langchain4j/src/test/resources/transcripts/01_sarah-london-10-07-1986-satisfied.json new file mode 100644 index 0000000..d4dab11 --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/transcripts/01_sarah-london-10-07-1986-satisfied.json @@ -0,0 +1,4 @@ +{ + "id": 1, + "content": "Operator: Hello, how may I help you ?\nCustomer: Hello, I'm calling because I need to declare an accident on my main vehicle.\nOperator: Ok, can you please give me your name ?\nCustomer: My name is Sarah London.\nOperator: Could you please give me your birth date ?\nCustomer: 1986, July the 10th.\nOperator: Ok, I've got your contract and I'm happy to share with you that we'll be able to reimburse all expenses linked to this accident.\nCustomer: Oh great, many thanks." +} \ No newline at end of file diff --git a/data-extract-langchain4j/src/test/resources/transcripts/02_john-doe-01-11-2001-unsatisfied.json b/data-extract-langchain4j/src/test/resources/transcripts/02_john-doe-01-11-2001-unsatisfied.json new file mode 100644 index 0000000..b7f2d7d --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/transcripts/02_john-doe-01-11-2001-unsatisfied.json @@ -0,0 +1,4 @@ +{ + "id": 2, + "content": "Operator: Hello, how may I help you ?\nCustomer: Hello, I'm John. I need to share a problem with you. Actually, the insurance has reimbursed only half the money I have spent due to the accident.\nOperator: Hello John, could you please give me your last name so that I can find your contract.\nCustomer: Sure, my surname is Doe.\nOperator: And last thing, I need to know the date you were born.\nCustomer: Yes, so I was born in 2001, actually during the first day of November.\nO [...] +} \ No newline at end of file diff --git a/data-extract-langchain4j/src/test/resources/transcripts/03_kate-boss-13-08-1999-satisfied.json b/data-extract-langchain4j/src/test/resources/transcripts/03_kate-boss-13-08-1999-satisfied.json new file mode 100644 index 0000000..af65759 --- /dev/null +++ b/data-extract-langchain4j/src/test/resources/transcripts/03_kate-boss-13-08-1999-satisfied.json @@ -0,0 +1,4 @@ +{ + "id": 3, + "content": "Operator: Hello, how may I help you?\nCustomer: Hello, I am currently at the police station because I've got an accident. The police would need a proof that I have an insurance. Could you please help me?\nOperator: Sure, could you please remind me your name and birth date?\nCustomer: Of course, my name is Kate Hart and I was born on August the thirteen in the year nineteen ninety nine.\nOperator: I'm sorry Kate, but we don't have any contract in our records.\nCustomer: Oh, [...] +} \ No newline at end of file diff --git a/docs/modules/ROOT/attachments/examples.json b/docs/modules/ROOT/attachments/examples.json index 41b6c4f..758b9b8 100644 --- a/docs/modules/ROOT/attachments/examples.json +++ b/docs/modules/ROOT/attachments/examples.json @@ -114,6 +114,11 @@ "description": "Shows how to define a Camel route in XML for tokenizing a CSV a file.", "link": "https://github.com/apache/camel-quarkus-examples/tree/main/file-split-log-xml" }, + { + "title": "Unstructured Data Extraction with LangChain4j", + "description": "Shows how to convert unstructured text data to structured Java objects helped with a Large Language Model and LangChain4j", + "link": "https://github.com/apache/camel-quarkus-examples/tree/main/data-extract-langchain4j" + }, { "title": "Vertx-Websocket Chat", "description": "Shows how to configure a WebSocket server and interact with connected peers.",