Extracting Information from Documents Using a Custom Agent in Oracle AI Agent Studio

Oracle AI Agent Studio allows you to create custom AI agents capable of reading unstructured documents and extracting relevant data directly into a structured format.

This article explains how to enable file attachments, configure the necessary tools, and format the extracted data into JSON.

Step 1: Enable the File Upload Option in the Agent Team

Before an agent can process a document, you must provide a way for users to upload files.

  1. Navigate to the Chat Experience tab while creating or editing your AI Agent team.
  2. Locate the Enable file upload setting.
  3. Toggle the option to On. This allows users to attach documents directly within the chat interface.


Step 2: Add the Multi-File Processor Tool

Create a custom agent and follow the steps
  1. Go to the agent configuration settings.
  2. Add the Oracle standard tool "Multi File Processor" to the agent.
  3. This tool processes the uploaded document and makes its textual content accessible to the agent's language model.



Step 3: Define the Agent Persona and Instructions

A well-defined role ensures the agent extracts data accurately and consistently.

  1. Configure the agent's Persona and Role to match the business requirements (for example, a "Purchase Order Data Extraction Specialist").
  2. Write a clear, concise system prompt detailing what data points need to be collected.
  3. Test the agent and refine the instructions if the initial results do not capture all the required information.


Step 4: Configure the JSON Output Schema

To seamlessly integrate the extracted data with downstream applications or databases, the output must be structured.

  1. Define a strict JSON schema within the agent configuration.
  2. Specify the exact keys, data types (such as strings, numbers, or arrays), and mandatory fields you expect in the final output.
  3. This ensures that the agent translates unstructured document text into a clean, predictable JSON object every time.


Sample JSON:


{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "purchase_orders": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "purchase_order_number": { "type": ["string", "null"] },
          "date": { "type": ["string", "null"], "format": "date" },
          "customer_number": { "type": ["string", "null"] },
          "supplier": {
            "type": ["object", "null"],
            "properties": {
              "name": { "type": ["string", "null"] },
              "address_box": { "type": ["string", "null"] },
              "city": { "type": ["string", "null"] },
              "province": { "type": ["string", "null"] },
              "postal_code": { "type": ["string", "null"] }
            },
            "required": ["name", "address_box", "city", "province", "postal_code"]
          },
          "ship_to": {
            "type": ["object", "null"],
            "properties": {
              "organization_name": { "type": ["string", "null"] },
              "branch_name": { "type": ["string", "null"] },
              "address_line": { "type": ["string", "null"] },
              "city": { "type": ["string", "null"] },
              "postal_code": { "type": ["string", "null"] }
            },
            "required": ["organization_name", "branch_name", "address_line", "city", "postal_code"]
          },
          "line_items": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "item_code": { "type": ["string", "null"] },
                "quantity": { "type": ["integer", "null"] },
                "price": { "type": ["number", "null"] }
              },
              "required": ["item_code", "quantity", "price"]
            }
          },
          "total_quantity": { "type": ["integer", "null"] },
          "total_price": { "type": ["number", "null"] }
        },
        "required": [
          "purchase_order_number", "date", "customer_number", 
          "supplier", "ship_to", "line_items", "total_quantity", "total_price"
        ]
      }
    }
  }
}
*/

No comments:

Post a Comment