可解釋人工智慧代理：使用 Spring AI 擷取 LLM 工具調用推理

1. 概述

當我們建構具備工具呼叫能力的AI代理時，我們通常可以看到LLM選擇了哪個工具，但卻無法了解它做出該選擇的原因。這種資訊缺失使得調試更加困難，降低了可觀測性，並限制了人們對AI驅動系統的信任。對於生產級代理而言，理解模型的推理過程至關重要。可解釋AI代理透過在工具選擇過程中捕獲LLM的額外上下文資訊來解決這個問題。

本文將透過一個實際範例來介紹工具論證增強器。我們將了解如何在工具呼叫期間擷取 LLM 推理，以及如何在 Spring AI 應用程式中使用這些資料。

2. 工具呼叫問題

當模型僅使用訓練資料無法可靠地回答問題時，我們會呼叫一個工具。例如，當LLM需要即時資料（例如當前價格或使用者特定資訊）、需要存取外部系統（例如資料庫或內部服務）或必須觸發操作（例如建立記錄或傳送通知）時，我們會使用該工具。

在 Spring AI 中，模型透過工具呼叫將工作委託給應用程式程式碼，而 LLM 則專注於理解使用者請求並產生最終回應。假設我們的應用程式公開了兩個工具：

@Tool(description = "Get patient health status")

 public String retrievePatientHealthStatus(String patientId) {

 return HEALTH_DATA.get(patientId).status();

 }



 @Tool(description = "Get when patient health status was updated")

 public LocalDate retrievePatientHealthStatusChangeDate(String patientId) {

 return HEALTH_DATA.get(patientId).changeDate();

 }

我們使用@Tool,將一個方法標記為可供 LLM 工具呼叫。我們向應用程式詢問Is the patient stable?以下是幕後發生的情況：

Spring AI 將工具定義（包括其輸入模式）傳送至 LLM。
LLM 分析請求並評估可用工具。
它決定呼叫retrievePatientHealthStatus函數。
LLM 傳回一個包含所需參數的工具呼叫請求。
工具管理調度並呼叫選定的工具。
該工具將結果傳回 LLM，由 LLM 產生最終回應。

從應用程式的角度來看，我們只能看到工具被選中。問題在於我們看不到選擇背後的原因。這種缺乏原因的情況限制了可觀察性，也增加了調試難度。我們可以確認調用了哪個工具，但無法解釋 LLM 為什麼選擇它。 Spring AI 工具參數增強器正是為了解決這個限製而設計的。

3. 工具論證增強器

工具參數增強器在標準工具呼叫之上增加了一個可解釋性層。我們動態地使用額外的參數擴充工具的 JSON Schema。這些參數可擷取應用程式所需的元數據，例如推理、洞察或置信度。工具本身保持不變，並且不會感知到這種增強。我們使用@ToolParam,來描述各個方法參數，以便模型可以理解它必須提供哪些輸入：

@ToolParam(description = """

 Your step-by-step reasoning for why you're calling this tool and what you expect.

 Add evidences why did you choose specific tool to call.

 """, required = true)

 String innerThought

啟用工具參數增強器後，工具呼叫流程將會發生以下變化：

我們問： Is the patient stable?
Spring AI 將retrievePatientHealthStatus()和retrievePatientHealthStatusChangeDate()的工具定義傳送至工具呼叫顧問。
工具參數增強器會攔截這兩個工具定義。
此增強器使用innerThought參數擴展每個工具的 JSON Schema。
Spring AI 將增強的工具模式傳送到 LLM。
LLM 決定呼叫retrievePatientHealthStatus()並傳回一個工具呼叫請求，其中包含原始參數和增強參數innerThought,解釋了為什麼選擇此工具。
增強器提取innerThought並將其轉發給消費者進行日誌記錄、記憶體儲存或分析。
Spring AI 呼叫retrievePatientHealthStatus()時僅使用預期的參數。
LLM 使用工具結果產生最終回應。

這種方法可以捕捉到LLM選擇特定工具的原因。我們可以記錄其推理過程，將其儲存為長期記憶，或用於調試和分析。同時，我們能夠在不改變現有工具契約的情況下，保持工具的簡潔性和可重複使用性，並提高智能體行為的可解釋性和可信度。

4. 病患健康狀況檢查器範例

讓我們來實現一個簡單的病人健康狀況查詢應用程式。在這個例子中，我們將提供幾個工具，分別用於查詢不同類型的病人健康狀況資訊。使用者可以透過查詢程序詢問不同病人的健康狀況。根據使用者提出的問題，LLM（邏輯邏輯模型）會決定呼叫哪個工具來提供所需的資訊。

4.1. 依賴關係

首先，我們加入spring-ai-starter-model-openai依賴項：

<dependency>

 <groupId>org.springframework.ai</groupId>

 <artifactId>spring-ai-starter-model-openai</artifactId>

 <version>${spring-ai.version}<version>

 </dependency>

此依賴項底層已經包含了 Spring AI 類別。此外，它還提供了我們將在本應用中使用的 OpenAI 模型整合。

4.2 工具規格

讓我們建立一個PatientHealthInformationTools類別。該類別將公開一些工具方法，我們的 AI 代理可以呼叫這些方法來檢索患者健康資訊。它充當 LLM 和我們內部健康資料來源之間的橋樑：

public class PatientHealthInformationTools {

 public static final Map<String, HealthStatus> HEALTH_DATA = Map.of(

 "P001", new HealthStatus("Healthy", LocalDate.ofYearDay(2025, 100)),

 "P002", new HealthStatus("Has cough", LocalDate.ofYearDay(2025, 200)),

 "P003", new HealthStatus("Healthy", LocalDate.ofYearDay(2025, 300)),

 "P004", new HealthStatus("Has increased blood pressure", LocalDate.ofYearDay(2025, 350)),

 "P005", new HealthStatus("Healthy", LocalDate.ofYearDay(2026, 10)));



 @Tool(description = "Get patient health status")

 public String retrievePatientHealthStatus(String patientId) {

 return HEALTH_DATA.get(patientId).status();

 }



 @Tool(description = "Get when patient health status was updated")

 public LocalDate retrievePatientHealthStatusChangeDate(String patientId) {

 return HEALTH_DATA.get(patientId).changeDate();

 }

 }

在這個類別中，我們首先加入了兩個工具。首先，在retrievePatientHealthStatus()函數中，我們根據 patientId 傳回患者狀態。然後，在retrievePatientHealthStatusChangeDate()中，我們傳回患者狀態更新日期。

4.3. 代理思維 DTO

現在我們介紹AgentThinking DTO。我們使用此物件來捕捉模型在工具選擇過程中的推理過程。它有助於使工具呼叫決策更加透明，也更易於分析：

public record AgentThinking(

 @ToolParam(description = """

 Your step-by-step reasoning for why you're calling this tool and what you expect.

 Add evidences why did you decided specific tool to call.

 """, required = true)

 String innerThought,

 @ToolParam(description = "Confidence level (low, medium, high) in this tool choice", required = true)

 String confidence) {

 }

在這個DTO中，首先，我們加入了兩個思考參數。具體來說，在innerThought中，LLM會解釋它呼叫特定工具的原因。此外，在confidence參數中，我們記錄了LLM對其選擇工具的信心程度。

4.4.患者健康狀況服務

讓我們建立PatientHealthStatusService服務。該服務負責協調 LLM 呼叫並整合我們增強的工具邏輯：

@Service

 public class PatientHealthStatusService {

 private static final Logger log = LoggerFactory.getLogger(PatientHealthStatusService.class);

 private final ChatClient chatClient;



 @Autowired

 public PatientHealthStatusService(OpenAiChatModel model) {

 AugmentedToolCallbackProvider<AgentThinking> provider = AugmentedToolCallbackProvider

 .<AgentThinking>builder()

 .toolObject(new PatientHealthInformationTools())

 .argumentType(AgentThinking.class)

 .argumentConsumer(event -> {

 AgentThinking thinking = event.arguments();

 log.info("Chosen tool: {}\n LLM Reasoning: {}\n Confidence: {}",

 event.toolDefinition().name(), thinking.innerThought(), thinking.confidence());

 })

 .build();



 chatClient = ChatClient.builder(model)

 .defaultToolCallbacks(provider)

 .build();

 }



 public String getPatientStatusInformation(String prompt) {

 log.info("Input request: {}", prompt);

 return chatClient.prompt(prompt)

 .call()

 .content();

 }

 }

我們建立了一個附加了工具的AugmentedToolCallbackProvider實例。此外，我們還注入了AgentThinking DTO，並添加了日誌邏輯，用於列印每次呼叫的原因細節。在getPatientStatusInformation()方法中，我們先呼叫chatClient並傳入一個輸入提示。同時，附加的AugmentedToolCallbackProvider會自動套用所有必要的邏輯。因此，我們無需手動處理此行為。

4.5. 致電`PatientHealthStatusService`中心以獲取不同訊息

最後，我們來測試PatientHealthStatusService 。我們要驗證工具選擇功能是否正常，以及推理元資料是否被正確捕捉：

@Test

 void givenPatientHealthStatusService_whenAskingPatientHealthStatusAndChangeDate_thenResponseShouldContainExpectedInformation() {



 String healthStatusResponse = statusService

 .getPatientStatusInformation("What is the health status of the patient P002?");



 assertThat(healthStatusResponse)

 .contains("cough");



 String healthStatusChangeDateResponse = statusService

 .getPatientStatusInformation("When the patient P002 health status was changed?");



 assertThat(healthStatusChangeDateResponse)

 .contains("July 19, 2025");

 }

我們呼叫了幾次getPatientStatusInformation()函數。第一次，我們請求取得患者的健康狀況。第二次，我們想了解健康狀況的變更日期。我們已驗證所有回應都包含預期資訊。以下是我們得到的日誌輸出：

[2026-02-02 09:34:46] [INFO] [cbsePatientHealthStatusService] - Input request: What is the health status of the patient P002?

 [2026-02-02 09:34:48] [INFO] [cbsePatientHealthStatusService] - Chosen tool: retrievePatientHealthStatus

 LLM Reasoning: I am calling this tool to get the current health status of the patient with ID P002, as it is essential to know their health condition.

 Confidence: high

 [2026-02-02 09:34:50] [INFO] [cbsePatientHealthStatusService] - Input request: When the patient P002 health status was changed?

 [2026-02-02 09:34:53] [INFO] [cbsePatientHealthStatusService] - Chosen tool: retrievePatientHealthStatusChangeDate

 LLM Reasoning: I need to find out when the health status for patient P002 was last updated to understand their current health situation and any recent changes that may affect their treatment or care. This tool is specifically designed to retrieve the date of the last health status change for a patient.

 Confidence: high

在這裡，我們可以看到呼叫了哪個工具，模型選擇該工具的理由，以及使用該工具的信心程度。

5. 工具呼叫鏈範例

讓我們來看另一個需要呼叫一系列工具的用例。在這個例子中，我們需要透過患者姓名來獲取患者的健康狀況。首先，我們新增一個新的工具定義：

public class PatientHealthInformationTools {

 private static final Map<String, String> PATIENTS_IDS = Map.of(

 "John Snow", "P001",

 "Emily Carter", "P002",

 "Michael Brown", "P003",

 "Sophia Williams", "P004",

 "Daniel Johnson", "P005"

 );



 @Tool(description = "Get patient id for patient name")

 public String retrievePatientId(String patientName) {

 return PATIENTS_IDS.get(patientName);

 }

 }

在這裡，在retrievePatientId()工具中，我們首先傳回給定病患姓名對應的病患 ID。接下來，我們再次呼叫我們的服務。然後，我們使用該服務按姓名檢索患者的健康狀況：

@Test

 void givenPatientHealthStatusService_whenAskingPatientHealthStatusByPatientName_thenResponseShouldContainExpectedInformation() {



 String healthStatusResponse = statusService

 .getPatientStatusInformation("What is the health status of the patient. Patient name: John Snow?");



 assertThat(healthStatusResponse)

 .containsIgnoringCase("healthy");

 }

如預期，我們獲取了健康狀態。現在，讓我們查看日誌：

[2026-02-02 09:44:50] [INFO] [cbsePatientHealthStatusService] - Input request: What is the health status of the patient. Patient name: John Snow?

 [2026-02-02 09:44:52] [INFO] [cbsePatientHealthStatusService] - Chosen tool: retrievePatientHealthStatus

 LLM Reasoning: I need to find out the health status of the patient named John Snow. This tool is specifically designed to retrieve the health status of a patient based on their name, which is why I chose it.

 Confidence: high

 [2026-02-02 09:44:55] [INFO] [cbsePatientHealthStatusService] - Chosen tool: retrievePatientId

 LLM Reasoning: Since I encountered an issue retrieving the health status directly, I'm going to first get the patient ID for John Snow. Once I have the patient ID, I can then retrieve the health status using that ID. This is a necessary step because the health status tool requires a valid patient ID to work properly.

 Confidence: high

 [2026-02-02 09:44:57] [INFO] [cbsePatientHealthStatusService] - Chosen tool: retrievePatientHealthStatus

 LLM Reasoning: Now that I have the patient ID for John Snow, I can use it to retrieve the health status. This tool will provide the health information associated with the patient ID I obtained earlier.

 Confidence: high

我們可以看到模型用於決定調用哪個工具的完整推理過程。我們甚至可以利用這些資訊作為回饋，使我們的請求提示更加具體，並避免不必要的工具呼叫。

6. 結論

在本文中，我們回顧瞭如何使用工具論證增強器使我們的 AI 整合更具可解釋性。我們捕捉了模型在工具選擇過程中的推理過程，而無需修改工具本身的實作。

透過這種方法，我們可以更清晰地觀察LLM工具呼叫決策，並收集寶貴的回饋資訊以進行快速改進。此外，我們還可以使用額外的元資料（例如風險等級、回退策略或決策類別）來豐富工具呼叫資訊。最後，推理數據可以路由到監控系統、持久保存以用於審計，或進行分析以優化代理的長期行為。

和往常一樣，程式碼可以在 GitHub 上找到。

本作品係原創或者翻譯，採用《署名-非商業性使用-禁止演繹4.0國際》許可協議