在 Spring AI 中配置多個 LLM

1.概述

現代應用程式越來越多地與大型語言模型 (LLM) 集成，以建立智慧解決方案。雖然單一 LLM 可以處理多個任務，但僅依賴一個模型並不總是最佳方法。

不同的模型專注於不同的功能，有些模型擅長技術分析，而有些模型則更擅長創意寫作。此外，我們可能更喜歡使用更輕且經濟高效的模型來處理簡單任務，而將功能強大的模型保留用於複雜任務。

在本教程中，我們將探索使用 Spring AI 在 Spring Boot 應用程式中整合多個 LLM 。

我們將配置來自不同提供者的模型，以及來自同一提供者的多個模型。然後，我們將在此配置的基礎上實現一個彈性聊天機器人，能夠在發生故障時自動在模型之間切換。

2. 配置不同提供者的LLM

讓我們先在我們的應用程式中配置來自不同提供者的兩個 LLM。

為了演示，我們將使用 OpenAI 和 Anthropic 作為我們的 AI 模型提供者。

2.1. 配置主 LLM

我們將首先配置一個 OpenAI 模型作為我們的主要 LLM。

首先，讓我們在專案的pom.xml檔案中加入必要的依賴項：

<dependency>

 <groupId>org.springframework.ai</groupId>

 <artifactId>spring-ai-starter-model-openai</artifactId>

 <version>1.0.2</version>

 </dependency>

OpenAI 啟動依賴項是OpenAI 聊天完成 API 的包裝器，使我們能夠在應用程式中與 OpenAI 模型進行互動。

接下來，讓我們在application.yaml檔案中設定我們的OpenAI API 金鑰和聊天模型：

spring:

 ai:

 open-ai:

 api-key: ${OPENAI_API_KEY}

 chat:

 options:

 model: ${PRIMARY_LLM}

 temperature: 1

我們使用${}屬性佔位符從環境變數載入屬性值。此外，我們將溫度設為1 ，因為較新的 OpenAI 模型僅接受此預設值。

配置上述屬性後， Spring AI 會自動建立一個OpenAiChatModel類型的 bean。讓我們用它來定義一個ChatClient bean，作為與 LLM 互動的主要入口點：

@Configuration

 class ChatbotConfiguration {



 @Bean

 @Primary

 ChatClient primaryChatClient(OpenAiChatModel chatModel) {

 return ChatClient.create(chatModel);

 }

 }

在我們的ChatbotConfiguration類別中，我們使用OpenAiChatModel bean 為我們的主要 LLM 建立一個ChatClient bean。

我們用@Primary註解這個 bean。當我們自動組裝ChatClient bean 而不使用限定符時，Spring Boot 會自動將其註入到我們的元件中。

2.2. 配置輔助 LLM

現在，讓我們配置一個來自 Anthropic 的模型作為我們的輔助 LLM。

首先，讓我們將Anthropic 啟動器依賴項新增到我們的pom.xml檔案中：

<dependency>

 <groupId>org.springframework.ai</groupId>

 <artifactId>spring-ai-starter-model-anthropic</artifactId>

 <version>1.0.2</version>

 </dependency>

此依賴項是Anthropic Message API 的包裝器，並為我們提供了建立連接和與 Anthropic 模型互動所需的類別。

接下來，讓我們定義輔助模型的配置屬性：

spring:

 ai:

 anthropic:

 api-key: ${ANTHROPIC_API_KEY}

 chat:

 options:

 model: ${SECONDARY_LLM}

與我們的主要 LLM 配置類似，我們從環境變數中載入Anthropic API 金鑰和模型 ID。

最後，讓我們為我們的輔助模型建立一個專用的ChatClient bean ：

@Bean

 ChatClient secondaryChatClient(AnthropicChatModel chatModel) {

 return ChatClient.create(chatModel);

 }

在這裡，我們使用 Spring AI 為我們自動配置的AnthropicChatModel bean 建立一個secondaryChatClient bean。

3. 配置相同提供者的LLM

很多時候，我們需要配置的LLM可能屬於同一個AI提供者。

Spring AI 原生不支援這種情況，因為自動配置只會為每個 provider 建立一個ChatModel bean 。我們需要為其他模型手動定義一個ChatModel bean。

讓我們探索這個過程並在我們的應用程式中配置第二個人擇模型：

spring:

 ai:

 anthropic:

 chat:

 options:

 tertiary-model: ${TERTIARY_LLM}

在我們的application.yaml中，在 Anthropic 配置下，我們新增了一個自訂屬性來保存第三層 LLM 的模型名稱。

接下來，讓我們為我們的高等 LLM 定義必要的 bean：

@Bean

 ChatModel tertiaryChatModel(

 AnthropicApi anthropicApi,

 AnthropicChatModel anthropicChatModel,

 @Value("${spring.ai.anthropic.chat.options.tertiary-model}") String tertiaryModelName

 ) {

 AnthropicChatOptions chatOptions = anthropicChatModel.getDefaultOptions().copy();

 chatOptions.setModel(tertiaryModelName);

 return AnthropicChatModel.builder()

 .anthropicApi(anthropicApi)

 .defaultOptions(chatOptions)

 .build();

 }



 @Bean

 ChatClient tertiaryChatClient(@Qualifier("tertiaryChatModel") ChatModel tertiaryChatModel) {

 return ChatClient.create(tertiaryChatModel);

 }

首先，為了建立我們的自訂ChatModel bean，我們注入自動配置的AnthropicApi bean、用於建立輔助 LLM 的ChatClient的預設AnthropicChatModel bean，以及使用@Value的第三模型名稱屬性。

我們從現有的AnthropicChatModel bean 複製預設選項並簡單地覆蓋模型名稱。

此設定假設兩個 Anthropic 模型共用相同的 API 金鑰和其他配置。如果需要指定不同的屬性，我們可以進一步自訂AnthropicChatOptions 。

最後，我們使用自訂的tertiaryChatModel在我們的配置類別中建立第三個ChatClient bean。

4. 探索實際用例

多模型配置完成後，讓我們來實作一個實際用例。我們將建立一個彈性聊天機器人，當主模型出現故障時，它將自動按順序回退到其他模型。

4.1. 建構一個有彈性的聊天機器人

為了實現回退邏輯，我們將使用 Spring Retry。

讓我們建立一個新的ChatbotService類別，並自動組裝我們定義的三個ChatClient bean。然後，讓我們定義使用主 LLM 的聊天機器人的入口點：

@Retryable(retryFor = Exception.class, maxAttempts = 3)

 String chat(String prompt) {

 logger.debug("Attempting to process prompt '{}' with primary LLM. Attempt #{}",

 prompt, RetrySynchronizationManager.getContext().getRetryCount() + 1);

 return primaryChatClient

 .prompt(prompt)

 .call()

 .content();

 }

這裡，我們建立了一個使用primaryChatClient bean 的chat()方法。我們用@Retryable註解此方法，設定它在出現任何Exception時最多嘗試三次。

接下來我們定義一個恢復方法：

@Recover

 String chat(Exception exception, String prompt) {

 logger.warn("Primary LLM failure. Error received: {}", exception.getMessage());

 logger.debug("Attempting to process prompt '{}' with secondary LLM", prompt);

 try {

 return secondaryChatClient

 .prompt(prompt)

 .call()

 .content();

 } catch (Exception e) {

 logger.warn("Secondary LLM failure: {}", e.getMessage());

 logger.debug("Attempting to process prompt '{}' with tertiary LLM", prompt);

 return tertiaryChatClient

 .prompt(prompt)

 .call()

 .content();

 }

 }

如果原始chat()方法失敗並用盡配置的重試次數，則@Recover註解將我們重載的chat()方法標記為後備。

我們首先嘗試從secondaryChatClient取得回應。如果仍然失敗，我們將使用tertiaryChatClient bean 進行最後一次嘗試。

由於 Spring Retry 只允許每個方法簽章使用一個復原方法，因此我們使用了這種基本的 try-catch 實作。然而，在生產應用程式中，我們應該考慮更複雜的解決方案，例如 Resilience4j。

現在我們已經實作了我們的服務層，讓我們在其上公開一個 REST API ：

@PostMapping("/api/chatbot/chat")

 ChatResponse chat(@RequestBody ChatRequest request) {

 String response = chatbotService.chat(request.prompt);

 return new ChatResponse(response);

 }



 record ChatRequest(String prompt) {}

 record ChatResponse(String response) {}

在這裡，我們定義一個 POST /api/chatbot/chat ，它接受一個prompt ，將其傳遞給服務層，最後在ChatResponse記錄中包裝並傳回response 。

4.2. 測試我們的聊天機器人

最後，讓我們測試我們的聊天機器人以驗證回退機制是否正常運作。

讓我們使用環境變數來啟動我們的應用程序，這些環境變數為我們的主要和次要 LLM 指定無效的模型名稱，但為第三級 LLM 指定有效的模型名稱：

OPENAI_API_KEY=.... \

 ANTHROPIC_API_KEY=.... \

 PRIMARY_LLM=gpt-100 \

 SECONDARY_LLM=claude-opus-200 \

 TERTIARY_LLM=claude-3-haiku-20240307 \

 mvn spring-boot:run

在我們的指令中， gpt-100和claude-opus-200是無效的模型名稱，會導致 API 錯誤，而claude-3-haiku-20240307是來自 Anthropic 的有效模型。

接下來，讓我們使用 HTTPie CLI 呼叫 API 端點並與我們的聊天機器人進行互動：

http POST :8080/api/chatbot/chat prompt="What is the capital of France?"

在這裡，我們向聊天機器人發送一個簡單的提示，讓我們看看我們收到的回應：

{

 "response": "The capital of France is Paris."

 }

我們可以看到，儘管配置了無效的主要和次要 LLM，我們的聊天機器人仍然提供了正確的回應，確認系統回退到第三級 LLM 。

為了了解回退邏輯的實際作用，我們來檢查一下應用程式日誌：

``[2025-09-30 12:56:03] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #1

[2025-09-30 12:56:05] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #2

[2025-09-30 12:56:06] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #3

[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Primary LLM failure. Error received: HTTP 404 - {

"error": {

"message": "The model gpt-100 does not exist or you do not have access to it.",

"type": "invalid_request_error",

"param": null,

"code": "model_not_found"

}

[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with secondary LLM

[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Secondary LLM failure: HTTP 404 - {"type":"error","error":{"type":"not_found_error","message":"model: claude-opus-200"},"request_id":"req_011CTeBrAY8rstsSPiJyv3sj"}

[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with tertiary LLM``

日誌清楚地說明了我們的請求的執行流程。

我們看到主 LLM 的三次嘗試都失敗了。然後，我們的服務嘗試使用輔助 LLM，也失敗了。最後，它調用了第三級 LLM 來處理提示並傳回我們看到的回應。

這表明我們的回退機製完全按照設計運行，確保即使多個 LLM 失敗，我們的聊天機器人仍然可用。

5. 結論

在本文中，我們探討了在單一 Spring AI 應用程式中整合多個 LLM。

首先，我們示範了 Spring AI 的抽象層如何簡化來自 OpenAI 和 Anthropic 等不同供應商的模型配置。

然後，我們解決了一個更複雜的場景，即從同一提供者配置多個模型，當 Spring AI 的自動配置不夠用時創建自訂 Bean。

最後，我們利用這個多模型配置建構了一個高彈性、高可用的聊天機器人。我們使用 Spring Retry 配置了級聯回退模式，以便在發生故障時自動在 LLM 之間切換。

與往常一樣，本文中使用的所有程式碼範例均可在 GitHub 上找到。

本作品係原創或者翻譯，採用《署名-非商業性使用-禁止演繹4.0國際》許可協議