在Spring?AI?中配置多個(gè)?LLM?客戶端的詳細(xì)過(guò)程

更新時(shí)間：2025年10月11日 08:49:21 作者：程序猿DD

本文探討了如何在單個(gè)Spring?AI應(yīng)用中集成多個(gè)LLM,本文給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友參考下吧

1. 概述

越來(lái)越多的現(xiàn)代應(yīng)用開始集成大型語(yǔ)言模型（LLM），以構(gòu)建更智能的功能。雖然一個(gè) LLM 能勝任多種任務(wù)，但只依賴單一模型并不總是最優(yōu)。

不同模型各有側(cè)重：有的擅長(zhǎng)技術(shù)分析，有的更適合創(chuàng)意寫作。簡(jiǎn)單任務(wù)更適合輕量、性價(jià)比高的模型；復(fù)雜任務(wù)則交給更強(qiáng)大的模型。

本文將演示如何借助 Spring AI，在 Spring Boot 應(yīng)用中集成多個(gè) LLM。

我們既會(huì)配置來(lái)自不同供應(yīng)商的模型，也會(huì)配置同一供應(yīng)商下的多個(gè)模型。隨后基于這些配置，構(gòu)建一個(gè)具備彈性的聊天機(jī)器人，在故障時(shí)可自動(dòng)在模型間切換。

2. 配置不同供應(yīng)商的 LLM

我們先在應(yīng)用中配置來(lái)自不同供應(yīng)商的兩個(gè) LLM。

在本文示例中，我們將使用 OpenAI 和 Anthropic 作為 AI 模型提供商。

2.1. 配置主 LLM

我們先將一個(gè) OpenAI 模型配置為主 LLM。

首先，在項(xiàng)目的 pom.xml 文件中添加所需依賴：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
    <version>1.0.2</version>
</dependency>

該 OpenAI Starter 依賴是對(duì) OpenAI Chat Completions API 的封裝，使我們能夠在應(yīng)用中與 OpenAI 模型交互。

接著，在 application.yaml 中配置我們的 OpenAI API Key 和聊天模型：

spring:
  ai:
    open-ai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: ${PRIMARY_LLM}
          temperature: 1

我們使用 ${} 屬性占位符從環(huán)境變量中加載屬性值。另外，我們將溫度設(shè)置為 1，因?yàn)檩^新的 OpenAI 模型只接受這個(gè)默認(rèn)值。

在完成上述屬性配置后，Spring AI 會(huì)自動(dòng)創(chuàng)建一個(gè) OpenAiChatModel 類型的 bean。我們使用它來(lái)定義一個(gè) ChatClient bean，作為與 LLM 交互的主要入口：

@Configuration
class ChatbotConfiguration {
    @Bean
    @Primary
    ChatClient primaryChatClient(OpenAiChatModel chatModel) {
        return ChatClient.create(chatModel);
    }
}

在 ChatbotConfiguration 類中，我們使用 OpenAiChatModel bean 創(chuàng)建了主 LLM 的 ChatClient。

我們使用 @Primary 注解標(biāo)記該 bean。當(dāng)在組件中注入 ChatClient 且未使用 Qualifier 時(shí)，Spring Boot 會(huì)自動(dòng)注入它。

2.2. 配置次級(jí) LLM

現(xiàn)在，我們將配置一個(gè)來(lái)自 Anthropic 的模型作為次級(jí) LLM。

首先，在 pom.xml 中添加 Anthropic Starter 依賴：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
    <version>1.0.2</version>
</dependency>

該依賴是對(duì) Anthropic Message API 的封裝，提供了與 Anthropic 模型建立連接并交互所需的類。

接著，為次級(jí)模型定義配置屬性：

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: ${SECONDARY_LLM}

與主 LLM 的配置類似，我們從環(huán)境變量中加載 Anthropic API Key 和模型 ID。

最后，為次級(jí)模型創(chuàng)建一個(gè)專用的 ChatClient bean：

@Bean
ChatClient secondaryChatClient(AnthropicChatModel chatModel) {
    return ChatClient.create(chatModel);
}

這里，我們使用 Spring AI 自動(dòng)配置的 AnthropicChatModel bean 創(chuàng)建了 secondaryChatClient。

3. 配置同一供應(yīng)商的多個(gè) LLM

很多時(shí)候，我們需要配置的多個(gè) LLM 可能來(lái)自同一 AI 供應(yīng)商。

Spring AI 并不原生支持這種場(chǎng)景，其自動(dòng)配置每個(gè)供應(yīng)商只會(huì)創(chuàng)建一個(gè) ChatModel bean。因此，對(duì)于額外的模型，我們需要手動(dòng)定義 ChatModel bean。

讓我們來(lái)看看具體過(guò)程，并在應(yīng)用中配置第二個(gè) Anthropic 模型：

spring:
  ai:
    anthropic:
      chat:
        options:
          tertiary-model: ${TERTIARY_LLM}

在 application.yaml 的 Anthropic 配置下，我們添加了一個(gè)自定義屬性來(lái)保存第三個(gè)（tertiary）LLM 的模型名稱。

接著，為第三個(gè) LLM 定義必要的 bean：

@Bean
ChatModel tertiaryChatModel(
    AnthropicApi anthropicApi,
    AnthropicChatModel anthropicChatModel,
    @Value("${spring.ai.anthropic.chat.options.tertiary-model}") String tertiaryModelName
) {
    AnthropicChatOptions chatOptions = anthropicChatModel.getDefaultOptions().copy();
    chatOptions.setModel(tertiaryModelName);
    return AnthropicChatModel.builder()
      .anthropicApi(anthropicApi)
      .defaultOptions(chatOptions)
      .build();
}
@Bean
ChatClient tertiaryChatClient(@Qualifier("tertiaryChatModel") ChatModel tertiaryChatModel) {
    return ChatClient.create(tertiaryChatModel);
}

首先，為創(chuàng)建自定義的 ChatModel bean，我們注入自動(dòng)配置的 AnthropicApi bean、用于創(chuàng)建次級(jí) LLM 的默認(rèn) AnthropicChatModel bean，并通過(guò) @Value 注入第三個(gè)模型的名稱屬性。

我們復(fù)制現(xiàn)有 AnthropicChatModel 的默認(rèn)選項(xiàng)，并僅覆蓋其中的模型名稱。

該設(shè)置假定兩個(gè) Anthropic 模型共享同一個(gè) API Key 及其他配置。如果需要不同的屬性，可以進(jìn)一步自定義 AnthropicChatOptions。

最后，我們使用自定義的 tertiaryChatModel 在配置類中創(chuàng)建第三個(gè) ChatClient bean。

4. 探索一個(gè)實(shí)用用例

在完成多模型配置后，讓我們實(shí)現(xiàn)一個(gè)實(shí)用用例。我們將構(gòu)建一個(gè)具備彈性的聊天機(jī)器人，當(dāng)主模型出現(xiàn)故障時(shí)可按順序自動(dòng)回退到替代模型。

4.1. 構(gòu)建具備彈性的聊天機(jī)器人

為實(shí)現(xiàn)回退邏輯，我們將使用 Spring Retry。

創(chuàng)建一個(gè)新的 ChatbotService 類，并注入我們定義的三個(gè) ChatClient。接著，定義一個(gè)入口方法使用主 LLM：

@Retryable(retryFor = Exception.class, maxAttempts = 3)
String chat(String prompt) {
    logger.debug("Attempting to process prompt '{}' with primary LLM. Attempt #{}",
        prompt, RetrySynchronizationManager.getContext().getRetryCount() + 1);
    return primaryChatClient
      .prompt(prompt)
      .call()
      .content();
}

這里，我們創(chuàng)建了一個(gè)使用 primaryChatClient 的 chat() 方法。該方法使用 @Retryable 注解，在遇到任意 Exception 時(shí)最多重試三次。

接著，定義一個(gè)恢復(fù)方法：

@Recover
String chat(Exception exception, String prompt) {
    logger.warn("Primary LLM failure. Error received: {}", exception.getMessage());
    logger.debug("Attempting to process prompt '{}' with secondary LLM", prompt);
    try {
        return secondaryChatClient
          .prompt(prompt)
          .call()
          .content();
    } catch (Exception e) {
        logger.warn("Secondary LLM failure: {}", e.getMessage());
        logger.debug("Attempting to process prompt '{}' with tertiary LLM", prompt);
        return tertiaryChatClient
          .prompt(prompt)
          .call()
          .content();
    }
}

使用 @Recover 注解標(biāo)記的重載 chat() 方法將作為原始 chat() 方法失敗并耗盡重試后的回退處理。

我們首先嘗試通過(guò) secondaryChatClient 獲取響應(yīng)；如果仍失敗，則最后再嘗試使用 tertiaryChatClient。

這里使用了簡(jiǎn)單的 try-catch 實(shí)現(xiàn)，因?yàn)?Spring Retry 每個(gè)方法簽名只允許一個(gè)恢復(fù)方法。但在生產(chǎn)應(yīng)用中，我們應(yīng)考慮使用更完善的方案，例如 Resilience4j。

在完成服務(wù)層實(shí)現(xiàn)后，我們?cè)賹?duì)外暴露一個(gè) REST API：

@PostMapping("/api/chatbot/chat")
ChatResponse chat(@RequestBody ChatRequest request) {
    String response = chatbotService.chat(request.prompt);
    return new ChatResponse(response);
}
record ChatRequest(String prompt) {}
record ChatResponse(String response) {}

這里定義了一個(gè) POST 接口 /api/chatbot/chat，接收 prompt，將其傳遞到服務(wù)層，最后把 response 包裝在 ChatResponse record 中返回。

4.2. 測(cè)試我們的聊天機(jī)器人

最后，我們來(lái)測(cè)試聊天機(jī)器人，驗(yàn)證回退機(jī)制是否正常工作。

通過(guò)環(huán)境變量啟動(dòng)應(yīng)用：為主、次級(jí) LLM 設(shè)置無(wú)效模型名稱，同時(shí)為第三個(gè) LLM 設(shè)置一個(gè)有效的模型名稱：

OPENAI_API_KEY=.... \ 
ANTHROPIC_API_KEY=.... \ 
PRIMARY_LLM=gpt-100 \ 
SECONDARY_LLM=claude-opus-200 \ 
TERTIARY_LLM=claude-3-haiku-20240307 \ 
mvn spring-boot:run

在上述命令中，gpt-100 和 claude-opus-200 是無(wú)效的模型名稱，會(huì)導(dǎo)致 API 錯(cuò)誤；而 claude-3-haiku-20240307 是 Anthropic 提供的有效模型。

接著，使用 HTTPie CLI 調(diào)用接口，與聊天機(jī)器人交互：

http POST :8080/api/chatbot/chat prompt="What is the capital of France?"

這里我們向聊天機(jī)器人發(fā)送一個(gè)簡(jiǎn)單的提示詞，看看返回結(jié)果：

{
"response": "The capital of France is Paris."
}

可以看到，盡管主、次級(jí) LLM 的配置為無(wú)效模型，聊天機(jī)器人仍返回了正確響應(yīng)，這驗(yàn)證了系統(tǒng)成功回退到了第三個(gè) LLM。

為了更直觀地看到回退邏輯的執(zhí)行過(guò)程，我們?cè)賮?lái)看一下應(yīng)用日志：

[2025-09-30 12:56:03] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #1
[2025-09-30 12:56:05] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #2
[2025-09-30 12:56:06] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #3
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Primary LLM failure. Error received: HTTP 404 - {
    "error": {
        "message": "The model `gpt-100` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with secondary LLM
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Secondary LLM failure: HTTP 404 - {"type":"error","error":{"type":"not_found_error","message":"model: claude-opus-200"},"request_id":"req_011CTeBrAY8rstsSPiJyv3sj"}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with tertiary LLM

日志清晰地展示了請(qǐng)求的執(zhí)行流程。

可以看到，主 LLM 連續(xù)三次嘗試失敗；隨后服務(wù)嘗試使用次級(jí) LLM，仍然失敗；最終調(diào)用第三個(gè) LLM 處理提示詞并返回了我們看到的響應(yīng)。

這表明回退機(jī)制按設(shè)計(jì)正常工作，即使多個(gè) LLM 同時(shí)失敗，聊天機(jī)器人仍保持可用。

5. 小結(jié)

本文探討了如何在單個(gè) Spring AI 應(yīng)用中集成多個(gè) LLM。首先，我們演示了 Spring AI 的抽象層如何簡(jiǎn)化來(lái)自不同供應(yīng)商（如 OpenAI 與 Anthropic）的模型配置。隨后，我們解決了更復(fù)雜的場(chǎng)景：在同一供應(yīng)商下配置多個(gè)模型，并在 Spring AI 的自動(dòng)配置不夠用時(shí)創(chuàng)建自定義 bean。最后，我們利用多模型配置構(gòu)建了一個(gè)具有高可用性的彈性聊天機(jī)器人。借助 Spring Retry，我們實(shí)現(xiàn)了級(jí)聯(lián)回退模式，在發(fā)生故障時(shí)可在不同 LLM 間自動(dòng)切換。

到此這篇關(guān)于如何在 Spring AI 中配置多個(gè) LLM 客戶端的文章就介紹到這了,更多相關(guān)Spring AI 配置 LLM 客戶端內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: