
LangChain Integration: Using your Custom LLM Class
The Framework Bridge. Learn how to plug your private fine-tuned model into the LangChain ecosystem by writing a custom LLM provider class.
LangChain Integration: Using your Custom LLM Class
You have your fine-tuned model running on a FastAPI server (Module 13). Now you want to use it inside a complex application with chains, routers, and pre-processors.
Most people use LangChain for this. While LangChain has built-in support for OpenAI and Anthropic, it doesn't know about your specific model. To use it, you need to create a Custom LLM Wrapper. This allows LangChain to "speak" to your fine-tuned model just as if it were GPT-4.
In this lesson, we will write the Python class that connects your private intelligence to the world's most popular AI framework.
1. Why Wrap Your Model?
- Uniform Interface: You can swap your fine-tuned model for GPT-4 with a single line of code for testing.
- Tool Access: Once your model is a "LangChain LLM," it can automatically use hundreds of LangChain tools (Google Search, Python REPL, SQL).
- Observability: LangChain (and LangSmith) can track the performance and latency of your custom model automatically.
2. The Custom LLM Blueprint
To integrate with LangChain, you create a class that inherits from BaseChatModel. You only need to implement one main method: _generate.
Visualizing the Framework Interop
graph LR
A["LangChain Components (Chains, Agents)"] --> B["Your Custom LLM Class"]
B --> C["HTTP Request (FastAPI)"]
C --> D["your-fine-tuned-model (vLLM)"]
subgraph "The Framework Bridge"
B
end
D --> E["Response Output"]
E --> B
B --> A
3. Implementation: The LangChain Wrapper
Here is the production-ready code to connect LangChain to your microservice.
from typing import Any, List, Optional
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import BaseMessage, AIMessage
from langchain_core.outputs import ChatResult, ChatGeneration
import requests
class FineTunedCustomLLM(BaseChatModel):
model_url: str
model_name: str
def _generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[Any] = None,
**kwargs: Any,
) -> ChatResult:
# 1. Prepare the payload for your FastAPI wrapper (Module 13)
last_message = messages[-1].content
payload = {
"prompt": last_message,
"max_tokens": 500,
"temperature": 0.0
}
# 2. Make the HTTP call to your server
response = requests.post(f"{self.model_url}/generate", json=payload)
response.raise_for_status()
data = response.json()
# 3. Wrap the text in an AIMessage
message = AIMessage(content=data["data"])
generation = ChatGeneration(message=message)
return ChatResult(generations=[generation])
@property
def _llm_type(self) -> str:
return "custom_fine_tuned_model"
# Usage
llm = FineTunedCustomLLM(model_url="http://localhost:8080", model_name="my-specialist")
print(llm.invoke("What is the status of project X?").content)
4. Why "BaseChatModel" vs "BaseLLM"?
- BaseLLM: For simple text-in, text-out models (Legacy).
- BaseChatModel: For models that use System/User/Assistant roles. Since we fine-tuned our model on conversation data in Module 8, you should almost always use BaseChatModel. This allows you to use LangChain's powerful
ChatPromptTemplatefeatures.
Summary and Key Takeaways
- Custom Wrappers are the glue that connects your private model to the LangChain ecosystem.
- Interchangeability: You can now use your model anywhere LangChain expects an LLM.
- HTTP Proxy: Your LangChain class acts as a client for your FastAPI inference server.
- Legacy Warning: Always prefer
BaseChatModelover the olderBaseLLMfor modern, instruction-tuned models.
In the next lesson, we will look at a more advanced agentic structure: LangGraph and Agents: Specializing the Reasoning Loop.
Reflection Exercise
- If you update your model weights but keep the FastAPI server running at the same URL, do you need to change your LangChain code?
- Why is it useful to inherit from
BaseChatModeleven if you only plan to use one single prompt? (Hint: Think about future features like 'Streaming' and 'Memory').
SEO Metadata & Keywords
Focus Keywords: LangChain custom LLM class, integrating private model LangChain, BaseChatModel implementation tutorial, invoking custom AI locally, AI framework integration. Meta Description: Connect your private brain to the world. Learn how to write a custom LangChain LLM wrapper to plug your fine-tuned models into the most powerful AI application framework.