LLM Wrapper
The LLM Wrapper is a unified API proxy that provides a consistent interface for interacting with multiple LLM providers (OpenAI, Anthropic, Ollama, etc.).
Overview
This service acts as a middleware layer that:
- Provides a unified API endpoint for all LLM interactions
- Handles provider-specific authentication and API differences
- Supports key rotation and fallback strategies
- Enables consistent logging and monitoring across all LLM calls
Quick Start
Prerequisites
- Node.js >= 20.0
- API keys for desired providers (OpenAI, Anthropic, etc.)
Installation
git clone https://github.com/FinAI-Temp/LLM-Wrapper.git
cd LLM-Wrapper
npm install
Configuration
Create a .env file in the root directory:
# Provider API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Server Configuration
PORT=3000
LOG_LEVEL=info
# Provider Selection (comma-separated, in priority order)
ACTIVE_PROVIDERS=openai,anthropic
Running the Service
# Development
npm run dev
# Production
npm start
API Reference
Unified Endpoint
POST /api/chat/completions
The unified chat completions endpoint that routes requests to the configured LLM providers.
Request Body
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is the current market sentiment?"
}
],
"temperature": 0.7,
"max_tokens": 1000
}
Response
{
"id": "chatcmpl-xxx",
"model": "gpt-4o",
"choices": [
{
"message": {
"role": "assistant",
"content": "Based on recent market data..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 50,
"completion_tokens": 150,
"total_tokens": 200
}
}
Supported Models
| Provider | Models |
|---|---|
| OpenAI | gpt-4o, gpt-4-turbo, gpt-3.5-turbo |
| Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-haiku |
| Ollama | llama3, mistral, codellama |
Adding Providers
To add a new provider:
- Create a new adapter in
src/adapters/following the existing adapter pattern - Implement the
LLMAdapterinterface withcomplete()andstream()methods - Register the provider in
src/providers/registry.ts - Add configuration options in
src/config/
Key Management
Setting API Keys
API keys are loaded from environment variables at startup. Never commit API keys to version control.
Key Rotation
The LLM Wrapper supports automatic key rotation:
- Configure multiple API keys for a provider (comma-separated in env)
- The wrapper automatically rotates keys on rate limit errors (429)
- Configure rotation strategy via
KEY_ROTATION_STRATEGYenv var:round-robin(default): Sequential rotationrandom: Random key selectionfailover: Use next key only on failure
Monitoring
Monitor key usage and rotation via the /api/health endpoint which includes:
- Current active key index per provider
- Request counts per key
- Error rates per key
Integration with News Analyzer
The News Analyzer service integrates with the LLM Wrapper using the unified endpoint.
Integration Pattern
const response = await fetch('http://llm-wrapper:3000/api/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'You are a financial news analyst. Analyze the following article...'
},
{
role: 'user',
content: articleContent
}
],
temperature: 0.3,
max_tokens: 500
})
});
Conventions
- Model Selection: News Analyzer uses
gpt-4ofor standard analysis andclaude-3-5-sonnetfor complex reasoning - Temperature: Typically 0.1-0.3 for analytical tasks (more deterministic)
- Max Tokens: 500-1000 for typical news summaries, adjust based on expected output length
- Timeout: 30 second timeout for all LLM calls; implement retry logic with exponential backoff
Error Handling
When the LLM Wrapper returns an error:
- Rate Limit (429): Wait and retry with exponential backoff (max 3 retries)
- Auth Error (401/403): Log and alert; do not retry with same key
- Server Error (500+): Retry on different provider if available
- Timeout: Retry once, then fail gracefully with cached response if available
Troubleshooting
Common Issues
Connection Refused
- Ensure the LLM Wrapper service is running
- Check that the port matches the configured
PORTvalue
Invalid API Key
- Verify the API key is correct and has not expired
- Check that the key has necessary permissions for the requested model
Rate Limiting
- Reduce request frequency
- Consider adding more API keys for rotation
- Implement request queuing for high-volume scenarios
Architecture
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client │────▶│LLM Wrapper │────▶│ OpenAI │
└─────────────┘ │ │ └─────────────┘
│ Router & │ ┌─────────────┐
│ Fallback │────▶│ Anthropic │
│ │ └─────────────┘
│ Key Mgmt │ ┌─────────────┐
│ │────▶│ Ollama │
└─────────────┘ └─────────────┘