Building an AI Gateway in ASP.NET Core: OpenAI, Claude, Gemini, and Ollama Behind One Interface

Most AI applications are tightly coupled to a single provider. This guide shows how to build a production-ready AI Gateway in ASP.NET Core that supports OpenAI, Claude, Gemini, and Ollama through a unified abstraction layer, enabling provider switching, fallback strategies, cost optimization, and better maintainability.

Most AI tutorials assume a single provider.

Production systems don't.

Different models have different strengths.

GPT-4o is excellent for reasoning and code generation.

Claude performs exceptionally well with large documents and long-context analysis.

Gemini offers strong multimodal capabilities and cost-effective models for high-volume workloads.

Ollama enables local inference and offline development.

The challenge isn't choosing the perfect provider.

The challenge is building an application that can switch providers without rewriting business logic.

If your controllers directly depend on OpenAI, you've already introduced vendor lock-in.

In this article, we'll build an AI Gateway in ASP.NET Core that abstracts multiple providers behind a common interface.

The result is a flexible architecture that allows provider changes through configuration rather than code changes.

What We'll Build

Our architecture looks like this:

                    Client

                       │

             ASP.NET Core API

                       │

                AI Gateway

         ┌────────┬─────────┬─────────┐

         │        │         │         │

     OpenAI    Claude    Gemini   Ollama

         │        │         │         │

      GPT-4o   Sonnet    2.0 Flash   Llama 3.2

Instead of coupling application logic to a specific provider, the gateway becomes the only component responsible for provider selection and communication.

Everything else remains provider-agnostic.

Why Build an AI Gateway?

Many projects start like this:

Controller

↓

OpenAI SDK

It works.

Until requirements change.

Consider these scenarios:

OpenAI experiences an outage.
Costs increase.
Another model performs better.
Customers require local inference.
Regulatory requirements prohibit cloud providers.

Suddenly the application must support multiple providers.

If provider-specific code exists throughout the codebase, the migration becomes expensive.

A better design looks like this:

Controller

↓

IAIProvider

↓

Provider Factory

↓

OpenAI

Claude

Gemini

Ollama

Now provider selection becomes an implementation detail.

Controllers remain unchanged.

Business logic remains unchanged.

Only the gateway evolves.

Designing the Provider Contract

Every provider must expose the same interface.

public interface IAIProvider
{
    Task<ChatResponse> GenerateAsync(
        ChatRequest request);
}

This is the most important architectural decision in the entire solution.

The interface defines what the application needs.

Not how the provider works.

OpenAI, Claude, Gemini, and Ollama all have different APIs.

They all return different payloads.

They all expose different features.

The gateway hides those differences.

The rest of the application interacts with a single contract.

Why Every Provider Implements the Same Contract

Without a common interface:

Controller

↓

OpenAI Logic

Claude Logic

Gemini Logic

Ollama Logic

The controller becomes responsible for provider management.

This violates separation of concerns.

With a shared interface:

Controller

↓

IAIProvider

The controller doesn't know which provider generated the response.

And it shouldn't.

This follows the Dependency Inversion Principle and keeps the system maintainable.

Implementing the Factory Pattern

Once multiple providers exist, something must decide which provider to use.

This is where the Factory Pattern fits naturally.

Request

↓

Provider Factory

↓

Selected Provider

The factory resolves the correct implementation based on configuration or business rules.

Real-World Provider Selection Scenarios

Provider selection is rarely random.

Examples:

Configuration-Based

Default Provider

↓

OpenAI

Useful for most applications.

User Tier Routing

Premium User

↓

GPT-4o

Free User

↓

Gemini Flash

This balances quality and cost.

Offline Development

Development Environment

↓

Ollama

No API costs.

No internet dependency.

Excellent for local development.

Long Context Workloads

Document Analysis

↓

Claude Sonnet

Provider selection becomes a business decision rather than a technical limitation.

Configuration-Driven Provider Selection

Hardcoded provider selection is another form of lock-in.

Avoid:

new OpenAIProvider()

Instead:

{
  "AI": {
    "DefaultProvider": "OpenAI"
  }
}

The active provider becomes configuration.

Switching providers becomes a deployment decision rather than a development project.

Dependency Injection

IAIProvider

↓

OpenAIProvider

ClaudeProvider

GeminiProvider

OllamaProvider

The factory receives provider implementations through Dependency Injection.

Benefits include:

Loose coupling
Better testing
Easier maintenance
Simpler extension

Adding a new provider should require registration, not refactoring.

Standardizing Responses

One of the biggest challenges is response normalization.

Each provider returns different structures.

OpenAI:

{
  "choices": [...]
}

Claude:

{
  "content": [...]
}

Gemini:

{
  "candidates": [...]
}

Ollama:

{
  "response": "..."
}

The rest of the application shouldn't care.

Create a common response model.

public class ChatResponse
{
    public string Answer { get; set; }

    public Usage Usage { get; set; }

    public string FinishReason { get; set; }

    public string Provider { get; set; }

    public long Latency { get; set; }
}

Now every provider returns the same shape.

This simplifies the entire application.

Error Handling Across Providers

Provider APIs fail in different ways.

Common failures include:

Rate Limits

429 Too Many Requests

Authentication Errors

401 Unauthorized

Timeouts

Request exceeded timeout

Model Unavailable

Selected model unavailable

The gateway should normalize these failures into consistent application exceptions.

Clients should never need provider-specific error handling.

Retry and Fallback Strategies

Provider abstraction enables intelligent failover.

Example:

OpenAI

↓

Failure

↓

Retry

↓

Claude

↓

Failure

↓

Gemini

↓

Success

This can dramatically improve reliability.

However, fallback is not always appropriate.

When Fallback Makes Sense

Examples:

Summarization
Content generation
General Q&A

A different provider may produce acceptable results.

When Fallback Is Risky

Examples:

Legal workflows
Compliance systems
Audited processes

Different providers may generate different answers.

Consistency may be more important than availability.

Fallback should always be a conscious business decision.

Logging and Observability

Provider abstraction creates a unique opportunity.

We can compare providers objectively.

Track:

Provider
Model
Prompt Tokens
Completion Tokens
Cost
Latency
Success
Failure

Over time this data becomes extremely valuable.

Questions become easy to answer:

Which provider is cheapest?
Which provider is fastest?
Which provider has the highest failure rate?
Which provider delivers the best user outcomes?

Without metrics, provider selection becomes guesswork.

Cost Optimization Through Intelligent Routing

Not every request requires the most expensive model.

Smart routing can dramatically reduce costs.

Task	Recommended Model
Summarization	Gemini Flash
Code Generation	GPT-4o
Long Documents	Claude Sonnet
Offline Development	Ollama

This is where an AI Gateway starts creating business value.

Instead of selecting one provider globally, the application routes requests to the most appropriate model.

Better results.

Lower costs.

Greater flexibility.

Security Considerations

An AI Gateway centralizes provider management.

That also makes it the ideal place to enforce security.

Secure API Key Storage

Never hardcode API keys.

Use:

Environment Variables
User Secrets
Azure Key Vault
AWS Secrets Manager

Configuration files should not contain production credentials.

Validate Model Names

Prevent arbitrary model execution.

Only allow approved models.

Example:

GPT-4o
Claude Sonnet
Gemini Flash
Llama 3.2

Reject everything else.

Restrict Provider Access

Some providers may be reserved for specific workloads.

The gateway should enforce authorization policies.

Provider-Level Rate Limiting

Different providers have different quotas.

Independent rate limiting prevents one provider from affecting others.

Testing Becomes Easier

This is one of the most overlooked benefits.

Instead of calling external APIs:

OpenAIProvider

can be replaced with:

FakeProvider

during tests.

Unit tests become:

Faster
Cheaper
More reliable

Business logic can be validated without consuming tokens.

Common Mistakes

These mistakes appear frequently in production systems.

Calling OpenAI Directly From Controllers

Creates vendor lock-in.

Hardcoding Model Names

Makes upgrades difficult.

Mixing Provider Logic Throughout the Application

Creates maintenance nightmares.

No Abstraction Layer

Prevents future flexibility.

No Fallback Strategy

Reduces reliability.

Ignoring Usage Metrics

Makes optimization impossible.

Repository Features

The reference implementation includes:

ASP.NET Core Web API
OpenAI Integration
Claude Integration
Gemini Integration
Ollama Integration
Provider Abstraction Layer
Factory Pattern
Configuration-Driven Routing
Structured Logging
Docker Support
Unit Tests

The objective is not merely to support multiple providers.

The objective is to make provider selection a business decision rather than an architectural constraint.

Suggested Screenshots

Include the following visuals.

Architecture Diagram

Show the gateway routing requests to multiple providers.

Swagger Endpoint

Demonstrate:

POST /api/chat

Provider Selection Flow

Visualize:

Request

↓

Factory

↓

Selected Provider

Sample Responses

Compare normalized responses from different providers.

Structured Logs

Show:

Provider
Latency
Tokens
Cost
Status

These screenshots help communicate the operational benefits of the gateway.

Conclusion

Building AI applications around a single provider is easy.

Building systems that can evolve as providers, models, pricing, and business requirements change is significantly harder.

An AI Gateway introduces a layer of abstraction that protects the rest of the application from provider-specific concerns.

By standardizing contracts, centralizing provider selection, normalizing responses, implementing fallback strategies, and collecting operational metrics, you create a system that remains flexible as the AI ecosystem evolves.

Switching providers solves flexibility, but modern AI applications need more than text generation—they need the ability to interact with external systems.

In the next article, we'll explore Function Calling in .NET and show how AI can safely invoke tools, APIs, and business logic.

7 min read

Jan 11, 2025

By Dheer Gupta

Your email address will not be published. Required fields are marked *

Comment

Name

Website

Save my name, email, and website in this browser for the next time I comment.

Building an AI Gateway in ASP.NET Core: OpenAI, Claude, Gemini, and Ollama Behind One Interface

What We'll Build

Why Build an AI Gateway?

Designing the Provider Contract

Why Every Provider Implements the Same Contract

Implementing the Factory Pattern

Real-World Provider Selection Scenarios

Configuration-Based

User Tier Routing

Offline Development

Long Context Workloads

Configuration-Driven Provider Selection

Dependency Injection

Standardizing Responses

Error Handling Across Providers

Rate Limits

Authentication Errors

Timeouts

Model Unavailable

Retry and Fallback Strategies

When Fallback Makes Sense

When Fallback Is Risky

Logging and Observability

Cost Optimization Through Intelligent Routing

Security Considerations

Secure API Key Storage

Validate Model Names

Restrict Provider Access

Provider-Level Rate Limiting

Testing Becomes Easier

Common Mistakes

Calling OpenAI Directly From Controllers

Hardcoding Model Names

Mixing Provider Logic Throughout the Application

No Abstraction Layer

No Fallback Strategy

Ignoring Usage Metrics

Repository Features

Suggested Screenshots

Architecture Diagram

Swagger Endpoint

Provider Selection Flow

Sample Responses

Structured Logs

Conclusion

Leave a comment