Building an AI Service Layer in ASP.NET Core: A Clean Architecture Approach

Learn how to build a production-ready OpenAI API in ASP.NET Core using Dependency Injection, the Options Pattern, HttpClientFactory, validation, logging, retry policies, and clean architecture. Create AI services that are maintainable, scalable, provider-agnostic, and ready for real-world applications.

Most AI tutorials stop the moment they receive a response from OpenAI.

Real applications don't.

In production systems, AI is not a code snippet inside a controller. It's a backend dependency that requires validation, dependency injection, logging, retry policies, configuration management, observability, cancellation support, and the flexibility to swap providers without rewriting your application.

If we treat AI differently than every other external service, we create technical debt from day one.

In this article, we'll build an AI API in ASP.NET Core the same way we'd build any production service—not as a demo, but as a reusable backend component.

What We'll Build

Our architecture will look like this:

ASP.NET Core Web API
        ↓
AI Controller
        ↓
IAIService
        ↓
OpenAI Provider Implementation
        ↓
GPT-4o / GPT-4 Turbo
        ↓
JSON Response

The goal is simple:

Controllers remain thin
Business logic lives in services
Configuration is externalized
AI providers are interchangeable
Infrastructure concerns stay isolated

This gives us a maintainable and scalable foundation for future AI features.

Project Structure

A clean project structure makes the application easier to understand and evolve.

AspNetCoreOpenAI
│
├── Controllers
│     AIController.cs
│
├── Services
│     IAIService.cs
│     OpenAIService.cs
│
├── Models
│     ChatRequest.cs
│     ChatResponse.cs
│
├── Options
│     OpenAIOptions.cs
│
├── Program.cs
│
└── appsettings.json

Each folder has a clear responsibility.

Controllers handle HTTP requests.
Services contain AI integration logic.
Models define request and response contracts.
Options store configuration settings.
Program.cs wires everything together.

Why Use an Abstraction?

One of the most common mistakes developers make is calling OpenAI directly from controllers.

Bad:

Controller
   ↓
OpenAI SDK

This tightly couples your API endpoints to a specific AI provider.

Now imagine your company decides to switch from OpenAI to Anthropic, Gemini, Azure OpenAI, or even a self-hosted Ollama instance.

Without an abstraction, every controller must change.

A better approach is:

Controller
     ↓
IAIService
     ↓
OpenAI
Anthropic
Gemini
Azure OpenAI
Ollama

The controller doesn't know which provider is generating responses.

It simply asks for an answer.

This follows the Dependency Inversion Principle and keeps provider-specific code isolated.

Future changes become implementation details instead of application-wide refactoring efforts.

Creating the Service Interface

Let's start by defining a contract.

public interface IAIService
{
    Task<string> GenerateAsync(string prompt);
}

The interface defines what the application needs.

Not how it works.

This distinction is important.

Controllers depend on abstractions rather than concrete implementations.

As a result:

Testing becomes easier.
Providers become replaceable.
Business logic remains independent from infrastructure.

This is one of the core ideas behind Dependency Injection.

Understanding Dependency Injection

Dependency Injection (DI) is a design pattern where dependencies are supplied to a class rather than created inside it.

Without DI:

public class AIController : ControllerBase
{
    private readonly OpenAIService _service =
        new OpenAIService();
}

This creates several problems:

Difficult to test.
Difficult to replace.
Difficult to configure.
Tight coupling.

With DI:

public class AIController : ControllerBase
{
    private readonly IAIService _service;

    public AIController(IAIService service)
    {
        _service = service;
    }
}

Now ASP.NET Core provides the implementation automatically.

The controller only knows about the interface.

This is cleaner, more maintainable, and follows SOLID principles.

Registering the Service

In Program.cs:

builder.Services.AddScoped<IAIService, OpenAIService>();

Why Scoped?

ASP.NET Core supports three primary lifetimes.

Singleton

One instance for the entire application.

Scoped

One instance per request.

Transient

A new instance every time it is requested.

For AI services, Scoped is typically the best choice.

Each HTTP request gets its own service instance while still allowing efficient reuse of shared infrastructure such as HttpClientFactory.

Scoped services also work naturally with request-level logging, correlation IDs, cancellation tokens, and other middleware features.

Reading Configuration

Never hardcode API keys.

Avoid this:

var apiKey = "sk-xxxxxxxx";

Store configuration externally.

appsettings.json:

{
  "OpenAI": {
    "ApiKey": "",
    "Model": "gpt-4o",
    "Temperature": 0.7,
    "MaxTokens": 1000
  }
}

Create a strongly typed options class.

public class OpenAIOptions
{
    public string ApiKey { get; set; } = string.Empty;
    public string Model { get; set; } = string.Empty;
    public double Temperature { get; set; }
    public int MaxTokens { get; set; }
}

builder.Services.Configure<OpenAIOptions>(
    builder.Configuration.GetSection("OpenAI"));

Understanding the Options Pattern

The Options Pattern provides strongly typed access to configuration.

Instead of this:

var apiKey =
    configuration["OpenAI:ApiKey"];

We can inject:

IOptions<OpenAIOptions>

Benefits include:

Compile-time safety
Cleaner code
Centralized configuration
Easier testing
Better maintainability

Configuration becomes a first-class citizen rather than scattered string lookups.

Building OpenAIService

This is where the actual AI integration lives.

Many tutorials simply paste a code block and move on.

Let's discuss the design decisions instead.

Use HttpClientFactory

Avoid this:

new HttpClient();

Creating HttpClient manually can cause socket exhaustion and resource management issues.

Instead:

builder.Services.AddHttpClient();

Inject:

IHttpClientFactory

into your service.

Benefits:

Connection pooling
Better performance
Centralized configuration
Resilience integrations
Cleaner testing

Support Cancellation Tokens

AI requests can take time.

Users may navigate away.

Requests may timeout.

Servers may shut down.

Always support cancellation.

Task<string> GenerateAsync(
    string prompt,
    CancellationToken cancellationToken);

Pass the token through the entire call chain.

This prevents wasted compute and improves resource utilization.

Serialize Explicitly

When sending requests:

JsonSerializer.Serialize(request);

When receiving responses:

JsonSerializer.Deserialize<Response>(json);

Explicit serialization gives you full control over payloads and reduces surprises during version upgrades.

Handle Errors Properly

External APIs fail.

Networks fail.

Rate limits happen.

Timeouts happen.

Assume failures will occur.

Handle:

429 Too Many Requests
500 Internal Server Errors
Timeouts
Invalid responses
Network failures

Translate infrastructure errors into meaningful application exceptions.

Never expose raw provider errors directly to clients.

Creating the API Controller

The controller should remain extremely thin.

Request model:

public class ChatRequest
{
    public string Prompt { get; set; } = string.Empty;
}

Response model:

public class ChatResponse
{
    public string Answer { get; set; } = string.Empty;
}

Endpoint:

POST /api/ai/chat

Request:

{
  "prompt": "Explain Dependency Injection"
}

Response:

{
  "answer": "Dependency Injection is..."
}

The controller should coordinate the request and delegate all AI logic to the service layer.

If your controller starts containing prompt engineering, model selection, retries, parsing logic, or provider-specific code, the architecture is already drifting in the wrong direction.

Production Improvements

This is where most tutorials stop.

Production systems start here.

Retry Policies

Transient failures are common.

A request might fail because:

Temporary network issues
Provider throttling
Short service disruptions

Use Polly.

Example strategy:

Retry 3 times
Exponential backoff
Handle transient HTTP failures

A retry policy dramatically improves reliability without complicating business logic.

Logging

Observability is critical.

Log:

Request duration
Response status
Token usage
Failures
Retry attempts

Never log:

API Keys

In many applications, you should also avoid logging:

Entire prompts

Prompts often contain customer data, business information, or sensitive content.

Log metadata rather than content whenever possible.

Token Monitoring

AI usage directly impacts cost.

Track:

Prompt Tokens
Completion Tokens
Total Tokens
Estimated Cost

Without monitoring, teams often discover expensive usage patterns after receiving the invoice.

Measure usage from the beginning.

Timeouts

Never allow AI requests to run indefinitely.

Configure reasonable timeouts.

Examples:

15 seconds
30 seconds
60 seconds

depending on your use case.

A stuck request should eventually fail rather than consume resources forever.

Validation

Validate input before calling the model.

Reject:

Empty Prompts

""

Excessively Large Prompts

Prevent abuse and unnecessary cost.

Dangerous Inputs

Depending on your domain, you may want additional screening for malicious or prohibited content.

Validation is significantly cheaper than generating tokens.

Dependency Injection Everywhere

Avoid this pattern:

var client = new OpenAIClient();

inside controllers.

Always inject dependencies.

Benefits:

Testability
Maintainability
Consistency
Configurability

Controllers should orchestrate.

Services should execute.

Infrastructure should remain isolated.

Common Mistakes

These issues appear repeatedly in production code reviews.

Putting AI Logic in Controllers

Controllers become massive and difficult to maintain.

Creating HttpClient Manually

Leads to resource management problems.

Hardcoding API Keys

Creates security risks.

Ignoring Cancellation

Wastes compute and server resources.

Returning Raw OpenAI Responses

Leaks implementation details.

Ignoring Rate Limits

Creates reliability issues under load.

No Retry Policy

Turns temporary failures into user-facing outages.

No Validation

Increases costs and opens the door to abuse.

Where to Go Next

This API returns a completed response after the model finishes generating.

That works, but it isn't how modern AI experiences feel.

In the next article, we'll improve the API by streaming tokens to the client using Server-Sent Events (SSE), allowing responses to appear in real time and creating an experience much closer to ChatGPT.

We'll also discuss:

Streaming architectures
Incremental token delivery
Client-side consumption
Cancellation during streams
Backpressure considerations

Conclusion

Building AI features in ASP.NET Core isn't about calling an LLM API.

It's about treating AI as another production dependency.

The same engineering principles that apply to databases, message queues, payment gateways, and external APIs also apply to AI providers.

By introducing abstractions, configuration management, dependency injection, validation, observability, resilience, and cancellation support from day one, your application remains flexible as models, providers, and requirements evolve.

The result isn't just an AI demo.

It's a production-ready foundation that your team can build on with confidence.

7 min read

Jul 18, 2024

By Dheer Gupta

Your email address will not be published. Required fields are marked *

Comment

Name

Website

Save my name, email, and website in this browser for the next time I comment.