I'm always excited to take on new projects and collaborate with innovative minds.
Learn how to build a production-ready OpenAI API in ASP.NET Core using Dependency Injection, the Options Pattern, HttpClientFactory, validation, logging, retry policies, and clean architecture. Create AI services that are maintainable, scalable, provider-agnostic, and ready for real-world applications.
Most AI tutorials stop the moment they receive a response from OpenAI.
Real applications don't.
In production systems, AI is not a code snippet inside a controller. It's a backend dependency that requires validation, dependency injection, logging, retry policies, configuration management, observability, cancellation support, and the flexibility to swap providers without rewriting your application.
If we treat AI differently than every other external service, we create technical debt from day one.
In this article, we'll build an AI API in ASP.NET Core the same way we'd build any production service—not as a demo, but as a reusable backend component.
Our architecture will look like this:
ASP.NET Core Web API
↓
AI Controller
↓
IAIService
↓
OpenAI Provider Implementation
↓
GPT-4o / GPT-4 Turbo
↓
JSON Response
The goal is simple:
This gives us a maintainable and scalable foundation for future AI features.
A clean project structure makes the application easier to understand and evolve.
AspNetCoreOpenAI
│
├── Controllers
│ AIController.cs
│
├── Services
│ IAIService.cs
│ OpenAIService.cs
│
├── Models
│ ChatRequest.cs
│ ChatResponse.cs
│
├── Options
│ OpenAIOptions.cs
│
├── Program.cs
│
└── appsettings.json
Each folder has a clear responsibility.
One of the most common mistakes developers make is calling OpenAI directly from controllers.
Bad:
Controller
↓
OpenAI SDK
This tightly couples your API endpoints to a specific AI provider.
Now imagine your company decides to switch from OpenAI to Anthropic, Gemini, Azure OpenAI, or even a self-hosted Ollama instance.
Without an abstraction, every controller must change.
A better approach is:
Controller
↓
IAIService
↓
OpenAI
Anthropic
Gemini
Azure OpenAI
Ollama
The controller doesn't know which provider is generating responses.
It simply asks for an answer.
This follows the Dependency Inversion Principle and keeps provider-specific code isolated.
Future changes become implementation details instead of application-wide refactoring efforts.
Let's start by defining a contract.
public interface IAIService
{
Task<string> GenerateAsync(string prompt);
}
The interface defines what the application needs.
Not how it works.
This distinction is important.
Controllers depend on abstractions rather than concrete implementations.
As a result:
This is one of the core ideas behind Dependency Injection.
Dependency Injection (DI) is a design pattern where dependencies are supplied to a class rather than created inside it.
Without DI:
public class AIController : ControllerBase
{
private readonly OpenAIService _service =
new OpenAIService();
}
This creates several problems:
With DI:
public class AIController : ControllerBase
{
private readonly IAIService _service;
public AIController(IAIService service)
{
_service = service;
}
}
Now ASP.NET Core provides the implementation automatically.
The controller only knows about the interface.
This is cleaner, more maintainable, and follows SOLID principles.
In Program.cs:
builder.Services.AddScoped<IAIService, OpenAIService>();
Why Scoped?
ASP.NET Core supports three primary lifetimes.
One instance for the entire application.
One instance per request.
A new instance every time it is requested.
For AI services, Scoped is typically the best choice.
Each HTTP request gets its own service instance while still allowing efficient reuse of shared infrastructure such as HttpClientFactory.
Scoped services also work naturally with request-level logging, correlation IDs, cancellation tokens, and other middleware features.
Never hardcode API keys.
Avoid this:
var apiKey = "sk-xxxxxxxx";
Store configuration externally.
appsettings.json:
{
"OpenAI": {
"ApiKey": "",
"Model": "gpt-4o",
"Temperature": 0.7,
"MaxTokens": 1000
}
}
Create a strongly typed options class.
public class OpenAIOptions
{
public string ApiKey { get; set; } = string.Empty;
public string Model { get; set; } = string.Empty;
public double Temperature { get; set; }
public int MaxTokens { get; set; }
}
Register it:
builder.Services.Configure<OpenAIOptions>(
builder.Configuration.GetSection("OpenAI"));
The Options Pattern provides strongly typed access to configuration.
Instead of this:
var apiKey =
configuration["OpenAI:ApiKey"];
We can inject:
IOptions<OpenAIOptions>
Benefits include:
Configuration becomes a first-class citizen rather than scattered string lookups.
This is where the actual AI integration lives.
Many tutorials simply paste a code block and move on.
Let's discuss the design decisions instead.
Avoid this:
new HttpClient();
Creating HttpClient manually can cause socket exhaustion and resource management issues.
Instead:
builder.Services.AddHttpClient();
Inject:
IHttpClientFactory
into your service.
Benefits:
AI requests can take time.
Users may navigate away.
Requests may timeout.
Servers may shut down.
Always support cancellation.
Task<string> GenerateAsync(
string prompt,
CancellationToken cancellationToken);
Pass the token through the entire call chain.
This prevents wasted compute and improves resource utilization.
When sending requests:
JsonSerializer.Serialize(request);
When receiving responses:
JsonSerializer.Deserialize<Response>(json);
Explicit serialization gives you full control over payloads and reduces surprises during version upgrades.
External APIs fail.
Networks fail.
Rate limits happen.
Timeouts happen.
Assume failures will occur.
Handle:
Translate infrastructure errors into meaningful application exceptions.
Never expose raw provider errors directly to clients.
The controller should remain extremely thin.
Request model:
public class ChatRequest
{
public string Prompt { get; set; } = string.Empty;
}
Response model:
public class ChatResponse
{
public string Answer { get; set; } = string.Empty;
}
Endpoint:
POST /api/ai/chat
Request:
{
"prompt": "Explain Dependency Injection"
}
Response:
{
"answer": "Dependency Injection is..."
}
The controller should coordinate the request and delegate all AI logic to the service layer.
If your controller starts containing prompt engineering, model selection, retries, parsing logic, or provider-specific code, the architecture is already drifting in the wrong direction.
This is where most tutorials stop.
Production systems start here.
Transient failures are common.
A request might fail because:
Use Polly.
Example strategy:
Retry 3 times
Exponential backoff
Handle transient HTTP failures
A retry policy dramatically improves reliability without complicating business logic.
Observability is critical.
Log:
Never log:
API Keys
In many applications, you should also avoid logging:
Entire prompts
Prompts often contain customer data, business information, or sensitive content.
Log metadata rather than content whenever possible.
AI usage directly impacts cost.
Track:
Without monitoring, teams often discover expensive usage patterns after receiving the invoice.
Measure usage from the beginning.
Never allow AI requests to run indefinitely.
Configure reasonable timeouts.
Examples:
depending on your use case.
A stuck request should eventually fail rather than consume resources forever.
Validate input before calling the model.
Reject:
""
Prevent abuse and unnecessary cost.
Depending on your domain, you may want additional screening for malicious or prohibited content.
Validation is significantly cheaper than generating tokens.
Avoid this pattern:
var client = new OpenAIClient();
inside controllers.
Always inject dependencies.
Benefits:
Controllers should orchestrate.
Services should execute.
Infrastructure should remain isolated.
These issues appear repeatedly in production code reviews.
Controllers become massive and difficult to maintain.
Leads to resource management problems.
Creates security risks.
Wastes compute and server resources.
Leaks implementation details.
Creates reliability issues under load.
Turns temporary failures into user-facing outages.
Increases costs and opens the door to abuse.
This API returns a completed response after the model finishes generating.
That works, but it isn't how modern AI experiences feel.
In the next article, we'll improve the API by streaming tokens to the client using Server-Sent Events (SSE), allowing responses to appear in real time and creating an experience much closer to ChatGPT.
We'll also discuss:
Building AI features in ASP.NET Core isn't about calling an LLM API.
It's about treating AI as another production dependency.
The same engineering principles that apply to databases, message queues, payment gateways, and external APIs also apply to AI providers.
By introducing abstractions, configuration management, dependency injection, validation, observability, resilience, and cancellation support from day one, your application remains flexible as models, providers, and requirements evolve.
The result isn't just an AI demo.
It's a production-ready foundation that your team can build on with confidence.
Your email address will not be published. Required fields are marked *