.NET has a large installed base in enterprise environments — manufacturing, logistics, financial services, healthcare. These organisations have significant investments in .NET Core backends, C# services, and SQL Server databases. As AI capabilities become essential rather than optional, .NET teams need practical guidance on integrating AI into their existing applications without abandoning the platform they know.
This guide covers the practical patterns for adding AI capabilities to .NET applications: semantic search, LLM integration, RAG pipelines, and AI agents — using the tools that .NET teams will find most familiar.
The .NET AI Ecosystem in 2026
Microsoft has invested heavily in making AI integration first-class in the .NET ecosystem. The key libraries are:
- Semantic Kernel — Microsoft's open-source SDK for building AI agents and integrating LLMs into .NET applications. Provides abstractions for chat completion, embeddings, memory, and function calling.
- Microsoft.Extensions.AI — a set of abstractions (IChatClient, IEmbeddingGenerator) that let you swap AI providers without rewriting your application code.
- Azure OpenAI SDK — the official C# client for Azure OpenAI Service, which gives enterprise .NET teams a Microsoft-supported path to GPT-4 and embeddings.
- Azure AI Search — vector search and hybrid retrieval, tightly integrated with the .NET and Azure ecosystem.
Pattern 1: Adding Chat Completion to a .NET API
The simplest integration is adding a chat endpoint to an existing .NET API. Using Semantic Kernel, this requires minimal setup:
Register the kernel in Program.cs with builder.Services.AddKernel().AddOpenAIChatCompletion("gpt-4o", apiKey), then inject IChatCompletionService into your controller or service. Use ChatHistory to manage conversation context across turns.
Semantic Kernel handles token management, streaming responses, and provider abstraction. You can switch between OpenAI, Azure OpenAI, and other providers by changing the registration — the rest of your code is unchanged.
Pattern 2: Semantic Search Over Business Data
Keyword search fails on business queries. Users search for "late deliveries last quarter" and get nothing, because the database stores "delayed shipments Q3". Semantic search — using vector embeddings — matches meaning, not keywords.
The Implementation Pattern
- Generate embeddings for your documents or records using IEmbeddingGenerator<string, Embedding<float>>.
- Store embeddings in a vector database — Azure AI Search, Qdrant, or pgvector if you are already on PostgreSQL.
- At query time, embed the user's query and retrieve the most semantically similar records.
- Pass the retrieved records as context to an LLM if you need a generated answer, or return them directly as search results.
For .NET teams already using SQL Server, pgvector on PostgreSQL is often the simplest starting point — it adds vector capabilities to a relational database you already operate, without introducing a new infrastructure dependency.
Pattern 3: RAG Pipeline for Document Q&A
RAG (Retrieval-Augmented Generation) is the pattern that allows an LLM to answer questions based on your private documents without fine-tuning. The flow in .NET:
- Ingest documents — parse PDFs, Word documents, or web pages into chunks. Semantic Kernel provides text chunking utilities.
- Embed and index — generate embeddings for each chunk and store them in your vector store.
- Retrieve — at query time, embed the question and retrieve the top-k most relevant chunks.
- Augment — construct a prompt that includes the retrieved chunks as context.
- Generate — pass the augmented prompt to the LLM and return the response.
Semantic Kernel's VectorStoreTextSearch<T> provides a generic abstraction over this entire pattern, supporting multiple vector store backends with the same query interface.
Pattern 4: AI Agents with Function Calling
Agents in Semantic Kernel are built around Plugins — collections of C# methods exposed to the LLM as callable tools. The LLM decides which functions to call, in which order, based on the goal it has been given.
A practical example: an invoice processing agent with plugins for ReadEmailAttachment(), ParseInvoiceData(), LookupPurchaseOrder(), ValidateLineItems(), and CreateERP Record(). The agent receives the goal "process incoming invoices from the shared inbox" and orchestrates these functions autonomously — calling the LLM between steps to reason about what to do next.
Choosing an Execution Strategy
- FunctionChoiceBehavior.Auto — the LLM decides which functions to call and when. Good for open-ended agents.
- FunctionChoiceBehavior.Required — forces the LLM to call a specific function. Good for structured pipelines.
- Sequential or Handlebars Planner — generates a multi-step plan upfront. Better for deterministic workflows.
Production Considerations for .NET AI Applications
- Rate limiting and retry — wrap your AI API calls with Polly for resilience. Token rate limits are a real production concern at scale.
- Cost management — log token usage per request. Unexpected cost spikes are the most common production surprise for teams new to LLM APIs.
- Prompt injection — treat any user-supplied content that reaches your prompt as untrusted input. Sanitise it and use system-level instructions to define the agent's behaviour.
- Observability — Semantic Kernel integrates with OpenTelemetry. Add spans around AI calls so you can see latency, token usage, and errors in your existing observability stack.
- Structured output — use response format constraints (JSON schema) for any AI call that feeds downstream code. Do not parse free-text LLM responses.
Where to Start
The fastest path for a .NET team new to AI integration is to add a single semantic search capability to an existing application. It is self-contained, immediately valuable, and introduces the core concepts — embeddings, vector stores, prompt construction — without the complexity of a full agent. Once that is working in production, the patterns for RAG pipelines and agents build naturally on the same foundation.