Skip to main content
ASP.NET Web Development

Stop Choking Your ASP.NET Core APIs: 7 Performance Killers to Fix Now

Your ASP.NET Core API feels sluggish. Requests pile up, memory grows, and the team blames the database—but the real culprit is often your own code. We've seen it happen: a well-intentioned developer adds a synchronous call here, a chatty loop there, and suddenly the API chokes under moderate load. This guide walks through seven performance killers we've encountered in real projects, with specific fixes and trade-offs. No theory—just patterns you can apply today. 1. Why This Matters Now: The Cost of Ignoring Performance Modern APIs face higher expectations than ever. Users tolerate sub-second response times, and microservices amplify latency—one slow endpoint can cascade across a distributed system. ASP.NET Core is fast out of the box, but common practices can sabotage that speed. Many teams we've worked with discover bottlenecks only after deploying to production, when scaling issues become urgent. The primary stakes are user experience and infrastructure cost.

Your ASP.NET Core API feels sluggish. Requests pile up, memory grows, and the team blames the database—but the real culprit is often your own code. We've seen it happen: a well-intentioned developer adds a synchronous call here, a chatty loop there, and suddenly the API chokes under moderate load. This guide walks through seven performance killers we've encountered in real projects, with specific fixes and trade-offs. No theory—just patterns you can apply today.

1. Why This Matters Now: The Cost of Ignoring Performance

Modern APIs face higher expectations than ever. Users tolerate sub-second response times, and microservices amplify latency—one slow endpoint can cascade across a distributed system. ASP.NET Core is fast out of the box, but common practices can sabotage that speed. Many teams we've worked with discover bottlenecks only after deploying to production, when scaling issues become urgent.

The primary stakes are user experience and infrastructure cost. A 200-millisecond delay can reduce conversion rates by up to 5% (based on industry research). On the cost side, inefficient code forces you to run more instances or upgrade hardware. Fixing these issues early saves money and prevents emergency rewrites.

This guide is for developers maintaining or building ASP.NET Core APIs—whether you use controllers, minimal APIs, or a mix. We assume you know the basics of async programming and dependency injection. Our focus is on the pitfalls that slip through code reviews and load tests.

What We Cover

Each section below describes a specific performance killer, how to detect it, and how to fix it. We include code snippets and, where relevant, benchmarks. We also note when a fix might not apply—because performance tuning has trade-offs.

2. Killer #1: Synchronous Blocking in Async Contexts

The most common mistake we see is using .Result or .Wait() on async calls inside controller actions or middleware. This blocks the thread pool thread, reducing throughput and causing thread pool starvation under load. ASP.NET Core's asynchronous pipeline is designed to free threads during I/O—blocking defeats that purpose.

How It Happens

A developer calls an async method like dbContext.SaveChangesAsync() but uses .Result because they're in a synchronous method. Or they wrap it in Task.Run() to make it synchronous. Both patterns tie up a thread while waiting for I/O, effectively reducing concurrency.

Detection

Look for .Result, .Wait(), or Task.WaitAll() in controller actions, middleware, or service methods. Also check for GetAwaiter().GetResult(). Profiling tools like dotTrace or Application Insights will show high thread contention.

Fix

Make the entire call chain async. Use await consistently, and avoid mixing sync and async code. If you must call an async method from a synchronous context (e.g., in a constructor), consider refactoring to async initialization patterns like IAsyncDisposable or using ValueTask where appropriate.

// Bad
public IActionResult Get()
{
    var data = _service.GetDataAsync().Result;
    return Ok(data);
}

// Good
public async Task<IActionResult> Get()
{
    var data = await _service.GetDataAsync();
    return Ok(data);
}

Trade-offs

Making everything async can introduce overhead from state machines. For CPU-bound work, synchronous calls are fine—but APIs are mostly I/O-bound. Use async for database, file, and network calls; keep CPU-bound work synchronous or offload to a background thread.

3. Killer #2: Excessive Memory Allocations and GC Pressure

ASP.NET Core's garbage collector is efficient, but high allocation rates can cause frequent Gen 2 collections, freezing threads and increasing latency. Common sources include temporary strings, large arrays, and inefficient LINQ queries.

The Problem

Each request allocates objects for serialization, logging, and data processing. If allocations per request are high (e.g., >10 KB), the GC runs more often. Under load, this leads to pause times of tens of milliseconds—enough to degrade response times.

Detection

Use .NET memory counters (gen-2-gc-count, alloc-rate) or a profiler like PerfView. Look for high allocation rates and frequent Gen 2 collections. Also check the large object heap (LOH) for fragmentation.

Fix

Reduce allocations by reusing buffers, using StringBuilder for concatenation, and preferring ArrayPool<T>. Avoid allocating LINQ closures in hot paths—use foreach loops instead. For JSON serialization, use Utf8JsonWriter or System.Text.Json's source generators to avoid reflection overhead.

// Bad: new array each call
var buffer = new byte[1024];

// Good: rent from pool
var buffer = ArrayPool<byte>.Shared.Rent(1024);
try
{
    // use buffer
}
finally
{
    ArrayPool<byte>.Shared.Return(buffer);
}

Trade-offs

Pooling adds complexity and requires careful return of resources. For low-traffic APIs, the overhead may not be worth it. Measure before optimizing—if your allocation rate is below 1 MB/s, focus on other killers first.

4. Killer #3: Chatty Database Queries and N+1 Patterns

Entity Framework Core makes it easy to write queries that fetch related data inefficiently. The N+1 problem—where you load a parent entity and then loop to load children one by one—is a classic performance killer. Each round trip adds latency and database load.

How It Happens

A controller action loads a list of orders, then iterates to load order items. EF Core lazy loading triggers a separate query for each order. Even with eager loading, missing .Include calls can cause similar issues.

Detection

Enable EF Core logging to see generated SQL. Look for multiple SELECT statements for the same table. Use tools like MiniProfiler or Stackify Prefix to visualize query patterns.

Fix

Use .Include() and .ThenInclude() to eagerly load related data in a single query. For complex scenarios, use raw SQL or a view. Consider using AutoMapper with projection to select only needed columns.

// Bad: N+1
var orders = context.Orders.ToList();
foreach (var order in orders)
{
    var items = context.OrderItems.Where(i => i.OrderId == order.Id).ToList();
}

// Good: eager loading
var orders = context.Orders.Include(o => o.Items).ToList();

Trade-offs

Eager loading can produce large result sets with duplicated data. For deeply nested graphs, consider splitting queries or using a read model. Also, lazy loading may be acceptable for admin panels with low traffic—but not for public APIs.

5. Killer #4: Inefficient Serialization and Large Payloads

Serialization is often the most expensive part of an API call. Returning more data than needed (over-fetching) increases response size, network time, and serialization CPU. Additionally, using Newtonsoft.Json (now legacy) instead of System.Text.Json adds overhead.

The Problem

APIs often return full domain objects with navigation properties, even when the client only needs a few fields. This wastes bandwidth and serialization time. Also, default serialization settings (e.g., camelCase, reference handling) can add processing.

Detection

Measure response sizes. Use browser dev tools or Fiddler to inspect payloads. Profile serialization time with a profiler—look for JsonSerializer.Serialize or Newtonsoft.Json calls taking significant CPU.

Fix

Create dedicated DTOs (Data Transfer Objects) that include only the fields the client needs. Use System.Text.Json with source generators for AOT-friendly serialization. Enable compression (Brotli or Gzip) on the server and configure clients to accept it. Also, consider pagination for list endpoints.

// Bad: returning full entity
public async Task<IActionResult> GetUser(int id)
{
    var user = await context.Users.FindAsync(id);
    return Ok(user); // includes password hash, etc.
}

// Good: return DTO
public async Task<IActionResult> GetUser(int id)
{
    var user = await context.Users
        .Where(u => u.Id == id)
        .Select(u => new UserDto { Name = u.Name, Email = u.Email })
        .FirstOrDefaultAsync();
    return Ok(user);
}

Trade-offs

DTOs require mapping code, which adds maintenance. Use AutoMapper or manual mapping—both have overhead. For small payloads, the benefit may be negligible. Also, compression adds CPU cost; test with realistic payloads.

6. Killer #5: Improper Caching Strategies (or No Caching at All)

Caching is one of the most effective performance boosts, but it's often misapplied or skipped entirely. Without caching, every request hits the database or external service, even for data that changes infrequently. On the other hand, caching stale data can cause bugs.

The Problem

Teams either add no caching, or they add caching without invalidation logic. Common mistakes: caching too much (entire responses), caching too little (only database queries), or using a local in-memory cache when multiple instances need consistency.

Detection

Look for repeated identical queries in logs. Measure response times for endpoints that return static data (e.g., product categories). If response times are high, caching is likely missing or ineffective.

Fix

Use a layered approach: in-memory cache (IMemoryCache) for single-instance data, distributed cache (Redis) for shared data, and response caching middleware for static or semi-static endpoints. Set appropriate expiration policies—absolute for data that changes rarely, sliding for frequently accessed data. Implement cache invalidation via cache tags or manual removal when data updates.

// Example: using IMemoryCache with absolute expiration
public async Task<List<Category>> GetCategories()
{
    var cacheKey = "categories";
    if (!_cache.TryGetValue(cacheKey, out List<Category> categories))
    {
        categories = await _context.Categories.ToListAsync();
        _cache.Set(cacheKey, categories, TimeSpan.FromMinutes(30));
    }
    return categories;
}

Trade-offs

In-memory cache is fast but doesn't scale across multiple servers. Distributed cache adds network latency. Cache invalidation logic can become complex—consider using a cache-aside pattern or write-through cache. For rapidly changing data, caching may not help.

7. Killer #6: Overusing Middleware and Filters

Middleware and filters are powerful, but each one adds overhead to every request. Common culprits: custom logging middleware that serializes the entire request body, authentication filters that query a database on every call, or multiple exception-handling layers.

The Problem

Developers add middleware for convenience without measuring its cost. For example, a middleware that logs request/response bodies can allocate large strings and slow down throughput. Similarly, a global authorization filter that checks permissions from a database for every endpoint adds latency.

Detection

Profile request processing time and break it down by middleware. Use Application Insights or a custom middleware timer. Look for middleware that does I/O or heavy computation.

Fix

Review each middleware for necessity. Move expensive operations (like database lookups) to the endpoint where they are actually needed. Use short-circuiting: if a condition fails early, return a response without executing further middleware. For logging, consider sampling or logging only on error.

// Example: short-circuit middleware for maintenance mode
app.Use(async (context, next) =>
{
    if (IsUnderMaintenance)
    {
        context.Response.StatusCode = 503;
        await context.Response.WriteAsync("Service temporarily unavailable");
        return; // short-circuit
    }
    await next();
});

Trade-offs

Removing middleware may reduce functionality. For example, removing request logging makes debugging harder. Balance performance with observability. Use conditional middleware registration for different environments (dev vs. production).

8. Killer #7: Ignoring Connection Pooling and HTTP Client Management

APIs that call external services often create new HttpClient instances per request, leading to socket exhaustion. Similarly, database connection pooling misconfiguration can cause timeouts. The default HttpClient in .NET Core is disposable, but disposing it too frequently leaves sockets in TIME_WAIT state.

The Problem

A typical pattern: inside a controller action, a developer does using var client = new HttpClient();. Under load, this can exhaust ephemeral ports and cause SocketException. On the database side, connection pool settings (like Max Pool Size) that are too low can cause contention.

Detection

Monitor socket usage with netstat or tools like Process Explorer. Look for many sockets in TIME_WAIT state. For database connections, check for timeout errors in logs and high connection pool usage.

Fix

Use IHttpClientFactory to manage HttpClient instances—it pools underlying handlers and reuses connections. For database connections, adjust Max Pool Size in the connection string (default is 100). Also, ensure you are using async database methods to avoid holding connections while waiting.

// Bad: creating HttpClient per request
public async Task<IActionResult> GetData()
{
    using var client = new HttpClient();
    var response = await client.GetAsync("https://api.example.com/data");
    // ...
}

// Good: using IHttpClientFactory
public class MyService
{
    private readonly HttpClient _httpClient;
    public MyService(IHttpClientFactory httpClientFactory)
    {
        _httpClient = httpClientFactory.CreateClient();
    }
    public async Task<string> GetDataAsync()
    {
        return await _httpClient.GetStringAsync("https://api.example.com/data");
    }
}

Trade-offs

IHttpClientFactory adds a small overhead for handler management. For very short-lived requests (like Lambda functions), pooling may not be beneficial. Also, be careful with DNS changes—the default handler lifetime is 2 minutes; adjust if needed.

FAQ: Common Questions About ASP.NET Core API Performance

Should I use IAsyncEnumerable for streaming large result sets?

Yes, if you need to return a large collection without buffering it entirely in memory. IAsyncEnumerable allows streaming results as they are produced, reducing memory pressure. However, it requires careful handling of the response—ASP.NET Core's JSON serializer supports it, but you must ensure the client can handle chunked transfer encoding. Use it for endpoints that return large lists, like log exports or report generation.

When should I use ValueTask instead of Task?

Use ValueTask when your async method often completes synchronously (e.g., from a cache) and you want to avoid allocating a Task object. This can reduce memory allocations. However, ValueTask has constraints: you can only await it once, and it's not suitable for all patterns. Stick to Task for most cases unless profiling shows allocation pressure.

How do I diagnose memory leaks in ASP.NET Core?

Use dotMemory or PerfView to capture memory snapshots. Look for objects that stay in memory after requests complete, such as event handlers that are not unregistered, or static collections that grow unbounded. Also check for large object heap fragmentation—if you see many byte arrays over 85 KB, consider pooling.

Is response caching middleware enough for high-traffic APIs?

Response caching middleware works well for GET endpoints that return static or semi-static data. It respects Cache-Control headers and can reduce server load. However, it caches entire responses, so it's not suitable for personalized data. For more granular caching, use output caching (available in .NET 7+) or a distributed cache.

Should I use minimal APIs or controllers for performance?

Minimal APIs have slightly lower overhead because they avoid controller infrastructure (model binding, filters). However, the difference is marginal for most applications—choose based on maintainability. If you need advanced features like validation or authorization filters, controllers may be simpler.

Next Steps: Where to Start Fixing Your API

Now that you know the seven killers, here's a practical plan to address them:

  1. Profile first. Run a load test with tools like Bombardier or k6. Measure response times, CPU, and memory. Identify the worst-performing endpoints.
  2. Eliminate synchronous blocking. Search your codebase for .Result and .Wait(). Refactor to async/await. This alone often yields the biggest gain.
  3. Review serialization. Check response sizes. Create DTOs for endpoints that return large objects. Enable compression.
  4. Audit database queries. Enable EF Core logging. Look for N+1 patterns. Add .Include or consider raw SQL for complex queries.
  5. Add caching. Start with in-memory cache for read-heavy endpoints. Use distributed cache if you have multiple instances.
  6. Check HttpClient usage. Replace new HttpClient() with IHttpClientFactory. Review connection pool settings for your database.
  7. Monitor and iterate. After each change, re-run load tests. Performance tuning is iterative—document your baseline and improvements.

Remember that not every fix is necessary for every API. Start with the killers that match your observed bottlenecks. Over time, these practices become second nature, and your API will handle more traffic with fewer resources.

Share this article:

Comments (0)

No comments yet. Be the first to comment!