Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange memory behaviour with large number of columns #34527

Open
Steve887 opened this issue Aug 26, 2024 · 9 comments
Open

Strange memory behaviour with large number of columns #34527

Steve887 opened this issue Aug 26, 2024 · 9 comments

Comments

@Steve887
Copy link

I have run into a strange issue with our app that results in strange memory usage: spiky memory loads and increasing memory usage over time, when there is a large number of properties in the model being selected. What's particularly strange, if I comment out just one property, the memory issues no longer occur and memory usage is extremely consistent.

A requirement of our system is to change connection strings at runtime, so instead of using Services.AddDbContext, I am registering the data context with Autofac and passing in a connection string at runtime, then using an overridden OnConfiguring to setup the SqlServer provider. If I change this to use Services.AddDbContext then memory does return the normal, but I'm confused why this would only seem to have an effect at a large number of properties.

The rest of my setup is very normal, with a new entities being selected with Includes.

I have attached an app that reproduces the issue, the steps are as follows:

  1. Open the memory test solution
  2. Start the MemoryTest.Api project. This will create a database on a (localdb)\MSSQLLocalDB database, so change this for SqlExpress or other server etc in Program.cs.
  3. Start memory profiling with your favourite program (I used dotMemory)
  4. Run the StressTestApi.ps1 file. This will execute an API call against the endpoint constantly to simulate a load.

Running the application as is, results in a memory graph as follows:
image

Open Item.cs and comment out the ItemImage property. Then open ImageMap.cs and comment out the ItemImage property mapping. Start the application again, attach the memory profiler and rerun the powershell script. This results in the following memory graph:
image

The big differences seem to be in the gen 1 and 2 heaps, although taking memory snapshots doesn't really reveal anything obvious. The total memory also grows over time when the column is there.

While it's easy enough to say, just decrease the model size, I am working with a large, legacy, model and cannot make big changes like that. I would also like the know the underlying reason why the memory behaviour changes so drastically just by changing one column. I would expect if the setup is wrong for it to happen all the time.

This also occurs in .Net 7 and EFCore 7 versions, on windows and linux.

Please let me know if there's any more information to supply.

MemoryTest.zip

Include provider and version information

EF Core version: 8.0.8
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .Net 8
Operating system: Windows

@roji
Copy link
Member

roji commented Aug 26, 2024

Before taking a look at your repro, it's well-known that SqlClient has some severe memory/performance issues with reading large binary columns asynchronously (dotnet/SqlClient#593), so that could explain the memory behavior when adding your "image" property. This should be easily verifiable by switching to sync I/O (SaveChanges() instead of SaveChangesAsync()) as a test - can you please do that?

@Steve887
Copy link
Author

Steve887 commented Aug 26, 2024

Before taking a look at your repro, it's well-known that SqlClient has some severe memory/performance issues with reading large binary columns asynchronously (dotnet/SqlClient#593), so that could explain the memory behavior when adding your "image" property. This should be easily verifiable by switching to sync I/O (SaveChanges() instead of SaveChangesAsync()) as a test - can you please do that?

It's not a binary column, it's just a 50 character string. It also doesn't have any data in it.
image

@ajcvickers ajcvickers self-assigned this Aug 26, 2024
@ajcvickers
Copy link
Member

@Steve887 Do you see the same behavior if you change the query to be no-tracking? For example:

return await _context.Set<VisitView>()
	.Include(x => x.ConsultationViews).ThenInclude(x => x.ConsultationItems).ThenInclude(x => x.Item)
	.AsNoTracking()
	.FirstOrDefaultAsync(p => p.VisitNumber == key);

@Steve887
Copy link
Author

@ajcvickers Hi, tried adding this and memory behaviour is similar. In fact total memory is actually higher.
image

@ajcvickers ajcvickers removed their assignment Aug 31, 2024
@Steve887
Copy link
Author

Steve887 commented Sep 2, 2024

@ajcvickers is there any more information I can provide for this one?

@satviktechie1986
Copy link

try using splitquery , might works

return await _context.Set()
.Include(x => x.ConsultationViews).ThenInclude(x => x.ConsultationItems).ThenInclude(x => x.Item)
.AsSplitQuery()
.FirstOrDefaultAsync(p => p.VisitNumber == key);

@Steve887
Copy link
Author

@satviktechie1986 Setting AsSplitQuery does result in normal memory usage. However, in our actual app this isn't really a good solution as we have many queries across the app, and we don't want the dramatic increase in network calls enabling this setting globally would result in.

I would still like to find out why the original query has such a dramatic difference by simply including one extra property, so I can take those findings and implement in our main application.

@satviktechie1986
Copy link

alternative you can use

return await _context.Set()
.AsNoTracking()
.Where(p => p.VisitNumber == key)
.Select(v => new
{
v.VisitNumber,
ConsultationViews = v.ConsultationViews.Select(cv => new
{
cv.Id,
ConsultationItems = cv.ConsultationItems.Select(ci => new
{
ci.Id,
ci.Item.Name
})
})
})
.FirstOrDefaultAsync();

@Steve887
Copy link
Author

@satviktechie1986 again, this wouldn't work for our actual app as we have too many queries to realistically select just the required columns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
@Steve887 @ajcvickers @roji @AndriySvyryd @satviktechie1986 and others