Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOIN instead of CROSS APPLY in generated query in SQL Server #17936

Open
dmitry-slabko opened this issue Sep 19, 2019 · 71 comments
Open

JOIN instead of CROSS APPLY in generated query in SQL Server #17936

dmitry-slabko opened this issue Sep 19, 2019 · 71 comments

Comments

@dmitry-slabko
Copy link

dmitry-slabko commented Sep 19, 2019

EF Core Preview 5 would generate CROSS APPLY from a linq query like this:

from navObject in Context.NavObjects
join vessel in Context.Vessels on navObject.VesselId equals vessel.VesselId
from passage in Context.Passages
    .Where(x => x.VesselId == navObject.VesselId && x.ActualDepartureTime.Value <= fromTime)
    .OrderByDescending(x => x.ActualDepartureTime)
    .Take(1)
    .DefaultIfEmpty()

The generated query would be:

SELECT ... FROM [NavObject] AS [no]
INNER JOIN [Vessel] AS [vessel] ON [no].[ObjectId] = [vessel].[ObjectId]
CROSS APPLY (
    SELECT TOP(1) [x].*
    FROM [Passage] AS [x]
    WHERE ([x].[ObjectId] = [no].[ObjectId]) AND ([x].[ActualDepartureTime] <= @__fromTime_1)
    ORDER BY [x].[ActualDepartureTime] DESC
) AS [t]

In RC1 the query contains JOINs from SELECTs from SELECTs which cause where bad performance and timeouts:

SELECT ... FROM [NavObject] AS [n]
INNER JOIN [Vessel] AS [v] ON [n].[ObjectId] = [v].[ObjectId]
INNER JOIN (
    SELECT [t].....
    FROM (
        SELECT [p]...., ROW_NUMBER() OVER(PARTITION BY [p].[ObjectId] ORDER BY [p].[ActualDepartureTime] DESC) AS [row]
        FROM [Passage] AS [p]
        WHERE ([p].[ActualDepartureTime] <= @__fromTime_1)
    ) AS [t]
    WHERE [t].[row] <= 1
) AS [t0] ON [n].[ObjectId] = [t0].[ObjectId]

As you can clearly see, the Preview 5 generated query is clear and effective while the RC1 generated query is off. Please fix this query generation pattern.

Further technical details

EF Core version: 3.0 RC1 (versus 3.0 Preview 5)
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.0
Operating system: Windows 10
IDE: Visual Studio 2019 16.2.5

@smitpatel
Copy link
Contributor

@roji - I believe you investigated some perf work around rownumber. Can you answer why generating RowNumber is more efficient than CROSS APPLY?

@roji roji added the area-perf label Sep 19, 2019
@roji
Copy link
Member

roji commented Sep 19, 2019

I'll take a look at this soon.

@smitpatel smitpatel added this to the 3.1.0 milestone Sep 20, 2019
@roji
Copy link
Member

roji commented Oct 22, 2019

Some very basic research I did back in June on window functions vs. PostgreSQL lateral join (which is equivalent to SQL Server cross apply in this context), showing window functions to be superior. I'll investigate the scenario above more deeply to understand the exact differences etc.

See also this article referenced by @divega at the time: https://blog.jooq.org/2017/09/22/how-to-write-efficient-top-n-queries-in-sql/


Here is a comparison between lateral join and window function, for getting the first row. Window functions win hands down (look at the second cost number on the outermost plan node). In theory the plan could vary due to table size (I only had very few rows), but I have no time to go deeper.

For info on reading PostgreSQL EXPLAIN results: https://www.postgresql.org/docs/current/using-explain.html

Lateral join

test=# EXPLAIN SELECT o.*
test-#         FROM customers AS c,
test-#              LATERAL (
test(#                  SELECT *
test(#                  FROM orders
test(#                  WHERE customer_id = c.id
test(#                  ORDER BY price
test(#                  LIMIT 1
test(#               ) as o;
                                QUERY PLAN                                 
---------------------------------------------------------------------------
 Nested Loop  (cost=35.55..45199.77 rows=1270 width=12)
   ->  Seq Scan on customers c  (cost=0.00..22.70 rows=1270 width=4)
   ->  Limit  (cost=35.55..35.55 rows=1 width=12)
         ->  Sort  (cost=35.55..35.57 rows=10 width=12)
               Sort Key: orders.price
               ->  Seq Scan on orders  (cost=0.00..35.50 rows=10 width=12)
                     Filter: (customer_id = c.id)

Window function

test=# EXPLAIN SELECT o.*
test-#         FROM customers AS c
test-#                  JOIN (
test(#             SELECT *, rank() OVER (PARTITION BY o.customer_id ORDER BY o.id)
test(#             FROM orders AS o
test(#         ) AS o ON o.customer_id = c.id AND o.rank = 1;
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 Hash Join  (cost=208.97..236.53 rows=10 width=20)
   Hash Cond: (c.id = o.customer_id)
   ->  Seq Scan on customers c  (cost=0.00..22.70 rows=1270 width=4)
   ->  Hash  (cost=208.84..208.84 rows=10 width=20)
         ->  Subquery Scan on o  (cost=142.54..208.84 rows=10 width=20)
               Filter: (o.rank = 1)
               ->  WindowAgg  (cost=142.54..183.34 rows=2040 width=20)
                     ->  Sort  (cost=142.54..147.64 rows=2040 width=12)
                           Sort Key: o_1.customer_id, o_1.id
                           ->  Seq Scan on orders o_1  (cost=0.00..30.40 rows=2040 width=12)

@dmitry-slabko
Copy link
Author

In our case with MS SQL the latter query (select with joins from select) could not complete at all. Either PostgreSQL and MS SQL use very different ways to optimize such queries, or the optimizations heavily depend on other database specifics, such as indexes, etc.

@roji
Copy link
Member

roji commented Oct 22, 2019

Thanks @dmitry-slabko, I'll do a more in-depth investigation and comparison of the two databases in the next few days.

@dmitry-slabko
Copy link
Author

@roji - I can send to you the two complete queries, as they are a bit larger then I initially posted with their execution plans.

@roji
Copy link
Member

roji commented Oct 22, 2019

That could definitely help - along with the actual schema please. The actual data could also be relevant, or at least the number of rows - plans can sometimes be different based on table size etc.

@dmitry-slabko
Copy link
Author

dmitry-slabko commented Oct 22, 2019

This is the full linq query:

from navObject in Context.NavObjects
join vessel in Context.Vessels on navObject.VesselId equals vessel.VesselId
from passage in Context.Passages
    .Where(x => x.VesselId == navObject.VesselId && x.ActualDepartureTime.HasValue && x.ActualDepartureTime.Value <= fromTime)
    .OrderByDescending(x => x.ActualDepartureTime)
    .Take(1)
    .DefaultIfEmpty()
from simulationPoint in Context.SimulationDetails
    .Where(s => !isPresent && s.PassageId == passage.PassageId && s.Time <= fromTime)
    .OrderByDescending(s => s.Time)
    .Take(1)
    .DefaultIfEmpty()
from position in Context.Positions
    .Where(p => p.VesselId == navObject.VesselId && p.Time <= fromTime)
    .OrderByDescending(p => p.Time)
    .Take(1)
    .DefaultIfEmpty()
from trackPoint in Context.TrackPoints
    .Where(t => t.VesselId == navObject.VesselId && t.Time <= fromTime)
    .OrderByDescending(t => t.Time)
    .Take(1)
    .DefaultIfEmpty()
where simulationPoint != null || position != null || trackPoint != null

I attached both generated queries, their execution plans, and schema files for involved tables.
plans-and-queries.zip

@roji
Copy link
Member

roji commented Oct 22, 2019

Thanks. As that query uses the same construct multiple times, if there's a perf issue it would indeed be magnified here. I'll dive into this soon.

@roji
Copy link
Member

roji commented Oct 24, 2019

@dmitry-slabko am exploring these two options and seeing some interesting phenomena with window functions and cross apply.

However, to complete the picture, could you please also submit your EF Core model? A simple, self-contained console program would be ideal, including the model and the problematic query.

@roji
Copy link
Member

roji commented Oct 24, 2019

Note also that the tables.sql you provided contains some issues (e.g. references to [dbo].[ObjectType] which does not exist)

@smitpatel
Copy link
Contributor

@ajcvickers - I propose punting this issue.

@roji
Copy link
Member

roji commented Oct 24, 2019

I'd like to at least complete the investigation, even though I agree there's little chance we'll actually change anything for 3.1.

@dmitry-slabko
Copy link
Author

dmitry-slabko commented Dec 6, 2019

Ok, some more input on this problem. Here is the linq:

from vessel in Context.Vessels.Where(...)
from position in Context.Positions
    .Where(t => t.VesselId == vessel.VesselId && t.Time <= fromTime)
    .OrderByDescending(s => s.Time)
    .Take(1)
    .DefaultIfEmpty()
select new LocationPoint ...

The meaning is to get the latest point for each vessel id.
In 3.0 Preview 5 this would generate such SQL:

SELECT ... FROM [Vessel] AS [v]
CROSS APPLY (
    SELECT [t3].*
    FROM (
        SELECT NULL AS [empty]
    ) AS [empty1]
    LEFT JOIN (
        SELECT TOP(1) [p0].*
        FROM [Position] AS [p0]
        WHERE ([p0].[ObjectId] = [v].[ObjectId]) AND ([p0].[Time] <= @__fromTime_4)
        ORDER BY [p0].[Time] DESC
    ) AS [t3] ON 1 = 1
) AS [t4]

The subquery to retrieve data from Position is effectively filtered.

Now, since Preview 5 and until 3.1 release, the query is such:

SELECT ... FROM [Vessel] AS [v]
LEFT JOIN (
    SELECT ...
    FROM (
        SELECT ..., ROW_NUMBER() OVER(PARTITION BY [p].[ObjectId] ORDER BY [p].[Time] DESC) AS [row]
        FROM [Position] AS [p]
        WHERE [p].[Time] <= @__fromTime_1
    ) AS [t]
    WHERE [t].[row] <= 1
) AS [t0] ON [v].[ObjectId] = [t0].[ObjectId]

And this is the problem - the inner subquery retrieves all rows from Position table, and in our case it is 16+ million rows, which may even be much more for some other customers. However, the subquery is executed for each row in the master query. So, it appears that the use of partitioned queries for MS SQL was based on wrong assumptions, as this pattern generates queries that will not perform quite well even on small data sets, while on large data sets they simply kill the reader.

I cannot say how this pattern behaves on other servers, such as PosgreSQL and Oracle, but for MS SQL it is not applicable. I would highly recommend to change the query generation pattern for such linq expressions back to what it was up until 3.0 Preview 5.

@roji roji removed this from the Backlog milestone Dec 6, 2019
@roji
Copy link
Member

roji commented Dec 6, 2019

@smitpatel just a quick guess, but shouldn't the WHERE [t].[row] <= 1 in the second query be inside the inner-most subquery? Not sure this would have an impact but seems a bit more correct regardless?

In any case I'll try to find time to look into this, although it won't be right away.

@dmitry-slabko
Copy link
Author

I tried to execute the innermost query, and its performance issue comes from this part:
ROW_NUMBER() OVER(PARTITION BY [t].[VesselId] ORDER BY [t].[EventDateTimeUTC] DESC) AS [row]
Most likely, bad performance here is due to large row count. So, moving the check on t.row<=1 will not help, as the partitioning itself is the culprit.

@dmitry-slabko
Copy link
Author

One more note: the table even has an index on VesselId + EventDateTimeUTC columns, yet it does not help.

@smitpatel
Copy link
Contributor

just a quick guess, but shouldn't the WHERE [t].[row] <= 1 in the second query be inside the inner-most subquery? Not sure this would have an impact but seems a bit more correct regardless?

Nope. The innermost subquery computes value of row in projection. Any filter on such computed column has to be on outer level. Where cannot use computed column from projection.

@dmitry-slabko
Copy link
Author

Ok, yet another note :)
I created a composite index in VesselId ASC and EventDateTimeUTC DESC, and with this index the query finally got to behave. However, I do not think this is a proper solution - we cannot have indexes for all possible querying needs.

@palhal
Copy link

palhal commented Nov 19, 2021

@roji sure.

3.1 SQL:

SELECT [t].[Id], [t0].[Timestamp], [t0].[VariableId]
FROM (
    SELECT TOP(2) [v].[Id]
    FROM [Variables] AS [v]
    WHERE [v].[Id] = @__variableId_0
) AS [t]
OUTER APPLY (
    SELECT TOP(2) [v0].[Timestamp], [v0].[VariableId]
    FROM [Values] AS [v0]
    WHERE [t].[Id] = [v0].[VariableId]
    ORDER BY [v0].[Timestamp] DESC
) AS [t0]
ORDER BY [t].[Id], [t0].[Timestamp] DESC, [t0].[VariableId]

6.0 SQL:

SELECT [t].[Id], [t0].[Timestamp], [t0].[VariableId]
FROM (
    SELECT TOP(2) [v].[Id]
    FROM [Variables] AS [v]
    WHERE [v].[Id] = @__variableId_0
) AS [t]
LEFT JOIN (
    SELECT [t1].[Timestamp], [t1].[VariableId]
    FROM (
        SELECT [v0].[Timestamp], [v0].[VariableId], ROW_NUMBER() OVER(PARTITION BY [v0].[VariableId] ORDER BY [v0].[Timestamp] DESC) AS [row]
        FROM [Values] AS [v0]
    ) AS [t1]
    WHERE [t1].[row] <= 2
) AS [t0] ON [t].[Id] = [t0].[VariableId]
ORDER BY [t].[Id], [t0].[VariableId], [t0].[Timestamp] DESC

Models:

public class Variable
{
    public int Id { get; private set; }
    public string Name { get; set; }
    public ICollection<VariableValue> Values { get; private set; }
}

public void Configure(EntityTypeBuilder<Variable> builder)
{
    builder.Property(v => v.Id)
        .ValueGeneratedOnAdd();

    builder.HasKey(v => v.Id)
        .IsClustered();
}

public class VariableValue
{
    public int VariableId { get; set; }
    public DateTimeOffset Timestamp { get; set; }
    public double? Value { get; set; }
    public Variable Variable { get; private set; }
}

public void Configure(EntityTypeBuilder<VariableValue> builder)
{
    builder.HasKey(w => new { w.VariableId, w.Timestamp })
        .IsClustered();

    builder.HasOne(w => w.Variable)
        .WithMany(v => v.Values)
        .HasForeignKey(w => w.VariableId);
}

@roji
Copy link
Member

roji commented Nov 19, 2021

Thanks @palhal and @mdawood1991; we'll still need to do a perf investigation to see if there are scenarios where OUTER APPLY performs worse than the ROW_NUMBER approach. Clearing milestone to consider for 7.0.

@roji roji removed this from the Backlog milestone Nov 19, 2021
@palhal
Copy link

palhal commented Nov 29, 2021

I'd like to share another workaround that I've found in case others have the same problem. The good news is that it's simpler and it also works with the more advanced queries I had problems with.

Adding a second OrderBy (after Take):

var _ = _db.Variables
    .Where(v => v.Id == variableId)
    .Select(v => v.Values
        .OrderByDescending(w => w.Timestamp)
        .Take(2)
        .OrderByDescending(w => w.Timestamp)
        .Select(w => w.Timestamp)
        .ToList())
    .SingleOrDefault();

produces the following SQL:

SELECT [t].[Id], [t0].[Timestamp], [t0].[VariableId]
FROM (
    SELECT TOP(2) [v].[Id]
    FROM [Variables] AS [v]
    WHERE [v].[Id] = @__variableId_0
) AS [t]
OUTER APPLY (
    SELECT [t1].[Timestamp], [t1].[VariableId]
    FROM (
        SELECT TOP(2) [v0].[VariableId], [v0].[Timestamp]
        FROM [Values] AS [v0]
        WHERE [t].[Id] = [v0].[VariableId]
        ORDER BY [v0].[Timestamp] DESC
    ) AS [t1]
) AS [t0]
ORDER BY [t].[Id], [t0].[Timestamp] DESC

Maybe not as clean as 3.1, but the result is returned instantly which means I can now actually use EF Core 6 :)

@ajcvickers ajcvickers added this to the Backlog milestone Nov 30, 2021
@roji
Copy link
Member

roji commented Dec 10, 2021

@iRandell removing the first OrderBy makes the result non-deterministic - you're effectively asking to get any 2 values, and only then order those two values by timestamp. Assuming you want the first two values by timestamp, this is unsafe and could lead to incorrect results.

@serge-bohorad
Copy link

serge-bohorad commented Dec 10, 2021

@iRandell removing the first OrderBy makes the result non-deterministic - you're effectively asking to get any 2 values, and only then order those two values by timestamp. Assuming you want the first two values by timestamp, this is unsafe and could lead to incorrect results.

You're right. I didn't notice that. Thank you!

Also you can use Skip(0) instead of the second OrderBy() AFTER the Take(). Even though it looks better, it adds one more ORDER BY clause

Provider: Npgsql.EntityFrameworkCore.PostgreSQL (6.0.1)

The use case with Skip(0):

var result = context.Chats
    .Include(chat => chat.Messages
        .OrderByDescending(message => message.CreatedAt)
        .Take(1)
        .Skip(0))
    .Where(chat => chat.Id == 1)
    .ToList();
SELECT c."Id",
       t0."Id",
       t0."ChatId",
       t0."CreatedAt"
FROM "Chats" AS c
LEFT JOIN LATERAL
  (SELECT t."Id",
          t."ChatId",
          t."CreatedAt"
   FROM
     (SELECT m."Id",
             m."ChatId",
             m."CreatedAt"
      FROM "Messages" AS m
      WHERE c."Id" = m."ChatId"
      ORDER BY m."CreatedAt" DESC
      LIMIT 1) AS t
   ORDER BY t."CreatedAt" DESC
   OFFSET 0) AS t0 ON TRUE
WHERE c."Id" = 1
ORDER BY c."Id",
         t0."CreatedAt" DESC

The use case with two OrderBy():

var result = context.Chats
    .Include(chat => chat.Messages
        .OrderByDescending(message => message.CreatedAt)
        .Take(1)
        .OrderByDescending(message => message.CreatedAt))
    .Where(chat => chat.Id == 1)
    .ToList();
SELECT c."Id",
       t0."Id",
       t0."ChatId",
       t0."CreatedAt"
FROM "Chats" AS c
LEFT JOIN LATERAL
  (SELECT t."Id",
          t."ChatId",
          t."CreatedAt"
   FROM
     (SELECT m."Id",
             m."ChatId",
             m."CreatedAt"
      FROM "Messages" AS m
      WHERE c."Id" = m."ChatId"
      ORDER BY m."CreatedAt" DESC
      LIMIT 1) AS t) AS t0 ON TRUE
WHERE c."Id" = 1
ORDER BY c."Id",
         t0."CreatedAt" DESC

@dhedey
Copy link

dhedey commented Jan 10, 2022

Just to add to the list of relevant reporters/data on this, I wanted to report my use case which results in an extreme difference under the different generated queries.

With this query in EF Core 6.x against PostgreSQL, I'm getting an EF Core generated query with the row number partition strategy that takes over 2 minutes to run - versus the LATERAL JOIN which is instant:

            from resource in dbContext.Resource(rri, stateVersion)
            from supplyHistory in dbContext.Set<ResourceSupplyHistory>()
                .Where(h => h.ResourceId == resource.Id && h.FromStateVersion <= stateVersion)
                .OrderByDescending(h => h.FromStateVersion)
                .Take(1)
            select supplyHistory;

Note that each query just returns 1 result, but the ResourceSupplyHistory table is large.

I do have a workaround to 'fix' this in one of a few ways:

  • A DB round trip to first load the resource
  • Inlining the Resource queryable into the h.ResourceId == resource.Id to use a Sub Query contains instead (the code is less nice, mind)
  • Creating a TVF in the schema (which EF Core couldn't optimise around, I think)
  • Writing the SQL directly in a FromSqlRaw

@roji
Copy link
Member

roji commented Jan 10, 2022

Thanks @dhedey and others - we're indeed aware that there are some scenarios where lateral join/cross apply is superior. The problem here is to work out exactly when (and possibly in which databases) it's preferred over row number. I hope to investigate this for 7.0.

@MoMack20
Copy link

Any news if this update will make it into 7.0? Seeing a lot of cases where this is causing increased computing time.

@ajcvickers
Copy link
Member

This issue is in the Backlog milestone. This means that it is not planned for the next release (EF Core 7.0). We will re-assess the backlog following the this release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources. Make sure to vote (👍) for this issue if it is important to you.

@NetMage
Copy link

NetMage commented Nov 2, 2022

Personally I think a bug that has been open for three years, and was a regression, should be prioritized higher especially since there is no obvious reason for the queries failing from the POV of an EF Core user.

A simple test for me on a small database shows at least 16 times slower queries between using windowing and CROSS APPLY just by adding Skip(0).

@roji
Copy link
Member

roji commented Nov 3, 2022

@NetMage technically this isn't a bug, in the sense that results are correct; it's a perf issue.

Could you please post the small repro and queries showing the x16 perf difference? That would help get this prioritized.

@MoMack20
Copy link

Any news on this performance update?

@roji
Copy link
Member

roji commented Feb 20, 2023

This issue is in the Backlog milestone. This means that it is not planned for the next release (EF Core 8.0). We will re-assess the backlog following the this release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources. Make sure to vote (👍) for this issue if it is important to you.

One thing that would help prioritize this is a perf comparison that shows this to be significant.

@MoMack20
Copy link

MoMack20 commented Apr 1, 2023

Ran into an instance of this in production. After a table hit a certain number of records the query plan changed and it went from a 300 ms query to a 3 minute query.

Over the last couple of years of the observing this query my team has had to make multiple changes that differ from the implementation that should work without issue.

Standard query I expect to work performantly.

await ctx.MainEntitity
	.Where(me => me.StatusId < 6)
	.Select(me => new
	{
		me.Id,
		me.Name,
		RelatedEntityInfo = me.RelatedEntity
			.Select(re => new
			{
				re.MainEntityId,
				re.DateCreated,
				re.Status
			}).FirstOrDefault()
	}).ToListAsync()

Modified query that was performing well until the issue in production yesterday.

await ctx.MainEntitity
	.Where(me => me.StatusId < 6)
	.Select(me => new
	{
		me.Id,
		me.Name,
		RelatedEntityInfo = me.RelatedEntity
			.Where(re => re.MainEntitityId == me.Id)
			.Select(re => new
			{
				re.MainEntityId,
				re.DateCreated,
				re.Status
			}).FirstOrDefault()
	}).ToListAsync()

Current query that solved the performance issues. This way forces the filter to be in the same block instead of using the windowed function

await ctx.MainEntitity
	.Where(me => me.StatusId < 6)
	.Select(me => new
	{
		me.Id,
		me.Name,
		RelatedEntityInfo = me.RelatedEntity
			.Where(re => re.MainEntitityId == me.Id)
			.Select(re => new
			{
				re.MainEntityId,
				re.DateCreated,
				re.Status
			}).Take(1)
			.FirstOrDefault()
	}).ToListAsync()

Here is a graph the performance impact of when the query went sour and when the last change was put into place.
image

Here is the SQL query given for the EF query without the "Take" statement.

LEFT JOIN (
    SELECT [t1].[Status], [t1].[DateCreated], [t1].[c], [t1].[MainEntitityId]
    FROM (
        SELECT [r].[Status], [r].[DateCreated], 1 AS [c], [r].[MainEntitityId], ROW_NUMBER() OVER(PARTITION BY [r].[MainEntitityId], [r].[MainEntitityId] ORDER BY [r].[Id]) AS [row]
        FROM [dbo].[RelatedEntity] AS [r]
    ) AS [t1]
    WHERE [t1].[row] <= 1
) AS [t0] ON [p1].[Id] = [t0].[MainEntitityId] AND [t].[MainEntitityId] = [t0].[MainEntitityId]

Here is the SQL query given for the EF query with the "Take" statement.

OUTER APPLY (
    SELECT TOP(1) [t1].[MainEntitityId], [t1].[Status], [t1].[DateCreated], 1 AS [c]
    FROM (
        SELECT TOP(1) [r].[MainEntitityId], [r].[Status], [r].[DateCreated]
        FROM [dbo].[RelatedEntity] AS [r]
        WHERE ([p1].[Id] IS NOT NULL) AND [p1].[Id] = [r].[MainEntitityId] AND [t].[MainEntitityId] = [r].[MainEntitityId]
    ) AS [t1]
) AS [t0]

Using the Take(1) before FirstOrDefault() seems to force the OUTER APPLY and the filtering to be done inside of the same block with the select, as opposed to the OUTER JOIN with the filter done outside of the same block as the select.

@roji
Copy link
Member

roji commented Apr 5, 2023

@MoMack20 thanks for sharing - this is definitely an issue I intend to explore (thought it probably won't happen for 8.0).

With this issue (and other query performance issues), what would be really helpful would be a simple, minimal repro comparing the two SQLs, showing both query plans and clear benchmark results. We get reports from time to time but without this information, making it harder to properly investigate etc. If you could help with that, that could go a long way to bump up the priority of this issue.

@MoMack20
Copy link

MoMack20 commented Apr 5, 2023

@roji Is there any specific template I can follow for setting up the database for that repo? Would I need to include the data in a SqlPackage file if the case is determined by number of rows in the tables?

@roji
Copy link
Member

roji commented Apr 5, 2023

@MoMack20 no template really - as long as you deliver a minimal, clear repro that's fine. In most cases, a simple, minimal SQL script for creating the database schema and some data is ideal; in this case, since lots of data may be needed, you can either generate it programmatically in the SQL script, or send e.g. a bacpac for SQL Server for a dump of the database (if needed, my email is listed on my github profile).

Basically any way in which we can easily get your database up and running and clearly see the plans and perf differences between the two queries.

@Gagarin23
Copy link

@roji hi, I am not a sql expert, but I did a little research
We have 2 tables:

BusinessPartners
(
    Id bigint identity constraint BusinessPartners_pk
            primary key
                with (fillfactor = 90)
)
Orders
(
    Id  bigint identity constraint Orders_pk
            primary key
                with (fillfactor = 90),
    BusinessPartnerId bigint,
    CreatedOn         datetime2
)

Indexes:

IX_Orders_BusinessPartnerId_CreatedOn
    on dbo.Orders (BusinessPartnerId asc, CreatedOn desc) include (Id)
    with (fillfactor = 90

Тest requests:

select
    bp.id, o.Id
from BusinessPartners bp
cross apply (
    select top 1 Id
    from Orders o
    where o.BusinessPartnerId = bp.Id
    order by o.DateCreated desc
) o
where bp.Id between 100000 and 200000

select
    bp.id, o.Id
from BusinessPartners bp
inner join (
        select Id, BusinessPartnerId
        from (
            select Id, row_number() over (partition by BusinessPartnerId order by DateCreated desc) n, businessPartnerId
            from Orders
        ) src
        where src.n <= 1
) O on bp.Id = o.BusinessPartnerId
where bp.Id between 100000 and 200000

Window functions were faster in scenarios with both the bp.Id scalar filter and the range from the example above. In case of scalar filter the query is optimized and window function doesn't perform aggregation of the whole table, you can see it in the screenshot below:
image
It is more interesting in case of range filter. The query processes many rows, but it does it in a bash and only ONE time. The total number of logical reads of the Orders table is 923:
image
image
The situation with cross apply is deplorable. Actual number of result strings was 13313, but number of calls to cross apply subquery = 55698 times, 55698 searches in the index. As far as I know (but I could be wrong), the index is not cacheable, and calling to it is a reference to constant memory. As a result, we have 55698 logical reads.
image
image
As far as I know, window functions are calculated in memory. Consequently, when dealing with huge tables it can really slow down the system catastrophically and the choice is cross apply. Since EF Core generates window functions, the solution is to write functions on the database side and import them with EF (example in the console project). Once again, I am not a database expert, but I hope I have shed some light on this problem.
Repo with db backup https://github.com/Gagarin23/EfConsoleTest

@roji
Copy link
Member

roji commented Apr 13, 2023

@Gagarin23 thanks for the testing and the in-depth analysis, this is exactly the kind of thing that can help. Unfortunately, it may take some time before we're able to properly look into this and improve the situation - but it definitely is high up on my priority list. We also have to verify what happens on other databases, but if the behavior is consistently similar, we can definitely consider changing what EF does here.

Note to self: I've forked the repro above to https://github.com/roji/EF17936 just in case it disappears

@MoMack20
Copy link

@roji I finally got around to making a repo for this
https://github.com/MoMack20/EFCore-Issue-17936

The DACPAC file should have everything necessary to standup the database. Query plans and EF generated sql are all included in the repo as well.

Results as of my last run:
Average Windowed Time: 910.6ms
Average Windowed with extra where Time: 853.6ms
Average Apply Time: 93ms

@roji
Copy link
Member

roji commented Apr 24, 2023

Thank you @MoMack20.

I do plan to take a deep look at this and at other query performance issues, but unfortunately there's a good chance this won't yet happen in 8.0 due to many other competing priorities. There's a good chance we'll do a significant push in this area for 9.0.

FYI I've forked your repo to https://github.com/roji/EFCore-Issue-17936 to make sure it's still there when I get to it.

@MoMack20

This comment was marked as resolved.

@roji

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment