Issue with select sub entity

Posts   
 
    
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 17-Feb-2020 09:15:00   

We used this query in v4.2 for a while and it works fine:


var data = PVBerechnungPreislisteEntity.Linq().Select(a => a.Artikel).Take(10000)
                .WithPath<ArtikelEntity>(a => a.Prefetch<ArtikelQuelleEntity>(b => b.ArtikelQuellen).SubPath(q => q.Prefetch(b => b.Lieferant))
                    .Prefetch(b => b.ArtikelGruppe)
                    .Prefetch(b => b.Hersteller)
                    .Prefetch(b => b.ArtikelChannels))
                .ToList();

There are some times a lot PVBerechnungPreislisteEntity in DB.

Linq() return DataSource2<PVBerechnungPreislisteEntity>

After upgrading to v5.6, we get this error:

The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request.

and the generated sql looks strange to me.

Whats wrong here?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 17-Feb-2020 09:31:05   

Do you set the ParameterizedPrefetchPathThreshold property of the adapter to a high number? (as in > 2100) ?

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 17-Feb-2020 09:55:32   

No, we didn't changed this property, after initializing an new DataAccessAdapter it is set to 50.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 17-Feb-2020 10:31:37   

Ah I think I see what's wrong, you have Take(10000), which always enforces a parameterized prefetch path as it's otherwise impossible to fetch it: it's not possible to filter the child rows on the parent rows otherwise: which parents are in set of parents for the children to consider is dictated by the 10000 parents you're fetching.

I'm not sure why you use 10000 as a limit, but I think it's better to check whether you have e.g. a PK that's incremented so you can use a predicate for that (e.g. PKfield < someValue) so you don't need 10000 parameters. Would that work?

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 17-Feb-2020 11:31:56   

In the table we Store Pk Values/Article which have to be calculated by a logic.

We fetch max 10000 Elements, because the logic should end in time.

The elements in that table are stored via GUI and than deleted via this backend service after calculation. So we can't use a predicate PKfield < someValue

Why this worked perfectly in v4.2 and now it isn't?

I could change the query like this:


var data = PVBerechnungPreislisteEntity.Linq().Take(10000)
                .WithPath(a => a.Prefetch<ArtikelEntity>(b => b.Artikel).SubPath(c => c.Prefetch<ArtikelQuelleEntity>(d => d.ArtikelQuellen).SubPath(q => q.Prefetch(b => b.Lieferant))
                    .Prefetch(b => b.ArtikelGruppe)
                    .Prefetch(b => b.Hersteller)
                    .Prefetch(b => b.ArtikelChannels)))
                .ToList().Select(a => a.Artikel).ToList();

But I think, this would be a performance-issue because it loads and initializes the PVBerechnungPreislisteEntity und then select the ArtikelEntity. But I don't need the PVBerechnungPreislisteEntity. Ok, this is not a huge Entity (5 Columns) but its not needed.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 17-Feb-2020 16:53:23   

Torsten wrote:

In the table we Store Pk Values/Article which have to be calculated by a logic.

We fetch max 10000 Elements, because the logic should end in time.

The elements in that table are stored via GUI and than deleted via this backend service after calculation. So we can't use a predicate PKfield < someValue

Why this worked perfectly in v4.2 and now it isn't?

Likely because you had < 2100 rows in the parent set

I could change the query like this:


var data = PVBerechnungPreislisteEntity.Linq().Take(10000)
                .WithPath(a => a.Prefetch<ArtikelEntity>(b => b.Artikel).SubPath(c => c.Prefetch<ArtikelQuelleEntity>(d => d.ArtikelQuellen).SubPath(q => q.Prefetch(b => b.Lieferant))
                    .Prefetch(b => b.ArtikelGruppe)
                    .Prefetch(b => b.Hersteller)
                    .Prefetch(b => b.ArtikelChannels)))
                .ToList().Select(a => a.Artikel).ToList();

But I think, this would be a performance-issue because it loads and initializes the PVBerechnungPreislisteEntity und then select the ArtikelEntity. But I don't need the PVBerechnungPreislisteEntity. Ok, this is not a huge Entity (5 Columns) but its not needed.

The main problem persists: the parent query, PVBerechnungPreisListe instances can run over 2100 rows. When it fetches Artikel instances related to these 10000 rows it has to filter these on the 10000 PVBerechnungPreisListe instances it has fetched. There it goes wrong, as it can't do that in another way than by passing all PK values as an IN clause (a correlated subquery won't work as there's no predicate on PVBerechnungPreisListe so you specify that you want all of them).

The root of the problem is that you fetch all PVBerechnungPreisListe instances and then a graph of related elements.

So instead, page through the resultset. Specify as ParameterPrefetchPathThreshold e.g. 100 and as page size 50 (if you have a single field PK in PVBerechnungPreisListe you can specify 100 as page size) and use Take(100) (or e.g. 50) and Skip() to fetch the page you want (specify an order by too!). This should go ok. You can go overboard and specify a page size of 2000 if you want, but it's slower.

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 17-Feb-2020 17:39:34   

Likely because you had < 2100 rows in the parent set

In 4.2v there where often more than 2.100 elements and it didn't crash.

The main problem persists: the parent query, PVBerechnungPreisListe instances can run over 2100 rows. When it fetches Artikel instances related to these 10000 rows it has to filter these on the 10000 PVBerechnungPreisListe instances it has fetched.

I changed the code to my new query and it works like a want.

There it goes wrong, as it can't do that in another way than by passing all PK values as an IN clause (a correlated subquery won't work as there's no predicate on PVBerechnungPreisListe so you specify that you want all of them).

It is not a Subquery, it is a prefetch path and this is done via join, isn't it?

If you were right, it should never be possible, to load more than 2.100 elements with PrefetchPath at ones. But we do this very often...

Walaa avatar
Walaa
Support Team
Posts: 14983
Joined: 21-Aug-2005
# Posted on: 17-Feb-2020 21:13:57   

It will use a parameterized IN query. Needless joins involved.

As a best practice, I second Otis advice to do the logic in batches. Just do a loop (10 rounds), and fetch 1000 in each.

Anyway, if it's working with you, that is for me for now, but keep in mind the above advice, you might need it soon.

As for your question, why it used to work. If you want, you can provide a repro and we can check it out.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 18-Feb-2020 09:31:04   

Torsten wrote:

Likely because you had < 2100 rows in the parent set

In 4.2v there where often more than 2.100 elements and it didn't crash.

So it passed > 2100 parent IDs in a single IN clause? I doubt it.

There it goes wrong, as it can't do that in another way than by passing all PK values as an IN clause (a correlated subquery won't work as there's no predicate on PVBerechnungPreisListe so you specify that you want all of them).

It is not a Subquery, it is a prefetch path and this is done via join, isn't it?

That's not what I wrote: the correlated subquery is used to filter the child rows based on the parent rows if the parameterized IN clause can't be used. So the 'artikel' entities are fetched in a query but which artikel instances to fetch is based on a filter based on the parent entities fetched in the root query. This can be done with an IN clause or with a correlated subquery. The former has a limit of 2100 parameters on sql server, the latter doesn't have a limit but can be slower.

If you were right, it should never be possible, to load more than 2.100 elements with PrefetchPath at ones. But we do this very often...

Show me.

10 to 1 it simply did: (using a correlated subquery) select ... from Artikel where ParentID in (select ID from PVBerechnungPreisliste)

as there's no predicate on PVBerechnungPreisliste, it simply fetches all rows.

v4.2 didn't have as much optimization in this area than v5.x and I have to check the changes since 4.2 (which is a period of 8 years or so!) but I'm fairly certain it didn't flip to a parameterized prefetch path (so an IN clause with values instead of the correlated subquery mentioned above) when a Take(n) was issued.

In any case, paging is what you should use in these situations, not fetching a massive amount of data at once as that's never going to perform well.

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 18-Feb-2020 11:11:38   

Ok, I checked it with SQLProfiler and this query:


 var data = PVBerechnungPreislisteEntity.Linq().Select(a => a.Artikel).Take(10000)
                .WithPath<ArtikelEntity>(a => a.Prefetch<ArtikelQuelleEntity>(b => b.ArtikelQuellen).SubPath(q => q.Prefetch(b => b.Lieferant))
                    .Prefetch(b => b.ArtikelGruppe)
                    .Prefetch(b => b.Hersteller)
                    .Prefetch(b => b.ArtikelChannels))
                .ToList();

I filled tblPVBerechnungPreislisten (PVBerechnungPreislisteEntity) with 18038 Entries

In v4.2 the query was executed successfully and fetched 10.000 ArtikelEntities (but it tooks a long time to finish)

In v5.6 it crashes with mentioned Exception.

I think, the problem ist the "WithPath"s, because intial, the first select in both versions is looking like that (simplified!)


exec sp_executesql N'SELECT DISTINCT TOP(@p2)  [LPA_L1].* FROM ( [DEV_CUPNEW_HEUTE].[dbo].[tblArtikel] [LPA_L1]  INNER JOIN [DEV_CUPNEW_HEUTE].[dbo].[tblPVBerechnungPreislisten] [LPA_L2]  ON  [LPA_L1].[ArtikelNr]=[LPA_L2].[ArtikelNr])',N'@p2 bigint',@p2=10000

But than (and maybe it was a bug) in v4.2, it is followed by some FullJoins, for example:


SELECT [DEV_CUPNEW_HEUTE].[dbo].[tblArtikelQuellen].*   WHERE ( [DEV_CUPNEW_HEUTE].[dbo].[tblArtikelQuellen].[ArtikelNr] IN (SELECT [LPA_L1].[ArtikelNr] FROM ( [DEV_CUPNEW_HEUTE].[dbo].[tblArtikel] [LPA_L1]  INNER JOIN [DEV_CUPNEW_HEUTE].[dbo].[tblPVBerechnungPreislisten] [LPA_L2]  ON  [LPA_L1].[ArtikelNr]=[LPA_L2].[ArtikelNr])))

So it seems, that it loads the full table with every 18038 entry.

Because the other query worked for me, I would say, the issue is solved wink

Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 25-Feb-2020 17:39:26   

I have marked that issue to early as solved. We are facing a similar problem with FetchEntityCollection and a OneToMany relation.

We have the following Relation:


return ModelInfoProviderSingleton.GetInstance().CreateRelation(RelationType.OneToMany, "RechnungEingangsrechnungenPositionen", true, new[] { RechnungEingangsrechnungFields.Id, RechnungEingangsrechnungenPositionFields.HeaderID }); 
        }

Because there are many "Rechnungen" in DB, somtimes it could be, that we have to load 5000 or more of this "Rechnungen" with there positions.

We are using this code to fetch the Entities:


var daa = new DataAccessAdapter(conStr, false, CatalogNameUsage.ForceName, CalatogName);

daa.ParameterisedPrefetchPathThreshold = 200;

da.FetchEntityCollection(entities, filterBucket, maxNumberOfItemsToReturn, sortClaues, prefetchPath);


Where entities is the Collection of RechnungEingangsrechnungEntity filterBucket contains "CustomerNr == 12345" maxNumberOfItemsToReturn = 10000 and PrefetchPath contains the PrefetchPath for the relation above.

As I mentioned before, we didn't change the code we are using in v4.2 und this fetch works fine there. But in v5.6 we get the error message:

The incoming request has too many parameters. The server supports a maximum of 2100 parameters

It seems for me, that the ParameterisedPrefetchPathThreshold isn't working correct an one to many.

I changed this value (tried 50, 200, 1000) without any success.

The error occurs only in RelationType.OneToMany. In RelationType.ManyToOne everything works fine.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 25-Feb-2020 18:08:36   

I can reproduce the 'too much parameters' problem, but as I said before, that's due to the Take(high value). In 4.2 things were less optimized so it didn't take into account a lot of things.

Here, the .Take(large number) results in a parameterized prefetch path in all cases. At the level of where the take is handled and the prefetch path is handled it doesn't know if 10K parameters is possible or not (SQL Server has this limit, which they're going to remove soon btw, the rest of the ado.net providers don't). So it simply produces the query elements at this level and sends them off to be generated into SQL.

Like I said before: you have to use paging and use that.

On adventureworks, this fails:

var metaData = new LinqMetaData(adapter);
var q = metaData.SalesOrderDetail.Select(sod => sod.SalesOrderHeader).Take(10000)
                .WithPath<SalesOrderHeaderEntity>(a => a.Prefetch<SalesOrderHeaderSalesReasonEntity>(b => b.SalesOrderHeaderSalesReasonCollection));
var l = q.ToList();


var metaData = new LinqMetaData(adapter);
for(int pageNo = 1; pageNo <= 100; pageNo++)
{
    var q = metaData.SalesOrderDetail.Select(sod => sod.SalesOrderHeader)
                    .OrderBy(x=>x.SalesOrderId)
                    .TakePage(pageNo, 100)
                    .WithPath<SalesOrderHeaderEntity>(a => a.Prefetch<SalesOrderHeaderSalesReasonEntity>(b => b.SalesOrderHeaderSalesReasonCollection));
    var l = q.ToList();
}

This succeeds. It reads a page of 100 elements, one page at a time.

So, to fix this, rewrite the .Take(10000) part into a paging query.

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 25-Feb-2020 18:15:06   

I think you missunderstood!

The problem occurs also in PredicateExpression/FetchEntityCollection and not only in LinqMetaData!

Sure, we could use paging, but therefore, we have to change all occurencies of FetchEntityCollection.

Because it works in v4.2 it is a bug in v5.6, isn't it?

Walaa avatar
Walaa
Support Team
Posts: 14983
Joined: 21-Aug-2005
# Posted on: 25-Feb-2020 20:59:49   

Reproduced in v.5.6.1 in a 1:m relation the prefetchPath query always use the IN(parameters), no matter the number of parameters and ignoring the ParameterisedPrefetchPathThreshold value.

Edit(repro code on Northwind)

using (var adapter = new DataAccessAdapter())
{
    var orders = new EntityCollection<OrderEntity>();
    var prefetch = new PrefetchPath2(EntityType.OrderEntity);
    prefetch.Add(OrderEntity.PrefetchPathOrderDetails);
    adapter.ParameterisedPrefetchPathThreshold = 50;

    adapter.FetchEntityCollection(orders, null, 1000, null, prefetch);
}
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 26-Feb-2020 10:03:30   

Torsten wrote:

I think you missunderstood!

The problem occurs also in PredicateExpression/FetchEntityCollection and not only in LinqMetaData!

Sure, as linq is a wrapper around the low level API. The key issue is: it can't fulfill your query, as you ask it to fetch 10000 rows but also use prefetch paths. It's the same with paging: if you use paging and prefetch paths, the query can only be fetched with a parameterized prefetch path query.

A prefetch path node is fetched using a filter based on the parent row (so if you have 10 customers in the root query, it will fetch all orders based on the 10 customers, and not all orders, like in your query). In your case, there's no filter possible, so in 4.2 it fetched all rows for order no matter how many customers were fetched. This has serious complications for large databases so we changed that in that it always switches to parameterized queries, simply because you have to use that if you want to limit the root resultset and use prefetch paths anyway. In 4.2 and also in v5.x.

Sure, we could use paging, but therefore, we have to change all occurencies of FetchEntityCollection.

No, you only have to change the situation where you specify .Take(<large number>) or where you specify a large number for the amount to fetch and prefetch paths.

Because it works in v4.2 it is a bug in v5.6, isn't it?

No it's not a bug, the query you started with in this thread 'works' in 4.2 because it fetches all related entities on 4.2, ignoring the 10000 you specify. So if you have 50,000 ArtikelEntity rows and 1 million ArtikelQuelleEntity entities, it will not fetch the ArtikelQuelleEntity entities related to the first 10,000 rows, but will fetch all 1 million of them, as there's no filtering on the child row (ArtikelQuelleEntity) based on the parent row (ArtikelEntity). It will merge the right ones in memory but this will take serious time if you're unlucky as the set of related entities isn't filtered.

So 'works' here is pretty subjective: it returns data, sure, but it's not ok.

To illustrate this, see this example by Walaa:

Walaa wrote:

Reproduced in v.5.6.1 in a 1:m relation the prefetchPath query always use the IN(parameters), no matter the number of parameters and ignoring the ParameterisedPrefetchPathThreshold value.

using (var adapter = new DataAccessAdapter())
{
    var orders = new EntityCollection<OrderEntity>();
    var prefetch = new PrefetchPath2(EntityType.OrderEntity);
    prefetch.Add(OrderEntity.PrefetchPathOrderDetails);
    adapter.ParameterisedPrefetchPathThreshold = 50;
    adapter.FetchEntityCollection(orders, null, 1000, null, prefetch);
}

On v4.2, it would fetch 1000 customers and all orders and merge them together. There's no other way, as there's no filter on the customers and it wouldn't use a parameterized prefetch path for orders. If there are 10 million order rows in the database this would take a while.

So we changed this, as there's no other way to do it anyway, to always use a parameterized prefetch path.

Frans Bouma | Lead developer LLBLGen Pro
Torsten
User
Posts: 26
Joined: 13-Mar-2015
# Posted on: 26-Feb-2020 10:10:35   

Could you please have a look into my last query and on the code Walaa posted?

I think, you are answering to the wrong problem.

Sure, your responnse for the query with Take(10000) I understood (also it works in v4.2).

But I am now on a complete different query and it is the same problem.

Like Walaa wrote, it ist reproducable for him and me. And there is no take or whatever.

I can't change every code line, where we fetch a large number of entities with a OneToMany prefetchpath.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39749
Joined: 17-Aug-2003
# Posted on: 26-Feb-2020 10:29:28   

Torsten wrote:

Could you please have a look into my last query and on the code Walaa posted? I think, you are answering to the wrong problem.

no simple_smile

Sure, your responnse for the query with Take(10000) I understood (also it works in v4.2). But I am now on a complete different query and it is the same problem. Like Walaa wrote, it ist reproducable for him and me. And there is no take or whatever.

You specify a number of items to fetch, which is the same as .Take(number). The .Take(number) ends up being the number of items to fetch (as that's what it represents). So it doesn't matter how you specify it, the query engine gets a directive to fetch a fixed number of items and fetch a prefetch path: the low level api with FetchEntityCollection or with linq (or queryspec for that matter) all end up at the same spot.

I can't change every code line, where we fetch a large number of entities with a OneToMany prefetchpath.

if the number is < 2100 you're fine.

If the number is >= 2100, you really have to think about what's going on as you fetch so much data the application isn't going to perform well anyway.

I also don't know how to fix this as there's no fix: the way 4.2 works is wrong as it fetches potentially all rows from related tables and that's harmful for other people's applications so rolling back to that isn't going to happen. There's also no other way to do this than how it's done now, using a parameterized prefetch path as I tried to explain a couple of times now.

I really don't know why you specify a large number of items to fetch. If you specify '0' it will fetch all root entities, and it won't run into this problem:

[Test]
public void FetchEverythingWithPPath()
{
    using (var adapter = new DataAccessAdapter())
    {
        var orders = new EntityCollection<SalesOrderHeaderEntity>();
        var prefetch = new PrefetchPath2(EntityType.SalesOrderHeaderEntity);
        prefetch.Add(SalesOrderHeaderEntity.PrefetchPathSalesOrderDetailCollection);
        adapter.ParameterisedPrefetchPathThreshold = 50;
        adapter.FetchEntityCollection(orders, null, 0, null, prefetch);
    }
}

As there's no 'paging' going on (specifying a limit is effectively paging, hence it uses the same logic) it will fetch all root rows and with the filter of that (being: no filter) it will fetch the related entities.

But if I specify 5000 as a limit, it will run into the problem and that's expected as I'm fetching a limited set. On 4.2, this would then fetch the 5000 salesorderheader entities, and all 121317 salesorderdetail rows as there's no filter on them. It would then merge the right salesorderdetail rows from that 121317 rows in the 5000, ignoring the rest.

An insane waste of performance, and which would go worse over time when salesorderdetail grows and grows. Hence our change.

I know you're on a tight schedule as you said, but please, do take this opportunity to at least change the way you fetch these large sets as it's not ok. It might 'work' now in v4.2 but as the data grows things will go slower and slower and that's not ok.

If the original programmers have specified a large set because they simply want to fetch 'all' rows, just specify 0, it will fetch all rows and things will work fine.

(btw, this change was documented as a new feature in v5.4: https://www.llblgen.com/Documentation/5.6/LLBLGen%20Pro%20RTF/whatsnew.htm#new-functionality--changes-2 )

Frans Bouma | Lead developer LLBLGen Pro