Prefetch Path and projection

Posts   
 
    
Fab
User
Posts: 108
Joined: 20-Oct-2008
# Posted on: 12-Mar-2023 15:13:38   

Hello,

Currently we mainly use "old" llblgen style (with IRelationPredicate etc.) and I'm trying to change to QuerySpec. We use a kind of repository pattern where constructed query are centralized. So I'm trying to expose EntityQuery and allow calling code to make projection on custom model.

I ended up with that example in the repository :

var qf = new QueryFactory();
var q = qf.DfContentVersion
    .From(QueryTarget.InnerJoin(DfContentVersionEntity.Relations.DfContentEntityUsingCtId))
    .From(QueryTarget.LeftJoin(qf.DfAuthorisationGroup).On(DfContentFields.ArgId == VwAuthorisationFields.ArgId))
    .Where(DfContentVersionFields.CtId == id)
    .OrderBy(DfContentVersionFields.CvVersion | SortOperator.Descending)
    .WithPath(DfContentVersionEntity.PrefetchPathDfContentStructuresCollection);
return q;

As you can see, there is a prefetch path to the ContentStructure table.

Now the class that consume the repository:

var q = this.repository.GetContent(id).Select((p => new ContentWithStructData()
{
    Id = DfContentFields.CtId.ToValue<Guid>(),
    // ...
    Structure = ???
});
var data = this.adapter.FetchFirst(q);

I didn't find a way to select the data ("Structure") from the prefetch path.

I found in the documentation a way to do it using CorrelatedOver, but that require the QueryFactory which isn't availble in the calling code. If I correctly understood the documentation it'll be something like this:

Structure = qf.DfContentStructure.CorrelatedOver(DfContentVersionEntity.PrefetchPathDfContentStructuresCollection)
                              .Select( /* ... */ )

But that means I would need to create a new QueryFactory (qf) in the calling code, which is amho not correct as all query construction should be in the repository. And also in that case the "WithPath" in the repository is useless ?

Could you tell we what's the best way to achieve that ?

Thank you simple_smile

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39760
Joined: 17-Aug-2003
# Posted on: 13-Mar-2023 09:07:09   

Prefetch paths are fetched after the parent query is fetched, so in your prefetch path query, the DfContentVersion is fetched first, and then the DfContentStructures are fetched and merged into the DfContentVersion set.

What you want is formulate a projection on a set spanning multiple entities and return a single object with the data projected into that single object, correct? This means that the query you're defining is run once and in the database, so isn't going to project the related set of the prefetch path, as that's fetched using a different system altogether.

If you want to fetch the entities first and then project them into an in-memory object, you first have to execute the entity query (which will then fetch the prefetch path query as a separate query, merge the results etc.) and then project that in memory (but queryspec isn't meant for that). This is highly inefficient.

If you want to have a single query where there's 1 set coming back from the database and projected into multiple objects, then you have to define the set in the queryspec projection with a queryfactory. The correlatedover is to define a where clause to tie the nested query to the outer query.

If all query construction should be in the repository, then your projection query (which defines a query, as the entity query you define the projection on hasn't run!) also should be in the repository, right? simple_smile Btw, creating a QueryFactory costs nothing, it has no overhead.

Not sure which version you're on, but our designer offers 'DerivedModels' which generates these projection queries for you.

Frans Bouma | Lead developer LLBLGen Pro
Fab
User
Posts: 108
Joined: 20-Oct-2008
# Posted on: 13-Mar-2023 09:37:36   

Seems support is still as quick as in the past years wink

I'm still in a very old version, so old i dare not to say smile

Currently the fetch is in the repository, but that means we always fetch the whole entities. Then the calling code convert these entities to DTO and return that from the APIs. We also have old legacy code that convert dto to other dto etc. so not very efficient. I'm trying to make order in all that code and to define a new design and best practise on how to query database.

My objectives are :

  • Avoid creating useless entities: when querying database we don't need intermediate objects and could directly fetch to the DTO/Model returned by the APIs (only about reading database here, not updating)

  • Avoid fetching all fields. This currently could be done using an IncludeExcludFields but would require to passe it to all repository methods, but the main problem is that after that you have an entity and it's not clear which fields have been fetched or not.

  • Centralize query construction because we have many security to apply on it (multi tenant, authorizations etc) and it shouldn't be let it to each developper that need to fech

Maybe I'm wrong but all of this bring me to the QuerySpec queries and exposing EntityQuery from the repos, and let the calling code do the projections (so that it fetch only what's needed) and the actual fetch itself (that not really a repository pattern anymore I know wink )

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39760
Joined: 17-Aug-2003
# Posted on: 13-Mar-2023 14:39:08   

simple_smile

I think what you want isn't a prefetch path query but a joined set which is projected to an object. However your DfContentStructure entity is related over a 1:n relationship I think which will either always result in duplicates of DfContentVersion entities in the joined set, or you need to specify it as a nested set. There's no other way.

So basically:

var qf = new QueryFactory();
var q = qf.DfContentVersion
    .From(QueryTarget.InnerJoin(DfContentVersionEntity.Relations.DfContentEntityUsingCtId))
    .From(QueryTarget.LeftJoin(qf.DfAuthorisationGroup).On(DfContentFields.ArgId == VwAuthorisationFields.ArgId))
    .From(QueryTarget.InnerJoin(qf.DfContentStructure).On(... )
    .Where(DfContentVersionFields.CtId == id)
    .OrderBy(DfContentVersionFields.CvVersion | SortOperator.Descending)
return q;

(I don't know the relationship so left it empty!)

This will return a query which forms a joined set. You can now define a projection on this query by appending a .Select(...) to load fields from all entities in the joined set. but as I said, you will run into duplicates here for the contentversion so it's not really ideal. The queryspec query system will take care of this for you for a nested set.

If you don't want to define the query outside the repository, then I think you can only define it somewhere inside the repository. You have to define it somewhere...

How did you define this query using the Low level API?

Frans Bouma | Lead developer LLBLGen Pro
Fab
User
Posts: 108
Joined: 20-Oct-2008
# Posted on: 14-Mar-2023 08:00:03   

This specific query doesn't exist in low level api as it's a new one. But other of the same kind exist: everything is done in the repository: add predicate, sort, prefetch and then adater fetch. So the entity (with prefetched data) is returned by the repository, and the calling code just need to convert to DTO. That's what I explained in my previsous post as "useless entities" because we don't need the entities, we need the DTO.

So using the QuerySpec the query must still be define in the repository, but the projection to DTO and the fetch should be done by the caller. But I don't find a way on how to use a prefetch path previously defined in the query (using WithPath) in the projection... that the central point of my post.

What do you means by a nested set ? I suppose you mean when querying in the select clause ? like this:

Structure = qf.DfContentStructure.CorrelatedOver(DfContentVersionEntity.PrefetchPathDfContentStructuresCollection)
                              .Select( /* ... */ )

I think it's the only way to do it, but in this way there is not a clear separation between query construction and query projection, because in the projection you still need the QueryFactory and you continue to construct the query by defining the nested set.

Thats also strange that you can define "WithPath" in your query, but don't have a way to use the data that will be fetched in the projection.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39760
Joined: 17-Aug-2003
# Posted on: 14-Mar-2023 09:07:36   

Fab wrote:

So using the QuerySpec the query must still be define in the repository, but the projection to DTO and the fetch should be done by the caller. But I don't find a way on how to use a prefetch path previously defined in the query (using WithPath) in the projection... that the central point of my post.

As I tried to explain, prefetch paths are for entity fetches and not for projections. If you define an entity query to fetch customers and you then define a prefetch path for orders with that query, the engine will first fetch the customers and then fetch the orders (so 2 queries) and merge the results in-memory, but the prefetch path is fetched after the root entities (in my example customers) are fetched.

This is for entity queries only. You want a projection of a set, so prefetch paths isn't something you can use here, unless you want to first fetch the entity graph (in my example customers + their orders) and then project that graph to a set of objects. You want to avoid that as you said, so I suggested you use a typed projection with a nested set. Nested sets/queries are queries inside the projection of another query. See https://www.llblgen.com/Documentation/5.10/LLBLGen%20Pro%20RTF/Using%20the%20generated%20code/QuerySpec/gencode_queryspec_projections.htm#nested-queries

The advantage of this is that you fetch the projection and the nested queries/sets without first fetching the entities they're projected from.

So please, forget prefetch paths, they're for fetching entity graphs (so sets of entities with their related entities) and that's not what you're doing.

What do you means by a nested set ? I suppose you mean when querying in the select clause ? like this:

Structure = qf.DfContentStructure.CorrelatedOver(DfContentVersionEntity.PrefetchPathDfContentStructuresCollection)
                              .Select( /* ... */ )

I think it's the only way to do it, but in this way there is not a clear separation between query construction and query projection, because in the projection you still need the QueryFactory and you continue to construct the query by defining the nested set.

Yes because there's just one query. You're stuck on the fact you 'defined the entity query' and now want to define a projection of that, but that's missing the point: the engine isn't going to transform your prefetch path query (which defines a graph of entities) into a query which projects that set of entities onto something else. Prefetch paths are a different concept and not used here.

If the projections have to be defined outside the repository, then first determine what these projections are actually projecting. If that's a graph of entities, then you have to fetch them first. If it's an entity query definition, then your projection will be the 'SELECT ... ' part and the entity query the 'FROM ... WHERE ..' part. There aren't any prefetch paths in play here as these are fetched after entities are fetched, but as you aren't fetching entities (but a projection) it's not going to work.

I honestly find it a bit strange you require query definitions to be defined inside a repository but you also want to define parts of these queries outside this repository. I fear working around that limitation falls outside the scope of our system, right? There are a couple of ways to do this though: e.g. why not wrap the entity query construction as methods in the repository. This way you don't need to expose the queryfactory instance, and have full control over the additional elements you add to the entity query that's used in the final projection. So your code above becomes:

Structure = myRespository.CreateEntityQuery<DfContentStructure>().CorrelatedOver(... predicate...)
                              .Select( /* ... */ )

where CreateEntityQuery<T>() looks something like:

public EntityQuery<T> CreateEntityQuery<T>()
{
    var toReturn = new QueryFactory().Create<T>();
    // append stuff you need to use to toReturn
    return toReturn;
}

... or share a queryfactory during the query construction process, that's up to you.

Frans Bouma | Lead developer LLBLGen Pro
Fab
User
Posts: 108
Joined: 20-Oct-2008
# Posted on: 17-Mar-2023 08:40:28   

(I forgot to post after preview smile )

I honestly find it a bit strange you require query definitions to be defined inside a repository but you also want to define parts of these queries outside this repository

Because I want to separate the "core" query part (= join & where) which must contains crucial security part, and the projection which depend on the exact usage and needs of the caller. Currently both are fully in the repository, w I may be too extremist in separation of concern smile

But you last proposition is indeed interesting. Is it a problem is the query is build using several different QueryFactory ? I suppose NO as what you proposed will be like this.

Again thank you for your explanations

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39760
Joined: 17-Aug-2003
# Posted on: 17-Mar-2023 09:44:39   

Fab wrote:

(I forgot to post after preview smile )

I honestly find it a bit strange you require query definitions to be defined inside a repository but you also want to define parts of these queries outside this repository

Because I want to separate the "core" query part (= join & where) which must contains crucial security part, and the projection which depend on the exact usage and needs of the caller. Currently both are fully in the repository, w I may be too extremist in separation of concern smile

But you last proposition is indeed interesting. Is it a problem is the query is build using several different QueryFactory ? I suppose NO as what you proposed will be like this.

Again thank you for your explanations

It's not a problem in normal queries without inheritance. The QueryFactory contains a system to produce unique aliases for query fragments that aren't aliased but should be aliased. It can only do so if it's reused over the entire query (otherwise it'll produce the same aliases again). So it's likely not a problem in most queries, but might be in some, tho you'll run into it right away with alias errors.

Frans Bouma | Lead developer LLBLGen Pro