Prefetched path and relations VS number of queries made to the database

Posts   
 
    
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 26-Mar-2008 14:25:54   

This is our code:

        HQ.Mile.Data.CollectionClasses.CombinaisonProbableCollection combinaisonProbableCollection = new CombinaisonProbableCollection();
        PrefetchPath prefetchPath = new PrefetchPath(EntityType.CombinaisonProbableEntity);
        prefetchPath.Add(HQ.Mile.Data.EntityClasses.CombinaisonProbableEntity.PrefetchPathMesureCollectionViaCombinaisonProbableMesure);
        prefetchPath.Add(HQ.Mile.Data.EntityClasses.CombinaisonProbableEntity.PrefetchPathTopologie);
        combinaisonProbableCollection.GetMulti(null, prefetchPath);

This is the dump:

Method Enter: DaoBase.PerformGetMultiAction Method Enter: CreatePagingSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Generated Sql query: Query: SELECT "MILE"."COMBINAISON_PROBABLE"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId", "MILE"."COMBINAISON_PROBABLE"."TEMPS_DEBUT" AS "TempsDebut", "MILE"."COMBINAISON_PROBABLE"."TEMPS_FIN" AS "TempsFin", "MILE"."COMBINAISON_PROBABLE"."TOPOLOGIE_ID" AS "TopologieId", "MILE"."COMBINAISON_PROBABLE"."TEMPS_INSCRIPTION" AS "TempsInscription" FROM "MILE"."COMBINAISON_PROBABLE"

Method Exit: CreateSelectDQ Method Exit: CreatePagingSelectDQ: no paging. Method Enter: DaoBase.ExecuteMultiRowRetrievalQuery Method Exit: DaoBase.ExecuteMultiRowRetrievalQuery Method Enter: DaoBase.FetchPrefetchPath Method Enter: DaoBase.PerformGetMultiActionInternal Method Enter: CreatePagingSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSubQuery Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Generated Sql query: Query: SELECT "MILE"."COMBINAISON_PROBABLE"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId" FROM "MILE"."COMBINAISON_PROBABLE"

Method Exit: CreateSelectDQ Method Exit: CreateSubQuery Generated Sql query: Query: SELECT DISTINCT "MILE"."MESURE"."MESURE_ID" AS "MesureId", "MILE"."MESURE"."POINT_MESURE_ID" AS "PointMesureId", "MILE"."MESURE"."TEMPS_DEBUT" AS "TempsDebut", "MILE"."MESURE"."TEMPS_DEBUT_FS" AS "TempsDebutFs", "MILE"."MESURE"."TEMPS_DEBUT_ENVELOPPE" AS "TempsDebutEnveloppe", "MILE"."MESURE"."TEMPS_FIN_ENVELOPPE" AS "TempsFinEnveloppe", "MILE"."MESURE"."STATUS_MESURE_ID" AS "StatusMesureId", "MILE"."MESURE"."APPAREIL_MESURE_ID" AS "AppareilMesureId", "MILE"."MESURE"."DATE_CREATION" AS "DateCreation" FROM (( "MILE"."COMBINAISON_PROBABLE" "LPA_C1" INNER JOIN "MILE"."COMBINAISON_PROBABLE_MESURE" ON "LPA_C1"."COMBINAISON_PROBABLE_ID"="MILE"."COMBINAISON_PROBABLE_MESURE"."COMBINAISON_PROBABLE_ID") INNER JOIN "MILE"."MESURE" ON "MILE"."MESURE"."MESURE_ID"="MILE"."COMBINAISON_PROBABLE_MESURE"."MESURE_ID") WHERE ( ( "MILE"."COMBINAISON_PROBABLE_MESURE"."COMBINAISON_PROBABLE_ID" IN (SELECT "MILE"."COMBINAISON_PROBABLE"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId" FROM "MILE"."COMBINAISON_PROBABLE")))

Method Exit: CreateSelectDQ Method Exit: CreatePagingSelectDQ: no paging. Method Enter: DaoBase.ExecuteMultiRowRetrievalQuery Method Exit: DaoBase.ExecuteMultiRowRetrievalQuery Method Exit: DaoBase.PerformGetMultiActionInternal Method Enter: DaoBase.MergeManyToMany Method Enter: DaoBase.GetMultiAsDataTableMergeManyToMany Method Enter: CreatePagingSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSubQuery Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Generated Sql query: Query: SELECT "MILE"."COMBINAISON_PROBABLE"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId" FROM "MILE"."COMBINAISON_PROBABLE"

Method Exit: CreateSelectDQ Method Exit: CreateSubQuery Generated Sql query: Query: SELECT DISTINCT "LPA_C1"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId0", "MILE"."MESURE"."MESURE_ID" AS "MesureId1" FROM (( "MILE"."COMBINAISON_PROBABLE" "LPA_C1" INNER JOIN "MILE"."COMBINAISON_PROBABLE_MESURE" ON "LPA_C1"."COMBINAISON_PROBABLE_ID"="MILE"."COMBINAISON_PROBABLE_MESURE"."COMBINAISON_PROBABLE_ID") INNER JOIN "MILE"."MESURE" ON "MILE"."MESURE"."MESURE_ID"="MILE"."COMBINAISON_PROBABLE_MESURE"."MESURE_ID") WHERE ( ( ( "MILE"."COMBINAISON_PROBABLE_MESURE"."COMBINAISON_PROBABLE_ID" IN (SELECT "MILE"."COMBINAISON_PROBABLE"."COMBINAISON_PROBABLE_ID" AS "CombinaisonProbableId" FROM "MILE"."COMBINAISON_PROBABLE"))))

Method Exit: CreateSelectDQ Method Exit: CreatePagingSelectDQ: no paging. Method Enter: DaoBase.ExecuteMultiRowDataTableRetrievalQuery(4) Method Exit: DaoBase.ExecuteMultiRowDataTableRetrievalQuery(4) Method Exit: DaoBase.GetMultiAsDataTableMergeManyToMany Method Exit: DaoBase.MergeManyToMany Method Enter: DaoBase.PerformGetMultiActionInternal Method Enter: CreatePagingSelectDQ Method Enter: CreateSelectDQ Method Enter: CreateSelectDQ Generated Sql query: Query: SELECT "MILE"."TOPOLOGIE"."TOPOLOGIE_ID" AS "TopologieId", "MILE"."TOPOLOGIE"."ADRESSE_ELECTRIQUE_DEPART" AS "AdresseElectriqueDepart", "MILE"."TOPOLOGIE"."DATE_CREATION" AS "DateCreation", "MILE"."TOPOLOGIE"."DESCRIPTION" AS "Description", "MILE"."TOPOLOGIE"."CHECKSUM" AS "Checksum", "MILE"."TOPOLOGIE"."LIGNE_ID" AS "LigneId" FROM "MILE"."TOPOLOGIE" WHERE ( ( "MILE"."TOPOLOGIE"."TOPOLOGIE_ID" IN (:TopologieId1, :TopologieId2, :TopologieId3, :TopologieId4, :TopologieId5, :TopologieId6, :TopologieId7))) Parameter: :TopologieId1 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 242. Parameter: :TopologieId2 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 240. Parameter: :TopologieId3 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 245. Parameter: :TopologieId4 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 239. Parameter: :TopologieId5 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 243. Parameter: :TopologieId6 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 244. Parameter: :TopologieId7 : Decimal. Length: 0. Precision: 0. Scale: 0. Direction: Input. Value: 241.

Method Exit: CreateSelectDQ Method Exit: CreatePagingSelectDQ: no paging. Method Enter: DaoBase.ExecuteMultiRowRetrievalQuery Method Exit: DaoBase.ExecuteMultiRowRetrievalQuery Method Exit: DaoBase.PerformGetMultiActionInternal Method Enter: DaoBase.MergeNormal Method Exit: DaoBase.MergeNormal Method Exit: DaoBase.FetchPrefetchPath Method Exit: DaoBase.PerformGetMultiAction

I expected 1 query and it produced 6 queries. I'm feeling a bit worried about performance and the way it is processed ? Also, i'm worried by the use of the "in" into the final select. What would it looklike if we have results with thousands of records. Why it's not a simple query with inner join ???

Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 26-Mar-2008 14:36:19   
HQ.Mile.Data.CollectionClasses.CombinaisonProbableCollection combinaisonProbableCollection = new CombinaisonProbableCollection();
            PrefetchPath prefetchPath = new PrefetchPath(EntityType.CombinaisonProbableEntity);
            prefetchPath.Add(HQ.Mile.Data.EntityClasses.CombinaisonProbableEntity.PrefetchPathMesureCollectionViaCombinaisonProbableMesure);
            prefetchPath.Add(HQ.Mile.Data.EntityClasses.CombinaisonProbableEntity.PrefetchPathTopologie);
            combinaisonProbableCollection.GetMulti(null, prefetchPath);

For the above code only 3 queries will be executed (the dump shows each query twice, one when it s prepared and one when executed). Please check my answer on your other thread: http://www.llblgen.com/TinyForum/Messages.aspx?ThreadID=12911

Also, i'm worried by the use of the "in" into the final select. What would it looklike if we have results with thousands of records. Why it's not a simple query with inner join ???

Please read Optimizing Prefetch Paths in the manual's section "Using the generated code -> SelfServicing -> Prefetch Paths"

Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 26-Mar-2008 14:56:04   

As far as what I know: 2 requests and the use of "in" clause is always worse than 1 request with an a inner join. Why using 2 requests ? ... I could be wrong but I really like to see your point on that.

Just to verify that I understand well... The actual design will do a first request, and the second will be defined like that:

if (results.count < DaoBase.ParameterisedPrefetchPathThreshold) { Do "in" with enumerated key values of the prefetchPath from the previous requests } else { Do "in" with the exact same request as the one used in the first request (only returning the key value)** }

Do I understand well ?

** also, for multiple keys in a relation(prefetchPath), you have to use concatenation of stringed key fields ?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39862
Joined: 17-Aug-2003
# Posted on: 26-Mar-2008 16:33:37   

Switch on tracing on ORMPersistence, not the DQE to see the REAL queries executed. This is because with prefetch paths, there are subqueries generated and the DQE simply traces these as separate queries in the output tracelog so you will see more queries than that there are executed.

With m:n prefetch path nodes: 2 queries per node are executed (one for the related entities, one for the pk-pk relations).

with normal nodes: 1 query per prefetch path node is executed.

Frans Bouma | Lead developer LLBLGen Pro
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 26-Mar-2008 16:51:59   

It is more clear now.

Thanks for the info on debug tracing.

Does my little pseudo code to decide what kind of "in" clause is OK (previous post) ?

Why not just doing only one request for every thing with join for prefetchPath ? It should be a lot quicker than using "in" clause in most(if not all) cases ?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39862
Joined: 17-Aug-2003
# Posted on: 26-Mar-2008 17:04:53   

No, joins are often slower, due to the fact that they increase the # of rows returned in for example 1:n relations (think customers + all orders), and also it increases the burden on the object fetcher, as the position of the columns for an entity are different every time. Entityframework uses this approach and their docs state that some prefetch paths can't be done due to the high number of joins.

Joins are also a problem if there are multiple branches in the prefetch path.

There is still an optimization thing to do though simple_smile -> multi field PK's now result in and/or filters, but could be done with an EXIST query as well, which is slightly more efficient (linq provider with nested queries already uses this)

Frans Bouma | Lead developer LLBLGen Pro
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 26-Mar-2008 17:21:44   

Does that little pseudo code is correct ?:

results = do first query if (results.count < DaoBase.ParameterisedPrefetchPathThreshold) { Do "in" with enumerated key values from the previous requests } else { Do "in" with the exact same request as the one used in the first request (only returning the key value) }

EntityFramework(Microsoft) does use join ?

I don't understand the burden on the fetcher... does it come from the fact that you have lots of duplicate (the side "1" on relation 1-n) and you have to look for them before adding them to ensure to keep no duplicate ?

About consistency, there is a risk of inconsistency if there is more than one request? Does both (or 3 for n-m) requests are encapsulate into a transaction ?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39862
Joined: 17-Aug-2003
# Posted on: 27-Mar-2008 11:22:35   

Wally wrote:

Does that little pseudo code is correct ?:

results = do first query if (results.count < DaoBase.ParameterisedPrefetchPathThreshold) { Do "in" with enumerated key values from the previous requests } else { Do "in" with the exact same request as the one used in the first request (only returning the key value) }

somewhat like that, with some enhancements in both branches.

EntityFramework(Microsoft) does use join ?

yes.

I don't understand the burden on the fetcher... does it come from the fact that you have lots of duplicate (the side "1" on relation 1-n) and you have to look for them before adding them to ensure to keep no duplicate ?

First of all, if you're fetching customers, orders, orderdetails, products, you have a very very wide resultset. In THAT query, the product fields are at the end. So the object fetcher has to take into account the complete resultset.

With our prefetch paths, it's build on top of the existing code, so the object fetcher doesn't know if the fetched set of entities are in a prefetch path, and that's a good thing, as it doesn't need that information at that level.

The joins become a big problem in multi-branch trees, as it will result in lots of null columns in some situations and lots of duplicate data in another.

About consistency, there is a risk of inconsistency if there is more than one request? Does both (or 3 for n-m) requests are encapsulate into a transaction ?

A complete fetch is performed in the transaction in progress, if any. If there's no transaction in progress, there's no transaction started (transactions aren't needed for fetches!)

Frans Bouma | Lead developer LLBLGen Pro
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 27-Mar-2008 15:33:15   

Thanks,

Thanks for every answers.

I'm not agree about the need for transaction when retreiving prefetch data.

If you have "order" and "orderDetails" plus a calculated field in "order" that depends on the data in orderDetails (bad design but something needed for performance). Example (order.hasOrderDetailsCorrections).

Example:

(step 1) A user as order with prefetchPath on orderDetails. LLBLGen get order

At the same time, another user add corrections (new orderDetails + order.hasOrderDetailsCorrections passing from false to true) in a transaction.

LLGLGen get orderDetails (continue from step 1)

You will have in memory: order.hasOrderDetailsCorrections set to false but with corrections in its collection of orderDetails... Inconsistency !

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39862
Joined: 17-Aug-2003
# Posted on: 28-Mar-2008 11:00:15   

Though that's unavoidable. What if the second user writes the orderdetails corrections right AFTER the prefetch path completed. You then have in memory 'there are no corrections but that that moment there are!

This is what's called 'stale data'. As soon as data is pulled out of the db, the data outside the db is 'stale'. The longer you wait before you refetch, the more chance your have that the data outside the db is inconsistent with 'reality', the data inside the db.

That's also why concurrency schemes exist. The only thing to do to avoid this is to lock all rows read and keep them locked till you're done. That's of course undoable, as it will kill performance and actually could create completely unusability of the application.

With a transaction you can't avoid this either. A transaction with a read adds read locks on sqlserver, but on oracle you're not blocking anyone for example. So it's not really something you can fix yourself other than having a better design (namely properties relying on the data fetched instead of the data in the db, as all data fetched is 'stale' so a property on the stale data is always correct in the context of the data it works on.

Frans Bouma | Lead developer LLBLGen Pro
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 28-Mar-2008 14:39:18   

hummmmmmmmm....

As far as what I know, any respectable actual database system server (oracle, sqlserver, sql anywhere, ...) has the principle of consistency view. Any single request or requests made in a transaction will be made as if it was an atomic instruction or totally rolled back.

"Stale" come from a difference between the DB and in memory data due to a change in DB since you get it in memory.

The inconcistency come from invalid logical condition between 2 rows in the database (or in memory) due to concurrency between 2 access (as soon as any write is in play, it should be transactionned).

Stale and inconsistency are totally differents and I'm not talking about 'Stale' here.

Inconsistency should never happen under any condition at all. Either in database or in memory. The software developper should use transaction to prevent that while writing or reading. Reading also should be protected by transaction whenever any write could be done at the same time. The example show that clearly. In the example you get an invalid condition between 2 objects that should never happen.

That is why it would be necessary in the actual design of LLBLGen to use transaction. But better than that, altough it could be limitative or some time slow to use "joins"... It should be in almost every time more efficient (to use join) than doing many requests. I understand that it would be a huge change. I can live with many requests (like it is now). I can live with the bug (no transaction for prefetchPath) . But you will always have to live with the fact that LLBLGen is not perfect (althought arguably the best and the most flexible actually). simple_smile ... and I know you like perfection !!! wink

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39862
Joined: 17-Aug-2003
# Posted on: 28-Mar-2008 16:39:58   

Trust me, JOINS aren't going to work with prefetch paths. I don't want to debate over this, time is too short. The main point is: a prefetch path isn't said to be atomic. It's a series of fetches. If your application can't deal with that, you have to make sure the fetches sees all data at time T during the fetch as if all fetches take place at time T. You then indeed need a transaction, and you should start one when fetching, though it can differ from database system to database system what the results are. For example in sqlserver, readers block writers but that's not the case on oracle.

Frans Bouma | Lead developer LLBLGen Pro
Wally
User
Posts: 285
Joined: 29-Aug-2007
# Posted on: 31-Mar-2008 16:15:38   

I have no experience in developping an ORM then I will trust (with a little reserve) about joining impossible for prefetch path (one day, I will clarify that wink , at least I hope).

Oracle is the only one (to my knowledge) that enable consistent view while permitting writing and reading the same records at the same time. But although, it is the only one with that strengh, they all (Oracle- SqlAnywhere, SQL Server, DB2) met the ACID (atomicity, consistency, isolation and durability) specification about transaction. For that reason, I think that results will always be the same in any cases (concurrent read-write mention earlier).

Note that for concurrent update on the same rows would lead to different results in different databases systems, but it was not my intention to discuss that here.

Anyway... Thanks for your nices answers and you precious time. Knowing that prefetch path are series of fetches not transactionned would then help us to do appropriate things whenever cases happen.

Thanks