Querying the nested set model

Posts   
 
    
mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 12-Apr-2006 18:11:05   

We've opted to implement the nested set model (as opposed to the adjacent model) on a project at work. I've been through a number of Celko's articles (e.g. http://www.dbmsmag.com/9604d06.html) and have read several threads here (e.g. http://llblgen.com/tinyforum/Messages.aspx?ThreadID=5493&HighLight=1) related to self-referencing/hierarchical structures. So far, so good.

We only update the hierarchy once a month and have a SQL statement to convert it from the adjacent model we receive it in, to the nested model we store it in. No problem there since it is a backend DTS process and doesn't rely on the LLBL entities.

My obstacle is in regards to querying. We have SQL that pulls the data out exactly how we need it. However, I can't figure out how to get the SQL over to the equivalent LLBL predicate format. An example of the SQL is:

SELECT P2.emp, COUNT(*) AS level
FROM Personnel AS P1, Personnel AS P2
WHERE P2.lft BETWEEN P1.lft AS P2
GROUP BY P2.emp;

How do I get P1 and P2 using the LLBL predicate model? As usual, it's probably somewhere in the docs, but I'm having a hard time finding it.

Thanks! Matt

Walaa avatar
Walaa
Support Team
Posts: 14950
Joined: 21-Aug-2005
# Posted on: 12-Apr-2006 18:34:43   

you may want to use a dynamic list, with table aliases as in the following thread: http://www.llblgen.com/TinyForum/Messages.aspx?ThreadID=5812

Is there a relation between Personnel and itself?

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 12-Apr-2006 19:03:50   

Walaa wrote:

Is there a relation between Personnel and itself?

No, there is no relation between Personnel and itself (i.e. there is not a parent column). The nested set model uses the 'lft' and 'rgt' columns to track the nodes location in the hierarchy.

I'll take a look at the link.

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 12-Apr-2006 19:15:19   

I think I'm finding what I need in the Help manual under Generated code - Advanced filter usage, Adapter > Advanced Filtering.

I'll report back...

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 13-Apr-2006 22:39:09   

mattsmith321 wrote:

I think I'm finding what I need in the Help manual under Generated code - Advanced filter usage, Adapter > Advanced Filtering.

I'll report back...

Still not quite getting it. cry All the doco and threads I'm finding either refer to field aliases or aliases in dynamic lists. I need to be able to alias a table/entity in the same query.

FYI, my SQL example isn't quite correct above. Here is some working SQL:


-- Returns the current organization and all sub-organizations
SELECT o1.*
FROM organization AS o1, organization AS o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
ORDER BY o1.lft ASC

or

-- Finds all sub-organizations
SELECT o1.*
FROM organization AS o1, organization AS o2
WHERE o1.lft > o2.lft AND o1.lft < o2.rgt
AND o2.team_id = 25
ORDER BY o1.lft ASC

So, how can I alias a table/entity (not field level alias or dynamic lists)?

Thanks!

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 13-Apr-2006 23:53:06   

I tried:

RelationPredicateBucket filter = new RelationPredicateBucket();

EntityField2 o1Left = EntityFieldFactory.Create(OrganizationFieldIndex.Lft) as EntityField2;
o1Left.ObjectAlias = "o1";
EntityField2 o2Left = EntityFieldFactory.Create(OrganizationFieldIndex.Lft) as EntityField2;
o2Left.ObjectAlias = "o2";
IEntityField2 o2Rgt = EntityFieldFactory.Create(OrganizationFieldIndex.Rgt);
o2Rgt.ObjectAlias = "o2";

filter.PredicateExpression.Add(o1Left > o2Left).AddWithAnd(o1Left < o2Rgt).AddWithAnd(OrganizationFields.TeamId == 25);

which generated the following SQL:

SELECT 
    [fabulos].[dbo].[fab_organization].[org_id] AS [OrgId], 
    [fabulos].[dbo].[fab_organization].[org_unit_name] AS [DeptName], 
    [fabulos].[dbo].[fab_organization].[fiscal_period] AS [FiscalPeriod], 
    [fabulos].[dbo].[fab_organization].[dept_code] AS [DeptCode], 
    [fabulos].[dbo].[fab_organization].[manager_id] AS [TeamLeadId], 
    [fabulos].[dbo].[fab_organization].[team_id] AS [TeamId], 
    [fabulos].[dbo].[fab_organization].[enabled_yn] AS [EnabledYn], 
    [fabulos].[dbo].[fab_organization].[knta_org_unit_id] AS [KntaOrgUnitId], 
    [fabulos].[dbo].[fab_organization].[lft] AS [Lft], 
    [fabulos].[dbo].[fab_organization].[rgt] AS [Rgt] 
FROM [fabulos].[dbo].[fab_organization]  
WHERE ( ( [o1].[lft] > [o2].[lft] 
AND [o1].[lft] < [o2].[rgt] 
AND [fabulos].[dbo].[fab_organization].[team_id] = @TeamId1))

which is pretty much want I want except that it is missing the appropriate alias in the FROM statement. The FROM statement needs to look something like:

FROM [fabulos].[dbo].[fab_organization] AS o1, [fabulos].[dbo].[fab_organization] AS o2

The generated query fails with the following exception (obviously because of the missing aliases):

An exception was caught during the execution of a retrieval query: The column prefix 'o1' does not match with a table name or alias name used in the query.
The column prefix 'o2' does not match with a table name or alias name used in the query.
The column prefix 'o1' does not match with a table name or alias name used in the query.
The column prefix 'o2' does not match with a table name or alias name used in the query.. Check InnerException, QueryExecuted and Parameters of this exception to examine the cause of this exception. 

I know I'm missing something really simple but I just don't know what it is or where it goes. confused

Thanks.

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 14-Apr-2006 07:02:25   

I made a lot of progress after reading this thread: http://llblgen.com/tinyforum/Messages.aspx?ThreadID=4039

Unfortunately I'm still not quite there. I now have the following code:

// first define the join relation. We need to specify a filter as well. 
EntityCollection orgs = new EntityCollection(new OrganizationEntityFactory());

RelationPredicateBucket filter = new RelationPredicateBucket();
            
EntityRelation relation = new EntityRelation(RelationType.OneToMany);
relation.AddEntityFieldPair(
    EntityFieldFactory.Create(OrganizationFieldIndex.OrgId), 
    EntityFieldFactory.Create(OrganizationFieldIndex.OrgId));

PredicateExpression onClauseFilter = new PredicateExpression();
onClauseFilter.Add(PredicateFactory.CompareExpression(
    OrganizationFieldIndex.OrgId, ComparisonOperator.NotEqual, 
    new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.OrgId)), "o2"));

RelationCollection relations = new RelationCollection();
relations.Add(relation, "o2", JoinHint.Inner).CustomFilter = onClauseFilter;
relation.CustomFilterReplacesOnClause = true;

filter.Relations.AddRange(relations);

// Now we'll define the WHERE clause
IPredicateExpression whereFilter = new PredicateExpression();

// WHERE o1.lft > o2.lft
whereFilter.Add(PredicateFactory.CompareExpression(
    OrganizationFieldIndex.Lft, ComparisonOperator.GreaterThan, 
    new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.Lft)), "o2"));
// AND o1.lft < o2.rgt
whereFilter.AddWithAnd(PredicateFactory.CompareExpression(
    OrganizationFieldIndex.Lft, ComparisonOperator.LesserThan, 
    new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.Rgt)), "o2"));
// AND o2.team_id = 25
whereFilter.AddWithAnd(PredicateFactory.CompareValue(
    OrganizationFieldIndex.TeamId, ComparisonOperator.Equal, 25, "o2"));

Which generates the following SQL:

SELECT DISTINCT 
    [fabulos].[dbo].[fab_organization].[org_id] AS [OrgId], 
    [fabulos].[dbo].[fab_organization].[org_unit_name] AS [DeptName], 
    [fabulos].[dbo].[fab_organization].[fiscal_period] AS [FiscalPeriod], 
    [fabulos].[dbo].[fab_organization].[dept_code] AS [DeptCode], 
    [fabulos].[dbo].[fab_organization].[manager_id] AS [TeamLeadId], 
    [fabulos].[dbo].[fab_organization].[team_id] AS [TeamId], 
    [fabulos].[dbo].[fab_organization].[enabled_yn] AS [EnabledYn], 
    [fabulos].[dbo].[fab_organization].[knta_org_unit_id] AS [KntaOrgUnitId], 
    [fabulos].[dbo].[fab_organization].[lft] AS [Lft], 
    [fabulos].[dbo].[fab_organization].[rgt] AS [Rgt] 
FROM ( [fabulos].[dbo].[fab_organization] [LPA_o1]  
INNER JOIN [fabulos].[dbo].[fab_organization]  
    ON  ( [LPA_o1].[org_id] <> [fabulos].[dbo].[fab_organization].[org_id])) 
WHERE ( ( ( [LPA_o1].[lft] > [fabulos].[dbo].[fab_organization].[lft] 
AND [LPA_o1].[lft] < [fabulos].[dbo].[fab_organization].[rgt] 
AND [LPA_o1].[team_id] = @TeamId1)))

While this is really close, the alias is occuring on the first table. I could almost deal with that but then I would want the returned fields to be coming from o2/LPA_o1.

What am I missing?

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 14-Apr-2006 09:27:29   

I think I finally figured it out! smile It took quite a while and a lot of experimenting. Here is what I did:

First, this line was my turning point:

relations.Add(relation, "o2", JoinHint.Inner).CustomFilter = onClauseFilter;

Turns out, there is another overload that allows you to define the start and end entity alias.

filter.Relations.Add(relation, "o2", string.Empty, JoinHint.Inner).CustomFilter = onClauseFilter;

Once I found out that I could control whether the first or second table got aliased, I struggled for a while trying to figure out how to control whether the left or right side of the predicate expressions would get aliased. It took me a little while, but I finally came to the conclusion that only the left side gets aliased and that the query should be adjusted accordingly. So, I ended up with this as the SQL statement that I was trying to render:


SELECT DISTINCT fab_organization.*
FROM fab_organization
INNER JOIN fab_organization o2 
ON o2.org_id != fab_organization.org_id
WHERE o2.lft < fab_organization.lft 
AND o2.rgt > fab_organization.lft
AND o2.team_id = 25

Which I converted to the following LLBL syntax:

EntityCollection orgs = new EntityCollection(new OrganizationEntityFactory());

RelationPredicateBucket filter = new RelationPredicateBucket();

EntityRelation relation = new EntityRelation(RelationType.OneToMany);

relation.AddEntityFieldPair(
        EntityFieldFactory.Create(OrganizationFieldIndex.OrgId), 
        EntityFieldFactory.Create(OrganizationFieldIndex.OrgId));

PredicateExpression onClauseFilter = new PredicateExpression();

onClauseFilter.Add(PredicateFactory.CompareExpression(
        OrganizationFieldIndex.OrgId, ComparisonOperator.NotEqual, 
        new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.OrgId)), "o2"));

relation.CustomFilterReplacesOnClause = true;

filter.Relations.Add(relation, "o2", string.Empty, JoinHint.Inner).CustomFilter = onClauseFilter;

IPredicateExpression whereFilter = new PredicateExpression();

whereFilter.Add(PredicateFactory.CompareExpression(
        OrganizationFieldIndex.Lft, ComparisonOperator.LesserThan, 
        new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.Lft)), "o2"));

whereFilter.AddWithAnd(PredicateFactory.CompareExpression(
        OrganizationFieldIndex.Rgt, ComparisonOperator.GreaterThan, 
        new Expression(EntityFieldFactory.Create(OrganizationFieldIndex.Lft)), "o2"));

whereFilter.AddWithAnd(PredicateFactory.CompareValue(
    OrganizationFieldIndex.TeamId, ComparisonOperator.Equal, 25, "o2"));

filter.PredicateExpression.Add(whereFilter);

using (DataAccessAdapter daa = new DataAccessAdapter())
{
        daa.FetchEntityCollection(orgs, filter);
}

return orgs;

Which generated the following SQL:

SELECT DISTINCT 
    [fabulos].[dbo].[fab_organization].[org_id] AS [OrgId], 
    [fabulos].[dbo].[fab_organization].[org_unit_name] AS [DeptName], 
    [fabulos].[dbo].[fab_organization].[fiscal_period] AS [FiscalPeriod], 
    [fabulos].[dbo].[fab_organization].[dept_code] AS [DeptCode], 
    [fabulos].[dbo].[fab_organization].[manager_id] AS [TeamLeadId], 
    [fabulos].[dbo].[fab_organization].[team_id] AS [TeamId], 
    [fabulos].[dbo].[fab_organization].[enabled_yn] AS [EnabledYn], 
    [fabulos].[dbo].[fab_organization].[knta_org_unit_id] AS [KntaOrgUnitId], 
    [fabulos].[dbo].[fab_organization].[lft] AS [Lft], 
    [fabulos].[dbo].[fab_organization].[rgt] AS [Rgt] 
FROM ( [fabulos].[dbo].[fab_organization]  
INNER JOIN [fabulos].[dbo].[fab_organization] [LPA_o1]  
    ON  ( [LPA_o1].[org_id] <> [fabulos].[dbo].[fab_organization].[org_id])) 
WHERE ( ( ( [LPA_o1].[lft] < [fabulos].[dbo].[fab_organization].[lft] 
AND [LPA_o1].[rgt] > [fabulos].[dbo].[fab_organization].[lft] 
AND [LPA_o1].[team_id] = @TeamId1)))

Which returned the expected records! smile

I'm glad that it works, but is there any way to simplify the LLBL code any? i.e. using textual query syntax instead of the verbose expressions I used?

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 14-Apr-2006 17:10:50   

Aaargghhh!! cry

The example above start with the easiest form of the query I needed:

-- Returns ONLY the child organizations where team_id = 25
SELECT o1.*
FROM fab_organization AS o1, fab_organization AS o2
WHERE o1.lft > o2.lft AND o1.lft < o2.rgt 
AND o2.team_id = 25

The next step in the query evolution is:

-- Returns the child organizations AND the selected organization where team_id = 25
SELECT o1.*
FROM fab_organization AS o1, fab_organization AS o2
--WHERE o1.lft BETWEEN o2.lft AND o2.rgt
--WHERE o1.lft > o2.lft AND o1.lft < o2.rgt 
AND o2.team_id = 25

The current SQL modified to work with LLBL aliasing is:

-- Returns ONLY the child organization where team_id = 25
SELECT DISTINCT fab_organization.*
FROM fab_organization
INNER JOIN fab_organization o2 
ON o2.org_id != fab_organization.org_id
WHERE o2.lft < fab_organization.lft 
AND o2.rgt > fab_organization.lft
AND o2.team_id = 25

My first attempt to add the BETWEEN clause still only returned the child organizations. I suspect it has to do with the != on the INNER JOIN that filters out that particular record.

-- Still only returns the child organizations
SELECT DISTINCT fab_organization.*
FROM fab_organization
INNER JOIN fab_organization o2 
ON o2.org_id != fab_organization.org_id
WHERE fab_organization.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25

Of course, that particular syntax above is not possible in LLBL because the alias is on the wrong side. So, rather than use the BETWEEN statement, I just re-used the first SQL and added in = to the less than and greater than expressions:

SELECT DISTINCT fab_organization.*
FROM fab_organization
INNER JOIN fab_organization o2 
ON o2.org_id != fab_organization.org_id
WHERE o2.lft <= fab_organization.lft 
AND o2.rgt >= fab_organization.lft
AND o2.team_id = 25

But, I'm still left with the problem that it only returns the child organizations and does not include the selected organization where team_id = 25.

In case you can't tell, I'm trying really hard to avoid stored procs, typed views, typed lists, and dynamic lists. Mainly because we have made it this far without them.

What can I do to get my query working the way I want in a way that LLBL wants it?

Any help is appreciated!

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 14-Apr-2006 18:30:40   

So it looks like I can't do what I need with the INNER JOIN approach. As such, I need a way to alias one or both tables so that I can build my predicates appropriately.

Is it even possible?

mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 15-Apr-2006 05:00:48   

I just stumbled on this thread regarding nested sets (ala Celko). Unfortunately, it isn't helping me all that much: http://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=3670

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39617
Joined: 17-Aug-2003
# Posted on: 15-Apr-2006 11:41:03   

the query


SELECT o1.*
FROM organization AS o1, organization AS o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
ORDER BY o1.lft ASC

uses a crossjoin, between o1 and o2. The query should be the same as:


SELECT o1.*
FROM organization AS o1 CROSS JOIN organization AS o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
ORDER BY o1.lft ASC

You can specify a cross-join in LLBLGen Pro as well: Create a new EntityRelation, doesn't matter which fields, e.g. just specify the PK field of organization, and specify as joinhint: Cross.

Then specify the where clause as a normal predicate. So browsing back to your code above, it should becomes something like:


// relation
EntityRelation relation = new EntityRelation(RelationType.OneToMany);
relation.AddEntityFieldPair(OrganizationFields.OrgId, OrganizationFields.OrgId);

// filter. Just alias o2 only. We don't need to alias o1.
RelationPredicateBucket filter = new RelationPredicateBucket();
filter.Relations.Add(relation, "o2", JoinHint.Cross);
filter.PredicateExpression.Add(
    new FieldBetweenPredicate(OrganizationFields.Lft, null, 
        OrganizationFields.Lft.SetObjectAlias("o2"), 
        OrganizationFields.Rgt.SetObjectAlias("o2")));
filter.PredicateExpression.AddWithAnd(
    OrganizationFields.TeamId.SetObjectAlias("o2") == 25);
SortExpression sorter = new SortExpression(
    OrganizationFields.Lft | SortOperator.Ascending);

// let's rock!
EntityCollection orgs = new EntityCollection(new OrganizationEntityFactory());
using(DataAccessAdapter adapter = new DataAccessAdapter())
{
    adapter.FetchEntityCollection(orgs, filter, 0, sorter);
}

Now, I of course didn't test this code snippet as I don't have the db layout here but it looks to me this should produce a query similar to yours.

Of course, your query in SQL gives an inner join execution plan (which I doubt), it's not that efficient and we've to look at different options simple_smile

Frans Bouma | Lead developer LLBLGen Pro
mattsmith321 avatar
Posts: 146
Joined: 04-Oct-2004
# Posted on: 15-Apr-2006 22:30:35   

Thanks for the response and insight. It seems that I learn something new everytime I read one of your replies. simple_smile

In this case, I've never messed with cross-joins before so I never would have thought to give them a try, especially since they weren't mentioned in anything that I've seen by Celko. But, you are right. The cross-join and inner-join do appear to result in almost identical execution plans. In fact, all three versions below produce almost identical execution plans:

SELECT DISTINCT o1.*
FROM fab_organization AS o1 
CROSS JOIN fab_organization AS o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
and o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

SELECT DISTINCT o1.*
FROM fab_organization o1
INNER JOIN fab_organization o2 
ON o2.org_id != o1.org_id
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
AND o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

SELECT DISTINCT o1.*
FROM fab_organization o1, fab_organization o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
AND o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

Without the ORDER BY statements, the Nested Loop shows up as a Left Semi Join. With the ORDER BY statements, the Nested Loop shows up as an Inner Join.

However, something about the INNER JOIN query caused it to be fractionally more expensive than the other two: Query cost (relative to the batch): 33.32% (CROSS) vs 33.36% (INNER) vs 33.32% (NONE). So, I may consider switching to the cross-join.

The other thing that I learned was to use the SetObjectAlias() method to assign aliases to the right side of the expression. I had run across that method in my research but obviously wasn't seeing how to apply it correctly. This means that I can keep the BETWEEN statement instead of swapping it around. simple_smile

Now for the bad news. confused I mentioned several times above about the INNER JOIN causing the query to not return the selected organization along with all of the sub-organizations. All three of the queries above (which are very similar to the ones that I've been running for several days now) return the selected organization with all the sub-orgs. That's good smile except for the fact that I've spent the last two days cry trying to solve the issue by trying to figure out how to alias both entities to go with a query like option #3 so that I could avoid the inner-join which is what I though was causing the problem. The only thing that I can think of that is different is that I got a new cut of data last night.

Originally when I started this whole exercise, my specific need was to have Organization entities returned so that I can show the organization hierarchy. However, in the course of working through these problems and working on the app, I found a need to have a query that only returned the team_ids of the sub organizations. These team_ids are then dumped to an arry and used as the input for a FieldCompareRangePredicate for the following larger query/prefetch logic frowning :

SELECT DISTINCT bfa.*
FROM fab_group g
INNER JOIN fab_category c ON c.group_id = g.group_id
INNER JOIN fab_cont_cat cc ON cc.category_id = c.category_id
INNER JOIN fab_bfa_summary bfa ON bfa.fcc_id = cc.fcc_id
WHERE cc.contract_id = 1
AND (
    (
        bfa.team_id IN (
            -- Calls separate method that returns int[] of team_ids
            SELECT DISTINCT o1.*
            FROM fab_organization AS o1 
            CROSS JOIN fab_organization AS o2
            WHERE o1.lft BETWEEN o2.lft AND o2.rgt
            AND o2.team_id = 25
            and o1.fiscal_period = '2005-09-30'
            ORDER BY o1.lft ASC
        )
        and bfa.fiscal_period = '2005-09-30'
    )
    OR
    (
        bfa.team_id in (
            -- Calls separate method that returns int[] of team_ids
            SELECT DISTINCT o1.*
            FROM fab_organization AS o1 
            CROSS JOIN fab_organization AS o2
            WHERE o1.lft BETWEEN o2.lft AND o2.rgt
            AND o2.team_id = 25
            and o1.fiscal_period = '2005-10-31'
            ORDER BY o1.lft ASC
        )
        and bfa.fiscal_period = '2005-10-31'
    )
    OR
    (
        -- repeat for each month in fiscal year
    )
)

I specifically pulled the sub-query into a separate method because there was no way I was going to try to tack that on to an already large prefetch/filter query. It seems to work for now, so I probably won't mess with it. Now if I could only do the aggregation in the query I would be set wink but for now I'll continue to do it in the code.

In case you are wondering, we are working on an internal temporal database financial app. Temporal database is a term I was unfamilar with until I ran across this thread last night: http://discuss.joelonsoftware.com/default.asp?design.4.331143.10 Good to see that there is a 500+ book written on the temporal db topic and that I'm really not an idiot for having such a hard time grasping some of the complexities that we keep uncovering in this project. Now if I could only read all 500+ pages before Monday, I would be set. sunglasses

Thanks again for the assistance. I'm sure that most of my ramblings don't provide any value but they sure help me vent my frustrations. I'm sure I'll be back.

Matt

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39617
Joined: 17-Aug-2003
# Posted on: 17-Apr-2006 10:54:45   

mattsmith321 wrote:

Thanks for the response and insight. It seems that I learn something new everytime I read one of your replies. simple_smile

simple_smile

In this case, I've never messed with cross-joins before so I never would have thought to give them a try, especially since they weren't mentioned in anything that I've seen by Celko. But, you are right. The cross-join and inner-join do appear to result in almost identical execution plans. In fact, all three versions below produce almost identical execution plans:

SELECT DISTINCT o1.*
FROM fab_organization AS o1 
CROSS JOIN fab_organization AS o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
and o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

SELECT DISTINCT o1.*
FROM fab_organization o1
INNER JOIN fab_organization o2 
ON o2.org_id != o1.org_id
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
AND o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

SELECT DISTINCT o1.*
FROM fab_organization o1, fab_organization o2
WHERE o1.lft BETWEEN o2.lft AND o2.rgt
AND o2.team_id = 25
AND o1.fiscal_period = '2006-03-31'
--ORDER BY o1.lft ASC

Without the ORDER BY statements, the Nested Loop shows up as a Left Semi Join. With the ORDER BY statements, the Nested Loop shows up as an Inner Join.

However, something about the INNER JOIN query caused it to be fractionally more expensive than the other two: Query cost (relative to the batch): 33.32% (CROSS) vs 33.36% (INNER) vs 33.32% (NONE). So, I may consider switching to the cross-join.

I didn't expect that actually... a cross join is in general way more expensive than an inner join.

The other thing that I learned was to use the SetObjectAlias() method to assign aliases to the right side of the expression. I had run across that method in my research but obviously wasn't seeing how to apply it correctly. This means that I can keep the BETWEEN statement instead of swapping it around. simple_smile

simple_smile Yeah I added that method (also SetExpression()) to be able to use the shorter EntityFields.Fieldname objects and use the operator overloading queries while still setting object aliasses.

Now for the bad news. confused I mentioned several times above about the INNER JOIN causing the query to not return the selected organization along with all of the sub-organizations. All three of the queries above (which are very similar to the ones that I've been running for several days now) return the selected organization with all the sub-orgs. That's good smile except for the fact that I've spent the last two days cry trying to solve the issue by trying to figure out how to alias both entities to go with a query like option #3 so that I could avoid the inner-join which is what I though was causing the problem. The only thing that I can think of that is different is that I got a new cut of data last night.

What exactly is option 3? (As there are a lot of queries in this thread wink )

Originally when I started this whole exercise, my specific need was to have Organization entities returned so that I can show the organization hierarchy. However, in the course of working through these problems and working on the app, I found a need to have a query that only returned the team_ids of the sub organizations. These team_ids are then dumped to an arry and used as the input for a FieldCompareRangePredicate for the following larger query/prefetch logic frowning :

SELECT DISTINCT bfa.*
FROM fab_group g
INNER JOIN fab_category c ON c.group_id = g.group_id
INNER JOIN fab_cont_cat cc ON cc.category_id = c.category_id
INNER JOIN fab_bfa_summary bfa ON bfa.fcc_id = cc.fcc_id
WHERE cc.contract_id = 1
AND (
    (
        bfa.team_id IN (
            -- Calls separate method that returns int[] of team_ids
            SELECT DISTINCT o1.*
            FROM fab_organization AS o1 
            CROSS JOIN fab_organization AS o2
            WHERE o1.lft BETWEEN o2.lft AND o2.rgt
            AND o2.team_id = 25
            and o1.fiscal_period = '2005-09-30'
            ORDER BY o1.lft ASC
        )
        and bfa.fiscal_period = '2005-09-30'
    )
    OR
    (
        bfa.team_id in (
            -- Calls separate method that returns int[] of team_ids
            SELECT DISTINCT o1.*
            FROM fab_organization AS o1 
            CROSS JOIN fab_organization AS o2
            WHERE o1.lft BETWEEN o2.lft AND o2.rgt
            AND o2.team_id = 25
            and o1.fiscal_period = '2005-10-31'
            ORDER BY o1.lft ASC
        )
        and bfa.fiscal_period = '2005-10-31'
    )
    OR
    (
        -- repeat for each month in fiscal year
    )
)

I specifically pulled the sub-query into a separate method because there was no way I was going to try to tack that on to an already large prefetch/filter query. It seems to work for now, so I probably won't mess with it. Now if I could only do the aggregation in the query I would be set wink but for now I'll continue to do it in the code.

You can also execute the query in a dyn. list and add the aggregate there.

In case you are wondering, we are working on an internal temporal database financial app. Temporal database is a term I was unfamilar with until I ran across this thread last night: http://discuss.joelonsoftware.com/default.asp?design.4.331143.10 Good to see that there is a 500+ book written on the temporal db topic and that I'm really not an idiot for having such a hard time grasping some of the complexities that we keep uncovering in this project. Now if I could only read all 500+ pages before Monday, I would be set. sunglasses

heh simple_smile . I never heard of the term temporal database as well.

Thanks again for the assistance. I'm sure that most of my ramblings don't provide any value but they sure help me vent my frustrations. I'm sure I'll be back.

I hope you get it sorted out simple_smile .

Perhaps a hint you can use: often people find themselves painted into a corner because they tried to do a hierarchy fetch in such a way that 'displaying' it is easy. That's a mistake. If you have a hierarchy stored in a single list, you can still display it as a hierarchy with an O(n) algorithm. This then makes the fetch often way simpler, but I'm not sure if it helps you in any way, just a FYI wink .

(edit): ah, I now read that temporal db thread and I now understand what they mean. In short it's versioned sets, which is a very complex topic. For starters: to version a set, you first have to define the set. This might sound simple, but any set which contains subsets versions with the change of a subset... wink

Frans Bouma | Lead developer LLBLGen Pro
sami
User
Posts: 93
Joined: 28-Oct-2005
# Posted on: 07-Nov-2006 13:37:27   

I have been learning about this topic myself lately, and found this nice article:

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

which describes how you query the immediate subordinates of a node, using following query:


SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
    nested_category AS parent,
    nested_category AS sub_parent,
    (
        SELECT node.name, (COUNT(parent.name) - 1) AS depth
        FROM nested_category AS node,
        nested_category AS parent
        WHERE node.lft BETWEEN parent.lft AND parent.rgt
        AND node.name = 'PORTABLE ELECTRONICS'
        GROUP BY node.name
        ORDER BY node.lft
    )AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
    AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
    AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft;

I was just wondering if such a complex subquery is doable using LLBLGen predicates?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39617
Joined: 17-Aug-2003
# Posted on: 08-Nov-2006 09:32:59   

LLBLGen Pro doesn't support derived tables in FROM clauses. Derived tables are select statements aliased with a name in a from clause, like you use (sub_tree). So that's a bit of a problem.

Frans Bouma | Lead developer LLBLGen Pro