The LLBLGen Pro Blog: LLBLGen Pro v5.5 feature highlight: window functions

In LLBLGen Pro v5.5 we've introduced a new feature called Window functions support. What exactly are 'window functions' and why are they so important? To answer these questions let's first look at a good description of what a window function is. From the excellent PostgreSQL documentation:

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. However, window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would. Instead, the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.

So i.o.w. you can define per row a set of rows inside the resultset, perform an operation on these rows and return the result of that operation as a value in the current row. Which rows are in the window is defined by the window definition which is defined using an OVER() clause. The operation applied on the rows is defined by the function the OVER() clause is defined with.

This sounds all rather "I don't need this"-complex and artificial, however it's highly likely you will run into use cases where window functions would help a lot and you're currently forced to use either complex group by clauses or multiple queries and merge the results yourself. In fact, SQL experts say that once you've been exposed to window functions you really wonder how you have lived without them before that.

Let's look at a simple example. We're using the good ol' Northwind database for this. We'll fetch a list of product data and want to additionally calculate the average unit price of the category the product is in. With window functions this is very easy:

QuerySpecGenerated SQL

// QuerySpec query
var qf = new QueryFactory();
var q = qf.Product
          .Where(ProductFields.Discontinued.Equal(false))
          .Select(ProductFields.ProductId, ProductFields.ProductName, ProductFields.UnitPrice,
                  ProductFields.UnitPrice.Avg()
                            .Over(WindowPartition.By(ProductFields.CategoryId)).As("AverageUnitPriceInCategory"));
var results = adapter.FetchQuery(q);

-- value of the @p1 parameter is: 0
SELECT [Northwind].[dbo].[Products].[ProductID] AS [ProductId],
       [Northwind].[dbo].[Products].[ProductName],
       [Northwind].[dbo].[Products].[UnitPrice],
       AVG([Northwind].[dbo].[Products].[UnitPrice]) 
                OVER(PARTITION BY [Northwind].[dbo].[Products].[CategoryID]) AS [AverageUnitPriceInCategory]
FROM   [Northwind].[dbo].[Products]
WHERE  (([Northwind].[dbo].[Products].[Discontinued] = @p1))

This will give the following result-set:

ProductId	ProductName	UnitPrice	AverageUnitPriceInCategory
75	Rhönbräu Klosterbier	7.75	7.75
76	Lakkalikööri	18.00	44.35
67	Laughing Lumberjack Lager	14.00	44.35
70	Outback Lager	15.00	44.35
1	Chai	18.00	44.35
2	Chang	19.00	44.35
34	Sasquatch Ale	14.00	44.35
35	Steeleye Stout	18.00	44.35
38	Côte de Blaye	263.50	44.35
39	Chartreuse verte	18.00	44.35
43	Ipoh Coffee	46.00	44.35
44	Gula Malacca	19.45	23.2181
61	Sirop d'érable	28.50	23.2181
63	Vegie-spread	43.90	23.2181
65	Louisiana Fiery Hot Pepper Sauce	21.05	23.2181
66	Louisiana Hot Spiced Okra	17.00	23.2181
3	Aniseed Syrup	10.00	23.2181
4	Chef Anton's Cajun Seasoning	22.00	23.2181
6	Grandma's Boysenberry Spread	25.00	23.2181
8	Northwoods Cranberry Sauce	40.00	23.2181
15	Genen Shouyu	15.50	23.2181
77	Original Frankfurter grüne Soße	13.00	23.2181

No complicated group by needed. It's also easy to understand how to define the window over the resultset: we want to have the average price per category, so we want all rows belonging to a category in a single group. The PARTITION SQL keyword does just that: here it partitions the resultset over the CategoryID so all rows with the same CategoryID are in the same group. The aggregate function we've defined, Avg(), which calculates an average, will then calculate the average value over the rows in a single group/partition.

For the Avg() function the order in which the rows are processed inside the partition isn't really important. For other functions, it is however. Take the Rank() function, which calculates the rank of the current row within the group/partition. Building on the previous query, let's say we want to know the rank of the product within the category based on its unit price, so i.o.w.: the more expensive the product, the higher its rank number (so the lower its rank). This looks something like this:

QuerySpecGenerated SQL

// QuerySpec query
var qf = new QueryFactory();
var q = qf.Product
          .Where(ProductFields.Discontinued.Equal(false))
          .Select(ProductFields.ProductId, ProductFields.CategoryId, 
                  ProductFields.ProductName, ProductFields.UnitPrice,
                  WindowFunctions.Rank()
                        .Over(WindowPartition.By(ProductFields.CategoryId),
                              ProductFields.UnitPrice.Ascending()).As("ProductRankInCategory"));
var results = adapter.FetchQuery(q);

-- value of the @p1 parameter is: 0
SELECT [Northwind].[dbo].[Products].[ProductID]  AS [ProductId],
       [Northwind].[dbo].[Products].[CategoryID] AS [CategoryId],
       [Northwind].[dbo].[Products].[ProductName],
       [Northwind].[dbo].[Products].[UnitPrice],
       RANK() OVER(PARTITION BY [Northwind].[dbo].[Products].[CategoryID] 
                   ORDER BY [Northwind].[dbo].[Products].[UnitPrice] ASC) AS [ProductRankInCategory]
FROM   [Northwind].[dbo].[Products]
WHERE  (([Northwind].[dbo].[Products].[Discontinued] = @p1))

This results in the following result-set:

ProductId	CategoryId	ProductName	UnitPrice	ProductRankInCategory
75	NULL	Rhönbräu Klosterbier	7.75	1
67	1	Laughing Lumberjack Lager	14.00	1
34	1	Sasquatch Ale	14.00	1
70	1	Outback Lager	15.00	3
76	1	Lakkalikööri	18.00	4
35	1	Steeleye Stout	18.00	4
1	1	Chai	18.00	4
39	1	Chartreuse verte	18.00	4
2	1	Chang	19.00	8
43	1	Ipoh Coffee	46.00	9
38	1	Côte de Blaye	263.50	10
3	2	Aniseed Syrup	10.00	1
77	2	Original Frankfurter grüne Soße	13.00	2
15	2	Genen Shouyu	15.50	3
66	2	Louisiana Hot Spiced Okra	17.00	4
44	2	Gula Malacca	19.45	5
65	2	Louisiana Fiery Hot Pepper Sauce	21.05	6
4	2	Chef Anton's Cajun Seasoning	22.00	7
6	2	Grandma's Boysenberry Spread	25.00	8
61	2	Sirop d'érable	28.50	9
8	2	Northwoods Cranberry Sauce	40.00	10
63	2	Vegie-spread	43.90	11

We see that per category Rank() creates rankings and keeps track of duplicates. All with an easy construct.

Window functions go further. The above examples all use the default window frame bounds, which are by default equal to 'all rows in the partition', but what if you want to define a part of that partition, e.g. you want to have a set of rows relative to the current row, for e.g. a sliding average? You can, using the window bounds definition of the OVER() clause. LLBLGen Pro supports that too.

Say in our average example above, we want to calculate the average unit price within a category over the subset 'current row and 3 rows preceding the current row'. This already sounds rather complicated if you don't have a window function. With window functions however, not a problem!

We already have a partition defined, namely over CategoryID, we then have to add to that partition definition an Order By clause and a window bounds for the window frame. The Order By is needed as a window frame is required to work on an ordered set:

QuerySpecGenerated SQL

// QuerySpec query
var qf = new QueryFactory();
var q = qf.Product
          .Where(ProductFields.Discontinued.Equal(false))
          .Select(ProductFields.ProductId, ProductFields.CategoryId, 
                  ProductFields.ProductName, ProductFields.UnitPrice,
                  ProductFields.UnitPrice.Avg()
                        .Over(WindowPartition.By(ProductFields.CategoryId),
                              ProductFields.UnitPrice.Ascending(),
                              WindowBounds.Rows(3)).As("AvgUnitPriceSubWindow"));
var results = adapter.FetchQuery(q);

-- value of the @p1 parameter is: 0
SELECT  [Northwind].[dbo].[Products].[ProductID] AS [ProductId], 
        [Northwind].[dbo].[Products].[CategoryID] AS [CategoryId], 
        [Northwind].[dbo].[Products].[ProductName], 
        [Northwind].[dbo].[Products].[UnitPrice], 
        AVG([Northwind].[dbo].[Products].[UnitPrice]) 
            OVER(PARTITION BY [Northwind].[dbo].[Products].[CategoryID] 
                 ORDER BY [Northwind].[dbo].[Products].[UnitPrice] ASC 
                 ROWS 3 PRECEDING) AS [AvgUnitPriceSubWindow] 
FROM [Northwind].[dbo].[Products] 
WHERE ( ( [Northwind].[dbo].[Products].[Discontinued] = @p1))

Which results in the following result-set:

ProductId	CategoryId	ProductName	UnitPrice	AvgUnitPriceSubWindow
75	NULL	Rhönbräu Klosterbier	7.75	7.75
67	1	Laughing Lumberjack Lager	14.00	14.00
34	1	Sasquatch Ale	14.00	14.00
70	1	Outback Lager	15.00	14.3333
76	1	Lakkalikööri	18.00	15.25
35	1	Steeleye Stout	18.00	16.25
1	1	Chai	18.00	17.25
39	1	Chartreuse verte	18.00	18.00
2	1	Chang	19.00	18.25
43	1	Ipoh Coffee	46.00	25.25
38	1	Côte de Blaye	263.50	86.625
3	2	Aniseed Syrup	10.00	10.00
77	2	Original Frankfurter grüne Soße	13.00	11.50
15	2	Genen Shouyu	15.50	12.8333
66	2	Louisiana Hot Spiced Okra	17.00	13.875
44	2	Gula Malacca	19.45	16.2375
65	2	Louisiana Fiery Hot Pepper Sauce	21.05	18.25
4	2	Chef Anton's Cajun Seasoning	22.00	19.875
6	2	Grandma's Boysenberry Spread	25.00	21.875
61	2	Sirop d'érable	28.50	24.1375
8	2	Northwoods Cranberry Sauce	40.00	28.875
63	2	Vegie-spread	43.90	34.35

As you can see, the average value is calculated of the current row and at most 3 preceeding rows of the current row, effectively creating a sliding average, without complicated group by constructs or other constructs.

LLBLGen Pro's powerful fluent query API QuerySpec as well as the low-level query API support window functions in full and allow you to leverage the full potential of what SQL Window Functions have to offer. All by using your entity model and no need for dropping down to hard-coded SQL strings.

See for more details on this new functionality the Window Functions section in the LLBLGen Pro Runtime Framework Documentation.

Happy querying!

Why wait?

Become more productive today.