In LLBLGen Pro v5.5 we've introduced a new feature called
Window functions support.
What exactly are 'window functions' and why are they so important? To answer these questions let's first look at a good description of what a window
function is. From the excellent PostgreSQL documentation:
A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. However, window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would. Instead, the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.
So i.o.w. you can define per row a set of rows inside the resultset, perform an operation on these rows and return the result of that operation as a value in the current row. Which rows are in the window is defined by the window definition which is defined using an OVER()
clause. The operation applied on the rows is defined by the function the OVER()
clause is defined with.
This sounds all rather "I don't need this"-complex and artificial, however it's highly likely you will run into use cases where window functions would help a lot and you're currently forced to use either complex group by clauses or multiple queries and merge the results yourself. In fact, SQL experts say that once you've been exposed to window functions you really wonder how you have lived without them before that.
Let's look at a simple example. We're using the good ol' Northwind database for this. We'll fetch a list of product data and want to additionally calculate the average unit price of the category the product is in. With window functions this is very easy:
This will give the following result-set:
ProductId |
ProductName |
UnitPrice |
AverageUnitPriceInCategory |
75 |
Rhönbräu Klosterbier |
7.75 |
7.75 |
76 |
Lakkalikööri |
18.00 |
44.35 |
67 |
Laughing Lumberjack Lager |
14.00 |
44.35 |
70 |
Outback Lager |
15.00 |
44.35 |
1 |
Chai |
18.00 |
44.35 |
2 |
Chang |
19.00 |
44.35 |
34 |
Sasquatch Ale |
14.00 |
44.35 |
35 |
Steeleye Stout |
18.00 |
44.35 |
38 |
Côte de Blaye |
263.50 |
44.35 |
39 |
Chartreuse verte |
18.00 |
44.35 |
43 |
Ipoh Coffee |
46.00 |
44.35 |
44 |
Gula Malacca |
19.45 |
23.2181 |
61 |
Sirop d'érable |
28.50 |
23.2181 |
63 |
Vegie-spread |
43.90 |
23.2181 |
65 |
Louisiana Fiery Hot Pepper Sauce |
21.05 |
23.2181 |
66 |
Louisiana Hot Spiced Okra |
17.00 |
23.2181 |
3 |
Aniseed Syrup |
10.00 |
23.2181 |
4 |
Chef Anton's Cajun Seasoning |
22.00 |
23.2181 |
6 |
Grandma's Boysenberry Spread |
25.00 |
23.2181 |
8 |
Northwoods Cranberry Sauce |
40.00 |
23.2181 |
15 |
Genen Shouyu |
15.50 |
23.2181 |
77 |
Original Frankfurter grüne Soße |
13.00 |
23.2181 |
No complicated group by needed. It's also easy to understand how to define the window over the resultset: we want to have the average price per category, so we want all rows belonging to a category in a single group. The PARTITION
SQL keyword does just that: here it partitions the resultset over the CategoryID so all rows with the same CategoryID are in the same group. The aggregate function we've defined, Avg()
, which calculates an average, will then calculate the average value over the rows in a single group/partition.
For the Avg()
function the order in which the rows are processed inside the partition isn't really important. For other functions, it is however. Take the Rank()
function, which calculates the rank of the current row within the group/partition. Building on the previous query, let's say we want to know the rank of the product within the category based on its unit price, so i.o.w.: the more expensive the product, the higher its rank number (so the lower its rank). This looks something like this:
This results in the following result-set:
ProductId |
CategoryId |
ProductName |
UnitPrice |
ProductRankInCategory |
75 |
NULL |
Rhönbräu Klosterbier |
7.75 |
1 |
67 |
1 |
Laughing Lumberjack Lager |
14.00 |
1 |
34 |
1 |
Sasquatch Ale |
14.00 |
1 |
70 |
1 |
Outback Lager |
15.00 |
3 |
76 |
1 |
Lakkalikööri |
18.00 |
4 |
35 |
1 |
Steeleye Stout |
18.00 |
4 |
1 |
1 |
Chai |
18.00 |
4 |
39 |
1 |
Chartreuse verte |
18.00 |
4 |
2 |
1 |
Chang |
19.00 |
8 |
43 |
1 |
Ipoh Coffee |
46.00 |
9 |
38 |
1 |
Côte de Blaye |
263.50 |
10 |
3 |
2 |
Aniseed Syrup |
10.00 |
1 |
77 |
2 |
Original Frankfurter grüne Soße |
13.00 |
2 |
15 |
2 |
Genen Shouyu |
15.50 |
3 |
66 |
2 |
Louisiana Hot Spiced Okra |
17.00 |
4 |
44 |
2 |
Gula Malacca |
19.45 |
5 |
65 |
2 |
Louisiana Fiery Hot Pepper Sauce |
21.05 |
6 |
4 |
2 |
Chef Anton's Cajun Seasoning |
22.00 |
7 |
6 |
2 |
Grandma's Boysenberry Spread |
25.00 |
8 |
61 |
2 |
Sirop d'érable |
28.50 |
9 |
8 |
2 |
Northwoods Cranberry Sauce |
40.00 |
10 |
63 |
2 |
Vegie-spread |
43.90 |
11 |
We see that per category Rank()
creates rankings and keeps track of duplicates. All with an easy construct.
Window functions go further. The above examples all use the default window frame bounds, which are by default equal to 'all rows in the partition', but what if you want to define a part of that partition, e.g. you want to have a set of rows relative to the current row, for e.g. a sliding average? You can, using the window bounds definition of the OVER()
clause. LLBLGen Pro supports that too.
Say in our average example above, we want to calculate the average unit price within a category over the subset 'current row and 3 rows preceding the current row'. This already sounds rather complicated if you don't have a window function. With window functions however, not a problem!
We already have a partition defined, namely over CategoryID, we then have to add to that partition definition an Order By clause and a window bounds for the window frame. The Order By is needed as a window frame is required to work on an ordered set:
// QuerySpec query
var qf = new QueryFactory();
var q = qf.Product
.Where(ProductFields.Discontinued.Equal(false))
.Select(ProductFields.ProductId, ProductFields.CategoryId,
ProductFields.ProductName, ProductFields.UnitPrice,
ProductFields.UnitPrice.Avg()
.Over(WindowPartition.By(ProductFields.CategoryId),
ProductFields.UnitPrice.Ascending(),
WindowBounds.Rows(3)).As("AvgUnitPriceSubWindow"));
var results = adapter.FetchQuery(q);
-- value of the @p1 parameter is: 0
SELECT [Northwind].[dbo].[Products].[ProductID] AS [ProductId],
[Northwind].[dbo].[Products].[CategoryID] AS [CategoryId],
[Northwind].[dbo].[Products].[ProductName],
[Northwind].[dbo].[Products].[UnitPrice],
AVG([Northwind].[dbo].[Products].[UnitPrice])
OVER(PARTITION BY [Northwind].[dbo].[Products].[CategoryID]
ORDER BY [Northwind].[dbo].[Products].[UnitPrice] ASC
ROWS 3 PRECEDING) AS [AvgUnitPriceSubWindow]
FROM [Northwind].[dbo].[Products]
WHERE ( ( [Northwind].[dbo].[Products].[Discontinued] = @p1))
Which results in the following result-set:
ProductId |
CategoryId |
ProductName |
UnitPrice |
AvgUnitPriceSubWindow |
75 |
NULL |
Rhönbräu Klosterbier |
7.75 |
7.75 |
67 |
1 |
Laughing Lumberjack Lager |
14.00 |
14.00 |
34 |
1 |
Sasquatch Ale |
14.00 |
14.00 |
70 |
1 |
Outback Lager |
15.00 |
14.3333 |
76 |
1 |
Lakkalikööri |
18.00 |
15.25 |
35 |
1 |
Steeleye Stout |
18.00 |
16.25 |
1 |
1 |
Chai |
18.00 |
17.25 |
39 |
1 |
Chartreuse verte |
18.00 |
18.00 |
2 |
1 |
Chang |
19.00 |
18.25 |
43 |
1 |
Ipoh Coffee |
46.00 |
25.25 |
38 |
1 |
Côte de Blaye |
263.50 |
86.625 |
3 |
2 |
Aniseed Syrup |
10.00 |
10.00 |
77 |
2 |
Original Frankfurter grüne Soße |
13.00 |
11.50 |
15 |
2 |
Genen Shouyu |
15.50 |
12.8333 |
66 |
2 |
Louisiana Hot Spiced Okra |
17.00 |
13.875 |
44 |
2 |
Gula Malacca |
19.45 |
16.2375 |
65 |
2 |
Louisiana Fiery Hot Pepper Sauce |
21.05 |
18.25 |
4 |
2 |
Chef Anton's Cajun Seasoning |
22.00 |
19.875 |
6 |
2 |
Grandma's Boysenberry Spread |
25.00 |
21.875 |
61 |
2 |
Sirop d'érable |
28.50 |
24.1375 |
8 |
2 |
Northwoods Cranberry Sauce |
40.00 |
28.875 |
63 |
2 |
Vegie-spread |
43.90 |
34.35 |
As you can see, the average value is calculated of the current row and at most 3 preceeding rows of the current row, effectively creating a sliding average, without complicated group by constructs or other constructs.
LLBLGen Pro's powerful fluent query API QuerySpec as well as the low-level query API support window functions in full and allow you to leverage the full potential of what SQL Window Functions have to offer. All by using your entity model and no need for dropping down to hard-coded SQL strings.
See for more details on this new functionality the Window Functions section in the LLBLGen Pro Runtime Framework Documentation.
Happy querying!