Adding controlling features to Financial Statements in Power BI
Hi there 👋 and welcome back! This is article 3 of 4 in the series about modeling data for Financial Statements in Power BI. If you missed the first two articles, please see the links below.
And if you would like to follow along, please remember to subscribe.
The 4 articles in the series:
Link to the GitHub code repository where the SQL code, as well as the Power BI template file, can be found.
In this article, we will follow up on the modeling done last week and add error handling and controlling features - still using SQL. The article will be structured into 3 parts:
The Assumptions
Before we get started, I need to address the one major assumption.
Last week, I mentioned that there was an assumption that one shared version of each Account Schedule is used for reporting across all companies.
Building on that, it is assumed that any given AccountNo is used in the same way across Companies. This means that a single account number can't be used, for example, as turnover in one company and COGS in another.
Possible Solutions
As I see it, there are two possible ways to circumvent this:
This could be a whole other article series, and maybe it will be in the future, but for now, let's stick with the assumption.
The Errors
Initially, it might be a good idea to define what the goal is. What is an error, after all?
What is the goal?
Simply put, we want to ensure that every single relevant account (Income and/or Balance) is mapped once and only once - that’s it!
Elaborating a bit further, we can say that an error is anything that causes the above statement to be untrue. Consequently, what we want is to be able to alert the end user and, as precisely as possible, tell them what the error is so they can fix it themselves.
Please refer to the examples below for this next part.
The 6 Types of Errors
Moving on to the 6 error types. They can be broadly categorized according to the 3 purposes they serve: 1) Indicating when something that should be included isn't. 2) Indicating when something is included more than once. 3) Providing general assistance to the user by catching typos that result in something not being included (as intended).
1. Unmapped
Unmapped serves the first category: Indicating when something that should be included isn't.
If a row is not referenced in the totaling column of any other rows, it gets caught as “Unmapped.”
Any account not included that should have been would also fit this error definition, even though it is handled differently. In this case, I chose to add them to an “Unmapped” grouping - more on what that looks like in Power BI next week.
2. Multi Mapping
Multi Mapping serves the second category: Showing when something is included more than once.
If accounts or rows that are referenced in totaling are also referenced by other rows, it gets caught as “Multi Mapping.”
3. Duplicate Row No
Another example of the second category.
If the same row number is used twice within a given account schedule, any reference would also reference two rows. This is most likely not the user's intention and could ultimately result in the same account being included more than once.
4. Unknown Row Type
All logic strictly uses the three types: Account, Group, and SubTotal. Anything else will simply be ignored.
5. Unknown Row No Reference
Group and SubTotal row types reference other rows. If no rows match the reference, it is either a typo or the user meant to reference accounts and should instead use that row type.
The result is again that something most likely isn’t being included as the user intended.
Recommended by LinkedIn
6. Unknown Account Reference
This is basically the same as “Unknown Row No Reference” but for the Account row type. If the totaling doesn't reference any known accounts, we warn the user as this is again either a typo or they meant to use another row type.
Examples
If you took the challenge last week and tried to spot the errors I had included, here they are! They are also examples of error types 1, 2, and 3, so let's quickly go over them.
Honorable Mention
In Power BI, you can sort fields by other fields in the same table.
The first time I made a setup similar to this, Power BI would give an error if the text and sorting fields weren't 1:1 (for instance, if "Turnover" had both 1 and 2 as its sorting values).
It seems that this is no longer the case. I have chosen not to include it as it doesn’t cause an error in Power BI and it isn't an error as defined above.
The Code
And now to the exciting part: The Code!
Again, I've chosen to highlight 3 parts of the code that I think do some cool things.
See the GitHub code repository also referenced at the top of the article for the complete code.
This part will be kind of nerdy - you have been warned!
OK? Let's begin!
Window Functions
I like window functions!
You can do all kinds of cool things with them. If you haven't used them yet, you are missing out!
In this project, I've used them multiple times, specifically for error handling. One example is making sure account numbers are unique.
In the case below, I can't be sure that AccountNo is unique because I also need the IncomeBalance field, and it is possible that it has two different values for the same AccountNo. This situation could result in the statement returning two rows for the same AccountNo, meaning it won't be unique.
Just to go through what the window function does for us here: ROW_NUMBER() returns a unique number for each row. It starts over again from 1 for each group defined by PARTITION BY - in my case, the AccountNo. The ORDER BY is required. It is also how you control the ordering of the rows within the partition, i.e., which row gets the 1.
In my case, I get RowNumber = 1 for all account numbers, but if I had the same account number as both Income and Balance, one of them would get RowNumber = 2 and be excluded by the WHERE statement, thereby ensuring unique account numbers.
WHERE (NOT) EXISTS
You can use join-based logic to check whether values are (or aren't) in other tables. However, if you've ever tried to join tables on columns that aren't unique, you'll know that you can end up with duplicate values.
In such cases, WHERE EXISTS can be used to verify the existence of values without risking duplication in your main query.
In the example, I add the NOT keyword to check if any row numbers don't exist, i.e., they aren't being referenced by other rows by matching the ChildRowNo (which is the reference) against the row numbers of the account schedule.
Note that when joining, we use the ON clause to tell the SQL engine which columns to match tables on. When using this syntax, we write the clause as we would with a join, but instead of ON, we write WHERE, as seen below.
Another Window Function…
Did I mention that I like window functions? 😂
I just realized that all the error handling either involves window functions or the “where not exists” syntax. You always need three examples, so here's another window function.
By including both ScheduleName and AccountNo in the partition by clause, we count how many times each account is referenced within a given account schedule. If it's more than 1, it's a Multi Mapping error - simple enough.
We then do the same for rows and use UNION to append the two result sets. It shouldn't be possible to get the same AccountScheduleId in both selects, but just in case, UNION rather than UNION ALL ensures uniqueness.
In Conclusion
If you're still here, it means you might want to try it out yourself.
In that case, I've included some demo data in the GitHub repo as well.
But before you go, I'm curious to know what you think. 🤔
In next week's article, I'll wrap up the series by bringing it all together in Power BI, so stay tuned!