Table of Contents
Introduction to Power BI DAX Fundamentals
Data Analysis Expressions (DAX) is a formula language that helps analyze data in Microsoft Power BI, Excel, and other apps. Knowing the main concepts of Power BI DAX is important to get the most out of it.
In this blog post, we’ll cover the Power BI DAX fundamentals, including data modeling, calculated column, measures, contexts, variables, formula types, and more. Let’s get started.
Data Modeling and Relationships
Data modeling means arranging data into tables and creating relationships between them. Power BI has a Power Query Editor and a Data view to help with data modeling. The data model forms the foundation of a Power BI report and decides how data is shown.
Making a good data model is key for creating precise and fast reports. Here are some tips for data modeling:
Keep it simple: Only use tables and columns needed for analysis. Too many calculated columns and measures can slow down a report.
Use clear names: Give tables, columns, and measures names that are easy to understand.
Set relationships properly: Understand how tables are related and make sure to set the direction and cardinality correctly.
Use correct data types: Choose the appropriate data types for each column to make sure calculations are correct and quick.
In Power BI DAX, a relationship refers to linking two tables based on common columns, such as customer ID or product ID, to enable users to analyze data from both tables together. This connection allows users to explore and analyze data from different tables without the need to merge them, resulting in faster and more efficient data analysis.
There are three types of relationships in Power BI DAX:
- One-to-many: The one-to-many relationship is a common type of connection between tables where one record in a table can be linked to many records in another table, but each record in the second table can only be linked to one record in the first table. This type of relationship is widely used in DAX.
- Many-to-one: Many-to-one relationships work in the opposite direction of one-to-many relationships. In this type of relationship, multiple records in one table can be linked to a single record in another table.
- Many-to-many: Many-to-many relationships are commonly used when multiple records in one table can have relationships with multiple records in another table.
It’s important to think about the cardinality and direction of a relationship when creating one between tables. Cardinality refers to the number of unique values in a column, which determines whether the relationship is one-to-many or many-to-one. Direction refers to which table is the primary table (the table on the “one” side of the relationship) and which is the related table (the table on the “many” side of the relationship).
Understanding relationships is crucial in Power BI DAX because it affects how the data is aggregated and calculated.
Calculated columns are additional columns in a table that are calculated using a formula provided by the user. The formula can consist of other columns, functions or constants. Once created, the calculated column can be used for other calculations like measures or any other calculated columns.
As an instance, by using the sales & cost columns in a sales table, a user can create a calculated column that calculates the profit by subtracting the cost from the sales.
Profit = Sales - Cost
Measures are calculations that you create in Power BI based on data from tables or data models. Unlike calculated columns, measures are evaluated in the context of the visualization or report.
Measures are used for data aggregation, ratio and percentage calculations, and other complex calculations. Common examples of measures include sum, average, minimum, maximum, count, & distinct count.
For instance, if you have a sales table with columns for sales and cost, you can create a measure to calculate the total sales using the SUM function as mentioned below:
Total Sales = SUM(Sales)
Measures can also be used to calculate ratios, percentages, and other advanced calculations.
Types of Context
Context is an essential concept in Power BI DAX that determines how a formula is evaluated based on the current filter context in Power BI. There are 2 types of contexts in Power BI DAX:
- Row context: Row context is the context in which a formula is evaluated for each row in a table. For example, if you have a table with sales data for multiple products, the row context is the individual product row.
- Filter context: Filter context is the context in which a formula is evaluated based on the filters applied to a report or visualization. For example, if you have a report that is filtered by date, the filter context is the selected date range.
Understanding the difference between row context & filter context is crucial to writing effective DAX formulas.
Context transition is the process by which DAX switches from one context to another in Power BI DAX. Context transition can occur when iterating over a table, using a function like CALCULATE, or using a function that takes a table as an argument.
For example, the CALCULATE function can be used to apply a filter context to a formula that is evaluated in row context:
Total Sales = CALCULATE(SUM(Sales), Dates[Year] = 2021)
Understanding context transition is essential to write complex DAX formulas in Power BI that correctly evaluate data in different contexts.
Variables are used in Power BI DAX to store values that can be referenced and reused throughout a formula. They are similar to variables in programming languages and can make DAX formulas more readable and easier to maintain.
To create a variable, you use the VAR keyword followed by a variable name, an equals sign, and a Power BI DAX expression. The variable can then be referenced using the variable name throughout the formula.
For example, suppose you have a table that contains the sales data for a company, including columns for the year and the total sales. You want to calculate the average sales per year for the last five years. You can use a variable to store the current year and then use it to calculate the average sales for the last five years. The formula would look like this:
Avg Sales Last 5 Years =
VAR CurrentYear = MAX('Sales'[Year])
'Sales'[Year] >= CurrentYear - 4
Formula Types in Power BI DAX
Categories of Power BI DAX Functions
Power BI DAX functions can be broadly classified into the following categories:
- Aggregation Functions: To perform aggregation operations such as SUM, AVERAGE, COUNT, MIN, MAX, etc. on a column or set of columns in a table.
- Date and Time Functions: To perform operations on date and time data types, such as YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, etc.
- Text Functions: To manipulate text data, such as CONCATENATE, LEFT, RIGHT, UPPER, LOWER, SUBSTITUTE, etc.
- Information Functions: To retrieve information about the data, such as ISBLANK, ISTEXT, ISNUMBER, etc.
- Logical Functions: To evaluate logical expressions, such as IF, AND, OR, NOT, etc.
- Filter Functions: To filter data based on certain conditions, such as FILTER, CALCULATETABLE, ALL, etc.
- Table Functions: To perform operations on tables, such as SUMMARIZE, GROUPBY, UNION, etc.
- Time Intelligence Functions: To perform time-based calculations, such as SAMEPERIODLASTYEAR, TOTALYTD, etc.
- Statistical Functions: To perform statistical analysis on data, such as AVERAGEIF, STDEV, etc.
- Math and Trig Functions: To perform mathematical operations, such as SQRT, POWER, ABS, etc.
- Parent-Child Functions: To handle parent-child hierarchies, such as PATH, PATHITEM, etc.
- Information Schema Functions: To retrieve metadata about the data model, such as COLUMNS, FILTERS, etc.
By understanding the different types of functions available in Power BI DAX, you can choose the right function for the job and create more powerful and flexible formulas.
Iterator and Non-Iterator Functions in Power BI DAX
Types of DAX Formulas based on Iterator and Non-Iterator:
DAX formulas can also be categorized based on whether they use iterator functions or non-iterator functions.
Iterator functions operate on a row-by-row basis, calculating a result for each row of a table. They are used to perform calculations that depend on the context of the current row. Some common iterator functions are:
- SUMX: Calculates the sum of an expression evaluated over a table.
- AVERAGEX: Calculates the average of an expression evaluated over a table.
- MINX: Calculates the minimum value of an expression evaluated over a table.
- MAXX: Calculates the maximum value of an expression evaluated over a table.
- COUNTX: Counts the number of rows in a table that contain a value or expression.
For example, the SUMX function is used to sum the sales amount for all dates on or after January 1, 2022, in the following expression:
Sales[Date] >= DATE(2022,1,1)
Non-Iterator Functions are functions that operate on an entire table or a specific column. These functions are used to perform calculations that do not depend on the context of the current row and are typically used for aggregating data.
Although non-iterator functions are generally faster & more efficient than iterator functions, the latter provide more flexibility and are necessary for more complex calculations. By knowing the differences between the two types of functions, you can select the appropriate function for the task at hand and create more effective and efficient DAX formulas.
- SUM: Calculates the sum of a column.
- AVERAGE: Calculates the average of a column.
- MIN: Calculates the minimum value in a column.
- MAX: Calculates the maximum value in a column.
- COUNT: Counts the number of rows in a column that contain a value.
For example, the SUM function is used to sum the sales amount for all dates in the following expression:
Commonly used DAX functions grouped by category
- SUM: Calculates the sum of a column.
- SUMX: Calculates the sum of an expression over a table.
- AVERAGE: Calculates the average of the given column.
- AVERAGEX: Calculates the average of an DAX expression over a table.
- COUNT: Counts the number of rows in the given column.
- COUNTX: Counts the number of rows in an expression over a table.
- MAX: Returns the maximum value from the column passed in this function.
- MAXX: Returns the maximum value in an expression over a table.
- MIN: Returns the minimum value from the column passed in this function.
- MINX: Returns the minimum value in an expression over a table.
- DISTINCTCOUNT: Returns the count of distinct values in a column.
- FILTER: Returns a table after evaluating the filter expression.
- ALL: Returns all the rows in a table or column, ignoring any filters that might have been applied.
- VALUES: Returns a table with the distinct values from a column.
- REMOVEFILTERS: Removes all filters from a table or column.
- RELATED: It retrieves a value that is connected or associated with another table..
- RELATEDTABLE: It returns a table connected to the current table.
- CROSSFILTER: Creates a relationship between two tables and sets the filter direction.
- IF: Evaluates a logical expression and returns one value if the expression is true and another value if the expression is false.
- SWITCH: Evaluates an expression against a list of values and returns a result corresponding to the first matching value.
- HASONEVALUE: Returns a Boolean value indicating whether a column has only one distinct value.
- ISFILTERED: Returns a Boolean value indicating whether a table or column is filtered.
- ISCROSSFILTERED: Returns a Boolean value indicating whether a table or column is filtered in a specific direction.
- CALCULATE: Evaluates an DAX expression in a modified filter context.
- CALCULATETABLE: Evaluates an DAX expression in a modified filter context over a table.
- CALCULATECOLUMNS: Evaluates a Power BI DAX expression in a modified filter context over a table and returns the result as a table.
- CALCULATEVALUES: It evaluates an DAX expression in a modified filter context and returns the result as a single value.
- SUMMARIZE: Aggregates data over a table and groups the results by one or more columns.
- SUMMARIZECOLUMNS: Aggregates data over a table and returns the result as a DAX table.
- GROUPBY: Groups data over a table and applies aggregations (one or more) to the groups.
- CONCATENATEX: Concatenates the values of a column into a string using a provided delimiter.
- DIVIDE: Divides a given number by another and returns the result.
- RANKX: Returns the rank of an DAX expression in a table.
- TOPN: Returns the top N rows of a table based on a specified DAX expression.
- COUNTROWS: Counts the number of rows in a Power BI table.
In conclusion, Power BI DAX is a powerful language that enables data analysts to create complex calculations and formulas to analyze and visualize data in Power BI and other Microsoft tools. Understanding the key concepts of calculated columns, measures, contexts, data modeling, and relationships is essential for creating effective Power BI DAX formulas. By mastering DAX, data analysts can unlock the full potential of their data and gain valuable insights that can drive business success.