Query performance of CALCULATE function in DAX

Recently, I’ve been working with a large inventory dataset (1.8M transactions), and trying to find ways to identify the sequence of transactions.  The dataset is very simple, with an assetID, a transaction date, and a transaction ID.  The latter is an identity in the database, and is guaranteed to be unique and incremental.

Calculating the previous transaction kept running the local database engine out of memory, even when I’d tried to narrow the scope of the calculation down.  After some research, it turned out that DAX CALCULATE evaluates filter parameters somewhat differently then I’d intuited.

Here’s what I was trying to acheive.  I want to find the last transaction ID for each asset (all 1.8M of them).  The filter of the calculate function is taking the current assetID from the current row context, and finding the largest transaction ID from the transactions where the date is earlier than the current row’s date.

prvtrxn =
CALCULATE (
    MAX ( ‘Historical trxns'[trxnID] ),
‘Historical trxns'[HistoricalAssetID]  = EARLIER ( ‘Historical trxns'[HistoricalAssetID] ),
‘Historical trxns'[DateStamp] < EARLIER ( ‘Historical trxns'[DateStamp] )
)

Logically, this is entirely correct, however the query plan that it generates is non optimal.  It turns out that calculate executes it’s filter parameters concurrently, and intersects the results before calculate the MAX() function.  Marco Russo describes a similar behaviour on nested CALCULATE functions in this article.  He points out that the filter parameters of CALCULATE are executed independently from each other.  In an older article describing filter contexts, he points out the interchangeable behaviour of the filter parameter in calculate.  For example, the following two statements would have the identical execution plan.

CALCULATE (
    MAX ( ‘Historical trxns'[trxnID] ),
‘Historical trxns'[HistoricalAssetID] = 12345
)

CALCULATE (
    MAX ( ‘Historical trxns'[trxnID] ),
    FILTER ( ALL ( ‘Historical trxns’ ), [HistoricalAssetID] = 12345 )
)

However, it turns out that the previous statement with two filter conditions that ran out of memory was effectively like having two filter statements as arguments, each being executed independently.  The solution is to have one filter statement, which finds the subset of Historical trxns before running the aggregate on it.

prvtrxn =
CALCULATE (
    MAX ( ‘Historical trxns'[trxnID] ),
    FILTER (
‘Historical trxns’,
‘Historical trxns'[HistoricalAssetID]= EARLIER ( ‘Historical trxns'[HistoricalAssetID] )
&& ‘Historical trxns'[DateStamp] < EARLIER ( ‘Historical trxns'[DateStamp] )
    )
)

Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise. -John Tukey
The plural of anecdote is not data. - John Myles White

Recent Posts

RSS PowerBI blog

  • New Power BI Premium summary and workload metrics available in the admin portal April 22, 2019
    We have rolled out a new high-level summary metrics experience in the Power BI Admin Portal . This new experience replaces the four summary tiles you previously saw : CPU, Memory thrashing, Memory usage and DirectQuery with wide range of metrics measuring the summary of usage in the last 7 days to better portray the health […]
  • Live now, full session catalog for Microsoft Business Applications Summit 2019 April 22, 2019
    Time to get excited – the full session catalog for Microsoft Business Applications Summit is here. Explore every breakout session and workshop coming to the conference, taking place in Atlanta, Georgia June 10 – 11, 2019. Get ready to flex your skills – and build new ones – with all things Power BI and beyond. […]
  • E-Mail Subscriptions for Paginated Reports is Now Available April 19, 2019
    E-mail subscriptions is one of the key features used by millions of SQL Server Reporting Services customers today. Our latest release for “New Feature Friday” is the support of Paginated Reports in e-mail subscriptions in Power BI. Now, for the first time, you can schedule an e-mail subscription with a PDF attachment of your full […]