Use ROW_NUMBER() and not Min/Max functions

For those of you who are SQL experts, this might sound like a trivial suggestion, but if you are not already familiar with the SQL function ROW_NUMBE


A person typing on a laptop keyboardFor those of you who are SQL experts, this might sound like a trivial suggestion, but if you are not already familiar with the SQL function ROW_NUMBER(), keep reading. In fact look it up.

The syntax is ROW_NUMBER() OVER(PARTITION BY [name of field you want to group by] ORDER BY [name of field you want to sort by]).

This function assigns a number to each record returned in a query based on a grouping (or partition) and sort order that you specify. So, if you want the first or last record of something (first order date, last encounter, latest documented flowsheet record, etc...), this function will make your code much faster and accurate than trying to get to the same answer with the combination of Min and Max.

Here is why: min and max are aggregate functions (just like a sum and count). They operate implicitly at the level of grouping where they are applied. Using them can easily result in some convoluted queries (worse: inaccurate in some instances). Also these aggregate functions tend to be more taxing on performance. Row_Number() on the other hand is not an aggregate function. It simply assigns a rank to records and works much faster. The only caveat is that you have to first assign the ranking before you can use it. So, it is leveraged primarily with a subquery.

 

Similar posts

Stay on the leading edge of Healthcare Analytics

Discover new ways to enhance and optimize your data analytics function using the most advanced tools and industry knowledge available today.