I am a supporter of
St. Joseph's hospice.
 If you find this site useful or if it helped you, consider a small donation to
St. Joseph's, please.

Information on
St. Joseph's

JustGiving - Sponsor me now!

 

Count Unique with Criteria

A simple count of unique values without taking any criteria into account you can find here. But what if you have some criteria and you do not want to re-arrange your spreadsheet so that it shows only relevant data?

One option might be a Pivot table with a helper column. Personally I would prefer two subsequent Pivot tables.

Another option is using the advanced filter.

The two options I would like to present here are a worksheet formula approach and a VBA approach, though. This is because I think it is worth to learn from the pitfalls of the suboptimal worksheet formula.

Let us assume you have a company in 10 regions with 200 employees who are selling some goods each day. And now you want to know how many (unique) employees per region sold anything in the first quarter of 2010.

We have the sales data in sheet Data:

20101030_PB_01_Count_Unique_with_Criteria_Screen

If you are interested in downloading a 190 KB Excel 2010 sample file which I used to create this data and these results, go to my Download page, please.

Now we define some convenient named ranges:

20101030_PB_01_Count_Unique_with_Criteria_Names

In sheet ResultWS we can provide now T. Valko's (Biff) clever worksheet array formula solution:

20101030_PB_02_Count_Unique_with_Criteria_Screen

The VBA approach is pretty straight forward:

The named ranges help us to phrase this formula in a neat way. We need to copy this array formula for each appearing region, and the named ranges make these formulae volatile, but that does not matter too much, right?

Hmm, let us see ...

20101030_PB_03_Count_Unique_with_Criteria_Screen
20101030_PB_01_Count_Unique_with_Criteria_Code
20101030_PB_02_Count_Unique_with_Criteria_Code

For both the worksheet function approach and the VBA approach I ran simulations on my dual core laptop for 10, 50 and 100 regions and for 1000, 5000, 10000 and 20000 rows of data and I measured the runtime with FastExcel ©.

20101030_PB_04_Count_Unique_with_Criteria_Screen

Now you can see that the worksheet formula approach shows a quadratic runtime depending on the number of data rows. If you double the number of data rows the runtime will increase by a factor of 4! The VBA approach runtime is linear - doubling the data rows only doubles the runtime.

Please note that a Pivot table approach would presumably be even faster than the VBA approach. But with runtimes around 1s for up to 20,000 rows of data I would be able to live pretty well.

Sulprobil   Get it done   Contact   Disclaimer   Download