Monday, August 11, 2008

Sorting Cells in Excel Using Formulas

This post describes how to create a new table in Excel from an existing table, where the cells are sorted, and to do this using only formulas. As a result, modifying a value in the original table results in rearranging the sorted table. And, this is accomplished without macros and without using the "sort" menu option.

The material in this post is generalized in another post. Also, if you are interested in this post, you may be interested in my book Data Analysis Using SQL and Excel.

Consider a table with two columns:



12B

14D

13C

11A


The sorted table would be automatically calculated as:



11A

12B

13C

14D


If the first value were changed to 5, then the sorted table would automatically recalculate as:



11A

13C

14D

15B


There are two typical approaches to sorting cells in Excel. The first is to select a region and to sort it using menu options. This does not work when the cells are protected, part of a pivot table, and sometimes when they are calculated. This might also be a bad idea when the data is copied from another location or loaded by accessing a database.

A common alternative is to resort to writing a macro. However, Visual Basic macros are beyond the capabilities of even many experienced Excel users.

The approach described here is much simpler, since it only uses formulas. I should mention that the method described in this post only works for numeric data that has no duplicates. I will remedy that in the next post, where the ideas are extended both to data with duplicates and to character data.

Three Excel functions are the key to the idea: RANK(), MATCH(), and OFFSET(). The first function ranks numbers in a list. The second allows us to use this info to sort the list.

The following shows the effect of the RANK() function:



ValueLabelRank

15B4

14D3

13C2

11A1


The function itself looks like:



ValueLabelRank

15B=RANK(C5, $C$5:$C$8, 1)

14D=RANK(C6, $C$5:$C$8, 1)

13C=RANK(C7, $C$5:$C$8, 1)

11A=RANK(C8, $C$5:$C$8, 1)




The first argument is the value to be ranked. The cell C5 contains the value "15". The second argument is the range to use for h the ranking -- these are all the values in the cell. And the third is the direction, which means the lowest values get the lowest rankings. In this case, "15" is the largest value of four, so its rank is 4.

The following shows the formulas for the table:



RankValueLable

1=OFFSET(C$4, MATCH(C11, $E$5:$E$8, 0), 0)=OFFSET(D$4, MATCH(C11, $E$5:$E$8, 0), 0)

2=OFFSET(C$4, MATCH(C12, $E$5:$E$8, 0), 0)=OFFSET(D$4, MATCH(C12, $E$5:$E$8, 0), 0)

3=OFFSET(C$4, MATCH(C13, $E$5:$E$8, 0), 0)=OFFSET(D$4, MATCH(C13, $E$5:$E$8, 0), 0)

4=OFFSET(C$4, MATCH(C14, $E$5:$E$8, 0), 0)=OFFSET(D$4, MATCH(C14, $E$5:$E$8, 0), 0)


The first column is the desired ranking column. This is simply the numbers starting at 1 and incrementing by 1. The function MATCH(C11, $E$5:$E$8, 0) simply looks up the ranks in the column of calculated ranks. So, the value in C11 is "1". In the previous column, this is the fourth value. The OFFSET() function then finds the fourth value in the C column for the value and the fourth value in the D column for the label.

The result is that the sorted table is tied to the original table by formulas, so changing values in the original table will result in recalculating the sorted table, automatically.

The overall approach is simple to describe. First, we need to calculate the ranking of each row in the original table based on the column that we want to sort. This ranking takes on the values 1 to N fo rthe values in the table. Then, we create a new sorted table that has the rankings in order. This table looks up the appropriate row for each ranking using the MATCH() function. Finally, the OFFSET() function is used to lookup the appropriate values from the appropriate row. The result is a table that is sorted with a "live" connection to another table.

5 comments:

  1. pauldurrant@prd-design.comAugust 27, 2008 at 2:16 AM

    Hi,
    This is extremely useful - so thanks for posting it.
    I do have a question, if I may ?

    If a number is entered which should force the offset to a new order - such as 5 - why do I get an N/A error in the relative cell ?

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4

    Value Label Rank
    5 A 4
    2 B 1
    3 C 2
    4 D 3

    Rank Value Label
    1 #N/A #N/A
    2 2 B
    3 3 C
    4 4 D

    I have attached a formula sheet too.

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4


    Value Label Rank
    =C5 A =RANK(C5,$C$5:$C$8,1)
    =C6 B =RANK(C6,$C$5:$C$8,1)
    =C7 C =RANK(C7,$C$5:$C$8,1)
    =C8 D =RANK(C8,$C$5:$C$8,1)

    Rank|ValueLabel
    =E5=OFFSET(C$4,MATCH(C12,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C12,$E$5:$E$8,0),0)
    =E6=OFFSET(C$4,MATCH(C13,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C13,$E$5:$E$8,0),0)
    =E7=OFFSET(C$4,MATCH(C14,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C14,$E$5:$E$8,0),0)
    =E8=OFFSET(C$4,MATCH(C15,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C15,$E$5:$E$8,0),0)

    I would really appreciate knowing my mistake.

    Many thanks
    Paul

    ReplyDelete
  2. Hi,
    This is extremely useful - so thanks for posting it.
    I do have a question, if I may ?

    If a number is entered which should force the offset to a new order - such as 5 - why do I get an N/A error in the relative cell ?

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4

    Value Label Rank
    5 A 4
    2 B 1
    3 C 2
    4 D 3

    Rank Value Label
    1 #N/A #N/A
    2 2 B
    3 3 C
    4 4 D

    I have attached a formula sheet too.

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4


    Value Label Rank
    =C5 A =RANK(C5,$C$5:$C$8,1)
    =C6 B =RANK(C6,$C$5:$C$8,1)
    =C7 C =RANK(C7,$C$5:$C$8,1)
    =C8 D =RANK(C8,$C$5:$C$8,1)

    Rank|ValueLabel
    =E5=OFFSET(C$4,MATCH(C12,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C12,$E$5:$E$8,0),0)
    =E6=OFFSET(C$4,MATCH(C13,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C13,$E$5:$E$8,0),0)
    =E7=OFFSET(C$4,MATCH(C14,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C14,$E$5:$E$8,0),0)
    =E8=OFFSET(C$4,MATCH(C15,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C15,$E$5:$E$8,0),0)

    I would really appreciate knowing my mistake.

    Many thanks
    Paul

    ReplyDelete
  3. Hi,
    This is extremely useful - so thanks for posting it.
    I do have a question, if I may ?

    If a number is entered which should force the offset to a new order - such as 5 - why do I get an N/A error in the relative cell ?

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4

    Value Label Rank
    5 A 4
    2 B 1
    3 C 2
    4 D 3

    Rank Value Label
    1 #N/A #N/A
    2 2 B
    3 3 C
    4 4 D

    I have attached a formula sheet too.

    Value Label Rank
    5 A 1
    2 B 2
    3 C 3
    4 D 4


    Value Label Rank
    =C5 A =RANK(C5,$C$5:$C$8,1)
    =C6 B =RANK(C6,$C$5:$C$8,1)
    =C7 C =RANK(C7,$C$5:$C$8,1)
    =C8 D =RANK(C8,$C$5:$C$8,1)

    Rank|ValueLabel
    =E5=OFFSET(C$4,MATCH(C12,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C12,$E$5:$E$8,0),0)
    =E6=OFFSET(C$4,MATCH(C13,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C13,$E$5:$E$8,0),0)
    =E7=OFFSET(C$4,MATCH(C14,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C14,$E$5:$E$8,0),0)
    =E8=OFFSET(C$4,MATCH(C15,$E$5:$E$8,0),0)=OFFSET(D$4,MATCH(C15,$E$5:$E$8,0),0)

    I would really appreciate knowing my mistake.

    Many thanks
    Paul

    ReplyDelete
  4. It is hard to follow the formulas without knowing which cells they are in.

    I believe the problem is due to the lookup of the ranked value. In the sorted list, the "rank" column should contain 1, 2, 3, 4, and not the values (which might have other values). My suspicion is that C12 contains the value "5" in this case, instead of "1".

    --gordon

    ReplyDelete
  5. This is awesome. Thanks for sharing. Any idea when you will be able to post the version that handles duplicates and character data? Or is it out already?

    Thanks again!

    ReplyDelete

Your comment will appear when it has been reviewed by the moderators.