Ads 468x60px

Saturday, June 30, 2012

Formula Forensics No. 002 – Joyces Question

Formula Forensics No. 002 – Joyces Question:
Last week Joyce asked a question on the Chandoo.org, Comment 24.
I’m wondering if there’s a way to count the number of occurrences of words when they’re all in a cell? Like this:

A1: “Windows NT, Networking, Firewalls, Security, TL, Training”

A2: “Networking, Networking, Training, Security, TL, Training”

A3: “Security, TL, Firewalls, Security, Networking, Windows NT”

Joyce


Hui responded with an Array Formula:
=SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,”")))/LEN(C10)
As the formula is an Array Formula it is entered with Ctrl Shift Enter.


Setup the Problem

Copy the Data Above into Cells A1:A3 or download the example file here: Example File (all Excel versions)
Enter the text string Security into cell C10
And array enter the formula
D10: =SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,”")))/LEN(C10)

Cell D10 should now display the value 4, which is the number of times the Word Security, appears in the Range A1:A3.

Pull The Formula Apart

Lets take a look inside this and see how it works
We will break this formula apart and look at each section independently and then put the answers back together.
=SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,””)))/LEN(C10)
In a cell below the data
D13: =LEN(A1:A3) but don’t press Enter, Press F9
Excel displays ={57,56,57}
This is the number of characters in each cell A1:A3
ie: A1 has 57 characters, A2 has 56 characters, A3 has 57 characters,
You can check this manually by typing =Len(A1) into any spare cell


=SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,”")))/LEN(C10)
In another cell below the data
D15: =LEN(SUBSTITUTE(A1:A3,C10,”")) but don’t press Enter, Press F9
Excel displays ={49,48,41}
What this section does is measure the length of each cell in A1:A3 but only after substituting the word being searched for from C10 with ””, which is a zero length string.
So the second array is shorter than the first Array, by X times the length of the word in C10

=SUM(LEN(A1:A3) – LEN(SUBSTITUTE(A1:A3,C10,”")))/LEN(C10)
Next we add up the difference between the two arrays
So you can see we have two arrays of numbers
Array 1 = {57,56,57}
Array 2 = {49,48,41}
If we subtract Array 2 from Array 1
= {57-49, 56-48, 57-41}
= {8, 8, 16}
We can do this in Excel to Check
In Cell D17 enter
=LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,”")) and press F9
Excel displays: = {8, 8, 16}

=SUM(LEN(A1:A3) – LEN(SUBSTITUTE(A1:A3,C10,””)))/LEN(C10)
The next part is to sum these up
Obviously the sum of 8, 8 & 16 is 32
We can check that
D21: =SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,C10,”"))) and press F9
Excel displays: 32

=SUM(LEN(A1:A3) – LEN(SUBSTITUTE(A1:A3,C10,”")))/LEN(C10)
The final part of this is to divide the sum (32 in this case) by the length of the text in C10 “Security” = 8 Characters
=32 / 8
= 4
Correct - The number of times Security appears in the cells A1:A3 is 4.

OTHER POSTS IN THIS SERIES:

You can learn more about how to pull Excel Formulas apart in the following posts
Formula Forensic 001 – Tarun’s Problem

WHAT FORMULAS WOULD YOU LIKE EXAMINED ?

If you have any formulas you would like explained please feel free to leave a post here or send me an email:
If the formula is already on Chandoo.org or Chandoo.org/Forums or even forbid another web site, simply send the link to the post and a Comment or ID No. number if appropriate.
If sending emails please attach an Excel file with the formula and data

0 comments:

Post a Comment