Previous post: Phase 2 Excel data capture

Once you have captured your questionnaire data in Microsoft Excel (or downloaded your data from a digital Google Form questionnaire), you must analyze the data. Your analysis must demonstrate your ability to use a range of functions varying in complexity. One of the biggest problems learners face at this stage is finding out that they didn’t collect appropriate data with their questionnaires!

In this post:

## First steps

Once you have captured all your raw data, duplicate the raw **Data** worksheet. Rename this copy “**Analysis**“. On this sheet, you will perform all the required calculations and formatting.

Do you have some “test data” responses from your Google Form? Keep them on your

Rawworksheet as evidence of your testing!

**Top tip**s:

- Did you collect a name and a surname? If you did and you entered both name and surname in one Excel column, we need to fix that on our Data worksheet before we copy it for analysis. Check the Level 4 section below to see how this can be done, scoring you extra cred at the same time by knocking off some of the more difficult analysis!
- Did you capture two pieces of data in one column? Head on over to the post on Excel’s
**Text to Columns**command.

## Levels

Now for our analysis: from the **Learner Guide**:

LEVEL | EXAMPLES OF FORMULAS/FUNCTIONS |
---|---|

1 | Simple functions that only use/include a single cell range, e.g. SUM, MAX, MIN, COUNT, AVERAGE, MODE, MEDIAN, LEN, VALUE OR a formula using any of the arithmetic operators ( +, –, *, / ) |

2 | Functions that include a cell range and one other parameter/condition, e.g. ROUND, LARGE, SMALL, LEFT, RIGHT, CONCATENATE, COUNTIF, SUMIF OR Calculations using a combination of arithmetic operators and brackets OR Calculations using a combination of any two simple functions from level 1 |

3 | Functions that include a cell range plus two parameters/conditions, e.g. POWER, MID, FIND, COUNTIFS, ROUNDUP, RANDBETWEEN or a simple IF-function or any of the DATE and TIME functions OR Any combination of more than two functions of arithmetic operators, brackets and other functions |

4 | A LOOKUP OR nested IF-function OR functions not in the CAT CAPS curriculum |

Use “labels” to describe your information.

### Level 1

This analysis is pretty straightforward. You can use a variety of these functions on the age data, for instance, calculating the minimum, maximum, average and mode of the range of ages of the respondents. Use the opportunity to use **ROUND** (a level 2 function) in conjunction with **AVERAGE** to demonstrate your ability to include a function as an argument for another function.

Note the use of the labels *Average*, *Youngest*, *Oldest *& *Mode *in D9:D12. These make your work more readable and easier to understand.

Don’t forget to do obvious calculations, for example: using **COUNT** to determine the number of respondent records. You will use the results of these calculations in further analysis later on.

Be sure to set up your data analysis neatly!

### Level 2

The easy win here is the **COUNTIF** function, for example in the *Gender *column you can calculate the total number of respondents per gender.

Be sure to select your criteria correctly!

Note the correct use of the absolute reference for the cell range argument ($D$2:$D$7) and the use of the labels in C9 & C10 for the criteria (many learners incorrectly select the *criteria *argument from within the *range *argument).

Again, you will use the results of these calculations in further analysis later on.

### Level 3

Level 3 is probably the most important level of analysis as it contains the **COUNTIFS** function. Remember, you are looking to process your data so that you have information that helps you “solve” your Focus Question; Phase 3 requires you to make at least three statements demonstrating what you have learned.

Start by combining a question from the *Biographical *questions with a question from the *Topic *section of questions using the **COUNTIFS** function: calculate and compare the number of females who gave a particular answer to a question to the number of males that gave the same answer to the same question.

For example: if as one of your topic questions you asked the closed question “Do you like ice-cream?”, you will be able to calculate how many females answered “Yes” versus how many males answered “Yes” and then compare the two groups.

Be careful: unless you have exactly equal numbers of respondents by gender, you will still have to *normalise *your data. In other words, if you look at the above numbers you might make the mistake of concluding “1 female eats ice-cream and 2 males eat ice-cream, so twice as many males as females eat ice-cream”.

However, if your sample of respondents does not comprise equal numbers of each gender, you must *normalise *your data. The easiest way to accomplish this is to calculate the percentage of females who answered “Yes” as a percentage of the total number of female respondents etc. You will now be able to establish which gender has the highest likelihood of saying they like ice-cream! You already have the total number of each gender as a result of your Level 2 calculations.

As you can see above, the initial assumption was, in fact, incorrect: the data shows no difference between the number females who report they like ice-cream versus the number of males who report they like ice-cream.

Note again the use of labels which are used both to describe your information as well as serving as the criteria in the functions.

### Level 4

Level 4 is not easy. The rubric’s suggestion of “A Lookup or nested IF-function OR functions not in the CAT CAPS curriculum” is deceiving: quite honestly I have not been creative enough to imagine a way of using a VLOOKUP in the PAT and I am seldom feeling brave enough for a nested IF statement!

Here, however, is one way to use a nested IF: group data into “categories” based on ranges of numeric data.

In the example below, respondents were asked how many times they ate ice-cream per year. The answers are categorised as follows:

- If more than 12 times =
**More than once a month** - If 12 times a year =
**Once a month** - If 0 times per year =
**NEVER** - else they eat ice-cream
**Less than once a month**

The resulting formula is:`=IF(F2>12,"More than once a month",IF(F2=12,"Once a month",IF(F2=0,"Never","Less than once a month")))`

## Bonus extras

Consider using one of the following functions not in the Curriculum:

An example of the CONVERT function can be found in the post Phase 2 “additional data”.

**Next step:** Phase 2 Excel Charts