Previous post: Phase 2 Excel data capture
Once you have captured your questionnaire data in Microsoft Excel (or downloaded your data from a digital Google Form questionnaire), you must analyze the data. Your analysis must demonstrate your ability to use a range of functions varying in complexity. One of the biggest problems learners face at this stage is finding out that they didn’t collect appropriate data with their questionnaires!
In this post:
Once you have captured all your raw data, duplicate the raw Data worksheet. Rename this copy “Analysis“. On this sheet, you will perform all the required calculations and formatting.
- Did you collect a name and a surname? If you did and you entered both name and surname in one Excel column, we need to fix that on our Data worksheet before we copy it for analysis. Check the Level 4 section below to see how this can be done, scoring you extra cred at the same time by knocking off some of the more difficult analysis!
- Did you capture two pieces of data in one column? Head on over to the post on Excel’s Text to Columns command.
Now for our analysis: from the Learner Guide:
|LEVEL||EXAMPLES OF FORMULAS/FUNCTIONS|
|1||Simple functions that only use/include a single cell range,|
e.g. SUM, MAX, MIN, COUNT, AVERAGE, MODE, MEDIAN, LEN, VALUE
a formula using any of the arithmetic operators ( +, –, *, / )
|2||Functions that include a cell range and one other parameter/condition,|
e.g. ROUND, LARGE, SMALL, LEFT, RIGHT, CONCATENATE, COUNTIF, SUMIF
Calculations using a combination of arithmetic operators and brackets
Calculations using a combination of any two simple functions from level 1
|3||Functions that include a cell range plus two parameters/conditions, e.g. POWER, MID, FIND, COUNTIFS, ROUNDUP, RANDBETWEEN or a simple IF-function or any of the DATE and TIME functions|
Any combination of more than two functions of arithmetic operators, brackets and other functions
functions not in the CAT CAPS curriculum
This analysis is pretty straightforward. You can use a variety of these functions on the age data, for instance, calculating the minimum, maximum, average and mode of the range of ages of the respondents. Use the opportunity to use ROUND (a level 2 function) in conjunction with AVERAGE to demonstrate your ability to include a function as an argument for another function.
Note the use of the labels Average, Youngest, Oldest & Mode in D9:D12. These make your work more readable and easier to understand.
Don’t forget to do obvious calculations, for example: using COUNT to determine the number of respondent records. You will use the results of these calculations in further analysis later on.
The easy win here is the COUNTIF function, for example in the Gender column you can calculate the total number of respondents per gender.
Note the correct use of the absolute reference for the cell range argument ($D$2:$D$7) and the use of the labels in C9 & C10 for the criteria (many learners incorrectly select the criteria argument from within the range argument).
Again, you will use the results of these calculations in further analysis later on.
Level 3 is probably the most important level of analysis as it contains the COUNTIFS function. Remember, you are looking to process your data so that you have information that helps you “solve” your Focus Question; Phase 3 requires you to make at least three statements demonstrating what you have learned.
Start by combining a question from the Biographical questions with a question from the Topic section of questions using the COUNTIFS function: calculate and compare the number of females who gave a particular answer to a question to the number of males that gave the same answer to the same question.
For example: if as one of your topic questions you asked the closed question “Do you like ice-cream?”, you will be able to calculate how many females answered “Yes” versus how many males answered “Yes” and then compare the two groups.
Be careful: unless you have exactly equal numbers of respondents by gender, you will still have to normalise your data. In other words, if you look at the above numbers you might make the mistake of concluding “1 female eats ice-cream and 2 males eat ice-cream, so twice as many males as females eat ice-cream”.
However, if your sample of respondents does not comprise equal numbers of each gender, you must normalise your data. The easiest way to accomplish this is to calculate the percentage of females who answered “Yes” as a percentage of the total number of female respondents etc. You will now be able to establish which gender has the highest likelihood of saying they like ice-cream! You already have the total number of each gender as a result of your Level 2 calculations.
As you can see above, the initial assumption was, in fact, incorrect: the data shows no difference between the number females who report they like ice-cream versus the number of males who report they like ice-cream.
Note again the use of labels which are used both to describe your information as well as serving as the criteria in the functions.
Level 4 is not easy. The rubric’s suggestion of “A Lookup or nested IF-function OR functions not in the CAT CAPS curriculum” is deceiving: quite honestly I have not been creative enough to imagine a way of using a VLOOKUP in the PAT and I am seldom feeling brave enough for a nested IF statement!
Here, however, is one way to use a nested IF: group data into “categories” based on ranges of numeric data.
In the example below, respondents were asked how many times they ate ice-cream per year. The answers are categorised as follows:
- If more than 12 times = More than once a month
- If 12 times a year = Once a month
- If 0 times per year = NEVER
- else they eat ice-cream Less than once a month
The resulting formula is:
=IF(F2>12,"More than once a month",IF(F2=12,"Once a month",IF(F2=0,"Never","Less than once a month")))
Consider using one of the following functions not in the Curriculum:
Next step: Phase 2 Excel Charts