Survey data analysis hero

How to analyse survey data without a PhD in math

Survey data analysis—that's fun, isn't it? When you've finally got all your survey data the next step is actually doing something with it.

The results are in. You’ve written the questions, found the right people to ask and had your answers back—now what?

Perfect surveys sent to insightful respondents can become entirely useless if the results are not analyzed in a coherent and comprehensive manner.

Don’t run and hide.

We know that the word analysis sounds technical and exclusive, but it’s not rocket science. By following a few helpful guidelines on how to analyze survey data, you’ll be able to draw out the insights from your survey data yourself.

Before you start your survey analysis

Before you get started crunching the numbers and performing a survey analysis, there are a few pieces of information which you need to gather.

First up, you need to know your number of total respondents. This number will give you an indicator as to how large your sample is, and how much you can rely on your results. It’s always a good idea to gather people’s opinions, but if 5000 people attended a concert—and only 5 people answered your survey—you can’t really treat those 5 answers as being representative of the whole group.

Secondly, you need to calculate your survey response rate. This is a straightforward percentage—calculate it by dividing the number of responses you received in total, divided by the number of people you asked to fill out the survey. The higher your response rate, and the higher your total number of respondents, the more you can trust your survey data to be representative of the sample as a whole.

Survey data analysis—aggregating the numbers

The first step when analyzing survey data is to turn your individualized responses into aggregated numbers. This sounds complicated but really it just means you need to do some counting.

For every question in your survey, you need to know the total number of people who answered with each response. Take a look at this example question:

By aggregating your responses, you are simply counting how many people answered a, b, c and d respectively. If 100 people took your survey—the aggregated results would look something like this:

In the last six months: 30

Six months to a year ago: 40

One to two years ago: 20

Over two years ago: 10

Total: 100

Now, if your survey was conducted through a survey host, your online survey results should be aggregated automatically, so there’ll be no need to add the numbers up.

Qualitative or quantitive?

Once you have all of your aggregated answers, it’s time to start making some sense of them.

Our brains can make sense of percentages much more quickly and easily than whole numbers. It is also far easier to compare different percentages rather than whole numbers.

Say you wrote a survey asking 5 year-olds for their favorite colors. Just saying that 67 children chose red as their favorite color means very little. However, saying that 23% of the children chose red as their favorite color, compared to 50% who chose blue, gives you a much clearer indication of the relative popularity of one color.

If you’ve asked people to write feedback or long-form answers, leave these until the end.

You don’t want to allow the qualitative data to bias your quantitative analysis. Do the numbers first, and hopefully, once you have a clear idea of what the sentiment is, the qualitative answers will be able to help you understand why that might be the case.

Making comparisons with cross-tabulation

Cross-tabulating your data is where you can really begin to draw insights from your survey results, instead of just statistics. It can help you to add context to your numbers and to really interrogate how different groups of people behave or how different factors might affect a single outcome.

When you planned your survey, you will have thought about the different comparisons you would like to make. Maybe you’d like to know if older people are more likely to enjoy eating olives.

Let’s take olives as an example. Your question might be something like this:


Now, in the first round of your data analysis, you might have already split the respondents into two, to work out the split between people who do and do not like eating olives.

So let’s say the results of this olive question were:

Like olives: 542 people (46%)

Dislike olives: 630 people (54%)

To cross-tabulate your data, you’ll need to map another variable onto this one.

We’re interested in whether tastes change with age, so let’s use that age our second variable and ask the question:


Now with these results, you can plug them into a Google Sheet and start to see if there are any correlations:


Imagine you have a client who is looking at marketing their olive brand directly at people under 35. You could ask these two questions and look at the split between olive lovers and haters just within this subgroup, and see how it compares to the overall average splits.

Comparing survey data

Data means very little to us without context and meaning. Turning your numbers into percentages makes comparisons easier, but although proportionally we can recognize exactly what 75% means, how can we know if that is good?

The answer is benchmarks.

Setting benchmarks is key to making sense of the data and working out what those percentages really mean.

Some of the most common benchmarking techniques comprise comparisons between this survey’s results and the data from the last time the survey was collected. In order to do this effectively, you need to make sure that you are comparing the results of the same question from each survey.

Setting a benchmark using last year’s data is easy. You simply take the percentage splits of responses to a certain question and treat these as your starting point. Then you can easily see if this month’s data is above or below that benchmark.

Year-on-year or month-on-month comparisons are an excellent way of tracking progress and allowing you to see whether there are trends emerging or how much responses have changed in a given time period. This is known as longitudinal analysis.

If this is your first time collecting data, no worries, you can still set yourself some benchmarks. Instead of comparing your results to last month or last year’s data, you can calculate the overall total split between responses for each question, and treat this as your benchmark or baseline.

Once you begin to cross-tabulate and break your respondents down into further categories, you can compare their results to your benchmark to place their statistics in context. If a value is higher than the average, we can say that this category is over-indexing, and if the value is lower, we can say that the category under-indexes. This gives some context to the statistics and starts letting you draw out some real insights from your survey data.

Yeah, that sounds great—but why?

When interpreting survey results, quantitative data is extremely valuable in showing us what has happened. The numbers themselves are unlikely to provide a concrete answer as to why something happened or why people have had a certain opinion.

Understanding why respondents answered in the way that they did is when you can really start to address problems and make changes. This is where the real insight is born.

Sometimes, the ‘why’ will be answered in direct questions in the survey, sometimes with multiple choice boxes. Other times, it will be up to you as the survey analyst to determine causation, if possible. And this is where we need to be careful.

It is easy to become sucked into a trap when analyzing survey data, and start to see patterns everywhere. This is not necessarily a bad thing, as identifying a correlation between two variables is a key part of interpreting survey results. However, the danger is that we often make an assumption instead.

Assumptions about the data can be hopes or expectations, conscious or subconscious. However, realizing when we are making assumptions can help us avoid any problems further down the line and prevent us from wasting time. Ultimately, no one would want to find out their assumptions were false after the survey analysis is complete. Similarly, you wouldn’t want a critical assumption to be false and never even realize.

Survey analysis examples—understanding correlation and causation

Correlation occurs when two different variables move at the same time.

A classic example is with the sales of seasonal products. During the summer, the sales of both swimming pools and barbecues will rise. The two variables when plotted onto a graph will move in the same direction at the same time. However, there is no direct connection between these two variables. People buying barbecues is not the reason that the sales of swimming pools increase.

Causation, on the other hand, occurs when one factor directly causes a change in another factor.

For example, in the case of seasonal products, the weather is a key factor. As the temperature rises in the summer, so do the sales of barbecues. Barbecue sales here is a variable which is dependent on the weather, and there is a key link between them.

When interpreting survey results, it is easy to mistake correlation for causation. Just because two variables move at the same time, it does not mean that one is directly influencing the other.

This is where qualitative data comes in. If you’ve asked your respondents to fill in longer-form answers to explain why they chose a certain response, analyzing these answers can give you the insight you need to work out why.

The final hurdle—reporting back on your survey data

When it comes to sharing your survey data analysis, remember that it is the story which makes it interesting, not the numbers.

The percentages you have calculated are there as vital evidence for your argument, but in order to have a real impact on the way in which people think about something, your analysis needs to have a narrative.

If you can, always provide context with your statistics, either bringing in a comparison to the same survey from last year or compare groups of people in the same year’s data. Benchmark your numbers so that your audience is immediately aware of whether what they are seeing is positive or negative.

If you are unable to provide recommended actions off the back of your survey data analysis, at least signpost the key areas which need attention, so that the relevant parties can then begin to tackle the problem if necessary.

When you get to visualizing your data, remember that whilst long reports can be fascinating, most people won’t read them. Whoever you are presenting to is unlikely to want to listen or read as you take them through your survey analysis methods step-by-step, so don’t feel like you have to include every single calculation you made in your reporting.

Put yourself in your audience’s shoes and determine their interests and priorities. Only give them the information if it is relevant to them, they will understand it and there is something they can do with this new information.

How you ask is everything.

Footer Section