Notes on Surveys for Market Researchers
This article is about the Scientific process in general and pertains specifically to surveys.
Research begins with an initial observation or question
■ You see (read, think, etc.) something that intrigues you and prompts you to ask – “what is this about?”
■ You look at what has been written on your question so that you can understand the way different investigators have approached the question. This is called a Literature Review.
■ You formulate your question, now informed by what is already known. It is vital to situate your question within the existing research, as you want your investigation to contribute to that research.
Doing so also gives you a Theoretical Base for your study. A Theory is a set of formal statements that explains how and why certain events are related to one another. Your theory may change as a result of your research. There is a constant interplay between research and theory, each informing the other.
Forming a Hypothesis
■ A hypothesis is a guess at the answer to the question you have asked. This guess should be informed by your literature review. It doesn’t have to tow the line – or accept what other people have said, but it’s important to place your hypothesis in the context of what is known.
■ Your hypothesis is the Alternative Hypothesis. It will be tested against the Null Hypothesis, which basically says, there's nothing going on here at all. So, you will not be proving that your hypothesis is correct, you will be seeing if you can reject the null hypothesis (Hey! There is something going on here after all)! This may seem strange, but "proving" things in Science is very complicated and requires multiple investigations. All you hope to show is that the null hypothesis is wrong.
■ Why can't I show that my Alternative Hypothesis is right? Mostly because there are likely a number of alternative hypotheses besides the one you've settled on. So rather than prove one of them is right, we hope to prove that the null hypothesis is wrong.
Determining your Methods
■ This may be the most important part of doing a research study. If your Methods are well thought out, then you can trust the results you get. If your Methods contain errors or erroneous assumptions, then you cannot trust the results you get.
■ There is a saying: "Garbage in = Garbage out." Meaning that if your Methods are poor, they will produce garbage results. And you cannot correct in analysis, what you have bungled in design. Which is why carefully designing your Methods is so important. I read a lot of studies with poor methodology, and if I see that, I don't bother reading the results, because I know they're worthless. You would be surprised by how many pieces of published research have poor methodology.
■ What is contained in your Methods? Your methods should lay out exactly, step for step, how you will conduct your study. It should be specific enough that another investigator can replicate or repeat your study. Replication is an essential way science uses to eventually prove that an alternative hypothesis could be right. If other investigators use your Methodology, and they find that they can also reject the null hypothesis, this strengthens the argument for your alternative hypothesis, although it still doesn't "prove" it's correct.
Sampling
■ First, you must describe the sample, or group of people, who will participate, and how you will recruit them, as well as how you will gain their Informed Consent to participate. Informed Consent means that participants have a clear understanding of what will be required and what the potential risks for participation are, so they can make an informed choice to participate or not to participate. Most of the time, Informed Consent takes the form of a document, which is often signed by those who choose to participate.
■ Not all samples are equal. Without going into too much detail, most samples in Psychological Science are called Convenience Samples, which is exactly what they are. They are composed of people who are immediately available to the investigator. A prime example is college freshman who will receive course credit for participating in research. The problem with these samples is that you cannot generalize (in general!) your results to the population as a whole. Your results are only generalizable to the population represented by your sample - in this case, college freshman in a particular geographic location. Believe it or not, convenience sampling is adequate, or the only choice you've got, for many research questions.
At least two other sampling techniques are available: Random and Representative sampling. In random sampling, each member of a population has an equal chance of being selected to participate in a study. In representative sampling, the sample is chosen to mirror the characteristics of a population. Popular characteristics include: race, age, gender, income, education, etc. These two sampling methods allow the researcher greater generalizability to a population.
■ What Data or information you will collect and how. Let's say you're interested in the cognitive changes that occur when someone is suffering depression. What questions, in what form, will you have to ask the participants in order to get the best data possible, as there are many ways to approach this question? This process is called Operationalization. You must convert your research questions into interpretable, numeric observations (if we are using statistics, and we are).
■ Because you know that only a percentage of your sample, perhaps small, will be suffering depression, you are going to want to have a lot of participants in your study, as this will increase the Power you have to detect effects that will help you reject the Null Hypothesis. So you will explain how you're going to obtain the large group you need. Certain calculations, called Power calculations, will help you determine exactly how many participants you need at the bare minimum in order to detect effects. You will perform these calculations and discuss them.
■ You will probably want to know some basic information about your participants, like how old they are, what their race/ethnicity is, what their family background is, what the economic status their family had is, what their sexual orientation is - all of these things have been proven to relate to depression. Some of these are referred to as Demographics. They will help you characterize, or provide a snapshot of your sample, and will be involved in your question on depression.
■ You have to find a way to determine which participants are depressed. The easiest way to do this, is to use an established Survey (Finally!) on depression, and you have several good choices that will provide you with a cut-off score, above which you can classify a person as depressed. The survey operationalizes depression - it takes it from a concept into a measurable concrete solution.
Some surveys are Unidimensional, meaning that they only measure one thing, while others are Multidimensional, measuring several aspects of a construct, like depression, at the same time. Perhaps one survey measures the cognitive aspects, the emotional aspects, and the behavioral aspects of depression. This survey would be multidimensional and each aspect is measured on a subscale, or a smaller scale embedded into a larger one. The total score on scales like this is also relevant, as it includes all aspects. There is an entire area of statistics called, Psychometrics, which is focused on writing good tests and surveys and exploring surveys for underlying subscales.
■ Finally, you must describe the context under which these forms will be filled out. Will participants be in a classroom, or at home? Will there be a time limit? It's best if you can control as much as you can about the context, so it is the same for all participants. If some students fill this out at home while watching TV and others fill it out in a classroom with a time limit, that could be problematic. It's best to have standardization of methods. When there isn't standardization, this introduces variables you don't want to deal with, which are called confounding variables, like noise level, time given, etc. They can impact how participants interact with the study materials, and they have nothing to do with your research question, but can influence your results. Sometimes there are ways to deal with confounding variables, but many times you just have to proceed, knowing they are there.
■ Your methods will also include a brief description of how you are going to analyze the data once you have it.
4. The Deal with Data
■ Data Collection Plan: one should always have a plan worked out prior to the start of the study as to how and when and where data will be collected. This may involve working with other people, like Faculty, to obtain the time necessary in their class to announce the study opportunity. Once a plan is established, it's best to go through it as a dry run and see if any problems come up.
Remember, thanks to the Power Analysis done previously, you already know how much data you must collect to answer the questions of interest. In this study, I'll end up with two groups: Depressed vs. Non-Depressed. I need enough Depressed individuals for this to work, so it's likely I'll need to collect a lot of data to make the depressed group at least 30 or more (30 is the minimum number of observations that most scientists feel is appropriate for parametric statistical analysis - more on that later).
■ Data Analysis Plan: Before you collect any data, you can specify what statistical tests you will need to run to test your hypothesis. You can do this because you have already operationalized your research question - you know what questions you'll ask your sample and what surveys you're going to ask them to fill out. So, this is enough information to then outline the statistical tests you will need to run.
You may also want to specify tests, charts, graphs that you will run using your data that will help you describe your dataset and sample as fully as possible. This is often done first, before hypothesis testing, and can be referred to as Exploratory Data Analysis (or EDA). Common types of EDA are things like examining correlations between different items or subscales in your data, graphing your data to see if it resembles a Normal (bell-shaped) distribution, running Descriptives - which are statistics that characterize key aspects of your data, like the average (or mean) and the standard deviation (or variance). Most of these descriptives need to be reported in some form for hypothesis testing.
Performing EDA will also alert you to any serious problems in your data BEFORE you test your hypothesis. Some problems can be remedied, and some cannot. So it is recommended that you engage in EDA, because if there are serious flaws in your data, then you would not want to proceed to hypothesis testing, since those results will likely be flawed.
You may have multiple different statistical tests that could be used. to test your hypothesis. A good rule of thumb when this is the case is Parsimony, which means - use the tests that are the simplest and make the fewest assumptions about different aspects of your data. A lot of people do just the reverse. They choose the most complex test because they think that using that test will make them look smarter. And, their complicated test may impress some people - but we don't do science to impress people. We do it to advance our knowledge. People who do science and data analysis to impress people are not very impressive. Parsimony is always best. If two tests are equivalent in simplicity and assumptions, then you should choose the test that is most commonly used in the literature on the types of questions you are asking. That way, your research will be more directly comparable to the research of others.
Finally, sometimes there are tests that you'd like to run AFTER you run the tests necessary for your hypothesis. These are referred to as post-hoc (or after-the-fact) tests. Because you do not specify hypotheses for such tests, they are viewed strictly as exploratory in nature and the conclusions of such tests are therefore tentative at best. They are mostly used to further refine results from hypothesis testing, or to explore aspects of your data that are not directly relevant to your hypothesis but may help you determine your next scientific move, or refine your theory. This is also sometimes referred to as Secondary Data Analysis, where your hypothesis testing is the Primary Data Analysis.
5. Writing your own Survey
Writing a survey is not something one should just sit down and do. If you think anyone can write a good survey, I guarantee you, you will write a bad survey. This is because there are certain basic things one needs to consider in writing a decent survey.
But first things first: if you don't like writing, or do not consider yourself a good writer - DON'T WRITE A SURVEY!!! If your boss asks you to write one, be upfront and blunt and say, "Ma'am, I really dislike writing, probably because I've never been very good at it. You probably want to ask someone else, if you want a good survey." If she insists you write it, or thinks you're trying to get out of an assignment, she'll find out once she reads your first draft that you were being truthful and she acted like an idiot (but don't tell her that). Seriously, if you don't write well, writing a survey will not be an exception. Bow out. You probably have plenty of other talents (let's hope). And what up with your disbelieving boss who assumed you were lying to her and being lazy? Now I'm getting tangential, but you get the point. Writers should write. Non-writers should not. And bosses should not assume that their employees are lazy liars. Do not attempt to write a survey if you think you're an "okay" writer, either. Not good enough.
6. The First Thing You Need to Know in Writing a Survey
■ First, who is going to fill out your survey? The answer is very important. For example, if you are going to administer your survey to 13-year old girls, then the vocabulary you use is important. You wouldn't want to use words like "dessicate," "pyramidal," "bannister," "exemplary," and other SAT-type words, would you? The same holds true if recent immigrants who are learning English will fill our the survey, or if people with limited educational backgrounds (less than a high school diploma) will potentially fill out your survey.
A good general rule is to write your survey with vocabulary that does not exceed what one would get from an 8th grade education. Why? Because in the US, 9th grade is the first time one can drop out of formal public education - through 8th grade, it is compulsory. This will work for your 13-year old girls and those who dropped out in 9th grade or later, but still may not do for new immigrants. So the first thing you ask yourself is "who will be filling this out?" And you write for their level of understanding, not for yours.
You're probably saying, "How do I know what an 8th grader understands?" Well, you've got to do some research to find out. There are survey writing guides out there that could help. You could go to a public library - they will have resources, like curriculums for different grades on different topics or books arranged by recommended age/grade. You could also go to your local school, and ask an 8th grade teacher to look at your survey or supply you with a copy of that grade's vocabulary tests for the year. Finally, there are some great survey writing consultants available that you can work with either directly or online.
Note this: if you are going to administer a consumer survey on a website like Amazon or at a local mall, you'll want to observe this 8th grade rule because there may be a lot of variability in the vocabulary levels of these consumers.
Why should you bother with this? Well, if someone is reading your survey, and does not understand a question or more, but does not want to appear as though they don't understand, they will answer your questions randomly, which could really mess up the work you're trying to do and the things you're trying to learn. The last thing you want are people faking as if they understand your survey when they do not.
Also, they may just leave questions blank, which means you've just lost potential data because of the vocabulary level you employed. When people skip questions, you don't know why. Was it because they didn't understand? Did they not think it was relevant to them? Did they think it was a dumb question? When data is missing, nobody knows why. Even if you could run after someone who didn't answer two questions because of words they didn't understand, do you think they're going to admit that to you? Unlikely. So you don't want people faking an understanding and answering what they shouldn't, and you don't want people not answering what they should! All because you decided it was not that important to pay attention to the level of vocabulary in your survey.
And "consumers" are a very heterogeneous group - VERY. If a mall is in a wealthy suburb, does that mean everyone shopping there is well educated? NO!!! People can drive to that mall from the worst neighborhood in the city! So you simply can't assume that everyone answering your questions actually understands them - you have to ensure that is the case. Otherwise, you're going to end up with either misleading data, messed up data, and insights that may not be insights at all, or may only apply to a select group of the people who filled out your survey, but not to another group, and you need a marketing strategy that will work for ALL groups. This may result in you misleading your boss and your company, and may end up with you writing your resume, not a survey.
■ Keep your questions straightforward and simple. Here's a "don't" and a "do:"
Don't
1. When you are shopping for items for your home, like furnishings for example, do you consider color, form, and match with your existing home furnishings important and could you rank these qualities so we know which is most and least important?
Okay - that's about 3-4 questions clumsily folded into one, using a bad vocabulary word: furnishings. Let's try again.
Do
1. When you shop for furniture, is color important?
2. When you shop for furniture, is the style important?
3. When you shop for furniture, do you want new furniture to match your old furniture?
4. Think about color, style and match. Which is the most important to you, which comes next, and which comes last?
Yes, takes more time to ask, more room on the page, more to analyze, but I feel more confident that the Do series will be understood and answered easily by a much larger range of different types of people. I can answer the Don't, college educated folks all probably can, but is everyone at the mall college educated?
■ Also, note the unnecessary words in the Don't: "for example" "existing" "rank these qualities so we know" - none of that is necessary. Survey respondents don't like it when a question looks really long and involved - why would they? It's supposed to be a question, not an essay! And it should not have multiple parts or demands like the Don't does. It asks the reader to simultaneously consider multiple different things at the same time and then, going further, asks them to perform a complex ranking operation, all within one question.
■ And no question they are asked should be long, involved, and include ANY unnecessary words. Down to the minimum. If your question cuts across two lines on an 8.5" x 11" sheet of paper - IT'S TOO LONG! Restrict yourself to one line most of the time, or a little bit of a second if you absolutely need it. Otherwise - EDIT, EDIT, EDIT. Do you want people skimming your long questions, or actually reading your concise ones?
If you think skimming is fine, consider the small but important words like: not, but, and, or, etc. You don't want someone skimming and assuming they saw "and" when it was really "or." And those are the words that are missed when rapidly skimming. If people are skimming your survey, it's likely they are responding to different questions than the ones you intended. Another compelling reason to keep questions short. So no one feels they have to skim your unnecessarily long questions.
Statistician | Bestselling Children's Nonfiction Author on Amazon | Co-founder Jim-Ree African American Museum
4yGreat resource for marketers and market researchers, especially those without a strong background in research methodology! We tend to arrive at job roles from different career paths - some more or less straightforward than others. It's important to skill-up for the job, but you can't always find the learning content when you need it. I highly recommend this resource for entry level researchers, undergrads, and graduate students in all fields. Regardless of your major or philosophical ideology, the basic methods remain the same. Managers and decision-makers will also find this useful as a practical reference. Thank you, Dr. Mark A. Biernbaum!