hi there. i'm here to lecture you about power again. it's what i do for fun.
collecting data is hard. large samples take time, and resources. i am sympathetic to the view that it's sometimes ok to have small samples.
but if you're doing a typical social/personality lab experiment, or a correlational study using questionnaire measures, then it's probably not ok. for those types of studies, adequate power should be a basic requirement for publishing your work in a good journal.*
when i hear people push back against the call for larger samples because they are sticking up for people who use hard-to-collect data, i scratch my head. those people are exactly why i think we absolutely need to increase the sample size of typical social/personality studies. if some of our colleagues are busting their asses measuring cortisol four times a day for weeks, or coding couples' behavior as they discuss marital problems, and even they can get samples of 100 or 200, then the least the rest of us can do is get a couple hundred undergrads to come to our labs for an hour.
when i see a simple lab/questionnaire study that has a smaller sample size than many super-hard-to-collect studies, it makes me sad.
today, i want to introduce you to two of my favorite researchers who do some of the hardest research i know of. i picked them because they study super important questions with incredibly rigorous methods. they are the people i think of when someone with a simple study is claiming that it would just be too hard to collect 80 more undergrad participants. they agreed to let me write about them,** but all of the opinions expressed in this post are mine alone.
case study one
kristina olson studies gender nonconformity and transgender children's development (among other things). this research is almost guaranteed to lead to important insights about the development of identity, gender stereotypes, and of course the understudied population of transgender and gender nonconforming individuals. but recruiting a sample of transgender children, and following them over time, is hard.***
here's what they do. when a family signs up for the study and meets the inclusion criteria, they make plans for a few members of the research team to drive or fly out to where the family lives, go to their home, and conduct several hours of tasks with various family members. for every. single. family.
their first paper (
Olson, Key, & Eaton, 2015) included a sample of 32 socially transitioned transgender kids, 18 cisgender siblings, and 32 matched controls. that's a total of sample size of 82 participants. this dataset took one year to collect, including four flights and two roadtrips. and their sample size is LARGER than the median sample size for social/personality research published in psych science (
Fraley & Vazire, 2014).
but they didn't stop there. their team is currently expanding this dataset, aiming for 200+ socially transitioned transgender kids, some gender nonconforming kids, a matched control for each kid, and a sibling control when possible. their total sample size will likely be around 750, and they're well on their way. they have families from 22 different states, and their waitlist includes families from 17 more states.
wait, it gets better. in addition to recruiting this incredible sample, they plan to follow them longitudinally, assess them in person every 2-3 years, and conduct questionnaire and interview measures more regularly. also they will cure world hunger. (that last part was a joke. it kind of all sounds unbelievable to me, so i just wanted to make sure you knew.)
one interesting moral of this story is that the psych science paper with a meager sample size of 82 is what made the larger, 700+ person study possible. it sparked interest in the topic, and gave olson a stronger foundation for recruiting new participants, securing funding, and investing more of her own time and effort into this project. this is a perfect example of when publishing a small sample study is obviously a good call.
case study two
jennifer tackett studies environmental and dispositional factors that predict aggression and self-control/disinhibition (among other things). her research focuses on childhood and adolescence, and uncovers important biological and environmental processes that influence the development of personality traits and personality disorders. her research also involves recruiting families and following them longitudinally, and measuring everything you can imagine about them.
one of her recent papers (
Tackett, Herzhoff, Kushner, & Rule, in press) came from a dataset including over 300 families that were assessed four times over four years. two of these assessments involved a two and half hour in-lab assessment of the child, mom, and dad, and included questionnaire measures, videotaping the kids in 15 different situations and coding their behavior, clinical interviews (also coded and transcribed), and genotyping (plus some other measures).
pretty damn impressive.
but then tackett decided that wasn't good enough, and decided to do the study again, but with a more ethnically and economically diverse sample, more behavioral tasks (including a parent-child interaction), and more psychophysiological measures. each session lasts three and a half to four hours in the laboratory, which means that the families get a meal, transportation to and from the lab, and babysitting for siblings. and she's collected data from 350 families in this second study (that's over 700 families in the two studies combined).
just thinking about running that study exhausts me. thinking about trying to recruit a community sample like that makes me want to curl up in a ball. can you imagine recruiting 350 families from a broad range of backgrounds and convincing them to come to your lab for 3.5 hours to do a bunch of tasks and spit in some vials? now imagine arranging meals, parking/transportation, and babysitting for all of them. now imagine trying to recruit and manage a team of insanely dedicated grad students and RAs who are willing to spend their weekends running the study. without paid staff. good news: you don't have to, because while we are watching netflix and eating peanut butter out of a jar, jennifer tackett is doing it for us.
conclusion
when i talked to both olson and tackett about this blog post, both insisted that they did not want to be held up as exemplars, that they felt that their research had important limitations. so let me be clear, this conclusion i am about to draw is mine and definitely, absolutely, positively not theirs:
the rest of us suck.
i'm not arguing all of us should do this. this cannot be the new standard. we're not all superhuman. these researchers (and their formidable grad students and RAs) are and always will be outliers, and they should get medals and many many grants and maybe you should consider their labs when doing your estate planning. but my point is if they can do this, most of us can run 80-100 people per condition in our simple studies with undergrads.
researchers like tackett and olson could use the fact that their data are super hard to collect as an excuse not to collect large samples. but they don't. so the rest of us should suck it up and do the right thing. next time you think about stopping data collection at 35 people per condition because it's too hard to get twice that many, just be glad you don't have to fly to bakersfield**** to get your next participant. then collect more data.*****
* and if you're doing an mTurk study, "adequate power" = at least 95% (why take a 1 in 5, or even 1 in 10, chance of a false negative if it costs you $18 to slash your type II error rate? is your research hypothesis worth less to you than a bottle of napa valley chardonnay?******)
** 'agreed' = tried to convince me that they weren't the best role models but didn't threaten to sue me if i went ahead.
*** kind of like the ironman is hard. but harder.
**** sorry bakersfield.
***** also, sequential analysis, within-person studies, bayes, openness, truth, utopia. did i miss anything?
****** while i'm hating on california, i hate oaky chardonnays. also, i just heard on the radio that the utility companies want us to know that the dirt-like odor and flavor of our tap water is a purely aesthetic issue, and will continue as long as the drought does. that oaky chardonnay is starting to sound pretty tasty now, actually. (who am i kidding, i love california. i would drink dirt for california any day of the week. but not oaky chardonnay.)

seriously lazy.
Comments