Scientists need not necessarily increase overall sample size by default when including both sexes in in vivo studies

Prior to about 2010, there had always been a strong bias toward using a single sex in lab animal research (one obvious reason for preferring males, in adult laboratory animals, is the variability of females due to their estrous cycles). Although there is variation between subdisciplines, this strategy has tended to result in a heavy bias in the direction of males. For example, in 2009, authors found only 26% of studies used both sexes and, among the remainder, there was a male bias in 80% of studies.

The negative consequences of these shortcomings on scientific originality are beginning to be better understood — as evidence emerges that our current fundamental biological knowledge base may be biased. For example, a recent report concluded that the fundamental molecular basis of pain is highly sex-dimorphic, yet much of our knowledge in this area has been derived from studies solely using male animals. This situation risks generating a knowledge imbalance that might persist through the research pipeline — ultimately manifesting in the clinic.

To improve the translation of results from animals to humans, there has been a push to include both male and female animals in studies. In fact, numerous funding bodies — including the NIH in the United States and the MRC in the United Kingdom — now have inclusion mandates. These policies do not require scientists to study differences between males and females per se, but rather aim to improve the generalizability of studies by calculating an average effect estimated from both sexes.

If, however, there is a large, meaningful sex difference in the treatment (or response) effect, studies should be designed in such a way that the visualization and analysis detect it. The NIH policy even introduced the term “Sex as a Biological Variable” (SABV). Authors [see attached] use the term to represent a sex-inclusive research philosophy that emphasizes the importance of automatic inclusion, with a focus on treatment- or response-effect estimates.

Any of a wide range of factors — (including animal strain, age, health status, or other factors) — could also be the focus of a movement to improve research generalizability. However, sex is a particularly pressing and timely direction for improved representation, because clinically, females account for more than 50% of almost any population of interest but are currently largely overlooked.

Authors [see attached] conducted an in-depth examination of “the consequences of including both sexes” on statistical power. They performed simulations by constructing artificial datasets that encompass a range of outcomes that may occur in studies examining a treatment effect in the context of both sexes; this included both baseline sex differences and situations in which the size of the treatment effect depended on sex in both the same and opposite directions. The data were then analyzed, using either a factorial analysis approach (which is appropriate for the design), or a t test approach, following pooling or disaggregation of the data (which are common but erroneous strategies).

Authors’ results demonstrated that there is no loss of statistical power to detect treatment effects — when splitting the sample size across sexes in most scenarios — providing that the data are analyzed using an appropriate factorial analysis method (e.g., two-way ANOVA). In the rare situations where power is lost, the benefit of understanding the role of sex outweighs the power considerations. In addition, use of inappropriate analysis pipelines results in a loss of statistical power. Therefore, authors (as a standard strategy) recommend analyzing data collected from both sexes — using factorial analysis, followed by splitting the sample size across male and female mice. 😊


PLoS Biol June 2023; 21: e3002129

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.