This year’s Nobel Memorial Prize in Economics goes to David Card “for his empirical contributions to labour economics” and Joshua D. Angrist and Guido W. Imbens “for their methodological contributions to the analysis of causal relationships”. The Royal Swedish Academy of Sciences writes in a comprehensive article about the scientific background of this prize (PDF): “Taken together, […] the Laureates’ contributions have played a central role in establishing the so-called design-based approach in economics. This approach – aimed at emulating a randomized experiment to answer a causal question using observational data – has transformed applied work and improved researchers’ ability to answer causal questions of great importance for economic and social policy using observational data.” Similar to what is still widespread in SCM research today, the traditional approach to causal inference in economics relied on structural equation models at least until the 1980s, but, based on the laureates’ work on the local average treatment effect, natural experiments have become increasingly popular in economics. Unfortunately, almost no corresponding research exists in our discipline, but a certain number of natural experiments were carried out in related disciplines (e.g.; Lee & Puranam, 2017; Li & Zhu, 2021; Huang et al., 2021). Perhaps this Nobel Prize can serve as an inspiration for more natural experiments also in the SCM discipline?
Our discipline is still almost exclusively shaped by positivism. This is very surprising in view of the very complex social phenomena with which the discipline deals. However, recently I have noticed a (slowly) growing trend toward interpretivism. For example, Darby and her coauthors (2019) have discussed the set of questions interpretive research can address in SCM. Many SCM researchers may still be unsure of how best to conduct an interpretive study. Used to the structured approaches of positivist studies (e.g., Yin), we often would like to have a template in hand that shows us how to conduct an interpretive study. A new article by Mees-Buss and her coauthors (2021) argues that the inductive route to theory that templates (e.g., Gioia) offer do not address the challenges of interpretation. They argue that “a return to a hermeneutic orientation opens the way to more plausible and insightful theories based on interpretive rather than procedural rigor” and they offer “a set of heuristics to guide both researchers and reviewers along this path”.
Mees-Buss, J., Welch, C., & Piekkari, R. (2021), From Templates to Heuristics: How and Why to Move Beyond the Gioia Methodology. Organizational Research Methods, in print. https://doi.org/10.1177/1094428120967716
In her insightful Nature comment Rein in the Four Horsemen of Irreproducibility, Dorothy Bishop describes how threats to reproducibility, recognized but unaddressed for decades, might finally be brought under control, by avoiding what she refers to as “the four horsemen of the reproducibility apocalypse”: publication bias, low statistical power, P-value hacking and HARKing (hypothesizing after results are known). In the video below she makes several important points. My perception is that the SCM research community does not take the reproducibility debate seriously enough.
Some time ago, an editorial of Nature Human Behaviour has highlighted that “[the] quest for positive results encourages numerous questionable research practices […] such as HARKing (hypothesizing after the results are known) and P-hacking (collecting or selecting data or statistical analyses until non-significant results become significant)”. To counteract these very serious problems, that make theory-testing research almost useless, the journal has adopted the registered report format, which “shift[s] the emphasis from the results of research to the questions that guide the research and the methods used to answer them”. Similarly, the European Journal of Personality has recently announced to support the registered report format, too: “In a registered report, authors create a study proposal that includes theoretical and empirical background, research questions/hypotheses, and pilot data (if available). Upon submission, this proposal will then be reviewed prior to data collection, and if accepted, the paper resulting from this peer-reviewed procedure will be published, regardless of the study outcomes.” I can only hope that SCM journals will quickly catch up with this development in other fields.
Several journals have already reacted to the p value debate. For example, an ASQ essay provides suggestions that not only every editor should read. Another example are the policies published by SMJ: SMJ “will no longer accept papers for publication that report or refer to cut-off levels of statistical significance (p-values)”. Instead, “authors should report either standard errors or exact p-values (without asterisks) or both, and should interpret these values appropriately in the text”. “[T]he discussion could report confidence intervals, explain the standard errors and/or the probability of observing the results in the particular sample, and assess the implications for the research questions or hypotheses tested.” SMJ will also require authors to “explicitly discuss and interpret effect sizes of relevant estimated coefficients”. It might well be that we are currently observing the beginning of the end of null-hypothesis statistical tests. And it might only be a matter of time before other journals, also SCM journals, require authors to remove references to statistical significance and statistical hypothesis testing and, ultimately, to remove p values from their manuscripts.
The P value debate has revealed that hypothesis testing is in crisis – also in our discipline! But what should we do now? Nature recently asked influential statisticians to recommend one change to improve science. Here are five answers: (1) Adjust for human cognition: Data analysis is not purely computational – it is a human behavior. So, we need to prevent cognitive mistakes. (2) Abandon statistical significance: Academia seems to like “statistical significance”, but P value thresholds are too often abused to decide between “effect” (favored hypothesis) and “no effect” (null hypothesis). (3) State false-positive risk, too: What matters is the probability that a significant result turns out to be a false positive. (4) Share analysis plans and results: Techniques to avoid false positives are to pre-register analysis plans, and to share all data and results of all analyses as well as any relevant syntax or code. (5) Change norms from within: Funders, journal editors and leading researchers need to act. Otherwise, researchers will continue to re-use outdated methods, and reviewers will demand what has been demanded of them.
Leek, J., McShane, B.B., Gelman, A., Colquhoun, D., Nuijten, M.B. & Goodman, S.N. (2017). Five Ways to Fix Statistics. Nature, 551 (2), 557-559. DOI: 10.1038/d41586-017-07522-z
Academics and students often have very different ideas in mind when they talk about case study research. Indeed, case studies in SCM research are not alike and several different case study research designs can be distinguished. A recent article by Ridder (2017), titled The Theory Contribution of Case Study Research Designs, provides an overview of four common approaches. First, there is the “no theory first” type of case study design, which is closely connected to Eisenhardt’s methodological work. The second type of research design is about “gaps and holes”, following Yin’s guidelines. This type of case study design is what can be seen in SCM journals maybe most often. A third design deals with a “social construction of reality”, which is represented by Stake. Finally, the reason for case study research can also be to identify “anomalies”. A representative scholar of this approach is Burawoy. Each of these four approaches has its areas of application, but it is important to understand their unique ontological and epistomological assumptions. A very similar overview is provided by Welch et al. (2011).
Ridder, H.G. (2017). The Theory Contribution of Case Study Research Designs. Business Research, 10 (2), 281-305. https://doi.org/10.1007/s40685-017-0045-z
“Scale purification” – the process of eliminating items from multi-item scales – is widespread in empirical research, but studies that critically examine the implications of this process are scarce. In our new article, titled Statistical and Judgmental Criteria for Scale Purification, we (1) discuss the methodological underpinning of scale purification, (2) critically analyze the current state of scale purification in supply chain management (SCM) research, and (3) provide suggestions for advancing the scale purification process. Our research highlights the need for rigorous scale purification decisions based on both statistical and judgmental criteria. We suggest several methodological improvements. Particularly, we present a framework to demonstrate that the justification for scale purification needs to be driven by reliability, validity and parsimony considerations, and that this justification needs to be based on both statistical and judgmental criteria. We believe that our framework and additional suggestions will help to advance the knowledge about scale purification in SCM and adjacent disciplines.
Wieland, A., Durach, C.F., Kembro, J. & Treiblmaier, H. (2017). Statistical and Judgmental Criteria for Scale Purification. Supply Chain Management: An International Journal, 22 (4). DOI: 10.1108/SCM-07-2016-0230
We should not ignore that researchers – in general but also in supply chain management – are not always as properly trained to perform data analysis as they should be. A highly visible discussion is currently going on regarding the prevalent misuses of p-values. For example, too often research has been considered as “good” research, just because the p-value passed a specific threshold – also in the SCM discipline. But the p-value is not an interpretation, it rather needs interpretation! Some statisticians now even prefer to replace p-values with other approaches and some journals have decided to ban p-values. Based on this ongoing discussion, the influential American Statistical Association has now issued a Statement on Statistical Significance and p-values. It contains six principles underlying the proper use and interpretation of the p-value. As a discipline, we should take these principles seriously: in our own research, but also when we review the manuscripts of our colleagues.
Wasserstein, R., & Lazar, N. (2016). The ASA’s Statement on p-values: Context, Process, and Purpose. The American Statistician https://doi.org/10.1080/00031305.2016.1154108
I believe we all have already experienced this: The same concept can sometimes be defined in very different ways by different authors. Conceptual clarity would certainly be great, but how can we achieve it? Think, for example, about concepts such as trust, integration or dependence. So, what do we really mean when we are talking about them? In their new article, Recommendations for Creating Better Concept Definitions in the Organizational, Behavioral, and Social Sciences, Podsakoff, MacKenzie & Podsakoff (2016) present four stages for developing good conceptual definitions: Researchers need to (1) “identify potential attributes of the concept and/or collect a representative set of definitions”; (2) “organize the potential attributes by theme and identify any necessary and sufficient ones”; (3) “develop a preliminary definition of the concept”; and (4) “[refine] the conceptual definition of the concept”. For each of these stages, the authors provide comprehensive guidelines and examples which can help supply chain researchers to improve the definitions of the concepts we use.
Podsakoff, P., MacKenzie, S., & Podsakoff, N. (2016). Recommendations for Creating Better Concept Definitions in the Organizational, Behavioral, and Social Sciences. Organizational Research Methods, 19 (2), 159-203 https://doi.org/10.1177/1094428115624965