Several journals have already reacted to the p value debate. For example, an ASQ essay provides suggestions that not only every editor should read. Another example are the policies published by SMJ: SMJ “will no longer accept papers for publication that report or refer to cut-off levels of statistical significance (p-values)”. Instead, “authors should report either standard errors or exact p-values (without asterisks) or both, and should interpret these values appropriately in the text”. “[T]he discussion could report confidence intervals, explain the standard errors and/or the probability of observing the results in the particular sample, and assess the implications for the research questions or hypotheses tested.” SMJ will also require authors to “explicitly discuss and interpret effect sizes of relevant estimated coefficients”. It might well be that we are currently observing the beginning of the end of null-hypothesis statistical tests. And it might only be a matter of time before other journals, also SCM journals, require authors to remove references to statistical significance and statistical hypothesis testing and, ultimately, to remove p values from their manuscripts.
The P value debate has revealed that hypothesis testing is in crisis – also in our discipline! But what should we do now? Nature recently asked influential statisticians to recommend one change to improve science. Here are five answers: (1) Adjust for human cognition: Data analysis is not purely computational – it is a human behavior. So, we need to prevent cognitive mistakes. (2) Abandon statistical significance: Academia seems to like “statistical significance”, but P value thresholds are too often abused to decide between “effect” (favored hypothesis) and “no effect” (null hypothesis). (3) State false-positive risk, too: What matters is the probability that a significant result turns out to be a false positive. (4) Share analysis plans and results: Techniques to avoid false positives are to pre-register analysis plans, and to share all data and results of all analyses as well as any relevant syntax or code. (5) Change norms from within: Funders, journal editors and leading researchers need to act. Otherwise, researchers will continue to re-use outdated methods, and reviewers will demand what has been demanded of them.
Leek, J., McShane, B.B., Gelman, A., Colquhoun, D., Nuijten, M.B. & Goodman, S.N. (2017). Five Ways to Fix Statistics. Nature, 551 (2), 557-559. DOI: 10.1038/d41586-017-07522-z
“Scale purification” – the process of eliminating items from multi-item scales – is widespread in empirical research, but studies that critically examine the implications of this process are scarce. In our new article, titled Statistical and Judgmental Criteria for Scale Purification, we (1) discuss the methodological underpinning of scale purification, (2) critically analyze the current state of scale purification in supply chain management (SCM) research, and (3) provide suggestions for advancing the scale purification process. Our research highlights the need for rigorous scale purification decisions based on both statistical and judgmental criteria. We suggest several methodological improvements. Particularly, we present a framework to demonstrate that the justification for scale purification needs to be driven by reliability, validity and parsimony considerations, and that this justification needs to be based on both statistical and judgmental criteria. We believe that our framework and additional suggestions will help to advance the knowledge about scale purification in SCM and adjacent disciplines.
Wieland, A., Durach, C.F., Kembro, J. & Treiblmaier, H. (2017). Statistical and Judgmental Criteria for Scale Purification. Supply Chain Management: An International Journal, 22 (4). DOI: 10.1108/SCM-07-2016-0230
You should all read this interesting article: Approaching the Conceptual Leap in Qualitative Research by Klag & Langley (2013), which is useful for researchers who build theory from qualitative data. Its central message is “that the abductive process is constructed through the synthesis of opposites that [the authors] suggest will be manifested over time in a form of ‘bricolage’.” The authors use four dialectic tensions: deliberation—serendipity, engagement—detachment, knowing—not knowing, social connection—self-expression. One of the poles of each dialectic has a disciplining character, the other pole has a liberating influence: On the one hand, overemphasizing the disciplining poles “may result in becoming ‘bogged down’ in contrived frameworks (deliberation), obsessive coding (engagement), cognitive inertia (knowing) or collective orthodoxy (social connection)”. On the other hand, overemphasizing the liberating poles “can also be unproductive as researchers wait for lightning to strike (serendipity), forget the richness and nuances of their data (detachment), reinvent the wheel (not knowing) or drift off into groundless personal reflection (self-expression)”.
Klag, M., & Langley, A. (2013). Approaching the Conceptual Leap in Qualitative Research. International Journal of Management Reviews, 15 (2), 149-166 DOI: 10.1111/j.1468-2370.2012.00349.x
Just like OM research, SCM research is dominated by three research methodologies: (1) analytical modelling research (optimization, computational, and simulation models etc.), (2) quantitative empirical research (surveys etc.), and (3) case study research. There has been a recent trend towards multi-methodological research that combines different methodologies. A new article by Choi, Cheng and Zhao, titled Multi-Methodological Research in Operations Management, investigates this trend. The authors “present some multi-methodological approaches germane to the pursuit of rigorous and scientific operations management research” and “discuss the strengths and weaknesses of such multi-methodological approaches”. The authors make clear that multi-methodological approaches can make our research “more scientifically sound, rigorous, and practically relevant” and “permit us to explore the problem in ‘multiple dimensions’”. However, such research can also be “risky as it requires high investments of effort and time but the final results might turn out to be not fruitful”. Anyhow, as the authors conclude: “no pain, no gain”!
Choi, T., Cheng, T., & Zhao, X. (2015). Multi-Methodological Research in Operations Management. Production and Operations Management DOI: 10.1111/poms.12534
The AVE–SV comparison (Fornell & Larcker, 1981) is certainly the most common technique for detecting discriminant validity violations on the construct level. An alternative technique, proposed by Henseler et al. (2015), is the heterotrait–monotrait (HTMT) ratio of correlations (see the video below). Based on simulation data, these authors show for variance-based structural equation modeling (SEM), e.g. PLS, that AVE–SV does not reliably detect discriminant validity violations, whereas HTMT identifies a lack of discriminant validity effectively. Results of a related study conducted by Voorhees et al. (2016) suggest that both AVE–SV and HTMT are recommended for detecting discriminant validity violations if covariance-based SEM, e.g. AMOS, is used. They show that the HTMT technique with a cutoff value of 0.85 – abbreviated as HTMT.85 – performs best overall. In other words, HTMT should be used in both variance-based and covariance-based SEM, AVE–SV should be used only in covariance-based SEM. One might be tempted to prefer inferential tests over such heuristics. However, the constrained ϕ approach did not perform well in Voorhees et al.’s study.
Fornell, C., & Larcker, D. (1981). Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research, 18 (1) https://doi.org/10.2307/3151312
Henseler, J., Ringle, C., & Sarstedt, M. (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling. Journal of the Academy of Marketing Science, 43 (1), 115-135 https://doi.org/10.1007/s11747-014-0403-8
Voorhees, C., Brady, M., Calantone, R., & Ramirez, E. (2016). Discriminant Validity Testing in Marketing: An Analysis, Causes for Concern, and Proposed Remedies. Journal of the Academy of Marketing Science, 44 (1), 119-134 https://doi.org/10.1007/s11747-015-0455-4
Trends in management towards a concentration on core competencies and outsourcing of non-core activities have created complex networks, i.e., global supply chains. At the same time, it has been discussed that this increased complexity has also made companies more vulnerable. An interesting paper, Structural Drivers of Upstream Supply Chain Complexity and the Frequency of Supply Chain Disruptions, co-authored by Bode and Wagner, is currently forthcoming in the Journal of Operations Management. Herein, the authors distinguish between three drivers of upstream supply chain complexity: (1) horizontal complexity (= the number of direct suppliers in a firm’s supply base), (2) vertical complexity (= the number of tiers in the supply chain), and (3) spatial complexity (= the geographical spread of the supply base). Based on survey data, the authors find that all of these three drivers increase the frequency of supply chain disruptions. It is further found that these three variables even amplify each other’s effects in a synergistic fashion.
Bode, C., & Wagner, S. (2015). Structural Drivers of Upstream Supply Chain Complexity and the Frequency of Supply Chain Disruptions. Journal of Operations Management, 36, 215–228 https://doi.org/10.1016/j.jom.2014.12.004
Two ingredients are needed to create supply chain resilience (Wieland & Wallenburg, 2013): robustness, which is proactive, and agility, which is reactive. Robustness builds on anticipation “to gain knowledge about potential changes that might occur in the future” and preparedness “to maintain a stable situation”. Agility builds on visibility “to gain knowledge about actual changes that are currently occurring” and speed “to get back to a stable situation”.
Wieland, A., & Wallenburg, C.M. (2013). The Influence of Relational Competencies on Supply Chain Resilience: A Relational View. International Journal of Physical Distribution & Logistics Management, 43 (4), 300-320 https://doi.org/10.1108/IJPDLM-08-2012-0243
Are you currently conducting conceptual, qualitative, or survey research? Are you also aiming to publish the results in a top journal? Then I have some tips for you that could bring you one step closer to your goal. These tips can be found in a recent JBL editorial: A Trail Guide to Publishing Success: Tips on Writing Influential Conceptual, Qualitative, and Survey Research. Herein, the authors identify and describe agreed-upon basics that can help to “(1) increase consistency in the review process, (2) reduce publication cycles, and (3) begin to roll back the length of articles”. For three types of research (conceptual, qualitative, and survey research), best practices are presented for crafting articles. I especially like a table with warning signs “that authors are wandering down a perilous path”, which can be used as a check list for your own research. These warning signs might also help reviewers to evaluate the quality of a manuscript.
Fawcett, S., Waller, M., Miller, J., Schwieterman, M., Hazen, B., & Overstreet, R. (2014). A Trail Guide to Publishing Success: Tips on Writing Influential Conceptual, Qualitative, and Survey Research. Journal of Business Logistics, 35 (1), 1-16 https://doi.org/10.1111/jbl.12039
Theory-building empirical research needs formal conceptual definitions. Particularly, such definitions are necessary conditions for construct validity. But what is a “good” formal conceptual definition? In his seminal JOM paper, A Theory of Formal Conceptual Definitions: Developing Theory-building Measurement Instruments, Wacker (2004) presents eight rules for formal conceptual definitions: (1) “Definitions should be formally defined using primitive and derived terms.” (2) “Each concept should be uniquely defined.” (3) “Definitions should include only unambiguous and clear terms.” (4) “Definitions should have as few as possible terms in the conceptual definition to avoid violating the parsimony virtue of ‘good’ theory.” (5) “Definitions should be consistent within the [general academic] field.” (6) “Definitions should not make any term broader.” (7) “New hypotheses cannot be introduced in the definitions.” (8) “Statistical tests for content validity must be performed after the terms are formally defined.” These rules are explained in detail in Wacker’s article. I am convinced that Wacker’s rules lead to better measurement instruments.
Wacker, J.G. (2004). A Theory of Formal Conceptual Definitions: Developing Theory-building Measurement Instruments. Journal of Operations Management, 22 (6), 629-650 https://doi.org/10.1016/j.jom.2004.08.002