Some thoughts on writing a replication study

This article was written by Daniel Keum and illustrates his personal reflections on publishing a replication study.

In my recent replication article in SMJ, I reverse the following finding from two influential articles in the Journal of Finance: antitakeover laws that insulate managers from the market for corporate control decrease innovation. Instead, I find that these laws actually increase innovation. I thought it would be useful to provide some practical tips on publishing a replication study and share my experience (beware: it is based on a sample size of one) as well as insights from conversations I had with other researchers during the process.

Just to provide some context, the article did not start out as a replication. It came out of a manuscript on a related topic that was rejected multiple times because its findings contradicted those of the study that I ended up replicating.

I was very fortunate to have worked with a patient associate editor who found both the topic and replication to be important. However, because this was my first time working on a replication paper, I made ~~a few~~ many mistakes that made the process more challenging than it could have been. (Hence, this post.)

Process takeaways:

1. Keep the replication as focused as possible

The manuscript in its first submission contained six “additional” tables that went beyond the original study. This was partly because there was a lot of insecurity about whether the reviewers would find the replication to provide sufficient contribution for publication. I also kept adding datasets and analyses on the theoretical mechanisms from the initial paper I wanted to write. This additional content ended up being distracting to every reader, diluted the manuscript’s focus, and weakened the forceful refutation of the original study. So, I would suggest that the scope of the manuscript is kept as focused as possible. All of the “additional” analyses you include should be directly related to the underlying mechanisms documented in the replicated study. Even then, limit them to two tables. I was reminded several times that the finding that I was trying to refute should be sufficiently important and interesting by itself without adding any bells and whistles.

2. Manage your expectations when it comes interacting with the original author(s)

Needless to say, the original author(s) will not be too happy with what you’re trying to do. More often than not, you will not have access to the original code and data. I went through the process of (1) notifying the original author(s) that I had trouble replicating their results, (2) requesting their code, (3) sending them my code for comments (while noting the key empirical choices that drive the differing conclusion), and (4) asking for their comments on the final manuscript. Of the two studies that I replicate in my study, one author was very hostile while the other was extremely helpful. Most authors are open to sharing their codes, but if the data in use is readily accessible, then I believe being told to replicate it independently (i.e., hearing no to step 2) and receiving input from step 3 is fair and has its benefits.

3. Be airtight in your replication (obviously) and show your work step-by-step during reviewer interactions.

3.A. The reviewers will ask you to identify exactly which empirical choice(s) underpin the differing results. Include a table (with 6 to 8 columns, not 15) that sequentially introduces your changes to the original study one at a time. Also include a table that provides a side-by-side comparison of sample statistics.

3.B. Reviewers will (rightfully) question every step of your empirical work. This could lead to a very long response letter that can be painful for both the author and the reviewers. The upside is that the entire process also gives you peace of mind that you are 99% right (see below for the risk of getting it wrong). I found it helpful to (a) include your replication code in the submission file in case the reviewers want to take a look, and (b) if available, find two (preferably well-known) other studies to validate your sample statistics and empirical specification.

3.C. The default assumption from the reviewers is that you are mistaken and the original study is right. For my replication, this was despite the fact that the changes I made to the original study were not controversial at all, for example, putting in industry-by-year fixed effects standard to the literature. Having said that, I do believe that this skepticism is warranted, but it can feel frustrating and hostile at certain points during the review process. Just a friendly warning that the process will probably be slower than you might expect.

4. “Depersonalize” your comments.

One of the most helpful comments I received during the review process was to depersonalize the comments on the original manuscript. For example, use an acronym when referring to the original study (Lee, Li, and Lee, 2014 à LLL, 2014) rather than the authors’ names. Throughout the review process, the language of the manuscript inadvertently became strong and potentially confrontational. I felt that I had to appeal to the reviewers why certain choices in the original study were “wrong” and “must be corrected.” Because our field is small, there is a high chance that the reviewers will know the original author. There is also a non-negligible chance that one of your reviewers was thanked in name in the original study [~~insert many messy stories here~~]. Overly strong language risks turning away even the most sympathetic reviewers. For example, consider changing “unusual” empirical choices to “updated” or “more conventional” empirical choices.

For an entirely different and more aggressive approach to replication, see: https://mobile.twitter.com/TradeandMoney/status/1280236313191604225

Personal takeaways

1. The intellectual experience of conducting a replication study is inherently negative in nature, which made the process stressful in ways that I did not expect.

1.A. The usual excitement and feeling of reward from discovering new knowledge is entirely absent. Instead, you are faced with the taxing task of negation. Trying to rehash and reconstruct someone else’s empirical work based on the descriptions furnished in the original study can be tedious although important. The policy implications of the paper were meaningful enough for me to trudge through the process but only barely. This was despite having an extremely encouraging AE. The finding you are trying to refute should be sufficiently important not only as a general academic topic but to you personally as a researcher.

1.B. I received valuable feedback from several colleagues in strategy and finance. However, three (senior) finance colleagues asked not to be included in the acknowledgments, so I decided not to thank anyone in name (with the exception of Jonathan Karpoff and Michael Wittry – the authors of one of the two replicated studies who provided valuable comments). Throughout the process, colleagues warned me that if it turned out that I was the one mistaken, then “it would be difficult to recover professionally.” This was a pretty grim warning.

2. There are most likely others working on the topic at hand. Find them and work with them.

Shortly after the manuscript was posted online, I was contacted by close to a dozen colleagues who had also struggled with the finding of the original study. They shared that they either abandoned the project altogether or went through a torturous review process because the reviewers were also skeptical of their conflicting results (see a contemporaneous study by Cabral, Francis, and Kumar, 2020 in SEJ that makes a closely related point to my study). Collaborating with these colleagues could have drastically improved the process.

3. There are hidden costs of writing a replication study.

3.A. The replication study absorbed about 40% of the initial study I wanted to publish, and I am not sure now if enough is left to resume that study (I don’t think I’ll go back to working on it). This is the largest and hidden cost of doing a replication study that I had been not aware of. I have a nagging suspicion that going ahead with the initial study could have been more rewarding – both professionally and to my personal intellectual satisfaction.

3.B. There are few “top” journals interested in replication studies. This imposes a cost to spending time writing a replication study (versus just going ahead with the study you wanted to write and taking a chance that the reviewers will look past the discrepancy with the study you replicate). The publishing journal of the original study is an obvious place to start, but you may be faced with obstacles. For example, there is a high chance that you will get the original AE and/or original author as a reviewer (see comment 4), and there are cases for and against these decisions. I fall on the side of (reluctantly) supporting these assignments because they help to clarify any misunderstanding. The cons are quite obvious.

Needless to say, there is much bias to my experience (again, N=1). The emails I received from researchers who informed me that they had to modify or abandon their projects entirely provided a sobering reminder of how lucky I was (I would like to thank the AE again who shall remain anonymous). Since my manuscript was posted, I came across many colleagues who looked back in their cabinet and considered writing a replication study. I believe that there is a general reckoning that science progresses not only through novel discoveries but also through replications. More people should do it, but the ~~gory~~ details of writing one are not often widely shared. This was meant to be a short and friendly post emphasizing that both the pros and cons will likely surpass what you have in mind.

SEE ALL

May 22, 2023

Doctoral Student and Junior Faculty Consortium Session 1

August 28, 2020

Still thinking about U: Reflections on the impact and use of “Thinking about U: Theorizing and testing U‐ and inverted U‐shaped relationships in strategy research”