Thursday 25 July 2013

Further thoughts on pre-registration

As far as I can tell pre-registration of scientific studies has been proposed as a solution to three important problems with the current model of scientific publications: replication, negative results and p-hacking. I personally think that it is great that this format will allow more replication studies to be undertaken and more importantly it will enable the publication of replication studies even if the results are negative. This will be of clear value to the field.

All my reservations about pre-registration concern p-hacking, more specifically how this has been 'sold' to the community. Pre-registration is a solution to the problem of p-hacking. However, those promoting pre-registration have often argued that pre-registered articles should somehow be considered "more truthful" than those that have been published by the traditional route. I strongly disagree with this kind of statement. It is true that pre-registered studies will not be p-hacked, however not all studies that are not pre-registered are p-hacked. The danger of promoting pre-registration as more 'truthful' is that the community will stop believing results from someone who has decided not to pre-register for whatever reason - maybe they wanted the freedom to publish their results wherever they wanted, maybe they did not want to deal with reviewers who might disagree with the experimental design or maybe they just wanted to start the study rather than wait for months before approval.

P-hacking is clearly a problem, particular as it can occur subconsciously. However, at the current time I can not see how pre-registration will work in practice to prevent p-hacking and I worry that it has been promoted in a way that potentially will denigrate equally good science that has been published via a different route.

Friday 5 July 2013

Some Thoughts on Pre-registration

In recent months there has been a lot written about a new route for publishing original research articles in the field of psychology and neuroscience  pre-registration (http://cdn.elsevier.com/promis_misc/PROMIS%20pub_idt_CORTEX%20Guidelines_RR_29_04_2013.pdf and http://www.guardian.co.uk/science/blog/2013/jun/05/trust-in-science-study-pre-registration). As I understand it pre-registration requires the author to submit full details of the methods and analysis of a proposed study for peer review prior to the collection of any data. Once accepted the data of the study can be collected precisely as detailed in the original submission. Any deviation from these methods would then have to be highlighted in the final manuscript. One potential benefit of this format is that it would enable the publication of negative results - something that is very difficult to do currently. This new format is, in my opinion, also excellently suited to direct replication studies as the methods and analyses should be identical to the original study. Indeed this is something that I personally plan to do in the near future.
However, this reason is not the main reason that pre-registration has received so much interest. The main reason is that is promises to restore trust in the scientific process. The reason that sometrust may have been lost is that there is evidence to suggest that researchers are either deliberately or unwittingly analysing their data in many different ways so that their effect of interests reaches the magically p<0.05 threshold. One consequence of this is that the literature is populated with results that are at best unreliable and at worse are false positives. It has been argued that pre-registration would prevent this type of behaviour as the analysis pipeline would have to be declared before any data is analysed. Any different, post-hoc analyses would then be clearly labelled and would have to be treated as such.
There are a number of different aspects to this use of pre registration that bother me. The first is that I worry that this will lead to an incorrect perception that a priori  is more correct and post-hoc is less correct. This is not the case. If both analyses are performed correctly then at a p=0.05 threshold both have the same probability of producing false positives. I worry that results that have not been pre-registered will be viewed as dodgy. It is true that they could be dodgy as a result of endless different analysis attempts, however, I think that we should trust what researchers write in their manuscripts. My biggest concern is that one of the motivations of pre-registration is that we should no longer trust what our peers have said they have done. There is something very negative about this way of thinking and I would prefer to think, perhaps naively, that all scientists are being as honest as they can be. One solution to this problem would be for all researchers to make their data publically available. In this way others can look for themselves at the effects that are reported. This is something that I came across when publishing a paper in the Proceedings of the Royal Society B who require data to be made publically available on datadryad.org.
A second reason that I have misgivings about pre-registration is that it has the potential to stifle exciting unusual research and the reporting of unexpected results. One of the things I really enjoy about my work is the possibility that with each experiment I might discover something unexpected that changes how I think about my field of research. The fact this has not happened to me to date does not diminish this excitement. Indeed the field of action observation would not really exist without the unexpected discovery of mirror neurons (although others might argue that this is a good argument for pre-registration!). I am not sure how excited I would be about experimentation if it were the end result of months of writing and revising pre-registration papers that would then treat post-hoc analysis of my exciting unexpected result as less truthful.
My third concern is how pre-registration will work in practice. For example, if I was to pre-register an fMRI study. I presumably would have to give details of all the scanning parameters, all subject details (numbers, how many female, ages etc.), all pre-processing parameters, the planned design matrix, the planned contrasts and any a priori regions of interest and how I would specify them, either from the data or from another source. Then a reviewer would have to review this to determine if all this was correct. I would then not be able to deviate from this. So if one of my subjects moved a lot in the scanner and I had not written this in the pre-registration document, then I assume the study would then be classified as post-hoc as I had not included all exclusion criteria in the pre-registration. My worry is that there are so many degrees of freedom in any study that to cover all possible outcomes in the pre-registration document would be unnecessarily burdensome. However, not to include them would be against the principle of pre-registration. I am sure many of these things will be ironed out when people start to pre-register their studies, but I am concerned that the level of detail required to pre-register a study and the level of detailed expertise required to review these documents will mean that pre-registration will not be able to work in practice as intended.