To test this I simulated data from the 4 cells of a 2X2 factorial design, A1, A2 B1 and B2, such that the data for each trial and each subject and each time point was drawn from the same normally distributed random population. I then selected either the time point where either (i) the main effect of A vs B was maximal or (ii) randomly selected a time point. Then for this time point I tested whether the interaction between A and B was significant. This was repeated 500 times and I calculated the percentage of false positives. At a significance level of 0.05 we should expect false positives 5% of the time. This is precisely what was produced. Irrespective of how the time point was selected the percentage of false positives was 5% (see left hand panel of the figure below). Therefore, choosing a region/electrodes/time window of interest from an orthogonal contrast is not biased? Well the answer is yes but only if all the cells have the same variance. If I rerun the same simulation but now I reduce the variance of A1 by a factor of 10, keeping the means of all the cells the same, equal to zero, I get a very different result. Now the statistical test for the interaction is biased with over twice as many false positives as predicted (see right hand panel of figure below)

This simulation assumed that cell A1 had a different variance to cells A2, B1 and B2. If I now rerun the simulation but now assume that all cells have unequal variances by dividing the variance of A1 by 1, A2 by 2, B1 by 3 and B2 by 4 then the proportion of false positives rises above 50%.

That's actually quite reassuring, no? For once a simulation that confirms my intuitions!!

ReplyDeleteIn the context of ERPs it makes sense to pick your time window in this fashion. If you're looking for how one factor mediates another (ie an interaction) then you want to pick based on where that factor being mediated actually exists.

But as your simulations indicate, this only makes sense (ie its only unbiased) if the different cells are contributing equally to the choice of time window. If you have one cell with much greater variance then that's going to have a disproportionate influence on where the peak is. So then you're picking based on that cell rather than the orthogonal contrast. In fact, I guess you could argue that the two factors are now no longer orthogonal (statistically at least).

I'm trying to think of a real context where this would be a concern. I thought of the MMN but you'd need to have reason to suspect that the deviant response was much bigger in one contrast than in another.

Anyway, any chance of posting a version of this on figshare so it's citable?

Jon, I will have a go at writing this up and putting the code on figshare. I will try and do this later on today

ReplyDelete