On optimal allocation of treatment/condition variance in principal component analysis

The allocation of a (treatment) condition-effect on the wrong principal component (misallocation of variance) in principal component analysis (PCA) has been addressed in research on event-related potentials of the electroencephalogram. However, the correct allocation of condition-effects on PCA components might be relevant in several domains of research. The present paper investigates whether different loading patterns at each condition-level are a basis for an optimal allocation of between-condition variance on principal components. It turns out that a similar loading shape at each condition-level is a necessary condition for an optimal allocation of between-condition variance, whereas a similar loading magnitude is not necessary.


Condition effects in Principal Components Analysis
Principal components analysis (PCA) has regularly been performed for the analysis of event-related potentials of the electroencephalogram (Dien, Khoe & Mangun, 2007;Dien, 2010;Kayser & Tenke, 2003, 2005. In the context of event-related potentials, PCA is often performed for observed variables representing k levels of at least one (experimental) condition factor, so that the components represent a mixture of the between-and within-condition variance. However, (experimental) condition factors occur in several areas of research and PCA is performed in several areas of research. It is therefore interesting to know how experimental condition effects are optimally allocated on principal components.

Misallocation of between-condition variance
Since Wood and McCarthy (1984) it has been regarded as an optimum when a single PCA component combines the complete between-condition variance of a single condition factor with some within-condition variance. However, the allocation of variance of a single condition factor on a single principal component combining within-and between-condition variance does not necessarily occur and the allocation of between-condition variance on more than one component has been termed 'misallocation of variance' (Wood & McCarthy, 1984). Misallocation of variance has been investigated in simulation studies on methods of PCA component rotation (e.g., Dien, 2010;Beauducel & Debener, 2003;Wood & McCarthy, 1984) and new methods of component rotation have been proposed that may reduce misallocation of variance (Beauducel, 2018;Beauducel & Leue, 2015).
It has also been proposed to perform a separate PCA for each group representing a level of the condition factor because the loading shapes in each condition can be different (Barry, De Blasio, Fogarty, Karamacoska, 2016). Although it might be reasonable to identify condition-specific loading patterns by means of separate PCAs at each level of a condition factor, the effect of this form of analysis on misallocation of variance remains unknown.

Aims of the present paper
The present paper therefore investigates the effects of separate PCAs at each level of a condition factor on the allocation of between-condition variance on PCA components. First, some definitions for separate PCAs at each level of a single condition factor and for a PCA of the between-condition variance of the condition factor are presented. Second, it is shown that misallocation of condition variance as it has been demonstrated and discussed since Wood and McCarthy (1984) follows necessarily from rotation of components that perfectly represent a single condition effect. Third, it is shown that different condition-specific loading shapes do not allow for an unambiguous allocation of between-condition variance on a single component representing within-and between-condition variance. Finally, it is shown that different condition-specific loading patterns are compatible with an unambiguous allocation of between-condition variance on a single component, when the between-condition differences of the loadings on each component can be accounted for by a scalar.

Definitions: PCA for within-and between-condition variance
Consider that p random variables have been observed in k levels of a condition factor, so that However, when a within-condition PCA is performed separately for the correlations or covariances at each level of the condition factor, It is possible to write the complete data comprising condition variance and within-group variance as (5) where i 1 has the dimensions of u . This yields (7) and (8) for the wanted components.
Typically, the wanted components are rotated in order to improve the interpretation (Dien, 2010;Kayser & Tenke, 2003). If there is an additional condition factor, there can be additional groupings of PCAs for each level of the condition factor and an additional PCA across the levels of the condition factor. If the sample size is sufficiently large, it is also possible to perform a PCA for each of the combinations of condition levels and across all combinations of conditions of the two condition factors.

Misallocation of variance and component rotation
When there are only a few condition factors the number of wanted within-condition components is probably larger than the number of wanted between-condition components. For example, when there is only one condition factor with two levels, PCA of the between-condition variance without subsequent component rotation will result in only one between-condition component. When q v > q b = 1 it is possible to write Equation 8 as (9) where j denotes the number of the respective within-condition component. describes what is typically regarded as an optimal allocation of variance, namely, that a condition effect occurs on a single component that combines within-and between-condition variance. The simulation studies on this issue were based on a single condition effect that was introduced exclusively on a single component when the data were generated (Wood & McCarthy, 1984;Dien, 2010;Beauducel & Debener, 2003;Beauducel & Leue, 2015) and that occurred on more than one component after PCA followed by component rotation.
Component rotation means that the M is rotated by means of postmultiplication by a q v  q v transformation matrix T (Harman, 1976) and that the component scores are counter-rotated by means of premultiplication with T -1 , so that (11) For a single condition i the rotation of the infinite matrices containing the population of individual component scores l can be written as (12) Theorem 1 describes that a non-zero expectation that is initially only on the first component leads to a non-zero expectation on others than the first component after component rotation.

Theorem 1.
Proof. A single element for condition i of the matrix resulting from Equation 12 is given by (13) Equation 13 can be written as Equation 14 implies that the expectation for the population of scores even for j > 1 is ji ji  t w t w This completes the proof.
 Theorem 1 implies that a condition effect that occurs only on the first component before rotation, also occurs on other components after rotation. Thus, Theorem 1 shows that misallocation of variance as it has typically been investigated in simulation studies since Wood and McCarthy (1984) is a necessary consequence of any rotation of an initial set of components combining unambiguously within-and between-condition effects. Therefore, the attempts to reduce misallocation of variance are attempts to recover the initial combination of within-and between-condition components (Dien, 2010;Beauducel & Leue, 2015;Beauducel, 2018) so that the matrix T, transforming the original components to the given components becomes I. This implies  * TI and * 0, jh for j h  t so that Theorem 1 does not hold. Eliminating variance misallocation by means of component rotation precludes that there exists a PCA solution for the data at hand where each between-condition effect can be allocated on a separate single component. This is, however, not necessarily the case for any data set.

Misallocation of variance in combined within-and between-condition components
Theorem 1 describes misallocation of variance as it can occur when PCA is performed for the total sample, i.e., across the levels of a between-condition factor. When separate within-condition components ,..., i.e., that each component in c can be decomposed into a separate withinand between-condition component. This implies that no misallocation of variance occurs because each between-condition component is uniquely combined with another within-condition component. This completes the proof.  Thus, when the within-condition loading matrices at each condition level are identical to the between-condition loading matrix, this implies a component model where all components combine their respective within-and between-condition variance. Theorem 2 implies that no misallocation of variance occurs when each condition-specific loading pattern is identical to the between-condition loading pattern. When Theorem 2 holds, it would be possible to find a solution without variance misallocation by means of component rotation. b  x a c Thus, it is possible that only a subset of the within-condition loading matrices and between-condition loading matrices is identical and that this subset of components combines within-and between-condition variance. When there is only one between-condition component, i.e., q v = p > q b = 1, Equation 17 can be written as

Writing loading vectors
Theorem 3 describes constraints for the loadings that are compatible with a model combining a single between-condition component with the first within-condition component. v

c a c c
This completes the proof.  The identity of the loading patterns of the first unrotated within-and between-condition components is a necessary constraint for the allocation of the between-and within-condition variance on a common component. Theorem 4 describes a somewhat relaxed constraint that is based on an identical shape of the loadings of the first within-and between-condition components but allows for a different scale.
This completes the proof.  Theorem 4 shows that condition-specific loading patterns that have the same shape, but a different scale are compatible with a model where a single between-condition component is unambiguously allocated on a single within-condition component.

Discussion
According to Wood and McCarthy (1984) misallocation of variance occurs when a single between-condition effect that can in principle be allocated on a single PCA component is allocated on more than one component in a given PCA solution. The present study describes constraints that are to be imposed on the component loading matrices in order to avoid misallocation of variance. The following conclusions can be drawn: When a single between-condition effect is allocated on a single component of an initial PCA solution, any rotation of these initial components will result in a misallocation of variance (Theorem 1). This is an algebraic demonstration of what has been discussed elsewhere (Dien, 2010;Beauducel & Leue, 2015;Beauducel, 2018), namely that, at the level of combined within-and between-condition components, the misallocation of variance is directly related to component rotation. However, component rotation can only result in an optimal allocation of between-condition variance when such a rotational solution exists for a given data set.
Since it has been proposed to perform separate PCAs at each level of a condition factor (Barry et al., 2016), the consequences of this procedure for misallocation of variance were explored. When a PCA is calculated at each level of a condition factor and when a PCA is calculated for a single between-condition factor, an unambiguous allocation of the between-condition variance on a single component combining within-and between-condition variance is possible when the within-condition component loadings have the same shape, even when their scale is different (Theorem 3 and 4). Thus, only when the constraints given Theorem 3 and 4 hold for a given data set, it would be possible to find the solution with optimal allocation of between-condition variance by means of component rotation.
Theorem 3 and 4 also imply that separate PCAs at each level of a condition-factor are not necessarily a way to avoid or eliminate misallocation of variance. When different loading shapes occur at each level of a condition factor in separate PCAs, this indicates that misallocation of variance would occur when the separate components are combined into within-and between variance components. In contrast, when the loading shape is similar in the different PCAs with larger or smaller loadings at each level of the condition factor, the components can be combined into within-and between-components without misallocation of variance.
Finally, it follows from Theorem 4 that perfect congruence coefficients (Tucker, 1951;Wrigley, & Neuhaus, 1955) of the loadings of respective components at different levels of the condition factor are not a necessary condition for optimal variance allocation because congruence coefficients also refer to the similarity of the loading magnitude. For optimal variance allocation, a perfect Pearson correlation of the loadings of the respective components at different levels of the condition factor would be sufficient.