0

I have been trying to understand the concept behind factorial design and its importance in connection to linear regression.

I will be glad if somebody can give a clear and tone down explanation in this regard.

Jack2018
  • 31
  • 6

1 Answers1

0

The factorial design is old, and its explanations can be found on internet. Basically, one continue response variable, and several categorical factors are required. If there are continue variable related to response variable, the factorial design is not suitable anymore.

Suppose there are $k$ factors and each of them have $l_i$ levels for factor $i$, then we have $\prod_{i=1}^k l_i$ combinations (cells) of factors. The experiment will be performed by putting certain number of subjects into each cell, and recording the values of response variable from each subject.

By generating dummy variables, the analysis of factorial design data can be performed by using linear regression. There are many coding methods for dummy variable. One of them is called reference coding system. Suppose the $i$-th factor has $l_i$ level, then we need to generate $l_i-1$ dummay variable and need to specify one level as reference. Suppose the level 1 being the reference. Then coding will be as following:

            X1         X2          .... X(l_i-1)  
  level 1    0          0                 0
  level 2    1          0                 0
  level 3    0          1                 0
     .............................
  level l_i  0          0                 1

By using linear regression, you can check the interaction between factors. Another advantage of factorial design is the orthogonal between factors if the number of subjects in the cells are balanced (the simple one is: the number of the subjects in each cell are the same). The orthogonal means that several types of sum of square (SS) are the same, i.e., the order of the factors entering the model has no effect on SS. So it makes the explanation of effect of factors very easy.

user158565
  • 7,461