space
Fisher's exact test
space
QUMA (QUantification tool for Methylation Analysis) top spacer close
The statistical significance of the difference between two bisulfite sequence groups at each CpG site is evaluated with Fisher's exact test that is non-parametric statistical significance test to determine if there are nonrandom associations between two categorical data. Fisher's exact test can use the same way as the Chi-square test for independence and more exact for small number of methylated CpGs or unmethylated CpGs, that is usually detected in CpG methylation analysis. Two-tailed p-value of Fisher's exact test is calculated from the 2 x 2 tables (exampled below) at each CpG site. This p-value is used to show the independence of CpG methylation between two groups at the CpG site.
 list  Example 2 x 2 table for CpG methylation status
methylated CpG unmethylated CpG
group1 a b
group2 c d
a: number of methylated CpGs of group1 at the CpG site
b: number of unmethylated CpGs of group1 at the CpG site
c: number of methylated CpGs of group2 at the CpG site
d: number of unmethylated CpGs of group2 at the CpG site
In case of sample data show in table1, this data can be transformed as table2.
Table1 space Table2
CpG position 375
Me-CpG group1 12/13
92.3%
group2 4/10
40.0%
total 16/23
69.6%
methylated CpG unmethylated CpG total
group1 12 1 13
group2 4 6 10
total 16 7 23
The probability p of this table can be determined by following formula:
p = a+bCa * c+dCc / a+b+c+dCa+c
   = 13C12 10C4 / 23C16 = (13! 10! 16! 7!) / (12! 1! 4! 6! 23!) = 0.0111357212
where the symbol ! indicates the factorial operator.
When the marginal totals are fixed, there are 9 cases indicated below.
a b c d |ad - bc| probability
6 7 10 0 70 0.0069995962
7 6 9 1 47 0.0699959618
8 5 8 2 24 0.2362363710
9 4 7 3 1 0.3499798089
10 3 6 4 22 0.2449858662
11 2 5 5 45 0.0801771926
12 1 4 6 68 0.0111357212
13 0 3 7 91 0.0004894823
To determine a two-tailed p-value of the significance, make a sum of probabilities of the case when the absolute value of "ad - bc" is not less than the absolute value of "ad - bc" of the sample.
In this data, the cases of a = 6, 12 and 13 are used. Then, the two-tailed p-value
= 0.0069995962 + 0.0111357212 + 0.0004894823
= 0.0186257997
QUMA (QUantification tool for Methylation Analysis) top spacer close