.-
help for ^gwar^, ^min2gwar^
.-
^gwar^ v1.0
Robust analysis and meta-analysis of genomewide-association studies
-----------------------------------------------------------------------------------
^gwar^ varlist, Method(string ) [ SNP(varlist min=1 max=1) EFfect(string) ]
varlist contains the variables corresponding to the genotypes for cases and
controls. The first three variables are the genotypes of the controls (e.g.
aa0, ab0, bb0), and the remaining ones the genotypes of cases (e.g. aa1, ab1,
bb1). Allele b is assumed to be the risk variant.
^min2gwar^ varlist, [ SNP(varlist min=1 max=1) EFfect(string) ]
In this version varlist contains the p-values from the Cochran-Armitahe trend
test and the Pearson chi-square. When the user specifies the meta-analysis
option, he has to provide, additionally, the total number of cases and controls
and a variable (-1, +1) for the direction of the association.
Description
-----------
^gwar^ performs standard analysis of genome-wide association studies (GWAS) using the
well-known Cochran-Armitage trend test (CATT) under any of the three available genetic
models of inheritance (Dominant, Recessive, Additive). Moreover, it can perform robust
analysis, which is more powerfull when the genetic model cannot be determined beforehand
Bagos (2013). The available robust methods are: MAX, MIN2 and MERT. MERT is a linear
combination of the two optimal tests for the extreme members of the family (in this case
the optimal CATTs for the recessive and dominant models) and it is distributed according
to a standard normal distribution (Freidlin et al., 2002). The MAX test is based on the
simple idea to test all three possible models and choose the one with the highest score
(Freidlin et al., 2002). MIN2 was applied by the investigators of the WTCCC (2007),
who considered the Pearson chi-square along with the CATT for the additive model and,
subsequently, chose the minimum of the p-values.
Apart from MERT, whose asymptotic distribution is easy to compute, MAX and MIN2 are
well-known for the difficulty in obtaining accurate p-values. Traditionally, simulations
are used, which, however, are time-consuming, particularly in large datasets with thousands
of SNPs. In ^gwar^ we have implemented two recently proposed methods that rely on numerical
integration. We use Gauss-Legendre quadrature implemented in ^integrate^ using Mata, and
this results in great gain in computing time. For MAX, we implement the method proposed by
Zhang et. al (2010), whereas, for MIN2, we implement the method proposed by Joo et. al (2009).
Simulation results suggest that MIN2 is slightly less powerful (1-4%) than MAX for the
recessive model, it outperforms MAX under the additive model (by ~2%), whereas for the
dominant model both tests perform similarly. MERT is less powerful compared to both MAX
and MIN2. We also need to point out that MIN2 is faster since it requires fewer calculations of
integrals.
^gwar^ performs also meta-analysis using fixed-, and random-effects methods that use
summary data. The method uses weights equal to the sum of the reciprocal of the combined
cases and controls, as advised by Zhou et. al (2011) and ilustrated by Bagos (2013). In
this way, standard inverse-variance methods for meta-analysis can be used, including
random-effects methods.
The methods are described in detail in Bagos (2013).
It is the user's responsibility to ensure that data are in the appropriate format when
^gwar^ is called.
^integrate^ must be installed first.
^metan^ is also needed if the user requests meta-analysis
Options for ^gwar^
-----------------
^Method^(string) This is a required argument. The user has to specify one of the
available methods: add (CATT additive), dom (CATT dominant), rec
(CATT recessive), mert (Maximum Efficiency Robust Test), max
(MAX test) and min2 (MIN2 test).
^SNP^(varlist) This is an optional argument. It contains the name of the variable
that carries the name of the SNPs under analysis. This option is
needed in case of meta-analysis, in which the name of the SNP is
used to group the data for each SNPs (i.e. there is no need for "study"
variable). In the single-study mode, the user can choose to use this
option or not. However, if a meta-analysis is requested (^effect^, see
below) then ^snp^ is mandatory.
^EFfect^(string) This option takes only two arguments, either "f" (for fixed) or "r"
(for random) and it is used to specify the method of meta-analysis
and at the same time to request a meta-analysis. If this option is
used, then the ^snp^ option needs to be declared, also.
Example
-------
Assuming that the data are in the following format (WTCC):
clear
set more off
input id str20 snp aa1 ab1 bb1 aa0 ab0 bb0
1 "rs2820037" 40 587 1325 72 684 2180
2 "rs6997709" 118 716 1116 237 1201 1500
3 "rs7961152" 416 963 570 492 1448 992
4 "rs11110912" 67 647 1237 83 804 2049
5 "rs1937506" 113 742 1097 244 1205 1484
6 "rs2398162" 111 624 1205 194 1121 1608
end
. ^list^
+--------------------------------------------------------+
| id snp aa1 ab1 bb1 aa0 ab0 bb0 |
|--------------------------------------------------------|
1. | 1 rs2820037 40 587 1325 72 684 2180 |
2. | 1 rs6997709 118 716 1116 237 1201 1500 |
3. | 1 rs7961152 416 963 570 492 1448 992 |
4. | 2 rs11110912 67 647 1237 83 804 2049 |
5. | 2 rs1937506 113 742 1097 244 1205 1484 |
6. | 2 rs2398162 111 624 1205 194 1121 1608 |
+--------------------------------------------------------+
. ^gwar aa0 ab0 bb0 aa1 ab1 bb1,method(min2)^ performs single study analysis using MIN2
. ^gwar aa0 ab0 bb0 aa1 ab1 bb1,method(max)^ performs single study analysis using MAX
For demonstration purposes, we can assume that the id variable is the variable indicating
the SNPs, so the program will perform a meta-analysis
. ^gwar aa0 ab0 bb0 aa1 ab1 bb1,method(min2) snp(id) eff(r) ^ performs meta-analysis using MIN2
clear
input id str20 snp R S prob1 prob2 direction
1 rs2820037 1952 2936 .0000576 7.66e-07 -1
1 rs6997709 1950 2938 7.88e-06 .0000436 1
1 rs7961152 1949 2932 7.39e-06 .0000303 -1
2 rs11110912 1951 2936 9.18e-06 .0000194 -1
2 rs1937506 1952 2933 9.23e-06 .0000453 1
2 rs2398162 1940 2923 7.85e-06 5.67e-06 1
end
. ^list^
+----------------------------------------------------------------+
| id snp R S prob1 prob2 direct~n |
|----------------------------------------------------------------|
1. | 1 rs2820037 1952 2936 .0000576 7.66e-07 -1 |
2. | 1 rs6997709 1950 2938 7.88e-06 .0000436 1 |
3. | 1 rs7961152 1949 2932 7.39e-06 .0000303 -1 |
4. | 2 rs11110912 1951 2936 9.18e-06 .0000194 -1 |
5. | 2 rs1937506 1952 2933 9.23e-06 .0000453 1 |
6. | 2 rs2398162 1940 2923 7.85e-06 5.67e-06 1 |
+----------------------------------------------------------------+
. ^min2gwar prob1 prob2^ performs single-study analysis using MIN2
. ^min2gwar prob1 prob2 R S direction,snp(id) eff(r) ^ performs meta-analysis using MIN2
Author
-------
Pantelis G. Bagos, University of Thessaly, GR
pbagos@compgen.org
References
----------
1) Bagos, PG, (2013)
Genetic model selection in genome-wide association studies: robust methods and the use of meta-analysis
Statistical Applications in Genetics and Molecular Biology. Volume 12, Issue 3, Pages 285–308
2) Freidlin, B., Zheng, G., Li, Z. and Gastwirth, J.L. (2002)
Trend tests for case-control studies of genetic markers: power, sample size and robustness.
Hum Hered, 53, 146-152.
3) WTCCC. (2007)
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.
Nature, 447, 661-678.
4) Joo, J., Kwak, M., Ahn, K. and Zheng, G. (2009)
A robust genome-wide scan statistic of the Wellcome Trust Case-Control Consortium.
Biometrics, 65, 1115-1122.
5) Zang, Y., Fung, W.K. and Zheng, G. (2010)
Simple algorithms to calculate asymptotic null distribution for robust tests in case-control genetic
association studies in R.
Journal of Statistical Software 33
6) Zhou, B., Shi, J. and Whittemore, A.S. (2011)
Optimal methods for meta-analysis of genome-wide association studies.
Genet Epidemiol, 35, 581-591.
Also see
--------
On-line: help for @metan@, @integrate@