IBMS BoneKEy | Commentary

Is the genomics glass half empty or almost completely empty?

John A Eisman



DOI:10.1138/20100425

Commentary on: Richards JB, Kavvoura FK, Rivadeneira F, Styrkársdóttir U, Estrada K, Halldórsson BV, Hsu YH, Zillikens MC, Wilson SG, Mullin BH, Amin N, Aulchenko YS, Cupples LA, Deloukas P, Demissie S, Hofman A, Kong A, Karasik D, van Meurs JB, Oostra BA, Pols HA, Sigurdsson G, Thorsteinsdottir U, Soranzo N, Williams FM, Zhou Y, Ralston SH, Thorleifsson G, van Duijn CM, Kiel DP, Stefansson K, Uitterlinden AG, Ioannidis JP, Spector TD; Genetic Factors for Osteoporosis Consortium. Collaborative meta-analysis: associations of 150 candidate genes with osteoporosis and osteoporotic fracture. Ann Intern Med. 2009 Oct 20;151(8):528-37.

A recent meta-analysis published by Richards et al. of candidate genes in association with osteoporosis and osteoporotic fracture () could be seen as rather disappointing. In this major collaboration, only 9 out of the 150 candidate genes identified in a number of independent studies were associated with bone density at the lumbar spine and only 3 at both lumbar spine and femoral neck sites. Only 4 loci had even a weak association with risk of spine or non-spine fractures and only one locus was associated with both spine density and spine fractures (odds ratio 1.33). These findings parallel a recent and similarly disappointing meta-analysis of genome-wide association studies (GWAS) (). A total of 21 gene loci in this study were associated with bone density at the lumbar spine or femoral neck. Only 9 were associated with bone density at both sites, including 4 of 11 newly identified loci. Moreover, the effect size at any locus was unremarkable, with the largest being 0.12 standard deviations. Even in simple additive combinations, these genes only contributed weakly to explaining the variance in bone density: ∼ 3% at the lumbar spine and ∼2% at the femoral neck. Combinations of these markers were associated with fracture risk but again explained only about 5% of the risk. One can look at these data in one of two ways: the glass is half-empty or it is almost completely empty.

Given these findings, it is worthwhile to consider what the expectations were when the candidate and GWAS studies were started. The initial twin and family studies suggested that 70-80% of the variance in bone density was explicable on the basis of genetic factors (). Twin studies also identified clinically significant genetic contributions to fracture risk (). One analysis suggested that a single locus would explain a substantial proportion of that genetic effect in any single family (). By contrast, in these recent large-scale GWAS and candidate gene analyses, only about <5% of the variance in bone density at any site was explicable by identified genetic loci. This lack of confirmation in meta-analyses could reflect false positive findings in smaller studies or could relate to misleading findings from combinations of studies in subjects drawn from different ethnic groupings. Misleading outcomes from meta-analyses have been commented upon recently in relation to clinically important differences between meta-analyses and subsequent large, highly powered, randomized, controlled trials for different healthy conditions (). One can draw two conclusions. First, one can argue that any genetic variants that show through in these large, ethnically heterogeneous meta-analyses must be robust contributors. On the other hand, one could argue that the maximum effect size for bone density variance of 0.1 of a standard deviation and 5% of the risk of fracture are such trivial contributions as to be virtually clinically useless. The 65-70% of genetic effects on bone density variance and of genetic effects to fracture risk prediction observed in twin and family studies leads to two questions: where have these genetic contributions gone? Are there systematic issues with GWAS beyond ethnic heterogeneity that mask genetic contributions?

Those who favor the very large-scale studies suggest the solution to these missing genetic contributions lies with still larger and larger studies. However, statistical considerations do not clearly support such outcomes. Could there be inherent limitations in these approaches? We suggest there are several potential limitations in very large-scale studies, including ethnic heterogeneity, sampling strategies, multiple testing, markers versus functional variants, gene-gene interactions and gene-environment interactions.

Ethnic Heterogeneity: In genetic studies, ethnic homogeneity is considered essential to avoid confounding by genetic heterogeneity. Usually this is to avoid false associations related to known (or expected) allelic differences between ethnic/racial groups and other group-specific but in reality unlinked genetic effects. However, it is equally true that such ethnic admixtures could mask real associations. Yet by their very nature the large-scale meta-analyses flout this principle to some extent. One could argue that reproducing the effect on any locus across such complex genetic backgrounds as in these meta-analyses suggests a very robust effect. Interestingly in the GWAS study, one “reproducible” locus, LRP5, was apparent in the initial cohort but the strength of association (at least as assessed by LOD score) did not increase with the addition of further cohorts. This problem is clear when different ethnic groups are intentionally sampled together in a meta-analysis. However, even in primary studies, there is increasing likelihood of significant heterogeneity as the sample size is increased with the exception perhaps of those in very isolated communities. By contrast, in twin and family-based studies, ethnicity is not a confounding factor as each pair is considered separately. In the large meta-analyses, involving very large samples drawn from many countries around the world albeit largely those of Caucasian background, unsuspected ethnic differences are likely to result in different allelic frequency and sufficient noise to obscure any genetic contributions. A recent genotype geography analysis in Europe suggested that principal component analysis of genotype differences could map an individual's place of origin to within a few hundred kilometers (). These associations were apparent even after the authors had minimized genetic diversity by excluding individuals for whom their grand-parental origin was outside Europe. The authors concluded that these associations could lead to false positive associations in gene-phenotype studies. One could suggest that any signal from such loci would generate “noise” and obscure real relationships in neighboring loci.

Sampling Strategies: It is perhaps instructive to look at the samples from which genome-wide studies are carried out. The “logical” approach has been to use an entire population sample. Of course this means that 2/3 of individuals have bone densities that are within 1 standard deviation of the average. It is perhaps not unexpected that individual gene variants would have a modest effect, as such a large part of the sampled individuals are so similar to each other in absolute values. If this model was taken to the illogical extreme of selecting individuals whose bone densities were all scattered around the mean or any other value within a measurement error of 2-3%, one could not imagine that one could never find any genetic locus that could contribute to these non-measurable differences. It is possible that the genome-wide approach using the entire population, for all its “efficiency”, is actually adding so much noise that it is impossible to see an effect. Equally important in the large meta-analyses involving very large samples drawn from a range counties around the world, including largely those of Caucasian background, is that the potential ethnic differences, including in frequency of different alleles, could also be expected to translate to sufficient noise as to obscure any genetic effect.

Multiple Testing: Another potential issue is that of multiple testing. The same argument of “efficiency” of using the entire population for different condition-specific phenotypes translates to justification for multiple use of the same cohorts for multiple different phenotypes. However there is virtually never any adjustment to the already stringent genome-wide significance criteria for so many phenotypes being examined. If 100 phenotypes were examined, one could argue that the genome-wide significance cut-off should be raised from 10-7 to 10-9, which would lead to even fewer of the “reproduced” loci still being significant. This issue has yet to be satisfactorily addressed.

Markers Versus Functional Variants: The genetic markers that are used, whether in relation to candidates or SNPs, are not necessarily related to any difference in function but rather it is supposed (hoped) that they are in linkage disequilibrium with functional allelic differences. Moreover, similar to other limitations in large samples, differences in the strength of linkage disequilibrium between different ethnic groups are to be expected, which would further blur any findings in large-scale meta-analyses.

A way of enriching these data sets is to focus on genes that are expressed in the cell types of interest. One recent study used expression QTLs (eQTLs) from primary cultures of human osteoblastic cells in relation to the large GWAS sets (). Reproducible cis-regulated eQTLs from two independent mRNA sets helped identify two novel genetic loci in a Swedish male cohort. One of these was reproduced in the Rotterdam sample and was eventually fine-mapped as a promoter variant. This approach has considerable potential as the gene had been ranked at number 372 in the initial GWAS analysis. However, this analysis would need to be expanded to include eQTLs from other bone cells, such as osteoclasts and osteocytes, and logically from any cells involved at any step in bone and calcium physiology. This requirement is not so different from the candidate gene approach and carries some of its limitations.

Limited Analyses of Gene-Gene Interactions: In statistical simulations, as noted above, a significant proportion of genetic variability in a family would appear to be driven by a single or small number of genetic loci (). This would be likely to be obscured when multiple different families and groups within a population are combined. An important reason for such differences may relate to gene-gene interactions. Candidate gene, and for that matter genome-wide analyses scans, are always analyzed on a locus-by-locus basis without any consideration of gene-gene interaction. Stronger effects within families or twin studies may be due to gene-gene interactions although this has yet to be adequately explored.

Limited Analyses of Gene-Environment Interactions: Despite much evidence that environment plays a role in osteoporosis susceptibility, i.e., that environmental and lifestyle factors may impact on bone density and fracture risk, there have been few studies aimed looking at gene-environment interactions. One such early study suggested that there were major differences in the effect of the environment, i.e., calcium intake on bone gain in relation to the vitamin D receptor (). This possibility needs to be evaluated explicitly but this is a challenge in larger studies, where different cohorts have different baseline information collected.

Perhaps the most important implication of these genome-wide scans and large-scale meta-analyses is not that still bigger studies need to be done but rather that innovative approaches must be developed to examine the issues of gene-gene interaction and gene-environment interaction to study and identify clinically meaningful outcomes. Some research studies should perhaps be focused on response to therapy aiming to identify genetic correlates of a better versus poorer response to therapy. In this concept of randomized controlled trials the GWAS approach in “non-responders” between the treatment and placebo arms could be particularly informative. Heritable factors clearly play a major role in osteoporosis as well as in many other chronic conditions. However one looks at the data, current study designs have been somewhat disappointing in their yields. Hopefully this “almost empty glass” will encourage more careful examination of study methodology as well as thoughtful refinement of experimental design and of the questions being asked.


Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.