Genome-Wide Linkage and Association Scans for Quantitative Trait Loci of Serum Lactate Dehydrogenase

Abstract

Serum lactate dehydrogenase (LDH) is used in diagnosing many diseases and is significantly determined by genetic factors. Three genes coding for LDH isoenzymes were mapped to chromosome 11q15 and 12p12. We used 330 Framingham Heart Study largest families for microsatellite linkage scan and 100K SNPs association scan to determine quantitative trait loci of LDH level. We estimated the heritability at 41%. Our genome-wide linkage analysis yielded several chromosomal regions, other than 11q and 12p, with LOD scores between 1 and 2.5. None of the 100K SNPs with a P-value $\text{[math]}$ in our genome-wide association study was close to the chromosomal regions where the LDH genes reside. Our study demonstrated a strong genetic effect on the variation of LDH levels. There may not be a single gene with a large effect, instead may be several genes with small effects in controlling the variation of serum LDH. Those genes may be located on chromosomal regions that differ from where the genes encoding LDH isoenzymes reside.

Previous Section Next Section

1. Introduction

LDH is a cytoplasmatic enzyme found in almost all body tissues. A small amount of the enzyme exists in the blood. When cells are damaged or destroyed, they release LDH into the bloodstream and cause blood levels to rise. LDH is used in diagnosing heart, liver, muscle, lung diseases, and testicular cancer. Its sensitivity and specificity in helping to diagnose acute myocardial infarction are very high [1].

The enzyme is composed of 4 peptide chains of 2 types, A and B. A is encoded by LDH A gene and predominant in muscle, and B by LDH B gene predominant in the heart. Five different combinations of A and B compose five LDH isoenzymes: LDH1 (A4), LDH2 (A3B), LDH3 (A2B2), LDH4 (AB3), and LDH5 (B4). While all of the five isoenzymes are represented in total serum LDH; LDH2 makes up the greatest percentage. In addition, there is homotetrameric LDH C isoenzyme, encoded by the LDH C gene, found only in mature testes and spermatozoa in humans. The LDH A and C genes are syntenic on human chromosome 11q15 [2, 3] while the LDH B gene maps to chromosome 12p12 [4].

Family studies have shown that serum LDH variation is significantly determined by genetic factors with heritability estimate as 50% [5]. So far, no genome-wide linkage or association studies on serum levels of LDH have been reported. We carried out a 10 cm genome-wide linkage analysis and an Affymetrix Human GeneChip 100 K genome-wide association study for quantitative trait loci of serum LDH level in a community-based Caucasian cohort, the Framingham Heart Study.

Previous Section Next Section

2. Subjects and Methods

The Framingham Heart Study, a population-based study, began in 1948 with the recruitment of 5209 residents aged 28–62 years (mean age 44.1) from Framingham, Massachusetts [6]. The participants have undergone biennial examinations since the study began. In 1971, the Framingham Offspring Study [7] was started, in part, to evaluate the genetic components of cardiovascular disease etiology. In total, 5124 subjects aged 5–70 years (mean age 36.3) including the offspring of the original cohort and the spouses of the offspring were recruited. The offspring cohort has been examined every four years (except the first two examinations with eight years intervening). Within the study, the 330 largest extended families were selected and genotyped with a 10 cm density microsatellite marker and the Affymetrix Human GeneChip 100 K genome-wide scans. The number of subjects with microsatellite genotyped was 1702 and with the 100 K SNPs genotyped was 1343. A recent study reported no evidence of major population substructure in the Framingham Heart Study [8]. LDH was measured during the first (1971–1975) and second (1980–1983) examinations of the offspring. Our analysis was limited to the first examination, since far fewer individuals have serum LDH measured in the second examination.

Total LDH was measured by kinetic method using Dow UV LDH-Reagent Kit. Weight was measured with the subject in light clothing and shoes off. Laboratory measurements were made on 12 hours fasting venous blood samples that were collected in tubes containing 0.1% EDTA. Lipid determinations were performed at the Framingham Heart Study laboratory, which participates in the Standardization Program of the Centers for Disease Control. All subjects provided informed consent prior to each clinic visit, and the examination protocol was approved by the Institutional Review Board at Boston Medical Center (Boston, MA). The clinical and laboratory methods have been detailed elsewhere [5].

2.1. Genotyping Methods

For the microsatellite genotyping, genomic DNA was isolated from nucleated blood cells. DNA samples were sent to the Marshfield Mammalian Genotyping Service (http://research.marshfieldclinic.org/genetics/). At an average of 10 cm density, 399 microsatellite markers (Screening Set 9) [9] covered the genome with an average marker heterozygosity of 0.77. The genotyping data were cleaned in two steps. First, the sib-kin program in Aspex (ftp://lahmed.standord.edu/pub/aspex/index.html) was used to verify family relationships based on all markers available. Second, the GENTEST program, as a precursor of INFER, created by Southwest Foundation for Biomedical Research (http://www.sfbr.org/sfbr/public/software/software.html), was used to identify and eliminate additional genotype inconsistencies. When inconsistencies were found, the genotyping values in all members of the nuclear family were set to missing.

For the Affymetrix 100 K SNPs genotyping, the details are provided in [10]. In summary, SNPs on the Affymetrix Human GeneChip 100 K (n = 112,990 autosomal SNPs) were genotyped in a sample of 1343 individuals from 330 families. SNPs with minor allele frequency <10% or call rate <80% or Hardy-Weinberg equilibrium P-value < .001 were excluded, leaving 70591 SNPs for analysis, criteria set by the Framingham Heart Study for the Affymetrix 100 K SNPs association genome-wide scans [11]. The 100 K SNPs data are publicly available through dbGaP (http://www.ncbi.nlm.nih.gov/gap) under Framingham SHARe project.

2.2. Statistical Methods

2.2.1. Genome-Wide Linkage Analysis

Variation in LDH from known factors was identified and removed by regression modeling incorporated in SOLAR [12–14], to enhance the ability of linkage analysis to detect genetically determined variation using a maximum-likelihood-based variance decomposition method. The covariates selected (P < .05) and incorporated into both the heritability estimation and the linkage analyses were age, sex, height, weight, high density lipoprotein cholesterol (HDL-C), and alkaline phosphatase (ALP).

An estimate of heritability was obtained using the variance-component method. Heritability is the proportion of total phenotypic variation due to additive genetic effects, after removing the variation attributable to covariates. The variance component method analysis was also used for the linkage analysis of LDH, adjusted for known covariates, using the random microsatellite DNA markers covering the entire genome. Marker allele frequencies were estimated from the study participants and then used to estimate the proportion of a single-point alleles shared identical by descent (IBD) among all relative pairs. Multipoint IBDs were estimated based on the singlepoint estimates by a regression approach [13]. A likelihood ratio test was used to evaluate linkage by comparing a purely polygenic model (without consideration of genetic marker information) to a model that incorporates IBD information at the marker. The LOD score was the log (base 10) of the ratio of the likelihoods of two models, purely polygenic versus one that also included IBD information at the marker.

Since the variance-component method is based on the assumption of a multivariate normal distribution, violations of this assumption may result in inaccurate results [14–16]. We found that LDH had high kurtosis and thus used an LOD score adjustment method implemented in SOLAR to ensure more reliable results. A fully informative marker linked to the trait studied was simulated. The IBD information for this marker was calculated, and then linkage analysis to the trait was performed. The LOD score adjustment process [15, 16] regresses the observed LOD scores from simulation (10,000 replicates) on the expected LOD scores for a multivariate normal trait to obtain a correction factor for the LOD scores from the analyses of the observed trait. This method provides robust LOD scores for data with nonnormal distributions.

2.2.2. Genome-Wide Association Study

One individual from each family was randomly selected for the Hardy-Weinberg equilibrium (HWE) test using a chi-squared statistic with one degree of freedom. For each SNP, we modeled the log-transformed trait value adjusted for the same covariates as used in the linkage analysis. Trait values within each pedigree were assumed to be correlated, while they were independent between pedigrees. A special case of linear mixed effects model (LME) in SAS was used to fit the model with the covariates and calculate the residual for each individual. The following model was used for family i, $\text{[math]}$ , where Y is a vector of log-transformed LDHs of subjects in the family, and X contains covariates to adjust (age, sex, height, weight, HDL-C, and ALP), which were treated as fixed effects. The covariance matrix for Y in family i was unstructured. Once the estimate of β was obtained, denoted as $\text{[math]}$ , we calculated the residual for person j in family i with covariates $\text{[math]}$ as $\text{[math]}$ . Then, in the second stage, we modeled the association between the residual and genetic effect by LME given by $\text{[math]}$ , where the covariance matrix of $\text{[math]}$ was unstructured within each family and diagonal between families, and $\text{[math]}$ was a coding for the genetic models underlying the disease locus, which were defined as recessive, additive, or dominant. For each genetic model, a P-value was obtained. The smallest of the three P-values was used to rank all SNPs. In genome-wide association studies, some SNPs associated with common diseases have little additive effect while there is strong deviation from Hardy-Weinberg equilibrium, which indicates strong recessive or dominant effects [17, 18]. These SNPs would not be detected using the P-value only based on the additive model. Using the minimum P-value to rank SNPs will be able to detect SNPs with all three genetic models. Since the minimum P-value is no longer a true p-value, we used the following threshold values: if the minimum P-value is less than $\text{[math]}$ , the association is regarded as strong, while if the minimum P-value is between $\text{[math]}$ and $\text{[math]}$ , the association is moderate. No correction for multiple testing was given for the minimum P-value.

Previous Section Next Section

3. Results

3.1. Genome-Wide Linkage Analysis

The total number of individuals with measured LDH and with all covariates used for the heritability estimates and linkage analysis in offspring was 1603. The basic characteristics of the clinical covariates of these individuals (about 50% male) are displayed in Table 1. Of the 1603 individuals, 1260 had microsatellite marker genotypes.

View this table:

Download to PowerPoint Slide

Table 1

Characteristics of the 1603 individuals with LDH and covariates used in the linkage analysis.

The skewness and kurtosis of LDH was 1.87 ± 0.04 and 13.92 ± 0.07, respectively. The heritability estimate for LDH, after adjusting for the covariates, was 40.9 ± 6.4%. The proportion of variation due to all covariates included in the model was approximately 5.8%.

From the multipoint linkage analysis of LDH, several maximum LOD scores between 1 and 2.5 were observed on different chromosomes (Table 2). The highest LOD score was 2.41 on chromosome 9.

View this table:

Download to PowerPoint Slide

Table 2

Chromosomal regions in the genome scan with multipoint LODs ≥ 1.0.

3.2. Genome-Wide Association Study

After log-transformed, only one individual's LDH value, 1.74, was out of the range of 5SD from the mean, 1.77–2.64. Since the value of 1.74 was so close to the lower bound of 1.77, we did not exclude this individual from the association study and reanalyzed the data. In the genome-wide association study, no SNP showed strong association, while there were two SNPs, rs9318892 and rs10518949, which showed moderate association. The top ranked SNP on chromosome 13 had a minimum P-value of $\text{[math]}$ , while the second one on chromosome 15 had a mini p = 1.6 × 10⁻⁷. There were 32 SNPs having a minimum P < 10⁻⁴, Table 3. All SNPs within 60 kb of LDHA gene (rs4237721, rs2279900, rs8018, rs3781640) and LDHB gene (rs10505873, rs1677106, rs1677104, rs1030452, rs1012354) have a P-value > .20.

View this table:

Download to PowerPoint Slide

Table 3

Top 32 SNPs associated with LDH levels with a minimum P-value < 10⁻⁴ by LME test.

Previous Section Next Section

4. Discussion

In our study, the heritability of LDH was estimated at 41% indicating that a substantial portion of the variation in serum LDH was attributable to additive genetic factors. This is consistent with a previous finding [5]. In the linkage study, we identified several chromosomal regions with LOD scores between 1 and 2.5. None of them are located on the chromosomes where the genes encoding the LDH isoenzymes reside. No obvious candidate genes were found in those chromosomal regions. In the genome-wide association study, no SNP with genome-wide significance was observed. Furthermore, testing three models per SNP in the genome-wise association study adds burden to the multiple test and this is not reflected in the P-values in Table 3. Among the top 32 SNPs with a P-values < 10⁻⁴, none of them were near the LDH structure genes, and none were within the regions of linkage peaks. Similarly as in the linkage analysis, no obvious candidate genes were found on those chromosomal regions.

A power study for linkage on the 330 Framingham families with traits measured only in the Offspring Cohort, similar to our study, was performed using SOLAR. The power was estimated to be 97%, 84%, and 62% to detect a QTL heritability of 30% using a LOD score cut off of 1, 2, or 3, respectively, as significant (http://www.framinghamheartstudy.org/). The results of the simulation studies imply that this study sample only has sufficient power to detect large QTL effects. A power study for association on the same study population under additive genetic model demonstrated 97%, 100%, 94%, and 77% power for SNP minor allele frequencies of 5%, 10%, 20%, and 30%, assuming 10% QTL heritability, the allele frequency for the QTL to be 0.10, and the QTL and the marker are in linkage disequilibrium ( $\text{[math]}$ = 1). The power was calculated using PBAT, a software for the family-based association test (the same Framingham website mentioned above). Usually LME is expected to have higher power than PBAT, since only heterozygous parents are informative for a family-based study. The results of the simulation studies imply that this study population may have good power for linkage study. However, since the Affymetrix 100 K GeneChip has a mean marker distance of 26 kb, in most situations, the genomic coverage may not be high enough to reach the power mentioned above for association study unless there is a major gene with a larger effect size in linkage disequilibrium with the adjacent SNPs.

Although there is a strong genetic effect on the variation of serum LDH levels, the results of both linkage and association genome scans did not show evidence of a major gene effect. Instead, there may be many genes with small effects in controlling the variation of serum LDH levels.

A limitation of this study is that our cohort is mostly composed of Caucasians. Therefore, caution is advised in extrapolating our results to other ethnicities.

Previous Section Next Section

Acknowledgments

The authors are very grateful to Nancy Geller for her helpful comments on the data analysis and critical review of the manuscript. This work was supported by the National Heart, Lung, and Blood Institute's Framingham Heart Study (Contract no. N01-HC-25195).

Previous Section

References

↵
1. Vasudevan G.,
2. Mercer D. W.,
3. Varat M. A.
“Lactic dehydrogenase isoenzyme determination in the diagnosis of acute myocardial infarction”. Circulation, vol. 57 no. 6 pp. 1055–1057 1978 2-s2.0-0017812147.
Abstract/FREE Full Text
↵
1. Li S. S.-L.,
2. Luedemann M.,
3. Sharief F. S.,
4. Takano T.,
5. Deaven L. L.
“Mapping of human lactate dehydrogenase-A, -B, and -C genes and their related sequences: the gene for LDHC is located with that for LDHA on chromosome 11”. Cytogenetics and Cell Genetics, vol. 48 no. 1 pp. 16–18 1988 2-s2.0-0023756072.
FindIt@Stanford Medline Order article via Infotrieve Web of Science
↵
1. Markert C. L.,
2. Shaklee J. B.,
3. Whitt G. S.
“Evolution of a gene. Multiple genes for LDH isozymes provide a model of the evolution of gene structure, function, and regulation”. Science, vol. 189 no. 4197 pp. 102–114 1975 2-s2.0-0016849248.
FREE Full Text
↵
1. Rethore M. O.,
2. Junien C.,
3. Malpuech G.
“Localization of the gene of the glyceraldehyde 3 phosphate dehydrogenase on the distal segment of the short arm of the chromosome 12”. Annales de Genetique, vol. 19 no. 2 pp. 140–142. 1976.
FindIt@Stanford Medline Order article via Infotrieve Web of Science
↵
1. Bathum L.,
2. Petersen H. C.,
3. Rosholm J.-U.,
4. Petersen P. H.,
5. Vaupel J.,
6. Christensen K.
“Evidence for a substantial genetic influence on biochemical liver function tests: results from a population-based Danish twin study”. Clinical Chemistry, vol. 47 no. 1 pp. 81–87 2001 2-s2.0-0035157380.
Abstract/FREE Full Text
↵
1. Dawber T. R.
The Framingham Heart Study: The Epidemiology of atherosclerotic Diseases, Cambridge, Mass, USA Harvard University Press. 1980.
↵
1. Kannel W. B.,
2. Feinleib M.,
3. McNamara P. M.
“An investigation of coronary heart disease in families. The Framingham offspring study”. American Journal of Epidemiology, vol. 110 no. 3 pp. 281–290. 1979.
Abstract/FREE Full Text
↵
1. Wilk J. B.,
2. Manning A. K.,
3. Dupuis J.,
4. et al.
“No evidence of major population substructure in the Framingham Heart Study”. Genetic Epidemiology, vol. 29 no. 3 286. 2005.
↵
1. Yuan B.,
2. Vaske D.,
3. Weber J. L.,
4. Beck J.,
5. Sheffield V. C.
“Improved set of short-tandem-repeat polymorphisms for screening the human genome”. American Journal of Human Genetics, vol. 60 no. 2 pp. 459–460 1997 2-s2.0-0031034483.
FindIt@Stanford Medline Order article via Infotrieve Web of Science
↵
1. Herbert A.,
2. Gerry N. P.,
3. McQueen M. B.,
4. Heid I. M.,
5. Pfeufer A.,
6. Illig T.,
7. Wichmann H.-E.,
8. Meitinger T.,
9. Hunter D.,
10. Hu F. B.,
11. Colditz G.,
12. Hinney A.,
13. Hebebrand J.,
14. Koberwitz K.,
15. Zhu X.,
16. Cooper R.,
17. Ardlie K.,
18. Lyon H.,
19. Hirschhorn J. N.,
20. Laird N. M.,
21. Lenburg M. E.,
22. Lange C.,
23. Christman M. F.
“A common genetic variant is associated with adult and childhood obesity”. Science, vol. 312 no. 5771 pp. 279–283 2006 2-s2.0-33645825830 doi:10.1126/science.1124779.
Abstract/FREE Full Text
↵
1. Cupples L. A.,
2. Arruda H. T.,
3. Benjamin E. J.,
4. et al.
“The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports”. BMC Medical Genetics, vol. 8 supplement 1, article S1 2007 2-s2.0-35748943204 doi:10.1186/1471-2350-8-S1-S1.
FindIt@Stanford CrossRef Medline Order article via Infotrieve
↵
1. Almasy L.,
2. Blangero J.
“Multipoint quantitative-trait linkage analysis in general pedigrees”. American Journal of Human Genetics, vol. 62 no. 5 pp. 1198–1211 1998 2-s2.0-0031966959 doi:10.1086/301844.
FindIt@Stanford CrossRef Medline Order article via Infotrieve Web of Science
↵
1. Blangero J.,
2. Almasy L.
“Multipoint oligogenic linkage analysis of quantitative traits”. Genetic Epidemiology, vol. 14 no. 6 pp. 959–964 1997 2-s2.0-0031442937 doi:10.1002/(SICI)1098-2272(1997)14:6<959::AID-GEPI66>3.0.CO;2-K.
FindIt@Stanford CrossRef Medline Order article via Infotrieve Web of Science
↵
1. Allison D. B.,
2. Neale M. C.,
3. Zannolli R.,
4. Schork N. J.,
5. Amos C. I.,
6. Blangero J.
“Testing the robustness of the likelihood-ratio test in a variance-component quantitative-trait loci-mapping procedure”. American Journal of Human Genetics, vol. 65 no. 2 pp. 531–544 1999 2-s2.0-0033358545 doi:10.1086/302487.
FindIt@Stanford CrossRef Medline Order article via Infotrieve Web of Science
↵
1. Blangero J.,
2. Williams J. T,
3. Almasy L.
“Robust LOD scores for variance component-based linkage analysis”. Genetic Epidemiology, vol. 19 supplement 1 pp. S8–S14. 2000.
FindIt@Stanford CrossRef Medline Order article via Infotrieve Web of Science
↵
1. Blangero J.,
2. Williams J. T.,
3. Almasy L.
“12 Variance component methods for detecting complex trait loci”. Advances in Genetics, vol. 42 pp. 151–181 2001 2-s2.0-0035220056.
FindIt@Stanford Medline Order article via Infotrieve
↵
1. Burton P. R.,
2. Clayton D. G.,
3. Cardon L. R.,
4. et al.
“Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls”. Nature, vol. 447 no. 7145 pp. 661–678 2007 2-s2.0-33745121154 doi:10.1038/nature05911.
FindIt@Stanford CrossRef Medline Order article via Infotrieve Web of Science
↵
1. Zheng G.,
2. Joo J.,
3. Zaykin D.,
4. Wu C. O.,
5. Geller N. L.
“Robust tests in genome-wide scans under incomplete linkage disequilibrium”. Statistical Science, vol. 24 pp. 503–516. 2009.
FindIt@Stanford CrossRef Web of Science