© 2001 by The Society for Integrative and Comparative Biology
Adaptive Evolution of Gene Expression in Antarctic Fishes: Divergent Transcription of the 5'-to-5' Linked Adult
1- and ß-Globin Genes of the Antarctic Teleost Notothenia coriiceps is Controlled by Dual Promoters and Intergenic Enhancers1
1 Department of Biology, Northeastern University, Boston, Massachusetts 02115
| SYNOPSIS |
|---|
|
|
|---|
Unlike temperate fishes, Antarctic fishes of the notothenioid suborder, whose body temperatures (2 to +1°C) conform to the Southern Ocean, must express their genomes in an extremely cold thermal regime. To determine whether these fishes have evolved compensatory adjustments that maintain efficient gene transcription at low temperatures, we have initiated studies of the cis-acting regulatory elements that control globin gene expression in the Antarctic rockcod Notothenia coriiceps and in its close relative, the temperate New Zealand black cod N. angustata (habitat temperature = +6 to +15°C). The genes encoding the major
1 and ß globins of these fishes are tightly linked in head-to-head (5' to 5') orientation. The intergenic regions separating the globin genes in the two fishes,
4.3 kb in N. coriiceps and
3.2 kb in N. angustata, are highly similar in sequence, the major difference being the absence of a 1.1-kb, repeat-containing segment in the latter. To assess the promoter and enhancer activities of the intergenic regions, each was cloned into the luciferase-reporter vector pGL3-Basic, and the constructs were transfected into MEL cells. Upon DMSO induction of MEL cell differentiation, each of the
/ß-intergenic regions functioned in both orientations as erythroid-responsive transcriptional regulators. However, expression of luciferase mediated by the N. coriiceps intergene was 6-fold greater in the
orientation than that for the N. angustata intergene and 2-fold greater for the ß. The greater transcription-stimulating activity of the N. coriiceps intergene can be attributed to two enhancers composed of combinations of CAC/Sp1 and GATA motifs and located in direct repeat elements. N. angustata, which lacked repetitive structure in its intergene, contained a single copy of the enhancer. We propose that cold adaptation of globin gene expression in N. coriiceps evolved in part through duplication and refinement of critical cis-acting regulatory elements as the Southern Ocean cooled during the past 25 million years. | INTRODUCTION |
|---|
|
|
|---|
The coastal fishes of the Antarctic diverged from temperate fishes approximately 25 million years ago (mya5) as the Southern Ocean cooled (DeWitt, 1971
+1°C). Compensatory alteration of the biosynthetic machinery necessary to express their genomes almost certainly may be counted among the adaptations that have enabled Antarctic fishes to thrive in their unique environment. The translational apparatus of these fishes, for example, supports polypeptide chain elongation at rates more than 10-fold greater than those measured in temperate fishes cooled to comparable temperatures (Smith and Haschemeyer, 1980
Prior studies of cold adaptation and acclimation of non-Antarctic organisms provide clues to strategies that have been exploited to adapt transcription to different thermal environments. Cold-adapted killifish (Fundulus heteroclitus) synthesize higher levels of lactate dehydrogenase B than do their southern, warm-adapted counterparts due largely to greater rates of transcription of the Ldh-B gene (Crawford and Powers, 1992
) supported by functionally distinct proximal promoters (Segal et al., 1996
; Schulte and Powers, 1997
; Powers and Schulte, 1998
). Carp acclimate their myofibrillar ATPases to seasonal changes in temperatures by switching, apparently through use of distinct promoters, the expression of genes for functionally different myosin heavy chains (Gerlach et al., 1990
; Goldspink et al., 1992
; Goldspink, 1995
; Gauvry et al., 1996
). Tolerance of low temperature by the plant Arabidopsis thaliana involves both unique cis-acting regulatory elements and stress-responsive transcription factors that together enhance transcription of stress-related genes (Yamaguchi-Shinozaki and Shinozaki, 1993, 1994
; Horvath et al., 1993
; Shinwari et al., 1998
; Liu et al., 1998
; Kasuga et al., 1999
). Thus, both the cis-acting and trans-acting components of the transcriptional apparatus can evolve to facilitate gene expression at low temperatures.
The suborder Notothenioidei of teleost fishes is arguably the optimal taxon for analysis of cold adaptation of gene expression. The notothenioids dominate the fish fauna of the Southern Ocean, both in number of species and in biomass, and they exhibit numerous adaptations to their cold and stable environment, such as the presence of antifreeze glycoproteins in their blood (Chen et al., 1997; Cheng, 1998a, b
) and tubulins that polymerize efficiently at 0°C (Detrich et al., 1989
; Detrich et al., 1992
; Detrich, 1997
). Furthermore, some notothenioid species are found exclusively in temperate waters north of the Antarctic polar front (Eastman, 1993
), which allows one to compare the transcriptional machinery of closely related fishes that have adapted to distinct thermal regimes.
The globin genes, long a model system for analysis of gene regulation in mammals, birds, amphibians (Orkin, 1990, 1995b
), and more recently, fishes (McMorrow et al., 1996
; Chan et al., 1997
; Miyata and Aoki, 1997
), provide an excellent framework for understanding cold adaptation of transcription. To this end, we have initiated studies of the promoter and enhancer elements and the transcription factors that govern globin gene expression in the Antarctic yellowbelly rockcod, Notothenia coriiceps, and in its temperate congener, the New Zealand black cod or Maori Chief, N. angustata. In this report we show that the genes encoding the major adult
and ß globins of these nototheniids, like those of other fishes (McMorrow et al., 1996
; Chan et al., 1997
; Miyata and Aoki, 1997
), are tightly linked in head-to-head (5' to 5') orientation. The intergenic regions that separate the globin genes,
4.3 kb in N. coriiceps and
3.2 kb in N. angustata, are highly similar in sequence, the major difference being the absence of a 1.1-kb, repetitive segment in the latter. When assayed by transfection of murine erythroleukemia (MEL) cells with luciferase-reporter constructs, the intergenic regions of N. coriiceps and of N. angustata function as erythroid-responsive, dual promoters, but the strength of the former is 26-fold greater. The greater transcription-promoting activity of the N. coriiceps intergene appears to be due to the presence of two enhancers, composed of combinations of CAC/Sp1 and GATA motifs, that are located in direct repeat elements. The N. angustata intergene, which lacked repetitive structure, contained a single copy of the enhancer. Thus, we propose that efficient gene expression in cold-adapted organisms may be achieved, in part, by evolutionary restructuring of cis-acting transcriptional regulatory elements. A preliminary report of some of this work has appeared (Saeed et al., 1997
).
| MATERIALS AND METHODS |
|---|
|
|
|---|
Collection of animals and storage of tissues
Specimens of the Antarctic yellowbelly rockcod, Notothenia coriiceps, were collected by bottom trawling from the R/V Hero or from the R/V Polar Duke south of Low Island (Antarctic Treaty Protected Area System Marine Site of Special Scientific Interest (MSSSI) 35, Western Bransfield Strait) or west of Brabant Island (MSSSI 36, East Dallmann Bay). Fishes were transported alive to Palmer Station, Antarctica, where they were maintained in seawater aquaria at 1 to +1°C. Testes were dissected immediately following sacrifice of mature males, frozen in liquid nitrogen, and maintained at 70°C until use.
Frozen testis tissue from the New Zealand black cod or Maori chief, Notothenia angustata (habitat temperature = +6 to +15°C), was generously provided by Dr. Arthur DeVries (University of Illinois, Urbana).
Genomic library construction and screening
Genomic libraries of testicular DNA from N. coriiceps and from N. angustata were constructed in phage lambda vectors (Charon 35 (Loenen and Blattner, 1983
) and LambdaGEM-11 (Promega), respectively) as described previously (Parker and Detrich, 1998
). Each library was screened for individual clones containing both major adult globin genes by hybridization of identical nylon replicas of bateriophage plaque DNA (Zhao et al., 1998
; Parker and Detrich, 1998
) either to the N. coriiceps cDNA NcHb
1-1, which encodes the major adult
1 globin chain of this fish, or to NcHbß1-1, the cDNA that encodes its sole adult ß-globin chain (Cocca et al., 1995
). One N. coriiceps genomic isolate positive for both globin probes was obtained from a screen of 500,000 recombinant phage and subsequently carried through tertiary plaque purification. Nine doubly positive N. angustata isolates were obtained from a screen of 500,000 phage, and three of these were purified through tertiary screening.
Gene linkage analysis by PCR
The relative orientations of the
1- and ß-globin genes in the genomic DNA clones obtained from the N. coriiceps and N. angustata lambda libraries were analyzed by PCR-based linkage analysis. Nondegenerate primers, designed from the coding regions of the N. coriiceps
- and ß-globin genes and tagged at their 5' ends with XhoI sites, were used pairwise in independent reactions to test for head-to-head (5' to 5'), tail-to-tail (3' to 3'), and head-to-tail (5'
1 to 3' ß, or 5' ß to 3'
1) orientations of the globin gene pair of each fish. The primers (XhoI sites italicized), and their locations within the globin genes, were 1) AS4 (
1-globin exon 1 antisense primer), 5' ATATTCCGCTCGAGTCCAATCGCATCAGCTGACTTGCCG 3'; 2) SS4 (
1-globin exon 2 sense primer), 5' ATATTCCGCTCGAGAAGACCTACTTCTCCCACTG-GCCTG 3'; 3) BBR1 (ß-globin exon 1 antisense primer), 5' ATATTCCGCTCGAGATGTGGGAGAAGATGTCGG 3'; and 4) BBF4 (ß-globin exon 1 sense primer), 5' ATATTCCGCTCGAGCTGTCTGACTGCATCACC. Globin gene orientations were tested using the following primer combinations: 1) 5'
to 5' ß, AS4 and BBR1; 2) 3'
to 3' ß, SS4 and BBF4; 3) 5'
to 3' ß, AS4 and BBF4; and 4) 5' ß to 3'
, SS4 and BBR1. Touchdown PCR (Don et al., 1991
) incorporating hot start by TaqStart antibody inactivation (Sharkey et al., 1994
) was performed using AdvantageTM KlenTaq polymerase mix (CLONTECH) for 30 cycles (cycling profile available on request). PCR products were analyzed on 1% agarose gels. The genomic clones from N. coriiceps and N. angustata gave PCR products only when amplified with the primer pair designed to test for head-to-head orientation of the globin genes. Thus, the adult
1- and ß-globin genes of both fishes are linked as divergent transcriptional units (Fig. 1), a finding confirmed by subsequent sequencing of the genomic clones.
|
Construction of globin promoter/luciferase reporter vectors
Direct subcloning of the entire N. coriiceps or N. angustata
/ß-globin intergenic regions6 from lambda phage to a plasmid reporter vector proved impossible because the intergenes contain multiple recognition sites for most common restriction endonucleases. Therefore, we employed a two-step, PCR-based protocol to introduce the intergenic regions, flanked by segments of the first exons of the
- and ß-globin genes, into the firefly luciferase reporter vector pGL3-Basic (Promega). (To assess the activities of both the
-globin and the ß-globin promoter elements, each intergene was subcloned in both orientations into the reporter vector, and the globin exon segments were fused in frame, through elements of the mulitple-cloning sites of the vector, to the initiator codon of the luciferase gene.) The head-to-head PCR products (see previous section) were subcloned independently into the pGEM-T (Promega) plasmid vector. The N. coriiceps insert was excised from the intermediate pGEM-T subclone by digestion with XhoI and then ligated in both orientations into the XhoI site of pGL3-Basic to yield two constructs, pNc
5 and pNcß7, with the
- or ß-globin promoter elements, respectively, immediately upstream of the luciferase reporter gene. Similarly, the N. angustata intergene was excised from pGEM-T subclones of opposite orientation by digestion with NcoI and NheI, and the two products were ligated directionally into NcoI- and NheI-digested pGL3-Basic to produce two constructs, designated pNa
1 and pNaß9, with the
- or ß-globin promoters, respectively, proximal to the luciferase gene. Preliminary transfection assay of the four N. coriiceps and N. angustata constructs in MEL cells (see below) demonstrated substantial promoter activity.
The first generation reporter constructs expressed firefly luciferase fused at its N terminus to oligopeptides encoded by segments of the first exons of the
- and ß-globin genes and by a short stretch of vector DNA between the multiple cloning site and the luc+ initiator codon. To eliminate the fusion oligopeptides, which may have altered the enzymatic activity of the reporter, a second generation of constructs was engineered by PCR. The sense primer for amplification of each of the four original clones was 5' ATGGAAGACGCCAAAAACATAAAG 3', which encodes the first 8 residues of firefly luciferase. The antisense primers were designed to bind to the intergenic termini contiguous with the stretches of polylinker/globin sequences to be deleted. For pNc
5 and pNcß7, the antisense primers were 5' CTTGCTTGCTTGCTTATTTCTTAGTGACC 3' and 5' TTATTTTTGACACCAGACCAACTG 3', respectively. The second primer for pNa
1 was 5' CTTGCTTGCTTATTTCTTCGTGAC 3', whereas that for pNaß9 was identical to the antisense primer used with pNcß7. PCR was performed using AdvantageTM KlenTaq for 25 cycles (cycling profile available on request). Following PCR, each reaction was treated with DpnI (10 U, 30 min, 37°C) to eliminate the methylated parental plasmid template (Geier and Modrich, 1979
). Amplified DNAs were ligated intramolecularly (Sambrook et al., 1989
). The resulting plasmids, which drive expression of bona fide luciferase from the
1- and ß-globin promoters of the N. coriiceps and N. angustata intergenic regions, were designated pNcorP
1, pNcorPß, pNangP
1, and pNangPß, respectively.
Deletion mutagenesis of the globin promoters
To evaluate the promoter and enhancer elements of the N. coriiceps intergenic regions, pNcorP
1 and pNcorPß were subjected to deletion mutagenesis. Three pairs of mutants were generated by restriction endonuclease digestion. pNcorP
1
HindIII and pNcorPß
HindIII were generated by excision of a 2.9-kb fragment from the intergenic region of the parental plasmids by digestion with HindIII, followed by treatment of the deleted DNAs with T4 DNA polymerase to create blunt ends and with T4 ligase to reconstitute circular plasmids. Similarly, pNcorP
1
EcoRI/SpeI and pNcorP
1
EcoRI/AflII, and the corresponding ß variants, were produced by digestion of the full-length plasmids with the pairs of restriction enzymes indicated.
Three other pairs of mutants were obtained by PCR. pNcorP
1
1.1kb and pNcorPß
1.1kb, which lack the 1.1-kb sequence element unique to the N. coriiceps intergenic region, were produced by amplification of the parental plasmids pNcorP
1 and pNcorPß, respectively, using the primers 5' TAGAAATGAAGTGTATTATTTTTTAAATGC 3' and 5' ATTTGTGTTTATTACACTTAATTTATAATG 3'. Similarly, pNcorP
1
1,333, which lacks the 1,333 nucleotides contiguous with the start codon of the
1-globin gene, was produced by amplification of its parental plasmid using the primers 5' GTTAGTTCCAGGTTTAATATGTGC 3' and 5' ATGGAAGACGCCAAAAACATAAAG 3'. Its counterpart, pNcorPß
1,333, was obtained using the primer pair 5' GTTAGTTCCAGGTTTAATATGTGC 3' and 5' ATGAGTCTCTCCGACAAAGAC 3'. Finally, the constructs pNcorP
1
ENH and pNcorPß
ENH, whose deletions cover nucleotides 1,882 to 3,204, were amplified from their respective
1.1kb templates, pNcorP
1
1.1kb and pNcorPß
1.1kb, using the primer pair 5' CTGTGCTGTATCCAAGACACTG 3' and 5' GTTGATGATACAATTCCACGAGC 3'. The PCR-generated constructs were ligated intramolecularly.
Fidelity and validation of PCR-generated constructs
The second generation reporter constructs, and some of the deletion mutants thereof, were produced by PCR, which could introduce point mutations into plasmid and/or intergenic insert sequences. To minimize this possibility, we used the AdvantageTM KlenTaq polymerase mix (CLONTECH), which contains a proofreading polymerase that reduces the error rate to
105 bp1 (Barnes, 1994
). Thus, the probability that mutations were introduced into our
69 kb constructs is low. Furthermore, the structures of all deletion constructs were validated by diagnostic restriction mapping and by sequencing across their ligation junctions. Two or three independent clones of each full-length or deletion construct were used for the promoter assays.
Sequence analysis of the globin gene complexes
Recombinant plasmids containing the first-generation PCR-amplified globin gene complexes of N. coriiceps and N. angustata, and restriction-fragment subclones thereof, were sequenced manually on both strands by use of the dideoxynucleotide chain-termination method (Sanger et al., 1977
) or by automated DNA sequencing (University of Maine DNA Sequencing Facility). The sequences of the exons and introns of the N. coriiceps ß-globin gene, and of the N. angustata
- and ß-globin genes, were obtained by automated sequencing of PCR products obtained from the original lambda clones.
Alignment of the nucleotide sequences of the N. coriiceps and N. angustata globin gene complexes, and subregions thereof, was performed by use of the Clustal method provided by DNASTAR MegAlign. DNA sequence relatedness was calculated as the Martinez-Needleman-Wunsch similarity index, also implemented by DNASTAR MegAlign, using the default parameters (minimum match = 9, gap penalty = 1.10, gap length penalty = 0.33). Potential promoters upstream of the globin coding sequences were predicted using the program NNPP (Promoter Prediction by Neural Network, available at http://www-hgc.lbl.gov/projects/promoter.html) (Reese et al., 1996
). The relationship of the repetitive elements of the N. coriiceps intergenic region to mobile genetic elements was investigated using the Repbase database (http://www.girinst.org) and the program RepeatMasker (A. F. A. Smit and P. Green, RepeatMasker available at http://ftp.genome.washington.edu/RM/RepeatMasker.html).
GenBank data deposition
The sequences of the N. coriiceps and N. angustata
/ß-globin gene complexes reported in this paper have been deposited in the GenBank database under the accession numbers AF049916
[GenBank]
and AF187046
[GenBank]
, respectively. The sequences of the complexes have been scanned against the GenBank database using the BLASTN program (National Center for Biotechnology Information) to identify sequences with significant relatedness (see Results).
Transient transfection assay of promoter/reporter plasmids in murine erythroleukemia (MEL) cells
The promoter and enhancer elements of the fish
/ß-globin intergenic regions were assayed in the hematopoietic microenvironment provided by differentiated MEL cells. Plasmids were grown in E. coli XL-1 Blue MRA cells (Stratagene) and were prepared for transfection by the Wizard Plasmid Midiprep protocol (Promega). MEL cells were cultured in DMEM + 10% FBS at 37°C in a humid, 5% CO2 atmosphere. Experimental plasmid constructs (3 µg in 100 µl sterile water), doped with 25 ng of the Renila (sea pansy) luciferase reporter plasmid pRL-SV40 (Promega), were mixed 1:1 (v/v) with 10 mg/ml DEAE-dextran. The plasmid/DEAE-dextran solution was added to MEL cells (2 ml at 2.5 x 106/ml in DMEM + 10% FBS), and the mixture was incubated at 37°C for 1 hr (5% CO2). The cells were collected by centrifugation (IEC clinical centrifuge, speed 3, 3 min, room temperature), "shocked" by resuspension in 1 ml of 10% DMSO in PBS, diluted by addition of 10 ml of culture medium, and centrifuged again (IEC clinical centrifuge, speed 4, 4 min, room temperature). To induce hematopoietic differentiation (Watanabe and Oishi, 1987
), the cells were resuspended in 5 ml DMEM + 10% FBS + 1.7% DMSO and then cultured for 72 hr at 37°C in a humid, 5% CO2 atmosphere. Cell extracts, prepared by the manufacturer's protocol, were assayed in duplicate or triplicate for both firefly and Renila luciferase activities by use of the Promega Dual Luciferase Reporter Assay System and an Optocomp I luminometer (MGM Instruments). Two or four independent transfections were performed for each reporter construct. To control for variable transfection efficiencies, firefly luciferase activities were normalized with respect to the activity of the Renila enzyme.
| RESULTS |
|---|
|
|
|---|
Organization and structure of the adult globin genes of an Antarctic teleost and a temperate congener
Figure 1 presents the organization and salient features of the putative adult globin gene complexes of an Antarctic teleost, N. coriiceps, and of a temperate congener, N. angustata. Each consisted of the
1-globin gene linked in head-to-head (5' to 5') orientation with the gene for ß globin. Thus, the
1- and ß-globin genes of these complexes would be transcribed in opposite directions. The intergenic region that separates the start codons of the globin coding sequences measured 4.3 kb for the N. coriiceps complex and 3.2 kb for N. angustata. Both the
1-globin genes of the two fishes, and their ß-globin genes, were composed of three exons separated by two introns (Fig. 1), and the positions of the introns conformed to the vertebrate norms for globin genes (Liebhaber et al., 1980
Several results indicate that these complexes constitute the functional adult globin gene loci of these fishes. First, the coding sequences and 5'- and 3'-untranslated regions of the
1- and ß-globin genes of N. coriiceps matched precisely the corresponding regions of the adult globin cDNAs (Cocca et al., 1995
; Zhao et al., 1998
) of this species. Second, the deduced primary sequences of the N. coriiceps and N. angustata globins, which differ at 2 positions in the
1 chains and 10 in the ß chains, are identical to those established previously by conventional protein sequencing (D'Avino and di Prisco, 1989
; Fago et al., 1992
). Finally, the globin genes of the two nototheniids possessed none of the structural features present in typical globin pseudogenes (Proudfoot and Maniatis, 1980
; Lacy and Maniatis, 1980
) or in processed pseudogenes (Vanin, 1985
).
Detailed comparison of the two gene complexes reveals a remarkable degree of sequence identity. Both the
1- and ß-globin genes of the N. coriiceps complex, including exons and introns, and the intergenic region separating the genes are strikingly similar in nucleotide sequence to their counterparts from N. angustata, the major difference being the absence of a 1.1-kb segment in the intergenic region of the latter (Figs. 2 and 7). After alignment, the two intergenic regions were found to be 65% similar in their entirety, and the score rose to 88% when the "extra" 1.1-kb segment of the N. coriiceps intergene was eliminated from the comparison. Although large, the intergenic regions of N. coriiceps and N. angustata did not contain significant open reading frames, which is consistent with the absence of embedded transcription units. The intronic splice junctions of the globin structural genes followed the GT/AG rule (Breathnach and Chambon, 1981
; Keller and Noon, 1984
; Padgett et al., 1986
). However, the
introns were much longer (I1 = 450 and 434 bp, I2 = 278 and 275 bp, for N. coriiceps and N. angustata, respectively) than were those of the ß genes (I1 = 91 bp, I2 = 87 bp for both species).
|
To determine whether the N. coriiceps and N. angustata globin intergenes shared regulatory or structural elements with the globin complexes of other vertebrates, we scanned the nototheniid sequences against the those of globin gene complexes from human (acc. nos. AF064190 [GenBank] , AF149718 [GenBank] , and AF137396 [GenBank] ), mouse (acc. nos. AF071080 [GenBank] , AF037169 [GenBank] , X66475 [GenBank] , X66476 [GenBank] , Z13985 [GenBank] , and U08220 [GenBank] ), and other fishes (Atlantic salmon, acc. no. X97287 [GenBank] ; carp, acc. no. AB004740 [GenBank] ; zebrafish, acc. no. U50382 [GenBank] ). No large-scale sequence similarities (>20 bp) were found, even with respect to the comparably organized globin complexes of the three fishes. However, several clusters of erythroid regulatory motifs characteristic of the globin gene enhancers of higher vertebrates were detected in the nototheniid intergenes (see Enhancer identification by phylogenetic footprinting and Discussion).
Promoter/enhancer motifs proximal to the globin genes
The intergenic regions of the two fishes were first analyzed for the presence of basal and erythroid-specific promoter or enhancer elements (TATA, CAAT, GATA, Sp1,7 CACCC,7 EKLF,7 NF-E2, etc.; refs. (Myers et al., 1986
; Orkin, 1990
; Bucher, 1990
; Strauss and Orkin, 1992
; Eleouet and Romeo, 1993
; Andrews et al., 1993a, b)
) that were proximal to (within
500 bp) the globin genes. We mapped previously three transcription initiation (capping) sites (see bent arrows, Fig. 7) upstream of the N. coriiceps
1-globin coding sequence (Zhao et al., 1998
), the first two at 38-bp (CAP 1) and 380-bp (CAP 2) upstream of the initiator ATG probably constituting the major sites of pol II initiation based on the size heterogeneity of the
1-globin mRNAs (Cocca et al., 1995
). Potential TATA and CCAAT boxes were located near the two cap sites, as well as CACCC and GATA (consensus WGATAR) motifs (for details, see Zhao et al. (1998)
. One NF-E2 element (YTGCTGASTCAY) preceded CAP 1. These general and hematopoietic promoter elements were conserved in their entirety upstream of the N. angustata
1-globin coding sequence, which demonstrates that the proximal
-globin promoter architecture has not changed despite the evolutionary divergence of the two nototheniid fishes.
|
The DNA sequences immediately upstream of the ß-globin coding regions of the N. coriiceps and N. angustata gene complexes also contained conserved erythroid promoter and enhancer elements. In each case, a noncanonical TATA box (beginning 32-bp upstream of the predicted transcription initiation site CAP 18) was preceded by a potential EKLF motif (86 to 94 relative to CAP 1), a G-rich region that may constitute a second EKLF element (45 to 55) (Klevit, 1991
-globin gene (Zhao et al., 1998
Repetitive elements of the N. coriiceps intergenic region
The intergenic region of the N. coriiceps complex contained two sets of repetitive sequences that were interspersed with stretches of unique sequence (Fig. 2). Near the center of the region, duplication of an
550 bp element (composed of the three smaller segments A, B, and C) gave rise to the direct repeats NcDR1 (1,4672,016 bp upstream of the
1-globin start codon) and NcDR2 (nucleotides 2,6033,176 relative to the same reference point). A second, but unrelated, tripartite repeat, NcIR2 (
420 bp), was found to be parsed in inverted orientation (NcIR1-1, -2, and -3) upstream of the
1-globin promoter region (Fig. 2). The shorter N. angustata intergenic region contained sequence elements (NaDR, NaIR) orthologous to NcDR1 and NcIR1 but lacked large-scale repetitive structure due to the absence of their companions (contained within the "extra" 1.1-kb DNA fragment of the N. coriiceps intergene). The novel structure of the N. coriiceps intergenic region raised the possibility that transcription of its associated
- and ß-globin genes might be facilitated, relative to the N. angustata complex, by the presence of additional cis-acting regulatory elements contributed by the repetitive domains (Fig. 2).
Alignment of the N. coriiceps direct repeats with the corresponding region from N. angustata revealed a complex pattern of sequence interrelationships (Fig. 3). Subregions A, B, and C of NcDR1 were more similar to their counterparts from N. angustata (9297%) than they were to NcDR2-A, -B, and -C (8495%). However, the A and B subregions of NcDR2 and NaDR, but not of NcDR1, were separated by virtually identical 23-nucleotide spacers. Such intermixing of structural affinities suggests that this
550-bp element experienced a complex evolutionary history (see Discussion).
|
The sizes of the direct and inverted repeats of N. coriiceps (
550 and
420 bp, respectively), and of their subregions, are consistent with the possibility that they might be short interspersed repetitive elements (SINES) that have duplicated in the globin intergenic region by retroposition. Using the program RepeatMasker, we compared these repeats to SINES of the 7SL RNA-derived and tRNA-derived classes and to other transposable genetic elements found in Repbase. No similarities or signature motifs were detected.
Divergent promoter and bidirectional enhancer activity of the globin intergenic regions
The ability of the N. coriiceps and N. angustata intergenes to drive gene transcription in a hematopoietic microenvironment was assessed by cloning each region into the luciferase-reporter vector pGL3-Basic and transfecting the constructs into MEL cells.9 Four independent constructs (Fig. 4) were engineered such that transcription of the luc+ gene would be controlled by the
1- and ß-globin promoters of each fish. Figure 5 shows that the N. coriiceps intergenic region in both orientations supported high-level expression of luciferase, but only following induction of the erythroid phenotype. The luciferase levels produced by the plasmids pNcorP
1 and pNcorPß approached those driven by the strong SV-40 promoter/enhancer of the positive control, pGL3-SV40, and were
40-fold greater than those obtained with the promoterless and enhancerless negative control, pGL3-Basic. The N. angustata promoters, by contrast, were significantly less active in the MEL cell hematopoietic environment, with pNangP
1 and pNangPß driving the synthesis of luciferase to 15 and 45%, respectively, of levels obtained with the
1- and ß-globin promoters of the Antarctic notothenioid (p < 0.001 for the standard error of the difference between the
promoters, p < 0.05 for the ß promoters, two-tailed t-test). These results suggest that the intergenic region of N. coriiceps contains promoter and/or enhancer elements that are absent in the N. angustata intergene.
|
|
We began to dissect the promoter and enhancer elements of the N. coriiceps intergenic region by examination of the orientation dependence of luciferase expression from plasmid deletion constructs. The obligatory requirement for unidirectional promoter elements vicinal to the
1- and ß-globin coding sequences was demonstrated by constructs pNcorP
1
1,333, which removed the three
1-globin transcription initiation sites (Figs. 6, 7), and pNcorPß
EcoRI/AflII, which deleted the proximal ß-globin candidate promoter motifs (Figs. 6, 7). Figure 6 shows that these constructs gave baseline levels of luciferase activity in contrast to the strong to moderate expression mediated by their counterparts of opposite orientation (pNcorPß
1,333 and pNcorP
1
EcoRI/AflII, respectively). Confirmation that minimal promoter elements reside near the structural genes was provided by the
HindIII and
ENH constructs, which expressed luciferase at levels 1215% those of the wild-type constructs. Furthermore, most of the elements required for minimal ß-globin promoter activity probably reside in the 117-bp HindIII/AflII fragment that contains the first TATA box, the two putative EKLF motifs, and two GATA sites (compare Pß
HindIII and Pß
EcoRI/AflII promoter activities).
|
Surprisingly, ß-promoter-driven expression from the

1,333 construct was substantially greater than that for the wild-type intergene, which suggests that a negative modulatory element of ß-globin transcription may reside in the deleted region. Alternatively, the ß-promoter may have been more active in the 
1,333 construct due to relief from opposing
-driven transcription through the plasmid sequences. We consider the second possibility unlikely because the reciprocal
EcoRI/AflII constructs showed no evidence of supra-wild-type expression driven from the
promoter when the ß promoter was clearly inactive.
Two enhancers were detected within the direct repeats of the N. coriiceps intergenic region (Fig. 7). The orientation-independent participation of a strong enhancer (NcEnh1) located in NcDR1 subregion C was illustrated by the
1.1kb and
ENH constructs (Fig. 6). Removal of the 1.1-kb fragment unique to the N. coriiceps intergenic region (nucleotides 2,042 to 3,208) caused modest decreases (2140%) in luciferase expression mediated by either the
- or ß-globin promoters, but further deletion of nucleotides 1,882 to 2,041 and 3,199 to 3,204 (
ENH) reduced expression in both orientations to levels (1213% of wild-type) observed for the basal promoters. (Retention of high-level expression by the
1.1kb construct also demonstrated that the differential activities of the wild-type intergenic regions from N. coriiceps and from N. angustata were not caused solely by their different lengths.) The existence of a second, weaker enhancer (NcEnh2) located within NcDR2 subregion C (Figs. 2, 7) that cooperates with the NcDR1-C enhancer to establish wild-type levels of
- and ß-globin gene expression was inferred by comparison of luciferase expression driven by plasmids pNcorPß
EcoRI/SpeI, pNcorPß
EcoRI/AflII, pNcorPß
1.1kb, and pNcorPß
ENH (Fig. 6).
Enhancer identification by phylogenetic footprinting
Because functional analysis indicated that a strong enhancer element, NcEnh1, of the N. coriiceps intergene resided in subregion C of the NcDR1 repeat (Figs. 2, 6, 7), this sequence was examined for motifs that distinguish it from its cognate sequences, NcDR2-C and NaDR-C. Figure 8 shows that the major feature that distinguished NcDR1-C was the presence of four binding sites (GGTGG, reverse complement CCACC) for CAC-binding/Sp1 transcription factors (Talbot et al., 1990
; Jarman et al., 1991
; Eleouet and Romeo, 1993
; Hardison, 1998
), three of which are nested (shadowed and boxed text). The overlap of these motifs may create a binding site that possesses enhanced affinity for CAC-binding proteins and/or Sp1-like hematopoietic transcription factors. The proximity of two GATA sites to this CAC/Sp1 complex is compelling because the transcription factor GATA-1 is known to act at erythroid enhancers, in concert with other erythroid (EKLF, NFE-2) transcription factors, to regulate globin gene expression in higher vertebrates (Orkin, 1990, 1995a, b, 1996
; Shivdasani and Orkin, 1996
; Hardison, 1998
). Thus, we propose that the strong enhancer (NcEnh1) of globin gene expression in the cold-adapted nototheniid fish N. coriiceps is the 4 CAC/Sp1, 2 GATA complex located in NcDR1-C (Fig. 2 and Fig. 7, green rectangle). By analogy, we consider it likely that the weaker NcEnh2 enhancer of the N. coriiceps intergene and the presumptive enhancer of the N. angustata intergene (NaEnh) are composed of the two 2 CAC/Sp1 sites and 2 GATA motifs found in their respective subregions NcDR2-C and NaDR-C (Fig. 2 and Fig. 7, orange and yellow rectangles, respectively). In the future, we will test these hypotheses directly by deletion of NcEnh1 from the N. coriiceps intergene, reciprocal insertion of this element into the N. angustata intergene, and functional testing of the recombinants using the MEL cell assay.
|
| DISCUSSION |
|---|
|
|
|---|
Adaptation of globin gene transcription in Antarctic teleosts by enhancer expansion and optimization in the globin intergenic region?
The cooling of the Southern Ocean over the past 25 million years has played a dominant selective role in the evolution of the Antarctic fish fauna (Eastman, 1993
As a first step toward understanding potential mechanisms of transcriptional adaptation to chronically cold environments, we have compared directly the cis-acting transcriptional regulatory elements that govern the expression of the major adult globins of two closely related notothenioid fishes, the Antarctic yellowbelly rockcod Notothenia coriiceps and its temperate congener, the New Zealand black cod N. angustata. The genes encoding the major adult globin chains,
1 and ß, are tightly linked as divergent transcriptional units. Excluding a repeat-containing, 1.1-kb segment present only in N. coriiceps, the intergenic regions that drive transcription of the globin genes of these fishes are remarkably similar in sequence. Conservation of the promoter regions proximal to the
1-globin genes of N. coriiceps and N. angustata, and to their ß-globin genes, is pronounced. Although the N. coriiceps proximal promoters may contain additional sites for transcription initiation, this factor alone appears unlikely to explain the ability of the N. coriiceps intergenic region to drive the expression of a reporter gene to high absolute values that significantly exceed those produced by the N. angustata intergene.
In the context of substantial conservation of the globin intergenic regions, the direct repeats of N. coriiceps stand out as potential locations for duplication and optimization of position- and orientation-independent, erythroid-responsive enhancers. Our functional analysis by deletion mutagenesis indicated that the N. coriiceps intergenic region contains two central enhancers, contributed by subregions C of its direct repeats, that interact synergistically to drive high-level reporter expression in an erythroid microenvironment. The enhancers appear to be complexes of closely linked CAC/Sp1 and GATA motifs: 4 plus 2, respectively, for the stronger enhancer NcEnh1, and 2 plus 2 for the weaker NcEnh2. Three of the CAC/Sp1 sites of the stronger enhancer are nested, which may create an "optimized," higher affinity binding region for its associated transcription factor(s), thereby ensuring successful recruitment at low environmental temperature. Lacking the direct repeats, the N. angustata intergene contains a single, slightly divergent copy of the weaker enhancer, consistent with its 26-fold lower transcriptional activity. We caution that our functional assays of the intergenic regions were performed by necessity in a mammalian hematopoietic cell line at 37°C rather than in analogous fish cell lines or in living fishes at physiological temperature. Nevertheless, the striking erythroid responsiveness of the fish intergenic regions in differentiated MEL cells suggests strongly that the transcriptional activities observed are representative of the functional activity of these regions in vivo.
The proximal promoters that drive Ldh-B gene expression in cold- and warm-adapted populations of the killifish, F. heteroclitus, provide an interesting evolutionary contrast to the adaptational mechanism described for globin gene expression. This TATA-less, Inr-containing (Smale and Baltimore, 1989
; Roeder, 1991
) promoter possesses multiple Sp1-like motifs and a TFIID-binding TCC repeat that contribute to activation of transcription (Segal et al., 1996
). The sequences, positions, and usage of these promoter sites vary between the populations, yet the major cis-acting element that accounts for the differential transcription rates appears to be a repressor motif that is present in Ldh-B alleles common in the southern population and mutated in alleles from the northern group (Schulte and Powers, 1997
). Thus, relief from negative regulation is also a viable evolutionary strategy for adjusting transcriptional rates to different thermal environments.
Evolution of notothenioid fishes and their globin intergenic regions
Phyletic diversification within the notothenioid suborder appears to have occurred rapidly during the mid-Miocene approximately 715 mya (Bargelloni et al., 1994
), concomitant with the approach of the Southern Ocean to its current freezing temperature (Kennett, 1977
). Substantial reconfiguration of the notothenioid genome occurred at this time, including evolution of genes encoding antifreeze glycoproteins (Chen et al., 1997
; Cheng, 1998b
), expansion of tubulin gene families (Parker and Detrich, 1998
), and, in the icefish family, loss of globin expression through gene deletion (Cocca et al., 1995
; Zhao et al., 1998
). The migration of the Antarctic polar convergence northward to encompass New Zealand during the late Miocene and the Pliocene (
37 mya) (Kennett, 1968
) is thought to have facilitated the dispersal of the cold-adapted notothenioid fauna to lower latitudes (Fago et al., 1992
). With the subsequent retreat of the convergence to more southerly latitudes, some of the migrant notothenioid species secondarily reacquired a more temperate phenotype. N. angustata, for example, retains 23 antifreeze glycopeptide genes in its genome but expresses the antifreeze glycopeptides only at low, non-protective levels (C.-H. Cheng, personal communication).
How, then, did the adult globin locus of the red-blooded nototheniids, including N. coriiceps and N. angustata, evolve? One plausible scenario is that the ancestral, non-cold-adapted notothenioid of the Oligocene and early Miocene (2538 mya) possessed a head-to-head linked complex of adult
1- and ß-globin genes much like those of modern temperate fishes (McMorrow et al., 1996
; Chan et al., 1997
; Miyata and Aoki, 1997
). Subjected to a cooling ocean, we propose that a major segment of the intergenic region, including NcIR1 and the CAC/Sp1, GATA enhancer-containing NcDR1, duplicated to provide additional regulatory elements necessary to maintain appropriate production of globin mRNAs. Subsequently, the NcDR1 enhancer (NcEnh1) developed the nested CAC/Sp1 sites that we postulate endow it with higher affinity for its transcription factor, and recombinational condensation of the duplicated NcIR1 region gave rise to NcIR2. On its arrival in cold New Zealand waters 37 mya, the common ancestor of N. coriiceps and N. angustata would have possessed a globin locus similar to that now extant in N. coriiceps. As the South Pacific re-warmed, recombination-mediated deletion in the diverging predecessor of N. angustata of the DNA segment encompassing NcDR1-C (containing the stronger enhancer NcEnh1) through NcDR2-B restored the globin locus to near that of the original, temperate notothenioid condition (i.e., with a single, weaker enhancer).
Recruitment of repetitive regulatory elements to the N. coriiceps intergene through retroposition of SINES or other mobile genetic elements is an alternative hypothesis to explain the evolution of this region. No similarities of the direct or indirect repeats of N. coriiceps to known transposable elements have been found, but further investigation is warranted before this possibility is discarded.
Given the structural complexity of the N. coriiceps and N. angustata intergenes, their evolution is likely to have been more complicated than the scenario presented. For example, our hypothesis in its simplest form fails to explain the 23-bp A-B spacer shared by NcDR2 and NaDR. Furthermore, intermediate evolutionary states may have been obscured by multiple recombinational events occurring at the globin locus during the span of
20 million years. That the adult globin locus has been subject to recombinational change is also illustrated by the loss of most of this region in the Antarctic icefishes (Zhao et al., 1998
).
Evolution of the globin gene locus control region
Once thought to govern the sequential, stage-specific expression of embryonic, fetal, and adult globin genes during development (Orkin, 1990, 1995b
), the locus control regions (LCRs) of the distinct
- and ß-globin gene complexes of higher vertebrates are now recognized to function primarily as transcriptional enhancers in their native chromosomal environments (Epner et al., 1998
). In this context, the CAC/Sp1- and GATA-containing enhancers of the nototheniid globin intergenic regions resemble incomplete versions of the hypersensitive sites of the LCRs of higher vertebrates. The fish enhancers, like the human ß-globin LCR hypersensitive site 2 (HS2) and the human
-globin HS-40 distal control region (Hardison, 1998
), contain 24 CAC/Sp1 sites in close proximity to 2 GATA motifs. However, the third major motif of mammalian LCR hypersensitive sites, NF-E2/AP1, is absent in the fish enhancers. Thus, the nototheniid enhancers may represent "primitive" globin LCR hypersensitive sites, to which NF-E2/AP-1 sites were added as the discrete
- and ß-globin gene complexes of higher vertebrates evolved.
| ACKNOWLEDGMENTS |
|---|
We gratefully acknowledge the logistic support provided to our Antarctic field research program, performed at Palmer Station and in the seas of the Palmer Archipelago, by the staff of the Office of Polar Programs of the National Science Foundation, by the personnel of Antarctic Support Associates, and by the captains and crews of the R/V Hero and the R/V Polar Duke. We thank Dr. Arthur L. DeVries for his contribution of frozen testis tissue from N. angustata, and we acknowledge Patricia Singer (University of Maine DNA Sequencing Facility) for her excellent technical assistance in automated DNA sequencing. We thank Dr. Chi-Hing Cheng (University of Illinois, Urbana) for permission to cite her unpublished observations of antifreeze glycoprotein genes in N. angustata. This work was supported by National Science Foundation Grants OPP-9120311, OPP-9420712, and OPP-9815381 to H. W. D.
| FOOTNOTES |
|---|
1 From the Symposium Antarctic Marine Biology presented at the Annual Meeting of the Society for Comparative and Integrative Biology, 48 January 2000, at Atlanta, Georgia.
2 These individuals contributed equally to the work described here and should be considered joint first authors. ![]()
3 Present address of Dr. David T. Lau is Primedica, 2 Taft Court, Rockville, MD 20850. ![]()
4 To whom correspondence should be addressed: Dept. of Biology, Northeastern University, 414 Mugar Hall, 360 Huntington Ave., Boston, MA 02115. Tel.: 617-373-4495; Fax: 617-373-3724; E-mail: iceman{at}neu.edu. ![]()
5 The abbreviations used are: mya, million years ago; bp, base pair(s); DMEM, Dulbecco's modified Eagle's medium; DMSO, dimethyl sulfoxide; EKLF, erythroid Krüppel-like factor; FBS, fetal bovine serum; Inr, initiator of transcription motif; kb, kilobase pairs; LCR, locus control region; MEL, murine erythroleukemia; NF-E2, nuclear factor erythroid 2; ORF, open reading frame; PCR, polymerase chain reaction; SINES, short interspersed repetitive elements; UTR, untranslated region. ![]()
6 The term "intergenic region" is here defined arbitrarily as the DNA sequence located between the initiator codons of each gene pair. By this definition, the intergenic region of each 5' to 5'-linked gene pair encompasses the 5'-untranslated regions present in its
- and ß-globin transcripts. ![]()
7 Sp1-like transcription factors bind to the GC box, whose consensus is KRGGCKRRK (Faisst and Meyer, 1992
), or to th

3') is indicated for each gene. Lengths of sequence components can be estimated from the scale below the bar diagrams





