Integrative and Comparative Biology Advance Access originally published online on July 20, 2006
Integrative and Comparative Biology 2006 46(6):978-990; doi:10.1093/icb/icl022
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
Identifying exoskeleton proteins in the blue crab from an expressed sequence tag (EST) library
Department of Biology and Marine Biology, University of North CarolinaWilmington 601 South College Road, Wilmington, NC 28403-5915, USA
Correspondence: 1E-mail: shafert{at}uncw.edu
| Synopsis |
|---|
|
|
|---|
A blue crab (Callinectes sapidus) expressed sequence tag project was designed for multiple purposes including discovery of genes for cuticular (exoskeletal) proteins, some of which may regulate mineralization. One of the expression libraries sequenced was from the hypodermis (the epithelium depositing the cuticle). RNAs used for cDNA synthesis were pooled from arthrodial and mid-dorsal hypodermis at both pre-ecdysis and post-ecdysis. This ensured representation from both calcifying and non-calcifying regions and from layers of cuticle deposited both before and after ecdysis. The EST database was mined for cuticular protein sequences in three ways. First, we searched for sequences coding for known cuticle-specific motifs like the Rebers-Riddiford chitin-binding sequence and a motif known only from proteins extracted from mineralized exoskeletons of other decapods. Second, we checked the associated annotations in the EST project for similarity to known cuticular proteins, often from insects. Third, BLAST was used to search the EST data for significant homology to published cuticular protein sequences from other crustaceans. In all, the database contains at least 73 contigs or singlets representing transcripts of cuticular proteins. Forty-five of these distribute among ten clusters of very similar transcripts, possibly representing alternative splicing or recent gene duplications. The rest share less similarity. We have obtained complete sequences for 25 of the transcripts, have produced phylogenetics trees comparing them with similar proteins from insects and other crustaceans, and have determined expression patterns across the molt in calcifying versus non-calcifying cuticle. The combination of homology analysis and gene expression analysis allows us to infer putative functions in cuticle synthesis and calcification.
| Introduction |
|---|
|
|
|---|
Expressed sequence tags (ESTs) are partial sequences of cDNAs typically produced by rapid single-pass sequencing of randomly chosen clones from a cDNA library. An EST project is a large dataset of such partial sequences from one or more libraries and is thus a representation of the transcriptome of a particular tissue or tissues at a particular stage of development and physiological condition. For organisms without genome projects and thus having as yet poorly described genetic backgrounds, the ease and relative economy of partial sequencing of cDNA librares can be a highly productive way to generate a great deal of information about functioning genes (Boguski and others 1993
Typically, EST datasets either lead to or derive from the production of microarrays, and so their annotations are useful for characterizing and understanding multi-gene expression patterns (for example Whitfield and others 2002
; papers in this Symposium including McClintock and others 2006
, and Stillman and Teranishi 2006
). In a more targeted approach, however, datasets produced in EST projects can be mined for the presence of transcripts with similarity to known protein sequences that might indicate a relationship to a particular physiological or developmental question (Blackshear and others 2001
). This is the approach we report here, where the database produced in a blue crab (Callinectes sapidus) EST project (Coblentz and others 2006
), though designed for mutiple purposes, has been used to obtain sequence information about proteins deposited in the cuticle. Some of these proteins may regulate exoskeleton mineralization.
To date, 73 different cuticle transcripts have been discovered. We have obtained complete sequences and gene expression patterns for 25 of these. After reviewing the blue crab as a model system for analysis of the control of mineralization and the known cuticle protein sequences from other decapod crustaceans, we summarize the relevant facts about our C. sapidus EST project (Coblentz and others 2006
). Next, we describe the approaches used to mine the database for cuticular transcripts. Finally, the general results obtained so far are reviewed in an attempt to illustrate the approach and construct a synthesis of our findings. The actual sequences and the details of the gene expression data are not given either because they are already published (Wynn and Shafer 2005
), have been submitted for publication (Faircloth and Shafer in review), or are being prepared for publication (Kennedy and others manuscript in preparation), or because they are preliminary and need additional replication.
| Crustacean exoskeleton as a model system to study matrix deposition and mineralization |
|---|
|
|
|---|
Biomineralization, the deposition of insoluble inorganic material, is a widespread phenomenon, occurring in organisms as diverse as coccolithophores, sponges, corals, mollusks, crustaceans, echinoderms, and vertebrates (Lowenstam and Weiner 1989
Crustaceans, especially large decapods, are ideal model systems in which to study matrix formation and calcification. Growth in these animals requires that they undergo cyclic shedding of the exoskeleton. Unlike other mineralizing organisms, they must periodically deposit and calcify an entirely new chitinprotein matrix at each molt. The tissue responsible for cuticle synthesis is the hypodermis, an epithelium of only 1 cell layer. The process of exoskeleton calcification is tightly regulated both temporally and anatomically. The temporal regulation is exemplified in the molt cycle, where calcification must closely follow the emergence of the animal from its old exoskeleton (Roer and Dillaman 1984
, 1993
). Preparation for ecdysis begins with apolysis, the separation of the underlying hypodermis from the old exoskeleton. Other events of pre-ecdysis, comprising stages D0, D1, D2, D3, and D4, include partial resorption of the organic and mineral components of the old cuticle and generation of a new epicuticle and exocuticle (the pre-exuvial layers) beneath the old exoskeleton (Drach and Tchernigovtzeff 1967
). The fully formed pre-exuvial layers remain completely uncalcified at this time. Ecdysis is referred to as stage E, after which the animal enters post-ecdysis, referred to as stages A1, A2, B1, B2, C1, C2, and C3. The pre-exuvial layers mineralize after expansion by water uptake and the endocuticle is deposited during post-ecdysis. The endocuticular lamellae mineralize as they appear or very soon thereafter. The deposition of the thin, uncalcified membranous layer signals entry into anecdysis or intermolt, stage C4 (Drach and Tchernigovtzeff 1967
; Roer and Dillaman 1984
, 1993
). Obviously, the timing of cuticular synthesis and the control of the initial CaCO3 accumulation in each layer is critical.
The need for anatomical regulation of exoskeleton calcification by proteins secreted by the hypodermis is evident in the differential calcification of specific body parts. For example, the dorsal carapace and the chelipeds are heavily calcified and become extremely rigid while a thin suture line calcifies less and in a pattern that promotes its systematic rupture at the next molt (Priester and others 2005
). At the other extreme, the cuticle of the joints, gills, and walls of the branchial chamber are uncalcified and remain flexible. Of particular interest to us are the arthrodial membranes, cuticle at each joint that remains completely uncalcified and allows for movement. Arthrodial membrane is similar to hard cuticle in the timing of deposition, in overall thickness of pre-exuvial and post-exuvial layers and in the morphology of the lamellae (Williams and others 2003
). The composition of its organic matrix, however, is considerably different from calcified cuticle (Hepburn and Chandler 1976
), and a comparison of the proteins in the 2 locations can be useful for understanding cuticular mineralization.
The blue crab, C. sapidus, is our organism of choice because of its economic importance to our region and because we are able to obtain integument material at precise time points both pre-ecdysis and post-ecdysis from a soft-shell crab "shedding" operation. The histology, histochemistry, and ultrastructure of cuticular deposition have been studied and key patterns of initial and subsequent mineralization have been described (Elliott and Dillaman 1999
; Williams and others 2003
; Dillaman and others 2005
; Priester and others 2005
). For example, post-ecdysial deposition of calcium carbonate into the exocuticle is now known to proceed from both distal and proximal aspects along septa between polygonal columns (interprismatic septa) (Dillaman and others 2005
). The resulting honeycomb pattern produces a rigid cuticle with a minimum of mineral. Only later do the prisms in-fill with CaCO3. Furthermore, evidence has been obtained indicating that mineral deposited first (in 35 h) at the epi-exocuticular boundary and within the interprismatic septa is amorphous calcium carbonate, which only later (812 h) is transformed to calcite (Dillaman and others 2005
). Presumably all these fine-scale differences in the location and type of mineral deposited must be under the control of cuticular proteins.
In a previous study (Shafer and others 1995
) we used isolated crab cuticle explants free of cellular material to show that the control of CaCO3 deposition after the molt resides in the organic matrix itself rather than in, for example, the ion-pumping activity of the underlying hypodermis. Dramatic post-ecdysial changes in glycoproteins synchronous with initial mineralization in the exocuticle were documented histochemically (Marlowe and others 1994
) and by lectin blotting following SDSPAGE (Shafer and others 1994
, 1995
). Subsequently, cuticle proteins were extracted with various solvents at critical times at and after the molt and were tested for their ability to affect calcite nucleation (Coblentz and others 1998
). The molecular weights of the proteins that associated with calcite crystals as they formed in vitro were determined by electrophoresis. The results were used to formulate a model wherein one or more acidic proteins form nucleating sites in the cuticle and glycoproteins present only in pre-ecdysis and for the first few hours post-ecdysis shield these nucleators from the calcium and carbonate ions or otherwise inhibit their function (Coblentz and others 1998
). Recently, the purification of a specific cuticular glycoprotein from ecdysial cuticle provided strong support for this hypothesis (Tweedie and others 2004
). Immunoblot analysis using an antibody produced against the N-terminal amino acid sequence suggested that several changes occur in the glycosylation pattern of this protein, including the loss of the highest molecular mass glycoform, during the first few hours after the molt. Immunohistochemistry showed that the antigen decreases in the exocuticle during early post-ecdysis and, most importantly, that the decrease occurs first in the interprismatic septa beginning as early as 2 h after the molt. Thus, we produced temporal and fine-scale spatial correlations between the loss of a possible inhibitor and the initiation of CaCO3 deposition. This protein, at least in its most heavily glycosylated form, does not exist in the arthrodial membrane either pre-ecdysis or post-ecdysis (T.H. Shafer unpublished observations). We also have identified a glycosidase that appears in the cuticle during the early post-ecdysial hours (Roer and others 2001
) and have produced evidence that N-acetylhexosaminidase activity can alter cuticular glycans and increase cuticle-explant calcifying potential in a manner that mimics in vivo post-ecdysis processes (Pierce and others 2001
).
| Crustacean cuticular proteins directly sequenced or translated from cDNA |
|---|
|
|
|---|
Since a major objective of what follows is to characterize C. sapidus cuticular proteins based on their similarity with sequenced proteins from other species, it is important to review work on cuticular proteins in other decapods. Andersen and his associates have extracted and directly sequenced proteins from the insoluble matrix of intermolt crustacean cuticles after decalcification in acetic acid. These include 1 protein from the shrimp Pandalus borealis (Jacobsen and others 1994
A study designed to discover transcripts expressed predominantly or exclusively in post-ecdysis tail-fan epidermis using the technique of differential display has produced sequence information on several cuticular proteins in the shrimp Marsupenaeus japonicus. DD9A and DD9B, 2 proteins with similar deduced amino acid sequences, contain partial RR-1 consensus sequences (Watanabe and others 2000
). These transcripts are found in the lateral uncalcified exoskeletal region of the tail blade of shrimp, and not in the calcified medial region. DD5 encodes a cuticular protein unusual for the fact that it consists of tandem repeats of units of
100 amino acid residues, each with an RR consensus sequence (Ikeya and others 2001
). Another differentially expressed M. japonicus transcript identified exclusively in calcified cuticle was originally called DD4 (Endo and others 2000
) but was renamed crustocalcin when additional open reading frame (ORF) sequence was obtained (Endo and others 2004
). The encoded protein contains a Rebers-Riddiford-like motif in close association with a region highly enriched in acidic amino acid residues. The recombinant protein is capable of nucleating CaCO3 in vitro (Endo and others 2004
). The only other protein described from the M. japonicus differential display study is DD1, which contains no recognized cuticle motif (Watanabe and others 2006
).
Two interesting proteins from the calcified cuticle of the crayfish Procambarus clarkii have been sequenced both directly (Inoue and others 2001
, 2004
) and from cDNA (Inoue and others 2003
, 2004
). They are termed calcification-associated peptide-1 and -2 (CAP-1 and CAP-2). Each contains a RR consensus motif flanked by stretches of acidic amino acid residues. They have been shown experimentally to bind chitin and to inhibit the formation of CaCO3 in solution, a known property of proteins that could have either positive or negative effects on crystal formation when associated with an insoluble organic matrix. The CAP-1 and CAP-2 transcripts are only expressed in the P. clarkii tail-fan blade during post-ecdysis, the time of exoskeleton calcification, and are described as candidate calcite-nucleating agents.
| A summary of the blue crab EST project |
|---|
|
|
|---|
We have produced an EST database from blue crab (C. sapidus) transcripts designed, in part, to facilitate the discovery of cuticular protein sequences in that species (Coblentz and others 2006
All ESTs, whether from gill or hypodermis, were put together in the assembly phase of the project (Coblentz and others 2006
). Paracel Transcript Assembler software organized the sequences into putative transcripts in 2 steps. In the first, ESTs with regions that are highly similar were placed in clusters. Ninety percent of our sequences cluster and the rest are singlets, assumed to represent transcripts sequenced only once in the project. In the second step, sequences within clusters were arranged as contiguous putative transcripts. Ideally, one would expect all the ESTs in a cluster to form a single contig. However, sometimes clusters contain sequences with regions of dissimilarity and then more than one contig forms or else singlets form because they do not contig with the other ESTs in the cluster. Such complex clusters are thought to represent cases of either alternative splicing or transcription from members of recently duplicated gene families. In the end, our data assembled into 883 contigs and 1293 singlets, representing 2176 putative blue crab transcripts. Almost 17% of these are part of complex clusters. There are 363 contigs that contain at least 1 EST from each library, indicating that they are reasonably abundant transcripts in both tissue types. The putative transcripts sequenced uniquely from hypodermal tissue are of the most interest for the discovery of cuticular proteins. There were 578 of these, 216 contigs and 362 singlets.
Several approaches were used to annotate the sequences and make them publicly available (Coblentz and others 2006
). The 2176 putative transcripts were compared with the NCBI non-redundant protein database using BLAST, and the descriptions of the 3 most similar proteins, when significant, were retained in the database. GOblet software (Hennig and others 2003
) was used to produce and summarize the Gene Ontology (GO) results associated with the significant BLAST hits. This was possible for 32% of the putative blue crab transcripts. For transcripts with no significant BLAST results, known protein domains were revealed using InterPro (European Bioinformatics Institute). The annotation results, as well as the actual sequences of the ESTs and contigs, have been made available to the public on a website (http://firedev.bear.uncw.edu:8080/shaferlab/) that may be searched by sequence or by keyword. Users may also input a sequence to compare with our data using BLAST. The ESTs themselves have all been submitted to NCBI (dbEST) using trace2dbest software (http://www.nematodes.org/PartiGene/; A. Anthony, J. Parkinson, and M. Blaxter, unpublished data) that annotates each sequence with its most similar BLAST result at the time of submission. All of our sequences will also be submitted to the Marine Genomics Project (http://www.marinegenomics.org) where the BLAST annotations will be periodically updated and where other C. sapidus ESTs submitted later can be automatically assembled with these.
| Approaches to "mining" the EST data for cuticle-related transcripts |
|---|
|
|
|---|
Three strategies have been employed for discovering transcripts for cuticular proteins in the C. sapdius EST database. First, we searched the sequences for the invariant residues of known cuticular protein motifs such as the Rebers-Riddiford chitin-binding sequence and crust-18, the mineralized tissue motif described by Nousiainen and others (1998
These 3 search strategies produced highly overlapping results as might be expected. In all, 73 transcripts likely to code for cuticle proteins were identified. Forty-eight of these fall into 10 complex clusters. This is a very high proportion given that only 16.8% of all transcripts were in complex clusters. It suggests that at least some cuticular proteins are coded by alternatively spliced genes or members of closely related gene families. The other transcripts for possible cuticular proteins are represented by 18 simple contigs and 7 singlets. Since 941 of all the putative transcripts in the EST database are expressed in the hypodermis (363 hypodermis-gill mixed contigs, 216 hypodermis-only contigs, and 362 hypodermis singlets), we estimate that the 73 mRNAs for cuticular proteins represent
7.8% of the diversity in the hypodermal transcriptome as we have sequenced it.
So far, we have done additional work on 25 of these potential cuticular protein transcripts. At this time, that effort includes obtaining complete transcript sequences and ascertaining gene expression patterns across the molt cycle in arthrodial membrane versus calcifying tissue hypodermis. Some of these data are published (Wynn and Shafer 2005
), some submitted for publication (Faircloth and Shafer in review), and some are about to be submitted (Kennedy and others manuscript in preparation). Gene expression data are from combinations of real-time PCR, northern blotting, and in situ hybridization (ISH). The 4 published expression patterns (Wynn and Shafer 2005
) do not include ISH data, so Figure 1 is included as an example of the use of this technique. The transcript is detected in the hypodermis depositing arthrodial membrane, but not in the hypodermis depositing cuticle destined to calcify. It is also not present in muscle or other internal tissues.
|
| ORFs analyzed to date: sequence similarities and expression patterns |
|---|
|
|
|---|
Translations of the 25 cuticular transcripts have been analyzed for their relatedness to other known proteins, and those data, when combined with summaries of the patterns of gene expression, are being used to infer possible functions for the encoded proteins. Following the convention used for C. pagurus and H. americanus (Andersen 1998b
The analysis began by using BLAST to obtain all similar protein sequences from both the NCBI non-redundant protein database and the SwissProt/TrEMBL database for the translation of each ORF. The resulting "hits" with E-values < 105 always included some subset of the previously sequenced crustacean cuticular proteins and in some cases various insect proteins. Next, C. sapidus sequences that had overlapping hits in BLAST were placed together with the other arthropod sequences with which they shared similarity, and each of these sets of presumptive homologs was aligned using Clustal X (Thompson and others 1997
), using the default Gonnet series to weight amino acid similarities. Redundant sequences obtained from BLAST were removed, and alignments were trimmed.
Bayesian phylogenetic analysis was performed using MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003
). We chose MrBayes in part due to its performance in estimating relationships between proteins over the broad range of evolutionary distances represented in our alignments, due to its modest need for computing time, and also due to the availability of the "protein mixed model" in the software. This is a procedure that efficiently jumps between several alternative empirically derived models of amino acid evolution (each based on large "training sets" of amino acid sequences) to estimate the most appropriate model. We felt this to be superior to selecting a model a priori for these previously unstudied proteins. In each of the 3 large alignments we analyzed, MrBayes selected the WAG model (Whelan and Goldman 2001
) as the best fitting model with a posterior probability of >95%.
To run Bayesian analysis, we allowed rates to be gamma distributed across residues. We ran a sufficient number of generations for an
24 h run (500,000 for the crust-18 alignment and 250,000 generations for the RR alignments) and set the burn-in at 10% of the trees saved. Each run was examined to ensure that log likelihoods had converged, which they did well in advance of termination of the run.
The first group of C. sapidus transcripts consists of the 7 obtained sequences to date that code for RR-1 consensus sequences within their ORFs. When BLAST searches were done on the public protein databases, each of the inferred C. sapidus proteins was found to have a high degree of similarity to the other 6, to all RR-1-containing proteins known from other crustacean species, and to a large number of insect protein sequences containing the RR-1 consensus. Since the number of insect "hits" was quite large and since the proteins included were not all the same when each of the C. sapidus proteins was BLASTed separately, we decided to include in a single alignment only those crustacean and insect sequences that had E-values <108 for at least 6 of our 7 C. sapidus sequences. Phylogenetic analysis showed that the crustacean RR-1 sequences and the insect RR-1-containing proteins form separate clades (Fig. 2A), except for 1 insect sequence, M. sexta cuticle protein 20 (Suderman and others 2003
), which is basal to the crustacean clade. Four blue crab proteins (CsAMP6.0, CsAMP16.5, CsAMP8.1, and CsAMP13.4) showed strongly supported sister relationship to different C. pagurus proteins. Of particular interest is the fact that the most basal sequence falling within the crustacean clade is CsCP14.0. This protein is also unique among C. sapidus RR-1-containing sequences in being transcribed only pre-ecdysis in calcifying tissue (Fig. 2C) rather than in arthrodial membrane. No other known crustacean RR-1 protein is exclusively from calcifying cuticle, though CpAMP/CP11.4 was isolated from both cuticle types (Andersen 1999
). The 6 more typical C. sapidus RR-1 proteins found in arthrodial membranes, including CsAMP6.0, which shares greatest similarity with CpAMP/CP11.14, are expressed both pre-ecdysis and post-ecdysis (Fig. 2B). This suggests that these flexible cuticle proteins are common to both exocuticle and endocuticle. Presumably they are structural proteins important for any matrix that remains uncalcified. They range from very common transcripts in arthrodial hypodermis to one sequence, CsAMP9.3, which appears to be transcribed only in scattered clusters of arthodial hypodermis cells at low levels (Faircloth and Shafer in review).
|
In contrast to the RR-1 sequences, a second group of Rebers-Riddiford motif-containing C. sapidus cuticular proteins produced BLAST search results revealing very few similar proteins in the public databases. These 3 sequences have moderately significant "hits" (E-values < 104) only with each other and with 4 previously sequenced proteins from other crustaceans. CsCP8.2 and CsCP8.5 (Wynn and Shafer 2005
We suggest that these 3 proteins discovered from the blue crab EST dataset are potential calcium carbonate nucleators for the following reasons. First, they are expressed in the dorsal hypodermis that is making cuticle destined to calcify but not in arthrodial hypodermis (Fig. 3). CsCP8.2 and 8.5 are exclusively pre-ecdysis transcripts and therefore likely to code for exocuticular proteins (Fig. 3A), and CsCP6.1 is exclusively post-ecdysis (Fig. 3B) and therefore likely to be endocuticular. Second, CAP-1 and -2 (Inoue and others 2004
) and crustocalcin (Endo and others 2004
) have been shown to either bind calcium or affect in vitro CaCO3 formation and have thus been implicated as important components in controlling the initiation of mineralization in their respective species. CpCP5.75 has not been tested for calcification potential, but it is the clear crab homolog of CsCP6.1. Third, like the other proteins, the 3 C. sapidus sequences contain not only an RR motif not assignable to any of the variants described in insects but also additional regions rich in aspartic acid and glutamic acid residues. These portions of the protein could attract or bind calcium ions. Finally, these sequences show no homology with any proteins found in insect cuticles, where mineralization never occurs. If the 3 blue crab proteins do act as nucleators for CaCO3, the action of CsCP8.2 and 8.5, the exocuticular transcripts, must somehow be inhibited until after ecdysis.
|
The third group of C. sapidus transcripts discovered by mining the EST data actually have more similarity, as revealed by BLAST, to insect cuticular proteins than to any of the known crustacean cuticular protein sequences. Several other sequences among the 73 putative cuticular transcripts represented in the EST database also appear to have closer BLAST matches with insect sequences than with known crustacean sequences, but so far we have obtained complete ORFs and tentative expression results for only these two. Both of them show unusual expression patterns (unpublished data). CsAMP/CP13.7 is the only gene we have examined to date that is transcribed in arthrodial hypodermis and calcification-associated hypodermis during both pre-ecdysis and post-ecdysis (Fig. 4B). Real-time PCR shows that its expression is 6- to 7-fold higher before the molt. CsAMP23.6 is the only arthrodial transcript analyzed so far that is strictly post-ecdysial (Fig. 4C). Both of these proteins contain reasonably well-conserved Rebers-Riddiford motifs somewhat similar to the RR-2 variant found in insects, yet without the typical RR-2 upstream consensus. CsAMP/CP13.7 shows a degree of similarity to the less-clearly characterized RR-3, which has been recognized as existing in H. americanus proteins as well as in some insect sequences (Anderson 2000). CsAMP23.7 codes for the longest ORF we have sequenced so far in blue crab cuticular transcripts. It contains an unusual arrangement of 3 short motifs each repeated 6 times in the C-terminal portion of the sequence. Some insect cuticular proteins show a similar feature, though the motifs themselves are quite different.
|
BLAST searches with these 2 crustacean proteins returned a number of insect and other arthropod sequences. Figure 4A shows phylogenetic relationships between the members of this diverse alignment, containing CsAMP/CP13.7, CsAMP23.7, the insect sequences that were found similar to one or the other of them by BLAST (E-value <104), and HaCP18.8 and HaCP14.2, 2 RR-containing crustacean proteins that also were found similar by BLAST. Interestingly, the crustacean sequences did not form a clade as they did for the RR-1 sequences, suggesting that divergence of these proteins from ancestral forms may have pre-dated the split between insects and crustaceans. In contrast, there is a well-supported chelicerate clade containing 2 sequences from the horseshoe crab Tachypleus tridentatus (Tt) and 2 sequences from the spider Araneus diadematus (Ad). It is possible that a clearer picture of the evolutionary history of the insect and crustacean proteins, and of the motifs they contain, may result when more species can be sampled. The highly supported homology of CsAMP23.7 with a predicted protein from the mosquito, Anopheles gambiae, genome that has not yet been annotated is worthy of note.
Another expected natural group of C. sapidus sequences is one with ORFs coding for crust-18, the motif found in either 2 or 4 copies in several hard-cuticle proteins of H. americanus and C. pagurus (Kragh and others 1997
; Nousiainen and others 1998
; Andersen 1998b
). It is never found in arthrodial membrane proteins or in insect cuticular proteins. When the EST dataset was probed for nucleotides coding for this motif, 10 contigs were found (Kennedy and others manuscript in preparation). Complete transcript sequences were obtained and the translations of the ORFs showed typical signal sequences followed by varying lengths of secreted polypeptides containing between 4 and 14 variant copies of the motif. In each case the crust-18 motifs constituted
70% of the total length.
In an attempt to understand the differences in motif number among these translations as well as to explain why they differ from the pattern of 2 or 4 motifs found in cuticular proteins extracted from C. pagurus and H. americanus, we examined the amino acid sequences more closely. Between groups of either 2 or 4 motifs was found the short sequence RxKR. Figure 5 is a diagrammatic example of the ORF from one of the transcripts described by Kennedy and others (manuscript in preparation) illustrating this arrangement. These groupings of the basic amino acid residues arginine and lysine are not found in the proteins previously extracted from decapod calcified cuticle and sequenced as polypeptides. They are rather typical of recognition sites for trypsin-like serine proteases. We suggest (Kennedy and others manuscript in preparation) that the transcripts code for pro-proteins that are cleaved after translation into 2-motif or 4-motif peptides that then become resident in the exoskeleton. As evidence for this suggestion, we show that when attempts are made to align the direct-sequenced Cancer proteins with the ORFs of the Callinectes transcripts, high degrees of full-length similarity are always found to sequences of amino acid residues between the groups of argininyl and lysyl residues and never to sequences including 2 or 4 motifs but spanning the putative cleavage sites (Fig. 5). Assuming cleavage at all the basic residue sites in all the pro-proteins, the C. sapidus transcripts discovered become 33 mature cuticular proteins, 7 of which contain 4 crust-18 motifs and the rest 2 motifs. Only 29 of these short peptides are actually different sequences since there are 2 identical pairs and 1 identical group of 3.
|
When each of these virtually cleaved peptides was used in a BLAST search of the public databases, only the known crust-18-containing sequences were returned. In other words, there is never significant similarity to any insect protein. Analysis of the crust-18 alignment (Fig. 6A) revealed a great deal of phylogenetic structure. Apart from CsCP5.2b and HaCP6.3, which are sister sequences, and CsCP5.2a and CpCP11.58, which are basal to the entire tree, there are 2 major clades. Both of these clades contain mixtures of Callinectes, Cancer and Homarus sequences. Crust-18, like the other major classes of cuticle proteins, is a diverse family of proteins with a complex evolutionary history whose diversity of function awaits further study.
|
Three northern blot probes were designed against either unique or shared parts of several of the 10 crust-18 transcripts. Differences in cDNA lengths allowed us to predict that the probes would detect 1, 2, or 3 bands if all the transcripts were actually present in an RNA sample (Kennedy and others manuscript in preparation). Each probe detected the predicted number and size of RNA bands, and the expression pattern for each was the same. The subset of crust-18-containing transcripts that could be detected on these northern blots was strictly dorsal (that is, hard cuticle) hypodermis and post-ecdysis (Fig. 6B).
The final 3 of the 25 transcripts discovered through mining of the EST data and for which sequencing and gene expression pattern analysis are complete are CsCP15.0, CsCP19.0A, and CsCP19.0B (Faircloth and Shafer in review). Their ORFs code for proteins containing 3 copies of the postmolt-18 motif. Though this hydrophobic motif was first described in insects, the BLAST search result for each of these proteins found no significantly similar insect proteins in the public databases. The only related sequences known are the crustacean proteins CpCP14.99, CpCP18.79, HaCP20.2A, and HaCP20.2B. A neighbor joining tree (not shown) suggests that CsCP15.0 and CpCP14.99 are clear orthologs while all other relationships among these crustacean sequences are less obvious. The C. sapidus transcripts are expressed principally, if not exclusively, in hypodermis that is making calcifying cuticle rather than in the arthrodial hypodermis (Fig. 7). Interestingly, CsCP15.0 (Fig. 7A), but not CsCP19.0A or CsCP19.0B (Fig. 7B), is transcribed both pre- and post-ecdysis. It is the only transcript with this particular expression pattern, so the protein for which it codes is unusual among the presumptive hard-tissue peptides we have analyzed in that it may be present in both exocuticle and endocuticle. It could be a key structural protein of all calcifying parts of the exoskeleton.
|
Only 6 cuticular proteins known from species other than C. sapidus show no apparent sequence similarity to any of the 25 putative transcripts analyzed to date from the EST database. They are the single protein sequenced from P. borealis (Jacobsen and others 1994
| Conclusion |
|---|
|
|
|---|
We have documented how an expressed sequence tag project from the blue crab (C. sapidus) was designed, in part, to contain a maximal number of partial sequences coding for cuticular proteins and how this dataset is being "mined" to discover and analyze these sequences. Though this is still a work in progress, the data presented here clearly demonstrate that this transcriptome approach is working. When the analysis is complete, this species will be the best-understood decapod in terms of the structural and, perhaps, the regulatory proteins of its cuticle. This will greatly enhance future efforts to elucidate the control of biomineralization in this model system.
| Acknowledgements |
|---|
The authors thank D.L. Mykles and D.W. Towle for the invitation to participate in the 2006 Society for Integrative and Comparative Biology symposium on Genomic and Proteomic Approaches to Crustacean Biology. Special thanks go to long-term collaborators F.E. Coblentz, R.M. Dillaman, R.D. Roer, and D.W, Towle for their multiple contributions. This work was supported by National Science Foundation grant IBN-0114597 and by a Genomics Research Initiative Award from the Office of the President, University of North Carolina.
Conflict of interest: None declared.
| Footnotes |
|---|
From the symposium "Genomic and Proteomic Approaches in Crustacean Biology" presented at the annual meeting of the Society for Integrative and Comparative Biology, January 48, 2006, at Orlando, Florida.
| References |
|---|
|
|
|---|
Addadi, L and S Weiner. 1985. Interactions between acidic proteins and crystals: stereochemical requirements in biomineralization. Proc Natl Acad Sci USA 82:41104.
Andersen, SO. 1998a. Amino acid sequence studies of endocuticular proteins from the desert locust Schistocera gregaria. Insect Biochem Mol Biol 28:42134.[CrossRef][ISI][Medline]
Andersen, SO. 1998b. Characterization of proteins from arthrodial membranes of the lobster Homarus americanus. Comp Biochem Physiol A 121:37583.[CrossRef][Medline]
Andersen, SO. 1999. Exoskeletal proteins from the crab, Cancer pagurus. Comp Biochem Physiol A 123:20311.[CrossRef][Medline]
Andersen, SO. 2000. Studies on proteins in post-ecdysial nymphal cuticle of locust, Locusta migratoria, and cockroach, Blaberus craniifer. Insect Biochem Mol Biol 30:56977.[CrossRef][ISI][Medline]
Blackshear, PJ, WS Lai, JM Thorn, EA Kennington, NG Staffa, DT Moore, GG Bouffard, SM Beckstrom-Sternberg, JW Touchman, MF Bonaldo, MB Soares. 2001. The NIEHS Xenopus maternal EST project: interim analysis of the first 13,879 ESTs from unfertilized eggs. Gene 267:7187.[CrossRef][ISI][Medline]
Boguski, MS, TM Lowe, CM Tolstoshev. 1993. dbEST--database for "expressed sequence tags". Nat Genet 4:3323.[CrossRef][ISI][Medline]
Bonaldo, MF, G Lennon, MB Soares. 1996. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 6:791806.
Coblentz, FE, TH Shafer, RD Roer. 1998. Cuticular proteins from the blue crab alter in vitro calcium carbonate mineralization. Comp Biochem Physiol B 121:34960.[CrossRef]
Coblentz, FE, DW Towle, TH Shafer. 2006. Expressed sequence tags from normalized cDNA libraries prepared from gill and hypodermis tissues of the blue crab, Callinectes sapidus. Comp Biochem Physiol D 1:2008.
Dillaman, RM, S Hequembourg, M Gay. 2005. Early pattern of calcification in the dorsal carapace of the blue crab, Callinectes sapidus. J Morphol 263:35674.[CrossRef][ISI][Medline]
Drach, P and C Tchernigovtzeff. 1967. Sur la méthode de détermination des stades d'intermue et son application générale aux crustaces. Vie Milieu Ser A Biol Mar 18:595610.
Elliott, EA and RM Dillaman. 1999. Formation of the inner branchiostegal cuticle of the blue crab, Callinectes sapidus. J Morphol 240:26781.[CrossRef]
Endo, H, P Persson, T Watanabe. 2000. Molecular cloning of the crustacean DD4 cDNA encoding a Ca2+-binding protein. Biochem Biophys Res Commun 276:28691.[CrossRef][ISI][Medline]
Endo, H, Y Takagi, N Ozaki, T Kogure, T Watanabe. 2004. A crustacean Ca2+-binding protein with a glutamate-rich sequence promotes CaCO3 crystallization. Biochem J 384:15967.[CrossRef][ISI][Medline]
Faircloth, LM and TH Shafer. In review. Differential expression of eight transcripts and their roles in the cuticle of the blue crab, Callinectes sapidus. Comp Biochem Physiol B submitted.
Hennig, S, D Groth, H Lehrach. 2003. Automated gene ontology annotation for anonymous sequence data. Nucleic Acids Res 31:37125.
Hepburn, HR and HD Chandler. 1976. Material properties of arthropod cuticles: the arthrodial membrane. J Comp Physiol 109:1778.[CrossRef]
Ikeya, T, P Persson, M Kono, T Watanabe. 2001. The DD5 gene of the decapod crustacean Penaeus japonicus encodes a putative exoskeletal protein with a novel tandem repeat structure. Comp Biochem Physiol B 128:37988.[CrossRef][Medline]
Inoue, H, N Ozaki, H Nagasawa. 2001. Purification and structural determination of a phosphorylated peptide with anti-calcification and chitin-binding activities in the exoskeleton of the crayfish, Procambarus clarkii. Biosci Biotechnol Biochem 65:18408.[CrossRef][Medline]
Inoue, H, T Ohira, N Ozaki, H Nagasawa. 2003. Cloning and expression of a cDNA encoding a matrix peptide associated with calcification in the exoskeleton of the crayfish. Comp Biochem Physiol B 136:75565.[CrossRef][Medline]
Inoue, H, T Ohira, N Ozaki, H Nagasawa. 2004. A novel calcium-binding peptide from the cuticle of the crayfish, Procambarus clarkii. Biochem Biophys Res Commun 318:64954.[CrossRef][ISI][Medline]
Jacobsen, SL, SO Andersen, P Højrup. 1994. Amino acid sequence determination of a protein purified from the shell of the shrimp, Pandalus borealis. Comp Biochem Physiol B 109:20917.[CrossRef][Medline]
Kragh, M, L Mølbak, SO Andersen. 1997. Cuticular proteins from the lobster, Homarus americanus. Comp Biochem Physiol B 118:14754.[CrossRef][Medline]
Lowenstam, HA and S Weiner. On Biomineralization1989. New York Oxford University Press.
Mann, S. Biomineralization: Principles and Concepts in Bioinorganic Materials Chemistry2002. New York Oxford University Press.
Marlowe, RL, RM Dillaman, RD Roer. 1994. Lectin binding by the crustacean cuticle: the cuticle of Callinectes sapidus throughout the molt cycle, and the intermolt cuticle of Procambarus clakii and Ocypoe quadrata. J Crust Biol 14:23146.[CrossRef]
McClintock, TS, CD Derby, BW Ache. 2006. Physiological genomics of lobster olfaction. Society for Integrative and Comparative Biology. Orlando, Florida.
Nousiainen, M, K Rafn, L Skou, P Roepstorff, SO Andersen. 1998. Characterization of exoskeletal proteins from the American lobster, Homarus americanus. Comp Biochem Physiol B 119:18999.[CrossRef][Medline]
Pierce, DC, KD Butler, RD Roer. 2001. Effects of exogenous N-acetylhexosaminidase on the structure and mineralization of the post-ecdysial exoskeleton of the blue crab, Callinectes sapidus. Comp Biochem Physiol B 128:691700.[CrossRef][Medline]
Priester, C, RM Dillaman, DM Gay. 2005. Ultrastructure, histochemistry, and mineralization patterns in the ecdysial suture of the blue crab, Callinectes sapidus. Microsc Microanal 11:47999.[Medline]
Rebers, JE and LM Riddiford. 1988. Structure and expression of a Manduca sexta larval cuticle gene homologous to Drosophila cuticle proteins. J Mol Biol 203:41123.[CrossRef][ISI][Medline]
Rebers, JE and JH Willis. 2001. A conserved domain in arthropod cuticular proteins binds chitin. Insect Biochem Mol Biol 31:108393.[CrossRef][ISI][Medline]
Roer, RD and RM Dillaman. 1984. The structure and calcification of the crustacean cuticle. Am Zool 24:893909.
Roer, RD and RM Dillaman. 1993. Molt-related change in integumental structure and function. In Horst, MN and FreemanJA (Eds.). Crustacean IntegumentBoca Raton, Florida CRC Press pp. 136.
Roer, RD, KE Halbrook, TH Shafer. 2001. Glycosidase activity in the post-ecdysial cuticle of the blue crab, Callinectes sapidus. Comp Biochem Physiol B 128:68390.[CrossRef][Medline]
Ronquist, F and JP Huelsenbeck. 2003. MrBayes 3: Baysian phylogenetic inference under mixed models. Bioinformatics 19:15724.
Rudd, S. 2003. Expressed sequence tags: alternative or complement to whole genome sequences? Trends Plant Sci 8:3219.[CrossRef][ISI][Medline]
Shafer, TH, RD Roer, C Midgette-Luther, TA Brookins. 1995. Postecdysial cuticle alteration in the blue crab, Callinectes sapidus: synchronous changes in glycoproteins and mineral nucleation. J Exp Zool 271:17182.[CrossRef]
Shafer, TH, RD Roer, CG Miller, RM Dillaman. 1994. Postecdysial changes in the protein and glycoprotein composition of the cuticle of the blue crab Callinectes sapidus. J Crust Biol 14:2109.[CrossRef]
Stillman, J and K Teranishi. Transcriptome changes during thermal acclimation, acclimatization, and stress in porcelain crabs2006. Orlando, Florida Society for Integrative and Comparative Biology.
Thompson, JD, TJ Gibson, F Plewniak, F Jeanmougin, DG Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:487682.
Suderman, RJ, SO Andersen, TL Hopkins, MR Kanost, KJ Kramer. 2003. Characterization and cloning of three major proteins from pharate pupal cuticle of Manduca sexta. Insect Biochem Mol Biol 33:33143.[CrossRef][ISI][Medline]
Togawa, T, H Nakato, S Izumi. 2004. Analysis of the chitin recognition mechanism of cuticle proteins from the soft cuticle of the silkworm Bombyx mori. Insect Biochem Mol Biol 34:105967.[CrossRef][ISI][Medline]
Tweedie, EP, FE Coblentz, TH Shafer. 2004. Purification of a soluble glycoprotein from the uncalcified ecdysial cuticle of the blue crab Callinectes sapidus and its possible role in initial mineralization. J Exp Biol 207:258998.
Watanabe, T, P Persson, H Endo, M Kono. 2000. Molecular analysis of two genes, DD9A and B, which are expressed during the post molt stage in the decapod crustacean Penaeus japonicus. Comp Biochem Physiol B 125:12736.[CrossRef][Medline]
Watanabe, T, P Persson, H Endo, I Fukuda, K Furukawa, M Kono. 2006. Identification of a novel cuticular protein in the kuruma prawn Penaeus japonicus. Fish Sci 72:4524.[CrossRef]
Weiner, S and PM Dove. 2003. An overview of biomineralization processes and the problem of the vital effect. In Dove, PM, De YoreoJJ , WeinerS (Eds.). Reviews in Mineralogy and GeochemistryWashington, DC The Mineralogical Society of America vol. 54: pp. 129.
Whelan, S and N Goldman. 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18:6919.
Whitfield, CW, MR Band, MF Bonaldo, CG Kumar, L Liu, JR Pardinas, HM Robertson, MW Soares, GE Robinson. 2002. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee. Genome Res 12:55566.
Williams, CL, RM Dillaman, EA Elliott, DM Gay. 2003. Formation of the arthrodial membrane in the blue crab, Callinectes sapidus. J Morphol 256:2609.


70. Crustacean proteins are in boldface, and the C. sapidus sequences are in boldface and italics. Species key: AgAnopheles gambiae, CpCancer pagurus, CsCallinectes sapidus, DmDrosophila melanogaster, HaHomarus americanus, MjMarsupeneaus japonicus, MsManduca sexta, TcTribolium castaneum. Insect sequences are named using their NCBI protein accession numbers. (B) Summary of gene expression results for CsCAMP6.0, CsCAMP16.3, CsCAMP16.5, CsCAMP8.1, CsCAMP13.4, and CsCAMP9.3. (C) Summary of gene expression results for CsCP14.1.



