The gene of tyrosylprotein sulfotransferase, which was discovered in mammals, has been widely found in marine mollusk Littorina sitkana . High conservation of this gene indicates the functional importance of TPST in the metabolism of the living world. The cDNA encoding TPST in the mollusk was cloned and sequenced, and the enzyme was assigned on the basis of amino acid sequence similarity as tyrosylprotein sulfotransferase-2 (TPST-2). The putative homology model for the catalytic domain of TPST from L. sitkana was constructed according to crystal structure of the catalytic domain of the human TPST-2. The putative model of dimer structure showed that the active site involved two monomers and the dimer contains two active centers.
TPST , tyrosylprotein sulfotransferases ; PAPS , 3′-phosphoadenosine-5′-phosphosulfate ; LsTPST , tyrosylprotein sulfotransferases from Littorina sitkana
Tyrosylprotein sulfotransferase ; TPST ; Post-translational modification ; Marine mollusk ; Littorina sitkana ; Homology modeling
Tyrosylprotein sulfotransferases (TPSTs1 , EC 220.127.116.11) are the Golgi-localized type II transmembrane proteins, which transfer a sulfuryl group (SO32 − ) from the universal sulfate donor 3′-phosphoadenosine-5′-phosphosulfate (PAPS) to the hydroxyl of tyrosine side chain (Lee and Huttner, 1983 ). Tyrosine sulfation is a common post-translational modification of peptides and proteins. TPST activity has been described in many species of animals and plants (Niehrs and Huttner, 1990 ; Sane and Baker, 1993 ; Kasinathan et al ., 2005 ; Nishimura and Naito, 2007 ; Hanai et al ., 2000 ). In 2012, this enzyme was first discovered in a Gram-negative bacterium (Han et al., 2012 ), although tyrosine sulfation has not previously been reported in prokaryotes. In spite of the functional importance of these enzymes, almost all known TPSTs are of mammalian origins. Only few enzymes from invertebrates have been described (from nematodes Caenorhabditis elegans and Brugia malayi , insects Culex quinquefasciatus , Drosophila spp., and Anopheles gambiae , sea squirts Ciona intestinalis and Halocynthia roretzi , trematode Schistosoma japonicum , marine mollusk Crassostrea gigas ) mainly via cDNA or genome sequencing.
The sulfation of biomolecules plays an important role in the metabolism of pro- and eukaryotes. The growing scientific interest in the sulfation of different natural compounds stems from the high biological activities of sulfated derivatives and their role in organisms. The known roles of tyrosine sulfation in mammals are various and multiple, for example maintenance of hemostasis, triggering inflammatory responses, strengthening leucocyte adhesion, defining specificity of chemokine receptors and enhancement of potency of bioactive peptides (Coughtrie et al ., 1998 ; Kehoe and Bertozzi, 2000 ; Moore, 2003 ; Seibert and Sakmar, 2008 ).
Marine hydrobionts are extremely rich in sulfated metabolites with diverse biological activities (Kornprobst et al., 1998 ). Unfortunately, sulfation/desulfation processes are studied considerably less than, for example, phosphorylation. The knowledge about the proteins undergoing sulfation in marine organisms is particularly poor. Frequently enzymes from marine habitats have unique properties; some of them are very useful for biotechnological applications. The enzymes with interesting specificities and high level of activities were found in the digestive glands of Littorina sitkana . Previously we isolated and characterized from this mollusk the enzymes, which catalyze hydrolysis and transformation of carbohydrate-containing natural products. There are alginate lyase ( Favorov et al., 1979 ), two forms of fucoidanases ( Kusaykin et al ., 2003 ; Bilan et al ., 2005 ), endo-1,3-β-d -glucanase (Pesentseva et al., 2012 ), β-d -glucosidase (Pesentseva et al., 2008 ) and sulfatase (Kusaykin et al., 2006 ). Herbivorous marine gastropod L. sitkana is widely spread on the coasts of the Pacific and the Atlantic Oceans, and it was interesting to search the TPST gene in this animal. The data of the crystal structure of human tyrosylprotein sulfotransferase-2 ( Teramoto et al., 2013 ) and the homology model for human TPST-1 (Nedumpully-Govindan et al., 2014 ) will give more information about this enzyme.
Thus, the present study was devoted to the search for the tyrosylprotein sulfotransferase gene in the marine gastropod, cloning and sequencing of the cDNA encoding this protein, and 3D-structure homology modeling.
Marine mollusks L. sitkana were collected in the Posieta Bay (northwestern part of the Sea of Japan) in August 2014 near the Marine Experimental Station of the Pacific Institute of Bioorganic Chemistry.
Total RNA was isolated from the liver of L. sitkana by the TRIzol Reagent (Invitrogen, USA) and cDNA was synthesized with Mint cDNA Synthesis Kit (Eurogen, Russia) according to the provided protocols.
The fragments of cDNA of TPST from L. sitkana were obtained by PCR using cDNA from L. sitkana . The SU-F1, SU-F2, SU-R2, SU-R3, and SU-R4 primers ( Table 1 ) synthesized on the basis of the conserved peptides were used for the amplification, which was carried out for 35 cycles (10 s at 95 °C, 15 s at 55 °C, 40 s at 72 °C). The terminal cDNA regions were obtained by rapid amplification of the cDNA fragments (RACE). The amplification was performed with SU-5race and SU-3race primers for 38 cycles (10 s at 95 °C, 20 s at 63 °C, 60 s at 72 °C).
The PCR products were cloned with InsTAclone PCR Cloning Kit from Fermentas (Lithuania) according to the manufacturers recommendations. Bacterial colonies containing plasmids with the desired insertion were screened by PCR using M13 universal primers. Nucleotide sequences were determined with ABI Prism Big Dye Terminator 3.1 Cycle Sequencing Kit from Applied Biosystems (USA) on ABI Prism 310 Genetic Analyzer.
The nucleotide and amino acid sequences were analyzed using the programs CHROMAS 2.01 (http://www.technelysium.com.au/chromas_lite.html ) and GENERUNNER 3.05. Amino acid sequence was established on the basis of the nucleotide sequence of the cDNA encoding TPST with EXPASY (http://expasy.org/tools/dna.html ). Search of TPST homologous was performed using the BLAST2 (http://www.ebi.ac.uk/blastall ). Domain architectures were identified by SMART tool (http://smart.embl-heidelberg.de ) and InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/ ). The multiple sequence alignment was performed with the ClustalW2 (http://www.ebi.ac.uk/Tools/clustalw2/index.html ).
The sequence of LsTPST (GenBank AHJ26006.1 or Uniprot W6FFX7) was used for fold recognition and homology modeling. The LsTPST sequence analysis and fold recognition were carried out with servers PHYRE (http://www.sbg.bio.ic.ac.uk/phyre2/ ), I-Tasser (http://zhanglab.ccmb.med.umich.edu/I-TASSER/ ) and FUGUE (http://tardis.nibio.go.jp/fugue/ ). The LsTPST monomer and homodimer model construction and analysis were carried out with program MOE 2013.08 (Molecular Operating Environment, 2011.10, Chemical Computing Group Inc., 1010 Sherbrooke St. West, Suite #910, Montreal, QC, Canada H3A 2R7, 2013).
The model of LsTPST monomer (residues 47–346) was built with the homology module in MOE 2013.08 using as template the crystal structure of the catalytic domain (residues 52–353) of human tyrosylprotein sulfotransferase-2 monomer (TPST-2; PDB code: 3AP1; resolution 1.9 Å) (Teramoto et al., 2013 ). The model of LsTPST homodimer was generated using optimized structure of LsTPST monomer and human TPST-2 homodimer as template.
An amino acid sequence of the L. sitkana TPST was determined by the methods of molecular biology. Total RNA isolated from L. sitkana liver was used for the synthesis of the first strand of cDNA. For amplification of cDNA fragments encoding the TPST degenerated oligonucleotide primers were used ( Table 1 ). These primers were synthesized on the basis of the conservative sites of the amino acid sequences of human TPSTs and the partial sequences of TPST from marine mollusks, which were found in GenBank. The cDNA fragments, obtained through PCR, were cloned in plasmid vector and sequenced. Analysis of the nucleotide sequence of this cDNA fragments by BLAST2 had shown that a fragment ~ 750 bp in length, obtained with primers SU-F2 and SU-R4, was homologous to cDNA encoding the TPSTs of other animals. Gene-specific primers for the amplification of the terminal cDNA sites of the mollusk L. sitkana TPST were constructed on the basis of the nucleotide sequence of the obtained cDNA fragment. Amplification was performed by the modified RACE method ( Matz et al., 1999 ). The PCR gave cDNA fragments of 810 and 1500 bp in length, which were cloned and sequenced. As a result, we determined the nucleotide sequence of three overlapping fragments of cDNA encoding the LsTPST. Their mutual correlation allowed reconstruction of the complete nucleotide sequence of TPST cDNA. The amino acid sequence of the LsTPST was deduced from the nucleotide sequence. Analysis of the cDNA sequence indicates one prolonged open reading frame of 1224 bp in length, which encodes a polypeptide of 407 amino acid residues. Its calculated molecular mass is 46.63 kDa. The isoelectric point of this protein is 8.94.
The LsTPST amino acid sequence analysis using I-Tasser, FUGUE and PHYRE2 servers showed that the protein central part has a fold with high confidence similar to the hTPST catalytic domain. A model of the full amino acid sequence of LsTPST, which was constructed by server I-Tasser, included cytoplasmic and transmembrane parts, and additional 4 helixes at the C-terminus (data not shown). The I-Tasser model with high C-score had TM-score 0.47, which showed the structural similarity between model of LsTPST fragment and catalytic domain of hTPST (3AP1) structures. The LsTPST sequence analysis by FUGUE and PHYRE2 servers showed with high confidence that the LsTPST fragment had folding similar to the catalytic domain of hTPST (PDB ID: 3AP1 ).
The alignment of amino acid sequences LsTPST and human TPST-2 showed that LsTPST had 64% of identity and 77% of similarity with the catalytic domain of TPST-2 (Fig. 1 ). The theoretical model of the spatial structure of the LsTPST catalytic domain (residues 47–346) was obtained by MOE 2013.08 (Fig. 2 A). The crystal structure of the catalytic domain of the human TPST-2 (PDB code 3AP1_A; Teramoto et al., 2013 ) was used as a template for modeling LsTPST structure. Analysis of the obtained model by MOE program showed that the monomer structure of the LsTPST catalytic domain had the following proportions of secondary structures β-strands (7.3%), α-helix (48%), loop and disordered structure (44.7%). The structure of LsTPST was stabilized by 86 hydrogen and 19 ionic bonds, 131 hydrophobic contacts and two disulfide bonds. Monomers of LsTPST and human TPST-2 contained free SH groups Cys204 and Cys210 in the catalytic domain. Superposition of LsTPST and human TPST-2 structures showed that the value RMSD for Cα atoms of the theoretical model and template crystal structure was 0.55 Å. Superposition of the model and template structures showed the identity of the active sites and substrate binding site structures (Fig. 2 B). The LsTPST dimer was modeled (Fig. 2 C) by analogy with the human TPST-2 dimer, which was functionally active.
Sequence alignment of the LsTPST (residues 47–346) and hTPST (residues 52–353) catalytic domains. Secondary structural elements are shown as cylinders (α helix) and arrows (β strand). Disulfide-bonds (C96–C156 and C225–C233 for hTPST; C90–C150 and C219–C227 for LsTPST) are indicated at the bottom. The alignment was conducted using Maestro 9.3 (http://www.schrodinger.com/Maestro/ ).
(A) Structural model of the LsTPST catalytic domain (residues 47–346). The structure of the protein is shown in ribbon diagram form: α-helixes (red), β-strands (yellow), coils (white) and turns (blue). (B) Superimposition of the structural model of LsTPST (residues 47–346; yellow) with template crystal structure of hTPST (residues 52–353; blue) in complex with PAP (magenta) and C4 peptide (green). LsTPST catalytic residues Arg72 and Glu93 are shown as spacefill. (C) The model of the complex of C4 peptide (P and P′) with LsTPST homodimer from M1 (magenta) and M2 (orange) monomers. C4 peptides in the active sites of M1 and M2 monomers are shown as sticks, and catalytic residues Arg72 and Glu93 are shown as spacefill.
The monomers, included in the dimer, were both involved in the formation of the active center of the LsTPST, wherein the binding of the substrate and two active centers was present in the LsTPST dimer structure. Thus, the models of the LsTPST catalytic domain in the form of monomer and dimer were generated and the residues Arg72 and Glu93 were determined as the catalytic residues (Fig. 3 ).
Superposition of the catalytic sites of TPST2 (white) and LsTPST (dark gray). TPST2 is complexed with PAP (black) and C4 peptide (white) which is shown as ball and stick. Atoms colors are shown in red (O), blue (N) and magenta (P).
The primary structure of TPST from marine mollusk L. sitkana was determined by cDNA sequencing. The comparison of the LsTPST amino acid sequence with two human TPSTs and the TPSTs of other invertebrate species revealed 70–85% of structural homology. It should be noted that the maximal degree of homology was 85% with the protein from С. gigas (GenBank EKC35269). The degree of amino acid homology with TPSTs of vertebrates is also high: 74–81%. Analysis of the sequence of LsTPST using the InterProScan server indicates that it contains a catalytic domain of TPST-2 from Arg5 to Gly355.
The multiple alignment of the amino acid sequences of TPST-1 (GenBank O60507) and TPST-2 (GenBank O60704) from human and some invertebrates — insects Drosophila melanogaster (GenBank AAM94031) and C. quinquefasciatus (GenBank EDS40555), clam C. gigas (GenBank EKC35269), nematodes C. elegans (GenBank O77081) and sea squirts H. roretzi (GenBank AAM09087) is presented in Fig. 4 .
Sequence alignment of human TPSTs with several species of invertebrates. Conserved residues are highlighted in black (strictly conserved) or in gray (homologous replacement). Residues in the putative transmembrane region are underlined. Symbols indicate PAPS-binding residues (*), catalytic residues (↓), substrate-binding residues (◊), and residues which are involved in stabilization of enzyme molecule during catalysis (▼). All residues are marked for the human TPST-2.
Their comparison demonstrated the conservative fragment PRSGTTLM. Amino acids of this fragment are involved in the binding of PAPS (Teramoto et al., 2013 ). The transmembrane region from Leu12 to Ser30 of LsTPST (Fig. 4 ) was defined with InterProScan server (http://www.ebi.ac.uk/Tools/pfa/iprscan/ ). It coincides with the transmembrane regions of other animal TPSTs (Stone et al., 2009 ).
The multiple alignment of TPST sequences revealed conserved residues, which allowed to speculate on their potential functions of LsTPST. Five of the eight cysteine residues (Сys90, Сys150, Сys204, Сys219, Сys227) are conserved and four of them formed two disulfide bonds, suggesting that these bonds are required for the stabilization of the monomeric molecule of LsTPST. Based on the available literature data (Teramoto et al., 2013 ) the Сys31 is a dimer-forming residue of the enzyme. Mutational analysis data obtained for human TPST-2 showed that conserved residues Arg72 and Glu93 are directly involved in catalysis, and Lys152 and Ser277, being conserved also, are needed to stabilize the transition state of the enzyme during catalysis (Teramoto et al., 2013 ). According to the data on the directed mutagenesis of human TPST-2 the positively charged conserved residues Arg95, Arg99, Arg116, Lys158, Lys109 and Arg276 as well as Thr192 are included in the substrate-binding site, where they contribute to the recognition of acidic amino acid residues of the acceptor molecule. The region from transmembrane site to the beginning of the catalytic domain has remarkably low sequence identity, and it is often named as unstructured “stem”. There are suggestions that this region in addition to some conserved residues is responsible for the specificity of the enzyme (Stone et al., 2009 ).
Taking into account that data on the tyrosylprotein sulfotransferases in marine mollusks are absent we compared the amino acid sequence of LsTPST with partial sequences from marine mollusks, which was found in GenBank. This comparison has shown 66% of identity with uncharacterized protein of gastropod Lottia gigantea (GenBank B4BVQ4). As a result of this study a new source of TPST was found. We have determined the amino acid sequence of the TPST from marine gastropod L. sitkana . The studied enzyme has been identified as TPST-2, on the basis of the analysis of its primary structure. A high degree of homology between TPSTs from terrestrial vertebrates, insects and marine invertebrates indicates the importance of sulfation of tyrosine residues for the metabolism of invertebrates and vertebrates. The investigation of TPST from marine mollusks is interesting for understanding the functions of these enzymes and of their substrates — proteins and peptides. A recent survey of SwissProt data finds only 561 experimentally detected sulfated proteins ( Khoury et al., 2011 ). Hence the studies of the metabolic importance of sulfation of tyrosines are at their very beginning. We hope our results could help to identify novel tyrosine-sulfated proteins and to understand the extent and the importance of the sulfation processes in marine organisms between others. Also, due to the high identity some structural features of LsTPST may be crucial for TPSTs of other types of shellfish.
This work was supported by the grant of the Russian Foundation for Basic Research grant # 15-04-01004_а . The authors express their gratitude to the Service of Science and Technology of the French Embassy in Moscow for the PhD in co-tutelle fellowship.