Synthetic Gene DataBase
 

Synthetic Gene 231


 
  Welcome, Guest!

Field NameNatural GeneSynthetic Gene
SGDB Gene ID205231
GenBank AccessionAY278741
GenBank GI30027617
Gene NameS proteinS(1190)
Gene Length (bp)37683570
SpeciesSARS coronavirusHomo spiens
StrainsUrbaniHEK293T/17 cells
CDSatgtttattttcttattatttcttactctcactagtggtagtgaccttgaccggtgcacc
acttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtt
tactatcctgatgaaatttttagatcagacactctttatttaactcaggatttatttctt
ccattttattctaatgttacagggtttcatactattaatcatacgtttggcaaccctgtc
ataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggt
tgggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattct
actaatgttgttatacgagcatgtaactttgaattgtgtgacaaccctttctttgctgtt
tctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgcact
ttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaa
cacttacgagagtttgtgtttaaaaataaagatgggtttctctatgtttataagggctat
caacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacctattttt
aagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacct
gctcaagacatttggggcacgtcagctgcagcctattttgttggctatttaaagccaact
acatttatgctcaagtatgatgaaaatggtacaatcacagatgctgttgattgttctcaa
aatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttac
cagacctctaatttcagggttgttccctcaggagatgttgtgagattccctaatattaca
aacttgtgtccttttggagaggtttttaatgctactaaattcccttctgtctatgcatgg
gagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacattt
ttttcaacctttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctcc
aatgtctatgcagattcttttgtagtcaagggagatgatgtaagacaaatagcgccagga
caaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtc
cttgcttggaatactaggaacattgatgctacttcaactggtaattataattataaatat
aggtatcttagacatggcaagcttaggccctttgagagagacatatctaatgtgcctttc
tcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccattaaatgat
tatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttct
tttgaacttttaaatgcaccggccacggtttgtggaccaaaattatccactgaccttatt
aagaaccagtgtgtcaattttaattttaatggactcactggtactggtgtgttaactcct
tcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgat
tccgttcgagatcctaaaacatctgaaatattagacatttcaccttgctcttttgggggt
gtaagtgtaattacacctggaacaaatgcttcatctgaagttgctgttctatatcaagat
gttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgc
atatattctactggaaacaatgtattccagactcaagcaggctgtcttataggagctgag
catgtcgacacttcttatgagtgcgacattcctattggagctggcatttgtgctagttac
catacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtct
ttaggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttt
tcaattagcattactacagaagtaatgcctgtttctatggctaaaacctccgtagattgt
aatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatggtagc
ttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacaca
cgtgaagtgttcgctcaagtcaaacaaatgtacaaaaccccaactttgaaatattttggt
ggttttaatttttcacaaatattacctgaccctctaaagccaactaagaggtcttttatt
gaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggc
gaatgcctaggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggactt
acagtgttgccacctctgctcactgatgatatgattgctgcctacactgctgctctagtt
agtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaatacctttt
gctatgcaaatggcatataggttcaatggcattggagttacccaaaatgttctctatgag
aaccaaaaacaaatcgccaaccaatttaacaaggcgattagtcaaattcaagaatcactt
acaacaacatcaactgcattgggcaagctgcaagacgttgttaaccagaatgctcaagca
ttaaacacacttgttaaacaacttagctctaattttggtgcaatttcaagtgtgctaaat
gatatcctttcgcgacttgataaagtcgaggcggaggtacaaattgacaggttaattaca
ggcagacttcaaagccttcaaacctatgtaacacaacaactaatcagggctgctgaaatc
agggcttctgctaatcttgctgctactaaaatgtctgagtgtgttcttggacaatcaaaa
agagttgacttttgtggaaagggctaccaccttatgtccttcccacaagcagccccgcat
ggtgttgtcttcctacatgtcacgtatgtgccatcccaggagaggaacttcaccacagcg
ccagcaatttgtcatgaaggcaaagcatacttccctcgtgaaggtgtttttgtgtttaat
ggcacttcttggtttattacacagaggaacttcttttctccacaaataattactacagac
aatacatttgtctcaggaaattgtgatgtcgttattggcatcattaacaacacagtttat
gatcctctgcaacctgagctcgactcattcaaagaagagctggacaagtacttcaaaaat
catacatcaccagatgttgatcttggcgacatttcaggcattaacgcttctgtcgtcaac
attcaaaaagaaattgaccgcctcaatgaggtcgctaaaaatttaaatgaatcactcatt
gaccttcaagaattgggaaaatatgagcaatatattaaatggccttggtatgtttggctc
ggcttcattgctggactaattgccatcgtcatggttacaatcttgctttgttgcatgact
agttgttgcagttgcctcaagggtgcatgctcttgtggttcttgctgcaagtttgatgag
gatgactctgagccagttctcaagggtgtcaaattacattacacataa
atgttcatcttcctgctgttcctgaccctgacctccggctccgacctggaccgctgcacc
accttcgacgacgtgcaggcgcccaactacacccagcacacctcctccatgcgcggcgtg
tactaccccgacgagatcttccgctccgacaccctgtacctgacccaggacctgttcctg
cccttctactccaacgtgaccggcttccacaccatcaaccacaccttcggcaaccccgtg
atccccttcaaggacggcatctacttcgcggcgaccgagaagtccaacgtggtgcgcggc
tgggtgttcggctccaccatgaacaacaagtcccagtccgtgatcatcatcaacaactcc
accaacgtggtgatccgcgcgtgcaacttcgagctgtgcgacaaccccttcttcgcggtg
tccaagcccatgggcacccagacccacaccatgatcttcgacaacgcgttcaactgcacc
ttcgagtacatctccgacgcgttctccctggacgtgtccgagaagtccggcaacttcaag
cacctgcgcgagttcgtgttcaagaacaaggacggcttcctgtacgtgtacaagggctac
cagcccatcgacgtggtgcgcgacctgccctccggcttcaacaccctgaagcccatcttc
aagctgcccctgggcatcaacatcaccaacttccgcgcgatcctgaccgcgttctccccc
gcgcaggacatctggggcacctccgcggcggcgtacttcgtgggctacctgaagcccacc
accttcatgctgaagtacgacgagaacggcaccatcaccgacgcggtggactgctcccag
aaccccctggcggagctgaagtgctccgtgaagtccttcgagatcgacaagggcatctac
cagacctccaacttccgcgtggtgccctccggcgacgtggtgcgcttccccaacatcacc
aacctgtgccccttcggcgaggtgttcaacgcgaccaagttcccctccgtgtacgcgtgg
gagcgcaagaagatctccaactgcgtggcggactactccgtgctgtacaactccaccttc
ttctccaccttcaagtgctacggcgtgtccgcgaccaagctgaacgacctgtgcttctcc
aacgtgtacgcggactccttcgtggtgaagggcgacgacgtgcgccagatcgcgcccggc
cagaccggcgtgatcgcggactacaactacaagctgcccgacgacttcatgggctgcgtg
ctggcgtggaacacccgcaacatcgacgcgacctccaccggcaactacaactacaagtac
cgctacctgcgccacggcaagctgcgccccttcgagcgcgacatctccaacgtgcccttc
tcccccgacggcaagccctgcaccccccccgcgctgaactgctactggcccctgaacgac
tacggcttctacaccaccaccggcatcggctaccagccctaccgcgtggtggtgctgtcc
ttcgagctgctgaacgcgcccgcgaccgtgtgcggccccaagctgtccaccgacctgatc
aagaaccagtgcgtgaacttcaacttcaacggcctgaccggcaccggcgtgctgaccccc
tcctccaagcgcttccagcccttccagcagttcggccgcgacgtgtccgacttcaccgac
tccgtgcgcgaccccaagacctccgagatcctggacatctccccctgctccttcggcggc
gtgtccgtgatcacccccggcaccaacgcgtcctccgaggtggcggtgctgtaccaggac
gtgaactgcaccgacgtgtccaccgcgatccacgcggaccagctgacccccgcgtggcgc
atctactccaccggcaacaacgtgttccagacccaggcgggctgcctgatcggcgcggag
cacgtggacacctcctacgagtgcgacatccccatcggcgcgggcatctgcgcgtcctac
cacaccgtgtccctgctgcgctccacctcccagaagtccatcgtggcgtacaccatgtcc
ctgggcgcggactcctccatcgcgtactccaacaacaccatcgcgatccccaccaacttc
tccatctccatcaccaccgaggtgatgcccgtgtccatggcgaagacctccgtggactgc
aacatgtacatctgcggcgactccaccgagtgcgcgaacctgctgctgcagtacggctcc
ttctgcacccagctgaaccgcgcgctgtccggcatcgcggcggagcaggaccgcaacacc
cgcgaggtgttcgcgcaggtgaagcagatgtacaagacccccaccctgaagtacttcggc
ggcttcaacttctcccagatcctgcccgaccccctgaagcccaccaagcgctccttcatc
gaggacctgctgttcaacaaggtgaccctggcggacgcgggcttcatgaagcagtacggc
gagtgcctgggcgacatcaacgcgcgcgacctgatctgcgcgcagaagttcaacggcctg
accgtgctgccccccctgctgaccgacgacatgatcgcggcgtacaccgcggcgctggtg
tccggcaccgcgaccgcgggctggaccttcggcgcgggcgcggcgctgcagatccccttc
gcgatgcagatggcgtaccgcttcaacggcatcggcgtgacccagaacgtgctgtacgag
aaccagaagcagatcgcgaaccagttcaacaaggcgatctcccagatccaggagtccctg
accaccacctccaccgcgctgggcaagctgcaggacgtggtgaaccagaacgcgcaggcg
ctgaacaccctggtgaagcagctgtcctccaacttcggcgcgatctcctccgtgctgaac
gacatcctgtcccgcctggacaaggtggaggcggaggtgcagatcgaccgcctgatcacc
ggccgcctgcagtccctgcagacctacgtgacccagcagctgatccgcgcggcggagatc
cgcgcgtccgcgaacctggcggcgaccaagatgtccgagtgcgtgctgggccagtccaag
cgcgtggacttctgcggcaagggctaccacctgatgtccttcccccaggcggcgccccac
ggcgtggtgttcctgcacgtgacctacgtgccctcccaggagcgcaacttcaccaccgcg
cccgcgatctgccacgagggcaaggcgtacttcccccgcgagggcgtgttcgtgttcaac
ggcacctcctggttcatcacccagcgcaacttcttctccccccagatcatcaccaccgac
aacaccttcgtgtccggcaactgcgacgtggtgatcggcatcatcaacaacaccgtgtac
gaccccctgcagcccgagctggactccttcaaggaggagctggacaagtacttcaagaac
cacacctcccccgacgtggacctgggcgacatctccggcatcaacgcgtccgtggtgaac
atccagaaggagatcgaccgcctgaacgaggtggcgaagaacctgaacgagtccctgatc
gacctgcaggagctgggcaagtacgagcag
5' End
3' End
NotesTeh protein record in NCBI is AAP13441Only the first 1190 amino acids were back translated into nucleotide sequence because this part codes for a soluble protein. Note: the sequence given was a fully optimized sequence and may contains runs of Gs and Cs.
Expression VectorNApcDNA3.1
Assay MethodsNASDS-PAGE and Western Blot
ResultsExpression not determinedStrong expression in human cells.
Protein Functionsurface spike glycoprotein
Recoding PurposeTo improve expression
Synthesized ByAuthors
Recoding MethodThe first 1190 amino acids were back translated to nucleotide sequence. The DNA sequence was codon
optimized for mammalian cell expression, replacing the natural codons with the following optimum
codons: ala (gcc), arg (cgc), asn (aac), asp (gac), cys (tgc), glu (gag), gln (cag)m gly (ggc), his
(cac), ile (atc), leu (ctg), lys (aag), met (atg), phe (ttc), pro (ccc), serine (tcc), thr (acc),
trp (tgg), tyr (tac) and val (gtg). When runs of Gc and Cs occurred, suboptimal codons were used.
Publication Author(s)Babcock, G. J.; Esshaki, D. J.; Thomas, W. D., Jr.; Ambrosino, D. M.
Corresponding AuthorGregory J. Babcock
Corresponding AddressMassachusetts Biologic Laboratories, University of Massachusetts Medical School, Jamaica Plain, Massachusetts 02130, USA. greg.babcock@umassmed.edu
Publication Year2004
Publication TitleAmino acids 270 to 510 of the severe acute respiratory syndrome coronavirus spike protein are required for interaction with receptor
AbstractA novel coronavirus, severe acute respiratory syndrome coronavirus (SARS-CoV), has recently been identified as the causative agent of severe acute respiratory syndrome (SARS). SARS-CoV appears similar to other coronaviruses in both virion structure and genome organization. It is known for other coronaviruses that the spike (S) glycoprotein is required for both viral attachment to permissive cells and for fusion of the viral envelope with the host cell membrane. Here we describe the construction and expression of a soluble codon-optimized SARS-CoV S glycoprotein comprising the first 1,190 amino acids of the native S glycoprotein (S(1190)). The codon-optimized and native S glycoproteins exhibit similar molecular weight as determined by Western blot analysis, indicating that synthetic S glycoprotein is modified correctly in a mammalian expression system. S(1190) binds to the surface of Vero E6 cells, a cell permissive to infection, as demonstrated by fluorescence-activated cell sorter analysis, suggesting that S(1190) maintains the biologic activity present in native S glycoprotein. This interaction is blocked with serum obtained from recovering SARS patients, indicating that the binding is specific. In an effort to map the ligand-binding domain of the SARS-CoV S glycoprotein, carboxy- and amino-terminal truncations of the S(1190) glycoprotein were constructed. Amino acids 270 to 510 were the minimal receptor-binding region of the SARS-CoV S glycoprotein as determined by flow cytometry. We speculate that amino acids 1 to 510 of the SARS-CoV S glycoprotein represent a unique domain containing the receptor-binding site (amino acids 270 to 510), analogous to the S1 subunit of other coronavirus S glycoproteins.
JournalJ Virol. 78(9): 4552-60.
SummaryThe soluble part of S glycoprotein was codon optimized for expression in mammalian cells and gain significant yields in human HEK-293T cells.
Comments
Discussion http://www.evolvingcode.net/forum/viewtopic.php?t=596
PubMed ID15078936
Submitter NameWu, Gang
Submitter AddressDepartment of Biological Sciences, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250 USA
Entry ConfirmationNo
 
 

Copyright 2004 the Freeland Bioinformatics Lab, All Rights Reserved. | Contact Us | About this site