|
The amino acid alphabet and the rules for translating 64
nucleotide triplets (codons) into them are two of the relatively
few biological phenomena that are nearly universal across
life. However, the discoveries of a 21st amino acid selenocysteine,
a 22nd amino acid pyrrolysine, and of some non-standard genetic
codes in organelle, prokaryotic and eukaryotic genomes, make
the dominance of both the standard amino acid alphabet and
the standard genetic code more intriguing.
I am interested
in the following two questions: what make the proteinaceous
amino acids special and why are there exactly
20 of them. To tackle this, I am looking into the link
between amino acid properties and the structure of the standard
genetic
code, since previous evidence shows that the arrangement
of codon/amino acid assignments in the standard genetic
code efficiently minimizes the phenotypic impact of genetic
error.
By using different mutation and translation bias models,
a series of simulations has been conducted to study the
optimized amino acid indices, which maximize the error minimization
effect of the standard genetic code. Each amino acid index
is a vector of 20 real numbers, one for each amino acid,
which represents some quantitative metric of amino acid
similarity.
Current results demonstrate that the high variety of the
standard amino acid alphabet and the high error minimization
effect of the standard genetic code could coexist when
the mutation bias is high, which is very possible during
early
genetic code evolution. Future simulation will include
a hypothesized incorporation order of amino acids in genetic
codes to study the formation of diversity in the standard
amino acid alphabet.
|