In other words is the code overlapping or none overlapping if the code is over lapping from where the reading will begin?
Because the beginning of the code reading will alter the nucleotide combination and hence will alter which amino acid is specified by the code.
When Gamow proposed the code he assumed it to be overlapping. Many people attempted to rationalize the excess number of codes i.e., 64 for 20 amino acids by assuming that the codes are overlapping and the codons can be reduced to about 20. But this involves a lot of problems.
If we take for instance the codes for lysine (AGG and AAA) and codes for phenyl alanine (UUU and UUC). The code letters of these two amino acids are mutual exclusive i.e., lysine cannot follow phenyl alanine and vice versa in any protein.
This constitutes a forbidden combination of amino acids and proteins. But on the other hand these amino acids are found in proteins proving that the genetic code is non overlapping.
Further the overlapping code assumes that a nucleotide is read twice in different combinations to code for amino acids.
If this is true the entire precision of code would be disturbed. Further, any mutational change in an overlapping code would affect more than one code thus disturbing a number of amino acids.
Mutational studies have shown that the code is none overlapping. In TMV (Tobacco Mosaic Virus) the change of one base in the DNA due to mutation would alter the composition of protein by a single amino acid proving that the code is none overlapping.
If the code is overlapping change of one base should have disturbed more than one amino acid. Studies on normal and sickle cell hemoglobin’s have shown that single base mutations result in the replacement of only one amino acid.
2. The code is degenerate:
(Synonym codons) As it has been mentioned above there are 64 possible codons considering the code to be a triplet one. Of these 64 codes about 61 codes for amino acids.
As they are only 20 amino acids which form the constituents of proteins, it is obvious that the number of codons far exceed the number of amino acids.
This leads to the conclusion that each amino acid must have more than one code. Of the 20 amino acids only tryptophan and methionine have a single codon each while all the others have more than one codons.
Phenylalanine, tryosine, histidine, glutamine, asparagine, lysine, aspartic acid and glutamic acid have two codons each (see the codon table). Leucine, argenine and serine have 6 codons each.
Thus there is no specificity as to the number of codons that decide an amino acid. The number of codons (whether one or many) specifying the amino acid possibly have been evolved over a long period of time during the process of evolution.
The variability in the number of codons perhaps may explain the differential distribution of amino acids in proteins (those with multiple codons occur more frequently).
When there are multiple codons for an amino acid, it must be noted that these codons are not totally different.
Synonym codons specifying an amino acid have the first two bases of the triplet being constant. Whereas only the third will vary for instance proline is coded by the following codons – CCU, CCC, CCA and CCG. It is obvious here in these synonm codons the first two bases are common.
The multiplicity or degeneracy of the code may pose a question as to why nature ha: provided alternate codons for an amino acid where a single codon would have been more precise and specific.
One possible reason for the degeneracy of the codes is that multiple codons protect the organism from mutations which might eliminate an amino acid altogether if there were to be only one code.
3. Polarity of the code:
It is very essential that a particular gene should specify an amino acid always and in all situations unless it undergoes mutational changes. It is necessary therefore that the code must be read between a fixed beginning point and a fixed end point.
These points are referred to as the initiation and termination codons respectively. Another necessity is the reading of the direction of the code because any change in the direction would alter the amino acid.
All these features clearly point out that the code must have a polarity i.e., a fixed start, a fixed end point and a predetermined direction. For instance if the codon direction as given below varies the composition of the amino acids would completely vary
It is obvious from the above that the direction of reading the code would alter the sequence of the amino acids and obviously changes the composition of the proteins.
Available molecular biological evidences indicate that the message in mRNA is read in 5? ->3? direction. Based on this message the polypeptide chain is synthesized from the direction of amino group to carboxyl group.
4. The code is comaless:
One of the most interesting questions with reference to the non overlapping code is, are there any gaps between the codes in the form of punctuation marks so that one code is separate from the other.
In genetic terms this means that there could be bases between codes indicating where one code ends and the other begins. For instance codes with intervening punctuation marks can be represented as follows –
If a mutation occurs in any one of the codes, it will not affect other codes and consequently the disturbance in the protein synthesised would be to the extent of only one amino acid.
On the other hand if the code is commaless (no punctuation mark-base between codes) mutation in one base would result in a drastic change of the genetic message. Researches of Dr.Khorana and others had clearly indicated that the code is commaless i.e., there is no delimiting point between one codon and other.
Khorana and his associates used artificially synthesized long polynucleotide chains which had repeating sequences to produce polypeptide chains. For example the repeating sequence CUCUCU contains the codons for leucine (CUC) and serine (UCU).
When this sequence was used for producing the proteins neither amino acid was incorporated with the protein unless the other one is also present. This clearly explains that the triplet code has to be commaless and only then there would be alternate translation of the codes into leucine and serine.
5. Universality of the code:
The code was first deciphered and worked out in microorganisms but subsequent researches have clearly shown that these codes are same in their amino acid specification from bacteria to man.
Thus, the code is said to be universal. Marshall Caskey and Nirenbrg (1967) have shown by their experiments that diverse organisms such as E.coli (bacterium) Xenopus laevis (amphibian) and Guinea pig which is a mammal use the same code for their amino acyl tRNAs.
Gene mutations provide additional evidence for the universality of codes. These mutations result in the substitution of one amino acid for other.
Studies have indicated, gene mutations bringing about amino acid substitutions affect the same amino acid in TMB, E.coli and also in man. For instance a single point mutation, changes the amino acid in the coat protein of TMV, alpha chain of tryptophan synthesize in E.coli and also in the hemoglobin of man.
Another remarkable feature of the code is not only that it universally occurs in all organisms but it has also remained probably without any change ever since it evolved in some bacteria a few billion years ago.
Any mutation that occurs would change the composition of mRNA and consequently the amino acid composition in the protein.
Changes such as these apparently have proved to be harmful to the organisms and as such there will be a strong selection pressure against the occurrence of such mutations. Hence natural selection has persisted with the codes perhaps ever since they originated.