What is the genetic code in biology definition. Genetic code: description, characteristics, research history

The genetic code is a special encryption hereditary information with the help of molecules Based on this genes appropriately control the synthesis of proteins and enzymes in the body, thereby determining the metabolism. In turn, the structure of individual proteins and their functions are determined by the location and composition of amino acids - the structural units of the protein molecule.

In the middle of the last century, genes were identified that are separate sections (abbreviated as DNA). The links of nucleotides form a characteristic double chain, assembled in the form of a spiral.

Scientists have found a connection between genes and the chemical structure of individual proteins, the essence of which is that the structural order of amino acids in protein molecules fully corresponds to the order of nucleotides in the gene. Having established this connection, scientists decided to decipher the genetic code, i.e. establish the laws of correspondence between the structural orders of nucleotides in DNA and amino acids in proteins.

There are only four types of nucleotides:

1) A - adenyl;

2) G - guanyl;

3) T - thymidyl;

4) C - cytidyl.

Proteins contain twenty types of essential amino acids. Difficulties arose with deciphering the genetic code, since there are much fewer nucleotides than amino acids. When solving this problem, it was suggested that amino acids are encoded by various combinations of three nucleotides (the so-called codon or triplet).

In addition, it was necessary to explain exactly how the triplets are located along the gene. Thus, three main groups of theories arose:

1) triplets follow each other continuously, i.e. form a continuous code;

2) triplets are arranged with alternation of "meaningless" sections, i.e. the so-called "commas" and "paragraphs" are formed in the code;

3) triplets can overlap, i.e. the end of the first triplet may form the beginning of the next.

Currently, the theory of code continuity is mainly used.

The genetic code and its properties

1) The code is triplet - it consists of arbitrary combinations of three nucleotides that form codons.

2) The genetic code is redundant - its triplets. One amino acid can be encoded by several codons, since, according to mathematical calculations, there are three times more codons than amino acids. Some codons perform certain termination functions: some may be "stop signals" that program the end of the production of an amino acid chain, while others may indicate the initiation of code reading.

3) The genetic code is unambiguous - only one amino acid can correspond to each of the codons.

4) The genetic code is collinear, i.e. the sequence of nucleotides and the sequence of amino acids clearly correspond to each other.

5) The code is written continuously and compactly, there are no "meaningless" nucleotides in it. It begins with a certain triplet, which is replaced by the next one without a break and ends with a termination codon.

6) The genetic code is universal - the genes of any organism encode information about proteins in exactly the same way. This does not depend on the level of complexity of the organization of the organism or its systemic position.

modern science suggests that the genetic code arises directly from the birth of a new organism from bone matter. Random changes and evolutionary processes make possible any variants of the code, i.e. amino acids can be rearranged in any order. Why did this kind of code survive in the course of evolution, why is the code universal and has a similar structure? The more science learns about the phenomenon of the genetic code, the more new mysteries arise.

- a unified system for recording hereditary information in molecules nucleic acids as a sequence of nucleotides. The genetic code is based on the use of an alphabet consisting of only four nucleotide letters that differ in nitrogenous bases: A, T, G, C.

The main properties of the genetic code are as follows:

1. The genetic code is triplet. A triplet (codon) is a sequence of three nucleotides that codes for one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide (since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides for coding amino acids are also not enough, since in this case only 16 amino acids can be encoded. Means, smallest number nucleotides encoding one amino acid is equal to three. (In this case, the number of possible nucleotide triplets is 4 3 = 64).

2. The redundancy (degeneracy) of the code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids, and 64 triplets). The exceptions are methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions. So, in an mRNA molecule, three of them - UAA, UAG, UGA - are terminating codons, i.e., stop signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), standing at the beginning of the DNA chain, does not encode an amino acid, but performs the function of initiating (exciting) reading.

3. Simultaneously with redundancy, the code has the property of unambiguity, which means that each codon corresponds to only one specific amino acid.

4. The code is collinear, i.e. The sequence of nucleotides in a gene exactly matches the sequence of amino acids in a protein.

5. The genetic code is non-overlapping and compact, that is, it does not contain "punctuation marks". This means that the reading process does not allow for the possibility of overlapping columns (triplets), and, starting at a certain codon, the reading goes continuously triple by triplet up to stop signals (terminating codons). For example, in mRNA, the following sequence of nitrogenous bases AUGGUGCUUAAAUGUG will only be read in triplets like this: AUG, GUG, CUU, AAU, GUG, and not AUG, UGG, GGU, GUG, etc. or AUG, GGU, UGC, CUU, etc. or in some other way (for example, codon AUG, punctuation mark G, codon UHC, punctuation mark U, etc.).

6. The genetic code is universal, that is, the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and systematic position these organisms.

0

Genetic code is a way of encoding the amino acid sequence of proteins using the sequence of nucleotides in the DNA molecule, characteristic of all living organisms.

The implementation of genetic information in living cells (that is, the synthesis of a protein encoded in DNA) is carried out using two matrix processes: transcription (that is, mRNA synthesis on a DNA matrix) and translation (synthesis of a polypeptide chain on an mRNA matrix).

DNA uses four nucleotides - adenine (A), guanine (G), cytosine (C), thymine (T). These "letters" make up the alphabet of the genetic code. RNA uses the same nucleotides, except for thymine, which is replaced by uracil (U). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of “letters” are obtained.

In the nucleotide sequence of DNA there are code "words" for each amino acid of the future protein molecule - the genetic code. It consists in a certain sequence of nucleotides in the DNA molecule.

Three consecutive nucleotides encode the "name" of one amino acid, that is, each of the 20 amino acids is encrypted by a significant code unit - a combination of three nucleotides called a triplet or codon.

At present, the DNA code has been completely deciphered, and we can talk about certain properties that are characteristic of this unique biological system, which provides the translation of information from the "language" of DNA to the "language" of protein.

The carrier of genetic information is DNA, but since mRNA, a copy of one of the DNA strands, is directly involved in protein synthesis, the genetic code is most often written in the "RNA language".

Amino acid Coding RNA triplets
Alanine GCU GCC GCA GCG
Arginine TsGU TsGTs TsGA TsGG AGA AGG
Asparagine AAU AAC
Aspartic acid GAU GAC
Valine GUU GUTS GUA GUG
Histidine CAU CAC
Glycine GSU GGC GGA GYY
Glutamine CAA CAG
Glutamic acid GAA GAG
Isoleucine AAU AUC AUA
Leucine TSUU TSUT TSUA TSUG UUA UUG
Lysine AAA AAG
Methionine AUG
Proline CCC CCC CCA CCG
Serene UCU UCC UCA UCG ASU AGC
Tyrosine UAU UAC
Threonine ACC ACC ACA ACG
tryptophan UGG
Phenylalanine uuu uuc
Cysteine UGU UHC
STOP UGA UAG UAA

Properties of the genetic code

Three consecutive nucleotides (nitrogenous bases) encode the "name" of one amino acid, that is, each of the 20 amino acids is encrypted by a significant code unit - a combination of three nucleotides called triplet or codon.

Triplet (codon)- a sequence of three nucleotides (nitrogenous bases) in a DNA or RNA molecule, which determines the inclusion of a certain amino acid in the protein molecule during its synthesis.

  • Unambiguity (discreteness)

One triplet cannot encode two different amino acids; it encodes only one amino acid. A certain codon corresponds to only one amino acid.

Each amino acid can be defined by more than one triplet. Exception - methionine and tryptophan. In other words, several codons can correspond to the same amino acid.

  • non-overlapping

The same base cannot be present in two adjacent codons at the same time.

Some triplets do not code for amino acids, but are peculiar " road signs”, which determine the beginning and end of individual genes (UAA, UAG, UGA), each of which means the cessation of synthesis and is located at the end of each gene, so we can talk about the polarity of the genetic code.

In animals and plants, in fungi, bacteria and viruses, the same triplet encodes the same type of amino acid, that is, the genetic code is the same for all living beings. In other words, universality is the ability of the genetic code to work in the same way in organisms of different levels of complexity, from viruses to humans. The universality of the DNA code confirms the unity of the origin of all life on our planet. Genetic engineering methods are based on the use of the universality property of the genetic code.

From the history of the discovery of the genetic code

For the first time the idea of ​​existence genetic code formulated by A. Down and G. Gamow in 1952-1954. Scientists have shown that a nucleotide sequence that uniquely determines the synthesis of a particular amino acid must contain at least three links. Later it was proved that such a sequence consists of three nucleotides, called codon or triplet.

The questions of which nucleotides are responsible for incorporating a certain amino acid into a protein molecule and how many nucleotides determine this inclusion remained unresolved until 1961. The theoretical analysis showed that the code cannot consist of one nucleotide, since in this case only 4 amino acids can be encoded. However, the code cannot be a doublet, that is, a combination of two nucleotides from a four-letter “alphabet” cannot cover all amino acids, since only 16 such combinations are theoretically possible (4 2 = 16).

Three consecutive nucleotides are enough to encode 20 amino acids, as well as a “stop” signal, which means the end of the protein sequence, when the number of possible combinations is 64 (4 3 = 64).

They line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L)Leucine
CUC (Leu/L) Leucine
CUA (Leu/L) Leucine
CUG (Leu/L) Leucine

In some proteins, non-standard amino acids such as selenocysteine ​​and pyrrolysine are inserted by the stop codon-reading ribosome, which depends on the sequences in the mRNA. Selenocysteine ​​is now considered as the 21st, and pyrrolysine as the 22nd amino acid that makes up proteins.

Despite these exceptions, all living organisms have a genetic code common features: a codon consists of three nucleotides, where the first two are defining, codons are translated by tRNA and ribosomes into an amino acid sequence.

Deviations from the standard genetic code.
Example codon Usual value Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Mammalian mitochondria, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A,G) Arginine Stop

The history of ideas about the genetic code

Nevertheless, in the early 1960s, new data revealed the failure of the "comma-free code" hypothesis. Then experiments showed that codons, considered by Crick to be meaningless, can provoke protein synthesis in a test tube, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a number of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but also serves as a start codon - as a rule, translation begins from the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. Jukes TH, Osawa S, The genetic code in mitochondria and chloroplasts., Experientia. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code". microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins.". Adv Protein Chem. 7 : 1-67. PMID 14933251 .
  7. M. Ichas biological code. - Peace, 1971.
  8. WATSON JD, CRICK FH. (April 1953). «Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid.". Nature 171 : 737-738. PMID 13054692 .
  9. WATSON JD, CRICK FH. (May 1953). "Genetical implications of the structure of deoxyribonucleic acid.". Nature 171 : 964-967. PMID 13063483 .
  10. Crick F.H. (April 1966). "The genetic code - yesterday, today, and tomorrow." Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relationship between Deoxyribonucleic Acid and Protein Structures.". Nature 173 : 318. DOI: 10.1038/173318a0 . PMID 13882203 .
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins.". Adv Biol Med Phys. 4 : 23-68. PMID 13354508 .
  13. Gamow G, Ycas M. (1955). STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. ". Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789 .
  14. Crick FH, Griffith JS, Orgel LE. (1957). CODES WITHOUT COMMAS. ". Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to the decoding of DNA. - M.: Tsentrpoligraf, 2006. - 208 s - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros Educational Journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation. 2010 .

The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons corresponding to amino acids in a protein.

Properties of the genetic code.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also offer other properties of the code related to the chemical features of the nucleotides included in the code or to the frequency of occurrence of individual amino acids in the proteins of the body, etc. However, these properties follow from the above, so we will consider them there.

a. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. As a rule, mRNA triplets are called codons. AT genetic code The codon performs several functions. First, its main function is that it codes for one amino acid. Second, a codon may not code for an amino acid, but in this case it has a different function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). codon characterizes elementary semantic unit genome - three nucleotides determine the attachment to the polypeptide chain of one amino acid.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded by one or two nucleotides. the latter are only 4. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids present in living organisms (see Table 1).

The combinations of nucleotides presented in Table 64 have two features. First, of the 64 variants of triplets, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

amino acids a are stop signals marking the end of translation. There are three such triplets UAA, UAG, UGA, they are also called "meaningless" (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a meaningless codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its informational part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with such a pathology will experience a lack of protein and will experience symptoms associated with this lack. For example, this kind of mutation was found in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is rapidly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A severe disease develops according to the type hemolytic anemia(beta zero thalassemia, from Greek word"Talas" - the Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons is different from the mechanism of action of sense codons. This follows from the fact that for all the codons encoding amino acids, the corresponding tRNAs were found. No tRNAs were found for nonsense codons. Therefore, tRNA does not take part in the process of stopping protein synthesis.

codonAUG (sometimes GUG in bacteria) not only encodes the amino acid methionine and valine, but is alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets code for 20 amino acids. Such a threefold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20, and secondly, amino acids can be encoded by several codons. Studies have shown that nature used the latter option.

His preference is clear. If only 20 out of 64 triplet variants were involved in coding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Earlier, we pointed out how dangerous for the life of the cell is the transformation of the coding triplet as a result of mutation into a nonsense codon - this significantly disrupts the normal operation of RNA polymerase, ultimately leading to the development of diseases. There are currently three nonsense codons in our genome, and now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. So, the amino acid leucine can be encoded by six triplets - UUA, UUG, CUU, CUC, CUA, CUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with the recording of the same information with different characters is called degeneracy.

The number of codons assigned to one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the probability of its damage by mutagenic factors. Therefore, it is clear that a mutated codon is more likely to code for the same amino acid if it is highly degenerate. From these positions, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense as well. Since the main part of the information in the codon falls on the first two nucleotides, the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base”. The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to carry oxygen from the lungs to tissues and carbon dioxide from tissues to lungs. This function is carried out by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, hemoglobin contains heme, which contains iron. Mutations in globin genes result in various options hemoglobins. Most often, mutations are associated with substitution of one nucleotide for another and the appearance of a new codon in the gene, which can code for a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of a mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known to affect the integrity of globin genes. Near 400 of which are associated with the replacement of single nucleotides in the gene and the corresponding amino acid substitution in the polypeptide. Of these, only 100 substitutions lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the “degeneracy of the third base” mentioned above, when the replacement of the third nucleotide in the triplet coding for serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonym codon encoding the same amino acid. Phenotypically, such a mutation will not manifest itself. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first one. physical and chemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of an iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and a protein - globin. Adult hemoglobin (HbA) contains two identical- chains and two-chains. Molecule-chain contains 141 amino acid residues,- chain - 146,- and-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. The gene encoding- the chain is located on the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Change in the gene encoding- hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and severe consequences for the patient. For example, replacing “C” in one of the CAU (histidine) triplets with “U” will lead to the appearance of a new UAU triplet encoding another amino acid - tyrosine. Phenotypically, this will manifest itself in a serious illness .. A similar replacement in position 63-chain of the histidine polypeptide to tyrosine will destabilize hemoglobin. The disease methemoglobinemia develops. Change, as a result of mutation, of glutamic acid to valine in the 6th positionchain is the cause of a severe disease - sickle cell anemia. Let's not continue the sad list. We only note that when replacing the first two nucleotides, an amino acid may appear similar in physicochemical properties to the previous one. Thus, the replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain on “Y” leads to the appearance of a new triplet (GUA) encoding valine, and the replacement of the first nucleotide with “A” forms an AAA triplet encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, the replacement of hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while the replacement of hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop mild form anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if uracil was replaced by cytosine in the CAH triplet and a CAC triplet arose, then practically no phenotypic changes in a person will be detected. This is understandable, because Both triplets code for the same amino acid, histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological position are protective mechanisms that are incorporated in evolution in the unique structure of DNA and RNA.

in. Unambiguity.

Each triplet (except for meaningless ones) encodes only one amino acid. Thus, in the direction of codon - amino acid, the genetic code is unambiguous, in the direction of amino acid - codon - it is ambiguous (degenerate).

unambiguous

codon amino acid

degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another variant, during the translation of the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. The cell's metabolism would switch to the "one gene - several polypeptides" mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and from mRNA occurs only in one direction. Polarity has importance to define higher order structures (secondary, tertiary, etc.). Earlier we talked about the fact that structures of a lower order determine structures of a higher order. Tertiary structure and structures more high order in proteins, they are formed immediately as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of the RNA or polypeptide acquires a tertiary structure, the other end of the chain still continues to be synthesized on DNA (if RNA is transcribed) or ribosome (if polypeptide is transcribed).

Therefore, the unidirectional process of reading information (in the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the rigid determination of secondary, tertiary, etc. structures.

e. Non-overlapping.

The code may or may not overlap. In most organisms, the code is non-overlapping. An overlapping code has been found in some phages.

The essence of a non-overlapping code is that the nucleotide of one codon cannot be the nucleotide of another codon at the same time. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if one nucleotide is common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been found that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument in favor of the fact that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have unambiguously shown that the genetic code is non-overlapping. Without going into the details of the experiment, we note that if we replace the third nucleotide in the nucleotide sequence (see Fig. 34)At (marked with an asterisk) to some other then:

1. With a non-overlapping code, the protein controlled by this sequence would have a replacement for one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a replacement would occur in two (first and second) amino acids (marked with asterisks). Under option B, the substitution would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is broken, the protein always affects only one amino acid, which is typical for a non-overlapping code.

ГЦУГЦУГ ГЦУГЦУГ ГЦУГЦУГ

HCC HCC HCC UHC CUG HCC CUG UGC HCU CUG

*** *** *** *** *** ***

Alanine - Alanine Ala - Cys - Lei Ala - Lei - Lei - Ala - Lei

A B C

non-overlapping code overlapping code

Rice. 34. Scheme explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlapping of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding AUG methionine.

It should be noted that a person still has a small number of genes that deviate from general rule and overlap.

e. Compactness.

There are no punctuation marks between codons. In other words, the triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of "punctuation marks" in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct proof The universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that the same sets of code values ​​are used in all bacterial and eukaryotic genomes. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which read the same as the UGG codon encoding the amino acid tryptophan. Other rarer deviations from universality have also been found.

DNA code system.

The genetic code of DNA consists of 64 triplets of nucleotides. These triplets are called codons. Each codon codes for one of the 20 amino acids used in protein synthesis. This gives some redundancy in the code: most amino acids are encoded by more than one codon.
One codon performs two interrelated functions: it signals the beginning of translation and encodes the incorporation of the amino acid methionine (Met) into the growing polypeptide chain. The DNA code system is designed so that the genetic code can be expressed either as RNA codons or as DNA codons. RNA codons occur in RNA (mRNA) and these codons are able to read information during the synthesis of polypeptides (a process called translation). But each mRNA molecule acquires a nucleotide sequence in transcription from the corresponding gene.

All but two amino acids (Met and Trp) can be coded for by 2 to 6 different codons. However, the genome of most organisms shows that certain codons are favored over others. In humans, for example, alanine is encoded by GCC four times more often than in GCG. This probably indicates a greater translation efficiency of the translation apparatus (eg, the ribosome) for some codons.

The genetic code is almost universal. The same codons are assigned to the same stretch of amino acids and the same start and stop signals are overwhelmingly the same in animals, plants, and microorganisms. However, some exceptions have been found. Most of these involve assigning one or two of the three stop codons to an amino acid.