Wednesday, 19 March 2014

Human Genome Project

HUMAN GENOME PROJECT

The human genome project was the international, collaborative research project whose goal was the complete mapping and understanding of all the genes of human beings.
The hereditary material of all multi-cellular organisms is the famous double helix of deoxyribonucleic acid (DNA), which contains all of our genes. DNA, in turn, is made up of four chemical bases, pairs of which form the "rungs" of the twisted, ladder-shaped DNA molecules. All genes are made up of stretches of these four bases, arranged in different ways and in different lengths. HGP researchers have deciphered the human genome in three major ways: determining the order, or "sequence," of all the bases in our genome's DNA; making maps that show the locations of genes for major sections of all our chromosomes; and producing what are called linkage maps, complex versions of the type originated in early Drosophila research, through which inherited traits (such as those for genetic disease) can be tracked over generations.





DEOXYRIBONUCLEIC (DNA) FROM THE BEGINNING..

DNA was discovered as a major chemical of the nucleus at about the same time Mendel and Darwin published their work. However, during the early 1900s, proteins were considered better candidates as molecules able to transmit large amounts of hereditary information from generation to generation.
Although DNA was known to be a very large molecule, it seemed likely that its four chemical components wereassembled in a monotonous pattern — like a synthetic polymer. Also, no specific cellular function had yet been found for DNA. Proteins, on the other hand, were important as enzymes and structural components of living cells. Proteins were also known to be polymers of numerous amino acids. These polymers are called polypeptides. Most importantly, the 20 amino acid "alphabet" of proteins potentially could be configured into more unique information-carrying structures than the four-letter alphabet of DNA. 

DNA and proteins are key molecules of the cell nucleus.

This section demonstrates finding genes, finding functions and examining variation through the use of bioinformatics. Bioinformatics is the branch of biology that is concerned with the acquisition, storage, display and analysis of the information found in nucleic acid and protein sequence data.

Figure 1: DNA Sequences- three bases and stop codons

Figure 1
Click to view an enlarged image of Figure 1: DNA Sequences - three bases and stop codons.
One of the most important aspects of bioinformatics is identifying genes within a long DNA sequence. Until the development of bioinformatics, the only way to locate genes along the chromosome was to study their behavior in the organism (in vivo) or isolate the DNA and study it in a test tube (in vitro). Bioinformatics allows scientists to make educated guesses about where genes are located simply by analyzing sequence data using a computer (in silico).
In principle, locating genes should be easy. DNA sequences that code for proteins begin with the three bases ATG that code for the amino acid methionine and they end with one or more stop codons; either TAA, TAG or TGA. Unfortunately, finding genes isn't always so easy.

Figure 2: Sense Strand / Antisense Strand
Figure 2
Click to view enlarged image of Figure 2: Sense Strand / Antisense Strand.
Let's consider a DNA sequence that contains a gene of interest. The DNA strand that codes for the protein is called the sense strand because its sequence reads the same as that of the messenger RNA. The other strand is called the antisense strand and serves as the template for RNA polymerase during transcription.




Figure 3: Open Reading Frame
Figure 3
Click to view enlarged image of Figure 3: Open Reading Frame.
A gene begins with a codon for the amino acid methionine and ends with one of three stop codons. The codons between the start and stop signals code for the various amino acids of the gene product but do not include any of the three stop codons. When examining an unknown DNA sequence, one indication that it may be part of a gene is the presence of an open reading frame or ORF. An ORF is any stretch of DNA that when transcribed into RNA has no stop codon.

Figure 4: Three Different Reading Frames
Figure 4
Click to view enlarged image of Figure 4: Three Different reading frames.
A computer program can be used to check an unknown DNA sequence for ORFs. The program transcribes each DNA strand into its complementary RNA sequence and then translates the RNA sequence into an amino acid sequence. Each DNA strand can be read in three different reading frames. This means that the computer must perform six different translations for any given double-stranded DNA sequence.

Figure 5: Regions of DNA sequence that might be part of genes
Figure 5
Click to view enlarged image of Figure 5: Regions of DNA sequence that might be part of genes.
The presence of an ORF doesn't guarantee that the DNA sequence is part of a gene. We expect that, just by chance, there will be some long stretches of DNA that do not contain stop codons yet are not parts of genes. Likewise, codons for methionine do not always mark the start of a gene sequence. Methionine codons are also found within genes. Nevertheless, searching for ORFs identifies regions of the DNA sequence that might be parts of genes.


Figure 6: Strands with 5' and 3'
Figure 6
Click to view an enlarged image of Figure 5: Strands with 5' and 3'.
A single RNA or DNA strand has a phosphate group at one end and a sugar (ribose for RNA and deoxyribose for DNA) at the other end. The end of the strand with the phosphate group is called the 5' end and the opposite end with the sugar is called the 3' end. In the double helix, the two strands run in opposite directions. That is, one strand runs in the 5' to 3' direction while the complementary strand runs in the 3' to 5' direction.



Figure 7: Transcription and Translation
Figure 7
Click to view enlarged image of Figure 7: Transcription and Translation.
The enzymes and ribosomes that carry out protein synthesis only work in one direction. During transcription, the mRNA is made in the 5' to 3' direction. During translation, the mRNA is read in the 5' to 3' direction. This means that a computer program looking for ORFs also must read each DNA strand in the 5' to 3' direction.






Figure 8: Exons and Introns
Figure 8
Click to view enlarged image of Figure 9: Alternative Splicing.
It is easier to locate genes in bacterial DNA than in eukaryotic DNA. In bacteria, the genes are arranged like beads on a string. Each gene consists of a single ORF. The situation in eukaryotic organisms is complicated by the split nature of the genes. Most eukaryotic genes take the form of alternating exons and introns. Each exon is an ORF that codes for amino acids. The intron sequences do not code for amino acids and contain internal stop codons.


Figure 9: Alternative Splicing
Figure 9
Click to view enlarged image of Figure 9: Alternative Splicing.
One of the surprises of the Human Genome Project was the relatively small number of genes found - about 25,000. One might ask, "How can something as complicated as a human have only 25 percent more genes than the tiny roundworm C. elegans?" Part of the answer seems to involve alternative splicing. Alternative splicing refers to the process by which a given gene is spliced into more than one type of mRNA molecule.




BENEFITS
The work on interpretation of genome data is anticipated that detailed knowledge of the human genome will provide new avenues for advances in medicine and biotechnology. For example, a number of companies, such as Myriad Genetics, started offering easy ways to administer genetic tests that can show predisposition to a variety of illnesses, including breast cancerhemostasis disorderscystic fibrosisliver diseases and many others. Also, the etiologies for cancersAlzheimer's disease and other areas of clinical interest are considered likely to benefit from genome information and possibly may lead in the long term to significant advances in their management.
There are also many benefits for biological scientists. For example, a researcher investigating a certain form of cancer may have narrowed down his/her search to a particular gene. By visiting the human genome database on the World Wide Web, this researcher can examine what other scientists have written about this gene, including (potentially) the three-dimensional structure of its product, its function(s), its evolutionary relationships to other human genes, or to genes in mice or yeast or fruit flies, possible detrimental mutations, interactions with other genes, body tissues in which this gene is activated, and diseases associated with this gene or other datatypes. Further, deeper understanding of the disease processes at the level of molecular biology may determine new therapeutic procedures.
The analysis of similarities between DNA sequences from different organisms is also opening new avenues in the study of evolution. In many cases, evolutionary questions can now be framed in terms of molecular biology; indeed, many major evolutionary milestones (the emergence of the ribosome and organelles, the development of embryos with body plans, the vertebrate immune system) can be related to the molecular level.










No comments:

Post a Comment