click below
click below
Normal Size Small Size show me how
Bio.203-21.Genomes
Molecular Biology Ch. 21 - Genomes and their Evolution
| Question | Answer |
|---|---|
| Genomics | The study of whole sets of genes and their interactions |
| Bioinformatics | The application of computational methods to the storage and analysis of biological data |
| Human Genome Project | Began in 1990, organized by an international consortium of scientists at universities and research institutions, involved 20 large sequencing centers in six countries. The genome was sequenced in 2003. |
| Fluorescence in situ hybridization | (FISH) is a method in which fluorescently labeled nucleic acid probes are allowed to hybridize to an immobilized array of whole chromosomes. Cytogenetic maps based on this provided the starting point for more detailed mapping of the human genome |
| With the cytogenetic maps provided by analysis, the initial stage in sequencing the human genome was able to construct ... | ... a linkage map of several thousand genetic markers spaced throughout the chromosomes. The order of the markers and the relative distances between them on such a map are based on recombination frequencies. The markers can be genes, RFLPS, STRs, etc. |
| The next stage of the human genome project was... | ...the physical mapping of the human genome. |
| Physical map | In a physical map the distances between markers are expressed by some physical measure, usually the number of base pairs along the DNA. |
| The ultimate goal in mapping any genome | to determine the complete nucleotide sequence of each chromosome. |
| What method was used to sequence the human genome during the human genome project? | It was accomplished by sequencing machines, using the dideoxy chain termination method. |
| How fast was the Human Genome Project research center able to sequence by the year 2000? | 1,000 bp/s |
| J. Craig Venter's approach to sequencing genomes | The whole-genome shotgun approach. It essentially skips the linkage mapping and physical mapping stages, and starts directly with the sequencing of DNA fragments from randomly cut DNA. Powerful computer programs then assemble the resulting short sequences |
| Overview of the whole-genome shotgun approach step: (1) | Cut the DNA from many copies of an entire chromosome into overlapping fragments short enough for sequencing. |
| Overview of the whole-genome shotgun approach step: (2) | Clone the fragments in plasmid or phage vectors. |
| Overview of the whole-genome shotgun approach step: (3) | Sequence each fragment |
| Overview of the whole-genome shotgun approach step: (4) | Order the sequences into one overall sequence with powerful computer software |
| Craig Venter's involvement in the Human Genome Project | In 1998 Venter set up Celera Genomics and declared his intention to sequence the entire genome. Five years late, Celera announced completion of the sequencing. |
| Whole-genome shotgun approach vs. sequencing by synthesis | Sequencing by synthesis is more modern and faster. In this technique many very small fragments (fewer than 100 bp) are sequenced at the same time, and computer software rapidly assembles the complete sequence. |
| The advantage of sequencing by synthesis | Because of the sensitivity of these techniques, the fragments can be sequenced directly; the cloning step is unnecessary. |
| Advancements in cost efficiency outlined | The sequencing of the first genome took 13 years and cost $100 million. In 2007 James Watson's genome was sequenced in 4 months for about $1mil. In 2010 researchers sequenced three human genomes for $4,400 each. |
| Metagenomics | The collection and sequencing of DNA from a group of species, usually an environmental sample of microorganisms. Computer software sorts partial sequences and assembles them into genome sequences of individual species making up the sample. |
| National Center for Biotechnology Information (NCBI) | A joint effort established by the National Library of Medicine and the NIH, maintains a website (www.ncbi.nlm.nih.gov) with extensive bioinformatics resources. |
| GenBank | NCBI's database of sequences. As of May 2010, it included the sequences of 119million fragments, with a total of 114 billion base pairs. |
| BLAST | software (available on the NCBI website) which allows the visitor to compare a DNA sequence with every sequence in GenBank, base by base, to look for similar regions. |
| Protein Data Bank | A database of all three-dimensional protein structures that have been determined (www.wwpdb.org) |
| Reverse genetics | studying genes directly, without having to infer genotype from phenotype. |
| Gene annotation | Analysis of genomic sequences to identify protein-coding genes and determine the function of their products. This process is now largely automated. |
| Gene annotation approach | Use software to scan the stored sequences for transcriptional and translational start/stop signals, for RNA-splicing sites, etc. Also look for short sequences that specify known mRNAs, called expressed sequence tags (ESTs). |
| How is protein function deduced in an organism with an entirely foreign genetic code? | Through a combination of biochemical and functional studies. The biochemical approach determines the 3D structure of the protein, binding sites for other molecules, etc. Functional = blocking or disabling the gene and then observing the phenotype. |
| ENCODE | The Encyclopedia of DNA Elements; a program started in 2003, focusing intensively on learning a specific 1% of the genome. They discovered the fact that 90% of the DNA was transcribed to RNA, only 2% coding for proteins. |
| proteomics | The success in sequencing genomes and studying entire sets of genes has encouraged scientists to attempt similar systematic study of the full protein sets (proteomes) encoded by genomes, an approach called proteomics |
| Systems biology approach | An approach to studying biology that aims to model the dynamic behavior of whole biological systems. |
| By early 2010 how many genomes have been sequenced? How many were in progress? | over 1200, including 100 bacteria, 80 archaea, and 124 eukaryotes. 5500 genomes and over 200 metagenomes were in progress. |
| Genome size: genomes of most bacteria and archaea range from | 1 to 6 million base pairs (genomes are eukaryotes are usually larger than archaea) |
| Genomes size: plants and animals | Most plants and animals have genomes greater than 100 Mb; humans have 3,200 Mb |
| Relationship between genome size and phenotype | Within each domain there is no systematic relationship between genome size and phenotype |
| Amount (total pico-grams) of DNA in a genome is often referred to as | ... the C value |
| Correlation between number of genes and genome size? | Number of genes is not correlated to genome size |
| As the amount of DNA increases in eukaryotic cell... | ... it primarily related to the amount of non-coding DNA within the genome. |
| Is the number of genes in the genome an accurate representation of the number of proteins produced? | It may not be, due to alternative splicing. |
| Breakdown of what the human genomes codes for | Exons: 1.5%, Introns and Regulatory Sequences: 24%, Unique noncoding DNA: 15%, Repetitive DNA not related to transposable elements: 15%, Simple sequence DNA: 3%, Alu elements: 10%, Repetitive DNA including transp. elements: 44% |
| “Introns and regulatory sequences (24%)” What does that include? | Introns, promoters, proximal/distal control elements, possibly some mRNA |
| “Repetitive DNA including transposable elements: 44%” What does that include? | AKA “Jumping genes”: transposons and retrotransposons. ~100 – 1000bp |
| Transposable elements | Stretches of DNA that can move from location to another within the genome. During ‘transposition’ a transposable elements moves from one site in a cell’s DNA to another by a type of recombination process. |
| Why is the name “jumping gene” not entirely appropriate for transposable elements? | Because they never completely detach from the DNA. Instead, the original and new DNA sites are brought together by enzymes and other proteins that bend the DNA |
| Transposons | A transposable element that moves within a genome by means of a DNA intermediate. Transposons can move by a “cut-and-paste” mechanism, which removes the element from the original site, or by a “copy and paste” mechanism. |
| Retrotransposon | A transposable element that moves within a genome by means of an RNA intermediate, a transcript of the retrotransposon DNA. They always leave a copy at the original site. |
| Transposase | An n enzyme that binds to the ends of a transposon and catalyzes the movement of the transposon to another part of the genome by a cut and paste mechanism or a replicative transposition mechanism. |
| Where did the first evidence for wandering DNA segments come from? | The first evidence for wandering DNA segments came from geneticist Barbara McClintock’s breeding experiments with Indian corn in 1940’s |
| Pseudogenes | Former genes that have accumulated mutations over a long time and no longer produce functional proteins. |
| Repetitive DNA | Consists of sequences that are present in multiple copies in the genome. |
| About 75% of repetitive DNA is made up of | Transposable elements and sequences related to them |
| “Alu elements 10%” What does that include? | Type of retrotransposons recognized by Alu restriction enzyme. ~300bp. |
| “Simple Sequence DNA 3%” What does that include? | Other repeating sequences probably arose from mistakes of DNA replication, i.e. duplications. Vary greatly in size. Smallest = STRs (tandem repeats), used in criminal investigations. Often found in telomeres and centromeres. |
| VNTRs | Certain regions of the human genome are highly variable, made up of repeating segments of DNA (VNTRs – variable number of tandem repeats). |
| “Unique noncoding DNA (15%)” What does this include? | Codes for various types of miRNA and DNA which appears not to be transcribed at all. RNA with unknown function |
| “Exons (regions of genes coding for protein, rRNA, tRNA) (1.5%)” What does this include? | Coding Region of DNA: Less than half made of unique solitary genes; the rest primarily composed of multigene families |
| Multigene families | 2 or more identical or similar genes |
| Transcription rates | RNA pol I: 15 nt/sec = 10 rRNAs/min/gene. 1440 min/day = 14K rRNAs/gene/day. Multigene families of rRNA genes are needed to produce enough ribosomes to sustain the organism |
| The classic examples of Multigene families of nonidentical genes are… | …two related families of genes that encode globins. The alpha-Globin gene family and beta-Globin gene family on chromosomes 16 and 11 respectively. These code for two subunits of hemoglobin |
| The genes encoding the various globin proteins evolved from… | …one common ancestral globin gene, which duplicated and diverged |
| Evolution of DNA | Eukaryotic genome has evolved (changed over time) from a type of prokaryotic genome. In changing, eukaryotic DNA has increased and changed the total amount of DNA within a single cell. |
| Mechanisms that have helped to increase the total amount of DNA: | Non-disjunction in meiosis, alteration of chromosomes (e.g. duplication and translocation), Unequal crossing over events. |
| evo-devo | Evolutionary development biology; a field of biology that compares developmental processes of different multicellular organisms to understand how these processes have evolved and how changes can modify existing organismal features or lead to new ones |
| homeobox | A 180-nucleotide sequence within homeotic genes and some other developmental genes that is widely conserved in animals. Related sequences occur in plants and yeasts. |