CHAPTER 9 GENETIC PRINCIPLES AND MOLECULAR BIOLOGY
CHAPTER 9 GENETIC PRINCIPLES AND MOLECULAR BIOLOGY
The Pattern of Inheritance
X Linkage and X Inactivation
The Family History
DNA and the Genetic Code
Regulation of Gene Expression
Normal Regulatory Processes
Experimental Interference with Gene Expression
The Methods of Molecular Biology
The Polymerase Chain Reaction
Cutting DNA with Restriction Endonucleases
Detecting Mutations in Individual Patients
Types of Mutations
The understanding of hematology is more than ever dependent upon an appreciation of genetic principles and the tools that can be used to study genetic variation. All of the genetic information that makes up an organism is encoded in the DNA. This information is transcribed into RNA, and then the triplet code of the RNA is translated into protein. Mutations that change the DNA code, either present in the germline or acquired after birth, can cause a variety of hematologic disorders. A variety of changes in DNA occur, including single base changes, deletions, and insertions. The detection of defined mutations that cause a variety of diseases is now possible and has become a routine method for the diagnosis of some disorders, particularly prenatally. Inheritance patterns depend upon the characteristics of the disorder and the chromosomal location of the mutation. Common autosomal recessive hematologic diseases include sickle cell disease, the thalassemias, and Gaucher disease. Hereditary spherocytosis, thrombophilia due to factor V Leiden, most forms of von Willebrand disease, and acute intermittent porphyria are characterized by autosomal dominant inheritance. Mutations that cause glucose-6-phosphate dehydrogenase deficiency, hemophilia A and B, and the most common form of chronic granulomatous disease are all carried on the X-chromosome and therefore manifest sex-linked inheritance, with transmission of the disease state from mother to son. Understanding of the genetics of a disorder is necessary for accurate genetic counselling.
Acronyms and abbreviations that appear in this chapter include: BAC, bacterial artificial chromosome; G-6-PD, glucose-6-phosphate dehydrogenase; PAC, P1-derived artificial chromosome; PCR, polymerase chain reaction; poly(A), polyriboadenylic acid; RFLP, restriction fragment length polymorphism; RT, reverse transcription; YAC, yeast artificial chromosome.
Many of the hematologic diseases described in this text have a genetic basis. Often the disease is caused by a mutation in a single gene. Some of these disorders, such as sickle cell disease (Chap. 47), thalassemia (Chap. 46), glucose-6-phosphate dehydrogenase deficiency (Chap. 45), and factor V Leiden (Chap. 127) are extremely common. Others, such as congenital dyserythropoietic anemia type I (Chap. 35), chronic granulomatous disease (Chap. 72), or afibrinogenemia (Chap. 124) are rare, but all are due to mutations in a gene that results in the formation of a defective protein or an insufficient amount of a normal protein. The principal focus of this chapter is such genetic disorders. However, a number of acquired hematologic diseases, including lymphomas, leukemias, and paroxysmal nocturnal hemoglobinuria, are now understood to result from damage to the genetic apparatus that is not inherited but rather occurs in a cell at some time during the lifetime of the patient. Understanding these diseases requires an appreciation of how the genetic apparatus functions.
All of the information required for the development of a complete adult organism is encoded in the DNA of a single cell, the zygote. This information, designated the genome, includes the data needed for the synthesis of all enzymes; of all the plasma proteins, including clotting factors, complement components, and transport proteins; of all the membrane proteins, including receptors; and of all of the cytoskeletal proteins. The units of information into which the genome is organized are the genes. Some of these genes direct the formation of ribosomal RNA and of proteins that regulate the function of genes. The remainder encode the proteins involved in the structure and function of the body. Genetic diseases are the result of changes, or mutations, in these genes.
THE PATTERN OF INHERITANCE
The inheritance of each genetic disease follows a distinctive pattern. The concept of dominant and recessive inheritance is one of the most deeply ingrained in our genetic thinking. It has long played a primary role in the introduction of every high school student of biology to genetics and is used extensively in the classification of genetic disease. A dominant disease is one that is expressed when the patient has only a single copy of the mutant gene, i.e., in the heterozygous state. A recessive disease, on the other hand, is expressed only when both copies of the gene are abnormal. If the mutations on both alleles are the same, as is the case with some very common diseases, and with some less common diseases when the parents are related, then the patient is said to be homozygous. If two different abnormal alleles have been inherited, then the patient is designated as being a compound heterozygote (or less accurately, a mixed heterozygote or double heterozygote). It is often implied that genes are dominant or recessive. This is incorrect. It is disease states, or phenotypes, that are dominant or recessive. The gene for sickle cell hemoglobin is expressed in the heterozygous state, so that the carrier of this gene has sickle cell trait. Sickle cell trait is therefore dominant, but sickle cell disease, which occurs in the homozygote, is recessive. By definition, the heterozygous phenotype of a recessive disease does not differ from the homozygous normal state, but it can usually be identified by biochemical means.
X LINKAGE AND X INACTIVATION
The principles of dominant and recessive disease can be readily applied to mutations occurring on the autosomes (chromosomes other than the X chromosome), but the situation is somewhat different in the case of genes on the X chromosome. Although the X chromosome is involved, at least indirectly, in the sex determination process, most of the genes on the X chromosome have nothing whatsoever to do with sex determination. Hematologically, some of the more important of these “sex-linked” genes include those which code for G-6-PD, phosphoglycerate kinase, factor VIII, factor IX, Bruton-type agammaglobulinemia, and one form of chronic granulomatous disease.
The chromosomal complement of males differs from that of females in that males have one X chromosome and one Y chromosome, while females have two X chromosomes. However, early in embryonic development one of the two X chromosomes of somatic cells of female mammals becomes genetically inactive: in some cells the paternally derived chromosome is inactivated; in others, the maternally derived chromosome is inactivated.1,2 Inactivation remains fixed, so that all the progeny of the cell in which the maternally derived X chromosome is inactive show only the gene products from the paternal X. Female heterozygotes for sex-linked genes such as G-6-PD deficiency, phosphoglycerate kinase deficiency, factor VIII deficiency, or factor IX deficiency are therefore a mosaic of cells, some of which manifest the full-blown deficiency, as it is found in affected males, and some of which are normal. The final proportion of cells with one or the other X chromosome active depends upon random factors, i.e., the binomial probability distribution, and on selection between cell populations, which may occur following the inactivation process.3,4 The process of X inactivation is not only useful in understanding the expression of X-linked diseases in women but has been valuable in studying the possible clonal origin of a variety of disorders. As shown in Fig. 9-1, the progeny of a single cell of a female heterozygous for an X-linked gene will manifest only the phenotype of the original cell. Examination of electrophoretically distinguishable variants of G-6-PD has made it possible to demonstrate in this way that the red cells are a clone in chronic myelogenous leukemia,5 in paroxysmal nocturnal hemoglobinuria,6 and probably in acute myelogenous leukemia.7,8 This indicates that each of these disorders arises through transformation of a single cell and that in the case of the leukemias erythroid cells as well as leukocytes are part of the malignant clone.
FIGURE 9-1 At fertilization, the female zygote inherits one maternal chromosome (Xm) and one paternal X chromosome (Xp). At some time early in embryogenesis, one X in each cell is inactivated at random and condenses to form the Barr body. The active X remains active not only for the lifetime of that cell but for the lifetime of all of its progeny. A tumor with a clonal origin will consist entirely of cells in all of which either Xm or Xp are active. A tumor with a multicentric origin may contain both Xm and Xp cells.
With the development of DNA-based technology it has been possible to use X-linked genes as a clonal marker even when there is not a different protein product from the two alleles. A different pattern of methylation of cytidines distinguishes the active from the inactive X-chromosome.9 This fact, together with the existence of restriction endonucleases that distinguish methylated from unmethylated cytidine, has made it possible to utilize restriction fragment length polymorphisms to determine the clonal origin of neoplasms,10,11 even when no polymorphism involving an X-linked enzyme is available. The existence of polymorphisms involving the coding region of genes also makes possible the detection of clones by reverse transcription and amplification of mRNA.12,13
The pattern of genetic transmission of sex-linked genes is characteristic: a father cannot transmit a sex-linked gene to his son, since the offspring is a boy by virtue of the fact that he inherited the father’s Y chromosome, not his X chromosome. Conversely, it is a truism that males always inherit sex-linked genes from their mother and that the mother must therefore be either heterozygous or homozygous for the gene. Because X inactivation is random, however, the degree of expression of X-linked genes in females is highly variable. This is why, even with the most sophisticated methodology, it is not always possible to detect the heterozygous state in the mother of an affected individual. It also explains why even identical twin carriers of diseases such as factor VIII deficiency can have very different levels of the clotting factor.
The vast majority of the genetic material in cells is encoded in the chromosomal DNA. However, mitochondria have their own replicating DNA. Apparently having arisen from symbiotic bacteria over a billion years ago, the DNA of mitochondrial DNA (mtDNA) exists as a closed circular molecule of 16,569 nucleotides. This DNA encodes 13 polypeptides, all of which are subunits of the mitochondrial energy-producing pathway; a small and a large ribosomal RNA; and 22 transfer RNAs.14 Some proteins found in mitochondria are, however, encoded in nuclear DNA. Mitochondria are transmitted through the egg; thus, inheritance is entirely maternal.15,16 Cells contain several hundred mitochondria, each with its own circle of mtDNA. To become clinically significant, mitochondrial mutations must confer some selective advantage upon the mitochondrion with the mutation; mutations that affect only a few of the hundreds of mitochondria in each cell are unlikely to produce a phenotype. Mitochondrial mutations, often consisting of deletions, are responsible for a number of neurologic diseases.15 Some of the childhood myelodysplastic syndromes,17,18 particularly Pearson marrow-pancreas syndrome,19 are hematologic manifestations of mitochondrial mutations.
THE FAMILY HISTORY
A carefully taken family history can give a physician considerable insight into the nature of a hematologic disorder. One should ascertain whether another member of the family has had a similar disease. In the case of patients with anemia, this is often difficult, since so many women have a history of anemia, usually due to iron deficiency. To estimate the severity of anemia it is particularly germane to inquire whether transfusion was required. A history of gallstones, particularly at an early age, may indicate that a hemolytic disorder was present. Similarly, episodes of jaundice in family members may be the only clue to the existence of familial hemolytic anemia.
Presence of the disease in one of the parents strongly suggests a dominant mode of transmission. If neither parent is affected, but several siblings have the disease, an autosomal recessive transmission is more likely. Consanguinity of the patient’s parents makes it highly probable that a disease is an autosomal recessive disorder. Occurrence primarily in male siblings and maternal uncles, with mild or absent manifestations of the disease in the mother, suggests a sex-linked mode of inheritance. Father-to-son transmission rules out sex linkage.
Lack of any family history does not rule out the genetic basis of a disease. In some instances the disease may be so mild in other family members that it is not recognized. Whenever possible, the physician should examine the family members, rather than relying solely on history. In some instances, of course, the gene mutation causing the disorder may have arisen in the generation in which the disease presents.
Once the mode of genetic transmission is clear, the diagnostic alternatives have been narrowed considerably. For example, methemoglobinemia transmitted as an autosomal dominant disorder is due to hemoglobin M, while methemoglobinemia transmitted as an autosomal recessive disorder is due to NADH diaphorase deficiency. Hemolytic anemia with autosomal dominant transmission is likely to be due to hereditary spherocytosis, but sex-linked transmission of the hemolytic state suggests a deficiency of G-6-DP or, more rarely, phosphoglycerate kinase. A bleeding disorder that is transmitted in a sex-linked fashion may be due to a deficiency of factor VIII or factor IX, but autosomal recessive inheritance should suggest to the physician a deficiency of other clotting factors, such as X, XI, or V. Careful analysis of the family history not only will make possible more appropriate genetic counseling to the patient and family but also will shorten the road to a correct diagnosis.
In human somatic cells chromosomes are present in pairs—one pair of sex chromosomes (two X chromosomes in females and an X and a Y in males) and 22 pairs of autosomes. One chromosome of each pair is distributed into the gametes, so that eggs and sperm of humans each contain 23 chromosomes.
If two genes are located on different chromosomes or are far apart on the same chromosome they are said to be unlinked: the offspring of a carrier of these two genes has one chance in two of inheriting either of the genes, and the probability of inheriting one or the other, both, or neither is governed by the laws of chance. For example, if a woman is a carrier of pyruvate kinase deficiency and of sickle cell trait, two genes that are on different autosomes, the probabilities of inheritance of pyruvate kinase deficiency, on the one hand, and sickle cell trait, on the other, are entirely independent. One-fourth the offspring will inherit both pyruvate kinase deficiency and sickle cell trait, one-fourth the offspring will inherit neither, one-fourth will inherit only sickle cell trait, and one-fourth will inherit only pyruvate kinase deficiency.
If the two genes in question are close together on the same chromosome, however, the situation may be quite different. For example, the genes for hemophilia A and for G-6-PD deficiency are both sex-linked. If a woman carries both these genes on one of her X chromosomes, the probability of her child’s inheriting either both of the abnormal genes or neither of the abnormal genes is much greater than the probability of its inheriting one or the other. Yet the inheritance of only one of these two genes is not an impossibility, because of the phenomenon of crossing-over during meiosis. In the course of the formation of germ cells, homologous pairs of chromosomes come into side-by-side apposition and regularly exchange chromosomal material. Thus, two genes that were originally on the same X chromosome may find themselves on separate chromosomes after germ cell formation (Fig. 9-2). The probability of their being separated during meiosis is a function of their distance from one another on the chromosome, and this distance is expressed in terms of map units, or morgans. One-hundredth of a morgan, a centimorgan, represents the genetic distance that gives a one percent probability per generation of a crossover between the two genes. A rule of thumb is that this corresponds to a physical distance of 1,000,000 base pairs, but the actual physical distance represented by a centimorgan varies a great deal from one location in the genome to another; the tendency to cross over varies greatly from place to place. It is not unusual for genes on the same chromosome to be so far apart that the probability of finding them in separate germ cells is just as great as though they had been on separate chromosomes. For this reason, genes on the same chromosome may be linked but may also be unlinked; in the latter case they are referred to as syntenic. G-6-PD and hemophilia A are both on the X chromosome, with a map distance estimated at approximately 0.04 morgans, or 4 centimorgans.20 Therefore, if two mutant genes at this locus are on the same X chromosome in a female, there is a four percent chance of the genes being in separate gametes. The genes for both G-6-PD and the Xg blood group are also on the X chromosome, but are apparently unlinked.21
FIGURE 9-2 Schematic representation of equal crossing-over during meiosis. There has been an exchange of chromosomal material between the maternally derived and paternally derived chromosome, but all genes are represented on the products of the crossover.
DNA AND THE GENETIC CODE
Understanding how the massive amount of information required to allow a complex organism to grow and survive is coded has been one of the major advances of modern biology. The information is all contained in polynucleotides, deoxyribonucleic acid (DNA). DNA contains only four different bases—adenine (A), guanine (G), thymine (T), and cytosine (C). DNA exists as a double helix in which A is always paired with T, and G is always paired with C.
The two ends of a strand of DNA are not the same. The nucleosides that make up each strand are linked to each other through a molecule of phosphoric acid attached to the 3′ carbon of the deoxyribose of one nucleoside and to the 5′ carbon of the next one. A linear strand of DNA thus has one end in which the hydroxyl group attached to the 5′ carbon is free; at the other end it is the hydroxyl group attached to the 3′ carbon that is not involved in a link. These ends are designated the 5′ and 3′ ends respectively, and by convention the 5′ end is drawn at the left and is called the “upstream” end. The 3′ end, then, is designated as “downstream”. In the pairing of two complementary strands of DNA the polarity of the two strands is opposite, i.e., the 5′ end of each strand is paired with the 3′ end of the other. By convention, the strand shown at the top is the coding, or “sense” strand, but the strand at the bottom is the one that actually serves as a template for RNA synthesis. Thus, the sequence of the mRNA is that of the top strand, and the triplet code may be read from this strand.
It is the faithful pairing of A with T and C with G in double-stranded DNA that makes possible the accurate replication of the genetic code. When cells divide, the two DNA strands separate. As this occurs the bases of the separate strands pair with the complementary purine or pyrimidine nucleotide, which become linked to each other, forming a complementary strand of nucleotides. In this way the cell forms two double strands that are identical with the original double strand.
The sequence of base pairs in the DNA strand specifies the sequence of amino acids in proteins. Each base cannot represent a single amino acid, since only four bases are found in DNA and there are 20 commonly occurring amino acids in proteins. Similarly, pairs of bases are not sufficient; they could code for only 16 amino acids. A triplet code is therefore the minimum number of bases that is required to code for 20 amino acids. The genetic code has been found in fact to consist of triplets: each amino acid is specified by one or more sequences of three bases. Long stretches of the triplet code are colinear with the amino acid sequence of the protein the synthesis of which the gene specifies, but these stretches are separated by intervening sequences, or introns, that do not code for the amino acid sequence of the protein (see Fig. 47-2). Moreover, DNA does not directly assemble amino acids into protein. This is achieved through a mechanism that involves another polynucleotide, ribonucleic acid (RNA). There are two differences between DNA and RNA. First of all, the nucleotide units contain ribose instead of deoxyribose. Secondly, in RNA uridine (U) is used instead of the thymidine (T) component of DNA. Messenger ribonucleic acid (mRNA) is synthesized with a base sequence determined by the nuclear DNA, which serves as a template in a copying process that is designated as transcription.
The transcription of DNA into mRNA is the first step in gene expression. In order for a gene to be transcribed a promoter must be located “upstream” (i.e., in the 5′ direction) from the DNA. Typical promoters have certain sequences in common. These include a “CAT box,” the cytosine- and guanine-rich CCAAT sequence, and a “TAATA box,” an adenine- and thymine-rich sequence. Mutations in these regions impair transcription of a gene; such lesions have been identified as causes of the thalassemias and are discussed in greater detail in Chap. 46. The effectiveness of a promoter may be increased by more distant DNA sequences, known as enhancers, which may be either upstream or downstream from the gene. The identification of sequences that enhance expression of the globin genes has been of particular importance in designing vectors for gene transfer to remedy the hemoglobinopathies22,23 (see Chap. 19).
The mRNA that is formed on the DNA template by RNA polymerase is not ready to be translated to a polypeptide. First it must be processed, by adding a cap to the 5′ end and a poly-A tail to the 3′ end and by removing introns. Capping consists of formation of an atypical 5′ to 5′ triphosphate bond between the 5′ terminus of the mRNA and a molecule of 7-methylguanosine. The addition of a poly-A tail serves to stabilize the mRNA. Recognition of a sequence (AAUAAA) serves as a signal that a poly-A tail should be added at a point that is approximately 15 bases downstream from the signal when another consensus sequence, YGUGUUYY (where Y stands for a pyrimidine, i.e. uridine or cytidine), is present further downstream. Sometimes more than one adenylation signal is present, and then additional species of mRNA with 3′ portions differing in length are formed.
Excision of introns is particularly important, since they interrupt the coding sequence. The first 5′ bases of the intron are always GpU and the last 3′ bases always ApG (the p represents the phosphate bond between the nucleotides). But there are many such couplets in the RNA, and additional information is required for an actual splice site to exist. The nature of this information has not been clearly defined, but a “consensus” sequence has been defined that most splice sites resemble closely. Removal of the intron is a complex enzymatic process involving the prior formation of a “lariat” structure.24 Splicing of a given normal mRNA does not always occur in the same manner. Sometimes “alternative splicing” occurs, so that after mRNA is processed some of the molecules contain an exon that is missing from other messenger molecules. This is a powerful mechanism that allows a single gene to direct the synthesis of more than one polypeptide. Potentially the type of polypeptide made can be modulated according to need, and different tissues and different developmental stages may utilize different splice sites to make tissue-specific polypeptides. Alternative splicing has been important, for example, in producing different forms of erythrocyte membrane band 4.125 and different forms of pyruvate kinase for the liver and for the erythrocyte.26
Processed mRNA contains the code for the synthesis of proteins, and an elaborate mechanism has evolved for the translation of the triplet code in the mRNA into protein. A ribosomal complex, consisting of ribosomal RNA (rRNA) subunits and protein components, attaches to the 5′ end of the mRNA. The transport of the needed amino acids to the ribosomal complex is achieved by clover-shaped RNA molecules designated transfer RNA (tRNA). tRNA molecules contain a recognition site which binds to a triplet on mRNA and a site that carries the amino acid appropriate for that triplet to the mRNA, where the ribosomal complex creates the peptide bond between it and the amino acid that is immediately 5′ to it. The initiation of protein synthesis is always at a AUG codon, usually one quite near the 5′ end of the messenger RNA. A consensus sequence27 around this codon marks it for the starting point of protein synthesis. The ribosome moves down the mRNA, adding amino acids to the nascent protein chain as it goes, until it reaches a termination codon, which serves as the signal to stop protein synthesis. The ribosome is then released and can begin the synthesis of another protein molecule. This complex process requires the presence of initiation factors (IF-1 through IF-6) and elongation factors (EF-1 through EF-3), as well as a releasing factor (RF). Both ATP and GTP are required.28 The cycle through which the peptide is formed on the ribosome is illustrated schematically in Fig. 9-3.
FIGURE 9-3 The elongation of a polypeptide as the ribosome moves down the mRNA. Each amino acid (aa) is added to the preceding one by the coordinated activity of elongation factors (EF). From Merrick,28 by permission.
Since the initiation codon AUG codes for methionine, the amino terminus of the primary translated protein is always a methionine, but this is usually cleaved from the protein during processing. Modification of the protein may include changes such as the removal of a leader sequence that directs the protein to a membrane, the addition of sugars to glycoproteins, the addition of fatty acids, and the formation of internal sulfhydryl bonds.
REGULATION OF GENE EXPRESSION
NORMAL REGULATORY PROCESSES
Many genes are highly specialized in their function. Hemoglobin is made only by erythrocyte precursors, crystallin only by the lens, and immunoglobulins only by lymphoid cells. Such genes must be silenced in other types of cells. On the other hand, so-called housekeeping genes produce their products in all cells. The latter include the enzymes of the basic metabolic processes that provide energy to all cells, such as hexokinase, phosphoglycerate kinase, and G-6-PD, or that provide basic structural proteins.
Clearly, an elaborate system for the regulation of protein production exists in all organisms, and this system is only beginning to be understood. Regulation of transcription determines to a large extent whether a protein will be synthesized.29 Promoters and enhancers are activated by transcription factors that are produced by the cell. Such factors, in turn, may be activated or inactivated by phosphorylation and by other processes. How enhancers act at a distance to increase the activity of promoters is not well understood, and the locus control region of the globin genes is serving as a paradigm in gaining understanding of possible interactions between transcription factors, enhancers, and promoters. Regulation also occurs at the translational level. The mRNA of ferritin contains an iron-responsive element that binds to a 87-kDa regulatory protein in the absence of iron, effectively shutting off translation.30 The same type of binding site in the 3′ untranslated region of the transferrin receptor mRNA serves to stabilize the message by allowing the protein to bind in the absence of iron.30 Similarly, a UA-rich portion in the 3′ untranslated portion of the tumor necrosis factor gene serves to inhibit translation of that mRNA.31 It is also likely that the stability of the mRNA itself is regulated by nucleases.32,33
EXPERIMENTAL INTERFERENCE WITH GENE EXPRESSION
It is possible to interdict the expression of a gene at several different levels. Genes can be interrupted in murine embryonic stem cells by the process of targeted disruption, destroying their function.34,35 The resulting “knockout mice” (a subset of transgenic mice, see “Transgenic Animals,” below) can provide valuable insights into the function of genes and serve as animal models of human disease (see Chap. 10).
The translation of mRNA can be inhibited and the RNA degraded by placing antisense RNA or DNA into cells. These molecules have a sequence complementary to the mRNA that is to be inactivated. When such oligonucleotides are present they inhibit gene expression through a variety of mechanisms. For example, they form a double strand with the RNA, just as two complementary strands of DNA will hybridize to form the normal double-stranded form of DNA. Because the double-stranded form cannot be translated and is probably degraded rapidly, the production of its protein product is inhibited specifically. Since antisense RNA can be produced in vivo by transcribing the complementary strand of a gene, it may represent a natural regulatory mechanism.36,37 and 38 In experimental systems, antisense DNA or stable DNA analogs such as the methyl phosphonates39 can be transfected directly into cells, or the RNA can be made by a plasmid with the appropriate DNA template and a promotor. Some of the uses of this approach include the suppression of lymphoma growth with DNA oligonucleotides antisense to introns of the oncogene c-myc,40 the suppression of the growth of marrow cells from patients with chronic myelogenous leukemia by antisense DNA directed at the BCR-ABL junction,41 the down-regulation of growth of BCL-2-positive lymphoma cells in culture by BCL-2 antisense,42 and the inhibition of Friend murine erythroleukemia cell growth by transfection with a plasmid that produces antisense to c-jun.43
The discovery of the enzymatic activity of certain forms of RNA represents a major advance in our understanding of how life may have originated on earth. Cleaving RNA at defined sequences, much as restriction endonucleases cleave DNA, is one of the known enzymatic functions of RNA, and this function provides a means by which the expression of a gene can be interdicted in experimental systems. This ribozyme approach has been used, for example, in preventing replication of the HIV-1 virus44,45 and for cleaving BCR-ABL with a view to developing a treatment for chronic myelogenous leukemia.46
THE METHODS OF MOLECULAR BIOLOGY
The sequencing of DNA and the preparation of probes requires that a fragment of DNA is amplified manyfold to provide a relatively pure sample for study. The classical method by which this is achieved, cloning, is one of the central techniques of molecular biology. It is generally accomplished by inserting the DNA into a vector, a bacteriophage or plasmid that normally replicates within a bacterial cell. When such a phage or plasmid contains a foreign DNA fragment, the fragment too undergoes replication and can then be purified in greatly amplified form.
If the DNA is not available in pure form to begin with it must be purified from a collection of DNA fragments that is designated a “library”. An adequate genomic library consists of millions of fragments of the genetic material of a cell that have been ligated into a suitable vector. Another valuable type of library is made by transcribing mRNA from a tissue into cDNA (“complementary” DNA) using the enzyme reverse transcriptase. Such a cDNA library is particularly useful for the isolation of genes because in it are represented only the intron-free portions of genes that are being actively transcribed in a tissue. In contrast, a genomic library represents all of the genetic material, coding and noncoding, transcribed and nontranscribed.
A large number of vectors that have the capacity to replicate fragments of DNA of widely differing sizes have been designed. The largest of these are yeast artificial chromosomes, which may incorporate a million or more base pairs of DNA into a vector that is grown in a yeast host.47,48 Such vectors are very useful in mapping genes because of their very large size, but there is a tendency for the DNA in YACs to be rearranged, which can lead to errors. Other vectors that also incorporate large fragments of DNA, ranging to about 100,000 bp in length, are bacterial artificial chromosomes, P1-derived artificial chromosomes, and cosmids (20,000 to 30,000 bp). Much smaller inserts, ranging in size from about 3,000 to 12,000 bp, can be cloned into bacteriophages. A library consisting of a large collection of the vector containing many different inserts is plated on a confluent layer (“lawn”) of micro-organisms; bacteria transfected with a plasmid library are plated on a semisolid culture medium. It is then necessary to detect the amplified wanted DNA fragment. If the exact sequence of at least 17 nucleotides is known, a probe consisting of a radioactively labeled synthetic complementary sequence can be used to detect the clone that is wanted. The precise base sequence cannot be deduced from the amino acid sequence, because there is more than one codon for most amino acids. However, if an appropriate portion of the amino acid sequence is selected, several different complementary sequences, encompassing all of the possibilities, may be used as probes.
Antibodies against the gene product may also serve as probes by using an “expression vector” in which a promotor is present upstream from the cloned DNA. When the fragment is in the correct orientation and when it is “in frame” so that the triplets are read correctly, sufficient gene product may be formed to allow immunologic detection. Colonies (or, in the case of phage vectors, plaques) that react with the probe are picked and subcultured at lower density until a single reactive colony or plaque is isolated.
THE POLYMERASE CHAIN REACTION
Amplification of the desired part of the genome may be achieved when some of the sequence is already known by using the polymerase chain reaction, a technique that is much simpler than cloning. For example, one may wish to determine the sequence of a portion of a gene for diagnostic purposes, but cloning the gene(s) of interest is too time-consuming and labor intensive to be practical. Two primers, matching opposite strands of DNA on either side of the region of interest, are used to amplify the intervening segment of DNA more than a millionfold. Successive cycles of DNA synthesis from the primers, and chain separation by heating between the cycles, are the basis of this powerful technique.49,50 The polymerase chain reaction is so sensitive that under optimal conditions the DNA from a single cell may be amplified. Moreover, the stability of DNA is such that very old preserved material may be used. Thus, it is possible to amplify the DNA from blood smears,51 from mummies, and even from insects preserved in amber for over 25 million years.52 Amplifying by PCR complementary DNA (cDNA) produced by reverse-transcribing mRNA in tissue extracts (RT-PCR) provides a very sensitive means for measuring the expression of genes in tissues.
CUTTING DNA WITH RESTRICTION ENDONUCLEASES
The discovery that many bacteria elaborate enzymes that cleave double-stranded DNA at the sites of very specific sequences greatly facilitated the study of DNA. Such enzymes generally recognize palindromes, i.e., DNA sequences that read the same in one direction on the upper strand and in the opposite direction in the lower strand. Fig. 9-4 illustrates how one such palindrome is cleaved by the commonly used restriction endonuclease Eco RI. Several hundred restriction endonucleases are now commercially available. Some recognize sequences of only four nucleotides and some as many as eight. The average size of fragments produced by the former is, of course, smaller than the average size of those produced by the latter.
FIGURE 9-4 A schematic representation of ECO R1 cleaving its recognition sequence, which is outlined by the rectangle. Whenever this restriction endonuclease encounters the palindromic sequence GAATTC, it cleaves DNA at the position shown by the arrows.
Restriction endonucleases are useful both for cloning DNA and for analyzing its structure. By digesting DNA with various endonucleases and combinations of endonucleases one may construct a restriction map, i.e., a linear representation of the fragment of DNA with the location of the various restriction sites that have been identified. Maps can be constructed from uncloned genetic DNA, provided that probes for the detection of the relevant fragments are available. Many of the restriction endonucleases produce fragments with overlapping ends (see, for example, Eco RI in Fig. 9-4). Such “sticky ends” may be used for the ligation (i.e., splicing) of DNA fragments into a vector by using a vector with complementary sticky ends. The seal is made permanent with the enzyme DNA ligase.
The size of restriction fragments produced after digesting whole genomic DNA with restriction endonucleases may be appreciated using the technique of Southern blotting, a useful procedure named after the investigator who developed it.53 The DNA is digested with one or more restriction endonucleases and then subjected to electrophoresis in a gel that separates fragments by size. It is then transferred to a membrane that binds DNA, and the appropriate DNA fragments are detected using labeled probes. Alternatively, the segment of DNA that is of interest may be amplified using the PCR technique and digested by a restriction endonuclease to determine whether or not target sites are present.
One of the most powerful uses of restriction endonucleases is in the detection of genetic variability. Changes in nucleotides may create or abolish restriction sites. Thus, they change the size of fragments that are formed when the DNA is digested. Such areas of variability represent restriction fragment length polymorphisms (RFLP). In some cases the changes in nucleotide sequence may be the ones that cause the disease itself. For example, the sickle cell mutation causes disappearance of a restriction site recognized by the enzyme Mst II,54 and the G-6-PD A– mutation causes formation of a restriction site recognized by Nla III; such changes have proved valuable in diagnosis (see Chap. 45 and Chap. 47).
Deletions of chromosomal material, as occur in a-thalassemia, also produce changes in fragment sizes. Larger fragments may appear if the deleted fragment contains a restriction site, or smaller fragments if it does not. If the area covered by the probe is deleted in its entirety, as occurs in hydrops fetalis, no band will be seen at all. Even when the lesion that causes the disease does not directly affect a restriction site, RLFPs may be valuable in disease detection by virtue of close linkage of the restriction site to a disease-causing gene. Multiple restriction sites near the gene of interest produce haplotypes that may unequivocally identify a chromosome. Such haplotypes have been particularly useful in the prenatal diagnosis of the thalassemias (see Chap. 46).
The chain termination technique55 is most commonly used to determine the sequence of DNA. It depends upon synthesizing a labeled strand of DNA, with the DNA to be sequenced serving as the template. The mixture of nucleotides used contains, in addition to the native deoxynucleotides, a nucleotide analog that results in chain termination when incorporated. The normal nucleotides are present in excess, and therefore chain termination occurs only sporadically, but always when the analog is incorporated. Four different incubation mixtures are used, each with an analog of one of the four nucleotides. Gel electrophoresis of the labeled products produces “ladders” of polynucleotides. The size of each fragment depends on the point at which there exists a nucleotide corresponding to the chain-terminating analog in the mixture (Fig. 9-5). Sequencing can now be carried out rapidly and accurately by automated methods.56
FIGURE 9-5 Radioautograph of a gel being used to determine the sequence of the glucocerebrosidase gene by the chain termination method. Four reaction mixtures are used. Each mixture contains a polynucleotide primer (Pr) that has a sequence complementary to the beginning of the strand to be sequenced, and all four normal deoxynucleoside triphosphates labeled with 32P. The “G” mixture also contains dideoxyguanosine triphosphate to act as a chain terminator when a guanine is reached. The “A” mixture contains the adenine chain terminator, and so on. Each mixture is placed in a slot: the “G” mixture in G, and so on. Upon electrophoresis the gel separates polynucleotides by size. Thus, the positions to which polynucleotides move in the gel correspond to the positions at which the indicated nucleotides are added to the end of the DNA strand as it is being synthesized. The sequence of the DNA can then be deduced. The apparent sequence of some of the bands are shown at the left.
While DNA sequencing formerly required cloning of the fragment to be studied, amplification by PCR serves as a simpler alternative when the surrounding sequences are known.57,58
DETECTING MUTATIONS IN INDIVIDUAL PATIENTS
The cloning and sequencing of DNA is too time-consuming to permit application for diagnostic purposes to individual patients. Fortunately, there are shortcuts that can be used when the nature of the lesion is known and a yes-or-no answer is sought with regards to the existence of a certain substitution. The value of restriction sites in this regard has been discussed above, but since many substitutions neither abolish nor create restriction sites the use of restriction endonucleases is not feasible in every case. However, a mismatch in one of the amplifying primers used in amplifying DNA by PCR, selected so as to create a restriction site where none existed before, is a technique that has been used successfully to detect mutations.59 Using amplifying primers that fit one genotype but not the other has been used in “color PCR”60 and in the amplification refractory mutation system (ARMS).61 The failure of fragments of DNA to ligate when aligned on a template in which there is a misfit of the terminal nucleotide also has been used to detect mutations.62,63 and 64 The hybridization of labeled oligonucleotide probes with a defined sequence to an amplified DNA target, but not to a DNA target harboring even a single nucleotide change, a method designated allele-specific oligonucleotide hybridization (ASOH), is also very useful.65,66 Probes containing approximately 17 nucleotides fitting either the normal or the mutant sequence are hybridized to PCR-amplified DNA. A single mismatch in an oligonucleotide of this size produces a sufficient change in melting temperature (i.e., the temperature at which the strands of DNA separate) that the two sequences can be distinguished from one another.
When the mutation is not known, other techniques may prove useful. Single-stranded conformation polymorphism (SSCP) analysis takes advantage of the fact that a single base substitution will usually change the conformation of single-stranded DNA and change its migration in a gel when it is subjected to electrophoresis. This technique has been found to be particularly powerful, revealing most mutations in segments of DNA between 200 and 400 bases in length.67,68 Alternatively single base mismatches may be detected by hybridizing mRNA with a known sequence to the DNA and cleaving the duplexes with ribonuclease,69 or by measuring the denaturation of mismatched double-stranded DNA (heteroduplexes) in a gradient.70
The mechanical insertion of DNA fragments into the nucleus of a fertilized ovum provides a means for altering the genetic constitution of animals. Animals that have been engineered in this manner are referred to as transgenic. The use of promotors that are inducible or tissue specific permits studies of the effect of a gene product that might be lethal if expressed in all tissues or at all times during embryogenesis. Transgenic mice that carry the human sickle b-globin gene and in which the murine globin genes have been “knocked out” have been produced71,72 and produce high enough levels of human hemoglobin S to have potential as an animal model of sickle cell disease.
Crossing-over during meiosis usually occurs with great precision. Homologous genes pair with each other, and although genes which were together on one chromosome before meiosis may now be on opposite chromosomes of the pair, each chromosome still contains a complete set of genes (see Fig. 9-2). Occasionally, however, an error occurs and pairing during meiosis is imperfect. Under these circumstances—unequal crossing-over (see Fig. 46-10)—one of the daughter chromosomes contains a duplicated gene, while the other one exists with a gene deleted.
Once a duplication has occurred, further duplications occur more readily because pairing of the first of the duplicate genes on one chromosome with the second gene of the duplicate on the other produces one chromosome with a triplicated gene and one with a single gene. Duplication has probably played a very important role in the course of evolution73 because the presence of two genes with the same function allows experiments of nature, mutations, to occur on one of the genes without totally losing the original function, which is still carried out by the duplicate. Examples of the results of gene duplication abound in hematology, particularly with respect to the hemoglobin loci. The a-chain loci are duplicated, and there are also two nearly identical copies of the g-chain locus (see Chap. 46). Furthermore, the close similarity of their amino acid sequence and the fact that they are tightly linked indicate that the b, g, and d loci represent the result of duplication of a single ancestral gene. The process of unequal crossing-over takes place not only between genes, but also within genes. When this occurs, one would anticipate that a portion of the amino acid sequence of a protein is represented twice on one chromosome and is missing on the other. The Lepore hemoglobins, leading to a thalassemic clinical state, are an example of this type of unequal crossing-over (see Fig. 46-6). These abnormal hemoglobins have the amino acid sequence of the d chain at the amino end, and the sequence of the b chain at the carboxyl end. The complement to this kind of abnormality, the “anti-Lepore” hemoglobin, also has been found (see Chap. 46). Similarly, a mutation of the glucocerebrosidase gene causing Gaucher disease has been found to be the result of a crossover between the active gene and the pseudogene.74
Pseudogenes are DNA sequences that resemble the corresponding functional genes, but do not result in the production of a gene product. Pseudogenes exist, for example, for the b-globin chain, von Willebrand factor, ferritin, and glucocerebrosidase. These pseudogenes apparently arose by gene duplication and simulate the true gene even in having introns. They have apparently lost their ability to function, through mutations either in the coding region or in their promoter. Some pseudogenes are devoid of introns. They may well have arisen in evolution as a result of the reverse transcription of a processed mRNA by retroviral reverse transcriptase. Unlike genes that arose by tandem duplication as a result of unequal crossover, such pseudogenes can be found anywhere in the genome. For example, a functional glutathione-S-transferase gene is on chromosome 11 and a pseudogene is located on chromosome 12.75
TYPES OF MUTATIONS
Mutations can occur in structural genes (the part of the DNA that specifies the amino acid sequence of protein), in the poorly understood regulatory apparatus that determines whether or not a gene will be available for transcription, in introns, or in portions of the DNA between genes that have no known function. As shown in Table 9-1, hematologic diseases provide examples of every known mechanism for causing mutations.
TABLE 9-1 EXAMPLES OF GENETIC MECHANISMS IN HEMATOLOGIC DISEASE
A change of one nucleotide to another without a change in the number of nucleotides in the sequence is called a point mutation. Other types of mutations are deletions and insertions (e.g., duplication of stretches of DNA in a gene). Mutations do not occur at random. Changes in the dinucleotide CpG to TpG are particularly common because invertebrate DNA cytidines followed by guanine are readily methylated and the methylcytosine formed is susceptible to oxidation to thymine. Thus, in both hemophilia A76 and G-6-PD deficiency77 an unusually high proportion of point mutations are found in CpG dinucleotides. Deletions or duplications of portions of genes tend to occur in areas in which the same sequence is repeated more than once. Thus, there are “hot spots” in the genome in which, for one reason or another, mutations are particularly likely to occur.
Another mechanism by which mutation appears to occur is that of gene conversion. This poorly understood phenomenon results in the sequence of one gene being transferred en bloc to another. This phenomenon is thought to account for the maintenance of identical sequence between duplicated genes.78,79
Many mutations affect the amount of processed mRNA that is formed. For example, mutations that cause abnormal splicing may produce a messenger that cannot be translated. Regulatory mutations that impair the rate at which a gene is transcribed into mRNA can be the consequence of mutations in promoter or enhancer elements. Mutations that cause thalassemia by impairing transcription of the hemoglobin b locus are the best characterized of these (see Chap. 46). However, most mutations causing hematologic disease seem to be structural mutations, those in which the sequence of the coding region of the gene is altered.
Errors in the base sequence of the structural gene result in failure to form any protein, in the formation of a very unstable protein that may never appear in the fully assembled form, or in the formation of an abnormal protein. The latter circumstance appears to be the most common. The abnormal protein may maintain all, some, or none of the functional properties of the normal protein. Even when it has lost the functional properties of the original protein it may retain the antigenic properties, and it is then designated cross-reacting material (CRM). Its stability may be normal or reduced. Mutations that result in the formation of stable proteins with normal functional properties are not clinically significant, but they may be very valuable from the point of view of population and family studies, or as genetic markers for various types of biologic investigations. Some “deficiencies” of enzymes are also clinically harmless. For example, genetic absence of the glycosyl transferases that convert the H antigen to the A or B antigen (see Chap. 137) results in the appearance of blood group O, surely a clinical state that cannot be considered a disease. Genetic variants that reach a frequency of more than one percent in a population are known as polymorphisms. Sometimes genetic variants such as the sickle cell gene or the G-6-PD deficiency gene reach polymorphic levels because the deleterious effects that they may have are counterbalanced by beneficial effects on survival, such as increased resistance to malaria. They are known as balanced polymorphisms.
All cells receive the same complement of genes. Nonetheless some mutations are tissue-specific. Several circumstances can account for this. Some enzymes that appear to perform the same function are encoded by different genes in different tissues. For example, the pyruvate kinase of leukocytes and that of erythrocytes are under separate genetic control (see Chap. 45). In other cases, alternative splicing of the primary mRNA can produce different polypeptides.80,81 Differences in posttranslational processing, including proteolysis and glycosylation of the same polypeptide by different enzymes in different tissues, can lead to different final products. However, in most instances a mutation that affects an enzyme in one type of blood cell will also affect the same enzyme in other blood cells, in liver, in brain, and in other tissues.
The types of enzyme deficiencies encountered clinically are limited by the ability of the affected individual to survive. Thus, complete absence of a key glycolytic enzyme from all tissues is incompatible with the basic process of energy metabolism and would almost surely be lethal long before birth. In contrast, the inheritance of enzyme deficiencies that are manifested only in erythrocytes is apparently quite compatible with survival, and thus many of the enzyme defects that are observed in humans are ones that only affect the red blood cell.
Historically, mutations were first detected by sequencing the protein, usually hemoglobin. Indeed, the mutation in sickle cell disease was described before the genetic code had been deciphered. Thus, mutations were designated by indicating the amino acid change. Amino acid-based nomenclature does not unambiguously define the mutation, since the same amino acid substitution can be caused by different nucleotide substitutions. Further ambiguity is introduced by the fact that three different starting points for the numbering of amino acids in protein are commonly employed: (1) the methionine start codon; (2) the amino acid after the methionine start codon; and (3) the amino terminal amino acid of the processed protein. Finally, there are many mutations, such as those that change splice sites or promoters, that cannot be designated by an amino acid substitution. Nonetheless, designations based on amino acid mutation have been so widely used that they serve as useful “nicknames” for mutations; the nucleotide-based designation would simply not be recognized by workers in the field. Moreover, knowing the amino acid change sometimes provides valuable information regarding the effect of the mutation at the protein level. Therefore, while the more robust nucleotide-based mutation is preferred in this text, the amino acid–based notation is used when it is the one that is generally recognized by workers in the field. Standards have been established for the different notations that are in use.82,83,84 and 85
Even before detection of mutations at the DNA level was feasible, clinicians could deduce that the same genotype did not always produce the same clinical disease picture (phenotype). Sibs inheriting autosomal recessive disorders from their parents often have been observed to have discordant clinical presentations—one severely affected, one mildly so—even though the same pair of disease-producing genes was inherited. With the development of the ability to define genotypes directly, the great degree of genotype-phenotype dissociation has become even more evident. Thus, persons inheriting the same sickle cell, G-6-PD, factor VIII, or glucocerebrosidase mutations may have mild or severe sickle cell disease, hemolytic anemia, hemophilia A, or Gaucher disease respectively. The factors that modify disease expression are usually not understood. In the case of G-6-PD deficiency, a second mutation, one in the UDP-glucuronosyltransferase-1 gene, has been shown to determine whether severe jaundice will be present.86,87
Beutler E, Yeh M, Fairbanks VF: The normal human female as a mosaic of X-chromosome activity: Studies using the gene for G-6-PD deficiency as a marker. Proc Natl Acad Sci USA 48:9, 1962.
Lyon MF: Sex chromatin and gene action in the mammalian X-chromosome. Am J Hum Genet 14:135, 1962.
Gartler SM, Linder D: Developmental and evolutionary implications of the mosaic nature of the G-6-PD system. Cold Spring Harb Symp Quant Biol 29:253, 1964.
Beutler E: The distribution of gene products among populations of cells in heterozygous humans. Cold Spring Harb Symp Quant Biol 29:261, 1964.
Fialkow PJ, Gartler SM, Yoshida A: Clonal origin of chronic myelocytic leukemia in man. Proc Natl Acad Sci USA 58:1468, 1967.
Oni SB, Osunkoya BO, Luzzatto L: Paroxysmal nocturnal hemoglobinuria: Evidence for monoclonal origin of abnormal red cells. Blood 36:145, 1970.
Beutler E, West C, Johnson C: Involvement of the erythroid series in acute myeloid leukemia. Blood 53:1203, 1979.
Fialkow PJ, Singer JW, Raskind WH, et al: Clonal development, stem-cell differentiation, and clinical remissions in acute nonlymphocytic leukemia. N Engl J Med 317:468, 1987.
Lindsay S, Monk M, Holliday R, et al: Differences in methylation on the active and inactive human X chromosomes. Ann Hum Genet 49:115, 1985.
Gilliland DG, Blanchard KL, Bunn HF: Clonality in acquired hematologic disorders. Annu Rev Med 42:491, 1991.
Gilliland DG, Blanchard KL, Levy J, Perrin S, Bunn HF: Clonality in myeloproliferative disorders: analysis by means of the polymerase chain reaction. Proc Natl Acad Sci USA 88:6848, 1991.
Curnutte JT, Hopkins PJ, Kuhl W, Beutler E: Studying X-inactivation. Lancet 339:749, 1992.
Prchal JT, Guan YL, Prchal JF, Barany F: Transcriptional analysis of the active X-chromosome in normal and clonal hematopoiesis. Blood 81:269, 1993.
Wallace DC: Mitochondrial DNA sequence variation in human evolution and disease. Proc Natl Acad Sci USA 91:8739, 1994.
Graeber MB, Muller U: Recent developments in the molecular genetics of mitochondrial disorders. J Neurol Sci 153:251, 1998.
Ohno S: The one ancestor per generation rule and three other rules of mitochondrial inheritance. Proc Natl Acad Sci USA 94:8033, 1997.
Bader-Meunier B, Rotig A, Mielot F, et al: Refractory anaemia and mitochondrial cytopathy in childhood. Br J Haematol 87:381, 1994.
Superti-Furga A, Schoenle E, Tuchschmid P, et al: Pearson bone marrow-pancreas syndrome with insulin-dependent diabetes, progressive renal tubulopathy, organic aciduria and elevated fetal haemoglobin caused by deletion and duplication of mitochondrial DNA. Eur J Pediatr 152:44, 1993.
Cormier V, Rötig A, Quartino AR, et al: Widespread multi-tissue deletions of the mitochondrial genome in the Pearson marrow-pancreas syndrome. J Pediatr 117:599, 1990.
Boyer SH, Graham JB: Linkage between the X chromosome loci for glucose-6-phosphate dehydrogenase electrophoretic variation and hemophilia A. Am J Hum Genet 17:320, 1965.
Siniscalco M, Filippi G, Latte B, et al: Failure to detect linkage between Xg and other X-borne loci in Sardinians. Ann Hum Genet 29:231, 1966.
Jarman AP, Wood WG, Sharpe JA, et al: Characterization of the major regulatory element upstream of the human alpha-globin gene cluster. Mol Cell Biol 11:4679, 1991.
Orkin SH: Globin gene regulation and switching: circa 1990. Cell 63:665, 1990.
Keller W: The RNA lariat: A new ring to the splicing of mRNA precursors. Cell 39:423, 1984.
Conboy JG, Chan J, Mohandas N, Kan YW: Multiple protein 4.1 isoforms produced by alternative splicing in human erythroid cells. Proc Natl Acad Sci USA 85:9062, 1988.
Noguchi T, Yamada K, Inoue H, Matsuda T, Tanaka T: The L- and R-type isozymes of rat pyruvate kinase are produced from a single gene by use of different promoters. J Biol Chem 262:14366, 1987.
Kozak M: Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. Nucl Acids Res 12:857, 1984.
Merrick WC: Mechanism and regulation of eukaryotic protein synthesis. Microbiol Rev 56:291, 1992.
Maniatis T, Goodbourn S, Fischer JA: Regulation of inducible and tissue-specific gene expression. Science 236:1237, 1987.
Rouault T, Klausner R: Regulation of iron metabolism in eukaryotes. Curr Top Cell Regul 35:1–19, 1997.
Han J, Brown T, Beutler B: Endotoxin-responsive sequences control cachectin/tumor necrosis factor biosynthesis at the translational level. J Exp Med 171:465, 1990.
Han J, Beutler B, Huez G: Complex regulation of tumor necrosis factor mRNA turnover in lipopolysaccharide-activated macrophages. Biochim Biophys Acta 1090:22, 1991.
Beutler E, Gelbart T, Han J, Koziol JA, Beutler B: Evolution of the genome and the genetic code: Selection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc Natl Acad Sci USA 86:192, 1989.
Gridley T: Insertional versus targeted mutagenesis in mice. New Biol 3:1025, 1991.
Waldman AS: Targeted homologous recombination in mammalian cells. Crit Rev Oncol Hematol 12:49, 1992.
Weintraub HM: Antisense RNA and DNA. Sci Am 262:40, 1990.
Simons RW: Naturally occurring antisense RNA control—a brief review. Gene 72:35, 1988.
Weintraub LR, Goral A, Grasso J, et al: Pathogenesis of hepatic fibrosis in experimental iron overload. Br J Haematol 59:321, 1985.
Smith CC, Aurelian L, Reddy MP, Miller PS, Ts’o POP: Antiviral effect of an oligo(nucleoside methylphosphonate) complementary to the splice junction of herpes simplex virus type 1 immediate early pre-mRNAs 4 and 5. Proc Natl Acad Sci USA 83:2787, 1986.
McManaway ME, Neckers LM, Loke SL, et al: Tumour-specific inhibition of lymphoma growth by an antisense oligodeoxynucleotide. Lancet 335:808, 1990.
Szczylik C, Skorski T, Nicolaides NC, et al: Selective inhibition of leukemia cell proliferation by BCR-ABL antisense oligodeoxynucleotides. Science 253:562, 1991.
Cotter FE, Johnson P, Hall P, et al: Antisense oligonucleotides suppress B-cell lymphoma growth in a SCID-hu mouse model. Oncogene 9:3049, 1994.
Smith MJ, Prochownik EV: Inhibition of c-jun causes reversible proliferative arrest and withdrawal from the cell cycle. Blood 79:2107, 1992.
Chen CJ, Banerjea AC, Harmison GG, Haglund K, Schubert M: Multitarget-ribozyme directed to cleave at up to nine highly conserved HIV-1 env RNA regions inhibits HIV-1 replication—potential effectiveness against most presently sequenced HIV-1 isolates. Nucl Acids Res 20:4581, 1992.
Heidenreich O, Eckstein F: Hammerhead ribozyme-mediated cleavage of the long terminal repeat RNA of human immunodeficiency virus type 1. J Biol Chem 267:1904, 1992.
Kuwabara T, Warashina M, Tanabe T, et al: Comparison of the specificities and catalytic activities of hammerhead ribozymes and DNA enzymes with respect to the cleavage of BCR-ABL chimeric L6 (b2a2) mRNA. Nucl Acids Res 25:3074, 1997.
Burt MJ, Smit DJ, Pyper WR, Powell LW, Jazwinska EC: A 4.5-megabase YAC contig and physical map over the hemochromatosis gene region. Genomics 33:153, 1996.
Schuler GD, Boguski MS, Stewart EA, et al: A gene map of the human genome. Science 274:540, 1996.
Amplification of nucleic acid sequences: The choices multiply. J NIH Res 3:81, 1991.
Innis MA, Gelfand DH, Sninsky JJ, White TJ (eds): PCR Protocols: A Guide to Methods and Applications. Academic Press, San Diego, 1990.
De Melo MB, Sales TSI, Lorand-Metze I, Costa FF: Rapid method for isolation of DNA from glass slide smears for PCR. Acta Haematol (Basel) 87:214, 1992.
DeSalle R, Gatesy J, Wheeler W, Grimaldi D: DNA sequences from a fossil termite in Oligo-Miocene amber and their phylogenetic implications. Science 257:1933, 1992.
Southern E: Gel electrophoresis of restriction fragments. Methods Enzymol 68:152, 1979.
Chang JC, Kan YW: Antenatal diagnosis of sickle cell anaemia by direct analysis of the sickle mutation. Lancet 2:1127, 1981.
Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463, 1977.
Rosenthal N: Molecular medicine. Recognizing DNA. N Engl J Med 333:925, 1995.
Winship PR: An improved method for directly sequencing PCR amplified material using dimethyl sulphoxide. Nucl Acids Res 17:1266, 1989.
Beutler E, Kuhl W, Gelbart T, Forman L: DNA sequence abnormalities of human glucose-6-phosphate dehydrogenase variants. J Biol Chem 266:4145, 1991.
Kumar R, Dunn LL: Designed diagnostic restriction fragment length polymorphisms for the detection of point mutations in ras oncogenes. Oncogene Res 4:235, 1989.
Chehab FF, Kan YW: Detection of specific DNA sequences by fluorescence amplification: A color complementation assay. Proc Natl Acad Sci USA 86:9178, 1989.
Mistry PK, Smith SJ, Ali M, et al: Genetic diagnosis of Gaucher’s disease. Lancet 339:889, 1992.
Landegren U, Kaiser R, Sanders J, Hood L: A ligase-mediated gene detection technique. Science 241:1077, 1988.
Barany F: Genetic disease detection and DNA amplification using cloned thermostable ligase. Proc Natl Acad Sci USA 88:189, 1991.
Kalin I, Shephard S, Candrian U: Evaluation of the ligase chain reaction (LCR) for the detection of point mutations. Mutat Res 283:119, 1992.
Beutler E, Gelbart T, West C, et al: Mutation analysis in hereditary hemochromatosis. Blood Cells Mol Dis 22:187, 1996.
Cai SP, Zhang JZ, Huang DH, Wang ZX, Kan YW: A simple approach to prenatal diagnosis of beta-thalassemia in a geographic area where multiple mutations occur. Blood 71:1357, 1988.
Mashiyama S, Murakami Y, Yoshimoto T, Sekiya T, Hayashi K: Detection of p53 gene mutations in human brain tumors by single-strand conformation polymorphism analysis of polymerase chain reaction products. Oncogene 6:1313, 1991.
Orita M, Iwahana H, Kanazawa H, Hayashi K, Sekiya T: Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphisms. Proc Natl Acad Sci USA 86:2766, 1989.
Myers RM, Larin Z, Maniatis T: Detection of single base substitutions by ribonuclease cleavage at mismatches in RNA:DNA duplexes. Science 230:1242, 1985.
Abrams ES, Murdaugh SE, Lerman LS: Comprehensive detection of single base changes in human genomic DNA using denaturing gradient gel electrophoresis and a GC clamp. Genomics 7:463, 1990.
Ryan TM, Ciavatta DJ, Townes TM: Knockout-transgenic mouse model of sickle cell disease. Science 278:873, 1997.
Paszty C, Brion CM, Manci E, et al: Transgenic knockout mice with exclusively human sickle hemoglobin and sickle cell disease. Science 278:876, 1997.
Ohno S: Evolution by Gene Duplication. Springer Verlag, Berlin, 1970.
Zimran A, Sorge J, Gross E, et al: A glucocerebrosidase fusion gene in Gaucher disease. Implications for the molecular anatomy, pathogenesis and diagnosis of this disorder. J Clin Invest 85:219, 1990.
Board PG, Coggan M, Woodcock DM: The human Pi class glutathione transferase sequence at 12q13-q14 is a reverse-transcribed pseudogene. Genomics 14:470, 1992.
Youssoufian H, Kazazian HH Jr, Phillips DG, et al: Recurrent mutations in haemophilia A give evidence for CpG mutation hotspots. Nature 324:380, 1986.
Vulliamy TJ, D’Urso M, Battistuzzi G, et al: Diverse point mutations in the human glucose 6-phosphate dehydrogenase gene cause enzyme deficiency and mild or severe hemolytic anemia. Proc Natl Acad Sci USA 85:5171, 1988.
Baltimore D: Gene conversion: Some implications for immunoglobulin genes. Cell 24:592, 1981.
Hess JF, Schmid CW, Shen CK: A gradient of sequence divergence in the human adult alpha-globin duplication units. Science 226:67, 1984.
Amara SG, Jonas V, Rosenfeld MG, Ong ES, Evans RM: Alternative RNA processing in calcitonin gene expression generates mRNAs encoding different polypeptide products. Nature 298:240, 1982.
Pihlajaniemi T, Myllyla R, Seyer J, Kurkinen M, Prockop DJ: Partial characterization of a low molecular weight human collagen that undergoes alternative splicing. Proc Natl Acad Sci USA 84:940, 1987.
Ad Hoc Committee on Mutation Nomenclature: Update on nomenclature for human gene mutations. Hum Mutat 8:197, 1996.
Antonarakis SE: Recommendations for a nomenclature system for human gene mutations. Nomenclature Working Group. Hum Mutat 11:1, 1998.
Beaudet AL, Tsui LC: A suggested nomenclature for designating mutations. Hum Mutat 4:245, 1993.
Beutler E, McKusick VA, Motulsky AG, Scriver CR, Hutchinson F: Mutation nomenclature: Nicknames, systematic names, and unique identifiers. Hum Mutat 8:203, 1996.
Kaplan M, Renbaum P, Levy-Lahad E, et al: Gilbert syndrome and glucose-6-phosphate dehydrogenase deficiency: A dose-dependent genetic interaction crucial to neonatal hyperbilirubinemia. Proc Natl Acad Sci USA 94:12128, 1997.
Sampietro M, Lupica L, Perrero L, et al: The expression of uridine diphosphate glucuronosyltransferase gene is a major determinant of bilirubin level in heterozygous beta-thalassaemia and in glucose-6-phosphate dehydrogenase deficiency. Br J Haematol 99:437, 1997.
Copyright © 2001 McGraw-Hill
Ernest Beutler, Marshall A. Lichtman, Barry S. Coller, Thomas J. Kipps, and Uri Seligsohn