1 00:00:00,420 --> 00:00:03,450 [Instructor] Hello and welcome to Module 5. 2 00:00:03,450 --> 00:00:04,770 So we will go through 3 00:00:04,770 --> 00:00:06,900 three different lectures in this module. 4 00:00:06,900 --> 00:00:09,210 And the first one we're going to start with 5 00:00:09,210 --> 00:00:10,950 really thinking about how to piece together 6 00:00:10,950 --> 00:00:15,570 some of what we've learned in the first couple of modules 7 00:00:15,570 --> 00:00:20,220 to expanding that to patterns of inheritance 8 00:00:20,220 --> 00:00:22,590 and the actual traits that you see. 9 00:00:22,590 --> 00:00:24,510 So we're gonna start this with lecture called 10 00:00:24,510 --> 00:00:26,100 genotype to phenotype. 11 00:00:26,100 --> 00:00:29,100 And as you remember, a genotype is the sequence 12 00:00:29,100 --> 00:00:31,170 of a particular gene. 13 00:00:31,170 --> 00:00:36,040 So if someone has, say a disease mutation that 14 00:00:38,333 --> 00:00:40,470 it causes a disease, it's the actual mutation, 15 00:00:40,470 --> 00:00:42,600 the sequence of the gene itself, that's the genotype. 16 00:00:42,600 --> 00:00:44,040 The phenotype is the trait. 17 00:00:44,040 --> 00:00:46,050 That's actually what you see in a person. 18 00:00:46,050 --> 00:00:48,090 And that might be the disease. 19 00:00:48,090 --> 00:00:53,090 It might be something else like eye color, hair color, 20 00:00:53,268 --> 00:00:56,460 height, these are all traits, 21 00:00:56,460 --> 00:00:59,549 these are all phenotypes that are affected, 22 00:00:59,549 --> 00:01:01,969 in some cases directly caused by, 23 00:01:01,969 --> 00:01:04,128 and in some cases just affected by 24 00:01:04,128 --> 00:01:06,348 a particular gene sequence or the genotype. 25 00:01:06,348 --> 00:01:10,290 So genotype determines phenotype. 26 00:01:10,290 --> 00:01:11,970 So that we're going to actually take this 27 00:01:11,970 --> 00:01:14,250 and look into it in a little bit more depth 28 00:01:14,250 --> 00:01:17,970 and better understand how can we transition from 29 00:01:17,970 --> 00:01:22,947 what we've learned about gene sequences and gene expression 30 00:01:23,880 --> 00:01:28,880 and transition that to the actual traits that we see. 31 00:01:29,431 --> 00:01:32,220 And take that into the next couple of lectures 32 00:01:32,220 --> 00:01:35,040 where we start to think about patterns of inheritance 33 00:01:35,040 --> 00:01:38,407 that you might record, say in a pedigree 34 00:01:38,407 --> 00:01:41,103 from a family history that a patient might give you. 35 00:01:42,750 --> 00:01:44,310 Let's do a quick review. 36 00:01:44,310 --> 00:01:47,910 DNA sequence is copied into mRNA during transcription. 37 00:01:47,910 --> 00:01:50,310 mRNA sequence is read as codons 38 00:01:50,310 --> 00:01:53,010 or triplets of bases in a specific order. 39 00:01:53,010 --> 00:01:55,230 And this will become really important when we start talking 40 00:01:55,230 --> 00:01:56,790 about different kinds of mutations 41 00:01:56,790 --> 00:02:00,510 and the impact that can have on the protein 42 00:02:00,510 --> 00:02:05,323 that's made from the instructions from this gene. 43 00:02:05,323 --> 00:02:09,510 During translation, each codon encodes for one amino acid 44 00:02:09,510 --> 00:02:12,150 or instructs the ribosome to stop translation. 45 00:02:12,150 --> 00:02:15,240 So you remember that there are 20 different amino acids 46 00:02:15,240 --> 00:02:19,800 and each one of those has a couple of different codons 47 00:02:19,800 --> 00:02:21,240 that can encode for it. 48 00:02:21,240 --> 00:02:24,060 And a codon, as you recall, is just a set 49 00:02:24,060 --> 00:02:28,080 of three bases together, but in a specific order. 50 00:02:28,080 --> 00:02:31,980 So if you remember, there was that codon table 51 00:02:31,980 --> 00:02:34,770 and you can double check that if you'd like 52 00:02:34,770 --> 00:02:37,440 to refresh your memory of what that actually was. 53 00:02:37,440 --> 00:02:42,440 But it's that table and you have basically the first base on 54 00:02:42,474 --> 00:02:45,120 the left hand column, 55 00:02:45,120 --> 00:02:47,370 the second base going horizontally across, 56 00:02:47,370 --> 00:02:51,030 and then the third base going down on the right hand column. 57 00:02:51,030 --> 00:02:53,430 And you can look on the table for any combination 58 00:02:53,430 --> 00:02:58,320 of three bases and see which amino acid it encodes 59 00:02:58,320 --> 00:03:02,013 or if it encodes for a stop signal, for example. 60 00:03:03,240 --> 00:03:05,550 Proteins are made up of a chain of amino acids, 61 00:03:05,550 --> 00:03:07,380 strung together in a specific order 62 00:03:07,380 --> 00:03:10,890 to give the protein its specific shape and function. 63 00:03:10,890 --> 00:03:13,560 Proteins do virtually everything in a cell 64 00:03:13,560 --> 00:03:16,410 so getting the amino acids sequence exactly correct 65 00:03:16,410 --> 00:03:18,090 is really important. 66 00:03:18,090 --> 00:03:20,910 And in some proteins, in some specific parts 67 00:03:20,910 --> 00:03:23,820 of certain proteins, if there are changes in amino acids, 68 00:03:23,820 --> 00:03:27,330 it may not actually affect the protein's function, 69 00:03:27,330 --> 00:03:32,330 but certainly for others there are regions of the protein 70 00:03:32,430 --> 00:03:36,630 where getting the exact amino acid in the exact right order 71 00:03:36,630 --> 00:03:39,450 is absolutely critical and can completely change 72 00:03:39,450 --> 00:03:42,423 or even eliminate the normal function of the protein. 73 00:03:44,340 --> 00:03:46,920 Here are a few important terms and definitions 74 00:03:46,920 --> 00:03:50,105 that we'll talk about in this particular lecture. 75 00:03:50,105 --> 00:03:52,140 So first of all, genotype. 76 00:03:52,140 --> 00:03:55,020 This is the nucleotide sequence of a gene. 77 00:03:55,020 --> 00:03:58,991 So genotype is the sequence of a particular gene. 78 00:03:58,991 --> 00:04:02,070 The phenotype, this is the trait or characteristics 79 00:04:02,070 --> 00:04:03,180 seen in a person. 80 00:04:03,180 --> 00:04:05,490 So an example would be the presence of a disease. 81 00:04:05,490 --> 00:04:10,090 Another example of a phenotype would be eye color or height 82 00:04:11,261 --> 00:04:14,820 or whether or not a person develops type 1 diabetes. 83 00:04:14,820 --> 00:04:16,950 These are all called phenotypes. 84 00:04:16,950 --> 00:04:18,570 So it's basically what you would see 85 00:04:18,570 --> 00:04:20,640 or experience in a person. 86 00:04:20,640 --> 00:04:25,230 The genotype is the DNA sequence for the particular genes 87 00:04:25,230 --> 00:04:27,123 that would affect the phenotype. 88 00:04:28,200 --> 00:04:30,510 And allele, this is an important definition in one 89 00:04:30,510 --> 00:04:34,470 I'd like you to try to pick up on, 90 00:04:34,470 --> 00:04:37,290 because we're going to start using this term a lot. 91 00:04:37,290 --> 00:04:39,450 And it's important that you understand what it means 92 00:04:39,450 --> 00:04:42,060 because otherwise it can get kind of confusing. 93 00:04:42,060 --> 00:04:45,090 But an allele is one of the alternate versions of a gene. 94 00:04:45,090 --> 00:04:47,580 And what I mean by that is, 95 00:04:47,580 --> 00:04:50,290 let's take for example, the eye color gene. 96 00:04:50,290 --> 00:04:53,520 Let's say we actually have a couple of different genes 97 00:04:53,520 --> 00:04:55,590 that affect eye color, but let's make it simple 98 00:04:55,590 --> 00:04:58,653 and just say there's one gene that affects eye color. 99 00:05:00,060 --> 00:05:04,650 So you would have in, say, 10 different people, 100 00:05:04,650 --> 00:05:07,350 you might have three or four different eye colors. 101 00:05:07,350 --> 00:05:10,300 You could have green, brown, blue, 102 00:05:10,300 --> 00:05:12,807 even like a grayish color. 103 00:05:12,807 --> 00:05:14,790 And you know, there are many other colors. 104 00:05:14,790 --> 00:05:16,860 But basically what we're talking about when we're talking 105 00:05:16,860 --> 00:05:20,700 about an allele, if we think about that eye color gene, 106 00:05:20,700 --> 00:05:23,820 let's say, so it's a gene which encodes a protein 107 00:05:23,820 --> 00:05:27,250 which impacts directly the eye color of an individual. 108 00:05:27,250 --> 00:05:31,080 And a person who has brown eyes, they would have an allele 109 00:05:31,080 --> 00:05:36,000 for a brown eye color for that particular eye color gene. 110 00:05:36,000 --> 00:05:39,930 And another person, they might have a blue eye color allele. 111 00:05:39,930 --> 00:05:43,950 So the allele is just basically a different version, 112 00:05:43,950 --> 00:05:48,423 a slight modification in the sequence of a gene, 113 00:05:49,740 --> 00:05:51,420 which impacts the phenotype. 114 00:05:51,420 --> 00:05:55,260 So it would give you a different phenotype or, you know, 115 00:05:55,260 --> 00:05:58,800 slightly different outcome for that particular trait. 116 00:05:58,800 --> 00:06:01,720 So alleles, you can have many different alleles 117 00:06:02,700 --> 00:06:06,397 sort of in the general population for a particular gene, 118 00:06:06,397 --> 00:06:08,670 you know, you might have hundreds 119 00:06:08,670 --> 00:06:10,470 or more different versions, 120 00:06:10,470 --> 00:06:12,873 slight variations on a particular gene. 121 00:06:15,390 --> 00:06:17,560 Each person will only have two alleles 122 00:06:18,630 --> 00:06:21,480 because you have two copies of each gene. 123 00:06:21,480 --> 00:06:24,270 So you might have two alleles that are exactly the same. 124 00:06:24,270 --> 00:06:25,770 Let's go back to the eye color example. 125 00:06:25,770 --> 00:06:30,190 Maybe both of your parents had blue eyes 126 00:06:31,260 --> 00:06:36,260 and let's say they both donated to you the blue eye allele. 127 00:06:36,570 --> 00:06:39,840 So say you'd have blue eyes in that case, 128 00:06:39,840 --> 00:06:42,480 and both of your alleles, both of your copies 129 00:06:42,480 --> 00:06:46,185 of the eye color gene would be for the blue eye color. 130 00:06:46,185 --> 00:06:50,433 You can also have two different alleles. 131 00:06:53,291 --> 00:06:56,247 So two slightly different versions of the same gene. 132 00:06:57,420 --> 00:07:01,130 And then the question becomes, what would your phenotype be 133 00:07:01,130 --> 00:07:03,480 if you have two different alleles? 134 00:07:03,480 --> 00:07:06,450 Well, we'll talk about that in a lot of detail 135 00:07:06,450 --> 00:07:08,070 in the next lecture. 136 00:07:08,070 --> 00:07:10,623 So hold on to those questions. 137 00:07:12,000 --> 00:07:15,960 A SNP or single nucleotide polymorphism is a single base 138 00:07:15,960 --> 00:07:17,370 change in a DNA sequence, 139 00:07:17,370 --> 00:07:20,610 which may or may not be associated with a phenotype. 140 00:07:20,610 --> 00:07:23,643 And what I mean by that is there could be a slight change. 141 00:07:23,643 --> 00:07:26,130 It's basically changing one base to another base. 142 00:07:26,130 --> 00:07:29,880 So an A towards to a T or a C to an A 143 00:07:29,880 --> 00:07:31,207 or something like that. 144 00:07:31,207 --> 00:07:35,100 And that may result in a different amino acid being coded 145 00:07:35,100 --> 00:07:37,440 for if this is occurring within a gene. 146 00:07:37,440 --> 00:07:41,130 And that may or may not impact the proteins function, 147 00:07:41,130 --> 00:07:44,433 which may or may not impact the phenotype for a gene. 148 00:07:45,270 --> 00:07:47,460 Or it may or may not impact the phenotype 149 00:07:47,460 --> 00:07:49,683 for that particular allele. 150 00:07:52,500 --> 00:07:55,200 All right, wild type is what we would call 151 00:07:55,200 --> 00:07:57,600 the most common allele of a gene. 152 00:07:57,600 --> 00:08:00,630 So we would refer to sort of what you might consider 153 00:08:00,630 --> 00:08:01,740 the normal copy, 154 00:08:01,740 --> 00:08:03,600 especially if we're talking about a disease. 155 00:08:03,600 --> 00:08:05,010 Now this isn't really going to be the case 156 00:08:05,010 --> 00:08:07,217 for something like eye color or hair color 157 00:08:07,217 --> 00:08:11,460 or things like that, but more when we're talking about 158 00:08:11,460 --> 00:08:15,390 a disease versus non-disease versus, you know, 159 00:08:15,390 --> 00:08:17,160 sort of a normal functioning copy. 160 00:08:17,160 --> 00:08:19,704 The wild type is going to really be the normal functioning 161 00:08:19,704 --> 00:08:24,480 version of a gene, but it's considered wild type. 162 00:08:24,480 --> 00:08:28,170 And it was given that name by earlier geneticists 163 00:08:28,170 --> 00:08:33,170 who were studying genetics in different species of animals. 164 00:08:35,340 --> 00:08:36,870 And they would go out into the wild 165 00:08:36,870 --> 00:08:41,820 and they would, you know, catch their particular animals 166 00:08:41,820 --> 00:08:42,653 that they're studying. 167 00:08:42,653 --> 00:08:43,530 And many of them it was, you know, 168 00:08:43,530 --> 00:08:45,060 fruit flies that they would catch. 169 00:08:45,060 --> 00:08:49,650 And so they would say the most common 170 00:08:49,650 --> 00:08:51,480 phenotypes or traits that they would see, 171 00:08:51,480 --> 00:08:53,850 they would consider those to be the wild ones. 172 00:08:53,850 --> 00:08:56,070 So what you would see out in the wild. 173 00:08:56,070 --> 00:08:58,680 So that's why it actually has the name wild type. 174 00:08:58,680 --> 00:09:01,560 Wild type really just means the normal version, 175 00:09:01,560 --> 00:09:02,940 the very common version 176 00:09:02,940 --> 00:09:05,520 that's mostly seen in the population 177 00:09:05,520 --> 00:09:08,550 Mutant would be the allele with the change in the sequence 178 00:09:08,550 --> 00:09:10,680 compared to the wild type sequence. 179 00:09:10,680 --> 00:09:15,510 And this mutant most likely is related 180 00:09:15,510 --> 00:09:19,530 to a phenotype change. 181 00:09:19,530 --> 00:09:21,720 And in the case of diseases, 182 00:09:21,720 --> 00:09:24,720 most mutant alleles are actually associated 183 00:09:24,720 --> 00:09:26,523 with a person having a disease 184 00:09:26,523 --> 00:09:30,046 or having a greater susceptibility for a disease. 185 00:09:30,046 --> 00:09:34,890 De novo would be what we would refer to as a new mutation 186 00:09:34,890 --> 00:09:37,170 which occurs spontaneously in an individual. 187 00:09:37,170 --> 00:09:39,753 So it was not inherited from his or her parents. 188 00:09:40,710 --> 00:09:44,733 Monogenic is a disease which is caused by a single gene. 189 00:09:47,799 --> 00:09:49,854 For example, in case of a disease, 190 00:09:49,854 --> 00:09:54,450 a mutation in one gene would be sufficient to cause 191 00:09:54,450 --> 00:09:56,643 an individual to have a disease. 192 00:09:58,590 --> 00:10:01,980 Multifactorial diseases, these are diseases which are caused 193 00:10:01,980 --> 00:10:04,680 by multiple genes and/or environmental influences. 194 00:10:04,680 --> 00:10:08,730 And actually, most diseases would fall into the category 195 00:10:08,730 --> 00:10:10,050 of being multifactorial. 196 00:10:10,050 --> 00:10:13,860 So they're influenced not just by one gene, say, 197 00:10:13,860 --> 00:10:18,057 maybe you have a slight influence from 10, 15, 198 00:10:18,057 --> 00:10:19,530 100 different genes 199 00:10:19,530 --> 00:10:22,050 and also could be influenced by the environment. 200 00:10:22,050 --> 00:10:26,700 And that could mean any combination of, you know, 201 00:10:26,700 --> 00:10:28,905 of things in a person, say for example, their diet, 202 00:10:28,905 --> 00:10:30,495 their level of exercise, 203 00:10:30,495 --> 00:10:33,944 any other comorbidities they may have. 204 00:10:33,944 --> 00:10:35,884 All of these may start to impact 205 00:10:35,884 --> 00:10:39,090 and influence a multifactorial disease. 206 00:10:39,090 --> 00:10:41,070 As opposed to a monogenic disease, 207 00:10:41,070 --> 00:10:43,350 which is really directly caused 208 00:10:43,350 --> 00:10:45,183 by a mutation in a single gene. 209 00:10:46,320 --> 00:10:48,180 We'll talk more about monogenic diseases 210 00:10:48,180 --> 00:10:50,460 in the next lecture as well. 211 00:10:50,460 --> 00:10:53,520 Multifactorial diseases, we'll go into those in more detail 212 00:10:53,520 --> 00:10:55,923 in the next module, so next week. 213 00:10:57,210 --> 00:11:00,540 All right, so I sort of gave you an overview with that, 214 00:11:00,540 --> 00:11:01,590 but let's go back one more time 215 00:11:01,590 --> 00:11:02,580 'cause it is pretty important 216 00:11:02,580 --> 00:11:04,443 to get some of these terms correct. 217 00:11:05,400 --> 00:11:08,010 So as you recall there, you have two copies of each gene, 218 00:11:08,010 --> 00:11:09,960 one from each parent, and the genotype 219 00:11:09,960 --> 00:11:12,210 will be the sequence of the gene. 220 00:11:12,210 --> 00:11:16,800 And so for each of us, if we're talking about a gene, 221 00:11:16,800 --> 00:11:19,320 which say is on an autosome, right? 222 00:11:19,320 --> 00:11:22,010 We each have two copies of all of those twos. 223 00:11:22,010 --> 00:11:26,424 So our genotype, we would have the genotype for 224 00:11:26,424 --> 00:11:31,424 both the maternal and our paternal copies of each gene. 225 00:11:31,530 --> 00:11:33,720 The phenotype is the trait or characteristic 226 00:11:33,720 --> 00:11:34,920 seen in the patient. 227 00:11:34,920 --> 00:11:39,030 And that would be in the case of what we'll be discussing 228 00:11:39,030 --> 00:11:41,040 more sensibly, that would be, say, 229 00:11:41,040 --> 00:11:43,263 having a disease or not having a disease. 230 00:11:44,610 --> 00:11:47,430 And these copies or alleles may have slight differences. 231 00:11:47,430 --> 00:11:49,380 And we talked about this a moment ago 232 00:11:49,380 --> 00:11:51,870 with eye color gene in the population one allele 233 00:11:51,870 --> 00:11:54,180 or sequence variant codes for brown eyes, one for blue, 234 00:11:54,180 --> 00:11:56,890 one for green, but they are all the same genes 235 00:11:56,890 --> 00:11:58,410 when we're talking about different alleles. 236 00:11:58,410 --> 00:12:01,050 These are different alleles of the same gene. 237 00:12:01,050 --> 00:12:03,360 A protein dysfunction related to a disease 238 00:12:03,360 --> 00:12:05,340 results from the patient having at least one 239 00:12:05,340 --> 00:12:08,700 copy of the gene with the disease or disrupted sequence 240 00:12:08,700 --> 00:12:09,870 or the disease allele, 241 00:12:09,870 --> 00:12:11,970 or what could be called the mutant allele. 242 00:12:13,500 --> 00:12:16,080 So I just wanna take one more quick moment 243 00:12:16,080 --> 00:12:19,320 to get the concept of alleles down. 244 00:12:19,320 --> 00:12:21,510 And I apologize if this is really being redundant, 245 00:12:21,510 --> 00:12:23,610 but I'd rather be a little bit redundant here 246 00:12:23,610 --> 00:12:26,370 than to just lose you a little bit 247 00:12:26,370 --> 00:12:29,400 when we start throwing around terms like allele wild type, 248 00:12:29,400 --> 00:12:32,913 you know, when we move on in this material. 249 00:12:33,870 --> 00:12:36,690 So here's an analogy that might be useful. 250 00:12:36,690 --> 00:12:38,370 I hope it's useful to you. 251 00:12:38,370 --> 00:12:40,353 If you think of alleles, 252 00:12:41,264 --> 00:12:45,794 let's take an example of a deck of cards, okay? 253 00:12:45,794 --> 00:12:49,320 So like a regular deck of playing cards. 254 00:12:49,320 --> 00:12:53,103 So there are 52 different cards in a typical playing deck. 255 00:12:55,560 --> 00:12:58,590 So if we're using an analogy here, 256 00:12:58,590 --> 00:13:02,010 you can think of this is basically like saying 257 00:13:02,010 --> 00:13:05,283 there's the card gene, right? 258 00:13:07,413 --> 00:13:09,060 So all of the cards in the deck, 259 00:13:09,060 --> 00:13:11,580 even though there are 52 different cards in the deck, 260 00:13:11,580 --> 00:13:13,826 they're all cards, right? 261 00:13:13,826 --> 00:13:16,127 So let's say they're all different versions 262 00:13:16,127 --> 00:13:16,960 of the card gene. 263 00:13:16,960 --> 00:13:18,243 We can call that the card gene. 264 00:13:19,410 --> 00:13:20,970 And there are 52 different versions of it 265 00:13:20,970 --> 00:13:22,320 that would be like saying there are 52 266 00:13:22,320 --> 00:13:23,940 different possible alleles. 267 00:13:23,940 --> 00:13:25,110 You could have the two of clubs, 268 00:13:25,110 --> 00:13:26,310 you could have the five of spades, 269 00:13:26,310 --> 00:13:27,420 you could have the queen of diamonds, 270 00:13:27,420 --> 00:13:28,650 you could have whatever, 271 00:13:28,650 --> 00:13:31,580 let's say there are 52 different alleles 272 00:13:31,580 --> 00:13:33,642 or slightly different versions of cards, 273 00:13:33,642 --> 00:13:36,990 but they're all cards so that they're all alleles 274 00:13:36,990 --> 00:13:38,580 of the same gene. 275 00:13:38,580 --> 00:13:41,370 It's just that they have slight changes. 276 00:13:41,370 --> 00:13:46,370 So you know, a small percentage of the bases are different 277 00:13:47,070 --> 00:13:49,710 between the different copies and the same, 278 00:13:49,710 --> 00:13:51,030 you know, could be true. 279 00:13:51,030 --> 00:13:53,550 If we go back to that analogy of the cards. 280 00:13:53,550 --> 00:13:55,935 So there are 52 different alleles, let's say 281 00:13:55,935 --> 00:13:59,943 in our population in the card gene. 282 00:14:01,260 --> 00:14:05,130 So for each person you get to have two alleles. 283 00:14:05,130 --> 00:14:07,770 Assuming this gene is on an autosome, right? 284 00:14:07,770 --> 00:14:08,970 Not on an X chromosome, 285 00:14:08,970 --> 00:14:11,070 but on an autosome you get to have two copies. 286 00:14:11,070 --> 00:14:14,430 So it's like taking two cards from the deck 287 00:14:14,430 --> 00:14:18,540 and you get whatever you get from that and those two cards. 288 00:14:18,540 --> 00:14:22,740 And you know, let's say we have more than one deck of cards 289 00:14:22,740 --> 00:14:23,700 that are there, 290 00:14:23,700 --> 00:14:26,040 let's say we have five decks of cards, okay? 291 00:14:26,040 --> 00:14:28,290 So there are 52 different cards in the deck, 292 00:14:28,290 --> 00:14:29,970 but each of them is present, you know, 293 00:14:29,970 --> 00:14:32,940 in five different cards. 294 00:14:32,940 --> 00:14:34,543 So when you pull two cards, 295 00:14:34,543 --> 00:14:36,870 you might pull two different cards, 296 00:14:36,870 --> 00:14:41,403 you might pull a two of clubs and a seven of spades, 297 00:14:42,390 --> 00:14:45,483 or you might pull two two of clubs. 298 00:14:47,490 --> 00:14:50,340 And what we'll talk about in the next lecture would be 299 00:14:50,340 --> 00:14:53,290 how our phenotype can be affected 300 00:14:54,390 --> 00:14:57,270 by the two different alleles that we have. 301 00:14:57,270 --> 00:14:59,460 So let's go back for a second to that analogy again, 302 00:14:59,460 --> 00:15:02,070 I apologize if this isn't coming out as clearly 303 00:15:02,070 --> 00:15:07,070 as it is in my head, but it's basically like saying 304 00:15:07,230 --> 00:15:11,910 alleles are slightly different versions of the same gene, 305 00:15:11,910 --> 00:15:13,507 but they're all the same gene. 306 00:15:13,507 --> 00:15:16,062 So they're all cards in our analogy. 307 00:15:16,062 --> 00:15:18,650 And you can have many different alleles 308 00:15:18,650 --> 00:15:21,240 or slightly different versions of a particular gene 309 00:15:21,240 --> 00:15:25,170 in the population as a whole, right? 310 00:15:25,170 --> 00:15:30,170 But each individual person only possesses two alleles. 311 00:15:30,510 --> 00:15:32,730 And those two alleles might be different from one another 312 00:15:32,730 --> 00:15:35,310 and they might be the same as one another. 313 00:15:35,310 --> 00:15:38,490 Okay, I hope that kind of starting 314 00:15:38,490 --> 00:15:39,873 to make sense a little bit. 315 00:15:41,250 --> 00:15:43,800 So let's take a quick peek at what this might mean 316 00:15:43,800 --> 00:15:46,380 and what are these changes that we're talking about? 317 00:15:46,380 --> 00:15:49,980 So small changes in the sequence are called mutations. 318 00:15:49,980 --> 00:15:54,980 So we would consider that the most common sequence 319 00:15:55,110 --> 00:16:00,110 for a particular gene is called the wild type allele. 320 00:16:00,960 --> 00:16:03,390 And any derivation from that in terms 321 00:16:03,390 --> 00:16:06,960 of the sequence would be considered a mutation. 322 00:16:06,960 --> 00:16:11,580 The mutation may be harmful in the end to the person, 323 00:16:11,580 --> 00:16:13,470 it may be completely neutral, 324 00:16:13,470 --> 00:16:16,153 actually most mutations are neutral 325 00:16:16,153 --> 00:16:19,470 or it actually may be slightly beneficial. 326 00:16:19,470 --> 00:16:23,280 And that would be what drives evolution 327 00:16:23,280 --> 00:16:26,550 and progression of a species and adaptation. 328 00:16:26,550 --> 00:16:28,230 We're not gonna really get into that. 329 00:16:28,230 --> 00:16:29,160 We're gonna really more 330 00:16:29,160 --> 00:16:33,540 so focus on the detrimental side of mutations. 331 00:16:33,540 --> 00:16:36,240 But just keep in mind that actually mutations are not 332 00:16:36,240 --> 00:16:40,110 entirely a bad thing, certainly not for a species 333 00:16:40,110 --> 00:16:43,650 because it can provide genetic diversity, 334 00:16:43,650 --> 00:16:45,999 which can be quite beneficial in that 335 00:16:45,999 --> 00:16:49,800 if we're responding to changes in our environment, 336 00:16:49,800 --> 00:16:53,760 those individuals who have an ability to say survive 337 00:16:53,760 --> 00:16:57,090 specific set of conditions could survive 338 00:16:57,090 --> 00:16:58,754 and our species could propagate. 339 00:16:58,754 --> 00:17:00,300 Anyway, we're not gonna really talk 340 00:17:00,300 --> 00:17:01,680 about beneficial mutations, 341 00:17:01,680 --> 00:17:04,800 we're really going to focus on those mutations 342 00:17:04,800 --> 00:17:06,900 which are detrimental. 343 00:17:06,900 --> 00:17:09,300 So mutation is just a slight change. 344 00:17:09,300 --> 00:17:11,010 And let's see, in this particular example, 345 00:17:11,010 --> 00:17:14,970 it would be like taking this sequence here, 346 00:17:14,970 --> 00:17:16,530 let's say we're just showing the single strand 347 00:17:16,530 --> 00:17:21,510 of a genes sequence, okay, that's just one strand 348 00:17:21,510 --> 00:17:23,730 and let's say it was actually mutated. 349 00:17:23,730 --> 00:17:26,340 So that's the wild type say on the top 350 00:17:26,340 --> 00:17:29,217 and on the bottom, this strand here. 351 00:17:29,217 --> 00:17:33,780 So changing that adenine base to a guanine, 352 00:17:33,780 --> 00:17:36,663 this would be a mutant version. 353 00:17:37,890 --> 00:17:39,659 So these could be two different alleles 354 00:17:39,659 --> 00:17:42,543 if they have different phenotypes. 355 00:17:44,070 --> 00:17:47,310 And this one mutation can result in big changes 356 00:17:47,310 --> 00:17:50,790 to the protein function, which can cause disease. 357 00:17:50,790 --> 00:17:53,310 Or again, it may actually have no impact 358 00:17:53,310 --> 00:17:55,293 or it might just have a slight impact. 359 00:17:58,950 --> 00:18:02,220 So let's just take a couple of examples here. 360 00:18:02,220 --> 00:18:03,853 So sickle cell disease, 361 00:18:03,853 --> 00:18:07,650 a mutation in the hemoglobin beta gene results in a change 362 00:18:07,650 --> 00:18:09,240 in the hemoglobin protein, 363 00:18:09,240 --> 00:18:11,340 which affects its shape and function, 364 00:18:11,340 --> 00:18:14,040 the shape and function of hemoglobin. 365 00:18:14,040 --> 00:18:17,220 Mutated hemoglobin forms along in flexible chains 366 00:18:17,220 --> 00:18:19,530 in red blood cells, making them stiff and angular 367 00:18:19,530 --> 00:18:22,380 and likely to become stuck in small capillaries 368 00:18:22,380 --> 00:18:24,450 reducing oxygen delivery to organs. 369 00:18:24,450 --> 00:18:29,280 So as you can see up here, the normal shape of hemoglobin 370 00:18:29,280 --> 00:18:32,790 is basically, you know, a nice sort of balled up shape 371 00:18:32,790 --> 00:18:34,410 and it needs to be that shape 372 00:18:34,410 --> 00:18:37,080 so it can properly bind to oxygen 373 00:18:37,080 --> 00:18:41,670 and can keep keep the red blood cells in the normal shape 374 00:18:41,670 --> 00:18:43,680 that our body has adapted to, 375 00:18:43,680 --> 00:18:46,020 and they can flow freely through capillaries 376 00:18:46,020 --> 00:18:48,990 and deliver oxygen throughout the body. 377 00:18:48,990 --> 00:18:51,720 If however, there is a mutation, it's just actually 378 00:18:51,720 --> 00:18:56,720 a single base change in the hemoglobin beta gene, 379 00:18:57,690 --> 00:19:01,830 that can result in a change in the protein structure 380 00:19:01,830 --> 00:19:05,340 for hemoglobin beta, which can cause it to form 381 00:19:05,340 --> 00:19:06,870 these long and flexible chains, 382 00:19:06,870 --> 00:19:09,240 which can affect the shapes of the cell. 383 00:19:09,240 --> 00:19:12,510 And these hemoglobin proteins 384 00:19:12,510 --> 00:19:15,720 no longer bind oxygen as well either. 385 00:19:15,720 --> 00:19:19,770 So you have the combined effect of the blood cells 386 00:19:19,770 --> 00:19:21,300 getting stuck in capillaries 387 00:19:21,300 --> 00:19:22,740 and not being able to deliver oxygen. 388 00:19:22,740 --> 00:19:23,910 But also when they do get there, 389 00:19:23,910 --> 00:19:27,090 the oxygen is actually not bound as well. 390 00:19:27,090 --> 00:19:30,840 So that's like an example of a change in a protein 391 00:19:30,840 --> 00:19:35,840 affecting an outcome for an individual. 392 00:19:36,540 --> 00:19:41,470 So in this particular case we could be looking at least 393 00:19:41,470 --> 00:19:46,260 two different alleles here, the normal or wild type allele 394 00:19:46,260 --> 00:19:49,623 which would encode the normal functioning hemoglobin here, 395 00:19:51,120 --> 00:19:56,050 and also an allele that has a single base mutation 396 00:19:58,043 --> 00:20:01,253 which would result in the hemoglobin being disrupted 397 00:20:02,280 --> 00:20:07,280 and give the phenotype of sickle cell disease. 398 00:20:08,520 --> 00:20:12,060 All right, let's look at one more example, cystic fibrosis. 399 00:20:12,060 --> 00:20:16,650 So mutations in the CFTR gene affect the CFTR proteins' 400 00:20:16,650 --> 00:20:19,530 ability to move chloride ions in and out of the cell, 401 00:20:19,530 --> 00:20:20,760 resulting in sticky mucus 402 00:20:20,760 --> 00:20:23,160 building up on the outside of the cell. 403 00:20:23,160 --> 00:20:26,040 And in the lungs this mucus buildup can lead to infections, 404 00:20:26,040 --> 00:20:29,250 and in the pancreas, the mucus can block digestive enzymes 405 00:20:29,250 --> 00:20:32,040 from being secreted leading to significant health problems. 406 00:20:32,040 --> 00:20:34,890 So here what we're looking at is on the left hand side, 407 00:20:34,890 --> 00:20:38,250 a drawing of a normal functioning CFTR channel, 408 00:20:38,250 --> 00:20:39,300 which is the protein. 409 00:20:39,300 --> 00:20:42,900 The protein's function is to act as basically like a channel 410 00:20:42,900 --> 00:20:46,560 through which ions can flow in and out of the cell, 411 00:20:46,560 --> 00:20:49,230 but you know, through the cell membrane. 412 00:20:49,230 --> 00:20:50,430 So that's the normal function, 413 00:20:50,430 --> 00:20:54,331 it allows chloride ions in and out as the cell needs it, 414 00:20:54,331 --> 00:20:56,550 and you don't get this buildup of mucus. 415 00:20:56,550 --> 00:21:00,420 If however, there is a mutation in the sequence 416 00:21:00,420 --> 00:21:02,130 for the CFTR gene. 417 00:21:02,130 --> 00:21:04,410 So that would be a mutant allele. 418 00:21:04,410 --> 00:21:07,830 If a person has mutant allele for this, 419 00:21:07,830 --> 00:21:12,330 then the CFTR protein that's encoded would not 420 00:21:12,330 --> 00:21:15,420 function properly because it would not be built properly 421 00:21:15,420 --> 00:21:20,250 because the cell would be told to, basically, 422 00:21:20,250 --> 00:21:22,950 form the CFTR protein 423 00:21:22,950 --> 00:21:26,692 with some changes in the amino acids than what it should be. 424 00:21:26,692 --> 00:21:29,790 Then the protein itself doesn't function properly 425 00:21:29,790 --> 00:21:31,620 and the chloride ions get stuck 426 00:21:31,620 --> 00:21:34,353 and it results in the buildup of mucus. 427 00:21:36,600 --> 00:21:38,400 All right, let's talk about mutations. 428 00:21:38,400 --> 00:21:41,310 So this is a change in A DNA sequence. 429 00:21:41,310 --> 00:21:43,770 New mutations occur randomly from uncorrected 430 00:21:43,770 --> 00:21:45,120 errors in DNA replication. 431 00:21:45,120 --> 00:21:47,520 You remember that, we talked about that. 432 00:21:47,520 --> 00:21:50,433 I believe that was in Module 3, 433 00:21:51,630 --> 00:21:55,260 and from DNA damage by mutagens. 434 00:21:55,260 --> 00:21:57,324 So we'll talk more about mutagens 435 00:21:57,324 --> 00:21:59,640 when we go into the cancer module. 436 00:21:59,640 --> 00:22:03,270 But mutagens are basically like, you can kind of think 437 00:22:03,270 --> 00:22:07,587 of them as like some toxins or chemicals or, you know, 438 00:22:09,120 --> 00:22:11,850 UV radiation from sunlight. 439 00:22:11,850 --> 00:22:14,190 Those types of things can be considered mutagens 440 00:22:14,190 --> 00:22:17,010 and they basically are essentially chemical structures 441 00:22:17,010 --> 00:22:21,240 that enter the cell and will damage DNA and cause mutations. 442 00:22:21,240 --> 00:22:23,790 Mutations could have been inherited from a parent 443 00:22:23,790 --> 00:22:27,030 or could also have occurred spontaneously in an individual. 444 00:22:27,030 --> 00:22:29,520 And these spontaneous mutations, 445 00:22:29,520 --> 00:22:32,493 which result in disease are called de novo mutations. 446 00:22:34,170 --> 00:22:36,425 Three basic types of changes can happen. 447 00:22:36,425 --> 00:22:39,060 And these are the three basic types of mutations. 448 00:22:39,060 --> 00:22:42,120 So point mutation or a change in one base to another base. 449 00:22:42,120 --> 00:22:43,560 And this is sometimes called a SNP, 450 00:22:43,560 --> 00:22:47,040 you remember that from the definitions page, 451 00:22:47,040 --> 00:22:49,113 a single nucleotide polymorphism. 452 00:22:50,100 --> 00:22:52,320 You can probably see why we call it SNPs. 453 00:22:52,320 --> 00:22:53,550 It's a little easier to say that than 454 00:22:53,550 --> 00:22:55,380 single nucleotide polymorphism. 455 00:22:55,380 --> 00:22:56,580 It's kind of a mouthful. 456 00:22:57,761 --> 00:23:00,570 So a point mutation is just a single change, 457 00:23:00,570 --> 00:23:02,880 nothing added, nothing removed. 458 00:23:02,880 --> 00:23:05,610 It's just basically, in this case what we're looking at here 459 00:23:05,610 --> 00:23:08,970 instead of the cytosine in the third position 460 00:23:08,970 --> 00:23:12,330 of the short sequence we're looking at, it's now an adenine. 461 00:23:12,330 --> 00:23:15,540 So it's a C to an A, point mutation, 462 00:23:15,540 --> 00:23:17,520 or change in the sequence. 463 00:23:17,520 --> 00:23:20,670 An insertion would be inserting a new base or bases. 464 00:23:20,670 --> 00:23:23,790 And here we're looking at between the two cytosines, 465 00:23:23,790 --> 00:23:25,950 the third in the fourth, we add a thymine. 466 00:23:25,950 --> 00:23:28,563 So it's now one base longer. 467 00:23:29,580 --> 00:23:33,180 And you could have an insertion of one base of, you know, 468 00:23:33,180 --> 00:23:36,780 any number of bases, 2, 3, 5, 10, 20, however many. 469 00:23:36,780 --> 00:23:40,140 A deletion would be deleting an existing base or bases. 470 00:23:40,140 --> 00:23:44,250 So here we're looking at the TGCC sequences. 471 00:23:44,250 --> 00:23:49,250 Now we're losing one of those cytosines, so it now is TGC. 472 00:23:49,350 --> 00:23:52,473 So one of the cytosines was lost, that's a deletion. 473 00:23:54,330 --> 00:23:57,630 So some possible effects of single nucleotide polymorphisms 474 00:23:57,630 --> 00:24:00,720 on the encoded protein or those point mutations. 475 00:24:00,720 --> 00:24:05,720 The end result, so what can it possibly due to the protein 476 00:24:05,850 --> 00:24:08,493 that could be encoded from this gene? 477 00:24:09,660 --> 00:24:11,430 The mutations could be silent, 478 00:24:11,430 --> 00:24:13,110 which means that while the sequence has changed, 479 00:24:13,110 --> 00:24:15,690 there is no change in the protein encoded. 480 00:24:15,690 --> 00:24:17,070 So this would be, for example, 481 00:24:17,070 --> 00:24:20,130 like a CGG being changed to a CGT. 482 00:24:20,130 --> 00:24:23,100 But both of these actually encode for alanine, 483 00:24:23,100 --> 00:24:24,330 the amino acid alanine. 484 00:24:24,330 --> 00:24:29,132 Go look at your amino acid chart if that's helpful to you. 485 00:24:29,132 --> 00:24:31,650 So the protein is the same either way. 486 00:24:31,650 --> 00:24:33,420 So there's no effect on phenotype, 487 00:24:33,420 --> 00:24:35,340 so this is a silent mutation. 488 00:24:35,340 --> 00:24:37,020 And we actually have quite a few of these 489 00:24:37,020 --> 00:24:40,560 because there's no selection against those. 490 00:24:40,560 --> 00:24:42,420 So individuals will survive. 491 00:24:42,420 --> 00:24:45,240 There'll be no effect whatsoever if this particular kind 492 00:24:45,240 --> 00:24:49,560 of silent mutation occurs because the codon 493 00:24:50,610 --> 00:24:52,764 still encodes for the same amino acid 494 00:24:52,764 --> 00:24:54,420 'cause if you remember, there are only 20 amino acids, 495 00:24:54,420 --> 00:24:59,373 but there's 64 different possible DNA triplets or codons. 496 00:25:00,341 --> 00:25:03,180 You remember that, that occurs in translation 497 00:25:03,180 --> 00:25:06,780 where the ribosome is reading each of the triplets, 498 00:25:06,780 --> 00:25:09,630 each of the codons and each of those triplets, 499 00:25:09,630 --> 00:25:12,288 it's telling a specific amino acid to add. 500 00:25:12,288 --> 00:25:14,640 Well, there's what's called redundancy, 501 00:25:14,640 --> 00:25:18,270 which basically means that, as we talked about before, 502 00:25:18,270 --> 00:25:19,680 since they're only 20 amino acids 503 00:25:19,680 --> 00:25:22,387 and there's 64 different possible codons, 504 00:25:22,387 --> 00:25:24,150 that each amino acid actually 505 00:25:24,150 --> 00:25:26,550 has multiple different codons that encode it. 506 00:25:26,550 --> 00:25:29,160 So you could have a single base change 507 00:25:29,160 --> 00:25:31,620 and actually have no change whatsoever 508 00:25:31,620 --> 00:25:33,470 in the amino acid that it's encoding. 509 00:25:34,950 --> 00:25:37,740 All right. See, it's all starting to come together. 510 00:25:37,740 --> 00:25:39,570 Hopefully, you're starting to see that, 511 00:25:39,570 --> 00:25:43,350 that all the hard work you put into learning the material 512 00:25:43,350 --> 00:25:44,700 in the first couple of modules 513 00:25:44,700 --> 00:25:48,754 is starting to come back around now and be reinforced. 514 00:25:48,754 --> 00:25:50,283 So that's the idea anyway. 515 00:25:51,180 --> 00:25:55,390 Okay, so the next type of possible effect of a SNP 516 00:25:56,280 --> 00:25:59,880 will be a missense mutation, a missense. 517 00:25:59,880 --> 00:26:03,000 So again, there's some funny words in genetics. 518 00:26:03,000 --> 00:26:05,310 We talked about the antisense strand. 519 00:26:05,310 --> 00:26:08,790 This is a missense mutation, and missense, 520 00:26:08,790 --> 00:26:12,660 it means a results in one amino acid change in the protein. 521 00:26:12,660 --> 00:26:16,200 So let's take an example of that CGG sequence, 522 00:26:16,200 --> 00:26:18,870 instead of it being changed to CGT, 523 00:26:18,870 --> 00:26:21,000 which was coding for the same amino acid. 524 00:26:21,000 --> 00:26:23,880 Let's say that the mutation was changing 525 00:26:23,880 --> 00:26:27,090 that first C to a T, so now it's A TGG. 526 00:26:27,090 --> 00:26:29,610 This would change the protein from having an alanine in 527 00:26:29,610 --> 00:26:31,504 that location to having a threonine. 528 00:26:31,504 --> 00:26:33,630 Because if you look up on your chart, you would see 529 00:26:33,630 --> 00:26:38,550 that a CGG codes for alanine, but a TGG codes for threonine, 530 00:26:38,550 --> 00:26:40,560 which is a different amino acid, 531 00:26:40,560 --> 00:26:42,889 but would only affect that one 532 00:26:42,889 --> 00:26:44,940 where the mutation is located. 533 00:26:44,940 --> 00:26:46,980 And this may have an effect on the phenotype, 534 00:26:46,980 --> 00:26:50,340 by possibly changing the protein shape 535 00:26:50,340 --> 00:26:52,080 and therefore its function. 536 00:26:52,080 --> 00:26:55,110 And certainly there are diseases as we already talked about, 537 00:26:55,110 --> 00:26:57,000 sickle cell disease, is the result 538 00:26:57,000 --> 00:27:00,780 of a single amino acid being changed. 539 00:27:00,780 --> 00:27:03,840 Now, that's not the case for most amino acids. 540 00:27:03,840 --> 00:27:07,920 I'd say in most proteins there's a little bit of flexibility 541 00:27:07,920 --> 00:27:09,180 in terms of their functions, 542 00:27:09,180 --> 00:27:11,730 so being maintained even if there's a single change. 543 00:27:11,730 --> 00:27:15,660 But there are those very specific amino acids 544 00:27:15,660 --> 00:27:18,750 which are really important to have the right ones there. 545 00:27:18,750 --> 00:27:22,176 And so missense mutations can actually cause 546 00:27:22,176 --> 00:27:24,573 or lead to certainly disease. 547 00:27:25,740 --> 00:27:28,620 A nonsense mutation results in a truncated protein 548 00:27:28,620 --> 00:27:32,100 because of the creation of a stop codon. 549 00:27:32,100 --> 00:27:35,820 So let's take the example of our original, our wild type. 550 00:27:35,820 --> 00:27:39,060 Remember our wild type sequence is TTT, 551 00:27:39,060 --> 00:27:41,970 and that this is now mutated to ATT, 552 00:27:41,970 --> 00:27:44,850 which would result in the ribosomes reading a stop signal 553 00:27:44,850 --> 00:27:46,170 instead of a lysine. 554 00:27:46,170 --> 00:27:49,350 So none of the codons following this mutation will be read 555 00:27:49,350 --> 00:27:52,350 or translated resulting in a shortened protein. 556 00:27:52,350 --> 00:27:56,460 So TTT codes for lysine, and what you would normally have, 557 00:27:56,460 --> 00:27:59,135 let's say this was part of your mRNA sequence. 558 00:27:59,135 --> 00:28:02,190 Well rather, let's say this is part of your DNA sequence, 559 00:28:02,190 --> 00:28:05,970 it gets transcribed right into mRNA. 560 00:28:05,970 --> 00:28:10,080 So the mRNA would read UUU, 561 00:28:10,080 --> 00:28:14,100 and the ribosome would read that and say, 562 00:28:14,100 --> 00:28:15,300 okay, that's a lysine, 563 00:28:15,300 --> 00:28:17,040 and then it would move on to the next codon 564 00:28:17,040 --> 00:28:17,880 and move on to the next, and move on to the next, 565 00:28:17,880 --> 00:28:18,880 move on to the next. 566 00:28:19,830 --> 00:28:22,974 Well, in the case of a nonsense mutation, 567 00:28:22,974 --> 00:28:26,880 that changes the codon from a codon 568 00:28:26,880 --> 00:28:31,050 which would normally encode for an amino acid to a stop. 569 00:28:31,050 --> 00:28:34,590 So now the ribosome gets to that sequence 570 00:28:34,590 --> 00:28:37,920 and says, oh, that means stop and it falls off. 571 00:28:37,920 --> 00:28:42,220 And the problem is not only do you not translate 572 00:28:44,762 --> 00:28:47,250 or incorporate that particular, in this case, 573 00:28:47,250 --> 00:28:51,390 lysine into the growing protein chain, you also 574 00:28:51,390 --> 00:28:54,930 don't continue on with reading the rest of that mRNA. 575 00:28:54,930 --> 00:28:59,930 So anything downstream or that comes after that new mutation 576 00:29:00,863 --> 00:29:04,380 or that that nonsense mutation doesn't get read. 577 00:29:04,380 --> 00:29:09,380 And so the protein basically is short or truncated 578 00:29:09,390 --> 00:29:12,450 because none of the rest of what comes after, 579 00:29:12,450 --> 00:29:16,710 is now a stop codon, gets read by the ribosome. 580 00:29:16,710 --> 00:29:20,010 So this can be a big, big problem. And it, and it often is. 581 00:29:20,010 --> 00:29:22,770 So if there's a nonsense mutation that happens, 582 00:29:22,770 --> 00:29:27,510 basically a protein will be either very, you know, 583 00:29:27,510 --> 00:29:30,750 have a severely reduced function to it, 584 00:29:30,750 --> 00:29:34,254 or it will actually be completely non-functional whatsoever, 585 00:29:34,254 --> 00:29:36,480 which can be a real problem 586 00:29:36,480 --> 00:29:38,223 and can certainly lead to disease. 587 00:29:39,840 --> 00:29:42,210 Let's take a quick peek at what I'm talking about here 588 00:29:42,210 --> 00:29:47,210 so you can see it more visually here, silent mutation. 589 00:29:47,247 --> 00:29:50,160 And let's just orient ourselves to what we're looking at. 590 00:29:50,160 --> 00:29:53,760 On the top here is the DNA template strand. 591 00:29:53,760 --> 00:29:56,940 So that would be the two strands, 592 00:29:56,940 --> 00:30:00,630 and the mRNA that gets transcribed off of that. 593 00:30:00,630 --> 00:30:03,840 And then the protein, the amino acid sequence, 594 00:30:03,840 --> 00:30:05,370 which gets read. 595 00:30:05,370 --> 00:30:08,520 And again, you can go back to your amino acid table 596 00:30:08,520 --> 00:30:12,060 to confirm that this is what it basically would read. 597 00:30:12,060 --> 00:30:14,520 But let's say that this is the wild type. 598 00:30:14,520 --> 00:30:18,090 Again, remember wild type or normal functioning sequence. 599 00:30:18,090 --> 00:30:21,480 And here you have the, the normal, let's say, 600 00:30:21,480 --> 00:30:23,670 I mean this would be extremely short word protein. 601 00:30:23,670 --> 00:30:26,250 This is just for demonstration purposes only. 602 00:30:26,250 --> 00:30:30,870 Normal proteins are, I think, in the range of at least, 603 00:30:30,870 --> 00:30:34,011 3, 400 amino acids usually. 604 00:30:34,011 --> 00:30:36,388 So they're much, much longer than this. 605 00:30:36,388 --> 00:30:39,220 And some of them are, you know, 606 00:30:39,220 --> 00:30:41,160 even in thousands of amino acids long. 607 00:30:41,160 --> 00:30:43,020 But let's just look at this, 608 00:30:43,020 --> 00:30:44,790 this is just for demonstration purposes. 609 00:30:44,790 --> 00:30:49,650 You have your methionine, lysine, phenylalanine, glycine, 610 00:30:49,650 --> 00:30:53,246 and then it reads this stop codon here. 611 00:30:53,246 --> 00:30:58,246 Well, in this case, let's say that the guanine here 612 00:30:58,500 --> 00:31:02,610 was mutated to an alanine, so an A instead of a G. 613 00:31:02,610 --> 00:31:07,610 So this gets read as GGU, but that still codes for glycine. 614 00:31:08,957 --> 00:31:13,936 GGC and the mRNA and GGU both code for glycine. 615 00:31:13,936 --> 00:31:14,769 So this is a silent mutation. 616 00:31:14,769 --> 00:31:18,680 There's no change in the amino acid sequence, 617 00:31:18,680 --> 00:31:20,969 so there's no change in the protein whatsoever. 618 00:31:20,969 --> 00:31:23,253 So there's no detriment to this. 619 00:31:24,480 --> 00:31:29,220 Okay, a missense mutation, let's look again, 620 00:31:29,220 --> 00:31:32,194 same type of sequence that we were looking at 621 00:31:32,194 --> 00:31:33,027 in the previous slide. 622 00:31:33,027 --> 00:31:36,840 Here, let's say instead of the C in this location 623 00:31:36,840 --> 00:31:39,729 and the wild type strand, rather, 624 00:31:39,729 --> 00:31:42,193 there's now a T, so a T instead of a C. 625 00:31:42,193 --> 00:31:43,950 And what does that change? 626 00:31:43,950 --> 00:31:48,150 Well now, instead of it being GGC in the mRNA, 627 00:31:48,150 --> 00:31:53,150 it's now AGC, which is a codon which encodes for serine 628 00:31:53,175 --> 00:31:55,481 as opposed to glycine. 629 00:31:55,481 --> 00:31:59,670 So there is one amino acid difference in this protein 630 00:31:59,670 --> 00:32:02,173 as a result of a missense mutation. 631 00:32:02,173 --> 00:32:06,180 So silent mutations are really, you know, 632 00:32:06,180 --> 00:32:09,103 not going to have any effect whatsoever on a person. 633 00:32:09,103 --> 00:32:14,103 A missense mutation, very specific ones, 634 00:32:14,340 --> 00:32:17,220 can have detriments to a person, 635 00:32:17,220 --> 00:32:19,860 but generally they're more tolerable 636 00:32:19,860 --> 00:32:22,080 because it's a single amino acid has been changed. 637 00:32:22,080 --> 00:32:24,570 And unless that amino acid is in a really important location 638 00:32:24,570 --> 00:32:29,570 for the protein, then the protein and overall the cell 639 00:32:30,015 --> 00:32:32,527 and therefore the individual might be able to tolerate 640 00:32:32,527 --> 00:32:33,410 a nonsense mutation. 641 00:32:33,410 --> 00:32:35,580 A nonsense mutation, this is the one that results 642 00:32:35,580 --> 00:32:39,760 in the truncated protein because you're now creating 643 00:32:41,687 --> 00:32:45,810 a stop codon earlier on in the chain, right? 644 00:32:45,810 --> 00:32:48,570 Remember because it gets read just like you're reading 645 00:32:48,570 --> 00:32:52,350 a sentence from left to right. 646 00:32:52,350 --> 00:32:55,950 The ribosome is also reading the mRNA from left to right. 647 00:32:55,950 --> 00:32:59,460 So as soon as it hits what it reads as a stop codon, 648 00:32:59,460 --> 00:33:02,910 it stops, the ribosome falls off, and the protein is done. 649 00:33:02,910 --> 00:33:05,220 No matter what stage it's at, it doesn't keep going 650 00:33:05,220 --> 00:33:07,380 because it reads that as a stop codon. 651 00:33:07,380 --> 00:33:11,730 So in this case, instead of the T located here, 652 00:33:11,730 --> 00:33:13,920 this is now mutated to an A 653 00:33:13,920 --> 00:33:18,540 and this gets read as a UAG in the mRNA. 654 00:33:19,950 --> 00:33:23,430 And so, basically, the protein stops being formed 655 00:33:23,430 --> 00:33:25,470 right here, and there's really nothing to it. 656 00:33:25,470 --> 00:33:28,290 It doesn't have the meat of the protein. 657 00:33:28,290 --> 00:33:31,950 So any amino acids that would be encoded, 658 00:33:31,950 --> 00:33:34,170 any past this stop codon, 659 00:33:34,170 --> 00:33:36,480 just will never see the light of day. 660 00:33:36,480 --> 00:33:38,370 They'll never be read by the ribosomes, 661 00:33:38,370 --> 00:33:39,720 'cause the ribosomes reach that stop. 662 00:33:39,720 --> 00:33:41,400 'Cause again, remember, they're moving left to right, 663 00:33:41,400 --> 00:33:44,160 they're sliding along, read three, move up, slide along, 664 00:33:44,160 --> 00:33:46,260 read three, get up, move read three. 665 00:33:46,260 --> 00:33:49,680 But here it reads three, it puts methionine, 666 00:33:49,680 --> 00:33:51,420 it moves over to the next three and it reads, 667 00:33:51,420 --> 00:33:55,227 oh stop, our job's done, you know, it's quitting time, 668 00:33:55,227 --> 00:33:58,310 and the ribosome drops off and that's it for the protein. 669 00:33:58,310 --> 00:34:01,893 It basically doesn't get any more amino acids added onto it. 670 00:34:03,810 --> 00:34:06,782 All right, let's think about insertions and deletions. 671 00:34:06,782 --> 00:34:09,540 So there's a couple of different effects that can happen 672 00:34:09,540 --> 00:34:11,190 and these tend to be more severe. 673 00:34:11,190 --> 00:34:13,200 So when there's an insertion or a deletion, 674 00:34:13,200 --> 00:34:14,820 these tend to be more severe 675 00:34:14,820 --> 00:34:17,010 and we'll talk about why in a moment. 676 00:34:17,010 --> 00:34:20,153 More severe than a point mutation or a SNP. 677 00:34:21,088 --> 00:34:23,550 So an in-frame insertion, this would be one or more 678 00:34:23,550 --> 00:34:26,250 complete codons inserted without disrupting any 679 00:34:26,250 --> 00:34:28,140 of the existing codons resulting in 680 00:34:28,140 --> 00:34:30,740 one or more added amino acids to the protein. 681 00:34:30,740 --> 00:34:31,980 Okay, that's a lot of words 682 00:34:31,980 --> 00:34:34,920 and I'll show you visually what this means in a moment. 683 00:34:34,920 --> 00:34:36,150 So just hang with me for a second, 684 00:34:36,150 --> 00:34:38,910 just wanna get a few definitions out of the way. 685 00:34:38,910 --> 00:34:41,604 An in-frame deletion would be deleting one 686 00:34:41,604 --> 00:34:44,550 or more complete codons without disrupting any 687 00:34:44,550 --> 00:34:46,500 of the existing codons, which results in 688 00:34:46,500 --> 00:34:49,023 one or more amino acids removed from the protein. 689 00:34:49,950 --> 00:34:53,413 Okay. So remember before when we were talking about, say, 690 00:34:53,413 --> 00:34:57,090 a missense mutation, which would be changing 691 00:34:57,090 --> 00:34:58,950 one amino acid to a different amino acid. 692 00:34:58,950 --> 00:35:01,627 In this case where with in-frame insertions or deletions, 693 00:35:01,627 --> 00:35:05,280 you're adding one more amino acid 694 00:35:05,280 --> 00:35:06,930 or taking away one amino acid. 695 00:35:06,930 --> 00:35:08,940 It's not that you're changing one, 696 00:35:08,940 --> 00:35:12,120 you're just adding one or you're taking one away. 697 00:35:12,120 --> 00:35:14,190 A frame shift insertion or deletion, 698 00:35:14,190 --> 00:35:17,550 which is actually the most likely kind of insertion 699 00:35:17,550 --> 00:35:21,270 or deletion results in all codons following the insertion 700 00:35:21,270 --> 00:35:23,100 or deletion of one or more bases. 701 00:35:23,100 --> 00:35:25,830 And that disrupts the codons following the insertion 702 00:35:25,830 --> 00:35:27,690 or deletion to be incorrect. 703 00:35:27,690 --> 00:35:30,210 So all of the amino acids in the protein coded for 704 00:35:30,210 --> 00:35:32,152 after the mutation will be incorrect. 705 00:35:32,152 --> 00:35:35,760 So that means all the triplets are off by one or two. 706 00:35:35,760 --> 00:35:38,100 And as you can imagine, that's a real problem. 707 00:35:38,100 --> 00:35:39,030 Let's take a look at that. 708 00:35:39,030 --> 00:35:41,220 But first, let's take a peek at the in-frame 709 00:35:41,220 --> 00:35:42,873 insertions and deletions. 710 00:35:44,550 --> 00:35:46,890 So in-frame versus frameshift, 711 00:35:46,890 --> 00:35:50,070 frameshift is bad news for the protein. 712 00:35:50,070 --> 00:35:51,554 It really, really is. 713 00:35:51,554 --> 00:35:54,058 So let's take a look at why that might be. 714 00:35:54,058 --> 00:35:56,010 And if you think about this, 715 00:35:56,010 --> 00:35:59,160 I've used the analogy of thinking about reading an mRNA 716 00:35:59,160 --> 00:36:02,412 as if you were reading words in a sentence. 717 00:36:02,412 --> 00:36:06,390 The order of the letters that compose the sentence 718 00:36:06,390 --> 00:36:08,250 are very important. 719 00:36:08,250 --> 00:36:10,740 And it's also important where you put the spaces 720 00:36:10,740 --> 00:36:11,880 between those words. 721 00:36:11,880 --> 00:36:13,260 And that's kind of like thinking 722 00:36:13,260 --> 00:36:16,680 of the letters in the words being like each base 723 00:36:16,680 --> 00:36:20,490 and the space between them being like reading one codon 724 00:36:20,490 --> 00:36:22,020 versus the next, versus the next. 725 00:36:22,020 --> 00:36:25,380 So if in the wild type, let's say that the wild type 726 00:36:25,380 --> 00:36:26,820 would be like the sentence. 727 00:36:26,820 --> 00:36:29,970 So we're using sentences made up of three letter words 728 00:36:29,970 --> 00:36:33,570 because it's sort of like the three base codons, right? 729 00:36:33,570 --> 00:36:35,800 So each letter is like a base 730 00:36:35,800 --> 00:36:38,637 and each word here is like a codon, okay? 731 00:36:38,637 --> 00:36:41,361 And let's say that the sentence is like the protein. 732 00:36:41,361 --> 00:36:44,230 All right. So if we were reading the wild type, 733 00:36:44,230 --> 00:36:46,890 let's say the wild type sentence is, 734 00:36:46,890 --> 00:36:49,980 the red bug bit the dog. 735 00:36:49,980 --> 00:36:52,007 For an in-frame insertion, 736 00:36:52,007 --> 00:36:55,980 we're adding in another three bases, 737 00:36:55,980 --> 00:36:59,230 say we're adding in, basically, one additional codon. 738 00:36:59,230 --> 00:37:02,912 So it would be inserting three bases and it doesn't change, 739 00:37:02,912 --> 00:37:07,290 basically, adjust how the rest of the codons are read. 740 00:37:07,290 --> 00:37:10,470 So in this case it would be like adding the word top. 741 00:37:10,470 --> 00:37:13,800 So now it would be, the red top bug bit the dog. 742 00:37:13,800 --> 00:37:15,240 So that's a little bit different 743 00:37:15,240 --> 00:37:19,890 than the wild type sentence, so the red bug bit the dog. 744 00:37:19,890 --> 00:37:22,320 Now it reads, the red top bug bit the dog. 745 00:37:22,320 --> 00:37:23,670 So that might make a difference in terms 746 00:37:23,670 --> 00:37:24,920 of the proteins function. 747 00:37:24,920 --> 00:37:28,410 An in-frame deletion would be like removing one of the words 748 00:37:28,410 --> 00:37:29,833 or one of the codons. 749 00:37:29,833 --> 00:37:33,060 So now it would read, the bug bit the dog. 750 00:37:33,060 --> 00:37:34,980 So you're missing some information here, right? 751 00:37:34,980 --> 00:37:37,230 So before it was, the red bug bit the dog. 752 00:37:37,230 --> 00:37:38,940 Now it's just, the bug bit the dog. 753 00:37:38,940 --> 00:37:40,306 So we lost red, 754 00:37:40,306 --> 00:37:43,710 but you can still kind of understand the sentence. 755 00:37:43,710 --> 00:37:45,132 It still makes sense. 756 00:37:45,132 --> 00:37:47,040 It's still providing you some information, 757 00:37:47,040 --> 00:37:48,330 just a little less. 758 00:37:48,330 --> 00:37:51,150 So most likely an in-frame deletion like this, 759 00:37:51,150 --> 00:37:54,422 that's especially one that's just deleting a single codon 760 00:37:54,422 --> 00:37:57,990 would really probably not have a dramatic impact 761 00:37:57,990 --> 00:37:59,357 on the protein's function. 762 00:37:59,357 --> 00:38:04,357 However, if we start messing around with the frame 763 00:38:05,460 --> 00:38:07,530 of how the codons are read, 764 00:38:07,530 --> 00:38:09,450 here, you get into some trouble. 765 00:38:09,450 --> 00:38:11,040 So a frameshift insertion. 766 00:38:11,040 --> 00:38:14,190 So let's say we're adding in a single nucleotide. 767 00:38:14,190 --> 00:38:16,863 Let's say in this case you're adding a single base, 768 00:38:16,863 --> 00:38:21,863 and this is our analogy of using letters instead of bases. 769 00:38:22,650 --> 00:38:26,194 So let's just like adding the letter F into 770 00:38:26,194 --> 00:38:28,560 the sentence that we had before. 771 00:38:28,560 --> 00:38:32,670 But you still have to read each of these in groups of three. 772 00:38:32,670 --> 00:38:37,670 So now it would be THE REF DBU GBI TTH EDO G. 773 00:38:40,260 --> 00:38:42,900 Okay, that doesn't make any sense, right? 774 00:38:42,900 --> 00:38:44,730 I mean you still get the word the, 775 00:38:44,730 --> 00:38:46,350 that kind of makes sense, that's fine. 776 00:38:46,350 --> 00:38:47,310 That's just the same. 777 00:38:47,310 --> 00:38:51,450 But anything that comes after that frameshift insertion 778 00:38:51,450 --> 00:38:52,740 gets all totally screwed up. 779 00:38:52,740 --> 00:38:55,913 And now the ribosomes are reading these codons 780 00:38:55,913 --> 00:38:58,410 which are now all off by one, 781 00:38:58,410 --> 00:38:59,730 which totally screws it up 782 00:38:59,730 --> 00:39:02,280 because now when it goes to read that, 783 00:39:02,280 --> 00:39:04,290 it's coding for completely different amino acids, 784 00:39:04,290 --> 00:39:06,750 which means the protein's totally different. 785 00:39:06,750 --> 00:39:08,130 It makes no sense. 786 00:39:08,130 --> 00:39:10,290 The same thing is true for a deletion, 787 00:39:10,290 --> 00:39:13,050 let's say deleting a single nucleotide. 788 00:39:13,050 --> 00:39:15,570 And again, we're demonstrating that 789 00:39:15,570 --> 00:39:17,910 through taking out one of the letters. 790 00:39:17,910 --> 00:39:22,910 Now what it would be is THE REB UGB ITT HED OG. 791 00:39:23,880 --> 00:39:26,100 Yeah, that doesn't make any sense either. 792 00:39:26,100 --> 00:39:30,150 Same thing with insertion for a frameshift deletion, 793 00:39:30,150 --> 00:39:32,220 you know, anything that comes after 794 00:39:32,220 --> 00:39:35,010 where that deletion occurred doesn't make any sense anymore. 795 00:39:35,010 --> 00:39:36,630 And now the protein's totally different, 796 00:39:36,630 --> 00:39:38,850 probably loses its function altogether 797 00:39:38,850 --> 00:39:42,480 or starts doing something crazy that it shouldn't be doing. 798 00:39:42,480 --> 00:39:44,583 And this can definitely lead to disease. 799 00:39:45,690 --> 00:39:47,850 Let's take a quick peek at what we're talking about here. 800 00:39:47,850 --> 00:39:50,940 So in-frame insertion, here you have the wild type sequence, 801 00:39:50,940 --> 00:39:53,040 the DNA sequence on the top in blue, 802 00:39:53,040 --> 00:39:54,690 followed by the mRNA sequence in red, 803 00:39:54,690 --> 00:39:56,790 and then the protein sequence, and I'm writing that out 804 00:39:56,790 --> 00:39:59,973 as the three letter designation for each amino acid. 805 00:40:01,410 --> 00:40:03,690 So this is our wild type sequence. 806 00:40:03,690 --> 00:40:06,270 Then an in-frame insertion means we're adding exactly 807 00:40:06,270 --> 00:40:11,270 three bases or a number divisible by three number of bases. 808 00:40:12,218 --> 00:40:15,420 And you're also maintaining the frame. 809 00:40:15,420 --> 00:40:18,600 So here we're adding in a TCT, let's say, 810 00:40:18,600 --> 00:40:23,600 or an AGA right in between codons two and three. 811 00:40:25,920 --> 00:40:29,280 So basically now our protein, 812 00:40:29,280 --> 00:40:30,720 what does our protein look like? 813 00:40:30,720 --> 00:40:34,533 So methionine, glutamine, now there's an arginine, 814 00:40:35,550 --> 00:40:37,680 and then it's followed by the rest of the, you know, 815 00:40:37,680 --> 00:40:39,090 the regular sequence of the protein, 816 00:40:39,090 --> 00:40:42,843 cystine, proline, proline, leucine, glutamic acid. 817 00:40:44,880 --> 00:40:46,950 All right. So the protein has one additional 818 00:40:46,950 --> 00:40:48,480 amino acid as the end result. 819 00:40:48,480 --> 00:40:50,550 Similarly for deletion, it's basically like 820 00:40:50,550 --> 00:40:54,180 taking out exactly the three nucleotides, 821 00:40:54,180 --> 00:40:56,160 which would encode for a single codon. 822 00:40:56,160 --> 00:40:57,930 So nothing gets shifted around, 823 00:40:57,930 --> 00:41:00,630 but you have one fewer amino acid there. 824 00:41:00,630 --> 00:41:03,185 So now we've lost our cystine. 825 00:41:03,185 --> 00:41:06,210 And so it's methionine, glutamine, proline, proline, 826 00:41:06,210 --> 00:41:10,650 leucine, glutamic acid, and the cystine is now gone. 827 00:41:10,650 --> 00:41:13,680 So this can have an impact on the protein certainly, 828 00:41:13,680 --> 00:41:15,933 but it's not totally screwing everything up. 829 00:41:17,250 --> 00:41:19,110 Now if we wanna start screwing stuff up, 830 00:41:19,110 --> 00:41:22,920 let's take a look at what a frameshift insertion would do. 831 00:41:22,920 --> 00:41:24,443 What would insertion do? 832 00:41:24,443 --> 00:41:27,300 Well, let's say we're just adding one nucleotide. 833 00:41:27,300 --> 00:41:28,440 Like what can that hurt, right? 834 00:41:28,440 --> 00:41:29,520 Just putting one base in those. 835 00:41:29,520 --> 00:41:32,040 Those other two examples, you're adding three bases. 836 00:41:32,040 --> 00:41:33,960 Shouldn't that have a bigger impact? 837 00:41:33,960 --> 00:41:36,420 Well, no, because it totally changes 838 00:41:36,420 --> 00:41:39,690 and shifts the way the ribosomes are reading 839 00:41:39,690 --> 00:41:41,040 those triplets of bases, 840 00:41:41,040 --> 00:41:43,920 which is so important, so important. 841 00:41:43,920 --> 00:41:47,490 So one little base off and everything gets screwed up. 842 00:41:47,490 --> 00:41:49,435 Just like those words in the sentence 843 00:41:49,435 --> 00:41:51,120 that we were just talking about. 844 00:41:51,120 --> 00:41:53,665 You shift things over one to the left or one to the right 845 00:41:53,665 --> 00:41:56,340 and it doesn't make any sense anymore, right? 846 00:41:56,340 --> 00:41:58,821 It's gibberish, and that's kind of what happens 847 00:41:58,821 --> 00:42:01,050 to the protein. 848 00:42:01,050 --> 00:42:02,790 So let's take this example. 849 00:42:02,790 --> 00:42:07,530 We're adding an A here and so everything 850 00:42:07,530 --> 00:42:11,340 basically gets shifted over to the right by one 851 00:42:11,340 --> 00:42:14,420 as far as how the ribosome is going to read the sequence. 852 00:42:14,420 --> 00:42:17,490 So now the ribosome reads AUG, yep, same as before, 853 00:42:17,490 --> 00:42:19,080 CAA, yep, same as before, 854 00:42:19,080 --> 00:42:23,432 but now it's reading AUG, which is methionine, not cystine. 855 00:42:23,432 --> 00:42:26,953 So it basically has totally shifted over by one. 856 00:42:28,230 --> 00:42:30,928 So instead of reading it the normal way it's supposed to, 857 00:42:30,928 --> 00:42:35,928 now everything gets basically shifted over one. 858 00:42:36,060 --> 00:42:39,330 So now instead of reading TGT, CCG, it's now going 859 00:42:39,330 --> 00:42:44,330 to read TCC, GCC, see what I'm saying? 860 00:42:44,939 --> 00:42:48,603 ATT, and that's what you have down here. 861 00:42:50,280 --> 00:42:52,830 All right. And that results in completely different 862 00:42:55,623 --> 00:42:58,070 amino acids being read after the frameshift. 863 00:42:59,730 --> 00:43:01,560 A deletion, same thing. 864 00:43:01,560 --> 00:43:03,240 You just taking out one little base. 865 00:43:03,240 --> 00:43:04,350 What can that hurt? 866 00:43:04,350 --> 00:43:05,580 Well, there's a lot 867 00:43:05,580 --> 00:43:08,343 because now everything gets screwed up after that. 868 00:43:10,032 --> 00:43:11,070 And again, same thing. 869 00:43:11,070 --> 00:43:14,190 So let's say we're removing this T here, 870 00:43:14,190 --> 00:43:17,760 so everything moves back over this way one. 871 00:43:17,760 --> 00:43:21,054 So everything gets moved over one again, 872 00:43:21,054 --> 00:43:24,483 and here you have the same problems. 873 00:43:25,680 --> 00:43:27,990 All the amino acids following that frameshift 874 00:43:27,990 --> 00:43:28,950 are completely different. 875 00:43:28,950 --> 00:43:30,980 And then the protein gets totally screwed up. 876 00:43:30,980 --> 00:43:33,120 So as you might be able to start imagining, 877 00:43:33,120 --> 00:43:35,280 since it's everything that follows that mutation 878 00:43:35,280 --> 00:43:40,280 gets screwed up, the closer to the start codon 879 00:43:40,410 --> 00:43:42,270 that a frameshift mutation happens, 880 00:43:42,270 --> 00:43:44,340 the more detrimental it is to the protein. 881 00:43:44,340 --> 00:43:48,360 Similarly, the closer to the start codon, 882 00:43:48,360 --> 00:43:53,070 the start of translation that a nonsense mutation happens. 883 00:43:53,070 --> 00:43:58,070 So basically changing a codon for an amino acid 884 00:43:58,560 --> 00:44:02,748 to a stop codon, the closer that happens to the start codon, 885 00:44:02,748 --> 00:44:04,637 the more screwed up the protein's going to be 886 00:44:04,637 --> 00:44:06,990 'cause the shorter the protein's going to be. 887 00:44:06,990 --> 00:44:09,270 Same thing here in a frameshift, 888 00:44:09,270 --> 00:44:14,270 the closer that mutation happens to the start 889 00:44:14,397 --> 00:44:18,899 of translation, that means all of the amino acids 890 00:44:18,899 --> 00:44:21,090 following it are going to be screwed up, 891 00:44:21,090 --> 00:44:23,943 so the worst it actually is for the protein. 892 00:44:25,440 --> 00:44:27,450 Okay, let's talk about a little nomenclature 893 00:44:27,450 --> 00:44:30,690 for mutations which result in changes to the protein. 894 00:44:30,690 --> 00:44:32,905 So this is definitely not all of it. 895 00:44:32,905 --> 00:44:35,400 Unfortunately, it's not all totally standardized, 896 00:44:35,400 --> 00:44:37,020 so you might see this in different ways. 897 00:44:37,020 --> 00:44:38,850 I just want to give you at least one way 898 00:44:38,850 --> 00:44:40,470 that you might start to see some of these. 899 00:44:40,470 --> 00:44:43,629 I'm not going to go in a lot of gory detail here 900 00:44:43,629 --> 00:44:45,630 'cause it gets to just be like a lot of memorization 901 00:44:45,630 --> 00:44:47,340 and it's not totally necessary. 902 00:44:47,340 --> 00:44:50,820 I just want you to, when you see something, say, on a chart 903 00:44:50,820 --> 00:44:52,740 or, you know, in a test result or something, 904 00:44:52,740 --> 00:44:54,960 I want you to be able to recognize it at least, 905 00:44:54,960 --> 00:44:57,390 and then you can kind of look up what it means, 906 00:44:57,390 --> 00:44:59,010 specifically in that context. 907 00:44:59,010 --> 00:45:02,460 But to give you a sense of what you might see. 908 00:45:02,460 --> 00:45:05,130 The nomenclature likely denotes the changes 909 00:45:05,130 --> 00:45:07,605 in the amino acid sequence, not the DNA sequence. 910 00:45:07,605 --> 00:45:09,630 And they may use a single letter 911 00:45:09,630 --> 00:45:12,390 or the three letter amino acid designation. 912 00:45:12,390 --> 00:45:14,940 As you saw I was using the three letter 913 00:45:14,940 --> 00:45:18,090 amino acid designation, but there's also single letters, 914 00:45:18,090 --> 00:45:20,278 which is what I have down here. 915 00:45:20,278 --> 00:45:22,380 And you can look those up just as you can look up 916 00:45:22,380 --> 00:45:25,410 a table of that online, you know, for the 20 amino acids, 917 00:45:25,410 --> 00:45:28,170 what the single letter and three letter designations are 918 00:45:28,170 --> 00:45:29,070 for each of those. 919 00:45:30,360 --> 00:45:33,520 For a nonsense mutation, it's designated as 920 00:45:34,871 --> 00:45:38,100 the amino acid which is changed at its location 921 00:45:38,100 --> 00:45:39,930 along the chain of amino acids, 922 00:45:39,930 --> 00:45:41,880 and it's followed by the letter X. 923 00:45:41,880 --> 00:45:44,910 And that X basically means its stopped. 924 00:45:44,910 --> 00:45:47,580 So nothing else follows that. 925 00:45:47,580 --> 00:45:51,990 So in this case we have G542X. 926 00:45:51,990 --> 00:45:56,340 If you see that, then you know that it is the glycine, 927 00:45:56,340 --> 00:46:00,210 which is the amino acid which is designated by the letter G. 928 00:46:00,210 --> 00:46:05,210 Glycine, at the 542nd amino acid in the chain of amino acids 929 00:46:06,420 --> 00:46:09,000 for this protein, is normally a glycine. 930 00:46:09,000 --> 00:46:11,531 So the wild type is glycine. 931 00:46:11,531 --> 00:46:15,240 But now in this individual who has nonsense mutation 932 00:46:15,240 --> 00:46:17,340 that it is now a stop codon. 933 00:46:17,340 --> 00:46:20,758 So it is designated as an X because no amino acids 934 00:46:20,758 --> 00:46:25,020 have X's there as single letter designation. Okay. 935 00:46:25,020 --> 00:46:29,700 And missense mutation is designated by the wild type 936 00:46:29,700 --> 00:46:32,395 or the old amino acid, you can think of it that way. 937 00:46:32,395 --> 00:46:34,890 It's location, and then the new or mutant amino acid, 938 00:46:34,890 --> 00:46:36,270 that's now in its place. 939 00:46:36,270 --> 00:46:40,923 In this example would be glycine at location 176. 940 00:46:42,092 --> 00:46:47,092 So the 176th amino acid in the chain is no longer glycine. 941 00:46:47,640 --> 00:46:50,733 It's now F which stands for phenylalanine. Okay. 942 00:46:52,713 --> 00:46:55,920 And in-frame deletion is denoted by a delta 943 00:46:55,920 --> 00:47:00,920 or could be denoted as the letters DEL for deletion. 944 00:47:02,460 --> 00:47:05,563 But you may see it as the delta symbol, 945 00:47:05,563 --> 00:47:10,560 then the amino acid, which is deleted and its location. 946 00:47:10,560 --> 00:47:13,200 So this would be a deletion of the phenylalanine 947 00:47:13,200 --> 00:47:18,150 at location 508 in that particular protein. 948 00:47:18,150 --> 00:47:21,677 And in-frame insertion, here's where we're supposed 949 00:47:21,677 --> 00:47:24,900 to get a little ugly with in-frame insertions and frames. 950 00:47:24,900 --> 00:47:29,900 It just gets really hairy with how it's designated 951 00:47:31,186 --> 00:47:34,350 and it's designated different ways depending upon, 952 00:47:34,350 --> 00:47:35,853 I don't know, what the person ate 953 00:47:35,853 --> 00:47:37,868 for breakfast that morning. 954 00:47:37,868 --> 00:47:40,170 There doesn't seem to be a lot of rhyme or reason 955 00:47:40,170 --> 00:47:42,360 to it sometimes, but here's at least one way 956 00:47:42,360 --> 00:47:45,183 in which an in-frame insertion can be denoted. 957 00:47:46,260 --> 00:47:49,650 It would be denoted as the amino acid before 958 00:47:49,650 --> 00:47:52,980 and the amino acid after. 959 00:47:52,980 --> 00:47:55,740 And so it would be denoted as the wild type, 960 00:47:55,740 --> 00:47:59,280 which would normally be leucine at location 21, 961 00:47:59,280 --> 00:48:02,220 followed by a lysine at location 22. 962 00:48:02,220 --> 00:48:04,410 And there's an insertion now of a three ins. 963 00:48:04,410 --> 00:48:08,340 So in this individual if you were reading 964 00:48:08,340 --> 00:48:13,340 their amino acids in this protein starting at amino acid 21, 965 00:48:15,182 --> 00:48:18,300 would read as leucine, threonine, lysine, 966 00:48:18,300 --> 00:48:20,670 and then all the rest of the amino acids 967 00:48:20,670 --> 00:48:22,740 are in the normal order 968 00:48:22,740 --> 00:48:24,060 as you would expect for this protein. 969 00:48:24,060 --> 00:48:26,520 Frameshifts, it gets more complicated. 970 00:48:26,520 --> 00:48:29,496 But if you see the letters fs in the designation, 971 00:48:29,496 --> 00:48:32,490 that means there is a frameshift at that location 972 00:48:32,490 --> 00:48:33,780 that they are specifying. 973 00:48:33,780 --> 00:48:37,677 But again, yeah, it gets a little weird. 974 00:48:37,677 --> 00:48:40,290 I still haven't quite figured out if there's... 975 00:48:40,290 --> 00:48:42,690 it doesn't seem like there's a standard way to do it. 976 00:48:42,690 --> 00:48:44,610 So if you see the letters fs, 977 00:48:44,610 --> 00:48:47,730 that likely means it's referring to a frameshift. 978 00:48:47,730 --> 00:48:50,250 And I apologize, I can't give you more direction than that. 979 00:48:50,250 --> 00:48:53,160 It's just I've seen it so many different ways 980 00:48:53,160 --> 00:48:56,490 that it would really, I think, be a waste of time 981 00:48:56,490 --> 00:48:58,554 to try to go through all of them. 982 00:48:58,554 --> 00:49:00,354 But anyway, if you see those, 983 00:49:00,354 --> 00:49:05,354 at least you'll be familiar with what those may indicate. 984 00:49:05,640 --> 00:49:08,160 All right, let's move on to some categories 985 00:49:08,160 --> 00:49:09,870 of genetic disease. 986 00:49:09,870 --> 00:49:12,030 Chromosomal disorders, which we talked about a lot 987 00:49:12,030 --> 00:49:13,320 in the last module. 988 00:49:13,320 --> 00:49:17,670 These are polygenic, and by that I mean they're involved 989 00:49:17,670 --> 00:49:20,490 with multiple genes and that would be all of the genes 990 00:49:20,490 --> 00:49:23,370 in the chromosome, if it's aneuploidy. 991 00:49:23,370 --> 00:49:26,920 Basically, the disorder itself may be 992 00:49:28,290 --> 00:49:29,640 the phenotype of the disorder. 993 00:49:29,640 --> 00:49:34,260 So let's say the phenotype would be down syndrome. 994 00:49:34,260 --> 00:49:39,260 The genotype would be trisomy of chromosome 21. 995 00:49:40,290 --> 00:49:42,120 So the phenotype would be the characteristic, 996 00:49:42,120 --> 00:49:43,410 which would be down syndrome, 997 00:49:43,410 --> 00:49:45,300 all that's associated with that. 998 00:49:45,300 --> 00:49:47,640 The genotype would be what caused it 999 00:49:47,640 --> 00:49:49,306 from a sequence perspective. 1000 00:49:49,306 --> 00:49:51,660 And that would be a trisomy of chromosome 21. 1001 00:49:51,660 --> 00:49:54,270 So chromosomal disorders are polygenic, right? 1002 00:49:54,270 --> 00:49:56,220 So they involve many different genes, 1003 00:49:56,220 --> 00:49:58,560 however many different genes are either on the chromosome 1004 00:49:58,560 --> 00:50:00,390 itself, if it's an aneuploidy 1005 00:50:00,390 --> 00:50:03,243 or in the region where there's a chromosomal disorder. 1006 00:50:04,980 --> 00:50:07,770 And that contributes to all of the different 1007 00:50:07,770 --> 00:50:10,770 and various symptoms that you see associated 1008 00:50:10,770 --> 00:50:13,710 with chromosomal disorders. 1009 00:50:13,710 --> 00:50:16,650 A single gene disorder would be considered monogenic. 1010 00:50:16,650 --> 00:50:19,022 There are over 6,000 different identified disorders 1011 00:50:19,022 --> 00:50:20,820 that fall into this category. 1012 00:50:20,820 --> 00:50:23,921 They're often called Mendelian disorders as well. 1013 00:50:23,921 --> 00:50:25,170 You might hear that, 1014 00:50:25,170 --> 00:50:27,590 we'll talk about that a lot in the next lecture. 1015 00:50:27,590 --> 00:50:29,820 And then there are multifactorial, 1016 00:50:29,820 --> 00:50:32,640 which is nearly all diseases to varying degrees, 1017 00:50:32,640 --> 00:50:34,140 which could be considered multifactorial, 1018 00:50:34,140 --> 00:50:37,291 which means they are involved with multiple genes 1019 00:50:37,291 --> 00:50:40,999 to different extents and influence from the environment. 1020 00:50:40,999 --> 00:50:43,833 We'll talk about this in the next module. 1021 00:50:45,241 --> 00:50:50,241 And those would be diseases like cardiovascular disease. 1022 00:50:50,400 --> 00:50:52,770 Cancer, for example, is also 1023 00:50:52,770 --> 00:50:55,083 a multifactorial genetic disease. 1024 00:50:57,240 --> 00:51:00,843 You know, stroke risk, all of these are multifactorial. 1025 00:51:02,859 --> 00:51:05,490 Chromosomal disorders, as you recall, 1026 00:51:05,490 --> 00:51:08,550 these would be either a full on aneuploidy 1027 00:51:08,550 --> 00:51:13,550 or it could be a portion of a chromosome is deleted, 1028 00:51:13,620 --> 00:51:18,620 duplicated, inverted, transferred from one chromosome 1029 00:51:19,680 --> 00:51:21,913 to another, you remember all of those 1030 00:51:21,913 --> 00:51:24,243 from the last module, hopefully. 1031 00:51:25,230 --> 00:51:27,930 So this is excess or lack of a whole chromosome 1032 00:51:27,930 --> 00:51:30,030 or segment of a chromosome. 1033 00:51:30,030 --> 00:51:31,590 So phenotype is the net effect 1034 00:51:31,590 --> 00:51:34,673 of changes in many different genes. Okay. 1035 00:51:34,673 --> 00:51:37,410 Single gene disorders, these would be disorders 1036 00:51:37,410 --> 00:51:40,530 like cystic fibrosis, which occurs when both copies 1037 00:51:40,530 --> 00:51:42,180 of the CFTR gene are mutated. 1038 00:51:42,180 --> 00:51:44,561 And we will go into that in the next lecture. 1039 00:51:44,561 --> 00:51:47,640 This would be a mutation or disruption in a single gene 1040 00:51:47,640 --> 00:51:49,620 which results in the disease. 1041 00:51:49,620 --> 00:51:54,513 So for a person who has the genotype 1042 00:51:54,513 --> 00:51:57,840 which determines cystic fibrosis, 1043 00:51:57,840 --> 00:52:00,690 it doesn't matter what that person does, right? 1044 00:52:00,690 --> 00:52:05,690 So they could live the healthiest life they possibly could, 1045 00:52:05,940 --> 00:52:07,890 they're still going to develop, at least, 1046 00:52:07,890 --> 00:52:10,770 some of the symptoms of cystic fibrosis. 1047 00:52:10,770 --> 00:52:12,780 They're still going to develop cystic fibrosis 1048 00:52:12,780 --> 00:52:15,450 because getting that particular disease, 1049 00:52:15,450 --> 00:52:18,390 the manifestation of that disease is 100 percent 1050 00:52:18,390 --> 00:52:23,390 determined by the genotype for the cystic fibrosis gene. 1051 00:52:25,560 --> 00:52:28,440 So it's monogenic, single gene influencing it, 1052 00:52:28,440 --> 00:52:30,630 and in particular mutations in that gene 1053 00:52:30,630 --> 00:52:32,130 will give you the disease. 1054 00:52:32,130 --> 00:52:34,863 The disease will absolutely manifest. 1055 00:52:35,760 --> 00:52:39,810 As opposed to multifactorial disorders, 1056 00:52:39,810 --> 00:52:42,510 where you can have really a combination of effects. 1057 00:52:42,510 --> 00:52:44,940 And again, this is most of what you see, 1058 00:52:44,940 --> 00:52:46,290 certainly on a day-to-day basis. 1059 00:52:46,290 --> 00:52:48,690 And this will be most diseases are 1060 00:52:48,690 --> 00:52:50,520 influenced by many different factors. 1061 00:52:50,520 --> 00:52:53,627 So if you take hypertension for example, well as you know, 1062 00:52:53,627 --> 00:52:57,030 right, that's going to be influenced dramatically by 1063 00:52:57,030 --> 00:53:00,750 a person's diet, by whether they smoke or not, 1064 00:53:00,750 --> 00:53:04,141 by their lifestyle, so whether they're sedentary 1065 00:53:04,141 --> 00:53:08,747 or do regular amounts of exercise. 1066 00:53:08,747 --> 00:53:12,382 But it's also influenced by a variety of different genes 1067 00:53:12,382 --> 00:53:15,480 and different mutations in those genes. 1068 00:53:15,480 --> 00:53:18,990 They all contribute a little bit to the risk associated 1069 00:53:18,990 --> 00:53:22,629 with potentially developing something like hypertension. 1070 00:53:22,629 --> 00:53:27,450 Okay. Yes, and hypertension involves over 15 genes 1071 00:53:27,450 --> 00:53:30,426 as well as diet, exercise, and smoking exposure. 1072 00:53:30,426 --> 00:53:34,181 And we'll get into a little more of that in the next module. 1073 00:53:34,181 --> 00:53:37,830 These become a lot more difficult to nail down 1074 00:53:37,830 --> 00:53:40,618 and give, you know, precise estimates to patients 1075 00:53:40,618 --> 00:53:43,620 as far as what's their risk because there are a lot 1076 00:53:43,620 --> 00:53:46,560 of factors contributing a little bit of risk. 1077 00:53:46,560 --> 00:53:48,887 And it's really the combination of those 1078 00:53:48,887 --> 00:53:51,450 that will determine whether or not 1079 00:53:51,450 --> 00:53:53,580 a person develops a particular disorder 1080 00:53:53,580 --> 00:53:55,180 and the extent to which they do. 1081 00:53:56,340 --> 00:53:57,810 All right, let's summarize here. 1082 00:53:57,810 --> 00:53:59,580 Genotype leads to phenotype. 1083 00:53:59,580 --> 00:54:04,140 So your genotype would be the sequence of a particular gene, 1084 00:54:04,140 --> 00:54:06,960 and that's going to lead to the phenotype 1085 00:54:06,960 --> 00:54:11,147 or how the genotype is seen in an individual. 1086 00:54:14,460 --> 00:54:18,810 So their traits are affected by a particular genotype. 1087 00:54:18,810 --> 00:54:21,090 Changes to a gene sequence can result in changes 1088 00:54:21,090 --> 00:54:23,970 to the protein it encodes, which can affect phenotype. 1089 00:54:23,970 --> 00:54:27,780 We gave you a few examples there of the sickle cell disease 1090 00:54:27,780 --> 00:54:32,451 and cystic fibrosis, how changes in those proteins 1091 00:54:32,451 --> 00:54:36,183 can affect the individual. 1092 00:54:37,260 --> 00:54:40,350 Mutations include changing a base or point mutation 1093 00:54:40,350 --> 00:54:42,930 or inserting or deleting one or more bases. 1094 00:54:42,930 --> 00:54:45,120 And these mutations can cause one amino acid 1095 00:54:45,120 --> 00:54:47,250 to be different, which would be a missense. 1096 00:54:47,250 --> 00:54:50,010 The protein to be truncated, which would be nonsense, 1097 00:54:50,010 --> 00:54:52,020 and amino acid to be added or removed, 1098 00:54:52,020 --> 00:54:54,330 which would be an in-frame insertion or deletion, 1099 00:54:54,330 --> 00:54:56,400 or the protein to be dramatically different, 1100 00:54:56,400 --> 00:54:58,486 which would be from a frameshift. 1101 00:54:58,486 --> 00:55:00,843 Genetic diseases can be monogenic, 1102 00:55:01,760 --> 00:55:03,300 polygenic, or multifactorial. 1103 00:55:03,300 --> 00:55:05,970 All right, so what's up next in this module? 1104 00:55:05,970 --> 00:55:07,860 Well, the remaining two lectures, 1105 00:55:07,860 --> 00:55:11,595 we're going to dive into patterns of inheritance 1106 00:55:11,595 --> 00:55:14,220 that are commonly called Mendelian Inheritance, 1107 00:55:14,220 --> 00:55:17,190 which means inheritance of conditions 1108 00:55:17,190 --> 00:55:21,120 that are caused by a single gene. 1109 00:55:21,120 --> 00:55:23,653 So looking at inheritance of a single gene. 1110 00:55:23,653 --> 00:55:26,430 And so we're gonna have two lectures on that. 1111 00:55:26,430 --> 00:55:29,310 The first one is on the basics of Mendelian Inheritance, 1112 00:55:29,310 --> 00:55:31,230 and that's where we get those terms of dominant 1113 00:55:31,230 --> 00:55:33,330 and recessive, I'm sure you're familiar with. 1114 00:55:33,330 --> 00:55:34,950 And then we're going to move on 1115 00:55:34,950 --> 00:55:36,690 to some extensions to Mendel. 1116 00:55:36,690 --> 00:55:40,230 So as we know all too well in genetics, 1117 00:55:40,230 --> 00:55:42,330 there's always exceptions 1118 00:55:42,330 --> 00:55:45,330 and some of them are actually pretty important to make sure 1119 00:55:45,330 --> 00:55:48,004 that you're familiar with and aware of 1120 00:55:48,004 --> 00:55:50,700 in things you might see in your practice 1121 00:55:50,700 --> 00:55:53,850 or patterns of inheritance that might not quite fit exactly 1122 00:55:53,850 --> 00:55:56,880 what we learned about from the Mendelian Inheritance. 1123 00:55:56,880 --> 00:55:59,130 So with that, I will say goodbye 1124 00:55:59,130 --> 00:56:01,330 and we'll talk with you in the next lecture.