Sunday, November 18, 2012

Math In The Time of Cholera

I got enough nice comments on the last post that when I was struck by a piece of curiosity I decided to go for another. To the extent that this blog has a purpose, it is essentially going to be my tinkering, which may or may not involve programming, but will (I imagine) pretty much always involve some form of mathematical modeling.

This week, I've decided that what's cool is evolution. In particular, antibiotic resistance. We all know that antibiotics, as they are (over)used, eventually fail to work. As bacteria are exposed to them, they evolve around them, learning to survive the deluge of drugs. Although I understand that this is a serious health problem for many, somehow I can't help but feel a smidgen of respect for these microorganisms, or at least for the machinery of life that enables their adaptation. As a great chaos theorist once said, "Life finds a way."

Disclaimer: Last time, I put up a disclaimer saying I didn't know anything, really, about cryptography. I would like to similarly note that I'm quite ignorant with respect to microbiology. At least with cryptography, I was staying in the realm of my academic training, which is highly mathematical. What I'm doing here is mathematical, but it's really a biological application, and the math is just how I approach it, because that's what I'm comfortable with. I worry that here I enter a long tradition of physical scientists who think they know what they're talking about. Hopefully I can get away with it by being self aware: I am not a doctor, I am not pre-med, I am not in the life sciences. The analysis and programs that follow are ridiculously simplistic and took me only a few hours, and are not to be taken seriously by anyone, including me. Feel free to point out all of the failings in this analysis.

So, anyway, we all know what antibiotic resistance is. When I examined the model I had in my head of antibiotic resistance, though, I discovered that it didn't really hold up, probably because I constructed most of it when I was ten. In my head, I imagined two kinds of bacteria--"resistant" and "vulnerable." When exposed to antibiotics, the resistant ones lived and the vulnerable ones died. Afterwards, only the resistant ones were around to produce offspring,  so the population as a whole would be resistant.

The flaws in this model come about when I considered the frequent advice that we hear: always finish your antibiotic regimen, because that helps prevent resistant bacteria from arising. This doesn't make a whole lot of sense in the context of my previous mental model. If resistant bacteria are all already resistant, then taking more antibiotics (or antibiotics for longer) won't affect them. In fact, arguably, you'd just be applying more selective pressure to the population, leading future generations to be even more resistant. 

The problem arises because I conceived of bacteria as falling into two categories, a binary distinction between "dies when exposed to antibiotic" and "lives when exposed to antibiotic." Perhaps, instead, it would make more sense to talk about probabilities. After all, often in toxicity we hear about "LD50" -- the lethal dose that kills 50% of a population. No antibiotic is 100%, right? Even bleach, alcohol, and other chemicals so strong that bacteria just straight-up do not evolve around them advertise "kills 99.9% of germs!"

In that last paragraph, I switched from talking about words ("resistant" and "not resistant") to talking about numbers ("how resistant?") which is my personal sign that I could analyze this for real. My goal was this: I wanted to watch a virtual population of bacteria evolve antibiotic resistance. 

The first part of any mathematical modeling is abstraction. For example, if i calculate the trajectory of a baseball, I will treat the baseball as if it were a point mass. I will ignore air resistance and how fast it spins in flight and the gravitational effects of the moon. Of course, these things can factor into an analysis, if I need a very detailed or comprehensive answer--but in general, why worry about them? Similarly, I don't need to do a real simulation of a human body full of bacteria, or to worry about how the antibiotics get there, or anything like that. I want to abstract out so all I have to worry about is the actual resistance, since that's what I'm interested in.

To do this, I organized my program like so: I have a list, called Petri. Petri is a petri dish, except obviously not a real one because doing biotech in my bedroom is a good way to give myself a terrible disease on accident. Petri is a list of objects which are just called "cells." (An aside, to any programmers reading, this was my first ever delve into the object-oriented world, with "cell" the first class I've ever defined. I had read about the concept and so I had some idea of what I was doing, but still. Milestone.) A cell has two associated variables. One variable "alive", which is either "true" or "false" to tell me whether the cell is alive, naturally. Another variable is "resistance." Resistance goes from .05 to .95, and represents the chance that the cell survives a dose of antibiotic. 

Resistance is an exceptionally abstract statistic, and I'm somewhat proud of how much information it carries in it. Everything about the interaction of the cell with the outside chemical is contained in that number. If you took a larger volume antibiotics, all that would do is change resistance. If you took them more frequently, I would just have to apply a dose more frequently. It doesn't matter how the antibiotic works, or how it's distributed, or anything. One thing I have thought of that it doesn't account for is this: suppose all the drug does is stop replication, rather than killing? That bacterium still takes up space on my virtual petri dish, but doesn't replicate (presumably eventually dying of, I dunno, entropy). To work that in, I'd have to allow for bacteria that don't die but also don't replicate, and so continue to take up space for a little while, etc. An interesting future extension, maybe.

Ok, so now that I have my list of cells, what do I do with it? Well, I can "dose" the dish, which makes all the bacteria roll a die generating a number between 0 and 1, and if that number is higher than their resistance, they die. I can also call a function called "generation." Generation means that every bacterium that is still alive makes a copy of itself into one of the empty spaces. (The number of total spaces is fixed and initially every space is full of bacteria, so the population is limited by its starting total.) I also want to be able to keep track of how the population is doing, so I have a function that can count the number of remaining live cells and another that calculates the average resistance of all the cells, which is interesting because this should increase as doses continue.

There are several arbitrary aspects of the simulation I'd like to talk about, because there are parameters that I essentially pull out of thin air. One of these is the generations/dose ratio. How many generations pass between doses? A thousand? Two? Ten? Manipulating this turns out to be very interesting, more on that later. Another is the initial resistance distribution. I have to pick a resistance for these guys to start with, after all, before any selection can occur. Also, they'll have to have different resistances, because otherwise there is no natural selection.

For the trials I'm about to show you, the generation/dose ratio was 2/1. That is, my program counts time in steps, t = 0, 1, 2..., and there is a new generation every step, and a dose of antibiotic applied at every even step. Again, this is arbitrary. I also started with an initial resistance distributed exponentially, with the average resistance at .15 or so. Also, there are 1000 spots in my petri dish. (This is arbitrary too, but it needs to be large enough that statistics works, but small enough that the program doesn't take too long to run. It is, therefore, limited mainly by patience, and more would be better.)

Anyway, here are the results. First, let's look at a chart of how many cells were alive:

Notice what happens here. First, the population drops extremely quickly. It makes a few small rebounds and is driven further down--and then there's a turnaround, and the bacteria win from there on out, their growth outpacing the antibiotic. Eventually they reach saturation, and every dose of the antibiotic does indeed drive them down--but their reproduction is fast enough to recover before another dose comes. Notably, they are consistently driven down to about six hundred or so. Let's look at what happened to the average resistance during the simulation:

First of all, it obviously rose quite a bit--remember our starting average was on the order of .1, and we ended around the .6 mark. That aligns perfectly with the survival rates we saw at the end of the simulation But why didn't it go higher? Well, the fact is that the simulation only has certain values in it, and so it appears that eventually all surviving cells had a common ancestor with that resistance, so it stops there. 

That reveals a flaw in the program that goes with another flaw I had already noticed and decided not to tell you about. First: no new traits emerge in this. Natural selection operates on the resistances that are already present, but if there was no resistance .9 in the sample, nothing will ever generate that high of resistance. Second: I don't think I have a very good distribution. Every biological trait I can think of tends to be distributed normally--on a bell curve, that is. But if I had a bell curve centered on .1, I might not have any of high resistance at all, and then my colony could never evolve that, so I cheated and used an exponential distribution.

Fixing one of these problems requires fixing the others, so here's what I'm going to do: First, I am replacing the exponential distribution with a bell curve centered on .15, with a standard deviation of .03. (On a normal distribution, 99.7% of samples lie within three standard deviations, so virtually all of our bacteria will lie between .06 and .24.) Second, I am introducing a new stat that each cell carries called "mutability." A cell's mutability ranges from 0 to .1, and when that cell reproduces, its daughter cell's resistance is modified by a random percentage of its mutability in either direction. That is, a cell with mutability .1 and resistance .2 will have daughters equally likely to be anywhere from .1 to .3. (I could make this a bell curve as well, I suppose, but it's not as crucial.) This will allow the population to produce resistances not present in the initial sample as the offspring may mutate to have better resistance than their parents.

Now, with our new simulations, here are the outputs:


Oh man. Ok, I kind of make a life out of geeking out about this stuff, but still, you gotta admit this is cool. First, look at the cell counts. Much like the others, it drops precipitously, recovers, reaches saturation, and oscillates as dosing continues. Unlike the others, it doesn't get stuck. Every time the dose applies, another round of natural selection occurs, until eventually the population is maximally resistant--which I set at .95, because it didn't seem all that realistic to allow perfect resistance. (In development, before I put that limit in, the resistance would go to 1.) Especially cool is the average resistance chart, because we started with a very tightly clustered distribution, but it doesn't take very long before every member of the population is more resistant than the entire original generation, which I think is pretty sweet.

Alright, until now, I've only shown you depressing graphs in which disease is nearly beaten back and then roars to a maximally-resistant dominance over feeble medicine. Luckily for you and I as humans, this is not inevitable. The variable I'm going to tweak is the generations/dose ratio, and I explored this by trial and error. If we flip it--two doses per generation--the bacteria die out in less than a dozen generations. I wanted to find something close to the tipping point, so I experimented a bit more, and it turns out we were never far from the tipping point--if there are three generations for every two doses, instead of four, the antibiotic wins. This only takes twenty-something generations, because there is no dramatic turnaround--only the cold hard victory of medical science:


Note that the resistance still goes up--the bacteria are, in fact, evolving--but because we finish our antibiotics they die and their resistance never escapes. If you stopped taking antibiotics, you would indeed potentially birth a new population of more-resistant (if probably not totally-resistant) bacteria. Repeat this a few times and you could well end up with maximally-resistant germs, at which point you are well and truly screwed.

Since I'm so big on the power of mathematical abstraction, let me point something out that represents the other side, that is, the flaws in this. My "doses" and my "generations" are quite abstract--in real life, antibiotics have a continuous effect if you're taking them constantly, and bacteria don't all divide in lockstep. This is fine for my extremely-simplistic model, but it means that ideas like "doses per generation" don't necessarily translate into real-life terms very well. In other words, don't read "doses per generation" as "pills per hour." Still, the basic concept--you need enough drugs to overcome their reproduction, and you need to stay on them long enough to actually wipe out the population--is there.

I want to point out that there are several other interesting things I could manipulate here--one would be starting resistance, one would be mutability range. Biological modeling is a real field, obviously, and I don't really have time this afternoon to reinvent it. There is one other experiment I do really want to run, though, and it didn't occur to me until I was typing this. I want to make mutability, uh, mutable. Before, a cell was exactly as mutable as its parent--now, I'm going to modify that number by saying that each cell's mutability can change...by its mutability. So mutability could as much as double, or it could drop all the way to zero. I don't know what will happen here exactly, but I have a hunch I'm highly interested in.

Neat. Super neat. Check it out: mutability at first increased, which makes sense because when you're under attack by terrible deadly chemicals it makes sense to be flexible. These cells found it favorable to be changeable, in other words. But at some point, the cells were as resistant as they were going to get, and at that point, mutations became a detriment. If there's nowhere to go but down, you're better off staying still. So mutability fell and continued falling for the rest of the simulation. (This is what I expected and so it makes me feel really cool, though I could easily lie about that. You'll have to trust me.) Again, we see the power of abstraction: real mutability presumably depends on, I dunno, DNA transcription error correction and stuff that I have no idea about. But because I looked at behavior (rather than mechanisms of behavior) I can still see these trends even though my "cells" are nothing like the real thing.

Natural selection: ain't life grand? That's kind of how I feel at the end of this project. I wanted to watch simple digital evolution, and I did. I hope you enjoyed coming along for the ride, if you've read this far.

Once again I'm attaching a tarball of my Python code. It is not very fast. (I can see ways to optimize it but don't really need to at this point, because, ah, I'm done with it for right now.) It is probably not very good. But, here it is, if you are interested.

http://dl.dropbox.com/u/32594622/antibio.tgz

wait a minute don't i have a test tomorrow

No comments:

Post a Comment