Friday, October 2, 2020

Tea Time: What is a crumpet, anyway?

Tea and crumpets. It is a phrase many of us have heard, conjuring images of steaming Earl Grey served in porcelain cups and overstuffed armed chairs. But what is a crumpet? Recently, I dug up a recipe and made a batch.


Crumpet with buter and jam on a plate.
Crumpet rings, tea is whistlin'; On the cake, butter's glistening. We're humming a song as we munch along...

A "crumpet" is a griddle cake, something like a cross between a pancake and an English muffin. Unlike a pancake, crumpets are yeast-risen and fluffy, big foamy bubbles form the "nooks and crannies" that absorb plenty of warm butter and jam. Crumpets are grilled on one side only at first and toasted before serving. They are an old English tradition, going back in some form for centuries, first fomally mentioned by the Bible translator, John Wycliffe, in the 14th century and later described in closer to the their current form by Elizabeth Raffald in "The Experienced English Housekeeper" in the 18th century [Oxford Online Reference Library].

The ready-to-cook batter is much too soft to stand on its own while cooking, so the crumpets are formed in metal rings, crumpet rings. Good crumpet rings are stainless steel and double-rolled to present a smooth, easy-to-clean surface for consistent crumpet extraction. Most people don't have them, however, and it's not like you can just pick them up at an typical American store. Canning jar lids are not at all ideal, but they do (mostly) work. (They probably offend some deep-seated English taboo, so exercise caution. I have no idea how a Brit would resond to them being served with hot coffee, either...)

My crumpet adventure started with a recipe from the Daring Gourmet. I was with the in-laws and made a double batch. I did not double the salt. So, the ingredients list ended up being:

  • 4 cups of flour, sifted. Half of this was recently ground, half store-bought because I ran out of fresh.
  • 1 teaspoon salt
  • 2 cups warm milk
  • 2 tablespoons active dry yeast
  • 2 teaspoons sugar
  • 2 cups warm water
  • 1 teaspoon baking soda

As instructed in that recipe, the batter is made in two stages, first making and rising a spongy dough, then beating in more liquid and resting for 30 minutes. The recipe specifically states not to worry about beating all of the lumps out when mixing the batter: you want some variation in texture. The rising process is very important to getting the big bubbles in the batter (you can see this in my picture of crumpets on the griddle). On this particular day, my wife was also baking, so the area was nicely warm for the rising and it doubled in size fairly quickly. Your mileage may vary. Proof your yeast and go by actual rising, not the clock-time.

The rings are necessary to get the batter to stand up enough to get decently-thick crumpets. The canning jar lids are not ideal and take some fiddling. I used a mix of narrow and wide mouth rings because that is what my inlaws had in the cupboard. I found that it was best to fill them 2/3rds full with batter on the hot, greased griddle. Preheat the griddle until a droplet of water dances. The crumpets expand when cooking and will overflow if the ring is overfilled. The crumpets are flatter and kindof crispy when underfilled. They have a pleasant, yeasty aroma while cooking.

Grease the rings well! I used cooking spray, again, because it was what I had on hand. When the rings were not greased well, underfilled or overfilled, I suffered Crumpet Extraction Failure (CEF). I had to force the cooked crumpet out and then had to try to quickly scrub and regrease the ring for the next batch. This was a substantial pain-in-the-you-know-what. When the rings were filled correctly and greased well, the crumpets dropped right out, even with the threads in the rings. When you are still learning, you probably want to have a large stock of clean rings handy, preferably the wide-mouth ones. That way, when CEF occurs, you can just set it aside temporarily and grab a clean ring.

The Crumpet Extraction Falure (CEF) condition. Avoid this. Have spare rings in case you do not avoid it.

Since some of the crumpets were to be served right away, I toasted them right on the hot griddle after everything was done. Turn up the heat a bit for proper toasting. The crumpets made with wide-mouth rings fit in a slot toaster, the small ones not safely. In civilized places, crumpets are supposed to be served cut in half (half-moons, not sliced like English muffins!). Again, this worked well for the larger rings, not so well for the smaller. The crumpets were slathered with butter and optionally homemade strawberry jam or honey, served with tea. They were a big hit with our teenage daughter who is now angling for me to make more.

The doubled recipe made 30 crumpets of mixed sizes. I found that the leftovers were best stored in brown paper because they were otherwise retaining too much moisture.

Some people do all kinds of unnatural things with crumpets, including topping with ham and cheese, curry, etc. Diehards stick to butter and jam or honey. Or just butter.

Thursday, September 3, 2020

Exocytosis, Stage Left: Coronavirus strategies for spreading infection

On a cellular biology level, learning about SARS-CoV-2 opens up a micro-world of epic struggles: stealth and trickery, strategem and counter, adaptation and (usually) survival. This is an early stab at the fifth panel for the redo of Bat Soup: the Graphic Novel in which we attempt to teach really tough biology concepts in a fun way. In this panel, we show the strange ways the virus spreads itself once it successfully enters the cell, takes over its machinery, and  forces the ribosome to start transcribing its proteins to make copies. When I have it where I want it, I will do the final copy on an 11"x14" Bristol Board where I will able to get finer detail with a less cluttered layout.

Endocytosis: In previous panels, we show the dastardly virus using a disguise to evade the immune system to successfully approach the cell. Its spike protein is initially rotated downward and harder for the immune system to recognize. The virus makes use of a host pro-protein such as furin to properly align the spike protein at the S1/S2 boundary so it can activate the ACE2 receptor to make entry into the cell (endocytosis).
The SARS-CoV-2 virus needs a human proprotein, furin, to correctly position its spike protein at the S1/S2 boundary before it can invade a cell.
Snippet from Panel 1

Infiltration and Replication: Once inside, the coronavirus then makes further use of existing human proteins such as cathepsin to unpack its payload and start transcription of its own proteins encoded in its RNA genome. At about 30,000 basepairs, these SARS coronavirus are among the largest and most complex RNA genomes known. So, now that the virus has taken over the cell's fabrication facility, a fleet of new virus particles is being constructed!

So what happens next? Most (lay) descriptions of viral replication just say that the virus causes the cell membrane to rupture so it can surge forth and infect more cells. With most viruses, there is more to it than that and, anyway, with the SARS viruses, particularly SARS-CoV-2 it takes a very different path-- mediated by the "non-structural proteins" in its complex genome which is only partly understood. Programmed cell death, "Apoptosis", does likely happen, but the virus has already spread by then and it is done for a much more cunning reason (which we'll get to in Panel 6).
In the meantime, though, in Panel 5: the viral horde is unleashed.

Viral proteins are synthesized and folded in the Endoplasmic Reticulum and then packaged in smooth-walled vesicles; these vesicles can then merge with the cell membrane to release mature virus into the intercellular space; leaked S-proteins can also cause nearby cell-membranes to merge, allowing virus to directly invade neighboring cells
Bat Soup, the Graphic Novel, Panel 5 concept sketch
The Endoplasmic Reticulum: A long time ago in a galaxy far away when I tutored Cell Biology to make a little side money for college, the Endoplasmic Reticulum and the viral transcription process was something most people seemed to have trouble with. It's extraordinarily complicated, and we are still learning. But I always found it helped to picture it like an WWII-era industrial complex, tiers of concrete and banks of windows (some of them are broken out) rising, the ribosomes stuck to the sides. This is the Rough ER, where the magic of protein synthesis and folding happens. I then picture the Smooth ER, where lipid synthesis happens (the stuff that makes cell membranes, among other things), as rising stacks because they are round and smooth like chimneys or pipes. The nucleus is a walled-complex just beyond the factory.

This is the structure the virus has taken over. The ribosomes start decoding RNA sequences: cytosine, guanine, adenine, and uracil, and stringing together chains of amino acids, the building blocks of proteins. But finished proteins aren't just strings of amino acids (AAs), they are complex 3D structures. The Endoplasmic Reticulum is where those chains of AAs are folded into completed structures like enzymes or hormones... or the shell of a new virus particle.

Exocytosis: Once the viral structure is fabricated, the replicated RNA strand is placed inside and the whole assembled inside smooth-walled vesicles small bubbles inside the cell which are made out of pieces stolen from the Endoplasmic Reticulum. The result is little packets of virus-laden food service condiment packets (extra spicey!) being arrayed inside the cell. But the virus does not have to burst the cell open to escape. At the command of virus-provided S-proteins, pores form in the surface of the cell and merge with the smooth-walled vesicles, pouring virus into the intercellular space (the spike protein once more disguised) where they can find more cells to infect. This is "exocytosis".

Synctium: But, this virus has another trick for avoiding the immune system, particularly deep in the lungs: it doesn't have to leave the cell to spread. The hostage cell starts leaking S-proteins into the intercellular fluid. The S-proteins cause the cell membranes of nearby cells to merge together. Long multinucleated cells called "synctium" start to form which share the same cytoplasm-- and the same viral infection! As we will see in future panels, the cell, including these multi-nucleates mega-cells, can end up dying several different ways.

This cycle of formation of mega-cells, their destruction, and the body's attempt to heal and regrow tissue is a big reason the virus can cause such massive tissue damage and scarring inside lung tissue. The immune response often makes the situation even worse. Understanding how it works and why some people get away with minimal damage may be a key to effective treatment of severe cases.

Thursday, March 12, 2020

Bat Soup, the Graphic Novel - How SARS-CoV-2 enters a cell

Today we get a quick dose of microbiology and an explanation of how Bat Soup, 2019-nCoV, SARS-CoV2, (whatever you wish to call it, see a note about disease names at the bottom!), gains access to a cell and starts to infect a human host.
The more basic information people know about this disease, the more tools they have to interpret the news reports (which are often very poorly done by people who know no more than you do). This gives you a chance to make rational decisions, maybe understand what to fear and what not to. My background is described at the top of the Bat Soup for the Soul: Teaching with Coronavirus article I wrote previously.

This particular virus uses a vulnerability in the Angiotensin Conversion Enzyme 2 (ACE2) found in many human cells and, in particular, epithelial cells (lining) in the lower lung. This is the same receptor used by SARS-CoV, but quite different from MERS-CoV, the common flu, etc.

Note here that there are plant compounds or "phytochemicals" which also bind weakly to this receptor and may inhibit (temporarily block) viral activity. Host receptor blockage by phytochemicals or synthetic compounds is a hot area of antiviral research. The object, of course, is to find something which inhibits the virus without itself causing damage to humans. There are actually a substantial number of naturally-produced compounds which might do the trick with COVID-19. None have been clinically proven yet, though a few had some potential effect in studies with the original SARS outbreak in 2004 or show antiviral activity in vitro (in a test tube) or an animal model.
The coronavirus really does look a bit like a hairy ball (that is where it gets its name), but I have used a tiny bit of artistic license here.

The ACE2 receptor is part of the Renin-Angiotensin System, or RAS. The RAS regulates a number of important body functions, including respiration, heart rate, blood pressure, and kidney function.  Some of you make take medications which target angiotensin or the ACE (Angiotensin Conversion Enzyme, part of a set of related functions with ACE2). These medications, called ACE inhibitors, may cause or indicate potential complications for Bat Soup, but this is still being researched. In any case, the virus, in addition to hijacking the cell for its own purpose, causes collateral damage to the RAS and complications throughout the body of the human host.

After gaining entry into the cell and using its own machinery to replicate, the cell dies and releases more virus particles to spread further. The human body has mechanisms to try to detect and destroy these hijacked cells before they release a virus cargo (also used to fight tumors) a cytokine called TNF (Tumor Necrosis Factor). When the immune system overreacts, cytokine's go crazy attacking everything in site, causing cell damage, inflammation, viral pneumonia, etc. in what is referred to as a "cytokine storm". It is though by many researchers that the cytokine storm may be triggered as a tactic by the virus, like causing a large-scale riot to cover up a break-in in a particular building. The chaos caused by the cytokine storm permits further and faster infection and may become deadly in its own right but is very hard for modern medicine to treat.

This is, of course, a very simple attempt at explaining a complex topic. More references are included below for the adventurous reader to explore further.


  • Buhner, S. H. (2013). Herbal Antivirals: Natrual Remedies For Emerging and Resistant Viral Infections (e-book). Massachusetts: Storrey Publishing. Retrieved from
    • Includes an in-depth section on SARS, the ACE2 receptor, and potential phytochemicals, including sources, studies, and preparations. Extremely in-depth material, but the best one-stop source for plant compound antiviral activity, research, and practice.
  • Chen, H., & Du, Q. (2020). Potential natural compounds for preventing 2019-nCoV infection Hansen. Preprints.Org, (January). Retrieved from
  • Wan, Y., Shang, J., Graham, R., Baric, R. S., & Li, F. (2020). Receptor recognition by novel coronavirus from Wuhan: An analysis based on decade-long structural studies of SARS. Journal of Virology, (January).
  • Bat Soup for the Soul: Teaching with Coronavirus describes disease models and spread statistics to the non-epidemiologist with graphical illustration.
  • The Confusing World of Disease Mortality Statistics in Simple Numbers
If you want to learn more about the general mechanisms of viruses (e.g. influenza), there is an excellent online Virology 101 course/podcast with plenty of diagrams and examples (free).

An Explanation of Names

When it was originally discovered, this virus, which was found to belong to the general family of the coronavirus, was simply labelled 2019-nCoV or "2019 novel coronavirus", novel, here, meaning simply previously unknown and poorly understood. As more about the virus was learned, it was renamed to SARS-CoV-2, formally signifying that it was closely related (but not identical to) the SARS outbreak of 2003-2004. The disease the virus causes is called COVID-19 (Coronavirus disease of 2019).

The two names can be a little confusing, but it is similar to HIV/AIDS: the human immunodeficiency virus (HIV) causes AIDS. Most of the time, the names can be used interchangeably unless you wish to make it clear that you specifically mean either the virus itself or its disease in humans. "Coronavirus" is often an acceptable shorthand as long as it is clear that it potentially refers to more than one virus which affect both humans and animals.

I started using the nickname "Bat Soup" before a formal name had been decided on, based on the urban myth (almost certainly not true) that the original victims got the virus from eating undercooked bat soup. In any case, this tiny virus has put many people in "deep soup".

Wednesday, March 11, 2020

The Confusing World of Disease Mortality Statistics in Simple Numbers

2b/~2b: 'How Many?' is the question!

There is a lot of confusion and debate over mortality figures for novel coronavirus (COVID-19, formerly 2019-nCoV). Most people see the numbers but do not understand how they are derived and therefore may be confused on how to compare numbers from different outbreaks or even the same outbreak on different days or different sources.

As discussed in my previous article, "Bat Soup for the Soul: Teaching with Coronavirus", the simple answer to how deadly this new virus is is that it is a good deal less deadly than SARS-CoV was and a good bit more deadly than the seasonal flu (but affects somewhat different age-groups--- out of scope for this article). At the same time, it is markedly more transmissible than SARS was and somewhat less transmissible than the flu. So, bottom line is that it does less damage on an individual basis than SARS but already has affected many more individuals (and continues to do so). Similarly, it is likely to spread less effectively than the flu but hurt more of the people it does infect (especially the elderly).

[Version 1.1 20200311: corrected typo in equation. Thank you CEMV!]

Less Deadly Is Not Always 'Good'

In general, we often see that less deadly diseases spread faster for the simple reason that people who get quickly and desperately sick do not tend to want to run around and spread disease! When someone has only mild symptoms or takes longer to get sick, they have opportunities to pass the infection to more people. But let us take a quick look at how the mortality figure is derived and why estimates may differ very sharply. We will walk through the math but with deliberately very simple numbers to start:

Let's say you have an outbreak with 20 people infected. At the time we measure, there are 5 fatalities, 5 serious cases, 5 recovered cases, and 5 mild cases. What is the fatality rate?

The quick answer is to divide 5 fatalities by 20 total cases for 25%:

5/20 = 0.25 = 25%

This is more or less the type of number often published for COVID-19. At this moment, using Johns Hopkins' tracker, you get:

4,373 deaths / 121,564 total cases = 0.35981047 or 3.6%

Don't put ANY stock in that specific number because it will be different by the time you read this. If you take this number at different times over the outbreak, the number varies somewhat, and the numbers published by various clinicians or regional authorities vary a great deal because they are taking numbers from their specific populations. Depending on what numbers you use, you can get anywhere from 0.7% to almost 8%, for instance, from different phases of the outbreak in China (according to WHO's report on the Joint Mission to China at the end of February).

OK, so why are people arguing about this? Why are some people saying the number is "wrong" or "likely wrong".

Well, there are a couple of issues with using this number reflexively.


Crude Mortality versus Completed Cases

First, the number is subtly wrong from the way most people think of the probability of dying from a disease. The number above is really what is often referred to as "crude mortality" because it includes uncompleted cases. What does that mean?

In our first set of numbers, we have 10 people, 5 serious cases and 5 mild cases, who have neither recovered nor died (yet). Presumably, they will do one or the other eventually. When looking at past epidemics, like the final numbers for the SARS outbreak in 2004, every case is completed because no one is still walking around actively infected with SARS-CoV-1! So let's fix the number by only including completed cases:

5 deaths / (5 deaths + 5 recovered) = 0.5 = 50% (!)

Ten people total in our example have either died or recovered, so that goes on the bottom. With the other ten people we simply do not know (yet) what will happen. Hopefully that makes sense so far. Mortality calculated from completed cases will tend to be higher for an active outbreak versus a past outbreak, so one must take some care comparing typical actively reported numbers versus historical. But it takes time during an outbreak to get statistically meaningful numbers of recovered cases, so crude mortality is usually what you get.

To take real coronavirus numbers further, we get:

4,373 deaths / (4,373 + 66,239) =  0.061929984 or 6.2%

This is usually what people are really thinking of when they ask "If I am in fact infected, what is my chance of dying once the disease runs its course?" As you can see, it is worse than the crude mortality frequently published. If only two of the serious cases later die and the rest recover, you will see yet a different (lower) number. But wait...

How Many People Actually Get the Disease?

The number you get is clearly heavily influenced by the number of cases of infection you use in the first place. Is this number "correct"? Well, probably not, and how much it is off is a matter of great debate. What happens if you are "infected" but have a mild case (or maybe do not even notice) and never get tested? You won't be included in the numbers at all. Going back to our simple example, if we say that the mild cases are simply never noticed, we get:

5 deaths / 15 cases = 0.333... or 33%

This number is higher than our initial 25%, but we know it does not actually reflect reality. So, let us say that instead of 121,564 cases of COVID-19 world-wide (the confirmed case count from above), we actually have one mild or asymptomatic case for each confirmed case, someone running around who may think they merely have a cold or whatever. Then we get:

4,373 deaths / 121,564*2 total cases =0.1798641 or 1.8%

Well, that looks better, doesn't it? This is the kind of thing you will see in many estimates of COVID-19 mortality, depending on what they use as their guess of how many mild or asymptomatic cases there are. In theory, the unknowns could affect the death count as well (two of the confirmed cases in Washington state were diagnosed postmortem), but we tend to be a bit better at noticing when someone actually keels over as opposed to when they just have a sniffle for a day or two.

Getting Actual Numbers

So, how does one figure out which number is the "correct" number to use for actual cases? How do you account for what you do not know?

Well, people guess from various disease models based on past outbreaks or on detailed numbers from one part of an outbreak. But the tried-and-true method is to swab and test everything that moves throughout a community (at least on a random sample basis) to find out how many people running around have the disease but have not actually showed up at a hospital. China, after a very rough beginning, has started to do this and, as a result, their case-counts, while initially sketchy, are a great deal more reliable. They did actually find unreported cases lurking around the community, mild cases, cases mistaken for something else, people afraid to report, etc., but not that many. South Korea has also done extensive testing around their outbreak (and, interestingly enough, their mortality figures are closer to 0.7%, at the low end of what China found).

The US has done very little of this at all and has suffered from a chronic shortage of test kits. Numbers for our domestic outbreaks (and consequently, estimates of mortality in the US) are therefore extremely poor. Presumably, if we actually had the foggiest clue how many people were infected, our mortality figures would be much lower than they appear. But we just do not know--- and cannot until the test kits catch up, which they are starting to do as of this writing on 11 March.

Be aware, then, if you use global case-counts and deaths, you are getting a mixed bag of both good data and bad data. That results in a number which--- well, it isn't wrong, it is a calculation, and it is what it is, but--- may not be very reliable from predicting the future. Using numbers from countries or regions we know have better data may give better results, but then you have to ask yourself whether the results China gets in their health system or South Korea in theirs will apply equally to the US population and our health system. Roughly, perhaps, but never exactly. HIV spread very differently in European populations than in African populations to what turns out to have been a genetic leftover from bubonic plague: that stuff happens and is inherently unpredictable.


So, what then? What conclusion can we solidly make?

Well, we come back to the beginning: "a good deal less deadly than SARS-CoV was and a good bit more deadly than the seasonal flu". (And, by the way, this virus seems to leave (most) children (<20 years) alone, and that is rather interesting, isn't it?)

Tuesday, January 28, 2020

Bat Soup For the Soul - Teaching With Coronavirus

It is time to speak of many things, of cabbages and kings, of why Bat Soup is boiling hot and whether it has wings...

There is a great deal of media discussion about the 2019-nCoV, 2019 Novel Coronavirus, outbreak in Wuhan China. Some are predicting dire catastrophy, others are saying it is just a distraction from impeachment. The problem is that most people do not understand viruses or epidemiology enough to judge what is being written, to understand whether this or that recent news is important. I am, myself "concerned" about the outbreak, very concerned about the catastrophe for the victims in China and "somewhat concerned" about what may happen here. I also see this as a "teaching moment" to try to explain some of the concepts behind the progress of and efforts against the disease.

  • Draft 1.1.1 11 March 2020 - Added link to Flatten the Curve chart (#FlattenTheCurve) and some discussion at end of article now that we have community spread in the US.
  • Draft 1.1 2 February 2020: Added, briefly discuss, a Lancet paper presenting a more involved (SEIR) model. Editorial corrections. Organized References. 1.1.01 same day: typo correction.
  • Draft 1.01 29 January 2020: Corrected significant typo in discussion of Basic Reproduction Number. Thanks CEMV;
  • Draft 1.0: 28 January 2020. Initial complete text. Needs a proof-reading pass or two, apologies.

If you are in a real hurry and do not have time to learn the underlying how and why, this same basic thing is presented in one chart as Flatten the Curve. I discuss this idea a bit more at the bottom and why we have suddenly gone from trying to "stop" the virus to spreading it out in time. (Thank you, Christie!)

Personally, I went from college (Environmental Science) to Air Force Studies and Analyses. My thesis was the production of a computer simulation toolkit for environmental and biological systems in C++. When I was learning these things, the computer resources for exploration were either not available for students or extremely expensive, and I added to the pool of such tools available. At the Pentagon, I mainly supported intelligence analyses using computers: improving, maintaining, and writing tools to analyze intelligence data, including Nuclear, Biological, and Chemical (NBC) warfare models. Since I did not have formal training in epidemiology, I had to learn much of it the hard way, talking to people who did and entombing myself in the Pentagon library for days-at-a-time until I understood what I had to make the simulation simulate, making mistakes, and doing it again until the mistakes went away. That experience does not make me a virologist or an epidemiologist now, but it means I have enough background to understand the papers being published and the data about the course of the disease.

[If you are dumb (or determined?) enough to try to learn the same why I did, some useful starting points are given at the bottom...]

I am going to try to explain some basic principles here about how some of the data coming out of China might affect the United States if the virus spread across the Pond and achieved effective human-to-human transmission here. What I am going to show you is not a predictive model but a teaching tool to understand how such a disease might progress in a large population with no effective medical prevention. Clearly, medical intervention will be attempted and some of it undoubtedly will be successful. The use of this model is to show what those medical efforts need to prevent and some of the issues involved.

If you are math challenged, don't worry about the equations as much. The graphs should give you a feel for what is happening. If you like math, the equations included will give you a means to play with the numbers yourself.

(Brief) Background On the Virus

The 2019-nCoV is a coronavirus which has been discovered in Wuhan, China related to two previous disease outbreaks, SARS-CoV (Severe Acute Respiratory Syndrome) and MERS-CoV (Middle East Respiratory Syndrome). The coronavirus family normally produces disease in bats, not humans. 2019 Novel Coronavirus is just a placeholder title for a specific coronavirus which in some way has learned how to infect humans. The scientific community has not come up with a handier title yet, so for ease of discussion and in honor of the popular (but likely incorrect) idea that it came from eating bat soup, I am going to refer to it as the Bat Soup Surprise Virus, "Bat Soup" or BSSV for short.

As of this writing, Bat Soup has infected roughly 4,000 people, almost all in China of which almost 100 have died. There have been 5 confirmed cases in the US, but all of these are imported cases, people who were infected overseas before coming to (or returning to) the US. I am not even going to try to print and cite up-to-date numbers here because they are changing too rapidly.

Animal viruses do cross over to humans from time to time. In many cases, they fail to effectively replicate in humans and therefore simply fizzle out. This virus is concerning because it has demonstrated sustained human-to-human transmission over more than five generations of confirmed cases and does not show signs of weakening. Attempts are ongoing to contain it to China, to locate, isolate, and treat the leakers who have brought the disease to other countries. In China, a large scale quarantine has affected more than 55 million people, including 11 million in the greater Wuhan area and 33 million in a neighboring city. The CDC is working to track contacts of infected people who came to the US and to process test samples to determine who among them may have the virus. This kind of effort is precisely what stopped the spread of SARS in 2003-2004.

Compared to SARS or MERS, this virus is more contagious but considerably less lethal, making it more likely to escape containment and spread but likely to cause fewer fatalities if it does. SARS had a case-fatality rate of about 10%, MERS about 37%; the Spanish Influenza of 1918 somewhat less than 5%; this disease is variously calculated at 4% or 3% and (for a variety of reasons) the actual number is likely to be lower as (if) it spreads.

The Basic Reproductive Number

A critical number for understanding disease epidemics of any type is the Basic Reproductive Number or R0 (often pronounced "R-nought"). This is often talked about but seldom actually explained. The Reproductive Number is the average number of successful transmissions of the disease from one individual. If one person manages to infect two other people (before recovering or dying) and each of those new infected people manage to each infect two other people (and so on), then the Reproductive Number (R) is 2.0, as shown in the following illustration:
Note that R is really an average. Bob might infect 4 people and Susan only 1 (avg = 2.5). It depends both on how contagious the disease is and on how many people Bob and Susan regularly come into contact with! For the same reason, R will almost certainly change over the course of an outbreak, as it encounters different conditions and as the medical community tries to stop its progress. The Effective Reproductive Number at time t or R(t) describes this change over the course of an epidemic. The Basic Reproductive Number, R(0), is then the "ideal" R at the start of the disease in a virgin population and overall (roughly) describes the capacity of the disease to move from human-to-human in a population. Strictly speaking, this number is different for Bat Soup in China versus Bat Soup in the US.  The population density and social habit in Wuhan is just a little bit different from, say, rural Southwest Missouri or even Brooklyn. In common usage, R(0) is used to compare different diseases across populations. Just keep in mind that this common usage is not entirely accurate.

Notice what happens when R changes in the illustration. There are three "interesting" ranges for R in describing diseases:
  1. R is less than 1.0: On average, each infected person infects less than 1 other person in each generation of the disease. Over time, this disease will fail to spread and die out. The Middle East Respiratory Syndrome (MERS-CoV) had an R0 of slightly less than 0.7 and did not effectively spread.
  2. R is exactly 1.0 (shown): Each infected person, on average, infects 1 new person. The disease remains in the population, going neither up nor down.
  3. R is more than 1.0 (shown): the number of infected people will tend to increase in the population from generation to generation of the disease. Growth is exponential, slow if R is near 1.0 and increases rapidly as R increases. Many infectious diseases range from 1.0 to 3.0. Some extremely infectious airborne diseases (e.g. measles) can be 15, 20, or even more.
Handily, this tells us the goal of epidemiology in an outbreak: convince the Effective Reproductive Number to be less than 1.0. Public health efforts do not have to actually stop the disease or prevent every case. If R(t) is less than one, the disease will die out on its own, even if infection continues for a time. There is a "good enough" point which gets the job done and protects the public. This is how SARS was stopped.

Time in Disease Models: Incubation, Latency, and Generation Time

To understand disease spread, you have to not only understand how many people it can infect, but how long it takes to do it. This section explains some basic terms for time with respect to infections.

When one or more pathogens (the infective agent, whether virus, bacteria, fungus, etc.) enters a human host, they cannot spread or cause disease immediately. The pathogen has to multiply in the body first, bypassing or overpowering the immune system, and reach some critical mass. Someone sneezes on you and eventually you start sneezing on others. The average time it takes between initial exposure and the development of symptoms is called incubation time. The time between initial exposure and when the host becomes contagious is known as latency.

Often, we assume that these numbers are the same, that is, that the disease can be spread starting when symptoms appear. This makes sense, because symptoms like coughing, sneezing, diarrhea, etc, are in fact the very tools the pathogen uses to infect people. They may not be precisely the same, however, (and may or may not be the case with Bat Soup) but that discussion is outside the scope of this article. Just keep in mind that they may be different things and plough forward for now, intrepid reader.

This concept of latency is what provides the time clock in a disease model. The latency period, the time it takes for a host to be exposed, for the infectious agent to multiply in their body, and for them to infect others is the Generation Time. The generation time will tend to be a bit larger than the latency period because the disease cannot successfully spread until it becomes infectious, it comes in contact with a susceptible host and the transmission to the new host succeeds. Combined with R, we can figure out how quickly a disease will spread from generation to generation of the infectious agent (a virus in this case). We will make use of this number in a little bit.

The World Health Organization (WHO) has listed 4 days as the average incubation time for BSSV in a range from 1 to 13 days. That means that if someone is exposed and has not developed the disease in 14 days, it is not considered likely that they will. This then becomes a handy number for isolating suspected cases. The generation time used by one model (see References) is either 8.3 or 6.8 days, meaning that, on average, it is thought to spread most easily a bit after symptoms first appear. The first number is the generation time measured for SARS-CoV and so it simply assumes that Bat Soup works the same way (it may not). The second number assumes that the generation time for this virus is a bit shorter. Whether or not these numbers are correct is again, outside our current scope, but they give us good numbers to work with for our model below.

Susceptibles and Immunity

Now that we know how many people a pathogen might infect and how quickly it can do it, we need to look at who it can infect. That subject can be complex, particularly when it has to take into account prior immunity and vaccination rates, but (fortunately or unfortunately) it is much simpler with respect to Bat Soup and a population which has never been exposed to it before. The number of susceptibles, S, is initially the number of people in the population.

But what about after the disease starts to spread? In each generation of the virus, people get infected and those people either recover or do not (die). If they recover, they develop immunity (presumably) to future infections, so, either way, anyone who is infected is removed from the pool of future susceptibles. We have to track this number in our model. S(t) is the number of susceptibles at generation t.

The Reed-Frost Epidemic Model

And now we have enough pieces to get to our simple epidemic model, the Reed-Frost model of an epidemic. Wade Hampton Frost was a late 19th, early 20th century epidemiologist. Lowell Reed and Frost developed this model in 1928. The Reed-Frost model is a simple iterative or step-based model, easy to calculate on paper or with a spreadsheet. It is deterministic (not random or not "stochastic"). It has a great many limitations, but is often used as a teaching model because it is easy to do, easy to play with the numbers and get instant results.

(Reed-Frost is sometimes referred to as an SIR model (Susceptible-Infectious-Removed) and is one of the simplest in a family of models known as Compartmental Models. We'll touch on this a little more in a bit.)

For many reasons, Reed-Frost is not likely to be accurate, and we'll get into some of those reasons after we explore the model itself. It will, however, visually demonstrate the pieces we have explained above given real numbers from the current outbreak and then, hopefully, give the reader some insight into the practical effect of developments in the news. This, in turn, may make people either less or more afraid, depending on whether they currently fear too little or too much... In either case, the fear will hopefully be more rational and appropriate.

[Trigger warning: equations follow - if you are arithmophobic, just close your eyes, think of England, and go on with the text (after opening your eyes again).]

The Reed-Frost model uses the following formula:
C(t+1) = S(t) * (1 - (1 - p)^C(t))  [Note to self: replace with LaTeX equation for better display]
  • C(t+1) will be the number of cases for the next generation of the model.
  • S(t) is the number of susceptibles for at time (generation) t. (You will need to multiply by the number of days in a generation to get a time in days.)
  • p is the probability that any given infected person will successfully infect someone else within one generation. This probability is fixed and does not change over the course of the epidemic in the Reed-Frost model!
  • C(t) is the number of (active, not total!) cases in the current generation.
The idea is that you start with the initial number of infections (say, a single individual who gets off an aircraft from another country), and an initial number of susceptibles (the whole population in our case) and use that to calculate the next generation, C(t+1). You then subtract that count from the susceptibles and do it again. And again. And again. At each generation, the number of cases increases as the number of susceptibles decreases. Eventually, the chance of an infected person successfully contacting a susceptible starts dropping sharply and the number of new infections falls off. This creates a characteristic curve we shall see below.

The Reed-Frost model makes a number of assumptions, including the fact that p is assumed to not change over the course of the epidemic (it does not allow for successful intervention or even changes in population density and habits within the population, say rural Alabama vs. urban California). It assumes that contact is random and the population is thoroughly mixed. Sometimes these assumptions make it pessimistic, other times optimistic, still other times just a bit off. If we keep these things in mind, it is a useful tool.

As with our discussion of R, if S(t) * p is above 1, the epidemic continues to grow. In contrast, if it is below 1, the epidemic will tend to shrink. S(t) * p models the Effective Reproduction Number or R(t) for a given generation. Given a population of 100 people, an R(t) of 2 gives a p of 2%, an R(t) of 1 gives a p of 1% and so forth, but this input number must get smaller with larger populations.

Given those notes, we show a graph of the Reed-Frost model for Bat Soup given an initial population of 331 million, an R(0) of 2.1, a case-fatality of 3%, and a generation time of 6.8:

The number of cases builds slowly, the number of susceptibles falls, and they cross here on day 176.8 (generation 27). The peak number of cases is a little over 56 million with a final death death toll of 8.2 million. We can see from this, that even with a disease with a relatively low lethality but good ability to spread, the losses can be considerable. The number of people who are simultaneously ill can itself be "problematic" even if most of them recover. We also, see, however, that the build to peak happens over almost half a year even in this dire case.

Now we look at a different case, one where the R(0) is 1.5 (the minimum WHO estimate), the case-fatality is 0nly 1% (but still 10 times common influenza), and the generation time is 8.3 days.
In both cases, our spreadsheet takes the epidemic out 50 generations. In this second one, we see that the peak happens at well over a year (390 days, generation 48). At peak, there are just shy of 16 million simultaneous cases and a death toll (by generation 50) of 1.75 million. This kind of scenario would take into account that our health system and prevention measures would both slow the spread and produce fewer fatalities than in China.

Lastly, we produce a graph with an R(0) of 1.5, case-fatality of 2%, generation time of 6.8 days.
Here we peak at 20.7 million active cases in generation 46 (306 days) with a cost of 3.5 million lives by generation 50. We can vary the graph in a number of ways, but you can see that the curves have the same general shape.

What the Model Shows Us

From these different graphs, we can get a feel for some principles of epidemiology in a case like this. Specifically, we see that, no matter what R(0) is, the virus will eventually touch almost the entire population if it is not actively stopped: it is just a question of how long it takes. That also means that for a given case-fatality rate, the final death toll doesn't really change, it is just spread over a shorter or longer period of time.

We also see that being able to adjust the rate of spread dramatically changes the peak number of infections and the amount of time we have to come up with interventions. Having, say, 50 million people all sick in bed at one time would clearly bring many functions of society to a halt, even with a moderate cost in lives. This means, in turn, that contact tracing, appropriate travel restrictions, self-quarantine, closing schools or public events where necessary, etc., can make a phenomenal difference in the overall cost of the epidemic in both economic and human terms. At best, it can bring the R(t) to below 1.0 and actually halt the spread. According to the report I got these input numbers from, the spread of this disease must be slowed by at least 60% to halt the epidemic [Imai, et al, "Report 3", see References below]. [Update 11 March 2020: as of the time the WHO Joint Mission to China returned and published (24 Feb), this has actually been achieved in China. New cases are still occurring, but the outbreak there has stopped growing. Now China is sending a mission to help Italy.]

If spread is never halted but simply works its way through the susceptibles in the population one generation at a time, a new disease may become "endemic", it reaches an equilibrium state where immigration and births provide new hosts to balance those lost to immunity or death. Human-kind deals with a number of such endemic diseases.

[Update 11 March 2020] The Flatten the Curve chart shows this same concept in very simple form. At the point where we now have community spread in the US and over 100 countries with cases globally, our chances of "stopping" the virus are close to zero. But if we can slow spread, it makes the difference between an outbreak that the US health system can keep up with and one, like Italy, where the system is overwhelmed and people die who might otherwise be saved.]

What the Model Does Not Show But Might Be Important

As mentioned above, the Reed-Frost Model has a number of shortcomings. Better models have been produced in general and specific models are being produced in the literature for this particular virus. All of them are going to be "a bit more complicated" than what we have here. Let's briefly discuss some of the important aspects of real-world virus behavior against our crude model.

Fixed p and Nosocomial Infections

As already mentioned, p is fixed in this model. We would expect that public health efforts from the national to community to individual level would lessen the spread over time. One of the critical ways this is so is with so-called nosocomial infections. This is a strange word you may encounter in the news but it is really very simple: a nosocomial infection is one which occurs in a healthcare setting, whether from the first responder (maybe a paramedic or LPO who first discovers a victim) to the hospital ICU and everywhere in-between. Paradoxically, the healthcare apparatus can be the greatest risk in combating infection. In past epidemics, healthcare workers, including first responders, paramedics, LPOs (who may be the first responder before paramedics are called), nurses, doctors, etc., can be exposed to infectious disease at rates 10 or 100 times as much as the rest of the population. When these health care workers start to get infected and sick in numbers, it strips the population of the very people who are depended upon to protect everyone else.

This is one of the reasons that infectious disease precautions and procedures are drilled so hard into everyone in the healthcare system, even volunteers like myself who are on the very edges of the system. It is why we drill things like "gloves and masks" and proper hand-washing very hard in training (and will certainly be doing so in the coming year!) Controlling nosocomial infections has the potential to dominate the course of a disease and did so with SARS. It is also important that trained volunteers exist in the community in advance to step in as attrition reduces the number of professional responders available for routine tasks. Everyday emergencies do not simply stop during an epidemic.

Self-Protection For Communities and Families

Some of the same basic techniques, including disciplined disinfection and handwashing, also reduce R(t). Every table (or, these days, touchscreen ordering device) at a restaurant which gets disinfected, every doorknob cleaned, can stop several infections. But the best approach for the general populace may simply be to temporarily reduce contacts with others (self-quarantine) to deny the virus opportunities for transmission. A bit of preparedness, such as a well-stocked pantry, materials for temporary home-schooling, or the ability to telecommute to work go a long way toward making self-quarantine possible and effective.

New Interventions?

We may end up with new interventions during the course of an epidemic, such as experimental vaccines (usually prioritized for healthcare workers for the reasons given above), better antivirals, etc. All of these can change the curve we see.

To Everything There Is a Season...

The other thing this model does not show is normal seasonal variation. With 170 days or more to peak in these graphs, the yearly changes in weather and activity will affect the course of infection. Infectious droplets from coughing or sneezing are not as effective at spreading disease in the summer when people spend more time outside, the windows are open, and schools are closed. If the start of sustained spread doesn't happen until warmer weather, the progress should be considerably slower. It would, however, pick up again as schools reopen and the weather turns cold (typical flu season). Past flu pandemics, such as the 1918 Spanish Flu progressed in waves, and this is one of several likely reasons.

Population Differences and Super-Spreaders

Some locations tend to spread illness no matter what interventions are taken or how low R(t) can be gotten outside of them. Major measles outbreaks frequently start at places like Disney World or university mega-campuses. When an infected person can come into contact with hundreds of people on a typical day, even a very low p can result in infections. Similarly, certain people (say, teachers, salespeople, paramedics, bat soup connoisseurs...) tend to be exposed to and potentially spread disease much more than the rest of the population. These sub-populations can continue to be sources of infection before an epidemic really builds and long after it wanes. The Reed-Frost model is simply not sophisticated enough to show such super-spreaders.

SEIR: A Slightly More Complex Model

To see how this kind of thinking applies to a real exploration of Bat Soup. you might try looking at Wu, Leung, and Leung, "Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study" (full citation in References) to see how much you can follow along, what important concepts you can pick out.
  • What numbers do they use for R(0), for generation time, etc.?
  • What do they assume about incubation and latency?
  • How do they try to control for the success of interventions in limiting the spread of disease?
  • What kind of timeline do the authors suggest for international spread?
There are a couple of pieces of their argument that I am not sure I fully understand or fully agree with, and I would not expect someone working from just my explanations here to do more than skim. The challenge, if you accept it, would be to see whether you could understand enough to judge how the model they present could be important and what it says about potential spread.

That paper presents an SEIR model (Susceptible-Exposed-Infected-Removed; Trigger Warning: scary equations in link) to try to take into account air travel data from China to better understand the real scope of the infection inside Wuhan (including the likely very high number of cases even the Chinese authorities do not know about) and then predict spread forward in the regions outside of Wuhan in China and internationally.

SEIR is another in the family of Compartmental Models and it is usually presented as a system of Ordinary Differential Equations (ODE), requiring knowledge of calculus, which I rather wanted to avoid in the body of this article. The relationships and results are shown in decent graphs (except for the European number formatting that always takes me a bit to adjust to). At the very least, this should give you a taste for what a typical real-world publication may look like (and why most people don't read them?).

Conclusion, References, Further Reading

So, now that we have almost gotten to the end, hopefully you have a bit better grasp of how infectious disease spreads, perhaps enough to better understand why one outbreak may be more worrying than another, and why some developments in the news may be something that needs to be paid attention to while others can be passed over. An understanding of terms and principles can help you decide whether you need to worry and how much worry is appropriate.

Personally, however, I figure that some basic preparations and precautions are almost always justified, simply because if Bat Soup does not take wing, something else someday most definitely will. Concentrate on those preparations which will not hurt you either way and which you will eventually use regardless (say, some long-term food storage or a bottle or two of disinfectant, the means to work from home when you need to, good nutrition including vitamin C and D, some first aid training, etc.).

References and Links

  • If you want my spreadsheet for educational use, ask. I am working on adding some notes and making it a little more user friendly.

Reed-Frost and Compartmental Models

  • The Reed-Frost Model has a basic entry in Wikipedia, a better but still approachable description is in "Epidemiology - An Introduction" by Kenneth J. Rothman. 2nd Edition. Oxford University Press. Oxford. 2012 pp 118-119. Kindle ed. Available.
  • The Basic Reproduction Number (R(0)) and the other concepts above can also be explored on Wikipedia and are defined in Rothman 2012. Both sources also have tables of estimated R(0) for a number of diseases. I have just seen (28 January) that the Wikipedia article now includes some referenced R(0) estimates for Bat Soup, which, obviously, Rothman 2012 does not.
  • Compartmental models in general have a Wikipedia entry, including exploration of SIR and SEIR models (Calculus again!). There is also a long article/report/short book by Fred Braur freely available (PDF): [Brauer, F. (2008). Compartmental models for epidemics. Vancouver, B.C.: Research Gate. Retrieved from]
  • The Bat Soup SEIR model I discuss above: [Wu, J. T., Leung, K., & Leung, G. M. (2020). Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan , China : a modelling study. Lancet, 6736(20).]

Numbers Used For My Bat Soup Model

  • For the R(0) and generation time numbers used above: [Imai, N., Cori, A., Dorigatti, I., Baguelin, M., Donnelly, C. A., Riley, S., & Neil, M. (2019). Report 3 : Transmissibility of 2019-nCoV. London. Retrieved from, Accessed 27 January 2020]

 Exploring Epidemiology

  • For more in-depth exploration of epidemiology, I strongly recommend ["Principles of Epidemiology, a Self-Teaching Guide" by Roht, Selwin, et al. Academic Press, NY., 1982.] It is one of the books I started with "in the day" and is still useful. I have recently discovered that it is available as an e-book. It provides a clear path to work through concepts, terms, and exercises, looking up the topics in other books as necessary. That is why it does not go out of date: if you use more current books and articles to look up the information to do the exercises, you will keep current with new developments. This could be a very useful approach for, say, a homeschool unit for an adventurous older student. Most of it is approachable with a strong grasp of algebra and basic statistics. The tools needed to rough out models, like spreadsheets with good built-in functions or even programming environments like Lua, are all freely available these days.
  • Epidemiology texts, resources, and papers can sometimes be awfully expensive. Being retired, I tend to look for books at library sales where extremely expensive reference books can go for a few dollars. I also periodically visit the St. John's Cancer Center Community Health Library in Springfield, MO, which has a fantastic array of resources, including references and journals. Getting a library card from them is not expensive (though I forget the exact amount currently). If you are not local, you may have similar resources in your community, such as a college or university library open to the public. Some institutions, including my college, also offer alumni J-Store accounts with a selection of journals; it may be worth checking to see if you have access to such a program.