Games

Problems

Reference

Resources

Printables

Go Pro!

Ask Professor Puzzler

Do you have a question you would like to ask Professor Puzzler? Click here to ask your question!

Coronavirus, Exponential Growth and Statistics

Posted by Professor Puzzler on March 17, 2020
Tags: exponents, coronavirus

[A note from Professor Puzzler: in addition to reading a lot of useful information about coronavirus here, don't forget for children who are stuck at home, we have plenty of online educational resources here on this site: games, reference units, lesson plans, printable worksheets, and more.]

In the last few days I've been asked several questions that are related to the coronavirus and how it is spreading around the world. I'd like to address several of the common questions I've seen. Please note that I am neither a biologist nor an epidemiologist (although I do know how to research and learn), but many of the questions I've fielded have their roots in mathematics, which is my primary field.

But let's start with a couple biology questions.

If I gargle with salt water or vinegar, that'll kill the virus before it makes it down into my stomach, right? After all, the virus hangs out in the throat for four days before traveling down to the stomach, right?

No, and no.

If I sip water every 15 minutes, that'll keep the virus from congregating in my throat. Instead, it'll wash the virus down into my stomach, where the stomach acid will kill it, right?

Also no.

Coronavirus actually arrived last fall. Our "bad flu season" was actually coronavirus. Right?

A big no. There has been absolutely no evidence presented that in hospitals all across the world doctors misdiagnosed coronavirus as one of the strains of influenza. This idea requires a massive, global conspiracy, or a massive level of incompetence in every doctor who saw a patient this winter.

It's interesting that in one of these scenarios, the goal is to keep the virus OUT of your stomach, while in another, the goal is to get it INTO your stomach. What makes it really interesting is that sometimes these two ideas are both shared by the SAME person on social media. Which really means people are not stopping to think about the things they claim are true.

The lack of concern with truth is -- to my mind -- far more scary than the hoarding of toilet paper. If you convince a friend that they can stop the coronavirus simply by drinking more water, that friend is more likely to put themselves at greater risk because they are paying more attention to a Facebook meme than they are to the disease experts at the CDC.

This is a situation where "Well, it might be true, so it doesn't hurt to share" is a 100% false way of thinking.

DO NOT ENDANGER LIVES BY SHARING UNSUBSTANTIATED NONSENSE ON SOCIAL MEDIA!

What does it mean that the virus is spreading exponentially?

Here's a good way of picturing an exponential function:

A man is offered a job. His boss says, "I'll either pay you $10,000 per day, or I'll pay you $0.01 the first day, $0.02 the second, then $0.04, $0.08, $0.16, doubling your pay each day. Which would you prefer?"

Most people's gut instinct is to go with the first option. But depending on how long you're planning to work the job, the second one is definitely preferable. On the 21st day you'll be making roughly the same amount as you would using the first option. After 31 days, you'll be making about $10,000,000 per day. When things get doubled every day, the values skyrocket quickly.

Now, the number of coronavirus cases is not doubling every day in the United States. Based on the current numbers, it's doubling every two to three days*. Thus, since the United States had 3,000 cases at the beginning of the week, we would expect to see 6,000 cases by midweek, 12,000 by the beginning of the next week, and so on. If (and this is a big "if") the exponential trend continues in this way, by the middle of next month we would have about 3,000,000 cases, which is close to 1 out of every 100 people in the United States.

Understanding exponential growth is key to understanding the spread of viruses. And it's clear that exponential functions are not well understood. Yesterday I saw a news headline that said something like "Italy had record number of new infections yesterday." If the journalist understood exponential functions, they would not use that headline, because that is not news. "Italy did NOT have a record number of new infections" would be newsworthy. In fact, since the point where Italy had a statistically significant number of cases, there have only been two days that they didn't "break their record."

Is the growth truly exponential?

Let's talk first about why we use exponential functions as a model for infection growth, and then we'll talk about the details of why this is not sustainable in any population. Suppose you have 5 people who are infected on Day 0. Let's further suppose that each of them infects one person every five days. Then on Day 5, there will be 5*2 = 10 people. Each of those ten people infect someone else, so on Day 10, there will be (5 * 2)*2 = 5 * 2^2 = 20 people. On Day 15, there will be (5 * 2 * 2)*2 = 5*2^3 = 40.

In general, on day 5n, the number of people will be calculated by p(5n) = 5 * 2^(n-1). If we do a substitution of m = 5n, we end up with the following:

p(m) = 5 * 2^(0.2m - 1)

Thus, at any day we can get a rough predictor of how big the infection will be. Want to know how many people infected there will be on Day 100? It looks like this:

p(100) = 5 * 2^19 = 2,621,440

Now, this function does not represent growth of the coronavirus. Currently the coronavirus is doubling every 3 days, so our scenario is worse than the given example.

Side note: in general, we write exponential functions in a slightly different way -- they look more like this: p(x) = ak^x, where k is (for a function that is growing) some number larger than one. This k value is very important, because a small difference in k can make a VAST difference in total infection. Consider these two functions:

p(x) = 1.2^x
h(x) = 1.3^x

If you calculate the infection after 100 days using p(x) you get p(x) = 82 million. However, if you calculate using h(x), you get 248 BILLION. Shrinking this k value is what people are talking about when they say "flatten the curve."

So back to the question: is the growth truly exponential? Yes and no. The growth starts out exponential in nature. However, there are a factors that can (and will) affect that. The biggest one is population size. If the population was infinite, then the growth would remain exponential. But in a finite population, the exponential growth cannot be sustained. After a while, there are many people who are infected, and so each infected person comes in contact with more infected people and fewer uninfected people. That means that as the growth of the virus builds in a population, the speed at which it grows slows down.

We're doing better than Italy, because we have fewer cases, right?

No. The fact that we have fewer cases is simply because the virus reached us later than it reached them. Our infection graph looks just like theirs, except it is shifted by a couple weeks. This is why so many charts are circulating that compare our statistics to Italy's with a date differential of a couple weeks.

We call this a "translation along the x-axis." It's a perfectly legitimate mathematical and scientific analysis technique, so when your friends tell you it's like "comparing apples and oranges," you can assure them it's not. Honestly, I think most people who use the "apples and oranges" argument know that's not true, but are a little too scared to admit it out loud. I could be wrong about that though -- I'm not a psychologist.

Anyway, back to the question at hand. The best measure of how we're doing is not the number of cases we have; it's the k factor in our exponential equation. Or, to describe it another way (using my original example) the best measure is our "doubling rate."

Based on the last numbers I checked, US infection has been doubling every two to three days, while Italy has been doubling every three to four days. So no, unfortunately, we are not doing better than Italy.

But we have a bigger population than Italy. Percentagewise, we're doing a lot better, right?

Sure. At the moment we have a smaller percentage than Italy. But that is only because we are running a couple weeks behind them. Remember what I mentioned above about the exponential growth being slowed because the population isn't infinite? Well guess what -- it is going to take much longer for the virus to reach that "critical mass" moment in the US because it has a larger population to work with. Let's suppose the infection graph for Italy levels off at 60% of the population. It's a reasonable expectation that the same will be true for us; the infection graph will level off at 60% of our population. It'll just take a lot longer to get there.

I understand the desire to factor a percentage into these calculations; it makes us look "better" and feel "safer" in the short term. But if you follow the equations through to their completion, that percentage will cancel itself out.

What does it mean to "flatten the curve"?

This was briefly addressed above. Flattening the curve means doing what we can to decrease the k value in our exponential function. Because k doesn't just depend on the virus -- it also depends on our actions. If we all spend all our time in large, crowded, public gatherings, then the k value will go through the roof, and it will take far less time for the virus to reach the entire population.

If we could do exactly the opposite, completely avoid one another, then each person who is infected would not be able to infect anyone else. In reality, in a society in which each person depends on many others, we cannot avoid contact. But if we can limit it, then we can significantly decrease the k value. This means that cases aren't coming in at such an alarming rate, and there's less risk of hospitals being overwhelmed (running out of supplies, personnel, time, etc.). That's what we mean when we talk about flattening the curve.

Why do some graphs show an exponential curve, and others show a bell curve?

The exponential curve (the one that doesn't go back down to zero) is a graph of people who have been infected with the virus. The bell curve (the one that goes up and then back down, making a lovely bell shape) represents people who are newly infected on a given day. Eventually, as mentioned above, the infection rate slows down, which leads to the dampening in new cases day by day.

In the chart that's been floating around [see above for an example chart] the numbers don't start at Day 1. Why?

That's a good question, which I've seen several people asking online. There are two reasons why the chart doesn't start on Day 1 (or patient zero, if you prefer). But before I get into this, I'd like to reiterate that it's not necessary to start at Day 1. The point of the chart is not to establish a baseline for one country, but rather, to provide a comparison between two countries when they are at a comparable stage of infection.

As I said, there are two reasons why these charts don't start at Day 1. The first is that no one knows when Day 1 is. Sure, Day 1 could be defined as the day the first patient was diagnosed, but mathematically that's not Day 1; Day 1 is the day the first infected person set foot on US soil. From a mathematical standpoint, that's really the day of most interest, and we don't actually know when that happened. For some countries, the day of first diagnosis might be very close to the actual "Day 1" but for other countries it is not.

The other reason is that what happens at the outset may not be statistically significant. Look at it this way: one of the first places coronavirus struck in the US was at a nursing home, and the mortality rate for our country was ridiculously high because of that. If that infection had happened three weeks later, when we have a much larger sample of cases to work from, that nursing home infection -- even if it resulted in the same number of infections and the same number of deaths -- would not have significantly skewed the mortaility rate because it would have been a much smaller fraction of the whole.

The graph below illustrates why we tend to ignore those initial values. This is a logarithmic graph, and if a graph is exponential, we expect its logarithmic graph to be a straight line. You can see that right around the time the US hit 100 cases, the graph settled down into a consistent straight-line graph. What happened before was unpredictable and not useful.

Another way of looking at it: if you wanted to calculate the probability of flipping heads on a fair coin, you wouldn't flip the coin 3 times, because you know that you wouldn't get accurate results that way; you'd flip the coin a hundred, or even a thousand times. The larger your sample, the more reliable the results.

At the very beginning, the numbers you have are really nothing more than "noise," and they don't provide any useful statistical information.

WHO (World Health Organization) put out an early estimate of 3.4% as the mortality rate. Is this number too high?

Probably. But I'm not an expert, and I'm not going to pretend I know better than WHO. Measuring a mortality rate is complicated by things like:

You don't know about the cases that were asymptomatic (no symptoms).
You don't know about the cases that produced symptoms, but the patient chose not to seek medical help.
Until a known case is complete (either through death or recovery) it lands on the "non-mortality" side of the scale, which means it drags the rate down from what it should be.
If you choose to base your mortality rate on completed cases, at first your numbers will be way off because the cases which close quickly are the ones that close with death. Survival cases take much longer to be cleared from the active list. This skews the measurement upward significantly, at first.

But there's something else to consider: these same caveats are also true of influenza. In other words, if the coronavirus mortality rate given is too high, it's reasonable to assume the influenza value is also too high. Consider this: CDC instructs people to contact their doctor if they think they might have coronavirus. But the CDC tells people that if they think they have the seasonal flu they should only contact their doctor under certain circumstances. In other words, the number of seasonal flu cases may be FAR more underreported than coronavirus cases.

Why does this matter? Because most of the time raw numbers don't mean a lot to us - they only mean something when they're given in comparison to another, better understood quantity. So the number 3.4% takes much of its meaning from the fact that we can compare it to the mortality rate for seasonal flu, which is around 0.1%. So if that's how you're using the coronavirus number, you should remember that the number you're comparing it to may also be too high.

In recent days I've seen reports that WHO has lowered their previous mortality estimate.

Why do different sources give different values for the number of infections?

Some sources are counting only confirmed cases, and some are also counting presumptive positives. A presumptive positive is a case which has been tested at a state or local lab, but has not yet been confirmed in a CDC lab. There is also another designation -- "POI" or "Person of Interest" I mention that merely for completeness; I don't think any sources are including POIs in their tallies.

Also, depending on the source you use, the numbers might not be updated on the same schedule. So someone might say, "I saw that we have 5,800 cases in the US," and you think to yourself, "I'm pretty sure I just read that there were 5,400 cases." Don't worry -- it doesn't mean you (or your friend) are consuming fake news. Of course, if your friend tells you there are only 500 cases, you might want to look into that with them!

The most important thing is not which data set you use, but that you use it consistently. If you look at one data set one day, and then the next day switch to a different one, that may very well be like comparing apples and oranges.

Does Population Density play a part in this?

Not being an epidemiologist, I won't try to give a definitive answer on this, but it does seem very likely that it does. Population density has to do with how many people there are per square mile. So, for example, New York City has a much higher population density than township T4-R9 in northern Maine (I wish there was a township R2-D2, but alas, that's not how the numbering system works). In a higher population density area, you have millions of people living in very close quarters, which makes it harder for them to distance themselves from each other. In rural areas, that distance is built into everything. One facebook meme I saw said, "Keep six feet away from each other? That seems pretty close to us Mainers!" I have friends who will say "Let's go to the city," and to them that means a full day trip, because the city they're talking about (Bangor) is more than 3 hours south of them. These folks are likely to have already fully stocked their freezers, pantries, etc, which decreases their need to get out of the house. They have social distancing built into the fabric of their lifestyles.

Conclusion

This blog post is long enough that when I was asked to discuss in more detail the significance of total population vs percent population data, I decided to start a whole new blog post. You may find that post interesting/valuable - especially if you have friends that are telling you percent population data is the only meaningful data.

[And please don't forget our many online educational resources: games, reference units, lesson plans, printable worksheets, and more.]

* A friendly reader pointed out that I didn't source this claim about our doubling rate. Since that number is in a state of flux, and by the time you read this, it may not be accurate any more, I'll do better than source it; I'll explain how you can estimate it for yourself without too much difficulty. Go to a reputable site which contains data about coronavirus infections. I've used multiple sites for redundancy, which is good because one of the sites I used (worldometers) got hit by a cyber attack a few days ago and had incorrect data up for a short period of time. Make a list of daily numbers. Now pick a number in the chart and double it. Count how many days you have to go down your list to find a value that is at least that value. Do this several times. You'll find that the doubling count mostly fall within a small range. That range is your approximate doubling range.

Christmas Ornaments

Posted by Professor Puzzler on December 20, 2019
Tags: geometry, christmas, art

Merry Christmas! A couple years ago I mentioned here that someday I would post instructions and templates for creating various ornaments out of cardstock and/or photo paper. Considering we're just days away from Christmas, some parents may be looking for fun activities to do with their children to pass the time on their way to the most anticipated day of the year, so I have put those plans online at last!

We have a whole "Paper Craft" section now, and it includes instructions for several ornaments like the one shown here. Note that the templates are blank (without images). This allows you to add your own clipart (in case you wanted a Santa-themed ornament, or a snow-themed ornament instead of a religious-themed ornament).

And you don't need clipart at all; I printed out some blank ones and my kids loved drawing their own pictures on them.

We have the following ornament templates/instructions:

The cube is a good size to put on a tree; the other two are too large for a tree. The dodecahedron was designed as a table-top ornament, and the icosahedron as an ornament to hang from the ceiling or doorway frame.

In addition to the ornaments, there are several tutorials, in case you are interested in designing your own projects, such as the partially complete cathedral shown here.

Happy crafting, and Merry Christmas!

Stressed Word in 'Are you okay?'

Posted by Professor Puzzler on January 9, 2019
Tags: language

Jac asks: Which word is stressed in the sentence "are you okay?"

Hi Jac,

There is not a single answer to your question; which word is stressed depends on the context. It could be any of the three words, depending on the situation.

Situation #1

I'm hanging out with two friends - Joe and Moe. Joe looks really sad, so Moe says, "Are you okay?". Joe says, "Oh yes, I'm fine." But then, after Moe leaves, Joe tells me about all the horrible things that's been happening in his life. After listening to him for a few minutes, I say, "ARE you okay?" I emphasize the word 'ARE', because Joe has previously said that he is okay, and my emphasis on that word indicates that I doubt his statement about being okay.

Situation #2

Now Moe and I are in a car, and we get into a fender-bender. Moe bangs his head against the windshield, and then he looks at me and says, "Are you okay?" I reply that I'm fine, and then say, "Are YOU okay?" This time I emphasize 'YOU' because I'm thinking that I'm not the one we should be worrying about, it's Moe. It's a way of saying, "Never mind about me - you're the one we should be worrying about!"

Situation #3

This is probably the most common situation; the word 'okay' will have a natural upward inflection/stress because the sentence is a question. For a situation where the word 'okay' gets extra stress, imagine that Joe is telling me about all the terrible things that he's gone through, and I ask, "Are you OKAY?" (I'd be more likely to add the word 'but': "But are you OKAY?"). Emphasizing the word 'OKAY' in this situation might be a way of implying that I have doubts whether he's handling the situations in a healthy way.

There are a lot of sentences that can be stressed in different ways depending on the context. In most cases, we do it automatically without stopping to think about how we're stressing the words!

Thanks for the question, Jac.

Tricks for Factoring Three Digit Numbers

Posted by Professor Puzzler on November 16, 2018
Tags: math

"Are there any tricks that can help you easily factor three digit numbers (without using a calculator)?" ~Jay

Hi Jay, I assume you're talking about tricks besides the normal divisibility rules (for example, if the digits add to three, the number is a multiple of three, if it ends in 0 or 5 the number is divisble by 5, etc). If you're not familiar with those rules, you might want to take a look at this unit here: Divisibility Rules.

Beyond that, there are some tricks that sometimes help. Here's my favorite. Let's say you wanted to factor the number 483. Here's what I would do:

Multiply the first and last digit: 4 x 3 = 12
Find two numbers that multiply to 12 and add to the middle digit (8). The numbers are 6 and 2 (6 + 2 = 8 and 6 x 2 = 12).
Now rewrite the number using those two numbers we just found: 483 = 460 + 23 (the tens place got split into two pieces using our numbers, and the entire number was rewritten as a sum of two numbers).
Now factor the result: 460 + 23 = 23(20 + 1) = 23 x 21.
Finish factoring: 23 x 7 x 3

Unfortunately, this doesn't always work. For example, it won't work for 648, because you can't find two numbers that add to 4 and multiply to 48. But maybe if we can find a way of regrouping this number, we might get around that. My first thought is to pull out one of the "hundreds" and put it into the tens place. So we're thinking of 648 as being rewritten 5(14)8. Now we do 5 x 8 = 40, and realize that our two numbers must be 4 and 10 (4 + 10 = 14 and 4 x 10 = 40). So we rewrite the number: (600 + 48 = 24(25 + 2) = 24 x 27. Then we just finish the prime factorization from there.

If the number is one of those special numbers (like 483) that can be factored without regrouping, it's a straightforward, foolproof process. But if the number has to be regrouped, it requires a bit of intuition to work it out. However, if you don't have a calculator, it might be worth doing!

Rationalize a Denominator with a Cube Root

Posted by Professor Puzzler on September 27, 2018
Tags: math

Thabang from Lesotho writes, "how do we rationalize a denominator consisting of a cube root with another constant added to it or subtracted from it?"

Good morning, Thabang, and thank you for your question. This is actually something I don't remember ever seeing before, so I had to give it some thought before answering.

What you're looking for is, how do we rationalize the denominator, if the denominator is something like "The cube root of three, plus two" or "the cube root of three, minus two"?

In order to solve this, it's important to remember two factoring rules you may have learned in an Algebra class:

x³ + y³ = (x + y)(x² - xy + y²)
x³ - y³ = (x - y)(x² + xy + y²)

Let's say your denominator is the cube root of three, plus two. Then I'm going to do the following substitutions:

Let x = the cube root of three, let y = 2.

Now your denominator is x + y, and if you multiply the numerator and denominator of the fraction by (x² - xy + y²), you will have turned the denominator into x³ + y³ = 3 + 8 = 11, which is rational.

That was using the first factoring rule shown above. If the denominator had a subtraction (the cube root of three, minus two), we'd just use the second factoring rule, and multiply by (x² + xy + y²).

Thanks again for asking, Thabang.

Why do we have names for numbers?

Coronavirus and Population Percentages

Older posts

Blogs on This Site

Book Scrounger

Reviews and book lists - books we love!

Ask Professor Puzzler

The site administrator fields questions from visitors.

Like us on Facebook to get updates about new resources