As someone with too many GIS degrees, I feel a level of cathartic release in reading this and thinking that laypersons might be able to improve their map making skills, avoiding some of the more serious cartographic gotchas. It was well-written. The beauty of the ubiquity and greatly-improved UX of modern GIS tools is that everyone can dive in to doing geospatial analysis and building static and dynamic maps. It also means people can accidentally author very misleading visualizations.
Despite this ESRI-backed article on the subject, I think the popular ESRI-driven map dashboard for Coronavirus[1] has a major flaw that violates the crux of this article. Dot density maps _MUST_ be set to scale relative to your map scale, or else you get nightmare scenarios like this one[2]. This is doubly true if the dots are varying in size (which I also think is a fundamentally terrible representation, because people suck at mentally comparing areas). If I were to modify it, I would probably use a choropleth-like representation. Keep the dots equally sized and colour them different shades of red. That way nobody's brain will mislead them into thinking "this larger circle means a larger area is all infected."
As another commenter pointed out, geographic projections are well-known in the infoviz literature to be problematic, for a number of reasons.
Further, two dimensional area (circles) is a particularly terrible dimension to marry to geography, because it’s an additional (and therefore competing) spatial dimension. Color is better, but still has problems (compare Rhode Island to Texas, or populous New Jersey to unpopulated Wyoming/Alaska). And color forces you to bin, which can be misleading. Chloropleths are still harder to read (compare California to Maine, they don’t share an Axis and are irregular shapes, making it hard to compare their areas) than a bar graph or histogram.
IMO a logarithmic bar graph is the most reasonable choice, if you want to include population density I’d encode it with opacity and one-dimensional space (a shaded-in bar representing infection, a dark bar representing mortality, and a transparent bar representing total population). If geographic projections are that important to you, you can superimpose those bars on countries. It sucks but it gives you geographic scale. If anyone wants to build this graph, please include an adjacent, different-hued bar encoding the number of tests performed thus far.
Personally I'd use an equal-population cartogram like https://go-cart.io/cartogram instead of a geographic projection, and a dot map or solid colors based on density.
And per the terminology in the article, that ESRI map uses proportionally scaled symbols, not dots.
> Dot density maps _MUST_ be set to scale relative to your map scale, or else you get nightmare scenarios like this one[2]
There are a lot of flaws with the visualizations of the infections. But using choropleth representations would need a population reference no ? I'm genuinely curious, should the range be reflective of the population with the series the infections ?
It could then be enhanced with deaths per infections in certain regions, which could be further enchanced with distance to hospitals.
I sound like an ass right now but the data is here, we should use it properly to help, and with people like yourself, maybe it would be better than whats being given right now
I was surprised when you said that. Recently I was doing some georeferecing of historical maps on top of current maps, and I was very disappointed with the choices.
That was my first experience with GIS toolkits. I tried ArcGIS, QGIS and a few lesser knowns. I was looking for a good UI/UX, partly because the goal was to teach a non-technical acquintance to do it. There was a shareware toolkit that was exclusively for georeferencing that had a very satisfactory UI, but it had a blocking bug that would cause it to crash.
I found this article to be very informative. Most of us don't have an understanding of how nuanced (or not) using maps to display data can be.
I wonder if someone with the proper credentials could contact the creator of this website [0] with advice. I seems like a good idea and resource but the map bothered me the very first time I saw it. The color coding is simply wrong and it communicates something that does not align well with reality.
I can't count the number of times I've looked at a data visualization and wished I could sit down with the person who made it and read an Edward Tufte book to them. There's just so few good examples out there of data visualizations that respect basic principles of visual communication, like the ones outlined in this article. They generally seem to aim more for visual impact (like the useless 3D display in the article, which you've gotta admit is striking) than for clarity, which I guess is understandable but is still too bad.
(And as long as I'm griping, don't get me started on all the people who think a wall of text slapped into a PNG constitutes an "infographic.")
I think Tufte is very overrated. People who are reasonably comfortable with data rather have it in basic format. Tufte-style often takes a lot of effort to produce and the payoff isn’t there. Consultants love it though because they can bill their clients for playing around for hours with charts.
Tufte, as another user correctly pointed out, is massively overrated. He is also incredibly thin-skinned and often blocks people, many of whom work in data visualization, if they criticize or critique his work or ideas.
Alberto Cairo is just as overrated, but he is more receptive to feedback.
It seems like you are complaining about graphics in the article, but I'm not sure. If you read the article, it specifically talks about why those are not good visualizations and gives pointers on developing good ones.
For the 3D one specifically, right under the graphic, the article says:
"3D has a time and a place. It can be a really useful way to encode thematic data on the z-axis and make something useful. But extruding Hubei compared to the rest of the areas just doesn’t work. It’s gratuitous and adds nothing. It’s really hard to make any sense of relative amounts and that’s before we even deal with foreshortening and occlusion."
Very nice and well-written writeup. Here's one graf that randomly provoked some thoughts:
> But looks can be deceptive. The fact that it looks okay is hiding a dark secret that, if you’re not aware of the fact, won’t even get noticed. The map is using totals (absolute values). There are very very few golden rules in cartography but this is one of them: you cannot map totals using a choropleth thematic mapping technique. The reason is simple. Each of our areas on the map is a different size, and has a different number of people in it. These two innate characteristics of all thematic maps means you simply cannot compare like for like across the map.
> The label tells us that Hubei region has over 65,000 cases of coronavirus. It sounds a lot. But does Hubei have 100,000 people, or possibly 100,000,000 people living there?
I definitely agree with the author: that there are very few "golden rules" in visualization, and that not depicting absolute numbers in a choropleth map is one of them. However, the author does an excellent job (with a bar chart and revised map) showing how this anti-pattern severely obfuscates how much the Hubei region is an extreme outlier.
IMO, during the first half of an epidemic, when a small portion of the population is infected and infections are growing exponentially, it makes sense to use the absolute number of infections. Then later when infection is widespread and the curve looks logistic, it makes sense to give the proportion of infected. I think we are clearly in the first half when it comes to coronavirus.
> this anti-pattern severely obfuscates how much the Hubei region is an extreme outlier.
For this reason I think the 3D projection graph is actually not as bad as made out. Sure, it's hard to tell anything about any of the other provinces compared to Hubei, but it really highlights the difference between there and other provinces.
The difficulty I always thrash around with is: proportional by area or proportional by population? I used to do some crime maps and some areas would look quite crime-ridden ... because they were areas with very little population, as the census counts it, like parks and such, so the crime would look rather high. So dividing by population isn't the cure-all, but it beats nothing. For giggles, I would do crimes in a given region, crimes in a region divided by that area, and crimes in a region divided by the population in that region. Very different-looking results.
I have often considered dividing by some kind of combination of area and population, but even that seems not quite right. Disregarding "victimless crimes," much crime is interactive: two or more parties must be involved, therefore the population ought to have some kind of exponent attached to it, like particles bouncing against one another in a container.
I never did puzzle this out, I am sure brighter minds than I would have come to some conclusions.
The answer ends up being: Why not Both? Both are useful but tells us different things.
For COVID-19, the number of infections has real meaning regardless of its proportion of the population: Each are instances of viral infections that can spread that infection further. But proportion infected can give information about virulence, etc.
The number of cases per people statistic is silly. It might make sense when the virus is common around the world, but when it's just spreading the number of cases itself is more important. For example if you detected 100 people infected with a virus in some region, does it matter if it has 200 million people (Uttar Pradesh) or 10 million (Lombardy)? These political divisions are arbitrary anyway.
This is interesting but sadly their service is not accessible in Iran, one of the hardest hit countries by Covid-19, not due to censorship by the Iranian government, but due to server-side blocking of IP addresses originating from Iran. The reason: US sanctions!
Email me if you need a rehost in EU - We can do a stream of the content from JH and feed it through, it might be ~2 seconds behind but if you guys are stuck for info it might be better than nothing.
Also worth considering whether you really need to aggregate all cases in the same province. If you can get higher-resolution data, use it. (E.g. for each prefecture in Hubei province: https://news.sina.cn/project/fy2020/yq_province.shtml?provin... Their visualization isn't great, but someone else could use their data to do a better job.)
OP wants to give lesson about mapping yet include Taiwan in a map of China. Seriously? He should learn about geography and country borders first before writing any blog post.
How about something constructive? Please write an uncontroversial blog post explaining "country borders". It should be easy since there's never once been disagreement on the subject in all of human history.
That is a BIG problem with this article - it looks like Chinese efforts to infiltrate all maps with their idea of Taiwan are pervasive and this author has unwittingly used their version of map data.
A map seems a terrible base layer for any information that isn’t trying to show proximity or proportional landmass. Seems ridiculous to mess around with talking about a projection when it’s still showing provinces related to their shape and size, which is worthless information here. Why not just use a population cartogram as the base?
If anyone from Esri ever reads these comments, please for the love of maps stop using scroll wheel and pinch to move maps north-south. Nobody, literally nobody, has ever wanted that, literally never.
I was thinking today that it would be a good use of that map data thats recorded from our phones GPS for mapping the route of the virus.
Hypothetically: If all infected submitted their map data for the last few days (annonymously - no need to identify people) and all of that data was plotted over maps, you could identify the routes and direction of infections.
I don't know if it would be anything more than an interesting visualisation of the data already collected, but the comment mentioning Edward Tufte really got me thinking how to visualise the data we have properly.
We haven't seen something spreading like this in my lifetime anyway and at the same time, we've never had so much data on ourselves in my lifetime either, might be a good time to put it to good use for once.
Incidentally, I made a demo app with proportionally sized circles like they suggest, and it allows to move day-by-day to see the progression. https://coronaprogress.com/
Could you allow the viewer to select the color of the case indicator? Or maybe just add a contrasting outline on the circles? I'm a not-at-all-uncommon type of colorblind, and I find it very, very difficult to make out red dots on green satellite image.
The specific type of coronavirus is clear from context. Language would be very tiresome if we had to give a full taxonomy for every term we use.
For instances I say "I saw a fox outside my house yesterday". I don't need to specify the exact species of fox, and anybody who knows that I live in Europe will know that I mean vulpes vulpes.
I'm unclear as to whether we should be seriously concerned about Coronavirus in the US at this point. Are there preparations I should be making or precautions I should be taking? People have been WhatsApping me articles about face mask shortages, but I don't know if this is just scaremongering.
(Last: if you're sick, if the outbreak is local, if you don't absolutely need to be somewhere.)
Ready: Pandemic preparations: Community mitigation guidelines to prevent pandemic influenza https://www.ready.gov/pandemic
Before a Pandemic
- Store a two week supply of water and food.
- Periodically check your regular prescription drugs to ensure a continuous supply in your home.
- Have any nonprescription drugs and other health supplies on hand, including pain relievers, stomach remedies, cough and cold medicines, anti-diarrhoeal medication, fluids with electrolytes, and vitamins.
- Get copies and maintain electronic versions of health records from doctors, hospitals, pharmacies and other sources and store them, for personal reference.
- Talk with family members, loved ones, neighbours, co-workers, and other frequent contacts, about how they would be cared for if they got sick, or what will be needed to care for them in your home.
During a Pandemic
Limit the Spread of Germs and Prevent Infection:
- Avoid close contact with people who are sick.
- When you are sick, keep your distance from others to protect them from getting sick too.
- Cover your mouth and nose with a tissue when coughing or sneezing. It may prevent those around you from getting sick.
- Wash your hands frequently to help protect you from germs.
- Avoid touching your eyes, nose or mouth.
- Practice other good health habits. Get plenty of sleep, be physically active, manage your stress, drink plenty of fluids, and eat nutritious food.
(Most of the prepatory advice will be familiar to Bay Area residents as typical earthquake preparedness. Elsewhere it's standard preparation for major winter storms or hurricanes. Be prepared to sit tight for a few weeks.)
[+] [-] Waterluvian|6 years ago|reply
Despite this ESRI-backed article on the subject, I think the popular ESRI-driven map dashboard for Coronavirus[1] has a major flaw that violates the crux of this article. Dot density maps _MUST_ be set to scale relative to your map scale, or else you get nightmare scenarios like this one[2]. This is doubly true if the dots are varying in size (which I also think is a fundamentally terrible representation, because people suck at mentally comparing areas). If I were to modify it, I would probably use a choropleth-like representation. Keep the dots equally sized and colour them different shades of red. That way nobody's brain will mislead them into thinking "this larger circle means a larger area is all infected."
[1] https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.h...
[2] https://imgur.com/NPhEzk7
[+] [-] gen220|6 years ago|reply
Further, two dimensional area (circles) is a particularly terrible dimension to marry to geography, because it’s an additional (and therefore competing) spatial dimension. Color is better, but still has problems (compare Rhode Island to Texas, or populous New Jersey to unpopulated Wyoming/Alaska). And color forces you to bin, which can be misleading. Chloropleths are still harder to read (compare California to Maine, they don’t share an Axis and are irregular shapes, making it hard to compare their areas) than a bar graph or histogram.
IMO a logarithmic bar graph is the most reasonable choice, if you want to include population density I’d encode it with opacity and one-dimensional space (a shaded-in bar representing infection, a dark bar representing mortality, and a transparent bar representing total population). If geographic projections are that important to you, you can superimpose those bars on countries. It sucks but it gives you geographic scale. If anyone wants to build this graph, please include an adjacent, different-hued bar encoding the number of tests performed thus far.
[+] [-] Mathnerd314|6 years ago|reply
And per the terminology in the article, that ESRI map uses proportionally scaled symbols, not dots.
[+] [-] bilekas|6 years ago|reply
There are a lot of flaws with the visualizations of the infections. But using choropleth representations would need a population reference no ? I'm genuinely curious, should the range be reflective of the population with the series the infections ?
It could then be enhanced with deaths per infections in certain regions, which could be further enchanced with distance to hospitals.
I sound like an ass right now but the data is here, we should use it properly to help, and with people like yourself, maybe it would be better than whats being given right now
[+] [-] kevin_thibedeau|6 years ago|reply
https://jagjapan.maps.arcgis.com/apps/opsdashboard/index.htm...
[+] [-] dmos62|6 years ago|reply
I was surprised when you said that. Recently I was doing some georeferecing of historical maps on top of current maps, and I was very disappointed with the choices.
That was my first experience with GIS toolkits. I tried ArcGIS, QGIS and a few lesser knowns. I was looking for a good UI/UX, partly because the goal was to teach a non-technical acquintance to do it. There was a shareware toolkit that was exclusively for georeferencing that had a very satisfactory UI, but it had a blocking bug that would cause it to crash.
[+] [-] robomartin|6 years ago|reply
I wonder if someone with the proper credentials could contact the creator of this website [0] with advice. I seems like a good idea and resource but the map bothered me the very first time I saw it. The color coding is simply wrong and it communicates something that does not align well with reality.
[0] https://covid19info.live/
[+] [-] smacktoward|6 years ago|reply
(And as long as I'm griping, don't get me started on all the people who think a wall of text slapped into a PNG constitutes an "infographic.")
[+] [-] bitxbit|6 years ago|reply
[+] [-] piffey|6 years ago|reply
https://medium.economist.com/mistakes-weve-drawn-a-few-8cdd8...
[+] [-] catacombs|6 years ago|reply
Tufte, as another user correctly pointed out, is massively overrated. He is also incredibly thin-skinned and often blocks people, many of whom work in data visualization, if they criticize or critique his work or ideas.
Alberto Cairo is just as overrated, but he is more receptive to feedback.
[+] [-] taeric|6 years ago|reply
[+] [-] ubertakter|6 years ago|reply
For the 3D one specifically, right under the graphic, the article says: "3D has a time and a place. It can be a really useful way to encode thematic data on the z-axis and make something useful. But extruding Hubei compared to the rest of the areas just doesn’t work. It’s gratuitous and adds nothing. It’s really hard to make any sense of relative amounts and that’s before we even deal with foreshortening and occlusion."
[+] [-] danso|6 years ago|reply
> But looks can be deceptive. The fact that it looks okay is hiding a dark secret that, if you’re not aware of the fact, won’t even get noticed. The map is using totals (absolute values). There are very very few golden rules in cartography but this is one of them: you cannot map totals using a choropleth thematic mapping technique. The reason is simple. Each of our areas on the map is a different size, and has a different number of people in it. These two innate characteristics of all thematic maps means you simply cannot compare like for like across the map.
> The label tells us that Hubei region has over 65,000 cases of coronavirus. It sounds a lot. But does Hubei have 100,000 people, or possibly 100,000,000 people living there?
I definitely agree with the author: that there are very few "golden rules" in visualization, and that not depicting absolute numbers in a choropleth map is one of them. However, the author does an excellent job (with a bar chart and revised map) showing how this anti-pattern severely obfuscates how much the Hubei region is an extreme outlier.
[+] [-] djannzjkzxn|6 years ago|reply
[+] [-] platz|6 years ago|reply
If you start moving to things like per-capita, i actually think that has the potential to be more confusing for more numbers of people.
so yes, absolute values will be highly correlated with population, but again it just depends on what you really want to highlight and communicate
Maybe you really do want the absolute value.
[+] [-] taneq|6 years ago|reply
For this reason I think the 3D projection graph is actually not as bad as made out. Sure, it's hard to tell anything about any of the other provinces compared to Hubei, but it really highlights the difference between there and other provinces.
[+] [-] at_a_remove|6 years ago|reply
I have often considered dividing by some kind of combination of area and population, but even that seems not quite right. Disregarding "victimless crimes," much crime is interactive: two or more parties must be involved, therefore the population ought to have some kind of exponent attached to it, like particles bouncing against one another in a container.
I never did puzzle this out, I am sure brighter minds than I would have come to some conclusions.
[+] [-] SubiculumCode|6 years ago|reply
[+] [-] heartbeats|6 years ago|reply
Population squared? That gives you the number of potential connections.
[+] [-] Grue3|6 years ago|reply
[+] [-] namirez|6 years ago|reply
https://twitter.com/ARTICLE19Iran/status/1231895623789576192...
[+] [-] bilekas|6 years ago|reply
[+] [-] yorwba|6 years ago|reply
[+] [-] heartbeats|6 years ago|reply
[+] [-] tasogare|6 years ago|reply
[+] [-] jobigoud|6 years ago|reply
[+] [-] bagacrap|6 years ago|reply
[+] [-] sleepytimetea|6 years ago|reply
[+] [-] panic|6 years ago|reply
[+] [-] peteretep|6 years ago|reply
https://m.facebook.com/457568574373257/posts/this-is-a-popul...
[+] [-] ggm|6 years ago|reply
Because Mercator.
Likewise in soviet russa.. map owns you. How much of the cold war might have been put back to bed, by a better map projection?
[+] [-] thedance|6 years ago|reply
[+] [-] bilekas|6 years ago|reply
Hypothetically: If all infected submitted their map data for the last few days (annonymously - no need to identify people) and all of that data was plotted over maps, you could identify the routes and direction of infections.
I don't know if it would be anything more than an interesting visualisation of the data already collected, but the comment mentioning Edward Tufte really got me thinking how to visualise the data we have properly.
We haven't seen something spreading like this in my lifetime anyway and at the same time, we've never had so much data on ourselves in my lifetime either, might be a good time to put it to good use for once.
[+] [-] user5994461|6 years ago|reply
I am gonna update the numbers for today.
[+] [-] arkades|6 years ago|reply
[+] [-] JoshTko|6 years ago|reply
[+] [-] alanh|6 years ago|reply
[+] [-] mantap|6 years ago|reply
For instances I say "I saw a fox outside my house yesterday". I don't need to specify the exact species of fox, and anybody who knows that I live in Europe will know that I mean vulpes vulpes.
[+] [-] hackinthebochs|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] pnako|6 years ago|reply
https://www.cdc.gov/eis/field-epi-manual/chapters/Describing...
[+] [-] aeontech|6 years ago|reply
There's a H1N1 model here [1] that one could use as a starting point, I imagine?
[0] http://www.gleamviz.org
[1] http://www.gleamviz.org/simulator/models/
[+] [-] ohmyblock|6 years ago|reply
[+] [-] xwowsersx|6 years ago|reply
[+] [-] aaomidi|6 years ago|reply
There's nothing to really say the same situation happening in China, Iran, SK, Italy won't happen here.
Have a supply of food ready, minimize being in crowds, don't touch your face when you're not inside the house.
Other stuff I've been doing that aren't necessarily the right thing:
- Eating meat well done for a while
- Not eating raw veggies
- Working from home more often
- Telling sick co-workers to stay home (I'm in a tech company, theres really no excuse of sick days)
[+] [-] pwg|6 years ago|reply
https://news.ycombinator.com/item?id=22425593
[+] [-] dredmorbius|6 years ago|reply
- Wash your hands.
- Cover your cough.
- Stay home.
(Last: if you're sick, if the outbreak is local, if you don't absolutely need to be somewhere.)
Ready: Pandemic preparations: Community mitigation guidelines to prevent pandemic influenza https://www.ready.gov/pandemic
Before a Pandemic
- Store a two week supply of water and food.
- Periodically check your regular prescription drugs to ensure a continuous supply in your home.
- Have any nonprescription drugs and other health supplies on hand, including pain relievers, stomach remedies, cough and cold medicines, anti-diarrhoeal medication, fluids with electrolytes, and vitamins.
- Get copies and maintain electronic versions of health records from doctors, hospitals, pharmacies and other sources and store them, for personal reference.
- Talk with family members, loved ones, neighbours, co-workers, and other frequent contacts, about how they would be cared for if they got sick, or what will be needed to care for them in your home.
During a Pandemic
Limit the Spread of Germs and Prevent Infection:
- Avoid close contact with people who are sick.
- When you are sick, keep your distance from others to protect them from getting sick too.
- Cover your mouth and nose with a tissue when coughing or sneezing. It may prevent those around you from getting sick.
- Wash your hands frequently to help protect you from germs.
- Avoid touching your eyes, nose or mouth.
- Practice other good health habits. Get plenty of sleep, be physically active, manage your stress, drink plenty of fluids, and eat nutritious food.
Adapted from: <https://www.ready.gov/pandemic>
(Most of the prepatory advice will be familiar to Bay Area residents as typical earthquake preparedness. Elsewhere it's standard preparation for major winter storms or hurricanes. Be prepared to sit tight for a few weeks.)
US CDC medical travel advisories: https://wwwnc.cdc.gov/travel/notices
United States, 2017, Draws on ~200 journal articles written 1990 - 2016. Provides a framework on response strategy to COVID-19. https://www.cdc.gov/mmwr/volumes/66/rr/rr6601a1.htm)
[+] [-] Eliezer|6 years ago|reply