12 Days of Data Analytics: Day 6 – Stay Colour Safe
In a previous job at a large multi-national company I was developing a piece of software to visualise the internal workings of a machine learning solution developed to distinguish between good and bad parts on a production line. I had used a neat multi-dimensional visualisation solution based on a force directed layout that showed datasets composed of good parts and bad parts and coloured them in green (good) and red (bad). It looked something like the image below.

In a previous job at a large multi-national company I was developing a piece of software to visualise the internal workings of a machine learning solution developed to distinguish between good and bad parts on a production line. I had used a neat multi-dimensional visualisation solution based on a force directed layout that showed datasets composed of good parts and bad parts and coloured them in green (good) and red (bad). It looked something like the image below.

One week senior execs from the US were coming to Ireland to visit us and I was rolled out to demo the great new software I had been developing. I got about 2 minutes into the demo before I showed the first example visualisation. At this point one of the guests interrupted me, to tell me that he was colour blind and, so, my visualisation was useless as he could not distinguish between red and green. Unfortunately that was the end of my presentation.
This was a long time ago and at the time I had never really thought too much about colour blindness, but the visitor from the US made a very good point. VisCheck is a great online tool that allows the effects of colour blindness to be simulated. The image below shows a simulation of what someone with red/green colour blindness (the most common form) would see when looking at the image above. Obviously the image isn’t terribly useful!

People with colour blindness have a difficulty differentiating between certain colour hues (there is a lot of very good information about colour blindness at www.colourblindawareness.org). This doesn’t mean that they cannot see colour at all, just that there are certain colours between which it is difficult to distinguish. The most common forms of colour blindness (deuteranope and protanope) lead to difficulties distinguishing between red and green, while a less common form (tritanope) leads to difficulties distinguishing between blue and yellow.
Colour blindness affects about 10% of men and about 1% of women, so it is something that we should take seriously when designing data visualizations. There are three easy things we can do to mitigate the effect of colour blindness on people’s ability to read a data visualisation. The first is, if we really must use a colour scheme that is not colour blindness safe, is to use a redundant visual encoding such as shape in our visualisations. The images below show a version of the scatter plot above where bad instances are shown as red circles and good ones are shown as green stars. Someone who cannot distinguish between the colour differences, should still be able to distinguish between the shape differences (although the shape differences are not as effective when points start to get crowded and overlap).


A more useful thing to do is to simply use a colour blindness safe palette. Most design tools provide these. ColorBrewer, a tool we mentioned in a previous post will also generate colour blindness safe palettes. One of these is shown in the image below, again with a colour blindness simulation from VisCheck.


And here is a version of the multi-dimensional visualization tool reworked to use colour-blindness safe colours instead of red and green.


The examples shown so far show categorical data using easily distinguishable colours in a multi-chromatic palette. When showing quantitative data using colour, for clarity, we should almost always use a mono-chromatic palette. One of the extra advantages of mono-chromatic palettes is that people with colour blindness have no problems at all with them. The images below show more examples from ColorBrewer again using Vischeck to simulate red/green colour blindness. While in the VisCheck versions some of the red and green shades began to look very similar there are no problems at all caused by the monochromatic palettes in the individual images.




So, to avoid making the same mistake that I did, and blowing a big demo, make sure to do the following:
- Use colour blindness safe palettes wherever possible
- If you cannot use a colour blindness safe palette (it is hard to create colour blindness safe palettes with more than about 5 colours) use redundant encodings
- Use mono-chromatic palettes for numeric data
- Use tools like VisCheck to test your visualisations for colour blindness safeness