I first saw a watercolour regression in a paper by my colleague Andrea Staid on wind prediction for wind energy. She had this neat figure here which shows a probabilistic prediction band: I want something similar to this for a research project and came across the work of Schönbrodt and Hsiang.
Seriously, how cool is that.
Basically, the figure on the left shows the scatter and the 1st, 2nd, and 3rd standard deviation away from the median. The figure on the right shows this with some smoothing.
But! Their code is in R and Matlab respectively and I want it in Python. A quick Google didn’t turn anything up (except this Stack question).
I’m going to start off easy and not run any regression. I just want a line which goes through the median value and has bands at the 25th and 75th percentile, the 10th and 90th, and the 5th and 95th. A couple of options from my attempts below. Option 1 (Top left) uses the colour palette from Schönbrodt’s figures. It has three distinct, albeit similar colours. The second option uses different transparencies of the same colour. The final option reverts the scatter points to their default style provided through the style context option - using this approach will ensure consistency between figures.
If this is of use to anyone, the code is on my github.
Imagine a vibrant and healthy city. Is it one where you, your children, and your grandparents have access to the things they need?
The City of Seattle flew us over to act as mentors for a Civic Hacking weekend focused on urban accessibility for senior citizens.
You May Also Enjoy
There's so much data being produced and presented online, but often it vanishes as quickly as it arrives. Here's a quick guide and example to how to record it.
Time for the speed and accuracy comparison
Have you ever had to revisit a coding project? If you're like me, there's a sense of trepidation. Will I remember what I was doing? Will it still work? Here are some techniques that may make you a happier and more efficient researcher.
If you want to query the walking, driving, or cycling time between multiple points, this may help.