Over and over, when talking with people who are starting to learn data science, there’s a frustration that comes up: I don’t know which programming language to start with.” And it’s not just programming languages, it’s also software systems like Tableau, SPSS, etc. There is an ever widening range of tools and programming languages and […]
There’s never been a better time to start learning data science. As early as 2011, McKinsey was predicting shortages of skilled data scientists (and noting big data as a key factor in business competition). They were right. As 2015 begins, the data industry has never looked brighter for people with exceptional data skill. Indeed, in […]
Once you know how to create simple plots you’ll want to learn how to design more sophisticated plots. A large part of being able to design sophisticated plots is having control over the “non-data elements” of the plot, such as the plot title and axis titles.You want to be able to format those and polish […]
Last week I published a data visualization of San Francisco crime. This week, I’m mapping Seattle crime data. The map above is moderately complicated to create, so I’ll start this tutorial with a simpler case: the dot distribution map. Seattle crime map, simplified version First, we’ll start by loading the data. Note that I already […]
When I was working as a data scientist at Apple in Silicon Valley, I’d drive up to San Francisco on nights and weekends to meet a girl for dinner or go to a meetup.
I sort of fell in love with the city, and …
For our purposes here, data exploration is the application of data visualization and data manipulation techniques to understand the properties of our dataset.
We’re going to be looking for interesting features: things that stand out, trends, and relationships between variables.
You’ve probably read numerous articles telling you how to start learning data science. Collectively, they tell you to dozens of things you need to learn. Learn Python. Learn R. Learn Hadoop. They tell you all the skills you need: learn machine learning, visualization, data wrangling. Little technical skills like manipulating vectors, matrices, loops. More tools […]
I love cars. The way they sound. The engineering. The craftsmanship. And let’s be honest: fast cars are just fun. Given my love of cars, I frequently watch Top Gear clips on YouTube. A couple of weeks ago, I stumbled across this: Watching the video, I’m thinking, “253 miles per hour? You’ve got to […]
Ok. Here’s an ugly secret of that data world: lots of your work will be prep work. Of course, any maker, artist, or craftsman has the same issue: chefs have their mise en place. Carpenters spend a heck of a lot of time measuring vs. cutting. Etcetera. So, you just need to be prepared that […]
An important principle in analyzing data is “overview first, zoom and filter, then details on demand” (quote: Ben Shneiderman) In practice, this typically means starting at a high level with a single chart, and then “zooming into” the data by replicating that chart for specific subsets of the dataset. And, even more valuable is being […]