Machine learning is hard. Some people spend weeks, months, even years trying to learn machine learning without any success. They play around with datasets, buy books, compete on Kaggle, but ultimately make little progress. One of the big problems, is that many people just want to “dive in and build something.” I admire the ambition […]
If you’ve been using R for a while, and you’ve been working with basic data visualization and data exploration techniques, the next logical step is to start learning some machine learning. To help you begin learning about machine learning in R, I’m going to introduce you to an R package: the caret package. We’ll build […]
On Google’s recent Q3 earnings call, Google’s CEO, Sundar Pichai said that one “transformative” technology is causing Google to rethink “how we’re doing everything.”
Read that again. There’s a single technology that’s causing Google to rethink they way it does everything.
And it’s not just Google …
A Sharp Sight Labs reader (and now student), Jason P. recently started learning data science. He has a background in data analysis (primarily with Excel and related tools in the Microsoft ecosystem) but he wanted to start learning some of the harder skills of data science. He contacted me after he had diligently reviewed past […]
One of the biggest issues that comes up when I talk to people who want to get started learning data science is the following: I don’t know where to get started! Recently, I argued that R is the best programming language to learn when you’re getting started with data science. While this helps you select […]
In a recent post, I wrote that when you’re starting out with data, you need to focus much more on process and technique, not syntax. Beginning students hear this, but it’s easy to ignore and get lost down the rabbit whole of syntax.
The problem is …
Over and over, when talking with people who are starting to learn data science, there’s a frustration that comes up: I don’t know which programming language to start with.” And it’s not just programming languages, it’s also software systems like Tableau, SPSS, etc. There is an ever widening range of tools and programming languages and […]
There’s never been a better time to start learning data science. As early as 2011, McKinsey was predicting shortages of skilled data scientists (and noting big data as a key factor in business competition). They were right. As 2015 begins, the data industry has never looked brighter for people with exceptional data skill. Indeed, in […]
Once you know how to create simple plots you’ll want to learn how to design more sophisticated plots. A large part of being able to design sophisticated plots is having control over the “non-data elements” of the plot, such as the plot title and axis titles.You want to be able to format those and polish […]
Last week I published a data visualization of San Francisco crime. This week, I’m mapping Seattle crime data. The map above is moderately complicated to create, so I’ll start this tutorial with a simpler case: the dot distribution map. Seattle crime map, simplified version First, we’ll start by loading the data. Note that I already […]