In last week’s blog post I asked How much data science do you actually remember? It’s a critical question. If you study data science, but forget everything that you learn, you’ll be in big trouble when you go in for an interview. Or, you’ll be in big trouble if you actually get a data science […]
How many data science books have you read? 5? 10? A few dozen? How many free online courses have you taken? A few? How many blog posts have you read? (I’d be willing to bet: you’ve read dozens.) If you’re like most budding data scientists, you’ve probably consumed a lot of material. You probably even […]
The histogram is a very useful visualization tool, and you need to master it.
In the world of data visualization, the heatmap is underrated and underutilized. It has limitations, but overall, it’s an excellent tool in your data science and data visualization toolkit. After you’ve mastered the foundational visualization techniques (you can write the code for the basic plots in your sleep, right?), you should learn the heatmap. […]
Most people woke up on Wednesday morning to some combination of shock, joy, bemusement, and/or mild terror as Donald Trump unexpectedly won the presidency.
I say “unexpectedly” because …
The world has just entered one of the biggest transitions in history. That’s the contention of two MIT economists, Eric Brynjolfsson and Andrew McAfee. In their recent book, The Second Machine Age, they argue that big data, computation, and innovation are changing our economy and institutions with a magnitude greater than almost anything ever seen […]
A few weeks ago, an acquaintance told me that he was interested in getting started with machine learning. He’s a web developer who primarily works in Ruby and Python, but also has a small amount of experience with R. Day-to-day, his work is run-of-the-mill web development, and he’s confessed to me that he’s a bit […]
Claims of “the end of geography” and the flatness of the world notwithstanding, place still matters today. Discussing why place matters is somewhat beyond the scope of this post, so I will direct you to the excellent work of Parag Khanna and his book Connectography. To put it simply, the the future of business and […]
In part 1, we went over how to use data visualization and data analysis prior to machine learning. For example, we discussed how to visualize the data to identify potential issues in the dataset, examine the variable distributions, etc. In this blog post, we’ll continue by building a very simple model and using data visualization […]
In my last article, I stated that for practitioners (as opposed to theorists), the real prerequisite for machine learning is data analysis, not math. One of the main reasons for making this statement, is that data scientists spend an inordinate amount of time on data analysis. The traditional statement is that data scientists “spend 80% […]