7 Tips to Help you Learn Numpy Quickly

Over and over again, I emphasize the importance of data manipulation.

Data manipulation is the foundation for almost everything else in data science and machine learning.

So to be a great data scientist, one of your first goals should be to master data manipulation tools.

And if you’re doing data manipulation in Python, that means that you need to learn Numpy.

Numpy is easy in many ways, but many people still get stuck on it.

So in this blog post, I’ll give you 7 tips to help you learn Numpy.

1: Use the 80/20 rule

Numpy is a very powerful toolkit.

One of the reasons that it’s so powerful is the vast number of tools and techniques within the Numpy package.

I don’t know the exact number, but there are probably well over 100 functions, methods, and tools within the Numpy system.

But this poses a problem.

There are a lot of things to potentially learn.

You could easily jump down the Numpy rabbit hole, and try to learn everything.

This is a bad strategy.

When you learn Numpy, a small number of tools account for the vast majority of your code.

This is an example of the 80/20 rule. You’ve probably heard of this before.

According to the 80/20 rule, 80 percent of the outputs result from 20 percent of the inputs.

The 80/20 rule applies to all sorts of things, and for the most part, it also applies to Numpy.

Even though there are 100+ functions and methods in Numpy, you’ll really probably only use a couple dozen of them.

You should focus relentlessly on learning and masting those commonly-used techniques.

(I’ll point out most of them in the upcoming sections.)

2: Learn How Arrays Are Structured

So, now let’s talk about some things specific to Numpy.

One of the first and most important things you need to know is how Numpy arrays are structured.

Let’s start with the basics:

Numpy arrays are data structures that hold numeric data. They can have 1-dimension, 2-dimensions, or multiple dimensions.

An example of a Numpy array with 3 rows and 4 columns, containing random integers.

So they are very similar to vectors and matrices in mathematics and linear algebra.

Importantly, you always need to be mindful of the shape of the array that you’re working with (i.e., the number of rows, columns, etc).

That’s because some Numpy functions change depending on the shapes of the arrays that you’re working with. For example, when you use Numpy multiply and Numpy divide, the functions will work differently if you have two same sized arrays verses two differently sized arrays.

Having said this, you can check the size and shape with the size and shape attributes.

For example, if you have an array named my_array, you can use my_array.shape to check the shape.

3: Learn Numpy Axes

You also need to learn about Numpy axes.

Axes in a Numpy array are very similar to axes in a Cartesian, x/y/z coordinate system. Remember that in a Cartesian coordinate system, we can identify a point in space by it’s location along the x-axis and location along the y-axis.

An example of defining a point in Cartesian coordinates by it's value along the x axis and y axis.

Axes in Numpy are very similar.

Much like in a Cartesian coordinate system, Numpy axes are directions. You can think of them like directions along the edges of the array.

For example, in a 2D array, axis-0 points downward and axis-1 points across.

An image that helps you learn Numpy axes by showing that axes are like directions.

This is important for a few reasons.

First, these axes enable you to locate individual cells in the array. To do this, you need to know which axis references the rows, which references the columns, etc.

Second, many Numpy functions can operate in a specific direction (i.e., along a specific axis). For example, Numpy sum can operate along specific axes to compute row sums and column sums.

An image that shows how Numpy sum works.

I can’t stress this enough: Numpy axes come up over and over again. They are one of the most important things in the Numpy system.

To learn more about Numpy axes, you should read our full tutorial that explains how axes work.

4: Learn the Important Array Creation Methods

Once you understand the essentials of array structure and array axes, you need to know how to create arrays.

There are many, many ways to create Numpy arrays, but the ones that I use most often are:

To be clear, and as I mentioned previously, Numpy has a lot of array creation methods. There are methods for creating arrays with random numbers of various properties. There are techniques for creating arrays with numbers drawn from various probability distributions. I could go on and on.

But in the beginning, you need to focus on the most commonly used techniques. Remember: 80/20.

5: Learn How to Reshape Arrays

Once you have an array with some Numbers, it’s very common to need to reshape the array in some way.

For example, you may have a 1-dimensional array that you want to reshape to 2 dimensions. Or visa versa.

Visual representation of how we re-shape data with the NumPy reshape method.

This is fairly easy to do once you know how, but it comes up over and over again, so you should learn it early.

To learn more about reshaping Numpy arrays, read this tutorial about Numpy reshape.

6: Learn Numpy Aggregation Techniques

Next, you should learn some array aggregation techniques.

The most common and most important are:

You’ll commonly use these techniques when you need to analyze the numbers in your array by computing summary statistics.

I’ll point out that Numpy axes are particularly important when you use these aggregation functions. That’s because you will often need to compute column statistics and row statistics (e.g., the row means, etc).

7: Learn Numpy Array Math

Finally, learn some essential array math.

At minimum, you should know how to add and subtract arrays (using np.add and np.subtract).

You should also probably know some simple math functions to operate on the individual elements of the array, like np.power, np.log, and np.exp.

You may also want to learn about dot products with Numpy and other linear algebra operations. These can be useful for machine learning and deep learning. Having said that, they are slightly more advanced, and you may be able to avoid them in the beginning.

Leave Your Questions in The Comments

Do you have any questions about what I’ve written here? Do you have other suggestions or things you’re struggling with?

If so, leave your questions and suggestions in the comments section near the bottom of the page.

There’s more to learn, but start with the essentials

In this blog post, I’ve tried to distill Numpy down to the most important things, so you can learn Numpy more quickly. I’ve tried to explain the most important, most commonly used techniques and concepts that you should focus on first.

But, if you really want to master Numpy, there’s a lot more to learn.

So if you’re serious about mastering Numpy, and serious about data science in Python, you should consider joining our premium course called Numpy Mastery.

Numpy Mastery will teach you everything you need to know about Numpy, including:

  • How to create Numpy arrays
  • How to reshape, split, and combine your Numpy arrays
  • What the “Numpy random seed” function does
  • How to use the Numpy random functions
  • How to perform mathematical operations on Numpy arrays
  • and more …

Moreover, this course will show you a practice system that will help you master the syntax within a few weeks. We’ll show you a practice system that will enable you to memorize all of the Numpy syntax you learn. If you have trouble remembering Numpy syntax, this is the course you’ve been looking for.

Find out more here:

Learn More About Numpy Mastery

Joshua Ebner

Joshua Ebner is the founder, CEO, and Chief Data Scientist of Sharp Sight.   Prior to founding the company, Josh worked as a Data Scientist at Apple.   He has a degree in Physics from Cornell University.   For more daily data science advice, follow Josh on LinkedIn.

Leave a Comment