Module 6 Lesson 2: NumPy: Arrays and Vectorization
·Data Science

Module 6 Lesson 2: NumPy: Arrays and Vectorization

Meet the foundation of numerical Python. Learn how NumPy arrays differ from lists and how 'vectorization' makes your math 100x faster.

Module 6 Lesson 2: NumPy: Arrays and Vectorization

Python lists are great for small tasks, but they are slow when dealing with millions of numbers. To solve this, we use NumPy (Numerical Python). It is the foundation for almost every other data science library.

Lesson Overview

In this lesson, we will cover:

  • The ndarray: NumPy’s super-powered list.
  • Vectorization: Mathematical operations at the speed of light.
  • Array Creation: np.array, np.zeros, and np.arange.
  • Performance Comparison: Why lists fail where NumPy shines.

1. What is an Array?

A NumPy array (ndarray) looks like a list, but it has one major rule: Every item in the array must be the same data type (e.g., all integers or all floats). This allows the computer to process them much faster in memory.

import numpy as np # Standard way to import numpy

arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr)) # Output: <class 'numpy.ndarray'>

2. The Magic of Vectorization

If you want to add 10 to every number in a Python list, you need a for-loop. In NumPy, you just add it. This is called Vectorization.

# The Slow List Way
my_list = [1, 2, 3]
new_list = [x + 10 for x in my_list]

# The Fast NumPy Way
my_arr = np.array([1, 2, 3])
new_arr = my_arr + 10 # Adding 10 to every element at once!
print(new_arr) # Output: [11 12 13]

3. Creating Arrays

NumPy provides built-in tools to generate data quickly.

# Create an array of 5 zeros
zeros = np.zeros(5)

# Create an array from 0 to 9
range_arr = np.arange(10)

# Create a 2D array (a table)
matrix = np.array([[1, 2], [3, 4]])

4. Why Use NumPy?

Imagine you have 1 million numbers.

  • Python List: Adding 1 to each number might take 100 milliseconds.
  • NumPy Array: Adding 1 to each number takes 1 millisecond.

By using NumPy, you’re using the full power of your computer’s CPU rather than the slower, safer Python interpreter.


Practice Exercise: The Temperature Converter (NumPy Style)

  1. Create a NumPy array containing 10 temperatures in Celsius (e.g., 0, 10, 20, 30...).
  2. Use a Vectorized Operation to convert all of them to Fahrenheit in one line of code.
    • Formula: F = (C * 9/5) + 32
  3. Print the new array.
  4. Calculate the Mean (Average) temperature using np.mean(your_array).

Quick Knowledge Check

  1. Which is faster for millions of numbers: a Python list or a NumPy array?
  2. What is the main restriction of a NumPy array?
  3. What is "Vectorization" in your own words?
  4. What does np.zeros(10) create?

Key Takeaways

  • NumPy arrays are faster and more memory-efficient than lists.
  • Arrays require all elements to be of the same type.
  • Vectorization lets you apply math to an entire array without loops.
  • NumPy is the required first step for using Pandas and Scikit-Learn.

What’s Next?

We know how to create arrays and do basic math. In Lesson 3, we’ll explore the advanced Math and Stats power of NumPy, including slicing and conditional filters!

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn