Introduction | ||||||||||||||||||||||||||||||||
What is Numpy
NumPy is a Python library used for working with arrayshas functions for working in domain of linear algebra, fourier transform, and matrices NumPy stands for Numerical Python Why Use NumPy?
Python lists can be slow to processNumPy aims to provide an array object that is up to 50x faster than traditional Python lists the array object in NumPy is called ndarray provides many supporting functions Why is NumPy Faster Than Lists?
NumPy arrays are stored at one continuous place in memory unlike listsprocesses can access and manipulate them very efficiently behavior is called locality of reference also is optimized to work with latest CPU architectures |
||||||||||||||||||||||||||||||||
Getting Started | ||||||||||||||||||||||||||||||||
Installation of NumPy
pip install numpy Import and Alias NumPy
import numpy as np Checking NumPy Version
import numpy as np print(np.__version__) |
||||||||||||||||||||||||||||||||
Creating Arrays | ||||||||||||||||||||||||||||||||
Create a NumPy ndarray Object
import numpy as np arr = np.array([1, 2, 3, 4, 5]) print(arr) print(type(arr))use a tuple to create a NumPy array import numpy as np arr = np.array((1, 2, 3, 4, 5)) print(arr) Dimensions in Arrays
a dimension in arrays is one level of array depth (nested arrays)nested array have arrays as their elements 0-D Arrays
0-D arrays or scalars are the elements in an arrayeach value in an array is a 0-D array create 0-D array with value 42 import numpy as np arr = np.array(42) print(arr) 1-D Arrays
an array that has 0-D arrays as its elements is called uni-dimensional or 1-D arraymost common and basic arrays create a 1-D array import numpy as np arr = np.array([1, 2, 3, 4, 5]) print(arr) 2-D Arrays
a array that has 1-D arrays as its elements is called a 2-D arrayoften used to represent matrix or 2nd order tensors NumPy has a whole sub module dedicated towards matrix operations named numpy.mat create a 2-D array containing two arrays 3-D arrays
an array that has 2-D arrays (matrices) as its elements is called 3-D arrayoften used to represent a 3rd order tensor create a 3-D array with two 2-D arrays each containing two arrays import numpy as np arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]]) print(arr) Check the Number of Dimensions?
NumPy Arrays provides the ndim attributereturns an integer indicating how many dimensions the array has import numpy as np a = np.array(42) b = np.array([1, 2, 3, 4, 5]) c = np.array([[1, 2, 3], [4, 5, 6]]) d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]]) print(a.ndim) print(b.ndim) print(c.ndim) print(d.ndim) Higher Dimensional Arrays
an array can have any number of dimensionscan define the number of dimensions hen the array is created using the ndmin argument create an array with 5 dimensions and verify that it has 5 dimensions import numpy as np arr = np.array([1, 2, 3, 4], ndmin=5) print(arr) print('number of dimensions :', arr.ndim) |
||||||||||||||||||||||||||||||||
Array Indexing | ||||||||||||||||||||||||||||||||
Access Array Elements
array indexing is the same as accessing an array elementYou can access an array element by referring to its index number. NumPy arrays are zero-based Access 2-D Arrays
to access elements from 2-D arrays use comma separated integers representing the dimension
and the index of the element2-D arrays are like a table with rows and columns the dimension represents the row and the index represents the column access the element on the 2nd row, 5th column import numpy as np arr = np.array([[1,2,3,4,5], [6,7,8,9,10]]) print('5th element on 2nd row: ', arr[1, 4]) Access 3-D Arrays
to access elements from 3-D arrays use comma separated integers representing
the dimensions and the index of the elementaccess the third element of the second array of the first array import numpy as np arr = np.array([ [[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]] ]) print(arr[0, 1, 2]) Negative Indexing
use negative indexing to access an array from the endprint the last element from the 2nd dimension import numpy as np arr = np.array([[1,2,3,4,5], [6,7,8,9,10]]) print('Last element from 2nd dimension: ', arr[1, -1]) |
||||||||||||||||||||||||||||||||
Array Slicing | ||||||||||||||||||||||||||||||||
Slicing arrays
slicing means taking elements from one given index to another given indexpass slice [start:end] end arg is exclusive can also define the step [start:end:step]. start default is 0 end default is the length of the array step default is 1 slice elements from the beginning to index 4 (not included) import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7]) print(arr[:4]) Negative Slicing
use the minus operator to refer to an index from the endslice from the index 3 from the end to index 1 from the end import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7]) print(arr[-3:-1]) STEP
use the step value to determine the step of the slicingreturn every other element from the entire array import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7]) print(arr[::2]) Slicing 2-D Arrays
from the second element slice elements from index 1 to index 4 (not included)
import numpy as np arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) print(arr[1, 1:4])from both elements, slice index 1 to index 4 (not included) will return a 2-D array import numpy as np arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) print(arr[0:2, 1:4])result [[2 3 4], [7 8 9]] |
||||||||||||||||||||||||||||||||
Data Types | ||||||||||||||||||||||||||||||||
Data Types in Python
Data Types in NumPy
NumPy has some extra data typesrefers to data types with one character
Checking the Data Type of an Array
the NumPy array object has a property called dtype which returns
the data type of the arrayget the data type of an array object import numpy as np arr = np.array([1, 2, 3, 4]) print(arr.dtype) # int64 Creating Arrays With a Defined Data Type
use the array() function to create arraysfunction can take an optional argument dtype allows defining the expected data type of the array elements create an array with data type string import numpy as np arr = np.array([1, 2, 3, 4], dtype='S') print(arr) print(arr.dtype)output [b'1' b'2' b'3' b'4'] |S1i, u, f, S and U arrays can define size as well create an array with data type 4 bytes integer import numpy as np arr = np.array([1, 2, 3, 4], dtype='i4') print(arr) print(arr.dtype)output [1 2 3 4] int32 What if a Value Can Not Be Converted?
if a type is given in which elements can't be casted then NumPy
will raise a ValueError
import numpy as np arr = np.array(['a', '2', '3'], dtype='i')output Traceback (most recent call last): File "./prog.py", line 3, in ValueError: invalid literal for int() with base 10: 'a' Converting Data Type on Existing Arrays
the best way to change the data type of an existing array is to make
a copy of the array
with the astype() methodthe astype() function creates a copy of the array can specify the data type as a parameter the data type can be specified using a string or can use the data type directly change data type from float to integer by using 'i' as parameter value import numpy as np arr = np.array([1.1, 2.1, 3.1]) newarr = arr.astype('i') print(newarr) print(newarr.dtype)or import numpy as np arr = np.array([1.1, 2.1, 3.1]) newarr = arr.astype(int) print(newarr) print(newarr.dtype)either way the output is [1 2 3] int32 |
||||||||||||||||||||||||||||||||
Copy vs. View | ||||||||||||||||||||||||||||||||
The Difference Between Copy and View
the main difference between a copy and a view of an array is that the
copy is a new array while
the view is just a view of the original arrayany changes made to the original array will not affect the copy the view does not own the data any changes made to the view will affect the original array any changes made to the original array will affect the view make a copy, change the original array, and display both arrays import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.copy() arr[0] = 42 print(arr) print(x)output [42 2 3 4 5] [1 2 3 4 5]make a view, change the original array, and display both arrays import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.view() arr[0] = 42 print(arr) print(x)output [42 2 3 4 5] [42 2 3 4 5]make a view, change the view, and display both arrays import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.view() x[0] = 31 print(arr) print(x)output [31 2 3 4 5] [31 2 3 4 5] Check if Array Owns its Data
copies owns the data while views do not own the dataNumPy arrays have the attribute base which returns None if the array owns the data otherwise, the base attribute refers to the original object print the value of the base attribute to check if an array owns it's data or not import numpy as np arr = np.array([1, 2, 3, 4, 5]) x = arr.copy() y = arr.view() print(x.base) print(y.base)output None [1 2 3 4 5] |
||||||||||||||||||||||||||||||||
Array Shaping | ||||||||||||||||||||||||||||||||
Shape of an Array
the shape of an array is the number of elements in each dimension
Get the Shape of an Array
NumPy arrays have an attribute named shapereturns a tuple with each index having the number of corresponding elements print the shape of a 2-D array import numpy as np arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) print(arr.shape)output (2, 4)Create an array with 5 dimensions using ndmin using a vector with values 1,2,3,4 verify that last dimension has value 4 import numpy as np arr = np.array([1, 2, 3, 4], ndmin=5) print(arr) print('shape of array :', arr.shape)output [[[[[1 2 3 4]]]]] shape of array : (1, 1, 1, 1, 4) What does the shape tuple represent?
integers at every index tells about the number of elements the corresponding dimension has
in the example above at index-4 has value 4can say that 5th ( 4 + 1 th) dimension has 4 elements |
||||||||||||||||||||||||||||||||
Array Reshaping | ||||||||||||||||||||||||||||||||
Reshaping Arrays
reshaping means changing the shape of an arraythe shape of an array is the number of elements in each dimension by reshaping can add or remove dimensions or change number of elements in each dimension Reshape From 1-D to 2-D
convert the following 1-D array with 12 elements into a 2-D arraythe outermost dimension will have 4 arrays, each with 3 elements: import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) newarr = arr.reshape(4, 3) print(newarr)output [[ 1 2 3] [ 4 5 6] [ 7 8 9] [10 11 12]] Reshape From 1-D to 3-D
convert the following 1-D array with 12 elements into a 3-D arraythe outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) newarr = arr.reshape(2, 3, 2) print(newarr)output [[[ 1 2] [ 3 4] [ 5 6]] [[ 7 8] [ 9 10] [11 12]]] Can We Reshape Into any Shape?
as long as the elements required for reshaping are equal in both shapescan reshape an 8 elements 1D array into 4 elements in 2 rows 2D array cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements resulting error Traceback (most recent call last): File "demo_numpy_array_reshape_error.py", line 5, in <module> ValueError: cannot reshape array of size 8 into shape (3,3) Returns Copy or View?
import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8]) print(arr.reshape(2, 4).base)output is the original array so the reshaped array is a view Unknown Dimension
are allowed to have one "unknown" dimensiondo not have to specify an exact number for one of the dimensions in the reshape method pass -1 as the value, and NumPy will calculate this number convert 1D array with 8 elements to 3D array with 2x2 elements import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8]) newarr = arr.reshape(2, 2, -1) print(newarr)output [[[1 2] [3 4]] [[5 6] [7 8]]]can not pass -1 to more than one dimension Flattening the Arrays
flattening array means converting a multidimensional array into a 1D arraycan use reshape(-1) to do this convert the array into a 1D array import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) newarr = arr.reshape(-1) print(newarr) |
||||||||||||||||||||||||||||||||
Array Iterating | ||||||||||||||||||||||||||||||||
Iterating Arrays
iterate on a 1-D array and it will go through each element one by one
import numpy as np arr = np.array([1, 2, 3]) for x in arr: print(x) Iterating 2-D Arrays
iterate on the elements of a 2-D array
import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) for x in arr: print(x)output [1 2 3] [4 5 6]iterate a n-D array abd it will go through n-1th dimension one by one to return the actual values, the scalars, have to iterate the arrays in each dimension import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) for x in arr: for y in x: print(y) Iterating 3-D Arrays
in a 3-D array it will go through all the 2-D arrays
import numpy as np arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) for x in arr: print("x represents the 2-D array:") print(x)output x represents the 2-D array: [[1 2 3] [4 5 6]] x represents the 2-D array: [[ 7 8 9] [10 11 12]]iterate down to the scalars import numpy as np arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) for x in arr: for y in x: for z in y: print(z) Iterating Arrays Using nditer()
the function nditer() is a helper function that can be used from very basic to
very advanced iterationssolves some basic issues which we face in iteration in basic for loops iterating through each scalar of an array need to use n for loops can be difficult to write for arrays with very high dimensionality import numpy as np arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) for x in np.nditer(arr): print(x)output 1 2 3 4 5 6 7 8 Iterating Array With Different Data Types
an use op_dtypes argument and pass it the expected datatype to change
the datatype of elements while iterating
NumPy does not change the data type of the element in-place so it needs some other space to perform
this actionextra space is called a buffer in order to enable it in nditer() pass flags=['buffered'] iterate through the array as a string import numpy as np arr = np.array([1, 2, 3]) for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']): print(x)output b'1' b'2' b'3' Iterating With Different Step Size
can use filtering and followed by iterationiterate through every scalar element of the 2D array skipping 1 element import numpy as np arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) for x in np.nditer(arr[:, ::2]): print(x)output 1 3 5 7 Enumerated Iteration Using ndenumerate()
enumeration means mentioning sequence number of somethings one by onesometimes require corresponding index of the element while iterating the ndenumerate() method can be used for those usecases enumerate on following 1D arrays elements import numpy as np arr = np.array([1, 2, 3]) for idx, x in np.ndenumerate(arr): print(idx, x)output (0,) 1 (1,) 2 (2,) 3enumerate on following 2D array's elements import numpy as np arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) for idx, x in np.ndenumerate(arr): print(idx, x)output (0, 0) 1 (0, 1) 2 (0, 2) 3 (0, 3) 4 (1, 0) 5 (1, 1) 6 (1, 2) 7 (1, 3) 8 |
||||||||||||||||||||||||||||||||
Array Join | ||||||||||||||||||||||||||||||||
Joining NumPy Arrays
joining means putting contents of two or more arrays in a single array
in SQL join tables based on a keyin NumPy join arrays by axes pass a sequence of arrays to join to the concatenate() function along with the axis if axis is not explicitly passed it defaults to 0 join 2 arrays import numpy as np arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr = np.concatenate((arr1, arr2)) print(arr)output [1 2 3 4 5 6]join two 2-D arrays along rows (axis=1) import numpy as np arr1 = np.array([[1, 2], [3, 4]]) arr2 = np.array([[5, 6], [7, 8]]) arr = np.concatenate((arr1, arr2), axis=1) print(arr)output [[1 2 5 6] [3 4 7 8]] Joining Arrays Using Stack Functions
stacking is same as concatenation with the difference being stacking is done along a new axiscan concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking pass a sequence of arrays to join to the stack() method along with the axis if axis is not explicitly passed it defaults to 0 import numpy as np arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr = np.stack((arr1, arr2), axis=1) print(arr)output [[1 4] [2 5] [3 6]] Stacking Along Rows
NumPy provides a helper function hstack() to stack along rows
import numpy as np arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr = np.hstack((arr1, arr2)) print(arr)output [1 2 3 4 5 6] Stacking Along Columns
NumPy provides a helper function vstack() to stack along columns
import numpy as np arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr = np.vstack((arr1, arr2)) print(arr)output [[1 2 3] [4 5 6]] Stacking Along Height (depth)
NumPy provides a helper function dstack() to stack along height which is the same as depth
import numpy as np arr1 = np.array([1, 2, 3]) arr2 = np.array([4, 5, 6]) arr = np.dstack((arr1, arr2)) print(arr)output [[[1 4] [2 5] [3 6]]] |
||||||||||||||||||||||||||||||||
Array Splitting | ||||||||||||||||||||||||||||||||
Splitting NumPy Arrays
splitting is reverse operation of joiningjoining merges multiple arrays into one while Splitting breaks one array into multiple arrays use array_split() for splitting arrays, pass it the array to split and the number of splits split the array in 3 parts import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) newarr = np.array_split(arr, 3) print(newarr)if the array has less elements than required, it will adjust from the end accordingly import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) newarr = np.array_split(arr, 4) print(newarr)output [array([1, 2]), array([3, 4]), array([5]), array([6])] Split Into Arrays
the return value of the array_split() method is an array containing each
of the split as an arraysplit an array into 3 arrays, can access each array from the result just like any array element import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) newarr = np.array_split(arr, 3) print(newarr[0]) print(newarr[1]) print(newarr[2])when the division won't yield equal length arrays import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) newarr = np.array_split(arr, 4) print(newarr[0]) print(newarr[1]) print(newarr[2])output [1 2] [3 4] [5] Splitting 2-D Arrays
use the same syntax when splitting 2-D arrays
import numpy as np arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]]) newarr = np.array_split(arr, 3) print(newarr)output [array([[1, 2], [3, 4]]), array([[5, 6], [7, 8]]), array([[ 9, 10], [11, 12]])]split the 2-D array into three 2-D arrays import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]]) newarr = np.array_split(arr, 3) print(newarr)output [array([[1, 2, 3], [4, 5, 6]]), array([[ 7, 8, 9], [10, 11, 12]]), array([[13, 14, 15], [16, 17, 18]])]can specify which axis you want to do the split around split the 2-D array into three 2-D arrays along rows (axis = 1) import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]]) newarr = np.array_split(arr, 3, axis=1) print(newarr)output [array([[ 1], [ 4], [ 7], [10], [13], [16]]), array([[ 2], [ 5], [ 8], [11], [14], [17]]), array([[ 3], [ 6], [ 9], [12], [15], [18]])]an alternate solution is using hsplit() import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]]) newarr = np.hsplit(arr, 3) print(newarr) |
||||||||||||||||||||||||||||||||
Array Search | ||||||||||||||||||||||||||||||||
Searching Arrays
can search an array for a certain value, and return the indexes that get a matchto search an array, use the where() method find the indexes where the value is 4 import numpy as np arr = np.array([1, 2, 3, 4, 5, 4, 4]) x = np.where(arr == 4) print(x)output (array([3, 5, 6]),)find the indexes where the values are even import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8]) x = np.where(arr%2 == 0) print(x) Search Sorted
searchsorted() method performs a binary search in an arrayreturns the index where the specified value would be inserted to maintain the search order assumed to be used on sorted arrays find the indexes where the value 7 should be inserted import numpy as np arr = np.array([6, 7, 8, 9]) x = np.searchsorted(arr, 7) print(x)the number 7 should be inserted on index 1 to remain the sort order the method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value Search From the Right Side
by default the left most index is returnedcan give side='right' to return the right most index instead find the indexes where the value 7 should be inserted starting from the right import numpy as np arr = np.array([6, 7, 8, 9]) x = np.searchsorted(arr, 7, side='right') print(x)the number 7 should be inserted on index 2 to remain the sort order t he method starts the search from the right and returns the first index where the number 7 is no longer less than the next value Multiple Values
to search for more than one value, use an array with the specified valuesfind the indexes where the values 2, 4, and 6 should be inserted import numpy as np arr = np.array([1, 3, 5, 7]) x = np.searchsorted(arr, [2, 4, 6]) print(x)The return value is an array [1 2 3] contains the three indexes where 2, 4, 6 would be inserted in the original array to maintain the order (???) |
||||||||||||||||||||||||||||||||
Array Sort | ||||||||||||||||||||||||||||||||
Sorting Arrays
the NumPy ndarray object has a function called sort() which will sort
a specified arrayreturns a copy of the array import numpy as np arr = np.array([3, 2, 0, 1]) print(np.sort(arr))can also sort arrays of strings, or any other data type Sorting a 2-D Array
if the sort() method is used on a 2-D array, both arrays will be sorted
import numpy as np arr = np.array([[3, 2, 4], [5, 0, 1]]) print(np.sort(arr))output [[2 3 4] [0 1 5]] |
||||||||||||||||||||||||||||||||
Array Filter | ||||||||||||||||||||||||||||||||
Filtering Arrays
getting some elements out of an existing array and creating a new array out of them
is called filteringin NumPy filter an array using a boolean index list a boolean index list is a list of booleans corresponding to indexes in the array if the value at an index is True that element is contained in the filtered array if the value at that index is False that element is excluded from the filtered array create an array from the elements on index 0 and 2 import numpy as np arr = np.array([41, 42, 43, 44]) x = [True, False, True, False] newarr = arr[x] print(newarr) Creating the Filter Array
example above uses hard-coded True and False valuescommon use is to create a filter array based on conditions create a filter array that will return only values higher than 42 import numpy as np arr = np.array([41, 42, 43, 44]) # Create an empty list filter_arr = [] # go through each element in arr for element in arr: # if the element is higher than 42, set the value to True, otherwise False: if element > 42: filter_arr.append(True) else: filter_arr.append(False) newarr = arr[filter_arr] print(filter_arr) print(newarr)create a filter array that will return only even elements from the original array import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7]) # Create an empty list filter_arr = [] # go through each element in arr for element in arr: # if the element is completely divisble by 2, set the value to True, otherwise False if element % 2 == 0: filter_arr.append(True) else: filter_arr.append(False) newarr = arr[filter_arr] print(filter_arr) print(newarr) Creating Filter Directly From Array
above example is quite a common task in NumPyNumPy provides a better solution can directly substitute the array instead of the iterable variable in our condition create a filter array that will return only values higher than 42 import numpy as np arr = np.array([41, 42, 43, 44]) filter_arr = arr > 42 newarr = arr[filter_arr] print(filter_arr) print(newarr) |