Data is collected in many different formats from numbers to images to categories to sound waves. However, we need numerical data in order to analyze it on computers. Machine learning and deep learning models are data-hungry. The performance of them is highly dependent on the amount of data. Thus, we tend to collect as much data as possible in order to build a robust and accurate model. As the amount of data increases, the operations done with scalars become inefficient. We need vectorized or matrix operations to make computations efficiently. That’s where linear algebra comes in.
What Is the Dot Product of a Matrix?
Linear algebra is one of the most important topics in the data science domain. In this post, we will cover two basic, yet very important, operations of linear algebra: Dot product and matrix multiplication. These basic operations are the building blocks of complex machine learning and deep learning models, so it’s important to understand them.
How to Find the Dot Product
The dot product of two vectors is the sum of the products of elements with regards to position. The first element of the first vector is multiplied by the first element of the second vector, and so on. The sum of these products is the dot product, which can be done with np.dot() function.
Let’s first create two simple vectors in the form of NumPy arrays and calculate the dot product.
The dot product of these two vectors is the sum of the products of elements at each position. In this case, the dot product is (1*2)+(2*4)+(3*6)
.
Since we multiply elements at the same positions, the two vectors must have the same length in order to have a dot product.
How to Calculate a Dot Product Matrix
In data science, we mostly deal with matrices. A matrix is a bunch of row and column vectors combined in a structured way. Thus, the multiplication of two matrices involves many dot product operations of vectors. It will be more clear when we go over some examples. Let’s first create two, 2x2 matrices with NumPy.
A 2x2 matrix has two rows and two columns. The index of rows and columns start with zero. For instance, the first row of A (row with index zero) is the array of [4,2]. The first column of A is the array of [4,0]. The element of the first row and first column is four. We can access individual rows, columns or elements with the following NumPy syntax.
These are important concepts to comprehend in order to understand matrix multiplication.
The multiplication of two matrices involves dot products between the first matrix row and the columns of the second matrix. The first step is the dot product between the first row of A and the first column of B. The result of this dot product is the element of the resulting matrix at position [0,0] (i.e. first row, first column.)
So the resulting matrix, C, will have a (4*4) + (2*1)
at the first row and first column. C[0,0] = 18
.
The next step is the dot product of the first row of A and the second column of B.
C will have a (4*0) + (2*4)
at the first row and second column. C[0,1] = 8
.
First row A is complete, so we start on the second row of A and follow the same steps.
C will have a (0*4) + (3*1)
at the second row and first column. C[1,0] = 3
.
The final step is the dot product between the second row of A and the second column of B.
C will have a (0*0) + (3*4)
at the second row and second column. C[1,1] = 12
.
We’ve now seen how it’s done step-by-step. All of these operations can also be done using a np.dot
operation:
As you may recall from vector dot products, two vectors must have the same length in order to have a dot product. Each dot product operation in a matrix multiplication must follow this rule. Dot products are done between the rows of the first matrix and the columns of the second matrix. Thus, the rows of the first matrix and columns of the second matrix must have the same length.
I want to emphasize an important point here: The length of a row is equal to the number of columns. Similarly, the length of a column is equal to the number of rows.
Consider the following matrix D:
D has three rows and two columns, so it’s a 3x2 matrix. The length of a row is two, which is the number of columns, and the length of a column is three, which is the number of rows.
That’s the long explanation, but the point is that to be able to perform a matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix.
For instance, we can multiply a 3x2 matrix with a 2x3 matrix.
The shape of the resulting matrix will be 3x3 because we’re doing three dot product operations for each row of A, and A has three rows. An easy way to determine the shape of the resulting matrix is to take the number of rows from the first one and the number of columns from the second one:
- 3x2 and 2x3 multiplication returns 3x3.
- 3x2 and 2x2 multiplication returns 3x2.
- 2x4 and 4x3 multiplication returns 2x3.
If the conditions we’ve been discussing are not met, matrix multiplication is impossible. Consider the following matrices C and D. They both are 3x2 matrices:
If we try to multiply them, we will get the following value error:
We’ve now covered the basic, but very fundamental operations of linear algebra. These basic operations are the building blocks of complex machine learning and deep learning models. Lots of matrix multiplication operations are done during the optimization process of models. Thus, it’s very important to also understand the basics, as well.