Dev C++ Matrix Multiplication

Nov 20, 2015 Arnold Schwarzenegger This Speech Broke The Internet AND Most Inspiring Speech- It Changed My Life. Duration: 14:58. Andrew DC TV Recommended for you. Oct 21, 2013  Some scripts in Python, Java and C for matrix multiplication. MartinThoma/matrix-multiplication.

  1. Matrix Multiplication In C
  1. Simple C Math. Math in C is very simple. Keep in mind that C mathematical operations follow a particular order much the same as high school math. For example, multiplication and division take precedence over addition and subtraction. The order in which these operations are evaluated can be changed using parentheses.
  2. For this, check if number of columns of first matrix is equal to number of rows of second matrix or not. If both are equal than proceed further otherwise generate output “Not Possible”. In Recursive Matrix Multiplication, we implement three loops of Iteration through recursive calls.
this is the solution, pls someone should help me with the correction, the question is using oop implement a matrix class that provides the basic matrix operations (addition, subtraction, multiplication, Inverse, Transportation).
#include <iostream>
using namespace std;
int main;
class matrix
{
int **p, m, n;
public:
matrix(int row = 2, int col = 2)
{
m = row;
n = col;
p = new(int *); m;
for (int i = 0; i < m; i++)
p[i] = new int[n];
}
~matrix()
{
for (int i = 0; i < m; i++)
delete p[i];
delete p;
}
void accept()
{
cout<<'Enter matrix elements:';
for(int i = 0; i < m; i++)
{
for(int j = 0; j < n; j++)
{
cin >> p[i][j];
}
}
}
void display()
{
cout <<'The matrix is:';
for(int i = 0; i < m; i++)
{
cout <<endl;
for(int j = 0; j < n; j++)
{
cout << p[i][j] <<' ';
}
}
}
matrix operator +(matrix m2)
{
matrix T(m, n);
for(int i = 0; i < m; i++)
{
for(int j = 0; j < n; j++)
{
T.p[i][j] = p[i][j] + m2.p[i][j];
}
}
return T;
}
friend matrix operator * (matrix, matrix);
};
matrix operator * (matrix a , matrix b)
{
if(a.n b.m)
{
matrix T(a.m, b.n);
for(int i = 0; i < a.m; i++)
{
for(int k = 0; k < b.n; k++)
{
T.p[i][k] = 0;
for(int j = 0; j < a.n; j++)
{
T.p[i][k]+= a.p[i][j] * b.p[j][k];
}
}
}
return T;
}
}
-->

This step-by-step walkthrough demonstrates how to use C++ AMP to accelerate the execution of matrix multiplication. Two algorithms are presented, one without tiling and one with tiling.

Prerequisites

Before you start:

  • Read C++ AMP Overview.

  • Read Using Tiles.

  • Make sure that you are running at least Windows 7, or Windows Server 2008 R2.

To create the project

Instructions for creating a new project vary depending on which version of Visual Studio you have installed. To see the documentation for your preferred version of Visual Studio, use the Version selector control. It's found at the top of the table of contents on this page.

To create the project in Visual Studio 2019

  1. On the menu bar, choose File > New > Project to open the Create a New Project dialog box.

  2. At the top of the dialog, set Language to C++, set Platform to Windows, and set Project type to Console.

  3. From the filtered list of project types, choose Empty Project then choose Next. In the next page, enter MatrixMultiply in the Name box to specify a name for the project, and specify the project location if desired.

  4. Choose the Create button to create the client project.

  5. In Solution Explorer, open the shortcut menu for Source Files, and then choose Add > New Item.

  6. In the Add New Item dialog box, select C++ File (.cpp), enter MatrixMultiply.cpp in the Name box, and then choose the Add button.

To create a project in Visual Studio 2017 or 2015

  1. On the menu bar in Visual Studio, choose File > New > Project.

  2. Under Installed in the templates pane, select Visual C++.

  3. Select Empty Project, enter MatrixMultiply in the Name box, and then choose the OK button.

  4. Choose the Next button.

  5. In Solution Explorer, open the shortcut menu for Source Files, and then choose Add > New Item.

  6. In the Add New Item dialog box, select C++ File (.cpp), enter MatrixMultiply.cpp in the Name box, and then choose the Add button.

Multiplication without tiling

In this section, consider the multiplication of two matrices, A and B, which are defined as follows:

A is a 3-by-2 matrix and B is a 2-by-3 matrix. The product of multiplying A by B is the following 3-by-3 matrix. The product is calculated by multiplying the rows of A by the columns of B element by element.

To multiply without using C++ AMP

  1. Open MatrixMultiply.cpp and use the following code to replace the existing code.

    The algorithm is a straightforward implementation of the definition of matrix multiplication. It does not use any parallel or threaded algorithms to reduce the computation time.

  2. On the menu bar, choose File > Save All.

  3. Choose the F5 keyboard shortcut to start debugging and verify that the output is correct.

  4. Choose Enter to exit the application.

To multiply by using C++ AMP

C++
  1. In MatrixMultiply.cpp, add the following code before the main method.

    The AMP code resembles the non-AMP code. The call to parallel_for_each starts one thread for each element in product.extent, and replaces the for loops for row and column. The value of the cell at the row and column is available in idx. You can access the elements of an array_view object by using either the [] operator and an index variable, or the () operator and the row and column variables. The example demonstrates both methods. The array_view::synchronize method copies the values of the product variable back to the productMatrix variable.

  2. Add the following include and using statements at the top of MatrixMultiply.cpp.

  3. Modify the main method to call the MultiplyWithAMP method.

  4. Press the Ctrl+F5 keyboard shortcut to start debugging and verify that the output is correct.

  5. Press the Spacebar to exit the application.

Multiplication with tiling

Tiling is a technique in which you partition data into equal-sized subsets, which are known as tiles. Three things change when you use tiling.

  • You can create tile_static variables. Access to data in tile_static space can be many times faster than access to data in the global space. An instance of a tile_static variable is created for each tile, and all threads in the tile have access to the variable. The primary benefit of tiling is the performance gain due to tile_static access.

  • You can call the tile_barrier::wait method to stop all of the threads in one tile at a specified line of code. You cannot guarantee the order that the threads will run in, only that all of the threads in one tile will stop at the call to tile_barrier::wait before they continue execution.

  • You have access to the index of the thread relative to the entire array_view object and the index relative to the tile. By using the local index, you can make your code easier to read and debug.

To take advantage of tiling in matrix multiplication, the algorithm must partition the matrix into tiles and then copy the tile data into tile_static variables for faster access. In this example, the matrix is partitioned into submatrices of equal size. The product is found by multiplying the submatrices. The two matrices and their product in this example are:

The matrices are partitioned into four 2x2 matrices, which are defined as follows:

The product of A and B can now be written and calculated as follows:

Because matrices a through h are 2x2 matrices, all of the products and sums of them are also 2x2 matrices. It also follows that the product of A and B is a 4x4 matrix, as expected. To quickly check the algorithm, calculate the value of the element in the first row, first column in the product. In the example, that would be the value of the element in the first row and first column of ae + bg. You only have to calculate the first column, first row of ae and bg for each term. That value for ae is (1 * 1) + (2 * 5) = 11. The value for bg is (3 * 1) + (4 * 5) = 23. The final value is 11 + 23 = 34, which is correct.

To implement this algorithm, the code:

C++
  • Uses a tiled_extent object instead of an extent object in the parallel_for_each call.

  • Uses a tiled_index object instead of an index object in the parallel_for_each call.

  • Creates tile_static variables to hold the submatrices.

  • Uses the tile_barrier::wait method to stop the threads for the calculation of the products of the submatrices.

To multiply by using AMP and tiling

  1. In MatrixMultiply.cpp, add the following code before the main method.

    This example is significantly different than the example without tiling. The code uses these conceptual steps:

    1. Copy the elements of tile[0,0] of a into locA. Copy the elements of tile[0,0] of b into locB. Notice that product is tiled, not a and b. Therefore, you use global indices to access a, b, and product. The call to tile_barrier::wait is essential. It stops all of the threads in the tile until both locA and locB are filled.

    2. Multiply locA and locB and put the results in product.

    3. Copy the elements of tile[0,1] of a into locA. Copy the elements of tile [1,0] of b into locB.

    4. Multiply locA and locB and add them to the results that are already in product.

    5. The multiplication of tile[0,0] is complete.

    6. Repeat for the other four tiles. There is no indexing specifically for the tiles and the threads can execute in any order. As each thread executes, the tile_static variables are created for each tile appropriately and the call to tile_barrier::wait controls the program flow.

    7. As you examine the algorithm closely, notice that each submatrix is loaded into a tile_static memory twice. That data transfer does take time. However, once the data is in tile_static memory, access to the data is much faster. Because calculating the products requires repeated access to the values in the submatrices, there is an overall performance gain. For each algorithm, experimentation is required to find the optimal algorithm and tile size.

    In the non-AMP and non-tile examples, each element of A and B is accessed four times from the global memory to calculate the product. In the tile example, each element is accessed twice from the global memory and four times from the tile_static memory. That is not a significant performance gain. However, if the A and B were 1024x1024 matrices and the tile size were 16, there would be a significant performance gain. In that case, each element would be copied into tile_static memory only 16 times and accessed from tile_static memory 1024 times.

  2. Modify the main method to call the MultiplyWithTiling method, as shown.

  3. Press the Ctrl+F5 keyboard shortcut to start debugging and verify that the output is correct.

  4. Press the Space bar to exit the application.

See also

Matrix Multiplication In C

C++ AMP (C++ Accelerated Massive Parallelism)
Walkthrough: Debugging a C++ AMP Application