Implementation of the Matrix::multiply method for GPU-accelerated matrix multiplication. More...

#include "matrix.h"
#include <cuda_runtime.h>
#include <stdexcept>
#include <string>

Include dependency graph for matrix_multiply.cu:

Functions
__global__ void	matrixMultiplyKernel (const double a, const double b, double *c, int m, int n, int k)
	CUDA kernel for matrix multiplication. More...

Detailed Description

Implementation of the Matrix::multiply method for GPU-accelerated matrix multiplication.

Definition in file matrix_multiply.cu.

Function Documentation

◆ matrixMultiplyKernel()

__global__ void matrixMultiplyKernel	(	const double *	a,
		const double *	b,
		double *	c,
		int	m,
		int	n,
		int	k
	)

CUDA kernel for matrix multiplication.

Parameters

a	Pointer to the first input matrix data.
b	Pointer to the second input matrix data.
c	Pointer to the output matrix data.
m	Number of rows in matrix A.
n	Number of columns in matrix A / rows in matrix B.
k	Number of columns in matrix B.

Definition at line 20 of file matrix_multiply.cu.

                                             {
     // Calculate global thread indices
     int row = blockIdx.y * blockDim.y + threadIdx.y;
     int col = blockIdx.x * blockDim.x + threadIdx.x;
  
     // Check if thread is within matrix bounds
     if (row < m && col < k) {
         // Initialize sum for dot product
         double sum = 0.0;
  
         // Perform dot product of row from A and column from B
         for (int i = 0; i < n; ++i) {
             sum += a[row * n + i] * b[i * k + col];
         }
  
         // Store the result in matrix C
         c[row * k + col] = sum;
     }
 }

Functions

Detailed Description

Function Documentation

◆ matrixMultiplyKernel()