CUDA Networks
Functions
matrix_multiply.cu File Reference

Implementation of the Matrix::multiply method for GPU-accelerated matrix multiplication. More...

#include "matrix.h"
#include <cuda_runtime.h>
#include <stdexcept>
#include <string>
Include dependency graph for matrix_multiply.cu:

Go to the source code of this file.

Functions

__global__ void matrixMultiplyKernel (const double *a, const double *b, double *c, int m, int n, int k)
 CUDA kernel for matrix multiplication. More...
 

Detailed Description

Implementation of the Matrix::multiply method for GPU-accelerated matrix multiplication.

Definition in file matrix_multiply.cu.

Function Documentation

◆ matrixMultiplyKernel()

__global__ void matrixMultiplyKernel ( const double *  a,
const double *  b,
double *  c,
int  m,
int  n,
int  k 
)

CUDA kernel for matrix multiplication.

Parameters
aPointer to the first input matrix data.
bPointer to the second input matrix data.
cPointer to the output matrix data.
mNumber of rows in matrix A.
nNumber of columns in matrix A / rows in matrix B.
kNumber of columns in matrix B.

Definition at line 20 of file matrix_multiply.cu.

25  {
26  // Calculate global thread indices
27  int row = blockIdx.y * blockDim.y + threadIdx.y;
28  int col = blockIdx.x * blockDim.x + threadIdx.x;
29 
30  // Check if thread is within matrix bounds
31  if (row < m && col < k) {
32  // Initialize sum for dot product
33  double sum = 0.0;
34 
35  // Perform dot product of row from A and column from B
36  for (int i = 0; i < n; ++i) {
37  sum += a[row * n + i] * b[i * k + col];
38  }
39 
40  // Store the result in matrix C
41  c[row * k + col] = sum;
42  }
43 }