CudaLight

C++ CUDA memory manager to use in conjunction with CudaLightKernels

Project maintained by pmontalb Hosted on GitHub Pages — Theme by mattgraham

CudaLight

C++ manager class for CudaLightKernel API. The low level calls are managed by the static class DeviceManager, whereas the high level infrastructure is delegated to the particular buffer type. Only contiguous memory data structures have been implemented, as this project aims to give a simplified version to the CUDA standard library. The implemented structures are:

Vector
Matrix (only column-wise)
3D Tensor

Types

All data structures are templated where the arguments are the memory space and the math domain. The memory space indicates where the memory has to be allocated, i.e. host side (CPU) or device side (GPU). The math domain defines the type of the vector: integer, float or double.

Restrictions

Dynamic buffers are not allowed. Size is needed in every constructor, and it’s not possible to resize the given buffer

For convenience’s sake the following typedefs have been defined:

Vector:

  typedef Vector<MemorySpace::Device, MathDomain::Int> GpuIntegerVector;
  typedef Vector<MemorySpace::Device, MathDomain::Float> GpuSingleVector;
  typedef GpuSingleVector GpuFloatVector;
  typedef Vector<MemorySpace::Device, MathDomain::Double> GpuDoubleVector;

  typedef Vector<MemorySpace::Host, MathDomain::Int> CpuIntegerVector;
  typedef Vector<MemorySpace::Host, MathDomain::Float> CpuSingleVector;
  typedef CpuSingleVector CpuFloatVector;
  typedef Vector<MemorySpace::Host, MathDomain::Double> CpuDoubleVector;
	
  typedef GpuSingleVector vec;
  typedef GpuDoubleVector dvec;
  typedef GpuIntegerVector ivec;

Matrix:

  typedef ColumnWiseMatrix<MemorySpace::Device, MathDomain::Int> GpuIntegerMatrix;
  typedef ColumnWiseMatrix<MemorySpace::Device, MathDomain::Float> GpuSingleMatrix;
  typedef GpuSingleMatrix GpuFloatMatrix;
  typedef ColumnWiseMatrix<MemorySpace::Device, MathDomain::Double> GpuDoubleMatrix;

  typedef ColumnWiseMatrix<MemorySpace::Host, MathDomain::Int> CpuIntegerMatrix;
  typedef ColumnWiseMatrix<MemorySpace::Host, MathDomain::Float> CpuSingleMatrix;
  typedef CpuSingleVector CpuFloatMatrix;
  typedef ColumnWiseMatrix<MemorySpace::Host, MathDomain::Double> CpuDoubleMatrix;

  typedef GpuSingleMatrix mat;
  typedef GpuDoubleMatrix dmat;
  typedef GpuIntegerMatrix imat;

Sample usage

Alloc a vector of 10 floats on GPU:
```
cl::GpuSingleVector gpuVector(10);
```

Alloc a vector of n integers on CPU, initialised at -1:

const unsigned nElements = 50;
cl::CpuIntegerVector cpuVector(nElements, -1);

Alloc a float vector of n integers on CPU, initialised with a linear space between -1 and 1:

const unsigned nElements = 50;
const float lowerBound = -1.0f;
const float upperBound =  1.0f;
cl::vec v = cl::LinSpace(lowerBound, upperBound, nElements);

Add two vectors with cuBlas:

const cl::vec a = cl::LinSpace(-1.0, 1.0, 100);
const cl::vec b = cl::RandomUniform(v1.size());
const cl::vec c = a + b;

Element-wise product between vectors:

const cl::vec a = cl::LinSpace(-1.0, 1.0, 100);
const cl::vec b = cl::RandomUniform(v1.size());
const cl::vec c = a % b;

Dot product between matrices:

const unsigned nRowsA = 10;
const unsigned nColsA = 15;
const unsigned nColsB = 20;
const cl::mat A(nRowsA, nColsA, 2.7182f);
const cl::mat B(nColsA, nColsB, 3.1415f);
const cl::mat C = A * B;

Dot product between a matrix and a vector:

const unsigned nRowsA = 10;
const unsigned nColsA = 15;
const cl::mat A(nRowsA, nColsA, 2.7182f);
const cl::vec x(nColsA, 3.1415f);
const cl::vec y = A * x;

Serialization to text file (compatible with numpy.loadtxt):

const unsigned nRows = 10;
const unsigned nCols = 15;
std::ofstream f("matrix.cl");
cl::mat m(nRows, nCols);
f << m;

Serialization to binary file (compatible with numpy.load and memory mapped files - makes use of Npy++):

const unsigned nRows = 10;
const unsigned nCols = 15;
cl::mat m(nRows, nCols);
m.ToBinaryFile("matrix.npy");
f << m;

Deserialization (compatible with numpy.savetxt):

std::ifstream f1("matrix.cl");
cl::mat m = cl::MatrixFromInputStream(f1);
  
std::ifstream f2("vector.cl");
cl::vec v = cl::VectorFromInputStream(f2);

Deserialization from binary file (compatible with numpy.save and memory mapped files - makes use of Npy++):

cl::mat m = cl::MatrixFromBinaryFile("matrix.npy");
  
cl::vec v = cl::VectorFromBinaryFile("v1.npy");;