CUDA Documents


CUDA Toolkit v5.0 Release Notes

Getting Started Guides


CUDA Getting Started Guide for Linux
This guide discusses how to install and check for correct operation of the CUDA Development Tools on GNU/Linux systems.
CUDA Getting Started Guide for Mac OS X
This guide discusses how to install and check for correct operation of the CUDA Development Tools on Mac OS X systems.
CUDA Getting Started Guide for Microsoft Windows
This guide discusses how to install and check for correct operation of the CUDA Development Tools on Microsoft Windows systems.

Programming Guides


CUDA C Programming Guide
This guide provides a detailed discussion of the CUDA programming model and programming interface. It then describes the hardware implementation, and provides guidance on how to achieve maximum performance. The Appendixes include a list of all CUDA-enabled devices, detailed description of all extensions to the C language, listings of supported mathematical functions, C++ features supported in host and device code, details on texture fetching, technical specifications of various devices, and concludes by introducing the low-level driver API.
CUDA C Best Practices Guide
This guide presents established parallelization and optimization techniques and explains coding metaphors and idioms that can greatly simplify programming for CUDA-capable GPU architectures. The intent is to provide guidelines for obtaining the best performance from NVIDIA GPUs using the CUDA Toolkit.
Kepler Compatibility Guide for CUDA Applications
This application note is intended to help developers ensure that their NVIDIA CUDA applications will run effectively on GPUs based on the NVIDIA Kepler Architecture. This document provides guidance to ensure that your software applications are compatible with Kepler.
Tuning CUDA Applications for Kepler
Kepler is NVIDIA's next-generation architecture for CUDA compute applications. Applications that follow the best practices for the Fermi architecture should typically see speedups on the Kepler architecture without any code changes. This guide summarizes the ways that an application can be fine-tuned to gain additional speedups by leveraging Kepler architectural features.
CUDA Dynamic Parallelism
This document provides guidance on how to design and develop software that takes advantage of the new Dynamic Parallelism capabilities introduced with CUDA 5.0.
PTX ISA Version 3.1
This guide provides detailed instructions on the use of PTX, a low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device.

Reference Manuals


CUDA Runtime API
The CUDA runtime API.
CUDA Driver API
The CUDA driver API.
CUDA Math API
The CUDA math API.
CUBLAS
The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access the computational resources of NVIDIA Graphical Processing Unit (GPU), but does not auto-parallelize across multiple GPUs.
CUFFT
The CUFFT library user guide.
CURAND
The CURAND library user guide.
CUSPARSE
The CUSPARSE library user guide.
Thrust
The Thrust getting started guide.

CUDA Samples


CUDA Samples
This document contains a complete listing of the code samples that are included with the NVIDIA CUDA Toolkit. It describes each code sample, lists the minimum GPU specification, and provides links to the source code and white papers if available.
Release Notes
This document provides instructions for installing the NVIDIA CUDA Toolkit and NVIDIA CUDA Samples, as well as guidelines for creating your own CUDA projects. It includes chapters on known issues as well as an FAQ. It's written for Windows, Linux, and Mac OS.
Guide to New Features
This document serves as a guide to the new code samples as they relate to the new CUDA Toolkit feature list.
Getting Started
This document is intended to introduce a set of samples that can be run as an introduction to CUDA. Most of these samples use the CUDA runtime API except for ones explicitly noted that are CUDA Driver API.

Tools Manuals


CUDA Compiler Driver NVCC
This document is a reference guide on the use of the CUDA compiler driver nvcc. Instead of being a specific CUDA compilation driver, nvcc mimics the behavior of the GNU compiler gcc, accepting a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process.
CUDA-GDB
The NVIDIA tool for debugging CUDA applications running on Linux and Mac, providing developers with a mechanism for debugging CUDA applications running on actual hardware. CUDA-GDB is an extension to the x86-64 port of GDB, the GNU Project debugger.
CUDA-MEMCHECK
CUDA-MEMCHECK is a suite of run time tools capable of precisely detecting out of bounds and misaligned memory access errors, checking device allocation leaks, reporting hardware errors and identifying shared memory data access hazards.
Profiler User's Guide
This is the guide to the Profiler.
Nsight Eclipse Edition Getting Started Guide
Nsight Eclipse Edition getting started guide

Miscellaneous


CUPTI
The CUPTI API.
Debugger API
The CUDA debugger API.
RDMA for GPUDirect
A tool for Kepler-class GPUs and CUDA 5.0 enabling a direct path for communication between the GPU and a peer device on the PCI Express bus when the devices share the same upstream root complex using standard features of PCI Express. This document introduces the technology and describes the steps necessary to enable a RDMA for GPUDirect connection to NVIDIA GPUs within the Linux device driver model.