0

I have a simple code (given below) which calls cudaMalloc and then immediately prints the GPU memory info.

int main(){
    int *ptr;
    cudaError_t rc = cudaMalloc(&ptr, sizeof(int) * 10000);
    if (rc != cudaSuccess)
        printf("Could not allocate memory: %d", rc);
    cudaDeviceSynchronize();

    size_t free_byte ;
    size_t total_byte ;

    cudaMemGetInfo( &free_byte, &total_byte ) ;

    cudaDeviceSynchronize();

    double free_db = (double)free_byte ;
    double total_db = (double)total_byte ;
    double used_db = total_db - free_db ;

    printf("GPU memory usage: used = %f, free = %f MB, total = %f MB\n",
    used_db/1024.0/1024.0, free_db/1024.0/1024.0, total_db/1024.0/1024.0);

    return 0;
}

This code prints the same value of used memory regardless of how much memory I allocate in the cudaMalloc statement. What am I doing wrong here?

Ricky Dev
  • 33
  • 1
  • 5
  • 3
    Your measurement is being swamped by factors you haven't considered. 1. CUDA has an overhead (hundreds of megabytes) You are allocating 40KB. 2. Recent versions of CUDA may allocate more than you request, effectively quantizing your request. As a result, small changes in your request won't be visible here. 3. The function call may be rounding results to the nearest 0.5MB. Try allocating 500MB. Then you will see the used number change. – Robert Crovella May 31 '22 at 18:02
  • @RobertCrovella thank you. This cleared up my confusion. I could have actually avoided this if I had properly calculated how much I am allocating and compared it with how much it is printing. If you want to make this an answer, I'll accept it – Ricky Dev May 31 '22 at 18:18

0 Answers0