Deck 2: Nvidia CUDA and GPU Programming

ملء الشاشة (f)
exit full mode
سؤال
NVIDIA CUDA Warp is made up of how many threads?

A)512
B)1024
C)312
D)32
استخدم زر المسافة أو
up arrow
down arrow
لقلب البطاقة.
سؤال
Out-of-order instructions is not possible on GPUs.
سؤال
CUDA supports programming in ....

A)c or c++ only
B)java, python, and more
C)c, c++, third party wrappers for java, python, and more
D)pascal
سؤال
FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

A)32-bit ieee floating point instructions
B)32-bit integer instructions
C)both
D)none of the above
سؤال
Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

A)1024
B)128
C)512
D)8
سؤال
Each NVIDIA GPU has ------ Streaming Multiprocessors

A)8
B)1024
C)512
D)16
سؤال
CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

A)"programming-overhead", 2 clock
B)"zero-overhead", 1 clock
C)64, 2 clock
D)32, 1 clock
سؤال
Each warp of GPU receives a single instruction and "broadcasts" it to all of its threads. It is a ---- operation.

A)simd (single instruction multiple data)
B)simt (single instruction multiple thread)
C)sisd (single instruction single data)
D)sist (single instruction single thread)
سؤال
Limitations of CUDA Kernel

A)recursion, call stack, static variable declaration
B)no recursion, no call stack, no static variable declarations
C)recursion, no call stack, static variable declaration
D)no recursion, call stack, no static variable declarations
سؤال
What is Unified Virtual Machine

A)it is a technique that allow both cpu and gpu to read from single virtual machine, simultaneously.
B)it is a technique for managing separate host and device memory spaces.
C)it is a technique for executing device code on host and host code on device.
D)it is a technique for executing general purpose programs on device instead of host.
سؤال
_______ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.

A)python, gpus.
B)c, cpus.
C)cuda c, gpus.
D)java, cpus.
سؤال
The CUDA architecture consists of --------- for parallel computing kernels and functions.

A)risc instruction set architecture
B)cisc instruction set architecture
C)zisc instruction set architecture
D)ptx instruction set architecture
سؤال
CUDA stands for --------, designed by NVIDIA.

A)common union discrete architecture
B)complex unidentified device architecture
C)compute unified device architecture
D)complex unstructured distributed architecture
سؤال
The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device.
سؤال
The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device.

A)128, 256, 512
B)32, 64, 128
C)64, 128, 256
D)256, 512, 1024
سؤال
NVIDIA 8-series GPUs offer -------- .

A)50-200 gflops
B)200-400 gflops
C)400-800 gflops
D)800-1000 gflops
سؤال
IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

A)32-bit ieee floating point instructions
B)32-bit integer instructions
C)both
D)none of the above
سؤال
CUDA Hardware programming model supports:
A. fully generally data-parallel archtecture;
B. General thread launch;
C. Global load-store;
D. Parallel data cache;
E. Scalar architecture;
F. Integers, bit operation

A)a,c,d,f
B)b,c,d,e
C)a,d,e,f
D)a,b,c,d,e,f
سؤال
In CUDA memory model there are following memory types available:
A. Registers;
B. Local Memory;
C. Shared Memory;
D. Global Memory;
E. Constant Memory;
F. Texture Memory.

A)a, b, d, f
B)a, c, d, e, f
C)a, b, c, d, e, f
D)b, c, e, f
سؤال
What is the equivalent of general C program with CUDA C:
Int main(void)
{
Printf("Hello, World!\n");
Return 0;
}

A) int main ( void )
{
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
B)__global__ void kernel( void ) { }
Int main ( void ) { kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
C)__global__ void kernel( void ) {
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
D)__global__ int main ( void ) {
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
سؤال
A simple kernel for adding two integers:
__global__ void add( int *a, int *b, int *c ) { *c = *a + *b; }
Where __global__ is a CUDA C keyword which indicates that:

A)add() will execute on device, add() will be called from host
B)add() will execute on host, add() will be called from device
C)add() will be called and executed on host
D)add() will be called and executed on device
سؤال
If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

A)cudamalloc( &dev_a, sizeof( int ) )
B)malloc( &dev_a, sizeof( int ) )
C)cudamalloc( (void**) &dev_a, sizeof( int ) )
D)malloc( (void**) &dev_a, sizeof( int ) )
سؤال
If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

A)memcpy( dev_a, &a, size);
B)cudamemcpy( dev_a, &a, size, cudamemcpyhosttodevice );
C)memcpy( (void*) dev_a, &a, size);
D)cudamemcpy( (void*) &dev_a, &a, size, cudamemcpydevicetohost );
سؤال
Triple angle brackets mark in a statement inside main function, what does it indicates?

A)a call from host code to device code
B)a call from device code to host code
C)less than comparison
D)greater than comparison
فتح الحزمة
قم بالتسجيل لفتح البطاقات في هذه المجموعة!
Unlock Deck
Unlock Deck
1/24
auto play flashcards
العب
simple tutorial
ملء الشاشة (f)
exit full mode
Deck 2: Nvidia CUDA and GPU Programming
1
NVIDIA CUDA Warp is made up of how many threads?

A)512
B)1024
C)312
D)32
32
2
Out-of-order instructions is not possible on GPUs.
False
3
CUDA supports programming in ....

A)c or c++ only
B)java, python, and more
C)c, c++, third party wrappers for java, python, and more
D)pascal
c, c++, third party wrappers for java, python, and more
4
FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

A)32-bit ieee floating point instructions
B)32-bit integer instructions
C)both
D)none of the above
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
5
Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

A)1024
B)128
C)512
D)8
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
6
Each NVIDIA GPU has ------ Streaming Multiprocessors

A)8
B)1024
C)512
D)16
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
7
CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

A)"programming-overhead", 2 clock
B)"zero-overhead", 1 clock
C)64, 2 clock
D)32, 1 clock
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
8
Each warp of GPU receives a single instruction and "broadcasts" it to all of its threads. It is a ---- operation.

A)simd (single instruction multiple data)
B)simt (single instruction multiple thread)
C)sisd (single instruction single data)
D)sist (single instruction single thread)
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
9
Limitations of CUDA Kernel

A)recursion, call stack, static variable declaration
B)no recursion, no call stack, no static variable declarations
C)recursion, no call stack, static variable declaration
D)no recursion, call stack, no static variable declarations
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
10
What is Unified Virtual Machine

A)it is a technique that allow both cpu and gpu to read from single virtual machine, simultaneously.
B)it is a technique for managing separate host and device memory spaces.
C)it is a technique for executing device code on host and host code on device.
D)it is a technique for executing general purpose programs on device instead of host.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
11
_______ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.

A)python, gpus.
B)c, cpus.
C)cuda c, gpus.
D)java, cpus.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
12
The CUDA architecture consists of --------- for parallel computing kernels and functions.

A)risc instruction set architecture
B)cisc instruction set architecture
C)zisc instruction set architecture
D)ptx instruction set architecture
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
13
CUDA stands for --------, designed by NVIDIA.

A)common union discrete architecture
B)complex unidentified device architecture
C)compute unified device architecture
D)complex unstructured distributed architecture
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
14
The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
15
The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device.

A)128, 256, 512
B)32, 64, 128
C)64, 128, 256
D)256, 512, 1024
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
16
NVIDIA 8-series GPUs offer -------- .

A)50-200 gflops
B)200-400 gflops
C)400-800 gflops
D)800-1000 gflops
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
17
IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

A)32-bit ieee floating point instructions
B)32-bit integer instructions
C)both
D)none of the above
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
18
CUDA Hardware programming model supports:
A. fully generally data-parallel archtecture;
B. General thread launch;
C. Global load-store;
D. Parallel data cache;
E. Scalar architecture;
F. Integers, bit operation

A)a,c,d,f
B)b,c,d,e
C)a,d,e,f
D)a,b,c,d,e,f
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
19
In CUDA memory model there are following memory types available:
A. Registers;
B. Local Memory;
C. Shared Memory;
D. Global Memory;
E. Constant Memory;
F. Texture Memory.

A)a, b, d, f
B)a, c, d, e, f
C)a, b, c, d, e, f
D)b, c, e, f
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
20
What is the equivalent of general C program with CUDA C:
Int main(void)
{
Printf("Hello, World!\n");
Return 0;
}

A) int main ( void )
{
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
B)__global__ void kernel( void ) { }
Int main ( void ) { kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
C)__global__ void kernel( void ) {
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
D)__global__ int main ( void ) {
Kernel <<<1,1>>>();
Printf("hello, world!\\n");
Return 0;
}
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
21
A simple kernel for adding two integers:
__global__ void add( int *a, int *b, int *c ) { *c = *a + *b; }
Where __global__ is a CUDA C keyword which indicates that:

A)add() will execute on device, add() will be called from host
B)add() will execute on host, add() will be called from device
C)add() will be called and executed on host
D)add() will be called and executed on device
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
22
If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

A)cudamalloc( &dev_a, sizeof( int ) )
B)malloc( &dev_a, sizeof( int ) )
C)cudamalloc( (void**) &dev_a, sizeof( int ) )
D)malloc( (void**) &dev_a, sizeof( int ) )
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
23
If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

A)memcpy( dev_a, &a, size);
B)cudamemcpy( dev_a, &a, size, cudamemcpyhosttodevice );
C)memcpy( (void*) dev_a, &a, size);
D)cudamemcpy( (void*) &dev_a, &a, size, cudamemcpydevicetohost );
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
24
Triple angle brackets mark in a statement inside main function, what does it indicates?

A)a call from host code to device code
B)a call from device code to host code
C)less than comparison
D)greater than comparison
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.
فتح الحزمة
k this deck
locked card icon
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 24 في هذه المجموعة.