-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document different alpaka terms and it relationships #2275
base: develop
Are you sure you want to change the base?
Changes from all commits
208d8e5
76280ec
1473738
8097c57
2c53167
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,70 @@ | ||||||
Terms | ||||||
===== | ||||||
|
||||||
This page provides an overview of the terms used in ``alpaka`` and the relationships between them. | ||||||
|
||||||
Platform | ||||||
-------- | ||||||
|
||||||
* A ``platform`` contains information about the system, e.g. the available devices. | ||||||
* Depending on the platform, it also contains a runtime context. | ||||||
* A ``platform`` can be shared by many devices. | ||||||
|
||||||
Device | ||||||
------ | ||||||
|
||||||
* A ``device`` represent a compute unit, such as a CPU or a GPU. | ||||||
* Each ``device`` is bounded to a specific ``platform``. | ||||||
* Each ``device`` can be used by many specific ``accelerators``. | ||||||
|
||||||
Work division | ||||||
------------- | ||||||
|
||||||
* Describes the domain decomposition of a contiguous N-dimensional index domain in ``blocks``, ``threads`` and ``elements``. A ``block`` contains one or more ``threads`` and a ``thread`` process one or more ``elements``. | ||||||
* A ``work division`` has limitations depending on the ``kernel`` function and ``accelerator``. | ||||||
|
||||||
Accelerator | ||||||
----------- | ||||||
|
||||||
* Describes "how" a kernel work division is mapped to device threads. | ||||||
* N-dimensional work divisions (1D, 2D, 3D) are supported. | ||||||
* Holds implementations of shared memory, atomic operations, math operations etc. | ||||||
* ``Accelerators`` are instantiated only when a kernel is executed, and can only be accessed in device code. | ||||||
* Each device function can (should) be templated on the accelerator type, and take an accelerator as its first argument. | ||||||
* The accelerator object can be used to extract the ``work division`` and indices of the current block and thread. | ||||||
* The accelerator type can be used to implement per-accelerator behaviours. | ||||||
* An ``accelerator`` is bounded to a specific ``platform``. | ||||||
|
||||||
Queue | ||||||
----- | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would add that a queue can be either "synchronous" (blocking) or "asynchronous" (non-blocking), and that this may determine how the work is submitted to the device, and if a synchronisation is performed after each operation. |
||||||
* Stores tasks which should be executed on a ``device``. | ||||||
* Operations can be ``TaskKernels``, ``HostTasks``, ``Events``, ``Sets`` and ``Copies``. | ||||||
* A Queue can be ``Blocking`` (host thread is waiting for finishing the API call) or ``NonBlocking`` (host thread continues after calling the API independent if the call finished or not). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
* All operations in a queue will be executed sequential in FIFO order. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
* Operations in different queues can run in parallel even on blocking queues. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With blocking queues, this requires using different host tasks, right ? |
||||||
* ``wait()`` can be executed for queues to block the caller host thread until all previous enqueued work is finished. | ||||||
* Each ``queue`` is bounded to a specific ``device``. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
Task | ||||||
---- | ||||||
|
||||||
* A ``TaskKernel`` contains the algorithm which should be executed on a ``device``. | ||||||
* A ``HostTask`` is a functor without ``acc`` argument, which can be enqueued and is always executed on the host device. | ||||||
|
||||||
Event | ||||||
----- | ||||||
|
||||||
* A ``event`` is a marker in a ``queue``. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
* ``events`` can be used to describe dependencies between different ``queues``. | ||||||
* A ``event`` allows to wait until all previous enqueued work in a queue has finished. | ||||||
|
||||||
Set | ||||||
--- | ||||||
|
||||||
* A ``Set`` sets a memory region to a specific value byte-wise. | ||||||
|
||||||
Copy | ||||||
---- | ||||||
|
||||||
* Deep memory copy from one memory to another memory location. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.