Expand description
Cooperative Matrix Multiplication Example
This example demonstrates how to use cooperative matrix operations (also known as tensor cores on NVIDIA GPUs or simdgroup matrix operations on Apple GPUs) to perform efficient matrix multiplication.
Cooperative matrices allow a workgroup to collectively load, store, and perform matrix operations on small tiles of data, enabling hardware-accelerated matrix math.
Note: This feature requires hardware support and is currently
experimental. Use adapter.cooperative_matrix_properties() to query
supported configurations:
- Metal (Apple): 8x8 f32, 8x8 f16, mixed precision (f16 inputs, f32 accumulator)
- Vulkan (AMD): Typically 16x16 f16
- Vulkan (NVIDIA): Varies by GPU generation
Structsยง
- Dimensions ๐
- Execute
Results ๐