naga/back/spv/mod.rs
1/*!
2Backend for [SPIR-V][spv] (Standard Portable Intermediate Representation).
3
4# Layout of values in `uniform` buffers
5
6WGSL's ["Internal Layout of Values"][ilov] rules specify the memory layout of
7each WGSL type. The memory layout is important for data stored in `uniform` and
8`storage` buffers, especially when exchanging data with CPU code.
9
10Both WGSL and Vulkan specify some conditions that a type's memory layout
11must satisfy in order to use that type in a `uniform` or `storage` buffer.
12For `storage` buffers, the WGSL and Vulkan restrictions are compatible, but
13for `uniform` buffers, WGSL allows some types that Vulkan does not, requiring
14adjustments when emitting SPIR-V for `uniform` buffers.
15
16## Padding in two-row matrices
17
18SPIR-V provides detailed control over the layout of matrix types, and is
19capable of describing the WGSL memory layout. However, Vulkan imposes
20additional restrictions.
21
22Vulkan's ["extended layout"][extended-layout] (also known as std140) rules
23apply to types used in `uniform` buffers. Under these rules, matrices are
24defined in terms of arrays of their vector type, and arrays are defined to have
25an alignment equal to the alignment of their element type rounded up to a
26multiple of 16. This means that each column of the matrix has a minimum
27alignment of 16. WGSL, and consequently Naga IR, on the other hand specifies
28column alignment equal to the alignment of the vector type, without being
29rounded up to 16.
30
31To compensate for this, for any `struct` used as a `uniform` buffer which
32contains a two-row matrix, we declare an additional "std140 compatible" type
33in which each column of the matrix has been decomposed into the containing
34struct. For example, the following WGSL struct type:
35
36```ignore
37struct Baz {
38 m: mat3x2<f32>,
39}
40```
41
42is rendered as the SPIR-V struct type:
43
44```ignore
45OpTypeStruct %v2float %v2float %v2float
46```
47
48This has the effect that struct indices in Naga IR for such types do not
49correspond to the struct indices used in SPIR-V. A mapping of struct indices
50for these types is maintained in [`Std140CompatTypeInfo`].
51
52Additionally, any two-row matrices that are declared directly as uniform
53buffers without being wrapped in a struct are declared as a struct containing a
54vector member for each column. Any array of a two-row matrix in a uniform
55buffer is declared as an array of a struct containing a vector member for each
56column. Any struct or array within a uniform buffer which contains a member or
57whose base type requires a std140 compatible type declaration, itself requires a
58std140 compatible type declaration.
59
60Whenever a value of such a type is [`loaded`] we insert code to convert the
61loaded value from the std140 compatible type to the regular type. This occurs
62in `BlockContext::write_checked_load`, making use of the wrapper function
63defined by `Writer::write_wrapped_convert_from_std140_compat_type`. For matrices
64that have been decomposed as separate columns in the containing struct, we load
65each column separately then composite the matrix type in
66`BlockContext::maybe_write_load_uniform_matcx2_struct_member`.
67
68Whenever a column of a matrix that has been decomposed into its containing
69struct is [`accessed`] with a constant index we adjust the emitted access chain
70to access from the containing struct instead, in `BlockContext::write_access_chain`.
71
72Whenever a column of a uniform buffer two-row matrix is [`dynamically accessed`]
73we must first load the matrix type, converting it from its std140 compatible
74type as described above, then access the column using the wrapper function
75defined by `Writer::write_wrapped_matcx2_get_column`. This is handled by
76`BlockContext::maybe_write_uniform_matcx2_dynamic_access`.
77
78Note that this approach differs somewhat from the equivalent code in the HLSL
79backend. For HLSL all structs containing two-row matrices (or arrays of such)
80have their declarations modified, not just those used as uniform buffers.
81Two-row matrices and arrays of such only use modified type declarations when
82used as uniform buffers, or additionally when used as struct member in any
83context. This avoids the need to convert struct values when loading from uniform
84buffers, but when loading arrays and matrices from uniform buffers or from any
85struct the conversion is still required. In contrast, the approach used here
86always requires converting *any* affected type when loading from a uniform
87buffer, but consistently *only* when loading from a uniform buffer. As a result
88this also means we only have to handle loads and not stores, as uniform buffers
89are read-only.
90
91[spv]: https://www.khronos.org/registry/SPIR-V/
92[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
93[extended-layout]: https://docs.vulkan.org/spec/latest/chapters/interfaces.html#interfaces-resources-layout
94[`loaded`]: crate::Expression::Load
95[`accessed`]: crate::Expression::AccessIndex
96[`dynamically accessed`]: crate::Expression::Access
97*/
98
99mod block;
100mod f16_polyfill;
101mod helpers;
102mod image;
103mod index;
104mod instructions;
105mod layout;
106mod mesh_shader;
107mod ray;
108mod reclaimable;
109mod selection;
110mod subgroup;
111mod writer;
112
113pub use mesh_shader::{MeshReturnInfo, MeshReturnMember};
114pub use spirv::{Capability, SourceLanguage};
115
116use alloc::{string::String, vec::Vec};
117use core::ops;
118
119use spirv::Word;
120use thiserror::Error;
121
122use crate::arena::{Handle, HandleVec};
123use crate::back::TaskDispatchLimits;
124use crate::proc::{BoundsCheckPolicies, TypeResolution};
125
126#[derive(Clone)]
127struct PhysicalLayout {
128 magic_number: Word,
129 version: Word,
130 generator: Word,
131 bound: Word,
132 instruction_schema: Word,
133}
134
135#[derive(Default)]
136struct LogicalLayout {
137 capabilities: Vec<Word>,
138 extensions: Vec<Word>,
139 ext_inst_imports: Vec<Word>,
140 memory_model: Vec<Word>,
141 entry_points: Vec<Word>,
142 execution_modes: Vec<Word>,
143 debugs: Vec<Word>,
144 annotations: Vec<Word>,
145 declarations: Vec<Word>,
146 function_declarations: Vec<Word>,
147 function_definitions: Vec<Word>,
148}
149
150#[derive(Clone)]
151struct Instruction {
152 op: spirv::Op,
153 wc: u32,
154 type_id: Option<Word>,
155 result_id: Option<Word>,
156 operands: Vec<Word>,
157}
158
159const BITS_PER_BYTE: crate::Bytes = 8;
160
161#[derive(Clone, Debug, Error)]
162pub enum Error {
163 #[error("The requested entry point couldn't be found")]
164 EntryPointNotFound,
165 #[error("target SPIRV-{0}.{1} is not supported")]
166 UnsupportedVersion(u8, u8),
167 #[error("using {0} requires at least one of the capabilities {1:?}, but none are available")]
168 MissingCapabilities(&'static str, Vec<Capability>),
169 #[error("unimplemented {0}")]
170 FeatureNotImplemented(&'static str),
171 #[error("module is not validated properly: {0}")]
172 Validation(&'static str),
173 #[error("overrides should not be present at this stage")]
174 Override,
175 #[error(transparent)]
176 ResolveArraySizeError(#[from] crate::proc::ResolveArraySizeError),
177 #[error("module requires SPIRV-{0}.{1}, which isn't supported")]
178 SpirvVersionTooLow(u8, u8),
179 #[error("mapping of {0:?} is missing")]
180 MissingBinding(crate::ResourceBinding),
181}
182
183#[derive(Default)]
184struct IdGenerator(Word);
185
186impl IdGenerator {
187 const fn next(&mut self) -> Word {
188 self.0 += 1;
189 self.0
190 }
191}
192
193#[derive(Debug, Clone)]
194pub struct DebugInfo<'a> {
195 pub source_code: &'a str,
196 pub file_name: &'a str,
197 pub language: SourceLanguage,
198}
199
200/// A SPIR-V block to which we are still adding instructions.
201///
202/// A `Block` represents a SPIR-V block that does not yet have a termination
203/// instruction like `OpBranch` or `OpReturn`.
204///
205/// The `OpLabel` that starts the block is implicit. It will be emitted based on
206/// `label_id` when we write the block to a `LogicalLayout`.
207///
208/// To terminate a `Block`, pass the block and the termination instruction to
209/// `Function::consume`. This takes ownership of the `Block` and transforms it
210/// into a `TerminatedBlock`.
211struct Block {
212 label_id: Word,
213 body: Vec<Instruction>,
214}
215
216/// A SPIR-V block that ends with a termination instruction.
217struct TerminatedBlock {
218 label_id: Word,
219 body: Vec<Instruction>,
220}
221
222impl Block {
223 const fn new(label_id: Word) -> Self {
224 Block {
225 label_id,
226 body: Vec::new(),
227 }
228 }
229}
230
231struct LocalVariable {
232 id: Word,
233 instruction: Instruction,
234}
235
236struct ResultMember {
237 id: Word,
238 type_id: Word,
239 built_in: Option<crate::BuiltIn>,
240}
241
242struct EntryPointContext {
243 argument_ids: Vec<Word>,
244 results: Vec<ResultMember>,
245 task_payload_variable_id: Option<Word>,
246 mesh_state: Option<MeshReturnInfo>,
247}
248
249#[derive(Default)]
250struct Function {
251 signature: Option<Instruction>,
252 parameters: Vec<FunctionArgument>,
253 variables: crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
254 /// Map from a local variable that is a ray query to its u32 tracker.
255 ray_query_initialization_tracker_variables:
256 crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
257 /// Map from a local variable that is a ray query to its tracker for the t max.
258 ray_query_t_max_tracker_variables:
259 crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
260 /// List of local variables used as a counters to ensure that all loops are bounded.
261 force_loop_bounding_vars: Vec<LocalVariable>,
262
263 /// A map from a Naga expression to the temporary SPIR-V variable we have
264 /// spilled its value to, if any.
265 ///
266 /// Naga IR lets us apply [`Access`] expressions to expressions whose value
267 /// is an array or matrix---not a pointer to such---but SPIR-V doesn't have
268 /// instructions that can do the same. So when we encounter such code, we
269 /// spill the expression's value to a generated temporary variable. That, we
270 /// can obtain a pointer to, and then use an `OpAccessChain` instruction to
271 /// do whatever series of [`Access`] and [`AccessIndex`] operations we need
272 /// (with bounds checks). Finally, we generate an `OpLoad` to get the final
273 /// value.
274 ///
275 /// [`Access`]: crate::Expression::Access
276 /// [`AccessIndex`]: crate::Expression::AccessIndex
277 spilled_composites: crate::FastIndexMap<Handle<crate::Expression>, LocalVariable>,
278
279 /// A set of expressions that are either in [`spilled_composites`] or refer
280 /// to some component/element of such.
281 ///
282 /// [`spilled_composites`]: Function::spilled_composites
283 spilled_accesses: crate::arena::HandleSet<crate::Expression>,
284
285 /// A map taking each expression to the number of [`Access`] and
286 /// [`AccessIndex`] expressions that uses it as a base value. If an
287 /// expression has no entry, its count is zero: it is never used as a
288 /// [`Access`] or [`AccessIndex`] base.
289 ///
290 /// We use this, together with [`ExpressionInfo::ref_count`], to recognize
291 /// the tips of chains of [`Access`] and [`AccessIndex`] expressions that
292 /// access spilled values --- expressions in [`spilled_composites`]. We
293 /// defer generating code for the chain until we reach its tip, so we can
294 /// handle it with a single instruction.
295 ///
296 /// [`Access`]: crate::Expression::Access
297 /// [`AccessIndex`]: crate::Expression::AccessIndex
298 /// [`ExpressionInfo::ref_count`]: crate::valid::ExpressionInfo
299 /// [`spilled_composites`]: Function::spilled_composites
300 access_uses: crate::FastHashMap<Handle<crate::Expression>, usize>,
301
302 blocks: Vec<TerminatedBlock>,
303 entry_point_context: Option<EntryPointContext>,
304}
305
306impl Function {
307 fn consume(&mut self, mut block: Block, termination: Instruction) {
308 block.body.push(termination);
309 self.blocks.push(TerminatedBlock {
310 label_id: block.label_id,
311 body: block.body,
312 })
313 }
314
315 fn parameter_id(&self, index: u32) -> Word {
316 match self.entry_point_context {
317 Some(ref context) => context.argument_ids[index as usize],
318 None => self.parameters[index as usize]
319 .instruction
320 .result_id
321 .unwrap(),
322 }
323 }
324}
325
326/// Characteristics of a SPIR-V `OpTypeImage` type.
327///
328/// SPIR-V requires non-composite types to be unique, including images. Since we
329/// use `LocalType` for this deduplication, it's essential that `LocalImageType`
330/// be equal whenever the corresponding `OpTypeImage`s would be. To reduce the
331/// likelihood of mistakes, we use fields that correspond exactly to the
332/// operands of an `OpTypeImage` instruction, using the actual SPIR-V types
333/// where practical.
334#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
335struct LocalImageType {
336 sampled_type: crate::Scalar,
337 dim: spirv::Dim,
338 flags: ImageTypeFlags,
339 image_format: spirv::ImageFormat,
340}
341
342bitflags::bitflags! {
343 /// Flags corresponding to the boolean(-ish) parameters to OpTypeImage.
344 #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
345 pub struct ImageTypeFlags: u8 {
346 const DEPTH = 0x1;
347 const ARRAYED = 0x2;
348 const MULTISAMPLED = 0x4;
349 const SAMPLED = 0x8;
350 }
351}
352
353impl LocalImageType {
354 /// Construct a `LocalImageType` from the fields of a `TypeInner::Image`.
355 fn from_inner(dim: crate::ImageDimension, arrayed: bool, class: crate::ImageClass) -> Self {
356 let make_flags = |multi: bool, other: ImageTypeFlags| -> ImageTypeFlags {
357 let mut flags = other;
358 flags.set(ImageTypeFlags::ARRAYED, arrayed);
359 flags.set(ImageTypeFlags::MULTISAMPLED, multi);
360 flags
361 };
362
363 let dim = spirv::Dim::from(dim);
364
365 match class {
366 crate::ImageClass::Sampled { kind, multi } => LocalImageType {
367 sampled_type: crate::Scalar { kind, width: 4 },
368 dim,
369 flags: make_flags(multi, ImageTypeFlags::SAMPLED),
370 image_format: spirv::ImageFormat::Unknown,
371 },
372 crate::ImageClass::Depth { multi } => LocalImageType {
373 sampled_type: crate::Scalar {
374 kind: crate::ScalarKind::Float,
375 width: 4,
376 },
377 dim,
378 flags: make_flags(multi, ImageTypeFlags::DEPTH | ImageTypeFlags::SAMPLED),
379 image_format: spirv::ImageFormat::Unknown,
380 },
381 crate::ImageClass::Storage { format, access: _ } => LocalImageType {
382 sampled_type: format.into(),
383 dim,
384 flags: make_flags(false, ImageTypeFlags::empty()),
385 image_format: format.into(),
386 },
387 crate::ImageClass::External => unimplemented!(),
388 }
389 }
390}
391
392/// A numeric type, for use in [`LocalType`].
393#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
394enum NumericType {
395 Scalar(crate::Scalar),
396 Vector {
397 size: crate::VectorSize,
398 scalar: crate::Scalar,
399 },
400 Matrix {
401 columns: crate::VectorSize,
402 rows: crate::VectorSize,
403 scalar: crate::Scalar,
404 },
405}
406
407impl NumericType {
408 const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
409 match *inner {
410 crate::TypeInner::Scalar(scalar) | crate::TypeInner::Atomic(scalar) => {
411 Some(NumericType::Scalar(scalar))
412 }
413 crate::TypeInner::Vector { size, scalar } => Some(NumericType::Vector { size, scalar }),
414 crate::TypeInner::Matrix {
415 columns,
416 rows,
417 scalar,
418 } => Some(NumericType::Matrix {
419 columns,
420 rows,
421 scalar,
422 }),
423 _ => None,
424 }
425 }
426
427 const fn scalar(self) -> crate::Scalar {
428 match self {
429 NumericType::Scalar(scalar)
430 | NumericType::Vector { scalar, .. }
431 | NumericType::Matrix { scalar, .. } => scalar,
432 }
433 }
434
435 const fn with_scalar(self, scalar: crate::Scalar) -> Self {
436 match self {
437 NumericType::Scalar(_) => NumericType::Scalar(scalar),
438 NumericType::Vector { size, .. } => NumericType::Vector { size, scalar },
439 NumericType::Matrix { columns, rows, .. } => NumericType::Matrix {
440 columns,
441 rows,
442 scalar,
443 },
444 }
445 }
446}
447
448/// A cooperative type, for use in [`LocalType`].
449#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
450enum CooperativeType {
451 Matrix {
452 columns: crate::CooperativeSize,
453 rows: crate::CooperativeSize,
454 scalar: crate::Scalar,
455 role: crate::CooperativeRole,
456 },
457}
458
459impl CooperativeType {
460 const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
461 match *inner {
462 crate::TypeInner::CooperativeMatrix {
463 columns,
464 rows,
465 scalar,
466 role,
467 } => Some(Self::Matrix {
468 columns,
469 rows,
470 scalar,
471 role,
472 }),
473 _ => None,
474 }
475 }
476}
477
478/// A SPIR-V type constructed during code generation.
479///
480/// This is the variant of [`LookupType`] used to represent types that might not
481/// be available in the arena. Variants are present here for one of two reasons:
482///
483/// - They represent types synthesized during code generation, as explained
484/// in the documentation for [`LookupType`].
485///
486/// - They represent types for which SPIR-V forbids duplicate `OpType...`
487/// instructions, requiring deduplication.
488///
489/// This is not a complete copy of [`TypeInner`]: for example, SPIR-V generation
490/// never synthesizes new struct types, so `LocalType` has nothing for that.
491///
492/// Each `LocalType` variant should be handled identically to its analogous
493/// `TypeInner` variant. You can use the [`Writer::localtype_from_inner`]
494/// function to help with this, by converting everything possible to a
495/// `LocalType` before inspecting it.
496///
497/// ## `LocalType` equality and SPIR-V `OpType` uniqueness
498///
499/// The definition of `Eq` on `LocalType` is carefully chosen to help us follow
500/// certain SPIR-V rules. SPIR-V ยง2.8 requires some classes of `OpType...`
501/// instructions to be unique; for example, you can't have two `OpTypeInt 32 1`
502/// instructions in the same module. All 32-bit signed integers must use the
503/// same type id.
504///
505/// All SPIR-V types that must be unique can be represented as a `LocalType`,
506/// and two `LocalType`s are always `Eq` if SPIR-V would require them to use the
507/// same `OpType...` instruction. This lets us avoid duplicates by recording the
508/// ids of the type instructions we've already generated in a hash table,
509/// [`Writer::lookup_type`], keyed by `LocalType`.
510///
511/// As another example, [`LocalImageType`], stored in the `LocalType::Image`
512/// variant, is designed to help us deduplicate `OpTypeImage` instructions. See
513/// its documentation for details.
514///
515/// SPIR-V does not require pointer types to be unique - but different
516/// SPIR-V ids are considered to be distinct pointer types. Since Naga
517/// uses structural type equality, we need to represent each Naga
518/// equivalence class with a single SPIR-V `OpTypePointer`.
519///
520/// As it always must, the `Hash` implementation respects the `Eq` relation.
521///
522/// [`TypeInner`]: crate::TypeInner
523#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
524enum LocalType {
525 /// A numeric type.
526 Numeric(NumericType),
527 Cooperative(CooperativeType),
528 Pointer {
529 base: Word,
530 class: spirv::StorageClass,
531 },
532 Image(LocalImageType),
533 SampledImage {
534 image_type_id: Word,
535 },
536 Sampler,
537 BindingArray {
538 base: Handle<crate::Type>,
539 size: u32,
540 },
541 AccelerationStructure,
542 RayQuery,
543}
544
545/// A type encountered during SPIR-V generation.
546///
547/// In the process of writing SPIR-V, we need to synthesize various types for
548/// intermediate results and such: pointer types, vector/matrix component types,
549/// or even booleans, which usually appear in SPIR-V code even when they're not
550/// used by the module source.
551///
552/// However, we can't use `crate::Type` or `crate::TypeInner` for these, as the
553/// type arena may not contain what we need (it only contains types used
554/// directly by other parts of the IR), and the IR module is immutable, so we
555/// can't add anything to it.
556///
557/// So for local use in the SPIR-V writer, we use this type, which holds either
558/// a handle into the arena, or a [`LocalType`] containing something synthesized
559/// locally.
560///
561/// This is very similar to the [`proc::TypeResolution`] enum, with `LocalType`
562/// playing the role of `TypeInner`. However, `LocalType` also has other
563/// properties needed for SPIR-V generation; see the description of
564/// [`LocalType`] for details.
565///
566/// [`proc::TypeResolution`]: crate::proc::TypeResolution
567#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
568enum LookupType {
569 Handle(Handle<crate::Type>),
570 Local(LocalType),
571}
572
573impl From<LocalType> for LookupType {
574 fn from(local: LocalType) -> Self {
575 Self::Local(local)
576 }
577}
578
579#[derive(Debug, PartialEq, Clone, Hash, Eq)]
580struct LookupFunctionType {
581 parameter_type_ids: Vec<Word>,
582 return_type_id: Word,
583}
584
585#[derive(Debug, PartialEq, Clone, Hash, Eq)]
586enum LookupRayQueryFunction {
587 Initialize,
588 Proceed,
589 GenerateIntersection,
590 ConfirmIntersection,
591 GetVertexPositions { committed: bool },
592 GetIntersection { committed: bool },
593 Terminate,
594}
595
596#[derive(Debug)]
597enum Dimension {
598 Scalar,
599 Vector,
600 Matrix,
601 CooperativeMatrix,
602}
603
604/// Key used to look up an operation which we have wrapped in a helper
605/// function, which should be called instead of directly emitting code
606/// for the expression. See [`Writer::wrapped_functions`].
607#[derive(Debug, Eq, PartialEq, Hash)]
608enum WrappedFunction {
609 BinaryOp {
610 op: crate::BinaryOperator,
611 left_type_id: Word,
612 right_type_id: Word,
613 },
614 ConvertFromStd140CompatType {
615 r#type: Handle<crate::Type>,
616 },
617 MatCx2GetColumn {
618 r#type: Handle<crate::Type>,
619 },
620}
621
622/// A map from evaluated [`Expression`](crate::Expression)s to their SPIR-V ids.
623///
624/// When we emit code to evaluate a given `Expression`, we record the
625/// SPIR-V id of its value here, under its `Handle<Expression>` index.
626///
627/// A `CachedExpressions` value can be indexed by a `Handle<Expression>` value.
628///
629/// [emit]: index.html#expression-evaluation-time-and-scope
630#[derive(Default)]
631struct CachedExpressions {
632 ids: HandleVec<crate::Expression, Word>,
633}
634impl CachedExpressions {
635 fn reset(&mut self, length: usize) {
636 self.ids.clear();
637 self.ids.resize(length, 0);
638 }
639}
640impl ops::Index<Handle<crate::Expression>> for CachedExpressions {
641 type Output = Word;
642 fn index(&self, h: Handle<crate::Expression>) -> &Word {
643 let id = &self.ids[h];
644 if *id == 0 {
645 unreachable!("Expression {:?} is not cached!", h);
646 }
647 id
648 }
649}
650impl ops::IndexMut<Handle<crate::Expression>> for CachedExpressions {
651 fn index_mut(&mut self, h: Handle<crate::Expression>) -> &mut Word {
652 let id = &mut self.ids[h];
653 if *id != 0 {
654 unreachable!("Expression {:?} is already cached!", h);
655 }
656 id
657 }
658}
659impl reclaimable::Reclaimable for CachedExpressions {
660 fn reclaim(self) -> Self {
661 CachedExpressions {
662 ids: self.ids.reclaim(),
663 }
664 }
665}
666
667#[derive(Eq, Hash, PartialEq)]
668enum CachedConstant {
669 Literal(crate::proc::HashableLiteral),
670 Composite {
671 ty: LookupType,
672 constituent_ids: Vec<Word>,
673 },
674 ZeroValue(Word),
675}
676
677/// The SPIR-V representation of a [`crate::GlobalVariable`].
678///
679/// In the Vulkan spec 1.3.296, the section [Descriptor Set Interface][dsi] says:
680///
681/// > Variables identified with the `Uniform` storage class are used to access
682/// > transparent buffer backed resources. Such variables *must* be:
683/// >
684/// > - typed as `OpTypeStruct`, or an array of this type,
685/// >
686/// > - identified with a `Block` or `BufferBlock` decoration, and
687/// >
688/// > - laid out explicitly using the `Offset`, `ArrayStride`, and `MatrixStride`
689/// > decorations as specified in "Offset and Stride Assignment".
690///
691/// This is followed by identical language for the `StorageBuffer`,
692/// except that a `BufferBlock` decoration is not allowed.
693///
694/// When we encounter a global variable in the [`Storage`] or [`Uniform`]
695/// address spaces whose type is not already [`Struct`], this backend implicitly
696/// wraps the global variable in a struct: we generate a SPIR-V global variable
697/// holding an `OpTypeStruct` with a single member, whose type is what the Naga
698/// global's type would suggest, decorated as required above.
699///
700/// The [`helpers::global_needs_wrapper`] function determines whether a given
701/// [`crate::GlobalVariable`] needs to be wrapped.
702///
703/// [dsi]: https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#interfaces-resources-descset
704/// [`Storage`]: crate::AddressSpace::Storage
705/// [`Uniform`]: crate::AddressSpace::Uniform
706/// [`Struct`]: crate::TypeInner::Struct
707#[derive(Clone)]
708struct GlobalVariable {
709 /// The SPIR-V id of the `OpVariable` that declares the global.
710 ///
711 /// If this global has been implicitly wrapped in an `OpTypeStruct`, this id
712 /// refers to the wrapper, not the original Naga value it contains. If you
713 /// need the Naga value, use [`access_id`] instead of this field.
714 ///
715 /// If this global is not implicitly wrapped, this is the same as
716 /// [`access_id`].
717 ///
718 /// This is used to compute the `access_id` pointer in function prologues,
719 /// and used for `ArrayLength` expressions, which need to pass the wrapper
720 /// struct.
721 ///
722 /// [`access_id`]: GlobalVariable::access_id
723 var_id: Word,
724
725 /// The loaded value of a `AddressSpace::Handle` global variable.
726 ///
727 /// If the current function uses this global variable, this is the id of an
728 /// `OpLoad` instruction in the function's prologue that loads its value.
729 /// (This value is assigned as we write the prologue code of each function.)
730 /// It is then used for all operations on the global, such as `OpImageSample`.
731 handle_id: Word,
732
733 /// The SPIR-V id of a pointer to this variable's Naga IR value.
734 ///
735 /// If the current function uses this global variable, and it has been
736 /// implicitly wrapped in an `OpTypeStruct`, this is the id of an
737 /// `OpAccessChain` instruction in the function's prologue that refers to
738 /// the wrapped value inside the struct. (This value is assigned as we write
739 /// the prologue code of each function.) If you need the wrapper struct
740 /// itself, use [`var_id`] instead of this field.
741 ///
742 /// If this global is not implicitly wrapped, this is the same as
743 /// [`var_id`].
744 ///
745 /// [`var_id`]: GlobalVariable::var_id
746 access_id: Word,
747}
748
749impl GlobalVariable {
750 const fn dummy() -> Self {
751 Self {
752 var_id: 0,
753 handle_id: 0,
754 access_id: 0,
755 }
756 }
757
758 const fn new(id: Word) -> Self {
759 Self {
760 var_id: id,
761 handle_id: 0,
762 access_id: 0,
763 }
764 }
765
766 /// Prepare `self` for use within a single function.
767 const fn reset_for_function(&mut self) {
768 self.handle_id = 0;
769 self.access_id = 0;
770 }
771}
772
773struct FunctionArgument {
774 /// Actual instruction of the argument.
775 instruction: Instruction,
776 handle_id: Word,
777}
778
779/// Tracks the expressions for which the backend emits the following instructions:
780/// - OpConstantTrue
781/// - OpConstantFalse
782/// - OpConstant
783/// - OpConstantComposite
784/// - OpConstantNull
785struct ExpressionConstnessTracker {
786 inner: crate::arena::HandleSet<crate::Expression>,
787}
788
789impl ExpressionConstnessTracker {
790 fn from_arena(arena: &crate::Arena<crate::Expression>) -> Self {
791 let mut inner = crate::arena::HandleSet::for_arena(arena);
792 for (handle, expr) in arena.iter() {
793 let insert = match *expr {
794 crate::Expression::Literal(_)
795 | crate::Expression::ZeroValue(_)
796 | crate::Expression::Constant(_) => true,
797 crate::Expression::Compose { ref components, .. } => {
798 components.iter().all(|&h| inner.contains(h))
799 }
800 crate::Expression::Splat { value, .. } => inner.contains(value),
801 _ => false,
802 };
803 if insert {
804 inner.insert(handle);
805 }
806 }
807 Self { inner }
808 }
809
810 fn is_const(&self, value: Handle<crate::Expression>) -> bool {
811 self.inner.contains(value)
812 }
813}
814
815/// General information needed to emit SPIR-V for Naga statements.
816struct BlockContext<'w> {
817 /// The writer handling the module to which this code belongs.
818 writer: &'w mut Writer,
819
820 /// The [`Module`](crate::Module) for which we're generating code.
821 ir_module: &'w crate::Module,
822
823 /// The [`Function`](crate::Function) for which we're generating code.
824 ir_function: &'w crate::Function,
825
826 /// Information module validation produced about
827 /// [`ir_function`](BlockContext::ir_function).
828 fun_info: &'w crate::valid::FunctionInfo,
829
830 /// The [`spv::Function`](Function) to which we are contributing SPIR-V instructions.
831 function: &'w mut Function,
832
833 /// SPIR-V ids for expressions we've evaluated.
834 cached: CachedExpressions,
835
836 /// The `Writer`'s temporary vector, for convenience.
837 temp_list: Vec<Word>,
838
839 /// Tracks the constness of `Expression`s residing in `self.ir_function.expressions`
840 expression_constness: ExpressionConstnessTracker,
841
842 force_loop_bounding: bool,
843
844 /// Hash from an expression whose type is a ray query / pointer to a ray query to its tracker.
845 /// Note: this is sparse, so can't be a handle vec
846 ray_query_tracker_expr: crate::FastHashMap<Handle<crate::Expression>, RayQueryTrackers>,
847}
848
849#[derive(Clone, Copy)]
850struct RayQueryTrackers {
851 // Initialization tracker
852 initialized_tracker: Word,
853 // Tracks the t max from ray query initialize.
854 // Unlike HLSL, spir-v's equivalent getter for the current committed t has UB (instead of just
855 // returning t_max) if there was no previous hit (though in some places it treats the behaviour as
856 // defined), therefore we must track the tmax inputted into ray query initialize.
857 t_max_tracker: Word,
858}
859
860impl BlockContext<'_> {
861 const fn gen_id(&mut self) -> Word {
862 self.writer.id_gen.next()
863 }
864
865 fn get_type_id(&mut self, lookup_type: LookupType) -> Word {
866 self.writer.get_type_id(lookup_type)
867 }
868
869 fn get_handle_type_id(&mut self, handle: Handle<crate::Type>) -> Word {
870 self.writer.get_handle_type_id(handle)
871 }
872
873 fn get_expression_type_id(&mut self, tr: &TypeResolution) -> Word {
874 self.writer.get_expression_type_id(tr)
875 }
876
877 fn get_index_constant(&mut self, index: Word) -> Word {
878 self.writer.get_constant_scalar(crate::Literal::U32(index))
879 }
880
881 fn get_scope_constant(&mut self, scope: Word) -> Word {
882 self.writer
883 .get_constant_scalar(crate::Literal::I32(scope as _))
884 }
885
886 fn get_pointer_type_id(&mut self, base: Word, class: spirv::StorageClass) -> Word {
887 self.writer.get_pointer_type_id(base, class)
888 }
889
890 fn get_numeric_type_id(&mut self, numeric: NumericType) -> Word {
891 self.writer.get_numeric_type_id(numeric)
892 }
893}
894
895/// Information about a type for which we have declared a std140 layout
896/// compatible variant, because the type is used in a uniform but does not
897/// adhere to std140 requirements. The uniform will be declared using the
898/// type `type_id`, and the result of any `Load` will be immediately converted
899/// to the base type. This is used for matrices with 2 rows, as well as any
900/// arrays or structs containing such matrices.
901pub struct Std140CompatTypeInfo {
902 /// ID of the std140 compatible type declaration.
903 type_id: Word,
904 /// For structs, a mapping of Naga IR struct member indices to the indices
905 /// used in the generated SPIR-V. For non-struct types this will be empty.
906 member_indices: Vec<u32>,
907}
908
909pub struct Writer {
910 physical_layout: PhysicalLayout,
911 logical_layout: LogicalLayout,
912 id_gen: IdGenerator,
913
914 /// The set of capabilities modules are permitted to use.
915 ///
916 /// This is initialized from `Options::capabilities`.
917 capabilities_available: Option<crate::FastHashSet<Capability>>,
918
919 /// The set of capabilities used by this module.
920 ///
921 /// If `capabilities_available` is `Some`, then this is always a subset of
922 /// that.
923 capabilities_used: crate::FastIndexSet<Capability>,
924
925 /// The set of spirv extensions used.
926 extensions_used: crate::FastIndexSet<&'static str>,
927
928 debug_strings: Vec<Instruction>,
929 debugs: Vec<Instruction>,
930 annotations: Vec<Instruction>,
931 flags: WriterFlags,
932 bounds_check_policies: BoundsCheckPolicies,
933 zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
934 force_loop_bounding: bool,
935 use_storage_input_output_16: bool,
936 void_type: Word,
937 tuple_of_u32s_ty_id: Option<Word>,
938 //TODO: convert most of these into vectors, addressable by handle indices
939 lookup_type: crate::FastHashMap<LookupType, Word>,
940 lookup_function: crate::FastHashMap<Handle<crate::Function>, Word>,
941 lookup_function_type: crate::FastHashMap<LookupFunctionType, Word>,
942 /// Operations which have been wrapped in a helper function. The value is
943 /// the ID of the function, which should be called instead of emitting code
944 /// for the operation directly.
945 wrapped_functions: crate::FastHashMap<WrappedFunction, Word>,
946 /// Indexed by const-expression handle indexes
947 constant_ids: HandleVec<crate::Expression, Word>,
948 cached_constants: crate::FastHashMap<CachedConstant, Word>,
949 global_variables: HandleVec<crate::GlobalVariable, GlobalVariable>,
950 std140_compat_uniform_types: crate::FastHashMap<Handle<crate::Type>, Std140CompatTypeInfo>,
951 fake_missing_bindings: bool,
952 binding_map: BindingMap,
953
954 // Cached expressions are only meaningful within a BlockContext, but we
955 // retain the table here between functions to save heap allocations.
956 saved_cached: CachedExpressions,
957
958 gl450_ext_inst_id: Word,
959
960 // Just a temporary list of SPIR-V ids
961 temp_list: Vec<Word>,
962
963 ray_query_functions: crate::FastHashMap<LookupRayQueryFunction, Word>,
964
965 /// F16 I/O polyfill manager for handling `f16` input/output variables
966 /// when `StorageInputOutput16` capability is not available.
967 io_f16_polyfills: f16_polyfill::F16IoPolyfill,
968
969 /// Non semantic debug printf extension `OpExtInstImport`
970 debug_printf: Option<Word>,
971 pub(crate) ray_query_initialization_tracking: bool,
972
973 /// Limits to the mesh shader dispatch group a task workgroup can dispatch.
974 ///
975 /// Metal for example limits to 1024 workgroups per task shader dispatch. Dispatching more is
976 /// undefined behavior, so this would validate that to dispatch zero workgroups.
977 task_dispatch_limits: Option<TaskDispatchLimits>,
978 /// If true, naga may generate checks that the primitive indices are valid in the output.
979 ///
980 /// Currently this validation is unimplemented.
981 mesh_shader_primitive_indices_clamp: bool,
982}
983
984bitflags::bitflags! {
985 #[derive(Clone, Copy, Debug, Eq, PartialEq)]
986 pub struct WriterFlags: u32 {
987 /// Include debug labels for everything.
988 const DEBUG = 0x1;
989
990 /// Flip Y coordinate of [`BuiltIn::Position`] output.
991 ///
992 /// [`BuiltIn::Position`]: crate::BuiltIn::Position
993 const ADJUST_COORDINATE_SPACE = 0x2;
994
995 /// Emit [`OpName`][op] for input/output locations.
996 ///
997 /// Contrary to spec, some drivers treat it as semantic, not allowing
998 /// any conflicts.
999 ///
1000 /// [op]: https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpName
1001 const LABEL_VARYINGS = 0x4;
1002
1003 /// Emit [`PointSize`] output builtin to vertex shaders, which is
1004 /// required for drawing with `PointList` topology.
1005 ///
1006 /// [`PointSize`]: crate::BuiltIn::PointSize
1007 const FORCE_POINT_SIZE = 0x8;
1008
1009 /// Clamp [`BuiltIn::FragDepth`] output between 0 and 1.
1010 ///
1011 /// [`BuiltIn::FragDepth`]: crate::BuiltIn::FragDepth
1012 const CLAMP_FRAG_DEPTH = 0x10;
1013
1014 /// Instead of silently failing if the arguments to generate a ray query are
1015 /// invalid, uses debug printf extension to print to the command line
1016 ///
1017 /// Note: VK_KHR_shader_non_semantic_info must be enabled. This will have no
1018 /// effect if `options.ray_query_initialization_tracking` is set to false.
1019 const PRINT_ON_RAY_QUERY_INITIALIZATION_FAIL = 0x20;
1020 }
1021}
1022
1023#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Hash)]
1024#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
1025#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
1026pub struct BindingInfo {
1027 pub descriptor_set: u32,
1028 pub binding: u32,
1029 /// If the binding is an unsized binding array, this overrides the size.
1030 pub binding_array_size: Option<u32>,
1031}
1032
1033// Using `BTreeMap` instead of `HashMap` so that we can hash itself.
1034pub type BindingMap = alloc::collections::BTreeMap<crate::ResourceBinding, BindingInfo>;
1035
1036#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1037pub enum ZeroInitializeWorkgroupMemoryMode {
1038 /// Via `VK_KHR_zero_initialize_workgroup_memory` or Vulkan 1.3
1039 Native,
1040 /// Via assignments + barrier
1041 Polyfill,
1042 None,
1043}
1044
1045#[derive(Debug, Clone)]
1046pub struct Options<'a> {
1047 /// (Major, Minor) target version of the SPIR-V.
1048 pub lang_version: (u8, u8),
1049
1050 /// Configuration flags for the writer.
1051 pub flags: WriterFlags,
1052
1053 /// Don't panic on missing bindings. Instead use fake values for `Binding`
1054 /// and `DescriptorSet` decorations. This may result in invalid SPIR-V.
1055 pub fake_missing_bindings: bool,
1056
1057 /// Map of resources to information about the binding.
1058 pub binding_map: BindingMap,
1059
1060 /// If given, the set of capabilities modules are allowed to use. Code that
1061 /// requires capabilities beyond these is rejected with an error.
1062 ///
1063 /// If this is `None`, all capabilities are permitted.
1064 pub capabilities: Option<crate::FastHashSet<Capability>>,
1065
1066 /// How should generate code handle array, vector, matrix, or image texel
1067 /// indices that are out of range?
1068 pub bounds_check_policies: BoundsCheckPolicies,
1069
1070 /// Dictates the way workgroup variables should be zero initialized
1071 pub zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
1072
1073 /// If set, loops will have code injected into them, forcing the compiler
1074 /// to think the number of iterations is bounded.
1075 pub force_loop_bounding: bool,
1076
1077 /// if set, ray queries will get a variable to track their state to prevent
1078 /// misuse.
1079 pub ray_query_initialization_tracking: bool,
1080
1081 /// Whether to use the `StorageInputOutput16` capability for `f16` shader I/O.
1082 /// When false, `f16` I/O is polyfilled using `f32` types with conversions.
1083 pub use_storage_input_output_16: bool,
1084
1085 pub debug_info: Option<DebugInfo<'a>>,
1086
1087 pub task_dispatch_limits: Option<TaskDispatchLimits>,
1088
1089 pub mesh_shader_primitive_indices_clamp: bool,
1090}
1091
1092impl Default for Options<'_> {
1093 fn default() -> Self {
1094 let mut flags = WriterFlags::ADJUST_COORDINATE_SPACE
1095 | WriterFlags::LABEL_VARYINGS
1096 | WriterFlags::CLAMP_FRAG_DEPTH;
1097 if cfg!(debug_assertions) {
1098 flags |= WriterFlags::DEBUG;
1099 }
1100 Options {
1101 lang_version: (1, 0),
1102 flags,
1103 fake_missing_bindings: true,
1104 binding_map: BindingMap::default(),
1105 capabilities: None,
1106 bounds_check_policies: BoundsCheckPolicies::default(),
1107 zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode::Polyfill,
1108 force_loop_bounding: true,
1109 ray_query_initialization_tracking: true,
1110 use_storage_input_output_16: true,
1111 debug_info: None,
1112 task_dispatch_limits: None,
1113 mesh_shader_primitive_indices_clamp: true,
1114 }
1115 }
1116}
1117
1118// A subset of options meant to be changed per pipeline.
1119#[derive(Debug, Clone)]
1120#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
1121#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
1122pub struct PipelineOptions {
1123 /// The stage of the entry point.
1124 pub shader_stage: crate::ShaderStage,
1125 /// The name of the entry point.
1126 ///
1127 /// If no entry point that matches is found while creating a [`Writer`], a error will be thrown.
1128 pub entry_point: String,
1129}
1130
1131pub fn write_vec(
1132 module: &crate::Module,
1133 info: &crate::valid::ModuleInfo,
1134 options: &Options,
1135 pipeline_options: Option<&PipelineOptions>,
1136) -> Result<Vec<u32>, Error> {
1137 let mut words: Vec<u32> = Vec::new();
1138 let mut w = Writer::new(options)?;
1139
1140 w.write(
1141 module,
1142 info,
1143 pipeline_options,
1144 &options.debug_info,
1145 &mut words,
1146 )?;
1147 Ok(words)
1148}
1149
1150pub fn supported_capabilities() -> crate::valid::Capabilities {
1151 use crate::valid::Capabilities as Caps;
1152
1153 Caps::IMMEDIATES
1154 | Caps::FLOAT64
1155 | Caps::PRIMITIVE_INDEX
1156 | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY
1157 | Caps::BUFFER_BINDING_ARRAY
1158 | Caps::STORAGE_TEXTURE_BINDING_ARRAY
1159 | Caps::STORAGE_BUFFER_BINDING_ARRAY
1160 | Caps::CLIP_DISTANCE
1161 // No cull distance
1162 | Caps::STORAGE_TEXTURE_16BIT_NORM_FORMATS
1163 | Caps::MULTIVIEW
1164 | Caps::EARLY_DEPTH_TEST
1165 | Caps::MULTISAMPLED_SHADING
1166 | Caps::RAY_QUERY
1167 | Caps::DUAL_SOURCE_BLENDING
1168 | Caps::CUBE_ARRAY_TEXTURES
1169 | Caps::SHADER_INT64
1170 | Caps::SUBGROUP
1171 | Caps::SUBGROUP_BARRIER
1172 | Caps::SUBGROUP_VERTEX_STAGE
1173 | Caps::SHADER_INT64_ATOMIC_MIN_MAX
1174 | Caps::SHADER_INT64_ATOMIC_ALL_OPS
1175 | Caps::SHADER_FLOAT32_ATOMIC
1176 | Caps::TEXTURE_ATOMIC
1177 | Caps::TEXTURE_INT64_ATOMIC
1178 | Caps::RAY_HIT_VERTEX_POSITION
1179 | Caps::SHADER_FLOAT16
1180 // No TEXTURE_EXTERNAL
1181 | Caps::SHADER_FLOAT16_IN_FLOAT32
1182 | Caps::SHADER_BARYCENTRICS
1183 | Caps::MESH_SHADER
1184 | Caps::MESH_SHADER_POINT_TOPOLOGY
1185 | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1186 // No BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1187 | Caps::STORAGE_TEXTURE_BINDING_ARRAY_NON_UNIFORM_INDEXING
1188 | Caps::STORAGE_BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1189 | Caps::COOPERATIVE_MATRIX
1190 | Caps::PER_VERTEX
1191 // No RAY_TRACING_PIPELINE
1192 | Caps::DRAW_INDEX
1193}