naga/back/spv/
mod.rs

1/*!
2Backend for [SPIR-V][spv] (Standard Portable Intermediate Representation).
3
4# Layout of values in `uniform` buffers
5
6WGSL's ["Internal Layout of Values"][ilov] rules specify the memory layout of
7each WGSL type. The memory layout is important for data stored in `uniform` and
8`storage` buffers, especially when exchanging data with CPU code.
9
10Both WGSL and Vulkan specify some conditions that a type's memory layout
11must satisfy in order to use that type in a `uniform` or `storage` buffer.
12For `storage` buffers, the WGSL and Vulkan restrictions are compatible, but
13for `uniform` buffers, WGSL allows some types that Vulkan does not, requiring
14adjustments when emitting SPIR-V for `uniform` buffers.
15
16## Padding in two-row matrices
17
18SPIR-V provides detailed control over the layout of matrix types, and is
19capable of describing the WGSL memory layout. However, Vulkan imposes
20additional restrictions.
21
22Vulkan's ["extended layout"][extended-layout] (also known as std140) rules
23apply to types used in `uniform` buffers. Under these rules, matrices are
24defined in terms of arrays of their vector type, and arrays are defined to have
25an alignment equal to the alignment of their element type rounded up to a
26multiple of 16. This means that each column of the matrix has a minimum
27alignment of 16. WGSL, and consequently Naga IR, on the other hand specifies
28column alignment equal to the alignment of the vector type, without being
29rounded up to 16.
30
31To compensate for this, for any `struct` used as a `uniform` buffer which
32contains a two-row matrix, we declare an additional "std140 compatible" type
33in which each column of the matrix has been decomposed into the containing
34struct. For example, the following WGSL struct type:
35
36```ignore
37struct Baz {
38    m: mat3x2<f32>,
39}
40```
41
42is rendered as the SPIR-V struct type:
43
44```ignore
45OpTypeStruct %v2float %v2float %v2float
46```
47
48This has the effect that struct indices in Naga IR for such types do not
49correspond to the struct indices used in SPIR-V. A mapping of struct indices
50for these types is maintained in [`Std140CompatTypeInfo`].
51
52Additionally, any two-row matrices that are declared directly as uniform
53buffers without being wrapped in a struct are declared as a struct containing a
54vector member for each column. Any array of a two-row matrix in a uniform
55buffer is declared as an array of a struct containing a vector member for each
56column. Any struct or array within a uniform buffer which contains a member or
57whose base type requires a std140 compatible type declaration, itself requires a
58std140 compatible type declaration.
59
60Whenever a value of such a type is [`loaded`] we insert code to convert the
61loaded value from the std140 compatible type to the regular type. This occurs
62in `BlockContext::write_checked_load`, making use of the wrapper function
63defined by `Writer::write_wrapped_convert_from_std140_compat_type`. For matrices
64that have been decomposed as separate columns in the containing struct, we load
65each column separately then composite the matrix type in
66`BlockContext::maybe_write_load_uniform_matcx2_struct_member`.
67
68Whenever a column of a matrix that has been decomposed into its containing
69struct is [`accessed`] with a constant index we adjust the emitted access chain
70to access from the containing struct instead, in `BlockContext::write_access_chain`.
71
72Whenever a column of a uniform buffer two-row matrix is [`dynamically accessed`]
73we must first load the matrix type, converting it from its std140 compatible
74type as described above, then access the column using the wrapper function
75defined by `Writer::write_wrapped_matcx2_get_column`. This is handled by
76`BlockContext::maybe_write_uniform_matcx2_dynamic_access`.
77
78Note that this approach differs somewhat from the equivalent code in the HLSL
79backend. For HLSL all structs containing two-row matrices (or arrays of such)
80have their declarations modified, not just those used as uniform buffers.
81Two-row matrices and arrays of such only use modified type declarations when
82used as uniform buffers, or additionally when used as struct member in any
83context. This avoids the need to convert struct values when loading from uniform
84buffers, but when loading arrays and matrices from uniform buffers or from any
85struct the conversion is still required. In contrast, the approach used here
86always requires converting *any* affected type when loading from a uniform
87buffer, but consistently *only* when loading from a uniform buffer. As a result
88this also means we only have to handle loads and not stores, as uniform buffers
89are read-only.
90
91[spv]: https://www.khronos.org/registry/SPIR-V/
92[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
93[extended-layout]: https://docs.vulkan.org/spec/latest/chapters/interfaces.html#interfaces-resources-layout
94[`loaded`]: crate::Expression::Load
95[`accessed`]: crate::Expression::AccessIndex
96[`dynamically accessed`]: crate::Expression::Access
97*/
98
99mod block;
100mod f16_polyfill;
101mod helpers;
102mod image;
103mod index;
104mod instructions;
105mod layout;
106mod mesh_shader;
107mod ray;
108mod reclaimable;
109mod selection;
110mod subgroup;
111mod writer;
112
113pub use mesh_shader::{MeshReturnInfo, MeshReturnMember};
114pub use spirv::{Capability, SourceLanguage};
115
116use alloc::{string::String, vec::Vec};
117use core::ops;
118
119use spirv::Word;
120use thiserror::Error;
121
122use crate::arena::{Handle, HandleVec};
123use crate::back::TaskDispatchLimits;
124use crate::proc::{BoundsCheckPolicies, TypeResolution};
125
126#[derive(Clone)]
127struct PhysicalLayout {
128    magic_number: Word,
129    version: Word,
130    generator: Word,
131    bound: Word,
132    instruction_schema: Word,
133}
134
135#[derive(Default)]
136struct LogicalLayout {
137    capabilities: Vec<Word>,
138    extensions: Vec<Word>,
139    ext_inst_imports: Vec<Word>,
140    memory_model: Vec<Word>,
141    entry_points: Vec<Word>,
142    execution_modes: Vec<Word>,
143    debugs: Vec<Word>,
144    annotations: Vec<Word>,
145    declarations: Vec<Word>,
146    function_declarations: Vec<Word>,
147    function_definitions: Vec<Word>,
148}
149
150#[derive(Clone)]
151struct Instruction {
152    op: spirv::Op,
153    wc: u32,
154    type_id: Option<Word>,
155    result_id: Option<Word>,
156    operands: Vec<Word>,
157}
158
159const BITS_PER_BYTE: crate::Bytes = 8;
160
161#[derive(Clone, Debug, Error)]
162pub enum Error {
163    #[error("The requested entry point couldn't be found")]
164    EntryPointNotFound,
165    #[error("target SPIRV-{0}.{1} is not supported")]
166    UnsupportedVersion(u8, u8),
167    #[error("using {0} requires at least one of the capabilities {1:?}, but none are available")]
168    MissingCapabilities(&'static str, Vec<Capability>),
169    #[error("unimplemented {0}")]
170    FeatureNotImplemented(&'static str),
171    #[error("module is not validated properly: {0}")]
172    Validation(&'static str),
173    #[error("overrides should not be present at this stage")]
174    Override,
175    #[error(transparent)]
176    ResolveArraySizeError(#[from] crate::proc::ResolveArraySizeError),
177    #[error("module requires SPIRV-{0}.{1}, which isn't supported")]
178    SpirvVersionTooLow(u8, u8),
179    #[error("mapping of {0:?} is missing")]
180    MissingBinding(crate::ResourceBinding),
181}
182
183#[derive(Default)]
184struct IdGenerator(Word);
185
186impl IdGenerator {
187    const fn next(&mut self) -> Word {
188        self.0 += 1;
189        self.0
190    }
191}
192
193#[derive(Debug, Clone)]
194pub struct DebugInfo<'a> {
195    pub source_code: &'a str,
196    pub file_name: &'a str,
197    pub language: SourceLanguage,
198}
199
200/// A SPIR-V block to which we are still adding instructions.
201///
202/// A `Block` represents a SPIR-V block that does not yet have a termination
203/// instruction like `OpBranch` or `OpReturn`.
204///
205/// The `OpLabel` that starts the block is implicit. It will be emitted based on
206/// `label_id` when we write the block to a `LogicalLayout`.
207///
208/// To terminate a `Block`, pass the block and the termination instruction to
209/// `Function::consume`. This takes ownership of the `Block` and transforms it
210/// into a `TerminatedBlock`.
211struct Block {
212    label_id: Word,
213    body: Vec<Instruction>,
214}
215
216/// A SPIR-V block that ends with a termination instruction.
217struct TerminatedBlock {
218    label_id: Word,
219    body: Vec<Instruction>,
220}
221
222impl Block {
223    const fn new(label_id: Word) -> Self {
224        Block {
225            label_id,
226            body: Vec::new(),
227        }
228    }
229}
230
231struct LocalVariable {
232    id: Word,
233    instruction: Instruction,
234}
235
236struct ResultMember {
237    id: Word,
238    type_id: Word,
239    built_in: Option<crate::BuiltIn>,
240}
241
242struct EntryPointContext {
243    argument_ids: Vec<Word>,
244    results: Vec<ResultMember>,
245    task_payload_variable_id: Option<Word>,
246    mesh_state: Option<MeshReturnInfo>,
247}
248
249#[derive(Default)]
250struct Function {
251    signature: Option<Instruction>,
252    parameters: Vec<FunctionArgument>,
253    variables: crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
254    /// Map from a local variable that is a ray query to its u32 tracker.
255    ray_query_initialization_tracker_variables:
256        crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
257    /// Map from a local variable that is a ray query to its tracker for the t max.
258    ray_query_t_max_tracker_variables:
259        crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
260    /// List of local variables used as a counters to ensure that all loops are bounded.
261    force_loop_bounding_vars: Vec<LocalVariable>,
262
263    /// A map from a Naga expression to the temporary SPIR-V variable we have
264    /// spilled its value to, if any.
265    ///
266    /// Naga IR lets us apply [`Access`] expressions to expressions whose value
267    /// is an array or matrix---not a pointer to such---but SPIR-V doesn't have
268    /// instructions that can do the same. So when we encounter such code, we
269    /// spill the expression's value to a generated temporary variable. That, we
270    /// can obtain a pointer to, and then use an `OpAccessChain` instruction to
271    /// do whatever series of [`Access`] and [`AccessIndex`] operations we need
272    /// (with bounds checks). Finally, we generate an `OpLoad` to get the final
273    /// value.
274    ///
275    /// [`Access`]: crate::Expression::Access
276    /// [`AccessIndex`]: crate::Expression::AccessIndex
277    spilled_composites: crate::FastIndexMap<Handle<crate::Expression>, LocalVariable>,
278
279    /// A set of expressions that are either in [`spilled_composites`] or refer
280    /// to some component/element of such.
281    ///
282    /// [`spilled_composites`]: Function::spilled_composites
283    spilled_accesses: crate::arena::HandleSet<crate::Expression>,
284
285    /// A map taking each expression to the number of [`Access`] and
286    /// [`AccessIndex`] expressions that uses it as a base value. If an
287    /// expression has no entry, its count is zero: it is never used as a
288    /// [`Access`] or [`AccessIndex`] base.
289    ///
290    /// We use this, together with [`ExpressionInfo::ref_count`], to recognize
291    /// the tips of chains of [`Access`] and [`AccessIndex`] expressions that
292    /// access spilled values --- expressions in [`spilled_composites`]. We
293    /// defer generating code for the chain until we reach its tip, so we can
294    /// handle it with a single instruction.
295    ///
296    /// [`Access`]: crate::Expression::Access
297    /// [`AccessIndex`]: crate::Expression::AccessIndex
298    /// [`ExpressionInfo::ref_count`]: crate::valid::ExpressionInfo
299    /// [`spilled_composites`]: Function::spilled_composites
300    access_uses: crate::FastHashMap<Handle<crate::Expression>, usize>,
301
302    blocks: Vec<TerminatedBlock>,
303    entry_point_context: Option<EntryPointContext>,
304}
305
306impl Function {
307    fn consume(&mut self, mut block: Block, termination: Instruction) {
308        block.body.push(termination);
309        self.blocks.push(TerminatedBlock {
310            label_id: block.label_id,
311            body: block.body,
312        })
313    }
314
315    fn parameter_id(&self, index: u32) -> Word {
316        match self.entry_point_context {
317            Some(ref context) => context.argument_ids[index as usize],
318            None => self.parameters[index as usize]
319                .instruction
320                .result_id
321                .unwrap(),
322        }
323    }
324}
325
326/// Characteristics of a SPIR-V `OpTypeImage` type.
327///
328/// SPIR-V requires non-composite types to be unique, including images. Since we
329/// use `LocalType` for this deduplication, it's essential that `LocalImageType`
330/// be equal whenever the corresponding `OpTypeImage`s would be. To reduce the
331/// likelihood of mistakes, we use fields that correspond exactly to the
332/// operands of an `OpTypeImage` instruction, using the actual SPIR-V types
333/// where practical.
334#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
335struct LocalImageType {
336    sampled_type: crate::Scalar,
337    dim: spirv::Dim,
338    flags: ImageTypeFlags,
339    image_format: spirv::ImageFormat,
340}
341
342bitflags::bitflags! {
343    /// Flags corresponding to the boolean(-ish) parameters to OpTypeImage.
344    #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
345    pub struct ImageTypeFlags: u8 {
346        const DEPTH = 0x1;
347        const ARRAYED = 0x2;
348        const MULTISAMPLED = 0x4;
349        const SAMPLED = 0x8;
350    }
351}
352
353impl LocalImageType {
354    /// Construct a `LocalImageType` from the fields of a `TypeInner::Image`.
355    fn from_inner(dim: crate::ImageDimension, arrayed: bool, class: crate::ImageClass) -> Self {
356        let make_flags = |multi: bool, other: ImageTypeFlags| -> ImageTypeFlags {
357            let mut flags = other;
358            flags.set(ImageTypeFlags::ARRAYED, arrayed);
359            flags.set(ImageTypeFlags::MULTISAMPLED, multi);
360            flags
361        };
362
363        let dim = spirv::Dim::from(dim);
364
365        match class {
366            crate::ImageClass::Sampled { kind, multi } => LocalImageType {
367                sampled_type: crate::Scalar { kind, width: 4 },
368                dim,
369                flags: make_flags(multi, ImageTypeFlags::SAMPLED),
370                image_format: spirv::ImageFormat::Unknown,
371            },
372            crate::ImageClass::Depth { multi } => LocalImageType {
373                sampled_type: crate::Scalar {
374                    kind: crate::ScalarKind::Float,
375                    width: 4,
376                },
377                dim,
378                flags: make_flags(multi, ImageTypeFlags::DEPTH | ImageTypeFlags::SAMPLED),
379                image_format: spirv::ImageFormat::Unknown,
380            },
381            crate::ImageClass::Storage { format, access: _ } => LocalImageType {
382                sampled_type: format.into(),
383                dim,
384                flags: make_flags(false, ImageTypeFlags::empty()),
385                image_format: format.into(),
386            },
387            crate::ImageClass::External => unimplemented!(),
388        }
389    }
390}
391
392/// A numeric type, for use in [`LocalType`].
393#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
394enum NumericType {
395    Scalar(crate::Scalar),
396    Vector {
397        size: crate::VectorSize,
398        scalar: crate::Scalar,
399    },
400    Matrix {
401        columns: crate::VectorSize,
402        rows: crate::VectorSize,
403        scalar: crate::Scalar,
404    },
405}
406
407impl NumericType {
408    const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
409        match *inner {
410            crate::TypeInner::Scalar(scalar) | crate::TypeInner::Atomic(scalar) => {
411                Some(NumericType::Scalar(scalar))
412            }
413            crate::TypeInner::Vector { size, scalar } => Some(NumericType::Vector { size, scalar }),
414            crate::TypeInner::Matrix {
415                columns,
416                rows,
417                scalar,
418            } => Some(NumericType::Matrix {
419                columns,
420                rows,
421                scalar,
422            }),
423            _ => None,
424        }
425    }
426
427    const fn scalar(self) -> crate::Scalar {
428        match self {
429            NumericType::Scalar(scalar)
430            | NumericType::Vector { scalar, .. }
431            | NumericType::Matrix { scalar, .. } => scalar,
432        }
433    }
434
435    const fn with_scalar(self, scalar: crate::Scalar) -> Self {
436        match self {
437            NumericType::Scalar(_) => NumericType::Scalar(scalar),
438            NumericType::Vector { size, .. } => NumericType::Vector { size, scalar },
439            NumericType::Matrix { columns, rows, .. } => NumericType::Matrix {
440                columns,
441                rows,
442                scalar,
443            },
444        }
445    }
446}
447
448/// A cooperative type, for use in [`LocalType`].
449#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
450enum CooperativeType {
451    Matrix {
452        columns: crate::CooperativeSize,
453        rows: crate::CooperativeSize,
454        scalar: crate::Scalar,
455        role: crate::CooperativeRole,
456    },
457}
458
459impl CooperativeType {
460    const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
461        match *inner {
462            crate::TypeInner::CooperativeMatrix {
463                columns,
464                rows,
465                scalar,
466                role,
467            } => Some(Self::Matrix {
468                columns,
469                rows,
470                scalar,
471                role,
472            }),
473            _ => None,
474        }
475    }
476}
477
478/// A SPIR-V type constructed during code generation.
479///
480/// This is the variant of [`LookupType`] used to represent types that might not
481/// be available in the arena. Variants are present here for one of two reasons:
482///
483/// -   They represent types synthesized during code generation, as explained
484///     in the documentation for [`LookupType`].
485///
486/// -   They represent types for which SPIR-V forbids duplicate `OpType...`
487///     instructions, requiring deduplication.
488///
489/// This is not a complete copy of [`TypeInner`]: for example, SPIR-V generation
490/// never synthesizes new struct types, so `LocalType` has nothing for that.
491///
492/// Each `LocalType` variant should be handled identically to its analogous
493/// `TypeInner` variant. You can use the [`Writer::localtype_from_inner`]
494/// function to help with this, by converting everything possible to a
495/// `LocalType` before inspecting it.
496///
497/// ## `LocalType` equality and SPIR-V `OpType` uniqueness
498///
499/// The definition of `Eq` on `LocalType` is carefully chosen to help us follow
500/// certain SPIR-V rules. SPIR-V ยง2.8 requires some classes of `OpType...`
501/// instructions to be unique; for example, you can't have two `OpTypeInt 32 1`
502/// instructions in the same module. All 32-bit signed integers must use the
503/// same type id.
504///
505/// All SPIR-V types that must be unique can be represented as a `LocalType`,
506/// and two `LocalType`s are always `Eq` if SPIR-V would require them to use the
507/// same `OpType...` instruction. This lets us avoid duplicates by recording the
508/// ids of the type instructions we've already generated in a hash table,
509/// [`Writer::lookup_type`], keyed by `LocalType`.
510///
511/// As another example, [`LocalImageType`], stored in the `LocalType::Image`
512/// variant, is designed to help us deduplicate `OpTypeImage` instructions. See
513/// its documentation for details.
514///
515/// SPIR-V does not require pointer types to be unique - but different
516/// SPIR-V ids are considered to be distinct pointer types. Since Naga
517/// uses structural type equality, we need to represent each Naga
518/// equivalence class with a single SPIR-V `OpTypePointer`.
519///
520/// As it always must, the `Hash` implementation respects the `Eq` relation.
521///
522/// [`TypeInner`]: crate::TypeInner
523#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
524enum LocalType {
525    /// A numeric type.
526    Numeric(NumericType),
527    Cooperative(CooperativeType),
528    Pointer {
529        base: Word,
530        class: spirv::StorageClass,
531    },
532    Image(LocalImageType),
533    SampledImage {
534        image_type_id: Word,
535    },
536    Sampler,
537    BindingArray {
538        base: Handle<crate::Type>,
539        size: u32,
540    },
541    AccelerationStructure,
542    RayQuery,
543}
544
545/// A type encountered during SPIR-V generation.
546///
547/// In the process of writing SPIR-V, we need to synthesize various types for
548/// intermediate results and such: pointer types, vector/matrix component types,
549/// or even booleans, which usually appear in SPIR-V code even when they're not
550/// used by the module source.
551///
552/// However, we can't use `crate::Type` or `crate::TypeInner` for these, as the
553/// type arena may not contain what we need (it only contains types used
554/// directly by other parts of the IR), and the IR module is immutable, so we
555/// can't add anything to it.
556///
557/// So for local use in the SPIR-V writer, we use this type, which holds either
558/// a handle into the arena, or a [`LocalType`] containing something synthesized
559/// locally.
560///
561/// This is very similar to the [`proc::TypeResolution`] enum, with `LocalType`
562/// playing the role of `TypeInner`. However, `LocalType` also has other
563/// properties needed for SPIR-V generation; see the description of
564/// [`LocalType`] for details.
565///
566/// [`proc::TypeResolution`]: crate::proc::TypeResolution
567#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
568enum LookupType {
569    Handle(Handle<crate::Type>),
570    Local(LocalType),
571}
572
573impl From<LocalType> for LookupType {
574    fn from(local: LocalType) -> Self {
575        Self::Local(local)
576    }
577}
578
579#[derive(Debug, PartialEq, Clone, Hash, Eq)]
580struct LookupFunctionType {
581    parameter_type_ids: Vec<Word>,
582    return_type_id: Word,
583}
584
585#[derive(Debug, PartialEq, Clone, Hash, Eq)]
586enum LookupRayQueryFunction {
587    Initialize,
588    Proceed,
589    GenerateIntersection,
590    ConfirmIntersection,
591    GetVertexPositions { committed: bool },
592    GetIntersection { committed: bool },
593    Terminate,
594}
595
596// Just one supported function right now, more in the future.
597#[derive(Debug, PartialEq, Clone, Hash, Eq)]
598enum LookupRaytracingFunction {
599    TraceRay {
600        payload: Handle<crate::GlobalVariable>,
601    },
602}
603
604#[derive(Debug)]
605enum Dimension {
606    Scalar,
607    Vector,
608    Matrix,
609    CooperativeMatrix,
610}
611
612/// Key used to look up an operation which we have wrapped in a helper
613/// function, which should be called instead of directly emitting code
614/// for the expression. See [`Writer::wrapped_functions`].
615#[derive(Debug, Eq, PartialEq, Hash)]
616enum WrappedFunction {
617    BinaryOp {
618        op: crate::BinaryOperator,
619        left_type_id: Word,
620        right_type_id: Word,
621    },
622    ConvertFromStd140CompatType {
623        r#type: Handle<crate::Type>,
624    },
625    MatCx2GetColumn {
626        r#type: Handle<crate::Type>,
627    },
628}
629
630/// A map from evaluated [`Expression`](crate::Expression)s to their SPIR-V ids.
631///
632/// When we emit code to evaluate a given `Expression`, we record the
633/// SPIR-V id of its value here, under its `Handle<Expression>` index.
634///
635/// A `CachedExpressions` value can be indexed by a `Handle<Expression>` value.
636///
637/// [emit]: index.html#expression-evaluation-time-and-scope
638#[derive(Default)]
639struct CachedExpressions {
640    ids: HandleVec<crate::Expression, Word>,
641}
642impl CachedExpressions {
643    fn reset(&mut self, length: usize) {
644        self.ids.clear();
645        self.ids.resize(length, 0);
646    }
647}
648impl ops::Index<Handle<crate::Expression>> for CachedExpressions {
649    type Output = Word;
650    fn index(&self, h: Handle<crate::Expression>) -> &Word {
651        let id = &self.ids[h];
652        if *id == 0 {
653            unreachable!("Expression {:?} is not cached!", h);
654        }
655        id
656    }
657}
658impl ops::IndexMut<Handle<crate::Expression>> for CachedExpressions {
659    fn index_mut(&mut self, h: Handle<crate::Expression>) -> &mut Word {
660        let id = &mut self.ids[h];
661        if *id != 0 {
662            unreachable!("Expression {:?} is already cached!", h);
663        }
664        id
665    }
666}
667impl reclaimable::Reclaimable for CachedExpressions {
668    fn reclaim(self) -> Self {
669        CachedExpressions {
670            ids: self.ids.reclaim(),
671        }
672    }
673}
674
675#[derive(Eq, Hash, PartialEq)]
676enum CachedConstant {
677    Literal(crate::proc::HashableLiteral),
678    Composite {
679        ty: LookupType,
680        constituent_ids: Vec<Word>,
681    },
682    ZeroValue(Word),
683}
684
685/// The SPIR-V representation of a [`crate::GlobalVariable`].
686///
687/// In the Vulkan spec 1.3.296, the section [Descriptor Set Interface][dsi] says:
688///
689/// > Variables identified with the `Uniform` storage class are used to access
690/// > transparent buffer backed resources. Such variables *must* be:
691/// >
692/// > -   typed as `OpTypeStruct`, or an array of this type,
693/// >
694/// > -   identified with a `Block` or `BufferBlock` decoration, and
695/// >
696/// > -   laid out explicitly using the `Offset`, `ArrayStride`, and `MatrixStride`
697/// >     decorations as specified in "Offset and Stride Assignment".
698///
699/// This is followed by identical language for the `StorageBuffer`,
700/// except that a `BufferBlock` decoration is not allowed.
701///
702/// When we encounter a global variable in the [`Storage`] or [`Uniform`]
703/// address spaces whose type is not already [`Struct`], this backend implicitly
704/// wraps the global variable in a struct: we generate a SPIR-V global variable
705/// holding an `OpTypeStruct` with a single member, whose type is what the Naga
706/// global's type would suggest, decorated as required above.
707///
708/// The [`helpers::global_needs_wrapper`] function determines whether a given
709/// [`crate::GlobalVariable`] needs to be wrapped.
710///
711/// [dsi]: https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#interfaces-resources-descset
712/// [`Storage`]: crate::AddressSpace::Storage
713/// [`Uniform`]: crate::AddressSpace::Uniform
714/// [`Struct`]: crate::TypeInner::Struct
715#[derive(Clone)]
716struct GlobalVariable {
717    /// The SPIR-V id of the `OpVariable` that declares the global.
718    ///
719    /// If this global has been implicitly wrapped in an `OpTypeStruct`, this id
720    /// refers to the wrapper, not the original Naga value it contains. If you
721    /// need the Naga value, use [`access_id`] instead of this field.
722    ///
723    /// If this global is not implicitly wrapped, this is the same as
724    /// [`access_id`].
725    ///
726    /// This is used to compute the `access_id` pointer in function prologues,
727    /// and used for `ArrayLength` expressions, which need to pass the wrapper
728    /// struct.
729    ///
730    /// [`access_id`]: GlobalVariable::access_id
731    var_id: Word,
732
733    /// The loaded value of a `AddressSpace::Handle` global variable.
734    ///
735    /// If the current function uses this global variable, this is the id of an
736    /// `OpLoad` instruction in the function's prologue that loads its value.
737    /// (This value is assigned as we write the prologue code of each function.)
738    /// It is then used for all operations on the global, such as `OpImageSample`.
739    handle_id: Word,
740
741    /// The SPIR-V id of a pointer to this variable's Naga IR value.
742    ///
743    /// If the current function uses this global variable, and it has been
744    /// implicitly wrapped in an `OpTypeStruct`, this is the id of an
745    /// `OpAccessChain` instruction in the function's prologue that refers to
746    /// the wrapped value inside the struct. (This value is assigned as we write
747    /// the prologue code of each function.) If you need the wrapper struct
748    /// itself, use [`var_id`] instead of this field.
749    ///
750    /// If this global is not implicitly wrapped, this is the same as
751    /// [`var_id`].
752    ///
753    /// [`var_id`]: GlobalVariable::var_id
754    access_id: Word,
755}
756
757impl GlobalVariable {
758    const fn dummy() -> Self {
759        Self {
760            var_id: 0,
761            handle_id: 0,
762            access_id: 0,
763        }
764    }
765
766    const fn new(id: Word) -> Self {
767        Self {
768            var_id: id,
769            handle_id: 0,
770            access_id: 0,
771        }
772    }
773
774    /// Prepare `self` for use within a single function.
775    const fn reset_for_function(&mut self) {
776        self.handle_id = 0;
777        self.access_id = 0;
778    }
779}
780
781struct FunctionArgument {
782    /// Actual instruction of the argument.
783    instruction: Instruction,
784    handle_id: Word,
785}
786
787/// Tracks the expressions for which the backend emits the following instructions:
788/// - OpConstantTrue
789/// - OpConstantFalse
790/// - OpConstant
791/// - OpConstantComposite
792/// - OpConstantNull
793struct ExpressionConstnessTracker {
794    inner: crate::arena::HandleSet<crate::Expression>,
795}
796
797impl ExpressionConstnessTracker {
798    fn from_arena(arena: &crate::Arena<crate::Expression>) -> Self {
799        let mut inner = crate::arena::HandleSet::for_arena(arena);
800        for (handle, expr) in arena.iter() {
801            let insert = match *expr {
802                crate::Expression::Literal(_)
803                | crate::Expression::ZeroValue(_)
804                | crate::Expression::Constant(_) => true,
805                crate::Expression::Compose { ref components, .. } => {
806                    components.iter().all(|&h| inner.contains(h))
807                }
808                crate::Expression::Splat { value, .. } => inner.contains(value),
809                _ => false,
810            };
811            if insert {
812                inner.insert(handle);
813            }
814        }
815        Self { inner }
816    }
817
818    fn is_const(&self, value: Handle<crate::Expression>) -> bool {
819        self.inner.contains(value)
820    }
821}
822
823/// General information needed to emit SPIR-V for Naga statements.
824struct BlockContext<'w> {
825    /// The writer handling the module to which this code belongs.
826    writer: &'w mut Writer,
827
828    /// The [`Module`](crate::Module) for which we're generating code.
829    ir_module: &'w crate::Module,
830
831    /// The [`Function`](crate::Function) for which we're generating code.
832    ir_function: &'w crate::Function,
833
834    /// Information module validation produced about
835    /// [`ir_function`](BlockContext::ir_function).
836    fun_info: &'w crate::valid::FunctionInfo,
837
838    /// The [`spv::Function`](Function) to which we are contributing SPIR-V instructions.
839    function: &'w mut Function,
840
841    /// SPIR-V ids for expressions we've evaluated.
842    cached: CachedExpressions,
843
844    /// The `Writer`'s temporary vector, for convenience.
845    temp_list: Vec<Word>,
846
847    /// Tracks the constness of `Expression`s residing in `self.ir_function.expressions`
848    expression_constness: ExpressionConstnessTracker,
849
850    force_loop_bounding: bool,
851
852    /// Hash from an expression whose type is a ray query / pointer to a ray query to its tracker.
853    /// Note: this is sparse, so can't be a handle vec
854    ray_query_tracker_expr: crate::FastHashMap<Handle<crate::Expression>, RayQueryTrackers>,
855}
856
857#[derive(Clone, Copy)]
858struct RayQueryTrackers {
859    // Initialization tracker
860    initialized_tracker: Word,
861    // Tracks the t max from ray query initialize.
862    // Unlike HLSL, spir-v's equivalent getter for the current committed t has UB (instead of just
863    // returning t_max) if there was no previous hit (though in some places it treats the behaviour as
864    // defined), therefore we must track the tmax inputted into ray query initialize.
865    t_max_tracker: Word,
866}
867
868impl BlockContext<'_> {
869    const fn gen_id(&mut self) -> Word {
870        self.writer.id_gen.next()
871    }
872
873    fn get_type_id(&mut self, lookup_type: LookupType) -> Word {
874        self.writer.get_type_id(lookup_type)
875    }
876
877    fn get_handle_type_id(&mut self, handle: Handle<crate::Type>) -> Word {
878        self.writer.get_handle_type_id(handle)
879    }
880
881    fn get_expression_type_id(&mut self, tr: &TypeResolution) -> Word {
882        self.writer.get_expression_type_id(tr)
883    }
884
885    fn get_index_constant(&mut self, index: Word) -> Word {
886        self.writer.get_constant_scalar(crate::Literal::U32(index))
887    }
888
889    fn get_scope_constant(&mut self, scope: Word) -> Word {
890        self.writer
891            .get_constant_scalar(crate::Literal::I32(scope as _))
892    }
893
894    fn get_pointer_type_id(&mut self, base: Word, class: spirv::StorageClass) -> Word {
895        self.writer.get_pointer_type_id(base, class)
896    }
897
898    fn get_numeric_type_id(&mut self, numeric: NumericType) -> Word {
899        self.writer.get_numeric_type_id(numeric)
900    }
901}
902
903/// Information about a type for which we have declared a std140 layout
904/// compatible variant, because the type is used in a uniform but does not
905/// adhere to std140 requirements. The uniform will be declared using the
906/// type `type_id`, and the result of any `Load` will be immediately converted
907/// to the base type. This is used for matrices with 2 rows, as well as any
908/// arrays or structs containing such matrices.
909pub struct Std140CompatTypeInfo {
910    /// ID of the std140 compatible type declaration.
911    type_id: Word,
912    /// For structs, a mapping of Naga IR struct member indices to the indices
913    /// used in the generated SPIR-V. For non-struct types this will be empty.
914    member_indices: Vec<u32>,
915}
916
917pub struct Writer {
918    physical_layout: PhysicalLayout,
919    logical_layout: LogicalLayout,
920    id_gen: IdGenerator,
921
922    /// The set of capabilities modules are permitted to use.
923    ///
924    /// This is initialized from `Options::capabilities`.
925    capabilities_available: Option<crate::FastHashSet<Capability>>,
926
927    /// The set of capabilities used by this module.
928    ///
929    /// If `capabilities_available` is `Some`, then this is always a subset of
930    /// that.
931    capabilities_used: crate::FastIndexSet<Capability>,
932
933    /// The set of spirv extensions used.
934    extensions_used: crate::FastIndexSet<&'static str>,
935
936    debug_strings: Vec<Instruction>,
937    debugs: Vec<Instruction>,
938    annotations: Vec<Instruction>,
939    flags: WriterFlags,
940    bounds_check_policies: BoundsCheckPolicies,
941    zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
942    force_loop_bounding: bool,
943    use_storage_input_output_16: bool,
944    void_type: Word,
945    tuple_of_u32s_ty_id: Option<Word>,
946    //TODO: convert most of these into vectors, addressable by handle indices
947    lookup_type: crate::FastHashMap<LookupType, Word>,
948    lookup_function: crate::FastHashMap<Handle<crate::Function>, Word>,
949    lookup_function_type: crate::FastHashMap<LookupFunctionType, Word>,
950    /// Operations which have been wrapped in a helper function. The value is
951    /// the ID of the function, which should be called instead of emitting code
952    /// for the operation directly.
953    wrapped_functions: crate::FastHashMap<WrappedFunction, Word>,
954    /// Indexed by const-expression handle indexes
955    constant_ids: HandleVec<crate::Expression, Word>,
956    cached_constants: crate::FastHashMap<CachedConstant, Word>,
957    global_variables: HandleVec<crate::GlobalVariable, GlobalVariable>,
958    std140_compat_uniform_types: crate::FastHashMap<Handle<crate::Type>, Std140CompatTypeInfo>,
959    fake_missing_bindings: bool,
960    binding_map: BindingMap,
961
962    // Cached expressions are only meaningful within a BlockContext, but we
963    // retain the table here between functions to save heap allocations.
964    saved_cached: CachedExpressions,
965
966    gl450_ext_inst_id: Word,
967
968    // Just a temporary list of SPIR-V ids
969    temp_list: Vec<Word>,
970
971    ray_query_functions: crate::FastHashMap<LookupRayQueryFunction, Word>,
972
973    ray_tracing_functions: crate::FastHashMap<LookupRaytracingFunction, Word>,
974
975    has_ray_tracing_pipeline: bool,
976
977    /// F16 I/O polyfill manager for handling `f16` input/output variables
978    /// when `StorageInputOutput16` capability is not available.
979    io_f16_polyfills: f16_polyfill::F16IoPolyfill,
980
981    /// Non semantic debug printf extension `OpExtInstImport`
982    debug_printf: Option<Word>,
983    pub(crate) ray_query_initialization_tracking: bool,
984
985    /// Whether the arguments to trace ray should be validated
986    pub(crate) trace_ray_argument_validation: bool,
987
988    /// See docs in [`Options`]
989    task_dispatch_limits: Option<TaskDispatchLimits>,
990    /// See docs in [`Options`]
991    mesh_shader_primitive_indices_clamp: bool,
992}
993
994bitflags::bitflags! {
995    #[derive(Clone, Copy, Debug, Eq, PartialEq)]
996    pub struct WriterFlags: u32 {
997        /// Include debug labels for everything.
998        const DEBUG = 0x1;
999
1000        /// Flip Y coordinate of [`BuiltIn::Position`] output.
1001        ///
1002        /// [`BuiltIn::Position`]: crate::BuiltIn::Position
1003        const ADJUST_COORDINATE_SPACE = 0x2;
1004
1005        /// Emit [`OpName`][op] for input/output locations.
1006        ///
1007        /// Contrary to spec, some drivers treat it as semantic, not allowing
1008        /// any conflicts.
1009        ///
1010        /// [op]: https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpName
1011        const LABEL_VARYINGS = 0x4;
1012
1013        /// Emit [`PointSize`] output builtin to vertex shaders, which is
1014        /// required for drawing with `PointList` topology.
1015        ///
1016        /// [`PointSize`]: crate::BuiltIn::PointSize
1017        const FORCE_POINT_SIZE = 0x8;
1018
1019        /// Clamp [`BuiltIn::FragDepth`] output between 0 and 1.
1020        ///
1021        /// [`BuiltIn::FragDepth`]: crate::BuiltIn::FragDepth
1022        const CLAMP_FRAG_DEPTH = 0x10;
1023
1024        /// Instead of silently failing if the arguments to generate a ray query are
1025        /// invalid, uses debug printf extension to print to the command line
1026        ///
1027        /// Note: VK_KHR_shader_non_semantic_info must be enabled. This will have no
1028        /// effect if `options.ray_query_initialization_tracking` is set to false.
1029        const PRINT_ON_RAY_QUERY_INITIALIZATION_FAIL = 0x20;
1030
1031        /// Instead of silently failing if the arguments to `traceRays` are
1032        /// invalid, uses debug printf extension to print to the command line
1033        ///
1034        /// Note: VK_KHR_shader_non_semantic_info must be enabled. This will have no
1035        /// effect if `options.trace_ray_argument_validation` is set to false.
1036        const PRINT_ON_TRACE_RAYS_FAIL = 0x40;
1037    }
1038}
1039
1040#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Hash)]
1041#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
1042#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
1043pub struct BindingInfo {
1044    pub descriptor_set: u32,
1045    pub binding: u32,
1046    /// If the binding is an unsized binding array, this overrides the size.
1047    pub binding_array_size: Option<u32>,
1048}
1049
1050// Using `BTreeMap` instead of `HashMap` so that we can hash itself.
1051pub type BindingMap = alloc::collections::BTreeMap<crate::ResourceBinding, BindingInfo>;
1052
1053#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1054pub enum ZeroInitializeWorkgroupMemoryMode {
1055    /// Via `VK_KHR_zero_initialize_workgroup_memory` or Vulkan 1.3
1056    Native,
1057    /// Via assignments + barrier
1058    Polyfill,
1059    None,
1060}
1061
1062#[derive(Debug, Clone)]
1063pub struct Options<'a> {
1064    /// (Major, Minor) target version of the SPIR-V.
1065    pub lang_version: (u8, u8),
1066
1067    /// Configuration flags for the writer.
1068    pub flags: WriterFlags,
1069
1070    /// Don't panic on missing bindings. Instead use fake values for `Binding`
1071    /// and `DescriptorSet` decorations. This may result in invalid SPIR-V.
1072    pub fake_missing_bindings: bool,
1073
1074    /// Map of resources to information about the binding.
1075    pub binding_map: BindingMap,
1076
1077    /// If given, the set of capabilities modules are allowed to use. Code that
1078    /// requires capabilities beyond these is rejected with an error.
1079    ///
1080    /// If this is `None`, all capabilities are permitted.
1081    pub capabilities: Option<crate::FastHashSet<Capability>>,
1082
1083    /// How should generate code handle array, vector, matrix, or image texel
1084    /// indices that are out of range?
1085    pub bounds_check_policies: BoundsCheckPolicies,
1086
1087    /// Dictates the way workgroup variables should be zero initialized
1088    pub zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
1089
1090    /// If set, loops will have code injected into them, forcing the compiler
1091    /// to think the number of iterations is bounded.
1092    pub force_loop_bounding: bool,
1093
1094    /// if set, ray queries will get a variable to track their state to prevent
1095    /// misuse.
1096    pub ray_query_initialization_tracking: bool,
1097
1098    /// If set, arguments to `traceRays` calls will be validated.
1099    pub trace_ray_argument_validation: bool,
1100
1101    /// Whether to use the `StorageInputOutput16` capability for `f16` shader I/O.
1102    /// When false, `f16` I/O is polyfilled using `f32` types with conversions.
1103    pub use_storage_input_output_16: bool,
1104
1105    pub debug_info: Option<DebugInfo<'a>>,
1106
1107    /// Limits to the mesh shader dispatch group a task workgroup can dispatch.
1108    ///
1109    /// Metal for example limits to 1024 workgroups per task shader dispatch. Dispatching more is
1110    /// undefined behavior, so this would validate that to dispatch zero workgroups.
1111    pub task_dispatch_limits: Option<TaskDispatchLimits>,
1112
1113    /// If true, naga may generate checks that the primitive indices are valid in the output.
1114    ///
1115    /// Currently this validation is unimplemented.
1116    pub mesh_shader_primitive_indices_clamp: bool,
1117}
1118
1119impl Default for Options<'_> {
1120    fn default() -> Self {
1121        let mut flags = WriterFlags::ADJUST_COORDINATE_SPACE
1122            | WriterFlags::LABEL_VARYINGS
1123            | WriterFlags::CLAMP_FRAG_DEPTH;
1124        if cfg!(debug_assertions) {
1125            flags |= WriterFlags::DEBUG;
1126        }
1127        Options {
1128            lang_version: (1, 0),
1129            flags,
1130            fake_missing_bindings: true,
1131            binding_map: BindingMap::default(),
1132            capabilities: None,
1133            bounds_check_policies: BoundsCheckPolicies::default(),
1134            zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode::Polyfill,
1135            force_loop_bounding: true,
1136            ray_query_initialization_tracking: true,
1137            trace_ray_argument_validation: true,
1138            use_storage_input_output_16: true,
1139            debug_info: None,
1140            task_dispatch_limits: None,
1141            mesh_shader_primitive_indices_clamp: true,
1142        }
1143    }
1144}
1145
1146// A subset of options meant to be changed per pipeline.
1147#[derive(Debug, Clone)]
1148#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
1149#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
1150pub struct PipelineOptions {
1151    /// The stage of the entry point.
1152    pub shader_stage: crate::ShaderStage,
1153    /// The name of the entry point.
1154    ///
1155    /// If no entry point that matches is found while creating a [`Writer`], a error will be thrown.
1156    pub entry_point: String,
1157}
1158
1159pub fn write_vec(
1160    module: &crate::Module,
1161    info: &crate::valid::ModuleInfo,
1162    options: &Options,
1163    pipeline_options: Option<&PipelineOptions>,
1164) -> Result<Vec<u32>, Error> {
1165    let mut words: Vec<u32> = Vec::new();
1166    let mut w = Writer::new(options)?;
1167
1168    w.write(
1169        module,
1170        info,
1171        pipeline_options,
1172        &options.debug_info,
1173        &mut words,
1174    )?;
1175    Ok(words)
1176}
1177
1178pub fn supported_capabilities() -> crate::valid::Capabilities {
1179    use crate::valid::Capabilities as Caps;
1180
1181    Caps::IMMEDIATES
1182        | Caps::FLOAT64
1183        | Caps::PRIMITIVE_INDEX
1184        | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY
1185        | Caps::BUFFER_BINDING_ARRAY
1186        | Caps::STORAGE_TEXTURE_BINDING_ARRAY
1187        | Caps::STORAGE_BUFFER_BINDING_ARRAY
1188        | Caps::ACCELERATION_STRUCTURE_BINDING_ARRAY
1189        | Caps::CLIP_DISTANCES
1190        // No cull distance
1191        | Caps::STORAGE_TEXTURE_16BIT_NORM_FORMATS
1192        | Caps::MULTIVIEW
1193        | Caps::EARLY_DEPTH_TEST
1194        | Caps::MULTISAMPLED_SHADING
1195        | Caps::RAY_QUERY
1196        | Caps::DUAL_SOURCE_BLENDING
1197        | Caps::CUBE_ARRAY_TEXTURES
1198        | Caps::SHADER_INT64
1199        | Caps::SUBGROUP
1200        | Caps::SUBGROUP_BARRIER
1201        | Caps::SUBGROUP_VERTEX_STAGE
1202        | Caps::SHADER_INT64_ATOMIC_MIN_MAX
1203        | Caps::SHADER_INT64_ATOMIC_ALL_OPS
1204        | Caps::SHADER_FLOAT32_ATOMIC
1205        | Caps::TEXTURE_ATOMIC
1206        | Caps::TEXTURE_INT64_ATOMIC
1207        | Caps::RAY_HIT_VERTEX_POSITION
1208        | Caps::SHADER_FLOAT16
1209        | Caps::SHADER_INT16
1210        // No TEXTURE_EXTERNAL
1211        | Caps::SHADER_FLOAT16_IN_FLOAT32
1212        | Caps::SHADER_BARYCENTRICS
1213        | Caps::MESH_SHADER
1214        | Caps::MESH_SHADER_POINT_TOPOLOGY
1215        | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1216        // No BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1217        | Caps::STORAGE_TEXTURE_BINDING_ARRAY_NON_UNIFORM_INDEXING
1218        | Caps::STORAGE_BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1219        | Caps::COOPERATIVE_MATRIX
1220        | Caps::PER_VERTEX
1221        | Caps::RAY_TRACING_PIPELINE
1222        | Caps::DRAW_INDEX
1223        | Caps::MEMORY_DECORATION_COHERENT
1224        | Caps::MEMORY_DECORATION_VOLATILE
1225}