naga/back/spv/
mod.rs

1/*!
2Backend for [SPIR-V][spv] (Standard Portable Intermediate Representation).
3
4# Layout of values in `uniform` buffers
5
6WGSL's ["Internal Layout of Values"][ilov] rules specify the memory layout of
7each WGSL type. The memory layout is important for data stored in `uniform` and
8`storage` buffers, especially when exchanging data with CPU code.
9
10Both WGSL and Vulkan specify some conditions that a type's memory layout
11must satisfy in order to use that type in a `uniform` or `storage` buffer.
12For `storage` buffers, the WGSL and Vulkan restrictions are compatible, but
13for `uniform` buffers, WGSL allows some types that Vulkan does not, requiring
14adjustments when emitting SPIR-V for `uniform` buffers.
15
16## Padding in two-row matrices
17
18SPIR-V provides detailed control over the layout of matrix types, and is
19capable of describing the WGSL memory layout. However, Vulkan imposes
20additional restrictions.
21
22Vulkan's ["extended layout"][extended-layout] (also known as std140) rules
23apply to types used in `uniform` buffers. Under these rules, matrices are
24defined in terms of arrays of their vector type, and arrays are defined to have
25an alignment equal to the alignment of their element type rounded up to a
26multiple of 16. This means that each column of the matrix has a minimum
27alignment of 16. WGSL, and consequently Naga IR, on the other hand specifies
28column alignment equal to the alignment of the vector type, without being
29rounded up to 16.
30
31To compensate for this, for any `struct` used as a `uniform` buffer which
32contains a two-row matrix, we declare an additional "std140 compatible" type
33in which each column of the matrix has been decomposed into the containing
34struct. For example, the following WGSL struct type:
35
36```ignore
37struct Baz {
38    m: mat3x2<f32>,
39}
40```
41
42is rendered as the SPIR-V struct type:
43
44```ignore
45OpTypeStruct %v2float %v2float %v2float
46```
47
48This has the effect that struct indices in Naga IR for such types do not
49correspond to the struct indices used in SPIR-V. A mapping of struct indices
50for these types is maintained in [`Std140CompatTypeInfo`].
51
52Additionally, any two-row matrices that are declared directly as uniform
53buffers without being wrapped in a struct are declared as a struct containing a
54vector member for each column. Any array of a two-row matrix in a uniform
55buffer is declared as an array of a struct containing a vector member for each
56column. Any struct or array within a uniform buffer which contains a member or
57whose base type requires a std140 compatible type declaration, itself requires a
58std140 compatible type declaration.
59
60Whenever a value of such a type is [`loaded`] we insert code to convert the
61loaded value from the std140 compatible type to the regular type. This occurs
62in `BlockContext::write_checked_load`, making use of the wrapper function
63defined by `Writer::write_wrapped_convert_from_std140_compat_type`. For matrices
64that have been decomposed as separate columns in the containing struct, we load
65each column separately then composite the matrix type in
66`BlockContext::maybe_write_load_uniform_matcx2_struct_member`.
67
68Whenever a column of a matrix that has been decomposed into its containing
69struct is [`accessed`] with a constant index we adjust the emitted access chain
70to access from the containing struct instead, in `BlockContext::write_access_chain`.
71
72Whenever a column of a uniform buffer two-row matrix is [`dynamically accessed`]
73we must first load the matrix type, converting it from its std140 compatible
74type as described above, then access the column using the wrapper function
75defined by `Writer::write_wrapped_matcx2_get_column`. This is handled by
76`BlockContext::maybe_write_uniform_matcx2_dynamic_access`.
77
78Note that this approach differs somewhat from the equivalent code in the HLSL
79backend. For HLSL all structs containing two-row matrices (or arrays of such)
80have their declarations modified, not just those used as uniform buffers.
81Two-row matrices and arrays of such only use modified type declarations when
82used as uniform buffers, or additionally when used as struct member in any
83context. This avoids the need to convert struct values when loading from uniform
84buffers, but when loading arrays and matrices from uniform buffers or from any
85struct the conversion is still required. In contrast, the approach used here
86always requires converting *any* affected type when loading from a uniform
87buffer, but consistently *only* when loading from a uniform buffer. As a result
88this also means we only have to handle loads and not stores, as uniform buffers
89are read-only.
90
91[spv]: https://www.khronos.org/registry/SPIR-V/
92[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
93[extended-layout]: https://docs.vulkan.org/spec/latest/chapters/interfaces.html#interfaces-resources-layout
94[`loaded`]: crate::Expression::Load
95[`accessed`]: crate::Expression::AccessIndex
96[`dynamically accessed`]: crate::Expression::Access
97*/
98
99mod block;
100mod f16_polyfill;
101mod helpers;
102mod image;
103mod index;
104mod instructions;
105mod layout;
106mod mesh_shader;
107mod ray;
108mod reclaimable;
109mod selection;
110mod subgroup;
111mod writer;
112
113pub use nt::spv::*;
114
115pub use mesh_shader::{MeshReturnInfo, MeshReturnMember};
116pub use spirv::{Capability, SourceLanguage};
117
118use alloc::{string::String, vec::Vec};
119use core::ops;
120
121use spirv::Word;
122use thiserror::Error;
123
124use crate::arena::{Handle, HandleVec};
125use crate::back::TaskDispatchLimits;
126use crate::proc::{BoundsCheckPolicies, TypeResolution};
127
128#[derive(Clone)]
129struct PhysicalLayout {
130    magic_number: Word,
131    version: Word,
132    generator: Word,
133    bound: Word,
134    instruction_schema: Word,
135}
136
137#[derive(Default)]
138struct LogicalLayout {
139    capabilities: Vec<Word>,
140    extensions: Vec<Word>,
141    ext_inst_imports: Vec<Word>,
142    memory_model: Vec<Word>,
143    entry_points: Vec<Word>,
144    execution_modes: Vec<Word>,
145    debugs: Vec<Word>,
146    annotations: Vec<Word>,
147    declarations: Vec<Word>,
148    function_declarations: Vec<Word>,
149    function_definitions: Vec<Word>,
150}
151
152#[derive(Clone)]
153struct Instruction {
154    op: spirv::Op,
155    wc: u32,
156    type_id: Option<Word>,
157    result_id: Option<Word>,
158    operands: Vec<Word>,
159}
160
161const BITS_PER_BYTE: crate::Bytes = 8;
162
163#[derive(Clone, Debug, Error)]
164pub enum Error {
165    #[error("The requested entry point couldn't be found")]
166    EntryPointNotFound,
167    #[error("target SPIRV-{0}.{1} is not supported")]
168    UnsupportedVersion(u8, u8),
169    #[error("using {0} requires at least one of the capabilities {1:?}, but none are available")]
170    MissingCapabilities(&'static str, Vec<Capability>),
171    #[error("unimplemented {0}")]
172    FeatureNotImplemented(&'static str),
173    #[error("module is not validated properly: {0}")]
174    Validation(&'static str),
175    #[error("overrides should not be present at this stage")]
176    Override,
177    #[error(transparent)]
178    ResolveArraySizeError(#[from] crate::proc::ResolveArraySizeError),
179    #[error("module requires SPIRV-{0}.{1}, which isn't supported")]
180    SpirvVersionTooLow(u8, u8),
181    #[error("mapping of {0:?} is missing")]
182    MissingBinding(crate::ResourceBinding),
183}
184
185#[derive(Default)]
186struct IdGenerator(Word);
187
188impl IdGenerator {
189    const fn next(&mut self) -> Word {
190        self.0 += 1;
191        self.0
192    }
193}
194
195#[derive(Debug, Clone)]
196pub struct DebugInfo<'a> {
197    pub source_code: &'a str,
198    pub file_name: &'a str,
199    pub language: SourceLanguage,
200}
201
202/// A SPIR-V block to which we are still adding instructions.
203///
204/// A `Block` represents a SPIR-V block that does not yet have a termination
205/// instruction like `OpBranch` or `OpReturn`.
206///
207/// The `OpLabel` that starts the block is implicit. It will be emitted based on
208/// `label_id` when we write the block to a `LogicalLayout`.
209///
210/// To terminate a `Block`, pass the block and the termination instruction to
211/// `Function::consume`. This takes ownership of the `Block` and transforms it
212/// into a `TerminatedBlock`.
213struct Block {
214    label_id: Word,
215    body: Vec<Instruction>,
216}
217
218/// A SPIR-V block that ends with a termination instruction.
219struct TerminatedBlock {
220    label_id: Word,
221    body: Vec<Instruction>,
222}
223
224impl Block {
225    const fn new(label_id: Word) -> Self {
226        Block {
227            label_id,
228            body: Vec::new(),
229        }
230    }
231}
232
233struct LocalVariable {
234    id: Word,
235    instruction: Instruction,
236}
237
238struct ResultMember {
239    id: Word,
240    type_id: Word,
241    built_in: Option<crate::BuiltIn>,
242}
243
244struct EntryPointContext {
245    argument_ids: Vec<Word>,
246    results: Vec<ResultMember>,
247    task_payload_variable_id: Option<Word>,
248    mesh_state: Option<MeshReturnInfo>,
249}
250
251#[derive(Default)]
252struct Function {
253    signature: Option<Instruction>,
254    parameters: Vec<FunctionArgument>,
255    variables: crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
256    /// Map from a local variable that is a ray query to its u32 tracker.
257    ray_query_initialization_tracker_variables:
258        crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
259    /// Map from a local variable that is a ray query to its tracker for the t max.
260    ray_query_t_max_tracker_variables:
261        crate::FastHashMap<Handle<crate::LocalVariable>, LocalVariable>,
262    /// List of local variables used as a counters to ensure that all loops are bounded.
263    force_loop_bounding_vars: Vec<LocalVariable>,
264
265    /// A map from a Naga expression to the temporary SPIR-V variable we have
266    /// spilled its value to, if any.
267    ///
268    /// Naga IR lets us apply [`Access`] expressions to expressions whose value
269    /// is an array or matrix---not a pointer to such---but SPIR-V doesn't have
270    /// instructions that can do the same. So when we encounter such code, we
271    /// spill the expression's value to a generated temporary variable. That, we
272    /// can obtain a pointer to, and then use an `OpAccessChain` instruction to
273    /// do whatever series of [`Access`] and [`AccessIndex`] operations we need
274    /// (with bounds checks). Finally, we generate an `OpLoad` to get the final
275    /// value.
276    ///
277    /// [`Access`]: crate::Expression::Access
278    /// [`AccessIndex`]: crate::Expression::AccessIndex
279    spilled_composites: crate::FastIndexMap<Handle<crate::Expression>, LocalVariable>,
280
281    /// A set of expressions that are either in [`spilled_composites`] or refer
282    /// to some component/element of such.
283    ///
284    /// [`spilled_composites`]: Function::spilled_composites
285    spilled_accesses: crate::arena::HandleSet<crate::Expression>,
286
287    /// A map taking each expression to the number of [`Access`] and
288    /// [`AccessIndex`] expressions that uses it as a base value. If an
289    /// expression has no entry, its count is zero: it is never used as a
290    /// [`Access`] or [`AccessIndex`] base.
291    ///
292    /// We use this, together with [`ExpressionInfo::ref_count`], to recognize
293    /// the tips of chains of [`Access`] and [`AccessIndex`] expressions that
294    /// access spilled values --- expressions in [`spilled_composites`]. We
295    /// defer generating code for the chain until we reach its tip, so we can
296    /// handle it with a single instruction.
297    ///
298    /// [`Access`]: crate::Expression::Access
299    /// [`AccessIndex`]: crate::Expression::AccessIndex
300    /// [`ExpressionInfo::ref_count`]: crate::valid::ExpressionInfo
301    /// [`spilled_composites`]: Function::spilled_composites
302    access_uses: crate::FastHashMap<Handle<crate::Expression>, usize>,
303
304    blocks: Vec<TerminatedBlock>,
305    entry_point_context: Option<EntryPointContext>,
306}
307
308impl Function {
309    fn consume(&mut self, mut block: Block, termination: Instruction) {
310        block.body.push(termination);
311        self.blocks.push(TerminatedBlock {
312            label_id: block.label_id,
313            body: block.body,
314        })
315    }
316
317    fn parameter_id(&self, index: u32) -> Word {
318        match self.entry_point_context {
319            Some(ref context) => context.argument_ids[index as usize],
320            None => self.parameters[index as usize]
321                .instruction
322                .result_id
323                .unwrap(),
324        }
325    }
326}
327
328/// Characteristics of a SPIR-V `OpTypeImage` type.
329///
330/// SPIR-V requires non-composite types to be unique, including images. Since we
331/// use `LocalType` for this deduplication, it's essential that `LocalImageType`
332/// be equal whenever the corresponding `OpTypeImage`s would be. To reduce the
333/// likelihood of mistakes, we use fields that correspond exactly to the
334/// operands of an `OpTypeImage` instruction, using the actual SPIR-V types
335/// where practical.
336#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
337struct LocalImageType {
338    sampled_type: crate::Scalar,
339    dim: spirv::Dim,
340    flags: ImageTypeFlags,
341    image_format: spirv::ImageFormat,
342}
343
344bitflags::bitflags! {
345    /// Flags corresponding to the boolean(-ish) parameters to OpTypeImage.
346    #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
347    pub struct ImageTypeFlags: u8 {
348        const DEPTH = 0x1;
349        const ARRAYED = 0x2;
350        const MULTISAMPLED = 0x4;
351        const SAMPLED = 0x8;
352    }
353}
354
355impl LocalImageType {
356    /// Construct a `LocalImageType` from the fields of a `TypeInner::Image`.
357    fn from_inner(dim: crate::ImageDimension, arrayed: bool, class: crate::ImageClass) -> Self {
358        let make_flags = |multi: bool, other: ImageTypeFlags| -> ImageTypeFlags {
359            let mut flags = other;
360            flags.set(ImageTypeFlags::ARRAYED, arrayed);
361            flags.set(ImageTypeFlags::MULTISAMPLED, multi);
362            flags
363        };
364
365        let dim = spirv::Dim::from(dim);
366
367        match class {
368            crate::ImageClass::Sampled { kind, multi } => LocalImageType {
369                sampled_type: crate::Scalar { kind, width: 4 },
370                dim,
371                flags: make_flags(multi, ImageTypeFlags::SAMPLED),
372                image_format: spirv::ImageFormat::Unknown,
373            },
374            crate::ImageClass::Depth { multi } => LocalImageType {
375                sampled_type: crate::Scalar {
376                    kind: crate::ScalarKind::Float,
377                    width: 4,
378                },
379                dim,
380                flags: make_flags(multi, ImageTypeFlags::DEPTH | ImageTypeFlags::SAMPLED),
381                image_format: spirv::ImageFormat::Unknown,
382            },
383            crate::ImageClass::Storage { format, access: _ } => LocalImageType {
384                sampled_type: format.into(),
385                dim,
386                flags: make_flags(false, ImageTypeFlags::empty()),
387                image_format: format.into(),
388            },
389            crate::ImageClass::External => unimplemented!(),
390        }
391    }
392}
393
394/// A numeric type, for use in [`LocalType`].
395#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
396enum NumericType {
397    Scalar(crate::Scalar),
398    Vector {
399        size: crate::VectorSize,
400        scalar: crate::Scalar,
401    },
402    Matrix {
403        columns: crate::VectorSize,
404        rows: crate::VectorSize,
405        scalar: crate::Scalar,
406    },
407}
408
409impl NumericType {
410    const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
411        match *inner {
412            crate::TypeInner::Scalar(scalar) | crate::TypeInner::Atomic(scalar) => {
413                Some(NumericType::Scalar(scalar))
414            }
415            crate::TypeInner::Vector { size, scalar } => Some(NumericType::Vector { size, scalar }),
416            crate::TypeInner::Matrix {
417                columns,
418                rows,
419                scalar,
420            } => Some(NumericType::Matrix {
421                columns,
422                rows,
423                scalar,
424            }),
425            _ => None,
426        }
427    }
428
429    const fn scalar(self) -> crate::Scalar {
430        match self {
431            NumericType::Scalar(scalar)
432            | NumericType::Vector { scalar, .. }
433            | NumericType::Matrix { scalar, .. } => scalar,
434        }
435    }
436
437    const fn with_scalar(self, scalar: crate::Scalar) -> Self {
438        match self {
439            NumericType::Scalar(_) => NumericType::Scalar(scalar),
440            NumericType::Vector { size, .. } => NumericType::Vector { size, scalar },
441            NumericType::Matrix { columns, rows, .. } => NumericType::Matrix {
442                columns,
443                rows,
444                scalar,
445            },
446        }
447    }
448}
449
450/// A cooperative type, for use in [`LocalType`].
451#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
452enum CooperativeType {
453    Matrix {
454        columns: crate::CooperativeSize,
455        rows: crate::CooperativeSize,
456        scalar: crate::Scalar,
457        role: crate::CooperativeRole,
458    },
459}
460
461impl CooperativeType {
462    const fn from_inner(inner: &crate::TypeInner) -> Option<Self> {
463        match *inner {
464            crate::TypeInner::CooperativeMatrix {
465                columns,
466                rows,
467                scalar,
468                role,
469            } => Some(Self::Matrix {
470                columns,
471                rows,
472                scalar,
473                role,
474            }),
475            _ => None,
476        }
477    }
478}
479
480/// A SPIR-V type constructed during code generation.
481///
482/// This is the variant of [`LookupType`] used to represent types that might not
483/// be available in the arena. Variants are present here for one of two reasons:
484///
485/// -   They represent types synthesized during code generation, as explained
486///     in the documentation for [`LookupType`].
487///
488/// -   They represent types for which SPIR-V forbids duplicate `OpType...`
489///     instructions, requiring deduplication.
490///
491/// This is not a complete copy of [`TypeInner`]: for example, SPIR-V generation
492/// never synthesizes new struct types, so `LocalType` has nothing for that.
493///
494/// Each `LocalType` variant should be handled identically to its analogous
495/// `TypeInner` variant. You can use the [`Writer::localtype_from_inner`]
496/// function to help with this, by converting everything possible to a
497/// `LocalType` before inspecting it.
498///
499/// ## `LocalType` equality and SPIR-V `OpType` uniqueness
500///
501/// The definition of `Eq` on `LocalType` is carefully chosen to help us follow
502/// certain SPIR-V rules. SPIR-V ยง2.8 requires some classes of `OpType...`
503/// instructions to be unique; for example, you can't have two `OpTypeInt 32 1`
504/// instructions in the same module. All 32-bit signed integers must use the
505/// same type id.
506///
507/// All SPIR-V types that must be unique can be represented as a `LocalType`,
508/// and two `LocalType`s are always `Eq` if SPIR-V would require them to use the
509/// same `OpType...` instruction. This lets us avoid duplicates by recording the
510/// ids of the type instructions we've already generated in a hash table,
511/// [`Writer::lookup_type`], keyed by `LocalType`.
512///
513/// As another example, [`LocalImageType`], stored in the `LocalType::Image`
514/// variant, is designed to help us deduplicate `OpTypeImage` instructions. See
515/// its documentation for details.
516///
517/// SPIR-V does not require pointer types to be unique - but different
518/// SPIR-V ids are considered to be distinct pointer types. Since Naga
519/// uses structural type equality, we need to represent each Naga
520/// equivalence class with a single SPIR-V `OpTypePointer`.
521///
522/// As it always must, the `Hash` implementation respects the `Eq` relation.
523///
524/// [`TypeInner`]: crate::TypeInner
525#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
526enum LocalType {
527    /// A numeric type.
528    Numeric(NumericType),
529    Cooperative(CooperativeType),
530    Pointer {
531        base: Word,
532        class: spirv::StorageClass,
533    },
534    Image(LocalImageType),
535    SampledImage {
536        image_type_id: Word,
537    },
538    Sampler,
539    BindingArray {
540        base: Handle<crate::Type>,
541        size: u32,
542    },
543    AccelerationStructure,
544    RayQuery,
545}
546
547/// A type encountered during SPIR-V generation.
548///
549/// In the process of writing SPIR-V, we need to synthesize various types for
550/// intermediate results and such: pointer types, vector/matrix component types,
551/// or even booleans, which usually appear in SPIR-V code even when they're not
552/// used by the module source.
553///
554/// However, we can't use `crate::Type` or `crate::TypeInner` for these, as the
555/// type arena may not contain what we need (it only contains types used
556/// directly by other parts of the IR), and the IR module is immutable, so we
557/// can't add anything to it.
558///
559/// So for local use in the SPIR-V writer, we use this type, which holds either
560/// a handle into the arena, or a [`LocalType`] containing something synthesized
561/// locally.
562///
563/// This is very similar to the [`proc::TypeResolution`] enum, with `LocalType`
564/// playing the role of `TypeInner`. However, `LocalType` also has other
565/// properties needed for SPIR-V generation; see the description of
566/// [`LocalType`] for details.
567///
568/// [`proc::TypeResolution`]: crate::proc::TypeResolution
569#[derive(Debug, PartialEq, Hash, Eq, Copy, Clone)]
570enum LookupType {
571    Handle(Handle<crate::Type>),
572    Local(LocalType),
573}
574
575impl From<LocalType> for LookupType {
576    fn from(local: LocalType) -> Self {
577        Self::Local(local)
578    }
579}
580
581#[derive(Debug, PartialEq, Clone, Hash, Eq)]
582struct LookupFunctionType {
583    parameter_type_ids: Vec<Word>,
584    return_type_id: Word,
585}
586
587#[derive(Debug, PartialEq, Clone, Hash, Eq)]
588enum LookupRayQueryFunction {
589    Initialize,
590    Proceed,
591    GenerateIntersection,
592    ConfirmIntersection,
593    GetVertexPositions { committed: bool },
594    GetIntersection { committed: bool },
595    Terminate,
596}
597
598// Just one supported function right now, more in the future.
599#[derive(Debug, PartialEq, Clone, Hash, Eq)]
600enum LookupRaytracingFunction {
601    TraceRay {
602        payload: Handle<crate::GlobalVariable>,
603    },
604}
605
606#[derive(Debug)]
607enum Dimension {
608    Scalar,
609    Vector,
610    Matrix,
611    CooperativeMatrix,
612}
613
614/// Key used to look up an operation which we have wrapped in a helper
615/// function, which should be called instead of directly emitting code
616/// for the expression. See [`Writer::wrapped_functions`].
617#[derive(Debug, Eq, PartialEq, Hash)]
618enum WrappedFunction {
619    BinaryOp {
620        op: crate::BinaryOperator,
621        left_type_id: Word,
622        right_type_id: Word,
623    },
624    ConvertFromStd140CompatType {
625        r#type: Handle<crate::Type>,
626    },
627    MatCx2GetColumn {
628        r#type: Handle<crate::Type>,
629    },
630}
631
632/// A map from evaluated [`Expression`](crate::Expression)s to their SPIR-V ids.
633///
634/// When we emit code to evaluate a given `Expression`, we record the
635/// SPIR-V id of its value here, under its `Handle<Expression>` index.
636///
637/// A `CachedExpressions` value can be indexed by a `Handle<Expression>` value.
638///
639/// [emit]: index.html#expression-evaluation-time-and-scope
640#[derive(Default)]
641struct CachedExpressions {
642    ids: HandleVec<crate::Expression, Word>,
643}
644impl CachedExpressions {
645    fn reset(&mut self, length: usize) {
646        self.ids.clear();
647        self.ids.resize(length, 0);
648    }
649}
650impl ops::Index<Handle<crate::Expression>> for CachedExpressions {
651    type Output = Word;
652    fn index(&self, h: Handle<crate::Expression>) -> &Word {
653        let id = &self.ids[h];
654        if *id == 0 {
655            unreachable!("Expression {:?} is not cached!", h);
656        }
657        id
658    }
659}
660impl ops::IndexMut<Handle<crate::Expression>> for CachedExpressions {
661    fn index_mut(&mut self, h: Handle<crate::Expression>) -> &mut Word {
662        let id = &mut self.ids[h];
663        if *id != 0 {
664            unreachable!("Expression {:?} is already cached!", h);
665        }
666        id
667    }
668}
669impl reclaimable::Reclaimable for CachedExpressions {
670    fn reclaim(self) -> Self {
671        CachedExpressions {
672            ids: self.ids.reclaim(),
673        }
674    }
675}
676
677#[derive(Eq, Hash, PartialEq)]
678enum CachedConstant {
679    Literal(crate::proc::HashableLiteral),
680    Composite {
681        ty: LookupType,
682        constituent_ids: Vec<Word>,
683    },
684    ZeroValue(Word),
685}
686
687/// The SPIR-V representation of a [`crate::GlobalVariable`].
688///
689/// In the Vulkan spec 1.3.296, the section [Descriptor Set Interface][dsi] says:
690///
691/// > Variables identified with the `Uniform` storage class are used to access
692/// > transparent buffer backed resources. Such variables *must* be:
693/// >
694/// > -   typed as `OpTypeStruct`, or an array of this type,
695/// >
696/// > -   identified with a `Block` or `BufferBlock` decoration, and
697/// >
698/// > -   laid out explicitly using the `Offset`, `ArrayStride`, and `MatrixStride`
699/// >     decorations as specified in "Offset and Stride Assignment".
700///
701/// This is followed by identical language for the `StorageBuffer`,
702/// except that a `BufferBlock` decoration is not allowed.
703///
704/// When we encounter a global variable in the [`Storage`] or [`Uniform`]
705/// address spaces whose type is not already [`Struct`], this backend implicitly
706/// wraps the global variable in a struct: we generate a SPIR-V global variable
707/// holding an `OpTypeStruct` with a single member, whose type is what the Naga
708/// global's type would suggest, decorated as required above.
709///
710/// The [`helpers::global_needs_wrapper`] function determines whether a given
711/// [`crate::GlobalVariable`] needs to be wrapped.
712///
713/// [dsi]: https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#interfaces-resources-descset
714/// [`Storage`]: crate::AddressSpace::Storage
715/// [`Uniform`]: crate::AddressSpace::Uniform
716/// [`Struct`]: crate::TypeInner::Struct
717#[derive(Clone)]
718struct GlobalVariable {
719    /// The SPIR-V id of the `OpVariable` that declares the global.
720    ///
721    /// If this global has been implicitly wrapped in an `OpTypeStruct`, this id
722    /// refers to the wrapper, not the original Naga value it contains. If you
723    /// need the Naga value, use [`access_id`] instead of this field.
724    ///
725    /// If this global is not implicitly wrapped, this is the same as
726    /// [`access_id`].
727    ///
728    /// This is used to compute the `access_id` pointer in function prologues,
729    /// and used for `ArrayLength` expressions, which need to pass the wrapper
730    /// struct.
731    ///
732    /// [`access_id`]: GlobalVariable::access_id
733    var_id: Word,
734
735    /// The loaded value of a `AddressSpace::Handle` global variable.
736    ///
737    /// If the current function uses this global variable, this is the id of an
738    /// `OpLoad` instruction in the function's prologue that loads its value.
739    /// (This value is assigned as we write the prologue code of each function.)
740    /// It is then used for all operations on the global, such as `OpImageSample`.
741    handle_id: Word,
742
743    /// The SPIR-V id of a pointer to this variable's Naga IR value.
744    ///
745    /// If the current function uses this global variable, and it has been
746    /// implicitly wrapped in an `OpTypeStruct`, this is the id of an
747    /// `OpAccessChain` instruction in the function's prologue that refers to
748    /// the wrapped value inside the struct. (This value is assigned as we write
749    /// the prologue code of each function.) If you need the wrapper struct
750    /// itself, use [`var_id`] instead of this field.
751    ///
752    /// If this global is not implicitly wrapped, this is the same as
753    /// [`var_id`].
754    ///
755    /// [`var_id`]: GlobalVariable::var_id
756    access_id: Word,
757}
758
759impl GlobalVariable {
760    const fn dummy() -> Self {
761        Self {
762            var_id: 0,
763            handle_id: 0,
764            access_id: 0,
765        }
766    }
767
768    const fn new(id: Word) -> Self {
769        Self {
770            var_id: id,
771            handle_id: 0,
772            access_id: 0,
773        }
774    }
775
776    /// Prepare `self` for use within a single function.
777    const fn reset_for_function(&mut self) {
778        self.handle_id = 0;
779        self.access_id = 0;
780    }
781}
782
783struct FunctionArgument {
784    /// Actual instruction of the argument.
785    instruction: Instruction,
786    handle_id: Word,
787}
788
789/// Tracks the expressions for which the backend emits the following instructions:
790/// - OpConstantTrue
791/// - OpConstantFalse
792/// - OpConstant
793/// - OpConstantComposite
794/// - OpConstantNull
795struct ExpressionConstnessTracker {
796    inner: crate::arena::HandleSet<crate::Expression>,
797}
798
799impl ExpressionConstnessTracker {
800    fn from_arena(arena: &crate::Arena<crate::Expression>) -> Self {
801        let mut inner = crate::arena::HandleSet::for_arena(arena);
802        for (handle, expr) in arena.iter() {
803            let insert = match *expr {
804                crate::Expression::Literal(_)
805                | crate::Expression::ZeroValue(_)
806                | crate::Expression::Constant(_) => true,
807                crate::Expression::Compose { ref components, .. } => {
808                    components.iter().all(|&h| inner.contains(h))
809                }
810                crate::Expression::Splat { value, .. } => inner.contains(value),
811                _ => false,
812            };
813            if insert {
814                inner.insert(handle);
815            }
816        }
817        Self { inner }
818    }
819
820    fn is_const(&self, value: Handle<crate::Expression>) -> bool {
821        self.inner.contains(value)
822    }
823}
824
825/// General information needed to emit SPIR-V for Naga statements.
826struct BlockContext<'w> {
827    /// The writer handling the module to which this code belongs.
828    writer: &'w mut Writer,
829
830    /// The [`Module`](crate::Module) for which we're generating code.
831    ir_module: &'w crate::Module,
832
833    /// The [`Function`](crate::Function) for which we're generating code.
834    ir_function: &'w crate::Function,
835
836    /// Information module validation produced about
837    /// [`ir_function`](BlockContext::ir_function).
838    fun_info: &'w crate::valid::FunctionInfo,
839
840    /// The [`spv::Function`](Function) to which we are contributing SPIR-V instructions.
841    function: &'w mut Function,
842
843    /// SPIR-V ids for expressions we've evaluated.
844    cached: CachedExpressions,
845
846    /// The `Writer`'s temporary vector, for convenience.
847    temp_list: Vec<Word>,
848
849    /// Tracks the constness of `Expression`s residing in `self.ir_function.expressions`
850    expression_constness: ExpressionConstnessTracker,
851
852    force_loop_bounding: bool,
853
854    /// Hash from an expression whose type is a ray query / pointer to a ray query to its tracker.
855    /// Note: this is sparse, so can't be a handle vec
856    ray_query_tracker_expr: crate::FastHashMap<Handle<crate::Expression>, RayQueryTrackers>,
857}
858
859#[derive(Clone, Copy)]
860struct RayQueryTrackers {
861    // Initialization tracker
862    initialized_tracker: Word,
863    // Tracks the t max from ray query initialize.
864    // Unlike HLSL, spir-v's equivalent getter for the current committed t has UB (instead of just
865    // returning t_max) if there was no previous hit (though in some places it treats the behaviour as
866    // defined), therefore we must track the tmax inputted into ray query initialize.
867    t_max_tracker: Word,
868}
869
870impl BlockContext<'_> {
871    const fn gen_id(&mut self) -> Word {
872        self.writer.id_gen.next()
873    }
874
875    fn get_type_id(&mut self, lookup_type: LookupType) -> Word {
876        self.writer.get_type_id(lookup_type)
877    }
878
879    fn get_handle_type_id(&mut self, handle: Handle<crate::Type>) -> Word {
880        self.writer.get_handle_type_id(handle)
881    }
882
883    fn get_expression_type_id(&mut self, tr: &TypeResolution) -> Word {
884        self.writer.get_expression_type_id(tr)
885    }
886
887    fn get_index_constant(&mut self, index: Word) -> Word {
888        self.writer.get_constant_scalar(crate::Literal::U32(index))
889    }
890
891    fn get_scope_constant(&mut self, scope: Word) -> Word {
892        self.writer
893            .get_constant_scalar(crate::Literal::I32(scope as _))
894    }
895
896    fn get_pointer_type_id(&mut self, base: Word, class: spirv::StorageClass) -> Word {
897        self.writer.get_pointer_type_id(base, class)
898    }
899
900    fn get_numeric_type_id(&mut self, numeric: NumericType) -> Word {
901        self.writer.get_numeric_type_id(numeric)
902    }
903}
904
905/// Information about a type for which we have declared a std140 layout
906/// compatible variant, because the type is used in a uniform but does not
907/// adhere to std140 requirements. The uniform will be declared using the
908/// type `type_id`, and the result of any `Load` will be immediately converted
909/// to the base type. This is used for matrices with 2 rows, as well as any
910/// arrays or structs containing such matrices.
911#[derive(Debug)]
912pub struct Std140CompatTypeInfo {
913    /// ID of the std140 compatible type declaration.
914    type_id: Word,
915    /// For structs, a mapping of Naga IR struct member indices to the indices
916    /// used in the generated SPIR-V. For non-struct types this will be empty.
917    member_indices: Vec<u32>,
918}
919
920#[expect(missing_debug_implementations, reason = "would be way too verbose?")]
921pub struct Writer {
922    physical_layout: PhysicalLayout,
923    logical_layout: LogicalLayout,
924    id_gen: IdGenerator,
925
926    /// The set of capabilities modules are permitted to use.
927    ///
928    /// This is initialized from `Options::capabilities`.
929    capabilities_available: Option<crate::FastHashSet<Capability>>,
930
931    /// The set of capabilities used by this module.
932    ///
933    /// If `capabilities_available` is `Some`, then this is always a subset of
934    /// that.
935    capabilities_used: crate::FastIndexSet<Capability>,
936
937    /// The set of spirv extensions used.
938    extensions_used: crate::FastIndexSet<&'static str>,
939
940    debug_strings: Vec<Instruction>,
941    debugs: Vec<Instruction>,
942    annotations: Vec<Instruction>,
943    flags: WriterFlags,
944    bounds_check_policies: BoundsCheckPolicies,
945    zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
946    force_loop_bounding: bool,
947    use_storage_input_output_16: bool,
948    emit_int_div_checks: bool,
949    void_type: Word,
950    tuple_of_u32s_ty_id: Option<Word>,
951    //TODO: convert most of these into vectors, addressable by handle indices
952    lookup_type: crate::FastHashMap<LookupType, Word>,
953    lookup_function: crate::FastHashMap<Handle<crate::Function>, Word>,
954    lookup_function_type: crate::FastHashMap<LookupFunctionType, Word>,
955    /// Operations which have been wrapped in a helper function. The value is
956    /// the ID of the function, which should be called instead of emitting code
957    /// for the operation directly.
958    wrapped_functions: crate::FastHashMap<WrappedFunction, Word>,
959    /// Indexed by const-expression handle indexes
960    constant_ids: HandleVec<crate::Expression, Word>,
961    cached_constants: crate::FastHashMap<CachedConstant, Word>,
962    global_variables: HandleVec<crate::GlobalVariable, GlobalVariable>,
963    std140_compat_uniform_types: crate::FastHashMap<Handle<crate::Type>, Std140CompatTypeInfo>,
964    fake_missing_bindings: bool,
965    binding_map: BindingMap,
966
967    // Cached expressions are only meaningful within a BlockContext, but we
968    // retain the table here between functions to save heap allocations.
969    saved_cached: CachedExpressions,
970
971    gl450_ext_inst_id: Word,
972
973    // Just a temporary list of SPIR-V ids
974    temp_list: Vec<Word>,
975
976    ray_query_functions: crate::FastHashMap<LookupRayQueryFunction, Word>,
977
978    ray_tracing_functions: crate::FastHashMap<LookupRaytracingFunction, Word>,
979
980    has_ray_tracing_pipeline: bool,
981
982    /// F16 I/O polyfill manager for handling `f16` input/output variables
983    /// when `StorageInputOutput16` capability is not available.
984    io_f16_polyfills: f16_polyfill::F16IoPolyfill,
985
986    /// Non semantic debug printf extension `OpExtInstImport`
987    debug_printf: Option<Word>,
988    pub(crate) ray_query_initialization_tracking: bool,
989
990    /// Whether the arguments to trace ray should be validated
991    pub(crate) trace_ray_argument_validation: bool,
992
993    /// See docs in [`Options`]
994    task_dispatch_limits: Option<TaskDispatchLimits>,
995    /// See docs in [`Options`]
996    mesh_shader_primitive_indices_clamp: bool,
997}
998
999bitflags::bitflags! {
1000    #[derive(Clone, Copy, Debug, Eq, PartialEq)]
1001    pub struct WriterFlags: u32 {
1002        /// Include debug labels for everything.
1003        const DEBUG = 0x1;
1004
1005        /// Flip Y coordinate of [`BuiltIn::Position`] output.
1006        ///
1007        /// [`BuiltIn::Position`]: crate::BuiltIn::Position
1008        const ADJUST_COORDINATE_SPACE = 0x2;
1009
1010        /// Emit [`OpName`][op] for input/output locations.
1011        ///
1012        /// Contrary to spec, some drivers treat it as semantic, not allowing
1013        /// any conflicts.
1014        ///
1015        /// [op]: https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpName
1016        const LABEL_VARYINGS = 0x4;
1017
1018        /// Emit [`PointSize`] output builtin to vertex shaders, which is
1019        /// required for drawing with `PointList` topology.
1020        ///
1021        /// [`PointSize`]: crate::BuiltIn::PointSize
1022        const FORCE_POINT_SIZE = 0x8;
1023
1024        /// Clamp [`BuiltIn::FragDepth`] output between 0 and 1.
1025        ///
1026        /// [`BuiltIn::FragDepth`]: crate::BuiltIn::FragDepth
1027        const CLAMP_FRAG_DEPTH = 0x10;
1028
1029        /// Instead of silently failing if the arguments to generate a ray query are
1030        /// invalid, uses debug printf extension to print to the command line
1031        ///
1032        /// Note: VK_KHR_shader_non_semantic_info must be enabled. This will have no
1033        /// effect if `options.ray_query_initialization_tracking` is set to false.
1034        const PRINT_ON_RAY_QUERY_INITIALIZATION_FAIL = 0x20;
1035
1036        /// Instead of silently failing if the arguments to `traceRays` are
1037        /// invalid, uses debug printf extension to print to the command line
1038        ///
1039        /// Note: VK_KHR_shader_non_semantic_info must be enabled. This will have no
1040        /// effect if `options.trace_ray_argument_validation` is set to false.
1041        const PRINT_ON_TRACE_RAYS_FAIL = 0x40;
1042    }
1043}
1044
1045#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1046pub enum ZeroInitializeWorkgroupMemoryMode {
1047    /// Via `VK_KHR_zero_initialize_workgroup_memory` or Vulkan 1.3
1048    Native,
1049    /// Via assignments + barrier
1050    Polyfill,
1051    None,
1052}
1053
1054#[derive(Debug, Clone)]
1055pub struct Options<'a> {
1056    /// (Major, Minor) target version of the SPIR-V.
1057    pub lang_version: (u8, u8),
1058
1059    /// Configuration flags for the writer.
1060    pub flags: WriterFlags,
1061
1062    /// Don't panic on missing bindings. Instead use fake values for `Binding`
1063    /// and `DescriptorSet` decorations. This may result in invalid SPIR-V.
1064    pub fake_missing_bindings: bool,
1065
1066    /// Map of resources to information about the binding.
1067    pub binding_map: BindingMap,
1068
1069    /// If given, the set of capabilities modules are allowed to use. Code that
1070    /// requires capabilities beyond these is rejected with an error.
1071    ///
1072    /// If this is `None`, all capabilities are permitted.
1073    pub capabilities: Option<crate::FastHashSet<Capability>>,
1074
1075    /// How should generate code handle array, vector, matrix, or image texel
1076    /// indices that are out of range?
1077    pub bounds_check_policies: BoundsCheckPolicies,
1078
1079    /// Dictates the way workgroup variables should be zero initialized
1080    pub zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode,
1081
1082    /// If set, loops will have code injected into them, forcing the compiler
1083    /// to think the number of iterations is bounded.
1084    pub force_loop_bounding: bool,
1085
1086    /// if set, ray queries will get a variable to track their state to prevent
1087    /// misuse.
1088    pub ray_query_initialization_tracking: bool,
1089
1090    /// If set, arguments to `traceRays` calls will be validated.
1091    pub trace_ray_argument_validation: bool,
1092
1093    /// Whether to use the `StorageInputOutput16` capability for `f16` shader I/O.
1094    /// When false, `f16` I/O is polyfilled using `f32` types with conversions.
1095    pub use_storage_input_output_16: bool,
1096
1097    pub debug_info: Option<DebugInfo<'a>>,
1098
1099    /// Limits to the mesh shader dispatch group a task workgroup can dispatch.
1100    ///
1101    /// Metal for example limits to 1024 workgroups per task shader dispatch. Dispatching more is
1102    /// undefined behavior, so this would validate that to dispatch zero workgroups.
1103    pub task_dispatch_limits: Option<TaskDispatchLimits>,
1104
1105    /// If true, naga may generate checks that the primitive indices are valid in the output.
1106    ///
1107    /// Currently this validation is unimplemented.
1108    pub mesh_shader_primitive_indices_clamp: bool,
1109
1110    /// If true (the default), integer division and modulo operations emit
1111    /// wrapper functions that replace a zero divisor with one, and for signed
1112    /// integers also guard against `INT_MIN / -1` overflow. This matches the
1113    /// WGSL spec's requirement that these cases produce defined results.
1114    ///
1115    /// Set to `false` to emit raw `OpSDiv`/`OpUDiv`/`OpSRem`/`OpUMod`
1116    /// instructions without checks. This is faster but produces
1117    /// implementation-defined results when the divisor is zero. Appropriate
1118    /// for compute shaders where the developer guarantees non-zero divisors.
1119    pub emit_int_div_checks: bool,
1120}
1121
1122impl Default for Options<'_> {
1123    fn default() -> Self {
1124        let mut flags = WriterFlags::ADJUST_COORDINATE_SPACE
1125            | WriterFlags::LABEL_VARYINGS
1126            | WriterFlags::CLAMP_FRAG_DEPTH;
1127        if cfg!(debug_assertions) {
1128            flags |= WriterFlags::DEBUG;
1129        }
1130        Options {
1131            lang_version: (1, 0),
1132            flags,
1133            fake_missing_bindings: true,
1134            binding_map: BindingMap::default(),
1135            capabilities: None,
1136            bounds_check_policies: BoundsCheckPolicies::default(),
1137            zero_initialize_workgroup_memory: ZeroInitializeWorkgroupMemoryMode::Polyfill,
1138            force_loop_bounding: true,
1139            ray_query_initialization_tracking: true,
1140            trace_ray_argument_validation: true,
1141            use_storage_input_output_16: true,
1142            debug_info: None,
1143            task_dispatch_limits: None,
1144            mesh_shader_primitive_indices_clamp: true,
1145            emit_int_div_checks: true,
1146        }
1147    }
1148}
1149
1150// A subset of options meant to be changed per pipeline.
1151#[derive(Debug, Clone)]
1152#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
1153#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
1154pub struct PipelineOptions {
1155    /// The stage of the entry point.
1156    pub shader_stage: crate::ShaderStage,
1157    /// The name of the entry point.
1158    ///
1159    /// If no entry point that matches is found while creating a [`Writer`], a error will be thrown.
1160    pub entry_point: String,
1161}
1162
1163pub fn write_vec(
1164    module: &crate::Module,
1165    info: &crate::valid::ModuleInfo,
1166    options: &Options,
1167    pipeline_options: Option<&PipelineOptions>,
1168) -> Result<Vec<u32>, Error> {
1169    let mut words: Vec<u32> = Vec::new();
1170    let mut w = Writer::new(options)?;
1171
1172    w.write(
1173        module,
1174        info,
1175        pipeline_options,
1176        &options.debug_info,
1177        &mut words,
1178    )?;
1179    Ok(words)
1180}
1181
1182pub fn supported_capabilities() -> crate::valid::Capabilities {
1183    use crate::valid::Capabilities as Caps;
1184
1185    Caps::IMMEDIATES
1186        | Caps::FLOAT64
1187        | Caps::PRIMITIVE_INDEX
1188        | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY
1189        | Caps::BUFFER_BINDING_ARRAY
1190        | Caps::STORAGE_TEXTURE_BINDING_ARRAY
1191        | Caps::STORAGE_BUFFER_BINDING_ARRAY
1192        | Caps::ACCELERATION_STRUCTURE_BINDING_ARRAY
1193        | Caps::CLIP_DISTANCES
1194        // No cull distance
1195        | Caps::STORAGE_TEXTURE_16BIT_NORM_FORMATS
1196        | Caps::MULTIVIEW
1197        | Caps::EARLY_DEPTH_TEST
1198        | Caps::MULTISAMPLED_SHADING
1199        | Caps::RAY_QUERY
1200        | Caps::DUAL_SOURCE_BLENDING
1201        | Caps::CUBE_ARRAY_TEXTURES
1202        | Caps::SHADER_INT64
1203        | Caps::SUBGROUP
1204        | Caps::SUBGROUP_BARRIER
1205        | Caps::SUBGROUP_VERTEX_STAGE
1206        | Caps::SHADER_INT64_ATOMIC_MIN_MAX
1207        | Caps::SHADER_INT64_ATOMIC_ALL_OPS
1208        | Caps::SHADER_FLOAT32_ATOMIC
1209        | Caps::TEXTURE_ATOMIC
1210        | Caps::TEXTURE_INT64_ATOMIC
1211        | Caps::RAY_HIT_VERTEX_POSITION
1212        | Caps::SHADER_FLOAT16
1213        | Caps::SHADER_INT16
1214        // No TEXTURE_EXTERNAL
1215        | Caps::SHADER_FLOAT16_IN_FLOAT32
1216        | Caps::SHADER_BARYCENTRICS
1217        | Caps::MESH_SHADER
1218        | Caps::MESH_SHADER_POINT_TOPOLOGY
1219        | Caps::TEXTURE_AND_SAMPLER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1220        // No BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1221        | Caps::STORAGE_TEXTURE_BINDING_ARRAY_NON_UNIFORM_INDEXING
1222        | Caps::STORAGE_BUFFER_BINDING_ARRAY_NON_UNIFORM_INDEXING
1223        | Caps::COOPERATIVE_MATRIX
1224        | Caps::PER_VERTEX
1225        | Caps::RAY_TRACING_PIPELINE
1226        | Caps::DRAW_INDEX
1227        | Caps::MEMORY_DECORATION_COHERENT
1228        | Caps::MEMORY_DECORATION_VOLATILE
1229}