naga/back/hlsl/
mod.rs

1/*!
2Backend for [HLSL][hlsl] (High-Level Shading Language).
3
4# Supported shader model versions:
5- 5.0
6- 5.1
7- 6.0
8
9# Layout of values in `uniform` buffers
10
11WGSL's ["Internal Layout of Values"][ilov] rules specify how each WGSL
12type should be stored in `uniform` and `storage` buffers. The HLSL we
13generate must access values in that form, even when it is not what
14HLSL would use normally.
15
16Matching the WGSL memory layout is a concern only for `uniform`
17variables. WGSL `storage` buffers are translated as HLSL
18`ByteAddressBuffers`, for which we generate `Load` and `Store` method
19calls with explicit byte offsets. WGSL pipeline inputs must be scalars
20or vectors; they cannot be matrices, which is where the interesting
21problems arise. However, when an affected type appears in a struct
22definition, the transformations described here are applied without
23consideration of where the struct is used.
24
25Access to storage buffers is implemented in `storage.rs`. Access to
26uniform buffers is implemented where applicable in `writer.rs`.
27
28## Row- and column-major ordering for matrices
29
30WGSL specifies that matrices in uniform buffers are stored in
31column-major order. This matches HLSL's default, so one might expect
32things to be straightforward. Unfortunately, WGSL and HLSL disagree on
33what indexing a matrix means: in WGSL, `m[i]` retrieves the `i`'th
34*column* of `m`, whereas in HLSL it retrieves the `i`'th *row*. We
35want to avoid translating `m[i]` into some complicated reassembly of a
36vector from individually fetched components, so this is a problem.
37
38However, with a bit of trickery, it is possible to use HLSL's `m[i]`
39as the translation of WGSL's `m[i]`:
40
41- We declare all matrices in uniform buffers in HLSL with the
42  `row_major` qualifier, and transpose the row and column counts: a
43  WGSL `mat3x4<f32>`, say, becomes an HLSL `row_major float3x4`. (Note
44  that WGSL and HLSL type names put the row and column in reverse
45  order.) Since the HLSL type is the transpose of how WebGPU directs
46  the user to store the data, HLSL will load all matrices transposed.
47
48- Since matrices are transposed, an HLSL indexing expression retrieves
49  the "columns" of the intended WGSL value, as desired.
50
51- For vector-matrix multiplication, since `mul(transpose(m), v)` is
52  equivalent to `mul(v, m)` (note the reversal of the arguments), and
53  `mul(v, transpose(m))` is equivalent to `mul(m, v)`, we can
54  translate WGSL `m * v` and `v * m` to HLSL by simply reversing the
55  arguments to `mul`.
56
57## Padding in two-row matrices
58
59An HLSL `row_major floatKx2` matrix has padding between its rows that
60the WGSL `matKx2<f32>` matrix it represents does not. HLSL stores all
61matrix rows [aligned on 16-byte boundaries][16bb], whereas WGSL says
62that the columns of a `matKx2<f32>` need only be [aligned as required
63for `vec2<f32>`][ilov], which is [eight-byte alignment][8bb].
64
65To compensate for this, any time a `matKx2<f32>` appears in a WGSL
66`uniform` value or as part of a struct/array, we actually emit `K`
67separate `float2` members, and assemble/disassemble the matrix from its
68columns (in WGSL; rows in HLSL) upon load and store.
69
70For example, the following WGSL struct type:
71
72```ignore
73struct Baz {
74        m: mat3x2<f32>,
75}
76```
77
78is rendered as the HLSL struct type:
79
80```ignore
81struct Baz {
82    float2 m_0; float2 m_1; float2 m_2;
83};
84```
85
86The `wrapped_struct_matrix` functions in `help.rs` generate HLSL
87helper functions to access such members, converting between the stored
88form and the HLSL matrix types appropriately. For example, for reading
89the member `m` of the `Baz` struct above, we emit:
90
91```ignore
92float3x2 GetMatmOnBaz(Baz obj) {
93    return float3x2(obj.m_0, obj.m_1, obj.m_2);
94}
95```
96
97We also emit an analogous `Set` function, as well as functions for
98accessing individual columns by dynamic index.
99
100## Sampler Handling
101
102Due to limitations in how sampler heaps work in D3D12, we need to access samplers
103through a layer of indirection. Instead of directly binding samplers, we bind the entire
104sampler heap as both a standard and a comparison sampler heap. We then use a sampler
105index buffer for each bind group. This buffer is accessed in the shader to get the actual
106sampler index within the heap. See the wgpu_hal dx12 backend documentation for more
107information.
108
109[hlsl]: https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl
110[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
111[16bb]: https://github.com/microsoft/DirectXShaderCompiler/wiki/Buffer-Packing#constant-buffer-packing
112[8bb]: https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size
113*/
114
115mod conv;
116mod help;
117mod keywords;
118mod ray;
119mod storage;
120mod writer;
121
122use alloc::{string::String, vec::Vec};
123use core::fmt::Error as FmtError;
124
125use thiserror::Error;
126
127use crate::{back, ir, proc};
128
129#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Hash)]
130#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
131#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
132pub struct BindTarget {
133    pub space: u8,
134    /// For regular bindings this is the register number.
135    ///
136    /// For sampler bindings, this is the index to use into the bind group's sampler index buffer.
137    pub register: u32,
138    /// If the binding is an unsized binding array, this overrides the size.
139    pub binding_array_size: Option<u32>,
140    /// This is the index in the buffer at [`Options::dynamic_storage_buffer_offsets_targets`].
141    pub dynamic_storage_buffer_offsets_index: Option<u32>,
142    /// This is a hint that we need to restrict indexing of vectors, matrices and arrays.
143    ///
144    /// If [`Options::restrict_indexing`] is also `true`, we will restrict indexing.
145    #[cfg_attr(any(feature = "serialize", feature = "deserialize"), serde(default))]
146    pub restrict_indexing: bool,
147}
148
149#[derive(Clone, Debug, Default, PartialEq, Eq, Hash)]
150#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
151#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
152/// BindTarget for dynamic storage buffer offsets
153pub struct OffsetsBindTarget {
154    pub space: u8,
155    pub register: u32,
156    pub size: u32,
157}
158
159#[cfg(any(feature = "serialize", feature = "deserialize"))]
160#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
161#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
162struct BindingMapSerialization {
163    resource_binding: crate::ResourceBinding,
164    bind_target: BindTarget,
165}
166
167#[cfg(feature = "deserialize")]
168fn deserialize_binding_map<'de, D>(deserializer: D) -> Result<BindingMap, D::Error>
169where
170    D: serde::Deserializer<'de>,
171{
172    use serde::Deserialize;
173
174    let vec = Vec::<BindingMapSerialization>::deserialize(deserializer)?;
175    let mut map = BindingMap::default();
176    for item in vec {
177        map.insert(item.resource_binding, item.bind_target);
178    }
179    Ok(map)
180}
181
182// Using `BTreeMap` instead of `HashMap` so that we can hash itself.
183pub type BindingMap = alloc::collections::BTreeMap<crate::ResourceBinding, BindTarget>;
184
185/// A HLSL shader model version.
186#[allow(non_snake_case, non_camel_case_types)]
187#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq, PartialOrd)]
188#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
189#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
190pub enum ShaderModel {
191    V5_0,
192    V5_1,
193    V6_0,
194    V6_1,
195    V6_2,
196    V6_3,
197    V6_4,
198    V6_5,
199    V6_6,
200    V6_7,
201}
202
203impl ShaderModel {
204    pub const fn to_str(self) -> &'static str {
205        match self {
206            Self::V5_0 => "5_0",
207            Self::V5_1 => "5_1",
208            Self::V6_0 => "6_0",
209            Self::V6_1 => "6_1",
210            Self::V6_2 => "6_2",
211            Self::V6_3 => "6_3",
212            Self::V6_4 => "6_4",
213            Self::V6_5 => "6_5",
214            Self::V6_6 => "6_6",
215            Self::V6_7 => "6_7",
216        }
217    }
218}
219
220impl crate::ShaderStage {
221    pub const fn to_hlsl_str(self) -> &'static str {
222        match self {
223            Self::Vertex => "vs",
224            Self::Fragment => "ps",
225            Self::Compute => "cs",
226            Self::Task | Self::Mesh => unreachable!(),
227        }
228    }
229}
230
231impl crate::ImageDimension {
232    const fn to_hlsl_str(self) -> &'static str {
233        match self {
234            Self::D1 => "1D",
235            Self::D2 => "2D",
236            Self::D3 => "3D",
237            Self::Cube => "Cube",
238        }
239    }
240}
241
242#[derive(Clone, Copy, Debug, Hash, Eq, Ord, PartialEq, PartialOrd)]
243#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
244#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
245pub struct SamplerIndexBufferKey {
246    pub group: u32,
247}
248
249#[derive(Clone, Debug, Hash, PartialEq, Eq)]
250#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
251#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
252#[cfg_attr(feature = "deserialize", serde(default))]
253pub struct SamplerHeapBindTargets {
254    pub standard_samplers: BindTarget,
255    pub comparison_samplers: BindTarget,
256}
257
258impl Default for SamplerHeapBindTargets {
259    fn default() -> Self {
260        Self {
261            standard_samplers: BindTarget {
262                space: 0,
263                register: 0,
264                binding_array_size: None,
265                dynamic_storage_buffer_offsets_index: None,
266                restrict_indexing: false,
267            },
268            comparison_samplers: BindTarget {
269                space: 1,
270                register: 0,
271                binding_array_size: None,
272                dynamic_storage_buffer_offsets_index: None,
273                restrict_indexing: false,
274            },
275        }
276    }
277}
278
279#[cfg(any(feature = "serialize", feature = "deserialize"))]
280#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
281#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
282struct SamplerIndexBufferBindingSerialization {
283    group: u32,
284    bind_target: BindTarget,
285}
286
287#[cfg(feature = "deserialize")]
288fn deserialize_sampler_index_buffer_bindings<'de, D>(
289    deserializer: D,
290) -> Result<SamplerIndexBufferBindingMap, D::Error>
291where
292    D: serde::Deserializer<'de>,
293{
294    use serde::Deserialize;
295
296    let vec = Vec::<SamplerIndexBufferBindingSerialization>::deserialize(deserializer)?;
297    let mut map = SamplerIndexBufferBindingMap::default();
298    for item in vec {
299        map.insert(
300            SamplerIndexBufferKey { group: item.group },
301            item.bind_target,
302        );
303    }
304    Ok(map)
305}
306
307// We use a BTreeMap here so that we can hash it.
308pub type SamplerIndexBufferBindingMap =
309    alloc::collections::BTreeMap<SamplerIndexBufferKey, BindTarget>;
310
311#[cfg(any(feature = "serialize", feature = "deserialize"))]
312#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
313#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
314struct DynamicStorageBufferOffsetTargetSerialization {
315    index: u32,
316    bind_target: OffsetsBindTarget,
317}
318
319#[cfg(feature = "deserialize")]
320fn deserialize_storage_buffer_offsets<'de, D>(
321    deserializer: D,
322) -> Result<DynamicStorageBufferOffsetsTargets, D::Error>
323where
324    D: serde::Deserializer<'de>,
325{
326    use serde::Deserialize;
327
328    let vec = Vec::<DynamicStorageBufferOffsetTargetSerialization>::deserialize(deserializer)?;
329    let mut map = DynamicStorageBufferOffsetsTargets::default();
330    for item in vec {
331        map.insert(item.index, item.bind_target);
332    }
333    Ok(map)
334}
335
336pub type DynamicStorageBufferOffsetsTargets = alloc::collections::BTreeMap<u32, OffsetsBindTarget>;
337
338/// Shorthand result used internally by the backend
339type BackendResult = Result<(), Error>;
340
341#[derive(Clone, Debug, PartialEq, thiserror::Error)]
342#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
343#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
344pub enum EntryPointError {
345    #[error("mapping of {0:?} is missing")]
346    MissingBinding(crate::ResourceBinding),
347}
348
349/// Configuration used in the [`Writer`].
350#[derive(Clone, Debug, Hash, PartialEq, Eq)]
351#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
352#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
353#[cfg_attr(feature = "deserialize", serde(default))]
354pub struct Options {
355    /// The hlsl shader model to be used
356    pub shader_model: ShaderModel,
357    /// Map of resources association to binding locations.
358    #[cfg_attr(
359        feature = "deserialize",
360        serde(deserialize_with = "deserialize_binding_map")
361    )]
362    pub binding_map: BindingMap,
363    /// Don't panic on missing bindings, instead generate any HLSL.
364    pub fake_missing_bindings: bool,
365    /// Add special constants to `SV_VertexIndex` and `SV_InstanceIndex`,
366    /// to make them work like in Vulkan/Metal, with help of the host.
367    pub special_constants_binding: Option<BindTarget>,
368    /// Bind target of the push constant buffer
369    pub push_constants_target: Option<BindTarget>,
370    /// Bind target of the sampler heap and comparison sampler heap.
371    pub sampler_heap_target: SamplerHeapBindTargets,
372    /// Mapping of each bind group's sampler index buffer to a bind target.
373    #[cfg_attr(
374        feature = "deserialize",
375        serde(deserialize_with = "deserialize_sampler_index_buffer_bindings")
376    )]
377    pub sampler_buffer_binding_map: SamplerIndexBufferBindingMap,
378    /// Bind target for dynamic storage buffer offsets
379    #[cfg_attr(
380        feature = "deserialize",
381        serde(deserialize_with = "deserialize_storage_buffer_offsets")
382    )]
383    pub dynamic_storage_buffer_offsets_targets: DynamicStorageBufferOffsetsTargets,
384    /// Should workgroup variables be zero initialized (by polyfilling)?
385    pub zero_initialize_workgroup_memory: bool,
386    /// Should we restrict indexing of vectors, matrices and arrays?
387    pub restrict_indexing: bool,
388    /// If set, loops will have code injected into them, forcing the compiler
389    /// to think the number of iterations is bounded.
390    pub force_loop_bounding: bool,
391}
392
393impl Default for Options {
394    fn default() -> Self {
395        Options {
396            shader_model: ShaderModel::V5_1,
397            binding_map: BindingMap::default(),
398            fake_missing_bindings: true,
399            special_constants_binding: None,
400            sampler_heap_target: SamplerHeapBindTargets::default(),
401            sampler_buffer_binding_map: alloc::collections::BTreeMap::default(),
402            push_constants_target: None,
403            dynamic_storage_buffer_offsets_targets: alloc::collections::BTreeMap::new(),
404            zero_initialize_workgroup_memory: true,
405            restrict_indexing: true,
406            force_loop_bounding: true,
407        }
408    }
409}
410
411impl Options {
412    fn resolve_resource_binding(
413        &self,
414        res_binding: &crate::ResourceBinding,
415    ) -> Result<BindTarget, EntryPointError> {
416        match self.binding_map.get(res_binding) {
417            Some(target) => Ok(*target),
418            None if self.fake_missing_bindings => Ok(BindTarget {
419                space: res_binding.group as u8,
420                register: res_binding.binding,
421                binding_array_size: None,
422                dynamic_storage_buffer_offsets_index: None,
423                restrict_indexing: false,
424            }),
425            None => Err(EntryPointError::MissingBinding(*res_binding)),
426        }
427    }
428}
429
430/// Reflection info for entry point names.
431#[derive(Default)]
432pub struct ReflectionInfo {
433    /// Mapping of the entry point names.
434    ///
435    /// Each item in the array corresponds to an entry point index. The real entry point name may be different if one of the
436    /// reserved words are used.
437    ///
438    /// Note: Some entry points may fail translation because of missing bindings.
439    pub entry_point_names: Vec<Result<String, EntryPointError>>,
440}
441
442/// A subset of options that are meant to be changed per pipeline.
443#[derive(Debug, Default, Clone)]
444#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
445#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
446#[cfg_attr(feature = "deserialize", serde(default))]
447pub struct PipelineOptions {
448    /// The entry point to write.
449    ///
450    /// Entry points are identified by a shader stage specification,
451    /// and a name.
452    ///
453    /// If `None`, all entry points will be written. If `Some` and the entry
454    /// point is not found, an error will be thrown while writing.
455    pub entry_point: Option<(ir::ShaderStage, String)>,
456}
457
458#[derive(Error, Debug)]
459pub enum Error {
460    #[error(transparent)]
461    IoError(#[from] FmtError),
462    #[error("A scalar with an unsupported width was requested: {0:?}")]
463    UnsupportedScalar(crate::Scalar),
464    #[error("{0}")]
465    Unimplemented(String), // TODO: Error used only during development
466    #[error("{0}")]
467    Custom(String),
468    #[error("overrides should not be present at this stage")]
469    Override,
470    #[error(transparent)]
471    ResolveArraySizeError(#[from] proc::ResolveArraySizeError),
472    #[error("entry point with stage {0:?} and name '{1}' not found")]
473    EntryPointNotFound(ir::ShaderStage, String),
474}
475
476#[derive(PartialEq, Eq, Hash)]
477enum WrappedType {
478    ZeroValue(help::WrappedZeroValue),
479    ArrayLength(help::WrappedArrayLength),
480    ImageSample(help::WrappedImageSample),
481    ImageQuery(help::WrappedImageQuery),
482    ImageLoadScalar(crate::Scalar),
483    Constructor(help::WrappedConstructor),
484    StructMatrixAccess(help::WrappedStructMatrixAccess),
485    MatCx2(help::WrappedMatCx2),
486    Math(help::WrappedMath),
487    UnaryOp(help::WrappedUnaryOp),
488    BinaryOp(help::WrappedBinaryOp),
489    Cast(help::WrappedCast),
490}
491
492#[derive(Default)]
493struct Wrapped {
494    types: crate::FastHashSet<WrappedType>,
495    /// If true, the sampler heaps have been written out.
496    sampler_heaps: bool,
497    // Mapping from SamplerIndexBufferKey to the name the namer returned.
498    sampler_index_buffers: crate::FastHashMap<SamplerIndexBufferKey, String>,
499}
500
501impl Wrapped {
502    fn insert(&mut self, r#type: WrappedType) -> bool {
503        self.types.insert(r#type)
504    }
505
506    fn clear(&mut self) {
507        self.types.clear();
508    }
509}
510
511/// A fragment entry point to be considered when generating HLSL for the output interface of vertex
512/// entry points.
513///
514/// This is provided as an optional parameter to [`Writer::write`].
515///
516/// If this is provided, vertex outputs will be removed if they are not inputs of this fragment
517/// entry point. This is necessary for generating correct HLSL when some of the vertex shader
518/// outputs are not consumed by the fragment shader.
519pub struct FragmentEntryPoint<'a> {
520    module: &'a crate::Module,
521    func: &'a crate::Function,
522}
523
524impl<'a> FragmentEntryPoint<'a> {
525    /// Returns `None` if the entry point with the provided name can't be found or isn't a fragment
526    /// entry point.
527    pub fn new(module: &'a crate::Module, ep_name: &'a str) -> Option<Self> {
528        module
529            .entry_points
530            .iter()
531            .find(|ep| ep.name == ep_name)
532            .filter(|ep| ep.stage == crate::ShaderStage::Fragment)
533            .map(|ep| Self {
534                module,
535                func: &ep.function,
536            })
537    }
538}
539
540pub struct Writer<'a, W> {
541    out: W,
542    names: crate::FastHashMap<proc::NameKey, String>,
543    namer: proc::Namer,
544    /// HLSL backend options
545    options: &'a Options,
546    /// Per-stage backend options
547    pipeline_options: &'a PipelineOptions,
548    /// Information about entry point arguments and result types.
549    entry_point_io: crate::FastHashMap<usize, writer::EntryPointInterface>,
550    /// Set of expressions that have associated temporary variables
551    named_expressions: crate::NamedExpressions,
552    wrapped: Wrapped,
553    written_committed_intersection: bool,
554    written_candidate_intersection: bool,
555    continue_ctx: back::continue_forward::ContinueCtx,
556
557    /// A reference to some part of a global variable, lowered to a series of
558    /// byte offset calculations.
559    ///
560    /// See the [`storage`] module for background on why we need this.
561    ///
562    /// Each [`SubAccess`] in the vector is a lowering of some [`Access`] or
563    /// [`AccessIndex`] expression to the level of byte strides and offsets. See
564    /// [`SubAccess`] for details.
565    ///
566    /// This field is a member of [`Writer`] solely to allow re-use of
567    /// the `Vec`'s dynamic allocation. The value is no longer needed
568    /// once HLSL for the access has been generated.
569    ///
570    /// [`Storage`]: crate::AddressSpace::Storage
571    /// [`SubAccess`]: storage::SubAccess
572    /// [`Access`]: crate::Expression::Access
573    /// [`AccessIndex`]: crate::Expression::AccessIndex
574    temp_access_chain: Vec<storage::SubAccess>,
575    need_bake_expressions: back::NeedBakeExpressions,
576}