Slots
A new problem
My work in ntv
exposed more issues that needed to be resolved, the most significant of which was the ability of Zink to accidentally clobber output variables. What does that actually mean though?
Inputs and outputs in shaders are assigned a location (SPIRV terminology) or a slot (mesa terminology). These are helpfully defined in mesa/src/compiler/shader_enums.h
:
/**
* Indexes for vertex shader outputs, geometry shader inputs/outputs, and
* fragment shader inputs.
*
* Note that some of these values are not available to all pipeline stages.
*
* When this enum is updated, the following code must be updated too:
* - vertResults (in prog_print.c's arb_output_attrib_string())
* - fragAttribs (in prog_print.c's arb_input_attrib_string())
* - _mesa_varying_slot_in_fs()
*/
typedef enum
{
VARYING_SLOT_POS,
VARYING_SLOT_COL0, /* COL0 and COL1 must be contiguous */
VARYING_SLOT_COL1,
VARYING_SLOT_FOGC,
VARYING_SLOT_TEX0, /* TEX0-TEX7 must be contiguous */
VARYING_SLOT_TEX1,
VARYING_SLOT_TEX2,
VARYING_SLOT_TEX3,
VARYING_SLOT_TEX4,
VARYING_SLOT_TEX5,
VARYING_SLOT_TEX6,
VARYING_SLOT_TEX7,
VARYING_SLOT_PSIZ, /* Does not appear in FS */
VARYING_SLOT_BFC0, /* Does not appear in FS */
VARYING_SLOT_BFC1, /* Does not appear in FS */
VARYING_SLOT_EDGE, /* Does not appear in FS */
VARYING_SLOT_CLIP_VERTEX, /* Does not appear in FS */
VARYING_SLOT_CLIP_DIST0,
VARYING_SLOT_CLIP_DIST1,
VARYING_SLOT_CULL_DIST0,
VARYING_SLOT_CULL_DIST1,
VARYING_SLOT_PRIMITIVE_ID, /* Does not appear in VS */
VARYING_SLOT_LAYER, /* Appears as VS or GS output */
VARYING_SLOT_VIEWPORT, /* Appears as VS or GS output */
VARYING_SLOT_FACE, /* FS only */
VARYING_SLOT_PNTC, /* FS only */
VARYING_SLOT_TESS_LEVEL_OUTER, /* Only appears as TCS output. */
VARYING_SLOT_TESS_LEVEL_INNER, /* Only appears as TCS output. */
VARYING_SLOT_BOUNDING_BOX0, /* Only appears as TCS output. */
VARYING_SLOT_BOUNDING_BOX1, /* Only appears as TCS output. */
VARYING_SLOT_VIEW_INDEX,
VARYING_SLOT_VIEWPORT_MASK, /* Does not appear in FS */
VARYING_SLOT_VAR0, /* First generic varying slot */
/* the remaining are simply for the benefit of gl_varying_slot_name()
* and not to be construed as an upper bound:
*/
VARYING_SLOT_VAR1,
VARYING_SLOT_VAR2,
VARYING_SLOT_VAR3,
VARYING_SLOT_VAR4,
VARYING_SLOT_VAR5,
VARYING_SLOT_VAR6,
VARYING_SLOT_VAR7,
VARYING_SLOT_VAR8,
VARYING_SLOT_VAR9,
VARYING_SLOT_VAR10,
VARYING_SLOT_VAR11,
VARYING_SLOT_VAR12,
VARYING_SLOT_VAR13,
VARYING_SLOT_VAR14,
VARYING_SLOT_VAR15,
VARYING_SLOT_VAR16,
VARYING_SLOT_VAR17,
VARYING_SLOT_VAR18,
VARYING_SLOT_VAR19,
VARYING_SLOT_VAR20,
VARYING_SLOT_VAR21,
VARYING_SLOT_VAR22,
VARYING_SLOT_VAR23,
VARYING_SLOT_VAR24,
VARYING_SLOT_VAR25,
VARYING_SLOT_VAR26,
VARYING_SLOT_VAR27,
VARYING_SLOT_VAR28,
VARYING_SLOT_VAR29,
VARYING_SLOT_VAR30,
VARYING_SLOT_VAR31,
} gl_varying_slot;
As seen above, there’s a total of 64 slots: 32 for builtins and 32 for other usage. In ntv
, the builtins are translated from GLSL -> SPIRV. The problem arises because SPIRV doesn’t have analogues for many of the GLSL builtins, which means they need to take up space in the latter half of the slots.
As an example, VARYING_SLOT_COL0
(GLSL’s gl_Color
or gl_FrontColor
depending on shader type) does not have a SPIRV builtin. This means it’ll get emitted as VARYING_SLOT_VAR[n]
. In such a scenario, any shader-created VARYING_SLOT[n]
created from a user-defined varying will end up clobbering the color value.
More problems
The simple solution here would be to just map the first half (builtin) of the slot range onto the second half (user), but that has its own problem: slot usage must remain within the boundaries of the enum. This means that the slot usage for GLSL builtins needs to be kept to a minimum in order to leave room for user-defined varyings.
Additionally, the slots need to be remapped consistently for all types of shaders, as ntv
has no capacity to look at any shader but the one being actively processed. So doing any kind of dynamic remapping is out.
Solutions
Ideally the GLSL slot usage needs to be compacted, so I started creating a remapping array for the builtins so that I could see what was available as a SPIRV builtin and what wasn’t. Then I went over the members lacking SPIRV builtins and assigned them a value. The result was this:
/* this consistently maps slots to a zero-indexed value to avoid wasting slots */
static unsigned slot_pack_map[] = {
/* Position is builtin */
[VARYING_SLOT_POS] = UINT_MAX,
[VARYING_SLOT_COL0] = 0, /* input/output */
[VARYING_SLOT_COL1] = 1, /* input/output */
[VARYING_SLOT_FOGC] = 2, /* input/output */
/* TEX0-7 are translated to VAR0-7 by nir, so we don't need to reserve */
[VARYING_SLOT_TEX0] = UINT_MAX, /* input/output */
[VARYING_SLOT_TEX1] = UINT_MAX,
[VARYING_SLOT_TEX2] = UINT_MAX,
[VARYING_SLOT_TEX3] = UINT_MAX,
[VARYING_SLOT_TEX4] = UINT_MAX,
[VARYING_SLOT_TEX5] = UINT_MAX,
[VARYING_SLOT_TEX6] = UINT_MAX,
[VARYING_SLOT_TEX7] = UINT_MAX,
/* PointSize is builtin */
[VARYING_SLOT_PSIZ] = UINT_MAX,
[VARYING_SLOT_BFC0] = 3, /* output only */
[VARYING_SLOT_BFC1] = 4, /* output only */
[VARYING_SLOT_EDGE] = 5, /* output only */
[VARYING_SLOT_CLIP_VERTEX] = 6, /* output only */
/* ClipDistance is builtin */
[VARYING_SLOT_CLIP_DIST0] = UINT_MAX,
[VARYING_SLOT_CLIP_DIST1] = UINT_MAX,
/* CullDistance is builtin */
[VARYING_SLOT_CULL_DIST0] = UINT_MAX, /* input/output */
[VARYING_SLOT_CULL_DIST1] = UINT_MAX, /* never actually used */
/* PrimitiveId is builtin */
[VARYING_SLOT_PRIMITIVE_ID] = UINT_MAX,
/* Layer is builtin */
[VARYING_SLOT_LAYER] = UINT_MAX, /* input/output */
/* ViewportIndex is builtin */
[VARYING_SLOT_VIEWPORT] = UINT_MAX, /* input/output */
/* FrontFacing is builtin */
[VARYING_SLOT_FACE] = UINT_MAX,
/* PointCoord is builtin */
[VARYING_SLOT_PNTC] = UINT_MAX, /* input only */
/* TessLevelOuter is builtin */
[VARYING_SLOT_TESS_LEVEL_OUTER] = UINT_MAX,
/* TessLevelInner is builtin */
[VARYING_SLOT_TESS_LEVEL_INNER] = UINT_MAX,
[VARYING_SLOT_BOUNDING_BOX0] = 7, /* Only appears as TCS output. */
[VARYING_SLOT_BOUNDING_BOX1] = 8, /* Only appears as TCS output. */
[VARYING_SLOT_VIEW_INDEX] = 9, /* input/output */
[VARYING_SLOT_VIEWPORT_MASK] = 10, /* output only */
};
Now all the GLSL builtins that need slots are compacted into 11 members of the enum, which leaves the other 21 available.
Any input or output coming through ntv
now goes through a switch statement: the GLSL builtins that can be translated to SPIRV builtins are, the builtins that can’t are remapped using this array, and and rest of the slots get mapped onto a slot after the reserved slot members.
Problem solved.
For now.