GPU programs

GPU programs are programmable parts of the GPU pipeline, in other literature often called shaders (Banshee uses the word shader for a higher level concept, so we won't call them that). They are core objects, meaning they can be used on both sim and core threads. We'll focus on the core thread version and note the differences between the two where relevant.

In Banshee they are represented with the ct::GpuProgram class. There are six types of GPU programs: vertex, hull (tesselation control), domain (tesselation evaluation), geometry, fragment (pixel) and compute programs. Each is used for a different purpose but has the same interface. We assume the user is familiar with the GPU pipeline and what the different program types do.

Note that if you are using Banshee Shading Language you do not need to create GPU programs manually - any shaders you import and materials created from those shaders will have GPU programs created internally, but they will be hidden from the normal user.

Creating GPU programs

To create a GPU program call ct::GpuProgram::create() with a GPU_PROGRAM_DESC structure. The structure needs to have the following fields populated:

For example if we wanted to create a HLSL fragment program (HLSL source not shown):

String hlslSource = "...";
GPU_PROGRAM_DESC desc;
desc.source = hlslSource;
desc.entryPoint = "main";
desc.language = "hlsl";
SPtr<GpuProgram> myProgram = GpuProgram::create(desc);

Once the GPU program has been created it is not guaranteed to be usable. The compilation of the provided source code could have failed, which you can check by calling ct::GpuProgram::isCompiled(), and retrieve the error message by calling ct::GpuProgram::getCompileErrorMessage().

if(!myProgram->isCompiled())
gDebug().logError("GPU program compilation failed with error: " + myProgram->getCompileErrorMessage());

Be aware that shader compilation happens on the core thread. Therefore if calling these methods on the sim thread GPU program, you must first ensure the GPU program is initialized by calling CoreThread::submit() with blockUntilComplete parameter set to true, or use the core-object-specific method GpuProgram::blockUntilCoreInitialized().

// If program is used on sim thread
myProgram->blockUntilCoreInitialized();
if(!myProgram->isCompiled())
gDebug().logError("GPU program compilation failed with error: " + myProgram->getCompileErrorMessage());

Using GPU programs for rendering

To use a GPU program in a draw or dispatch call, you must first create a GPU pipeline object using the relevant GPU programs. There are two types of pipeline states:

  • ct::GraphicsPipelineState - Supports vertex, hull, domain, geometry and fragment programs. At minimum requires a vertex program, while most pipelines will use vertex & fragment programs.
  • ct::ComputePipelineState - Supports only the compute GPU program type.

To create a graphics pipeline you need to fill out the ct::PIPELINE_STATE_DESC structure with references to relevant GPU programs, followed by a call to ct::GraphicsPipelineState::create().

PIPELINE_STATE_DESC desc;
desc.vertexProgram = ...;
desc.fragmentProgram = myProgram; // Program we created in the example above
desc.geometryProgram = ...;
desc.hullProgram = ...;
desc.domainProgram = ...;
SPtr<GraphicsPipelineState> graphicsPipeline = GraphicsPipelineState::create(desc);

Note that graphics pipelines also support a set of fixed (non-programmable) states we'll discuss later.

Compute pipeline states are simpler, accepting just a single compute GPU program as a parameter to their ct::ComputePipelineState::create() method.

SPtr<GpuProgram> computeProgram = ...;
SPtr<ComputePipelineState> computePipeline = ComputePipelineState::create(computeProgram);

Once created pipelines can be bound for rendering through the globally accessible ct::RenderAPI interface. This interface is the primary entry point in the low-level rendering API and it will be used for most low-level rendering operations, as we'll see throughout this set of manuals.

Call ct::RenderAPI::setGraphicsPipeline() or ct::RenderAPI::setComputePipeline() to bind a graphics or a compute pipeline state, respectively.

// Bind pipeline for use
RenderAPI& rapi = RenderAPI::instance();
rapi.setGraphicsPipeline(graphicsPipeline);
// Or: rapi.setComputePipeline(computePipeline);

Once bound any draw or dispatch calls will be executed using the bound pipeline states. Draw/dispatch calls are explained in more detail later on.

GPU program parameters

Although you can use a GPU program without any parameters, most will require some additional data in order to perform their operations. Program parameters represent data that is static throughout a single GPU program execution (a draw/dispatch call). For example, when drawing a 3D object you will usually want to provide a projection matrix that transforms the object from 3D to 2D, according to the camera the user is viewing the object through.

You can access information about GPU program parameters by calling ct::GpuProgram::getParamDesc(). This will return a GpuParamDesc structure containing information about all GPU parameters used by that GPU program. This includes primitives (int, float, etc.), textures, samplers, buffers and parameter buffers (constant/uniform buffers in DirectX/OpenGL lingo).

You generally don't need to use this information directly. It is instead automatically parsed when you create a GPU pipeline. Once you have a pipeline you can use it to create a ct::GpuParams object that allows you to assign values to all parameters of a specific pipeline.

GpuParams

GpuParams is a container for all parameters required by a single GPU pipeline (graphics or compute). It allows you to set primitive/texture/sampler/buffer parameters used by the GPU programs, which it stores in an internal buffer. You can then bind it to a RenderAPI similar to how you bind the pipeline themselves. Assigned parameter will then be using with the current pipeline in any following draw or dispatch calls.

To create a GpuParams object call ct::GpuParams::create() with either a graphics or a compute pipeline state as a parameter.

SPtr<GraphicsPipelineState> graphicsPipeline = ...;
SPtr<GpuParams> params = GpuParams::create(graphicsPipeline);

Once created you can assign values to parameter by calling any of the following methods (depending on parameter type):

Each of the methods accepts a GpuProgramType of the program whose parameter to assign, name of the parameter (as specified in the GPU program code) and a value of the parameter.

Matrix4 viewProjMat = ...;
params->setParam(GPT_VERTEX_PROGRAM, "vertProjMatrix", viewProjMat);
SPtr<Texture> someTexture = ...;
params->setTexture(GPT_FRAGMENT_PROGRAM, "mainTexture", someTexture);

If parameters are modified often you can instead use parameter handles for faster access. Use the following methods to retrieve a handle to the parameter:

Each of the methods accepts a GpuProgramType of the program whose parameter to assign, name of the parameter (as specified in the GPU program code), and outputs a handle to the parameter as specified above. Handles provide set() and get() methods that can be used for writing and reading the parameter values.

// Same result as above, only using handles
GpuParamMat4 myMatParam;
GpuParamTexture myTextureParam;
params->getParam(GPT_VERTEX_PROGRAM, "vertProjMatrix", myMatParam);
params->getTextureParam(GPT_FRAGMENT_PROGRAM, "mainTexture", myTextureParam);
Matrix4 viewProjMat = ...;
SPtr<Texture> someTexture = ...;
myVectorParam.set(viewProjMat);
myTextureParam.set(someTexture);

Parameter handles can be retrieved once and stored, then used for quick access to the parameter without an otherwise expensive lookup that happens when setting parameter directly.

Parameter blocks

All primitive (int, float, Vector3, etc.) GPU parameters are grouped in parameter blocks - each GPU program must have at least one. These are better known as constant buffers in DirectX or uniform blocks in OpenGL/Vulkan.

Before we can assign primitive parameters to GpuParams we must first allocate their parameter blocks (in the examples above we skipped this step). Parameter blocks are represented by ct::GpuParamBlockBuffer. They are essentially just blocks of memory and only require a size in bytes in their call to ct::GpuParamBlockBuffer::create().

// Buffer with 256 bytes in size
SPtr<GpuParamBlockBuffer> buffer = GpuParamBlockBuffer::create(256);

You can find out which parameter blocks (and their sizes) a GPU program contains from its GpuParamDesc structure retrieved from GpuProgram::getParamDesc().

// Create the first available parameter block, with adequate size to fit all of its fields
SPtr<GpuParamDesc> paramDesc = myProgram->getParamDesc();
SPtr<GpuParamBlockBuffer> buffer;
if(paramDesc.paramBlocks.size() > 0)
{
UINT32 sizeBytes = paramDesc.paramBlocks[0].blockSize * 4; // Block size is in multiples of 4 bytes
buffer = GpuParamBlockBuffer::create(sizeBytes);
}

Finally you can bind the parameter block to GpuParams by calling ct::GpuParams::setParamBlockBuffer() with the GPU program type, parameter block variable name, and the parameter block buffer.

params->setParamBlockBuffer(GPT_FRAGMENT_PROGRAM, "myParamBlock", buffer);

After it is bound you we free to set primitive parameters and they will be stored in the bound buffer. This buffer can then be shared between multiple GPU programs and/or GPU pipelines.

You can also write to GpuParamBlockBuffer directly by calling its ct::GpuParamBlockBuffer::write() method. In this case you must be careful to respect the layout of the variables in the buffer as expected by the shader. This layout can be determined by examining the other entries in GpuParamDesc structure.

Binding GPU params

Once GpuParams has been created and populated with necessary data, you can bind it to the GPU by calling ct::RenderAPI::setGpuParams().

// This should be called after the pipeline expecting these parameters is bound
RenderAPI::instance().setGpuParams(params);

Vertex input

Vertex GPU programs provide information about their inputs in the form of a VertexDeclaration. This is the same structure that we used for describing per-vertex components while creating a mesh. Per-vertex input declaration can be retrieved from a GPU program by calling ct::GpuProgram::getInputDeclaration().

SPtr<GpuProgram> vertProg = ...;
SPtr<VertexDeclaration> inputs = vertProg->getInputDeclaration();

Input declaration can be used for creating meshes or vertex buffers that provide per-vertex information that a GPU program expects.

Note: This method is only available on the core thread version of GpuProgram.

GLSL specifics

When declaring vertex inputs for a GPU program written in GLSL you should use the following variable names depending on the input usage:

  • bs_position - Vertex position
  • bs_normal - Vertex normal
  • bs_tangent - Vertex tangent
  • bs_bitangent - Vertex bitangent
  • bs_texcoord - Vertex UV
  • bs_color - Vertex color
  • bs_blendweights - Blend weights used for skinning
  • bs_blendindices - Blend indices used for skinning

This allows the system to map the semantics specified in VertexDeclaration of the bound mesh or vertex buffer to the GPU program inputs. This is not required for HLSL as HLSL has built-in support for semantics which are used instead.

You can append 0-8 to the names to receive more than one element of the same name. Actual types of these elements, as well as the data stored by them doesn't need to match the names and it's up to the user to provide whatever data he needs.