Monthly Archives: July 2009

From Vertex Positions to Packed Element Arrays

At my new workplace at university I'm currently porting an advanced terrain rendering engine from DirectX to OpenGL.
One of the performance optimizations the engine uses is that it draws the terrain tiles right from the index buffer without using a vertex buffer at all - that is it packs the vertex position into the 32-bit index and unpacks it in the vertex shader.

Why is this faster than rendering using a vertex buffer and no index buffer?

When using an index buffer the graphics card can make use of a cache of already transformed (vertex-shaded) vertices and when an index is reused, it can use the cached result instead of running the vertex shader again. Of course, this only works if there exists a certain temporal locality, but that is given.
If no index buffer is used, the vertex cache won't be used, because the implicit index is different for each vertex.

Since I need to port the engine from DirectX to OpenGL, I did some research to see if it is possible to do the same in OpenGL.
It's not really possible but you can achieve something quite similar in OpenGL 3.0.

Using the Input-Assembler Stage without Buffers (OpenGL)

This is meant as OpenGL analogon for Using the Input-Assembler Stage without Buffers (Direct3D 10).

I think the title is self-explanatory but for greater clarity let me rephrase it:
The aim is to render something without using vertex or index buffers, that is (in OpenGL speak) using neither vertex data nor an elements array to render something.
Instead the automatically supplied gl_VertexID attribute  (vertexId in DirectX) is used to determine the vertex the shader is currently processing.

The example in MSDN simply draws a triangle using vertexId:

VSOut VSmain(VSIn input)
{
VSOut output;

if (input.vertexId == 0)
output.pos = float4(0.0, 0.5, 0.5, 1.0);
else if (input.vertexId == 2)
output.pos = float4(0.5, -0.5, 0.5, 1.0);
else if (input.vertexId == 1)
output.pos = float4(-0.5, -0.5, 0.5, 1.0);

output.color = clamp(output.pos, 0, 1);

return output;
}

If you want to do the same thing in OpenGL, you have at least two problems:

  • OpenGL uses glVertex* as end marker for the specification of a single vertex and the array rendering commands (glDrawArrays, glDrawElements, etc.) all require an enabled vertex array.
  • gl_VertexID is only supplied if:
    • the vertex comes from a vertex array command that specifies a complete primitive (e.g. DrawArrays, DrawElements)
    • all enabled vertex arrays have non-zero buffer object bindings, and
    • the vertex does not come from a display list, even if the display list was compiled using DrawArrays / DrawElements with data sourced from buffer objects.

    (from GL_EXT_gpu_shader_4)

There is no way around these requirements, but what you can do is to create dummy vertex buffer with one element, bind it as vertex array and simply draw as many vertices as you want. If you don't access gl_Vertex there is no way that uninitialized data can affect the shader and although behavior is generally undefined in OpenGL, if you render beyond the vertex buffer size, it has worked so far that I've test this on.

Drawing a circle without accessing gl_Vertex

Drawing a circle without accessing gl_Vertex

You can download the source code here.

Packing Vertex Positions into the Elements Array

The next step is to start packing and unpacking data in the gl_VertexID. For this an integer type and bit operations (shifting and masking at least) are required in the vertex shader, so it requires GLSL 1.30 at least.

The code is quite short from my proof of concept project, so I'm pasting the shader here:

#version 130
#extension GL_EXT_gpu_shader4 : enable

out vec4 color;

vec3 unpackVertex(int index) {
return vec3( index & 0xFF, (index >> 8 ) & 0xFF, (index >> 16) & 0xFF ) * (2 / 255.0) - vec3(1.0);
}

void main()
{
vec3 unpackedData = unpackVertex( gl_VertexID );
gl_Position = vec4( unpackedData, 1.0 );
color = vec4( (unpackedData + 1.0) / 2.0, 1.0 );
}

In main.cpp the equivalent can be found for setting up the elements array:

#define packFloat(v)	(int(((v) + 1.0) / 2 * 255) & 255)
#define packVertex(x,y,z) (packFloat( x ) + (packFloat( y ) << 8 ) + (packFloat( z ) << 16))

void display(void) {
[...]

unsigned indices[] = {
packVertex( 0.0, 0.0, -1.0 ), packVertex( 1.0, 0.0, -1.0 ), packVertex( 1.0, 1.0, -1.0 ),
packVertex( 0.0, 0.0, -1.0 ), packVertex( -1.0, 0.0, -1.0 ), packVertex( -1.0, -1.0, -1.0 )
};
glDrawElements( GL_TRIANGLES, sizeof( indices ) / sizeof( *indices ), GL_UNSIGNED_INT, indices );

[...]
}

For the code to actually make sense I should initialize an index buffer/elements array buffer in OpenGL and upload the indices into it but this was just for testing.

Drawing two triangles by packing their position into the index buffer

Drawing two triangles by packing their position into the index buffer

You can download the source code here.

Note:
It has been brought to my attention (thanks Yagero), that on NVIDIA cards the driver currently reduces the element array data size from int to short, which causes the third byte (ie z values in this packing scheme) to always be 0.

A fix is to use an element array buffer (index buffer) which prevents the driver from messing with the data size.
You can download the adapted source code here. To show that it works, I have swapped the x and z component in the packing scheme.

Fast Forward

I've quite a few things I've wanted to write about a long time ago and I actually started working on them and taking notes, etc. but never found time for one reason or another to write and publish the actual posts.
So this is meant as fast forward of all these text bits.

Movies and Games

Memento

One awesome movie. I've seen it twice already and I still love the movie and its plot. I don't want to spoil too much of the story, but some will probably inevitable.

A few bullet points:

  • Telling a story in reverse is pretty cool in itself and perfect for the movie
  • I wonder if one could use this kind of plot for a game, too, but it would make the plot quite linear because the player could only do things that won't change the outcome of the future (i.e. what he has already played).
  • If the game is linear though and goal based it could be possible easily because you would replay it like a movie. If it's plot-centric, it could be fun and create a kind of suspense similar to Memento.
  • Fahrenheit would be a game that is plot-centric like that..
  • The movie depicts a chaotic system: small imprecision leads to huge consequences ("Don't believe his lies").
  • One of the impressions I liked most is the way you have to constantly reevaluate everything you have seen so far because of some new piece of information from the past which totally changes the whole movie a whole time.
  • It's kind of difficult to remember the movie at first because the human brain is not used to this presentation of causality (seeing the effect before the cause).
  • The movie gets you thinking about what defines yourself - what are you if you can't remember things anymore and how do you define yourself through your memory.
  • A few weeks ago I read a psychology case book that contained a case similar to Memento. Alcoholics sometimes damage their brain through alcohol abuse and lose all ability to memorize anything new. Their long-term memory only works up until some point of time and after that they won't remember anything. They constantly live at that moment and will be lost forever.

Fahrenheit

Fahrenheit is a pretty cool. Like many movies it is excellent for 90% of the playtime and then it suddenly starts to suck and/or becomes very weird story-wise.

It's not a typical game as more a cinematic experience that does a good job at combining gaming aspects with a very advanced plot and some pretty awesome action scenes.

I've really enjoyed the game and just like Omikron: The Nomad Soul (an earlier game by developer Quantic Dream) it's positively refreshing and different.

Some random notes:

  • Vista compatibility sucks. I had to download a hacked binary to make it start at all on Vista. Otherwise it ran fine except for one crash that was due to my notebook overheating slightly.
  • Like in many other games you can't skip cut scenes or dialogs easily which is annoying if you just want to replay a chapter up to a certain point
  • More annoyingly it seems that if you replay an earlier chapter, you have to replay the ones following it, too. Be careful with that :-|
  • As said - the story is awesome until you have played 90% of the game then it turns a bit into being on the bizarre side of things´.
  • Also "Hiding at Tiffany's" is awful. If you ever play the game be ready to replay it a few times:
    You have to hide at Tiffany's place from someone who is searching for, and you have 30 seconds or so to find a hiding place, after which the person starts searching for.
    The problem is that you don't know where the person will search for you and if you are caught, you obviously won't see where he would search afterwards, so you have to replay the same part quite a few times if you have bad lack and I think it was one of the more frustrating parts.
  • Zero Punctuation has a good review of the story issue of Fahrenheit hidden in his review of Condemned 2, it is also funny, so it's certainly worth watching :)
  • The way the controls work in Fahrenheit is also pretty interesting. The wikipedia article about Fahrenheit has a good description of it in its Gameplay section.

Books

The Pragmatic Programmer

Last year I think I wrote that I had started reading "The Pragmatic Programmer". I actually finished reading it quite a while ago, but here are a few remarks about it:

  • It's a good and nice read and the book contains lots of helpful suggestions and things to keep in mind when coding or designing software or even just when you want to communicate with co-workers, etc.
  • It's a "common sense" book - similar to Code Complete - and when you read it, you'll often think "that's straight-forward" or "that's the logical thing to do", but it still valuable to have all that common sense written down somewhere and to be able to look at it now and then in search for inspiration.
  • It's not as useful as Code Complete though, which was a real eye opener (and still is) and it's not going to improve your coding style a lot or the way to think about code design.

OpenGL Superbible

I've bought the "OpenGL Superbible" and it's a pretty good book if you want to learn OpenGL or read a light text about certain advanced OpenGL features before rolling up your sleeves and digging around in the extension specs. It's written like a big and pretty complete tutorial and the latest edition is a lot better suited for the new features than, say, the latest edition of the OpenGL Programming Guide (which is pretty horrible - I've read through the sixth edition and it's pretty much the second edition plus a paragraph tacked on here and there and long explanations of deprecated features).

The only part of the book that is really, really weak and totally useless is the part about GLSL and shader programming. It contains a short description about GLSL and while the chapter summary mentions functions like glUniform and co, the function is not mentioned anywhere in the chapter nor does it provide even one example on how to set or access vertex attributes or uniforms, which is essential.

If you want to learn about GLSL and shader programming in OpenGL I can only recommend the OpenGL Shading Language and the GLSL language specifications.

Multi-Core Programming

First don't buy this book. It's from Intel Press (you can read the book description here) and it's ridiculously expensive for the content it provides.

I got it for free at university presentation from Intel and have read through most of it in the last weeks and really - if you want to learn about OpenMP and threading techniques and tools, there are better sources available online for free.

Real-Time Rendering

This book on the other hand is awesome - buy it if you are interested in computer graphics and want to understand the underlying principles better.
It's well-written and presents lots of advanced computer graphics topics in a very understandable way. Especially the chapters about local and global illumination and the physical base of them are very good. It is good starting point to look for resources and papers and the book's homepage is also pretty useful: http://realtimerendering.com/.