Building a voxel engine like Minecraft in Unity (1 of 2)

This is part 1 of 2. These posts are meant to give a basic introduction to voxel engines. In later posts I will dig into optimization techniques and benchmarking.

So its that wonderful time of the year again where I play with something of my own choosing, uninterrupted for a few days. I left my work at work, ignore all the tasks I have to do and dig deep into something that interests me. This is my vacation, my relaxation, my Christmas.

In last years Christmas project I was optimizing a voxel rendering engine. I ran into some issues, and uncovered a problem in Unity 3D’s implementation of structs that I wrote a detailed blog post about and reported to Unity. I’m pleased to see that they have fixed the issue.


To give some background: I got interested in developing a voxel game back in 2013. It is one of those things where you can build everything from scratch with code. I spent quite some time on it back then, but after a long period of heavy focus on other work I had forgotten most of the source code and building upon it was difficult. It was my first attempt, and from the perspective of a game it got pretty far.

This video shows some of it. It was functional with multiplayer support.

This year, I want to take last years project one step further. This is a fun activity for playing around with benchmarking to understand the intrinsic of .Net, operating system and hardware.

But first I thought I would give an introduction into how to write a voxel engine, so that the optimizations would make sense for those not familiar with the concept.

The basics of voxel engines

Polygon meshes

Standard 3D graphics is made up of 3D polygon mesh. A polygon mesh usually consists of many triangles that make an object appear solid.

Image source: Wikipedia

There are many ways to represent a mesh. One of the most basic mesh formats is storing a list of points in 3D space (vertex), and triangles that form by connecting 3 vertices.

For example a cube would have 8 vertices (one for each corner) and 12 triangles. So a full cube can be described as:

Does it matter what order we address the vertices? Yes. It affects what direction is considered the “front” of the triangle. This affects what angle it will be visible from.


Just like an image can be made up of pixels in 2 dimensions, a volumetric shape can be made up of voxels in 3 dimensions. This picture shows one voxel highlighted in red:

Image source: Wikipedia

Graphics cards don’t render voxels directly, so a rendering engine needs to translate voxel data into meshes before it can be rendered as 3D on the screen. The choice of a cube as an example of mesh was not random, and we’ll get back to the actual translation between voxel and polygon mesh in part 2.

Although most people were introduced to voxels through Minecraft, the technology has been around for many years. In the early years a common use was medical scanners, and optimizing it was a matter of saving minutes for every zoom or rotate a doctor made to a scan. A lot of research has been made into rendering voxels, and there are papers out there that detail many different techniques. Today we use it for games, and we are optimizing in the 0-5 ms range.

Voxel storage

Lets just jump straight in and tackle this one first. Lets assume you are using voxels in a game such as Minecraft, so you need to store 1 voxel for each 1x1x1 meter in the world.

If you are storing a world that is (x, y, z) 10x10x10, and each voxel is represented by one byte (8-bit) then you need 1000 bytes of storage. If each voxel is one int (32-bit) then you need 4000 bytes of storage. You would probably want a much larger world than 10x10x10 voxels, so you may need to balance the number of voxel types you support with how much memory you can use. 2 bytes gives you 65536 unique voxel types, and could be a good balance.

If your world is 1.000.000×1.000.000×1.000.000 (1 million meter in each direction) and you use 2 bytes per voxel you end up with bytes, or gigabytes of RAM. Clearly this can be a bit much to load all at once on an average computer.

Even if you size down your world, anything bigger than a few meters means you run into memory access problems. This is because of how modern computers work. I’ll get back to that in discussing cache locality/cache optimized memory access in a later blog post.

So splitting the world into smaller areas (chunks) and loading only the chunks required will make the demand for RAM a bit lighter. One common size is 16x16x16=4096*2=8192 bytes per chunk. If you want to look 10 chunks (160 voxels) in any direction (radius) that is 20 chunks in each axis. This adds up to 20^3=8000 chunks*8192 bytes per chunk=65536000 bytes/1.000.000=65.536 MB (64MB RAM). Most computers can handle 64MB of RAM without any problem, for example my home computer has 64GB of RAM.



We can define a chunk as an array of integers:

Although accessing the voxel by a number between 0 and 16*16*16-1 is not very intuitive, so I would add a default index that translated x, y and z into index. If you ever worked on bitmap images you will recognize the calculation, only here it is in 3D.

Why don’t we just use a 3 dimensional array like UInt16[16][16][16]? Well, several reasons. One being speed, another being easy serialization. Arrays are objects and objects are scattered across random locations in memory. We’ll get back to some details on memory alignment and prefetcher in a later blog post.

And we can define a world as a list of chunks. Since we probably want to both read and write to the chunks we need a way to find the correct chunk quickly. Assuming the world will be big we can’t just allocate all the memory like we do with voxels inside the chunk, so we use a dictionary.

ChunkId composite key for Dictionary

Notice that we are using ChunkId as key for the dictionary. The reason for this is that the chunks will have a position in 3D space, so we need to address them by their x, y and z coordinates. To do this correctly we will create a struct called ChunkId that will function as our composite key.

As always, a proper implementation of a struct for use as Dictionary lookup key must implement IEquatable<T>, override GetHashCode and preferrably also override == and != operators. The struct should also be immutable (x, y and z are readonly). If you fail to implement IEquatable<T> the dictionary will be very slow.

A standard implementation looks like this

ReSharper was kind enough to generate everything inside “#region Equality members” for me. It is also documented in Microsoft .Net documentation here.

Access helpers

Accessing a voxel now requires you to know both ChunkId(x, y, z) and voxel id (x, y, z). That can be a bit difficult, so having a way of directly accessing it can be helpful. Lets assume that we have a world position of x, y, z and we want to be able to use this to get or set a voxel. We add this to ChunkId:

Since our chunks consist of 16x16x16 voxels, we need 4x4x4 bits to access them (4 bits gives 16 possible numbers). So in a 32-bit integer that makes out any axis, the 28 first bits make out the position of the chunk, and the 4 remaining make out the position of the voxel within the chunk. Therefore, to find our chunk we simply right shift the axis 4 bits.

We then expand our World class to contain default index property for getting and setting the voxel.

We use the FromWorldPos to translate world position into ChunkId, get the chunk from the dictionary, then we AND with hex 0xF (decimal 15). This will remove everything but the last 4 bits or the axis, which is the bits that describe the position within the chunk.


We looked at the fundamental differences between voxel data and polygon mesh based rendering. We looked at the problems with voxel data storage and how that is usually solved. We set up a fairly efficient system to store chunks and voxels in memory enabling us to represent a near infinite sized world. And we set up a simple way to access voxel data without calculating chunk id and voxel index for every voxel we want to access.

In the next part we will look into rendering from voxel to mesh to screen.

Continue to part 2…

1 thought on “Building a voxel engine like Minecraft in Unity (1 of 2)”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: