Photogrammetry on Steroids - Part 4 From 2D to 3D

New Heights: Realistic Climbing and Bouldering

The ultimate climbing and bouldering simulation game. Explore and climb 250 real world routes, create your own routes and compete against your friends from the safety of your computer.

[img]{STEAM_CLAN_IMAGE}/43535104/03e49f3b36def2ea4dc54d29768c74f412484e9e.jpg[/img] Back in the office, with gigabytes of video footage from Al Legne, Rocher du Casino and Crèvecoeur it was time to start processing. This might come as a big surprise to no one, but most if not all photogrammetry programs are built around photos and not videos. Taking still images out of a video is trivial to do with ffmpeg. First we started by just taking an image at an interval of 3 seconds. This would make sure there is enough overlap between the images. Feeding these images into Agisoft’s Metashape already gave quite good results, but we had a hard time having some images aligned. When we looked at those images we noticed they were very blurry.. [img]{STEAM_CLAN_IMAGE}/43535104/0ec7f790b3e362b84bcb731d66fcb0909b38ba5b.gif[/img] Looking at the original video we saw that around the blurry image there should be more than enough footage to get a better frame out. So we used ffmpeg to get more frames out, then used blur detection to find the best image and we used those as input to Metashape. Actually to our own surprise using this method has given us a 99%+ aligned on the photos! [h3]What does this actually mean though?[/h3] Converting the images to a 3D model happens in several steps that all build on the step before. The first one is the alignment of the photos. What this means is that the program will look at features of a picture (this about hard corners, etc) and try to find them in multiple pictures. When enough of these features are found in multiple images it can then triangulate the position of where the photos were taken from and align the virtual camera for that image. After having your images aligned the second step is to create a dense point cloud. In this step the program will use things like curvature to find many many more points that overlap between the features found in the last step. All these points are found in 3D again using triangulation. When this step is done you can already start to see a vague version of what will be the output. Based on this dense point cloud the program can now start to make a mesh. It does this by some version of triangulation, and yes this is very confusing because this is a totally different kind of triangulation then used in the previous steps. Here it actually means “to divide into triangles”. After the mesh is built there are most likely a bunch of floating bits and ugly edges. These can easily be removed with some filtering. This mesh is actually even too detailed for us, because it breaks almost every program we tried to put it into. So we use the decimate function in Metashape to bring it back to still very large, but not meganormous numbers of vertices. Don’t throw away that high detailed version though, because we still need it in a later step!! A mesh is still very boring though without having any textures on it. So with the power of Metashape we are even able to generate albedo textures from the images. Here is where a little bit of good judgment comes up. Because you need to pick the size and number of textures you want to create for the mesh. You might think just picking the highest resolution and a high number of textures would give the best results, but there is a very strong limit on the quality of the textures. This all depends on the input pictures. Metashap is a very magical program, but one thing it does not do is use AI to enhance or content aware fill parts of the textures. Another thing you need to keep in mind is that the higher resolution and number of textures you want to export the longer it will take and the more system resources you will need. For our computer with 64GB of RAM the limit was around four 16k textures. The model still kind of looks flat. This is because there is no normal map yet. We can easily get this by using the high detailed version of our mesh. This step will bake in all of those details into the decimated mesh, making it so the light will reflect better how it originally should. After the normal map we also generate an occlusion map to get those really nice ambient occlusion depth right into the model. [img]{STEAM_CLAN_IMAGE}/43535104/cfdfd294fa1c8ea52ce12d3137e32ab10ff8fd4b.gif[/img] Now that we have the model we can actually prepare it for use in New Heights. This is what the next part will be about. Having super detailed meshes is really amazing of course, but having a good framerate is also very preferable over a slideshow in a game.