ModLab is a free tool that can generate and fine tune normal maps in a realtime rendering environment.
Alright, so other part of this update: Optimization
First of all I need a reasonable goal. Processing a 4k x 4k texture on a 4 years old mid tier GPU at 60 fps seems like a reasonable goal, so let's aim for that. To make it fun let's include the latest changes with the most performance hungry settings (using largest possible kernels for small and medium elements).
So initial test: Load texture, Generate From Diffuse set, set the highest radius for both elements + using most complicated kernel aaaaand we are at 17 fps. Alright, so I need to make this aprox 3 times faster. In terms of rendering this is kinda scary especially considering that I was trying to make this efficient in first place and I am not going to compromise quality in any way.
First of all let's do some profiling to see what's the most expensive, I have pretty good idea but it is good to confirm and see how exactly expensive it is. What we have learned at this point is that VS graphic debugging sucks. The moment I am under 60 fps (thus the point you want to profile) I am unable to capture a frame and the whole debugging process glitches. Well it's not like I am mad, I am just furious. Ok then, let's profile smaller texture and just use the ratios between individual processing steps. Okay, so I can now capture a frame, I can get quick analysis but I can't get extended "Error have occured" - Thx for info, I almost didn't notice. *flips table* Alright , let's roll with Nsight. Luckily for me Nsight works like a charm. Profiling confirmed what I expected. While at it I've noticed minor Color Surfaces mismatch and here we have another fps. In other words 53 ms per frame, I would need to get to aprox 16.7 ms for frame to be at 60 fps. Currently worst offenders are processing added recently and for those there is no possible optimization as they are simply blurs with large kernel (remember when I said that I wont compromise quality, so no, I am not going to undersample these, blurs shaders used here already make advantage of linear interpolation). The most reasonable way at this point is to process these only when required, thus on value change, it is obvious approach, really which I was avoiding for few minor reasons but I guess it's time. TL DR version of what's going to happen: There are several parallel processing branches running at same time, some are merging at some points, all are contributing to final result. Goal here is to update minimum portion of the "tree" system when their input changes. After these changes we are suddenly at 46-47 fps with very minor fps drop (3-4fps) when moving sliders, everything still responds in realtime, 13fps/5ms to go. So let's give same treatment to billateral processing.
[url=https://steamcdn-a.akamaihd.net/steamcommunity/public/images/clans/31777225/01160601c1c84ff54ac31b79d076c49f3b5a17e1.png][img]https://steamcdn-a.akamaihd.net/steamcommunity/public/images/clans/31777225/65cb0842f7d782ae2680dcd26b18763c0e9bc7fc.png[/img][/url]
Changes above resulted in improvement from 17 to 56 fps with worst case scenario drop to 49 fps for split second (realistically unnoticeable). I've tested also some 8k x 2.5k textures and performance was still very reasonable. This concludes optimization pass for patch 1.1. Main feature still remain changes regarding large and medium elements as mentioned in previous post: https://steamcommunity.com/games/768970/announcements/detail/1671276544921786698
So when is v1.1 going to be released? Probably during next week, no promises there, I want to clean up few things, finish slider value normalization and add few other QoL improvements (scroll wheel to scroll left side of GUI incoming).
As always I would like to thanks all Patrons, you are the reason why I am working on this patch.
Patreon: https://www.patreon.com/user?u=7785848