This year object Paging has been successfully implemented and is a major visual milestone for OpenMW. This thread is to lay out some of the discussion on how it could evolve in the future.
First, to make myself clear: Object Paging is amazing. This topic however is entirely focused on how it can be bettered in the future, and that means talking about weaknesses that exist as they are now, whether due to paging itself or other factors. Mostly, this relates to scalability. At default settings, the current distant terrain system performs well with x5 cells, and still ok with x10 cells. For the traditional Morrowind user base, this is more than enough. But if you want viewing distance comparable to Oblivion or Skyrim (about x40 Morrowind-sized cells), that's a different story. Initial loading time exponentially increases, and fps drops well below what would be acceptable. At the moment, aside from increasing the
object paging min size setting to reduce the amount of objects being drawn, there's not much else that can be done by OpenMW itself.
The favored solution to this (spoken by psi29a and AnyOldName3) is for users to add
NiLODNode to existing morrowind objects, which swaps the high-detail mesh with a low-detail one at greater distances. To test this, I created an
experimental plugin adding these to a variety of the common medium and large-sized objects across morrowind. The initial results (
1,
2), however, were mixed: while it dramatically reduced the triangle count in test scenes, the number of drawables (and hence the average framerate) was barely affected. On the face of it, it seemed that object paging had already done the best it could, with the introduction of simplified meshes a nearly superfluous extra step. Handy if you were GPU-limited, less so for any significant reduction in draw time. In addition, the use of NiLODNode does not address the issue of long load times.
The second limitation relates to user-authored LOD as well. If one is using a higher object paging min size to maintain decent performance at high viewing distance, they will see large sections of buildings or entire towns disappear - landmarks you'd want to make sure are visible from any distance. The trouble is that many of these structures are in fact a composite of smaller objects pieced together like Lego bricks, and the object paging has no inkling of which objects should be treated as part of a cohesive whole. The proposed solution (by AnyOldName3) again involves NiLODNode, this time by placing a .nif featuring a consolidated mesh (whether it's a castle, village, etc) as a distant LOD level, so that it vanishes when the player is close enough to load the real thing. Unfortunately, NiLODNode and object paging do not talk to each other in this way, and it's hard to see how the distant mesh could appear or vanish at the appropriate time.
I have my own ideas, in which a cell or any number of selected objects could be flagged in OpenMW-CS to be rendered regardless of the object paging min size, but so far this idea has been met with little enthusiasm.
That being said, what follows is a valuable pool of insights, collated from Discord (with minor edits for readibility), distilling a number of developer's thoughts on how they see distant terrain and where it could go in the future.
Bold text is myself unless otherwise noted.
How have others fared with this NiLODNode test plugin?-
Spoiler: Show
Greatness7 wrote:(i7 8700k / GTX 1080 Ti)
-
Spoiler: Show
It's a pretty big improvement on my end. The triangle count reduction is huge.
vtastek wrote:Turning on wireframe in Oblivion, you see how low poly everything is.
Try it in Morrowind, wireframes look solid.
Q. 20,000,000 triangles does seem a bit much.
https://www.g-truc.net/post-0662.html
The performance impact is something.
What about generating separate, bespoke meshes to be used for distant objects?
-
Spoiler: Show
Greatness7 wrote:In MGE XE, Distant Land allows the use of custom _dist.nif assets, which will be rendered when an object is outside of the active cells grid.
akortunov wrote:It would eliminate the overhead from NiLODNode-s for everything, but psi29a is against it. He does not like an idea to have separate meshes for active grid and distant objects.
AnyOldName3 wrote:Because it means we have two ways of doing the exact same thing, and it doesn't eliminate any overhead.
Whether the switch is expressed within the file or by splitting the file doesn't affect the rendering performance in any way, it's how we represent the data once it's already in memory that makes a difference.
akortunov wrote:_dist meshes technically are not LODs all, they are just separate meshes for different objects, not related to the main scene at all. There is no need for dynamic level switching. I suppose there are reasons why Bethesda did not use NiLODNode's widely in their assets, despite NetImmerse supports this feature for ages.
Greatness7 wrote:Because their view distance was like 6ft.
Because it means we have two ways of doing the exact same thing, and it doesn't eliminate any overhead.
I don't have any strong opinion either way. But I don't think this statement is totally fair. The _dist overrides do eliminate some overhead.
1. It obviously removes the need to poll distance for LOD state. Instead you only need to check if it's in an active cell or not. Since all the _dist objects are static in this way, it is potentially much more friendly to batching/paging.
2. It can make load times faster and reduce memory usage. E.g. say you just loaded the game in Balmora, but you have distant statics set to render 30 adjacent cells. If not using _dist overrides you must fully load all meshes used in those adjacent cells. Including potentially very large animations, various LOD levels, tons of unique textures, etc. Much of this data would not be used at all since only the farthest LOD is likely to be applicable, animations shouldn't play so far away, etc.
Also there are some other implications that make LOD distinct from _dist overrides. E.g. imagine standing at the top of a very deep pit looking down. The object may be very far away, but it's still in the active cell so it would not use a _dist override.
AnyOldName3 wrote:The reason why we don't need to treat LOD nodes particularly differently to _dist meshes is that there's basically never going to be any gameplay reason why the LOD nodes need to be super-precise. Object paging is already forming different versions of the same page depending on distance and viewing direction (either that or there's some framework there so it can easily be changed to do so), so we only need to ask the loader which mesh variant gets used at the distance the page is being generated for, and the LOD checking overhead gets baked out. We would end up with the LOD level changing based on cell transitions or whatever triggers pages to be paged rather than transitioning exactly when the LOD node changes.
As for the overhead of loading a whole Nif when only the lower LOD levels are needed, I agree that it's an irritation and waste of time right now. However, I don't see it as any different to keeping low mip levels in the same DDS as the full-quality ones. The solution is a streaming-friendly archive format like Fallout 4's BA2 archives which are somewhat content-aware. This would mean being able to load just part of a file, and also shouldn't have the overhead of dealing with multiple file handles, which is definitely a cause of performance problems in the vanilla engines. Uncompressed BSAs/BA2s will always be faster to load than loose files in vanilla, and if you've got a slow drive, compressed ones can be faster still.
Would the current very long (initial) load times go away with these changes to the paging system and addition of BA2?
Greatness7 wrote:Only after we ditch nif format. They are made in such a way that you can only really read them linearly from start to end; not ideal. Which I'm sure is part of the reason why MGE uses some baked DX format for its distant objects rather than nifs. (the _dist get converted to this during distant land generation). How high you set the draw distance seems to have little-to-no impact on MGE load times.
Q. Skyrim LOD is in nif format and loads pretty quick; quicker than OpenMW at least, despite being nif. Is it really a big deal?
Probably not. Only if you want to do what AON was suggesting and jump to specific parts of a file (e.g. the farthest LOD level). In theory nif format is flexible and you could bake the LOD block offset information when exporting, but that is not so pretty.
Q. The only real alternative to nif that might be on the table for OpenMW atm is collada support, is that suitable?
Never looked into it. But I think such a streaming-friendly format would have to be designed with that goal in mind from the start, so if collada is that then it would be advertised as one of the formats features.
Does OpenMW still potentially have a few tricks up its sleeve to read only the relevant LOD part of the .nif and remove overhead?
akortunov wrote:I suppose, no. When NIF reader de-serializes NIF-file, it caches this file, and it does not know if this file will be used for active grid (so we need to read all levels) or for distant objects.
I see two alternatives:
1. Add a NiStringExtra data to LOD nodes to tell the engine that these LODs are not LODs.
In this case we will need to de-serialize such NiLODNode's to our own object instead of osg::LOD. This object should not support switching at all, it should always display only first LOD level, but paging system still should be able to read all levels from this object. It should reduce performance overhead, but scene graph for active grid still we bloated with unused data for paging system. Probably we can remove unused data via visitor when we place objects to scene, but in this case paging will not be able to use object scene nodes and it will need to create another mesh instances, with full data.
2. Split data sources for "normal" objects and pages, so we do not need to use NiLODNode's at all. Probably we will have to do it eventually anyway if we are going to support later Bethesda's titles, which use pre-generated pages for Distant Land.
In any case current proposed by psi29a system (tell modmakers to just put NiLODNode's everywhere) is not acceptable.
Q. Incidentally, why do you think NiLODNode solution is unacceptable?
Because we will need to re-work it, and all created assets should be remade in this case. Also keep in mind that there are a lot of existing replacers. In theory, LODs are needed for most of trees, rocks, icechunks and buildings meshes (most of meshes/x folder and part of meshes/f folder). Plus tiled meshes (such as lava or ice on water) should be rendered indifferently from size threshold.
Q. You've played with osg:simplify and billboard before; what if those generated results could be saved to disk for OpenMW to use?
We have a NIF loader, but have no NIF writer.
Ezze: Why not use the native OpenSceneGraph format?
Because other devs do not like an idea to have separate meshes for active grid and distant land.
In theory, there are osgDB::readNodeFile/osgDB::writeNodeFile methods to read/write subgraphs to files, but I do not know which nodes they support.
Q. Well, if they are generated in-engine and preserve the dynamism of OpenMW approach, I don't think separate meshes are out of the question.
With separate meshes LOD generator should be relatively simple:
1. Iterate over VFS and make a list of meshes which needs LODs.
2. Load every such mesh as scene node in loop.
3. Apply some kind of simplification algorithm to that node.
4. Save result as a separate file via osgDB::writeNodeFile to the separate folder. If an initial mesh has a name X.nif, LOD mesh will have something like X.dist name.
On OpenMW side user just pluggs-in this folder just as any other mod, and paging system can load these files via osgDB::readNodeFile (with caching).
Q. What if the user subsequently installs a mesh replacer?
The similar thing as in MGE - user needs to generate LODs for new replacer (in case if mod author did not do that).
Ezze: With some tricks reading the file dates it is probably possible to detect the need runtime.
No, it is a bad decision to place LOD generation to OpenMW executable itself. It is a content creation task, not a runtime one.
Q. What's the advantage of creating meshes as NiLODNode within the existing .nif files instead of just making a new nif with sensible names like like _lod, _lod_2, etc in order of mesh detail? It would be loads simpler for whoever has to make them.
AnyOldName3 wrote:Other file formats meant for use with game engines usually have some analogue of a NiLodNode, so it's not likely to be a problem that comes up when people stop using Nifs.
As for reasons why we'd rather not use separate files, there are a few. The two that I can immediately remember are that:
* It becomes easier to shoot yourself in the foot with mods if, for example, you have a replacer with LoD meshes that mostly keeps to the vanilla look, and another without LoD meshes that makes all trees bright orange. Why's the tree appear green when you first see it? Why's the tree turn orange when you get close? Who knows? Not your first-time modder.
* Having lots of file handles is slow and bad. Seeking around within a file is much, much faster.
Q. Is it really? I thought the NiLODNode overhead was significant. I'm not seeing the draw reduction I thought I'd get by LODplaning and atlasing everything.
The NiLODNode overhead was supposed to be swallowed up by object paging, but it's obviously not working like that yet.
Any other interesting tech that could help?
-
Spoiler: Show
vtastek wrote:Q. I was really hoping the LOD would have a big impact on amount of draw calls, but they hardly changed it at all.
There is something called mesh cluster rendering. It is probably the same idea of object paging but it can handle the frustum too.
-
Spoiler: Show
AnyOldName3 wrote:So you just stick bounding boxes for index ranges in a buffer and do indirect rendering via a compute shader?
Ooooh, cluster depth sorting...
That could be fun. Front-to-back for opaques, and we'd be able to batch the 32 two-triangle leaf meshes in each tree without alpha issues.
Although I'm maybe waiting to see how linked-list-based order-independent transparency ends up working once it's matured.
Q. Is such a thing compatible with the existing paging solution?
I wouldn't say it's particularly incompatible. However, once we get into compute shader land, we're potentially only fixing problems for the machines with the fewest problems already.
It's good to have performance scale well, so running OpenMW on more powerful hardware gives a higher framerate, or allows higher settings with the same framerate, but it's also important that we avoid upsetting people who see a screenshot taken on a GTX 3090 at 60 FPS and think they should have something comparable on a much lower-end machine.
vtastek wrote:You guys can always lock a legacy renderer and make that scaling forward 3090 60 fps Morrowind. The life of this game is in lock for at least till 2090.
Greatness7 wrote:Has any openMW devs ever looked into
https://www.khronos.org/opengl/wiki/Array_Texture? Just reading about these make it seem like they make atlas textures obsolete, and would work automatically with MW assets. Ideally things like the vines/branches could be the same drawcall using texture arrays.
Q. That's what Star Citizen uses iirc, to render their high detail characters/outfits in 1 draw call.
AFAIK texture array tech is pretty old, so I assume theirs has some innovation. Sounds great tho.
AnyOldName3 wrote:They're not even available with the GLSL version OpenMW uses right now, but eventually we're going to have to stop caring about that and bump the version anyway. They could be a tool in our arsenal if we add dynamic atlassing to the object paging system, but I'd expect if they were faster than using a single huge atlas texture, they'd get used instead of atlases in more engines as they're conceptually simpler.
Q. Atlasses don't tile, for one thing. Otherwise you could alter a much greater swathe of existing Morrowind assets to cost just one draw call.
Atlases can tile, it's just you have to implement texture wrapping in the shader instead of rely on the GPU's fixed-function hardware. It's one of the reasons texture arrays are conceptually simpler.