LukasPJ Posted February 17, 2015 Share Posted February 17, 2015 Slightly off topic but one of the biggest things i have seen that is really holding T3D back, is just how CPU bound it is. Using T3D now on a decent size level it quickly becomes very apparent how under utilized the GPU is. It needs multi threading. I think Timmy is write, and I just wanted to start a discussion on which part of the engine it'd make sense to multithread.The ParticleSystem is one, which I can probably handle in a meaningful way :)Any other ideas? Quote Link to comment Share on other sites More sharing options...
Azaezel Posted February 17, 2015 Share Posted February 17, 2015 The renderinst and renerDelegate subsystem is basically a break off of renderable data from a given overall class, submitted to the renderer subsystem, so there's bifurcation there that may bear looking into... definitely a real spiderweb at present though. Quote Link to comment Share on other sites More sharing options...
Timmy Posted February 17, 2015 Share Posted February 17, 2015 Yeah Az is definitely right it is a real spider web right now. As it stands the network/sound is the only thing multi-threaded and they both use the thread pool functionality already present in T3D.The render system is the most obvious place to start, it doesn't need to actually support rendering API calls from multiple threads but breaking down the work the render system does to be distributed amongst the thread pools. Things like traversing the scene graph, culling etc etcIt would be a massive undertaking to do it but very beneficial. Quote Link to comment Share on other sites More sharing options...
andrewmac Posted February 17, 2015 Share Posted February 17, 2015 I think the easiest place to multithread that will have a noticeable impact and isn't a crazy project would be the asset loading. I started the research on it one day and it didn't look too bad. The key thing is creating some kind of state an asset can be in that's essentially "Loading". It won't be displayed but will still exist as an object in the world. Once that's in place you can just offload the asset loading to another thread and then mark the asset as ready/loaded once it's done processing. This should alleviate the hangs you experience when you quickly rotate the camera and do other things that cause hiccups.The man who multithreads that renderer deserves a lifetime of supply of whiskey. He'll also need it for the PTSD he'll surely have after the project is complete. I've looked it up and down 3 or 4 times now and I keep coming to the same conclusion: it would make more sense to gut it, build it proper, and then go through all the existing code and update it to use a new threaded render system. It's what I started concluding when doing BGFX. In some cases you can fix a house by replacing one wall at a time.. but I think in this case the house should be torn down and rebuilt. Just my two cents anyway. Quote Link to comment Share on other sites More sharing options...
LukasPJ Posted February 17, 2015 Author Share Posted February 17, 2015 We could put up a bounty for a multithreaded rendering system!? :P (Trying to incentivize you guys to utilize bounties..) Quote Link to comment Share on other sites More sharing options...
JeffH Posted February 17, 2015 Share Posted February 17, 2015 how about that particle system ;)inb4 lukas kills me. Quote Link to comment Share on other sites More sharing options...
Timmy Posted February 17, 2015 Share Posted February 17, 2015 Even with multi-threading the asset loading, i think you would still get the dreaded hangs/pauses etc because even though disk i/o and any cpu intensive operations can happen on another thread, things like sending the vert/index buffers, uploading textures,compiling shaders etc etc still has to happen on the main thread causing delays. D3D9Ex does support sharing resources https://msdn.microsoft.com/en-us/library/windows/desktop/bb219800(v=vs.85).aspx#Sharing_Resources though and OpenGL is capable of this too. Quote Link to comment Share on other sites More sharing options...
andrewmac Posted February 18, 2015 Share Posted February 18, 2015 Even with multi-threading the asset loading, i think you would still get the dreaded hangs/pauses etc because even though disk i/o and any cpu intensive operations can happen on another thread, things like sending the vert/index buffers, uploading textures,compiling shaders etc etc still has to happen on the main thread causing delays. D3D9Ex does support sharing resources https://msdn.microsoft.com/en-us/library/windows/desktop/bb219800(v=vs.85).aspx#Sharing_Resources though and OpenGL is capable of this too. http://i.imgur.com/yj8MWI2.gifGood point. I've got nothing on that one. Quote Link to comment Share on other sites More sharing options...
rlranft Posted February 18, 2015 Share Posted February 18, 2015 Yeah, but you can't share the resource until it's loaded.... ;p Quote Link to comment Share on other sites More sharing options...
Timmy Posted February 18, 2015 Share Posted February 18, 2015 Lockable resources (textures with D3DUSAGE_DYNAMIC, vertex buffers and index buffers, for instance) can experience poor performance when shared. Lockable rendertargets will fail to be shared on some hardware. That part also doesn't sound very nice with resource sharing Quote Link to comment Share on other sites More sharing options...
Haladrin Posted February 18, 2015 Share Posted February 18, 2015 (edited) Asynchronous resource allocation in directx is not done with shared resources between devices ( a bad idea! ), but by creating the device with the D3DCREATE_MULTITHREADED flag, allowing for threadsafe api calls. The downside in d3d9 is that the critical section is global over all api calls. D3d11 is the first api to separate resource allocation and render calls. Edited February 18, 2015 by Haladrin Quote Link to comment Share on other sites More sharing options...
Timmy Posted February 18, 2015 Share Posted February 18, 2015 The downside in d3d9 is that the critical section is global over all api calls Yes this is the problem. Quote Link to comment Share on other sites More sharing options...
MangoFusion Posted February 18, 2015 Share Posted February 18, 2015 From my experience, multithreading well is pretty hard. There's so many things which can go wrong.IMO any additional multithreading should meet the following criteria:- Core systems should be thread-safe. (e.g. the logging which currently isn't. also console execution which isn't 100% safe)- Memory allocations should be kept to a minimum, otherwise you'll increase the chances of memory fragmentation (e.g. if the one thread is allocating differently sized blocks to the other threads on a temporary basis)- Mutexes should be kept to a minimum otherwise you have to deal with the overhead of a mutex lock too often (e.g. the current scenario of having 2 mutexes per simset is a bit OTT)- Any threaded operation should be as isolated as possible from the system as a whole (debugging random timing-related crashes because thread X accessed something in the main thread without a lock is no fun)- Any threaded operation should be designed so that it can easily be cancelled upon shutdown- It should actually offer a performance advantage (threading just because it sounds cool is not good enough) Quote Link to comment Share on other sites More sharing options...
buckmaster Posted February 19, 2015 Share Posted February 19, 2015 Would it be possible to load several COLLADA shapes in parallel, even if we couldn't load new resources in the background? Might speed up that initial level load time.Skinning has always been mentioned as a candidate for parallelism, though @MangoFusion's GPU skinning may make that obsolete. it doesn't need to actually support rendering API calls from multiple threads but breaking down the work the render system does to be distributed amongst the thread poolsThis seems like it could be a good way to go for many types of problem. Taking discrete pieces of an algorithm that involves a lot of computation, and parallelising it within a single discrete time slice in order to make it go faster. Not long-running systems which interact with the main thread over any length of time.As always, Mango has it right. Definitely all good things to keep in mind. Quote Link to comment Share on other sites More sharing options...
Timmy Posted February 19, 2015 Share Posted February 19, 2015 This seems like it could be a good way to go for many types of problem. Taking discrete pieces of an algorithm that involves a lot of computation, and parallelising it within a single discrete time slice in order to make it go faster. Not long-running systems which interact with the main thread over any length of time.. Yeah this seems to be the direction most of the "big boys" seemed to have taken with their engines. Intel has a really great video explaining the concept from GDC a few years back, i'll post the link if i find it. Quote Link to comment Share on other sites More sharing options...
LukasPJ Posted February 19, 2015 Author Share Posted February 19, 2015 Multithreaded particlesystem with 25.000 particles on screen, vs non multithreaded: 31.25MSPF vs 33MSPF.. There is a change, but it's a small change :P That's only simulation though, which is pretty light.. I'll try and see if I can get more of the system to be multithreaded. Quote Link to comment Share on other sites More sharing options...
buckmaster Posted February 19, 2015 Share Posted February 19, 2015 I quite like the idea of getting scene traversal to use a thread pool, though I think that would involve the container system (IIRC), which is nowhere near threadsafe. I did do some exploration in that direction when working on recast/walkabout, so I could multithread container queries to build geometry. It wasn't super pretty. Quote Link to comment Share on other sites More sharing options...
Timmy Posted September 25, 2015 Share Posted September 25, 2015 Out of curiosity has anyone tried threading the CPU skinning in T3D before? I know there is GPU skinning but in T3D's current setup, i often wonder if CPU skinning wouldn't be a great place to start. Quote Link to comment Share on other sites More sharing options...
buckmaster Posted October 6, 2015 Share Posted October 6, 2015 I've heard rumours about someone trying that. I agree, I think it might be a decent idea, though IIRC @MangoFusion was close to getting GPU skinning working, so it might be moot.It might honestly be easiest to just add more documentation about T3D's threading features and add more examples of their use (maybe in the navmesh/pathfinding code?) so that game devs can decide how they want to use parallelism for their own problems. Sounds like the Life is Feudal team were having problems with parallelising their game; I wonder if better docs and guidance around what parts of the engine can and should be subject to it would have helped.The biggest, biggest thing is obviously the resource manager but that'd be a huge amount of effort I reckon. Quote Link to comment Share on other sites More sharing options...
Chelaru Posted October 6, 2015 Share Posted October 6, 2015 I've heard rumours about someone trying that. I agree, I think it might be a decent idea, though IIRC @MangoFusion was close to getting GPU skinning working, so it might be moot.It might honestly be easiest to just add more documentation about T3D's threading features and add more examples of their use (maybe in the navmesh/pathfinding code?) so that game devs can decide how they want to use parallelism for their own problems. Sounds like the Life is Feudal team were having problems with parallelising their game; I wonder if better docs and guidance around what parts of the engine can and should be subject to it would have helped.The biggest, biggest thing is obviously the resource manager but that'd be a huge amount of effort I reckon. I think we need multithreading. The documentation is good, but we still need multithreading. Is 2015. Quote Link to comment Share on other sites More sharing options...
JeffR Posted October 6, 2015 Share Posted October 6, 2015 A ponderance:One of the first things I'll PR for 3.9 is the Taml/Asset/Module stuff pulled from T2D.Assets are interesting because they're auto-managed via references. If something references an asset, it does it's initializing/loading work, and then cleanup when nothing references it again.So rather than trying to wade into the resource code itself, would it possibly make more sense to thread the asset load/unload step?So when an asset is referenced for the first time, it'd do it's setup/loading work in a secondary thread. It wouldn't be as low level as doing up the entire resource system, but it seems like that'd help at least a little on load times and streaming. Quote Link to comment Share on other sites More sharing options...
Chelaru Posted October 6, 2015 Share Posted October 6, 2015 A ponderance:One of the first things I'll PR for 3.9 is the Taml/Asset/Module stuff pulled from T2D.Assets are interesting because they're auto-managed via references. If something references an asset, it does it's initializing/loading work, and then cleanup when nothing references it again.So rather than trying to wade into the resource code itself, would it possibly make more sense to thread the asset load/unload step?So when an asset is referenced for the first time, it'd do it's setup/loading work in a secondary thread. It wouldn't be as low level as doing up the entire resource system, but it seems like that'd help at least a little on load times and streaming. That will be a nice start. I hope this will help the load time on the Pacific demo Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.