A Cooperative Job System

This section assumes you are familiar with the difference between Multi Threading and Multi Processing.

Today’s computers have multiple processor cores, this can go into large numbers. With multi processor boards this can be as high as 64 cores on a single board at the time of this writing. It is clear that speed advancement can only be reached in today’s software by utilizing parallel processing. It is equally clear that powerful multi threading and multi processing elements must be part of any new language.

Today’s languages leave this mostly to external libraries, like C++ 11 delegates this to the STL. The only language I know of which has multi threading as part of the language is Ada. Ada does it pretty good but still on a low level which exists so that programmers can build their own multi threading universe on top of these elemental building blocks. They are better than what C++ offers but still relatively primitive.

What I’m proposing instead is a high level system based on our practical experience in a huge server application over the past years. This system has evolved over several stages until it reached today’s form and by doing so, we had to learn the hard way what does and doesn’t work with multithreading. In it’s latest stage it still has room for improvements, but it is already so powerful and practical that it has allowed us to rewrite our server system in a “building block” kind of way where the utilization of any number of processor cores is really easy.

Such a system should be part of the language itself, not an external library so that the compiler has control over all aspects especially when it comes to pointers (a topic for later). Having this all under the direct control of the language would eliminate the need for some OO stretches we had to make in order to implement this system in C++.

First, what does NOT work:

A) Having a HUGE number of Operating System Threads. Each thread uses up a lot of resources, mainly a pre-allocated stack of several MB (2MB in the Window standard setup), so that restricts the number of threads already. Now you can create several 100 Threads with that, but there are two problems: First if you have a server that must handle thousands and thousands of user requests like a game server, you still need more parallel entities than you can create even with the largest memory area and second, switching between these threads is very expensive. You create 100 threads and check the simple Windows performance tool in the Task Manager, you see the red line go up and up, which represents the system time your CPU spends or, in other words, time needed to switch between threads. Not to mention a 2MB stack per thread that is hardly used wastes most your memory for nothing in return. So this is not the answer

B) Creating threads as needed and destroying them when the task is finished. This is even worse, because you don’t know how many might be created at peak times, you may run out of memory plus creating and destroying an OS thread is EXTREMELY expensive. This this does not work at all.

What is the solution then?

Create a number of worker threads in relation with the number of processor cores and keep them alive. Then assign code to be executed to these threads. When one section of code finishes, another one is assigned (with the respective data).

In our lingo, these code/data objects are called a “Job”. A Job is basically a class derived from a baseclass that provides the functionality needed to carry out operations in parallel.

Here is where it gets interesting: Each code will eventually have to perform blocking operations, like sending a request to another thread or process and waiting for an answer. Doing this the naive way by blocking the whole thread, would lead to the problems described under A and B above. So the solution is, that the thread system (in the language) must “send data as a message” and then the code that called this “send” functionality must return control to the threading system. It must then be prepared to be called again in another state with the answer to it’s request.

This is a “cooperative Job system” because it’s the user code that relinquishes control when it realizes that it can not proceed without receiving more data from other sources. It is then prepared to react to this answer and proceed processing at another location in the code. There is no other way of doing this. Any non cooperative way would require the OS to move switch stack to the one of another function that should resume. With stacks being one cohesive area of data, this is basically not possible.

The simplest way to do this (and the way we do it) is by having an overloaded “int Run( int State)” function. The code in the Run() function consists of one big switch() statement. Each case terminates with a “return X” line where X is the state with which it wants the Run() function to be called when the answer arrives. The switch statement will then continue at “case X” when the answer arrives and voila, we have a cooperative Job system.

Of course there are many more intricate details which I won’t elaborate here and now, but suffice it to say that this system works beautifully and, when checking the processor load, we can see that all processor cores can be stretched to the full load with only very minimal calling of system code since there are no context switches necessary. All threads run constantly. A true time sharing system.

This is only half the system though. Our Job system is divided into two (actually now three) basic types of Jobs. One is called a “Server Job” and the other “Client Job”.

A server Job is a job that defines service requests and can be called from client Jobs. The server then wakes up when a request arrives, carries it out and returns to sleep. Server Job’s don’t have states in the above sense, because that would be detrimental to their responsibility of servicing many client Jobs with little wait time. Their task must be finished within one wake-up call, including sending an answer.

There are only few server Jobs and they are all what’s called a singleton. They exist only once in the system. A good example is a database Job that provides access to database entries and performs transactions on them to alter entries.

Client Jobs on the other hand are a dime a dozen. Whenever a user sends a request to our game server for instance, we create a Client Job that handles this request. The Client Job may even call other client Jobs to handle different aspects of the user request in parallel. These Child Jobs can be fire and forget or synchronized with the calling Client Job who then waits for the child Job to finish.

Finally we have recently been adding a type of “Hybrid Job” which is a Server Job that can run it’s own local Client Jobs, in case that a Server Job needs to call other Jobs and wait for an answer. It can’t really do that in the main loop, but uses it’s own local Client Jobs to do that.

All in all this system has been stress tested with 10s of thousands of Jobs in parallel. As long as one state of a Job does not occupy it’s thread for a lengthy period of time but rather performs short bursts of computation and then waits for an answer or new data, the reaction time is spectacular. We are using this system in our new game server now with great success. Not only does it allow us to use Multi Threading efficiently but it also makes it EASY to use since the programmer does not need to deal with low end constructs like signals and mutexes anymore. By creating our own database service based with this system we have also solved the problem of deadlocks when altering game objects.

Of course this system is currently Object Oriented and would have to be rewritten in order to work with a procedural programming paradigm but that should be easily possible.

In my opinion, any new language that is designed to use multi tasking must incorporate a system like this.

Advertisements