
Originally Posted by
Simetrical
That's not really how it works. All eight tasks are allotted time slots by the scheduler, which also decides which cores to put each on. It might be something like core 1 gets tasks 1-2, core 2 gets 3-4, up to core 4 getting 7-8, distributed according to how busy each one is: if thread 1 is using a lot of CPU time, the scheduler might put it on its own core with relatively few other threads.
Then each thread gets a time slice. Assuming the simple thread distribution given above, and a time slice length of (say) 10 ms, we would have thread 1 getting 10 ms of CPU time; then thread 2 getting 10 ms of CPU time; then back to thread 1 again; and so on. On a single-core system, an eight-threaded application would work identically, except that all eight threads would alternate on the same core. So thread 1 might get 2 ms, thread 2 would get 2 ms, then thread 3, etc., until you get to thread 8 and then back to 1 again.
In practice you'll have interrupts as well, which mean they won't necessarily use their whole time slice at once, and of course they all have to share with other programs that are running. Plus every once in a while the scheduler will tweak which cores get which tasks. But the basic idea holds.
As a side note on why this doesn't give fourfold performance improvement, since each thread is getting four times as much CPU time. First of all, threads won't always use their entire timeslices, since they're partly I/O bound. That is, an AI thread has to figure out what your enemy should do next, but once it's figured that out, it has to tell the other threads and wait for them to finish, print the decided actions to output devices, and wait for player input before it can start processing again. (This is not the case in purely CPU-bound applications, such as something that calculates the digits of pi for five hours straight. That will do twice as well given twice the CPU power, although of course it might be difficult to multithread.)
Even to the extent that threads are purely CPU-bound, they probably aren't going to be equally so. Threads 3 and 6 might need a lot more CPU time than the other threads, say. They'll probably get cores mostly to themselves, which they might be using to 100% capacity, but the other two cores will still just be waiting on them. That's why you get diminishing returns for adding more cores. The ideal multithreaded application would have a very large number of threads, each of which can run for a long time without waiting for the others, but in only a few cases is it possible to really scale CPU performance linearly with the number of cores. Threads almost always have to wait on each other's results at some point.
Yes, pretty much. The whole idea of multithreaded code is to split up complex tasks into subtasks that are as close to independent as possible, but that's not always feasible.
That's certainly true.