Question:
Question about CPU registers...?
Daniel
2010-01-26 23:07:01 UTC
I've just become sort of "aware" of how the 'hot' CPU registers on an x86-64 processor work. I kinda had to, since I had to write a CD/DVD-ROM bootloader to meet the the ISO 9660 "El Torito" spec for my WIP operating system. I've been programming in C, C++ and C# for a long time, and have extensively made use of multi-threading - really taking it for granted. Now I'm realizing how exactly all of my code boils down to ASM, bit by bit.

Now I'm a little perplexed thinking about it. The bootloader I've written is about 1000 lines of Intel instructions, but it's 16-bit "RealMode", synchronous code (just the way they boot and read the MBR). I had to be VERY careful what I stored in the registers in between interrupts and my different functions. It really hit me; what in the world goes on in multi-threaded code with the registers!?

You see, my code might boil down to:

mov eax, some_value

...on Thread 0x0.

But by chance, Thread 0x1 might do this at the exact same time:

mov eax, other_value

...is that even possible? Mind you, there are much more registers in PMode, especially with a multi-core. But I'm curious as to how that is ever avoided. Something might be depending on the value in eax right after this occurs! Is this all avoided by the virtual memory capabilities in Pmode? I'm just wondering what safeguards the processor from nasty things like this.

Note: If you honestly don't know, don't even post. Pretty please! :)
Four answers:
Cubbi
2010-01-27 13:40:51 UTC
First of all, if you have truly *exact same time* execution of thread 0 and thread 1, they are necessarily executing on different CPU cores, thus the eax's are different. Thread 0 moved some_value to CPU 0's eax, thread 1 moved other_value to CPU 1's eax. Following commands in Thread 0 will see the eax with some_value, following commands in thread1 will see the eax with other_value.



If both threads are scheduled on the same CPU, the two movs cannot be executed simultaneously. After Thread 0's mov will execute, the OS will perform a task switch by reading every CPU register and storing it in the TSS (task state segment) or equivalent structure, then it will load all CPU registers from Thread 1's TSS, and will let Thread 1 run, at which time your second mov will execute.



Whatever depends on the value of eax in Thread 0 will see eax containing some_value after the OS reloads the CPU registers from Thread 0's TSS on the next task switch into thread 0.



If you want to write your own thread scheduler in assembly language, it's a good and instructive task, I recommend it.

Set up a table of threads, with room for all CPU registers, a few kilobytes for a copy of the thread's stack and a variable containing state (running, not running, for the simple example) Set up a timer interrupt handler, and on entry, save all registers and the stack in the memory structure for the thread that is currently in the Running state. Change it to Not Running, set the next thread Running, then copy the stack from TSS, read all registers from TSS, and modify the return address on stack to point to the eip of the new Running thread, and return from interrupt.
2016-05-26 20:24:31 UTC
The registers are a fundamental part of the CPU. There are typically 16 or 32 of them depending on the type of CPU. Each one will hold a single number (32 bits or 4 bytes on a 32 bit CPU, 64 bits on a 64 bit CPU). The instructions that the CPU actually runs are things like "Add register 5 to register 7" or "Store register 15 at memory location 62041". Some registers have a special function (e.g. holding the memory address of the next instruction to execute) others are general purpose and can be used for anything. There is no time penalty for accessing a register, they are all available instantly to the CPU making them the fastest possible place to store something. Cache is a type of memory. The idea is that main memory is large but slow. However normally the computer isn't using all of the memory, it's using a small subsection of that memory and will use the same memory location a few times in a row. A cache is a smaller faster memory that holds a copy of a section of the main memory, when the CPU requests information from memory first the system checks the cache and if that memory address is in the cache used that copy. Only if that memory address isn't in the cache does the whole CPU stop and wait while a copy is fetched from main memory. This is the basic idea behind CPU cache, cache on a hard drive (where the main memory is the spinning disc) and all other types of cache used in electronics, they pop up everywhere. So cache can store anything that main memory can store. A modern CPU will have several caches called L1 (Level1), L2 (level 2) and sometimes L3. These can either be split into separate instruction and data caches (two parallel caches based on what that memory was used for) or be unified. The L1 CPU cache is generally very fast, only 1 or 2 cpu clock cycles to access (so still slower than a register but a LOT faster than main memory) and is typically only a few thousand bytes. L2 cache will be a lot larger (several MB) but also a lot slower than the L1 cache. Similarly L3 cache is even bigger and slower. The number of registers is fixed for a CPU instruction set (e.g. all intel x86 CPUs from the 80386 made in the mid 1990's to a brand new core i7 have the same number of registers in 32 bit mode. But a newer ones when running in 64 bit mode have twice as many registers available (that's why 64 bit applications run a little faster, most of the time it's nothing to do with the 64 bit nature it's almost entirely due to double the number of registers)). Over that same time however the size of the CPU cache has increased by a few thousand times. This is because the number of registers and how they are used is a fundamental part of the CPU, changing it would break any programs written for the old design. The cache on the other hand is part of the memory subsystem and is hidden from the software running on the CPU and so can change as the process to build CPUs improves.
Kardinal
2010-01-26 23:33:29 UTC
The core of the operating system is basically a scheduler that assigns processor time to different threads. This scheduling might be done in many ways, but for a simple example, suppose it gives each thread a fixed amount of time to run. After this time expires (triggered by an interruption), the operating system must perform a "context switch". This involves saving all the data the expired thread had at the time of the interrupt (including, of course, the registers) and loading the data of the next running thread.

Take into account that I used the word thread in the previous example, but it also aplies for processes.
Kasey C
2010-01-26 23:15:11 UTC
Let me describe this then...



In a mutlithread environment, there are multiple "levels" of program privileges. The OS kernel runs at "ring 0"



http://en.wikipedia.org/wiki/Ring_%28computer_security%29



So the kernel will periodically switch among the programs based on CPU cycles and stuff like that. That's why "thread-safe" programs are very carefully written, so it can be shunted off the the stack temporarily, then resumed as if nothing happened.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...