Consider what operations are involved in each case and to what extreme you consider.
Slowest: Context switch. Must make several CPU changes. Can require several CPU cycles. Store state and load new routine. http://en.wikipedia.org/wiki/Context_switch
Next: Read disk (HDD). Must go out to a peripheral device. Disk must spin to access the portion to be read. Data is being duffered to the CPU.
Next: Read from memory. RAM is faster than HDD. Data must be read in.
Fastest: Read CPU registry. This can be done within a single CPU cycle.
It is actually impossible to determine exactly where to place "context switch" since it depends upon your processor and how much change is necessary. In the minimum case, it is generally faster to read one byte from the HDD or memory cache than do a context switch where you store your state and load a new routine.