Question:
A fair amount of C programming questions on functions(Sorry a lot of questions).?
2013-09-05 03:03:51 UTC
Because I'm self-studying out of a CS textbook, I'm probably going to sound ignorant here so please forgive me ahead of time for these stupid question. I also am not sure if I understand the paradigm of OOP or how it works because the textbook is not really studying C++ and instead is more toward ASM and C. Anyways, please keep this in mind.

1) First can someone clarify for me that all functions will "always" remain in the stack(no where else in memory like the heap or code segments of memory address space)? Here I mean does the code of whatever called function(currently running function) remain on the stack during the whole time of it's execution?

2)What about those functions that will be called later? Do they also get allocated specifically to the stack at the same time as the current executing function or is it after the one finishes executing they get ran by reallocating it's memory and reassigning that stack space to a new function and son on so that the new next function is allocated and deallocated one by one after execution in a top down fashion? What I'm trying to do is eliminate all functions get loaded at the same time or at the time they are called and if they do get loaded at the same time then do they all exist in this area(stack) at the same time?

2) With regards to functions, what are local variables vs global variables? The text I am reading said that the locals variables are in the stack with the function code. Can someone please explain this process from a compiler point of view and how exactly it might get treated?

3) How does the compiler generate the code for the function in the stack and keep order of which one is running and which one will come next? In C with functional programming I can see the order being divided into a hierarchy tree like structure with lists of functions in order on the stack, but in OOP design(again I might not understand it), being a different paradign, does it change the way the stack loads frames? A simple explanation please

4) Does the return statement have relationship to the PC or IR registers? If so please paint a picture so I can better see this at a hardware level.

5) What is the EBP register exactly?

6) Lastly, related to 1 and 2, does the term "stack frame" mean a unit of code for specific function? So that there can be more than one stack frame at a time in the stack? Could I say that to pop a frame by frame model exists? If not please explain if you didn't in 1 and 2 already.

Although I realize this is a lot of questions, I really appreciate anyone who took the time to read and even respond to some of them.
Three answers:
green meklar
2013-09-07 15:12:48 UTC
1. The machine code instructions for any particular function reside in static memory, while the local data for any particular call to that function resides on the stack. The machine code instructions use memory offsets to determine where the local data is that they're supposed to be dealing with. There may be multiple versions of local data for the same function (e.g. if you are using a recursive algorithm), but generally only one copy of the machine code.



2. Space for local function data is only allocated on the stack when the function is called. That's why it's called a 'stack': It grows when the call depth becomes deeper, and shrinks in reverse when the calls return back up towards main(). This is more memory-efficient, but in any case, the program does not know beforehand what functions are going to be called and in what order (it is inherently impossible to predict all the function calls faster than the actual execution of the program). However, the compiler may perform some optimizations in this area to improve performance, such as compositing certain small functions into their callers.



2B. As mentioned in (1), the function code does not go in the stack, it goes in static memory. Only the local data goes in the stack. Every time a function is called, space for its local variables is allocated on the stack so that the function can fit its data there during execution. If there is more than one call to the same function active at once (e.g. if you are using a recursive algorithm), a separate segment is allocated for each call. In this way, different calls are able to fit different data into what is 'the same variable' as it appears in the source code, without conflicting.



3. The compiler generates nothing in the stack. It generates the contents of static memory only, and the code in there dictates how memory is to be allocated on the stack or the heap later on, in a way that matches what the source code says to do. The compiler cannot necessarily predict which functions will be called when (as mentioned in (2), this is inherently impossible in the general case). It just prepares the machine code to handle each call correctly whenever it happens to be made. Besides the logic represented in the source code, there is extra machine code inserted to allocate memory on the stack when needed. This doesn't really change in OOP, since methods are basically just functions that are given an extra pointer pointing to the object on which the method was called (again, this happens under the hood, you don't see it in the source code).



4. On each function call, a segment of memory on the stack is set to contain the value of where the program counter was (or was about to be) when the call was made. A return statement just says to pull that value back out, deallocate the stack memory, and set the program counter back to that value. The compiler is allowed to arrange the machine code however it likes in static memory so long as it guarantees that the right value is going to be there to set the program counter back to, so that execution will continue in the right place.



5. In an X86 processor, apparently the EBP register stores the pointer to the segment of memory allocated on the stack for the current function call. In (1) I mentioned about memory offsets, it seems the EBP register can be considered the value on which those offsets are based. That is to say, in a given function call, a given local variable will have an address equal to the start of that function call's memory segment plus some particular constant determined by the compiler. For any particular call, when it is executing, the EBP register will store the address at the top of the memory segment, and constants in the machine code will tell the processor where to look for that local variable.



6. No, it means the segment of memory on the stack holding the local variables for a particular function call. There can be many frames on the stack at once, that's kinda the whole idea. Exactly how many depends on the compiler specifications and the limitations of the OS and hardware, but for a small function (i.e. one with not very much local data) it can easily be over a thousand.
Will H
2013-09-05 10:53:40 UTC
You need to ask less but more direct questions per post.



Memory management in C: The heap and the stack

http://www.inf.udec.cl/~leo/teoX.pdf



Memory Layout of C Programs

http://www.geeksforgeeks.org/memory-layout-of-c-program/



Return statement

http://msdn.microsoft.com/en-us/library/sta56yeb.aspx



The EBP register is a static register that points to the stack bottom. The bottom of the stack is at a fixed address. More precisely the EBP register contains the address of the stack bottom as an offset relative to the executed function. Depending on the task of the function, the stack size is dynamically adjusted by the kernel at run time. Each time a new function is called, the old value of EBP is the first to be pushed onto the stack and then the new value of ESP is moved to EBP. This new value of ESP held by EBP becomes the reference base to local variables that are needed to retrieve the stack section allocated for the new function call. As mentioned before, a stack grows downward to lower memory address. This is the way the stack grows on many computers including the Intel, Motorola, SPARC and MIPS processors.



http://www.tenouk.com/Bufferoverflowc/Bufferoverflow2a.html
husoski
2013-09-05 16:17:17 UTC
1. You seem to have program ("code") memory and data memory confused. Program memory contains the compiled instructions ("machine code") that the CPU runs to execute the program. Typically, there is just one copy of the code for each function and it is allocated once when the program is loaded.



Data memory is divided into three classes: static, automatic and dynamic. Each typically has it's own area in the processors memory.



All variables declared at the file level are static, local variables in a function are static if declared so. Static variables are allocated at program load time and remain so for the life of the program. From the program's point of view, they are never allocated or deleted. They reside in an area normally called "static data". (Big surprise.)



It is the automatic local variables of a function that are allocated on "the stack", along with program arguments, plus compiler-generated data for saving and restoring CPU register and anything else needed for the low-level implementation of function calls. Any "by value" arguments are also pushed onto the stack. Automatic variables are allocated when a function is called and released on function return.



Dynamic memory is explicity allocated by calling a function like malloc, calloc.



So 1) your functions code lives in the code segment, it's static data lives in static data (aka "data segments" in the Intel model) and normal (automatic) local variables live on the stack but only for the duration of the function's execution.



2) Local variables visible only inside the function they are local to. (Actually, every {} braced compound statement block can have local variables, and they can be either automatic or static.) Global variables are declared outside of any function and are visible to all functions that follow. Global variables always have static storage allocation.



3) N/A since code doesn't live on the stack. C doesn't do functional programming, either, but if an optimizing compiler decided to parallelize code like f(g1(), g2(), g3()) by calling g1, g2 and g3 in parallel, then yes: The compiler would have to proved separate stacks for each of the called functions. (*note below*)



4) PC, IR? "Program Counter", "Instruction Register"? If so, everything else in your question is Intel x86 except for this. An x86 CALL instruction will (a) push the address of the next instruction onto the stack, then (b) load EIP (the instruction pointer) with the address of the first instruction of the called function's code. The stack is managed by ESP (the stack pointer) and grows downward. An item is "pushed" by subtracting the size of the item from the current ESP value and then copying the contents of the item to memory starting at the new value of ESP. Blocks of memory are allocated from the stack by simply subtracting from ESP (and the new value of SP is the pointer to the allocated memory) and deallocated by adding to ESP. It is obviously CRITICAL that stack allocations and deallocations be properly nested, like parentheses.



5) As you can see, ESP jumps around a lot during execution. EBP (the base pointer) is intended to be copy of ESP at entry to a function, before allocating space for locals (but after saving the previous EBP value on the stack) That's a stable point of reference that doesn't change over the lifetime of the function call, and can be used to located both arguments and (automatic) local variables as fixed offsets from that register value.



6) One more wrinkle to complete the picture. In a function call, the code before the CALL instruction in the calling program will push all arguments onto the stack, then CALL the function. The term "stack frame" loosely refers to the arguments, saved registers and return address, plus the allocated local variables for one invocation of a function. On the x86, it extends above EBP for the arguments and below EBP for automatic locals. Yes, you can "pop a whole frame" by copying EBP to ESP and then popping to restore the caller's EBP register. That's why "PUSH EBP" is the last stack operation before copying ESP to EBP.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...