about c++ variables addresses in windows?

Cubbi

2011-08-01 06:49:38 UTC

Good answers already, I'll address a few specific points that I think can be lost in the discussion:

"I’ve learnt that initialized variable is stored in .data section, uninitialized variables in .bss and pointers in heap."

Wrong. Objects in C++ can have one of three storage classes: automatic ("on stack"), static (in .data/.bss), and dynamic ("on heap" / "in free store"). A pointer-to-int is an object, just like an int, and is stored in main()'s stack frame in the first sample and in .bss in the second sample, just like the int objects.

That leads to the confusion in your other question:

"why uninitialized variable is close to a pointer variable that is supposed to be in heap section."

It's not supposed to be in heap section. Both c and d have static storage duration and are not initialized with constant expressions, therefore they are supposed to be in .bss, which is where they are placed by your compiler.

As for what is there occupying extra space on Visual Studio's stack, WaMSie has the right idea: it's a bunch of 0xCC bytes placed on stack around each automatic object, to detect buffer overflows due to writing to arrays out of bounds or other forms of stack frame corruption.

?

2011-08-01 08:20:53 UTC

@Blackcompe and @aren: You say 'd' is allocated on the heap. Not so: 'd' is just a normal non-initialized variable. It's the value 'd' *contains* (e.g., the result of "new" or "malloc") that points somewhere inside the heap.

Since you're using GCC I assume you have the "nm" command. If you compile the program into "a" then the command "nm a | grep '4a0' | sort" gives:

0804a024 D __data_start

0804a024 W data_start

0804a028 D __dso_handle

0804a02c D a

0804a030 D b

0804a034 A __bss_start

0804a034 A _edata

0804a040 B _ZSt4cout@@GLIBCXX_3.4

0804a0cc b completed.6155

0804a0d0 b dtor_idx.6157

0804a0d4 B c

0804a0d8 B d

0804a0dc b _ZStL8__ioinit

0804a0e0 A _end

Note that 'a' and 'b' are indeed on the data segment (denoted by "D"). 'c' and 'd' are on the BSS segment (denoted by "B"). You can also see that the BSS area starts after the data segment; but there are variables internal to C++ (or to Devil knows whom) store there, so variable 'c' isn't at its very beginning.

===EDIT===

"@Merc : I'm aware that 'd' is a pointer. I'm saying that its value points to the heap."

"it's value", yes, but the program doesn't print d's value (it doesn't do "cout << d"). It prints d's own address (it does "cout << &d"), which is in the BSS, not in the heap.

You think (erroneously) that 0x804a0d8 is in the heap (search your message for the word "heap": this word shouldn't even be mentioned).

Blackcompe

2011-08-01 07:08:40 UTC

Let me start off by saying it's implementation-dependent. On my machine, a, b, and c are separated by 4 bytes. I'm using GCC. The language doesn't define how locals are to be laid out on the stack. Perhaps you should consult your compiler's documentation. WaMSie also proposes a good idea.

For the second program, my output was:

0x804a02c

0x804a030

0x804a0d4

0x804a0d8

a and b are 4 bytes apart in the data segment. c is 164 bytes from b in the bss segment, and d is 4 bytes from c in the heap. It easy to me to see why the heap starts 4 bytes after after the start of the bss segment.

But now I ask myself why the bss segment starts so far away. If we lay out all initialized static and global objects during compilation in the data segment, the bss segment can start immediately after. Right? Even though some of the objects (in the data segment) will be overwritten at runtime, the size of the object won't change.

Perhaps the assembler allocates a default amount of space in the data segment, because it's likely that it won't allocate space for all the initialized globals and statics before allocating for an unintialized global.

E.g

//globals

int a = 1; //assembler: visit this first, allocate in data seg

int b = 2;//visit second, allocate in data seg

int c; //visit third, allocate in bss. But WAIT! the bss has to be far away because we may //have an unknown amount of intialized globals to place in the data seg. If this weren't //true we could just place 'c' directly after 'b'.

int d = 3; //Ahh... we were right!

int e = 4;

I think my theory is somewhat correct, but it wouldn't explain why the heap starts directly after the bss seg. But, then again I didn't write the assembler, so how accurate can I really be? But, if I were designing a assembler, that's how I'd do it. Of course, there's probably a whole bunch of other factors like, language support, that I didn't consider.

@Merc : I'm aware that 'd' is a pointer. I'm saying that its value points to the heap.

2011-08-01 06:42:49 UTC

To begin with, the difference between the first code and second code is the scope that the variables are declared in. When variables (etc) are declared inside functions, they are allocated on the stack. When they are declared globally, they are allocated on the heap.

An exception to this rule is when you explicitly allocate memory using the "new" operator or "malloc" like functions. i.e. if you allocate memory using the "new" operator inside a function, it will allocate that memory on the heap, however, the pointer to the memory allocated will still be stored on the stack, since you declared the pointer inside a function. That's why you get the same address printed for the pointer. If you remove the "address of" operator from pointers when printing them, you will see that they point to the memory allocated on the heap.

peteams

2011-08-01 06:30:38 UTC

You are looking at a debug build. Change to a release build and you will find the variables are allocated as you expected.

I believe what you are seeing in the debug build is the compiler writing safe code, so if you make a memory handling error it has a good chance of catching it.