Ramblings of Narc

When the issue isn't confused enough.

That’s odd…

Now, here’s an odd one for you, courtesy of my friend, Cristian Chilipirea (Dark Hunterj):

#include <stdio.h>

int* holy_crap()
{
	/*static*/ int s;
	printf("%i\n", s);
	return &s;
}

main()
{
	int* x = holy_crap();
	*x = 2;
	holy_crap();
}

So, what are we looking at? Well, if you uncomment the “static”, you get the intended effect — holy_crap(), when called for the first time, (prints “0\n” and) returns the address of its static variable; the value in that address is changed to contain a 2, and then holy_crap() prints out “2\n”. This is as intended, and it is a very silly thing to do, but it works and is legal.

And then, you comment out the static as I’ve done, and… it still works? That’s odd… In theory, at least, the allocation of a non-static local variable can take place absolutely anywhere in memory, so the address read and returned by the first call to holy_crap() could be different from the address checked and returned by the second call. Yet, on my compiler (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) and on Dark’s, it actually does print “2\n” on the second call to holy_crap().

Granted, gcc does complain: “warning: function returns address of local variable”, but it still works, for whatever reason.

And it gets better! Check this out:

#include <stdio.h>

int * holy_crap()
{
	int s;
	printf("%i\n",s);
	return &s;
}

int huh_what()
{
	int test;
	test = 3;
}

main()
{
	int* x = holy_crap();
	*x = 2;
	huh_what();
	holy_crap();
}

This actually prints “3\n” on the second call to holy_crap()! We could probably reduce this further by removing the initial call to holy_crap() and the assignment of 2 to that pointer, but it doesn’t matter, because this is still some weird shit going on here.

Oh, and, I probably don’t need to tell you this but don’t ever rely on this happening for you! It’s very likely a combination of compiler, OS, and unoptimized compilation. I haven’t tried this in any conditions other than “gcc file.c -o file”, and I definitely do not recommend writing anything like this in any application beyond the most trivial/educational.

Still, any ideas to explain why this happens?


7 Comments
  1. Apu Illapu on 2009-01-14 at 14:08:40:

    Not that odd, I should think. I fact I believe this is typical for most combinations of non-optimizing compiler and OS.
    AFAIK it usually goes like this: upon entry into function code the stack pointer (which points initially at the return address) is moved by the exact amount it takes to store all the local variables – in your case, by sizeof (int). The second time you call the function the stack pointer should be the same. Since no initialization is done, it’s just like having a static variable.
    As for your second example: the two functions have the same number of arguments (none) and the same number of local variables (one). They are called from the same context, thus will most likely have the same stack pointer. Therefore the memory locations for the variables will overlap.
    I wonder, with more than one local variable, what would be the order in memory – compiler and OS dependent, or totally whimsical?

  2. Apu Illapu on 2009-01-14 at 14:19:14:

    Uh, re. the above, the Wikipedia entry on stack frames.

  3. Narc on 2009-01-14 at 16:56:16:

    the two functions have the same number of arguments (none) and the same number of local variables (one).

    The number of arguments doesn’t seem to matter (I’ve tried giving huh_what() an argument and it still works the same way), but I haven’t tried giving it two local variables yet.

    Let me try that…

  4. Narc on 2009-01-14 at 17:08:25:

    Okay, here’s an interesting item or two:
    - multiple ints seem to work in order of declaration, as I honestly expected.
    - and with two unsigned chars, they tend to overwrite the leftmost two bytes of the four-byte int in the first function.

    That last one is definitely dependent on the platform byte ordering. x86 is little-endian, isn’t it? Of course it is. Interestingly, there’s no word alignment that I can see — the two bytes got packed as tightly as they could be, literally one after the other. Might be interesting to test all this with optimizations.

  5. Narc on 2009-01-14 at 17:17:24:

    As for optimizations:
    - O1, O2 and Os seem to block changes from a different function (second function no longer affects local variables in the first); and
    - O3 blocks the changes from the first example, too — the behavior I was expecting originally.

    It’s times like these I feel like learning x86 assembly (again, this time the right way). It’s just so fascinating.

  6. Apu Illapu on 2009-01-14 at 18:41:57:

    Aaargh!
    Been trying to find an interactive disassembler that “just works” on my Fedora 9. No luck so far. Advice?

  7. Narc on 2009-01-14 at 19:22:57:

    Interactive, no — so far, I’ve just been using gcc -S for my limited needs. I’m a PHP-er by trade, so I’m not very motivated to pursue these things (outside of curiosity).

    That said, gdb is what usually gets pointed out as a disassembler + debugger, so maybe a GUI front end to gdb might be an idea.

    Looking through the apt repositories (I’m on Ubuntu) with Synaptic gives some interesting clues, too: x86dis, a front end to libdisasm, and something called dissy, which is written in Python and uses objdump for disassembling.

    So, um, good luck?

Add your comment

 

XHTML: You may use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>