Symbols in crash dumps on iOS

If you distribute release builds… it’s possible your application will crash, and generate a not particularly useful bunch of addresses for a variety of threads. Luckily there’s a solution, if you get the .app and the .app.dSYM in an easily findable place (like say… the same place as your .crash file!) XCode will do all the lookups for you! Easy peasy!

However release builds should generally be as small as you can make them, how do we overcome this? The following combination of options gets you a nice small stripped .app size while still being able to get symbols:

Debug Information Format: DWARF with dSYM File
Deployment Postprocessing: Yes
Strip Linked Product: Yes
Use Separate Strip: Yes
Generate Debug Symbols: Yes

If you’re using Unity be sure to disable the -Wl,S,x flags in the additional linker options, they basically do the stripping before the dSYM gets generated, which is not what you want! A side note that the .app which comes from a signed .ipa doesn’t seem to match up with the one it was generated from (which… makes sense), so you’ll want to keep your .app around like XCode does when you use Archive.

Finding iOS memory

As I get more familiar with iOS, one difference from my experience with consoles is the general mystery about where resources end up in memory.

For consoles a typical game allocates as much memory as it can from the OS, puts it into its own memory pools and then is as miserly as possible from then on.

On iOS, even figuring out how much memory you’re using is pretty difficult! There’s not too much information available at runtime, and it’s a bit off-putting when I count up the allocations my game has made, and the system reports that it is using 400% more than I expected. Ouch. The documentation seems a bit scattered, so most of this post is me documenting my discoveries as I go. I’m certainly not an expert here, so take all of it with a few grains of salt.

If you google for information about memory on iOS, the first recommendation is to look at the Allocations instrument. This is generally good advice.

Allocations

The allocations will show you in a reasonably nice way all of the calls your application makes to malloc. If you’re using C++ or Objective-C, calls to new or alloc end up falling through to… malloc! So any standard code side allocations will show up here with a callstack to help you track it down, which is pretty handy. As a reference in my game if I disable my internal allocator, this shows a pretty stable usage of around 16 MB over 12 thousand allocations.

But wait… where are my textures? I know I’m using some memory for my textures and buffer objects. And they are nowhere to be found. And if I swap in TCMalloc or dlmalloc as my allocator, all of a sudden I’m only using 4.3 MB. And if you’re using MonoTouch or Unity, you might get suspicious about not seeing the C# heap memory being taken into account. So off to other instruments we go!

Memory Monitor

The number that’s usually the most off-putting is the one reported by the Memory Monitor instrument under the Real Memory column. This seems to include EVERYTHING! Textures, executable, shared libraries, actual allocations. You name it, it’s yours and counted against you. Of course it’s pretty much impossible to figure out what’s using it from just the one number… but it’s a start. A side note is that this number is the same as what this call returns:

 kern_return_t kerr = task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t)&info, &size);    

So it is available at runtime at least. That number also tends to match the number reported in out of memory crash reports, indeed if you dig around Mach you’ll see the number reported is the number of pages multiplied by the page size. One fun thing about the memory monitor is you can see other applications and how much memory they’re using. The biggest one I’ve found so far is Infinity Blade by Epic, which peaked in at a hefty 165 MB of memory, or around 1/4 of the available memory on my iPhone 4. My game when running pokes in at 28.62 MB, which we’ll note is significantly higher than the 15 MB that Allocations reported.

VM Tracker

Next down the list is the number reported by the VM Tracker (Virtual Memory Tracker) instrument. Things get interesting with VM Tracker, as it looks to be the most finely grained method for iOS to report an application’s memory usage.

In my game we can see that it has a *Resident* size of 97.69 MB, and a *Dirty* size of 36.43 MB. The *Resident* size is presumably all of the memory/pages that the system associates with my App. It is close to 3x what the Memory Monitor reports! So clearly the Memory Monitor is not assigning the blame for all of that memory to my game. The *Dirty* size is also bigger than what Memory Monitor reports, but it’s at least a little bit closer to what’s expected. From reading through the Apple developer forums it looks like “Dirty memory usage is the most important on iOS, as this represents memory that cannot be automatically discarded by the VM system (because there is no page file).” So let’s go through the categories I’m aware of:

IOKit

IOKit is… textures you’ve uploaded with OpenGL, i.e. all those calls to glTexImage2d. If you’re not using OpenGL, your image resources will likely go into a different label, “Core Animation” and “CGImage” come up, but don’t quote me on it.

VM_ALLOCATE

The VM_ALLOCATE label is a bit of a grab bag. I set a breakpoint on the call to vm_allocate() and noticed that my glBufferData calls are in the callstack, so VBOs allocate their memory from this region, (not all vertex arrays will end up in here though, some also end up hitting malloc!). In fact it seems like anything that uses mmap will also show up in here, presumably because mmap calls generate vm_allocate() under the hood. If you’re using dlmalloc, it will use mmap by default, so dlmalloc’s heap is actually accounted for in this label and not included in the malloc sections. Likewise it seems mono uses this for its memory, so if you have a Unity game and are wondering where the rest of the heap is… VM_ALLOCATE is your culprit!

SBRK

If you see an sbrk region, it means you are using software designed for systems that expect the sbrk() to do something reasonable. iOS (and OS X!) are not those platforms. They just get the lovely call found here, which shows it calling vm_allocate under the hood and with a hardcoded 4 MB. Oof. If you’re using TCMalloc make sure you modify the config.h to not use sbrk, as the autoconfig will find the header and assume it’s a good idea to use it.

__TEXT

The __TEXT section is using quite a bit of memory in my example:  584 KB for the application itself, and 27.21 MB total! As one might fear, unfortunately a big chunk of this is outside my control. __TEXT contains your actual executable code, as well as other things marked read only, like literal strings and other constant data. It also contains those sections for any dynamic libraries in use by running applications, so even if you cut down on the libraries you’re using, there’s no guarantee your __TEXT section will shrink.  What is under your control is the size of your executable, including constant data, as well as the libraries you use/load, although the latter may not be much help.

__DATA

The __DATA section contains static sections of your executable that are writable. So those big global buffers you shouldn’t be using will show up here. If you look at the map file for your executable, you can see which parts of the __DATA section in particular are taking up space. The bss would be your application’s static data. If you’re unfamiliar with what that that might mean, in C/ObjC/C++ it’s typically made of the following:

  1. Non-const variables that are declared static in your classes or functions.
  2. Global variables.

You can read in more detail here what the executable sections mean, including our __DATA section and __TEXT section friends.

MALLOC_TINY, MALLOC_SMALL, MALLOC_LARGE

Then we get to the malloc regions. There are three of them, and their sum will be the closest number to what Allocations reports: all our ‘standard’ allocations will end up somewhere in one of these three pools. From this division, and by browsing through the Libc code, we can see that the mach allocator is partially inspired by Hoard, and has some small thread caching behaviour to help threaded applications.

Assuming the iOS malloc is using that source file, we can see that ‘MALLOC_TINY’ means up to 496 bytes, ‘MALLOC_SMALL’ means up to 15 KB, and ‘MALLOC_LARGE’ for everything bigger than that. I did some quick tests with iOS 5.1 and can confirm those sizes make sense. Leaking 400 bytes goes to tiny, 600 bytes go to small, and 16000 bytes end up in large. It seems to be a generally spiffy allocator actually, so carefully consider the downsides of using your own!

A useful rundown of OS X’s allocator (which seems to be the same!) tells us there used to be a ‘huge’ malloc, but it was removed when Snow Leopard hit, somewhere around iOS 4. Besides the standard debug calls available in malloc.h, there’s a useful undocumented function called scalable_zone_statistics and it can work well for getting statistics about how you’re hitting the different malloc sections at runtime, I use it like so:

extern "C" boolean_t scalable_zone_statistics(malloc_zone_t *zone, malloc_statistics_t *stats, unsigned subzone);
malloc_statistics_t stats;
scalable_zone_statistics(malloc_default_zone(), &stats, 0); //or 1 or 2 for the small/large

As it’s not public, I would recommend against using it in anything you hope to release.

Miscellaneous

TCMalloc – this seems to be WebKit using TC Malloc internally. Not using webkit you say? Neither am I. Seems to top out around 250 KB for me.

Memory Tag 70 – This wonderfully named section seems to be related to UIImage or other UIKit calls. For me it’s only 32 KB, but I’ve seen some apps report much higher usage. Make sure you’re loading your images with UIImage with the correct method!

LINKEDIT – The ‘link edit’ (I always read it first as Linked-It…) is for the dynamic linker to patch up calls to dynamic libraries. I think. Most of this will be in under the dyld_shared_cache, which presumably means it’s for shared libraries.

WAT?

I’ll keep updating this post as I learn more about how things work on iOS, but hopefully this will be useful to other people who start to wonder where all that memory is going. If I have anything wrong or there’s things that should be added here, please let me know!

Poking around Unity

Unity seems to be the hot new thing these days and for good reason! I haven’t played with it too much, but it seems to be a nice friendly cross-platform way to get a game up and running. It purrs like a kitten on my PC, and doesn’t do too shabbily on my 3 year old laptop, so it certainly seems to cover its bases!

Unfortunately the current version doesn’t let you run with XCode 4.3.1, it seems something in iOS 5.1 has broken the check they’ve added to make sure you use their shiny Unity logo, and their only advice is to download an old version. That was going to be the end of my playing with it until I noticed the callstack. There’s a function called VerifyiPhoneSplashScreen. Takes a string, no return value. So I made a function named VerifyiPhoneSplashScreen that takes a string and immediately returns. Allowed multiply defined symbols… annnnnnnd it works just fine.

Don’t change the image though, I like Unity and feel like they deserve the marketing. :)

Updating to iOS 5.1

I had a brief lapse in judgement recently and decided to update my phone to iOS 5.1. Of course I can’t use my current version of XCode with iOS 5.1, and I also can’t use the new version of XCode with OS X Snow Leopard. I should mention I have no intention of using any iOS 5.1 features.

Anyways, a few hours later! Paid $30 for Lion. Install it and my shiny new XCode from the App Store. Recompile game. Game instantly dies on mysterious “could not find partial die in cache” error in the debugger. Go through recompiling all intermediate libraries I have source code for. Same error. Clean/repeat. Same error. Ignore problems. Finally notice XCode is downloading iOS 5.1 development libraries in the background (!), no idea when it started or what it’s doing.

Wait for it to complete, full recompile… and it works. But it was actually days between when I got XCode and I noticed this download bar, so who knows. But if anyone googles that like I did… make sure you have the development libs installed that you think you do. The other Google results are way more alarming… :)

Wrong On The Internet!

I saw a post on the internet yesterday saying that the SSE implementation of inverse square root was 4x faster than the ‘marvelous‘ method of computing it. My internal thought process was along the lines of: “Of course it is! It’s going 4 floats at a time!” and so I decided… someone on the internet MAY be wrong!

Duty Calls

So I spent an ill-advised hour or so trying to figure out how to get vector intrinsics to work with the version of GCC that I had installed.

I eventually ended up with code that generated OK-looking x86 assembly (although it does make me miss the PS3/360 vector intrinsics!):

void inv_sqrt_marvelous_sse(float* begin, float* end, float* output)
{
    v4sf onePointFiveFloat = {1.5f, 1.5f, 1.5f, 1.5f};
    v4sf zeroPointFiveFloat = {0.5f, 0.5f, 0.5f, 0.5f};
    v4si intConst = {0x5f3759df, 0x5f3759df, 0x5f3759df, 0x5f3759df};
    v4si intShift = {0x1u, 0x1u, 0x1u, 0x1u};

    union {
        v4sf f;
        v4si i;
    } x;

    v4sf *v4out = (v4sf*)output;

    for(; begin < end; begin+=4, v4out++){
        v4sf orig = *((v4sf*)begin);
        v4sf half = zeroPointFiveFloat * orig;
        x.f = half;
        x.i = intConst - (x.i >> intShift);
        *v4out = x.f * (onePointFiveFloat - half * x.f * x.f);        
    }
}

I didn’t actually get the >> operator to work: my version of gcc on Mac is a bit out of date and so doesn’t support all of the magic vector operations (despite the fact someone may be wrong, I decided that potentially breaking XCode would be taking things too far). So I subbed in a cheap nop and… those SSE implementers made that approximate inverse square root pretty fast! The marvelous version naively vectorized as above was about 2x slower with what I guess would be around an order of magnitude more error. It’d be a lot harder to tell and I’d have to use a much better benchmark than the one I was using to say for sure, but it was enough for me to concede. Regardless… go SSE!

So in this case… I was wrong on the internet! :) At least I learned a bit about gcc intrinsics along the way.

(Update: this guy went into it in much more thorough detail if you’re interested!)

Rendering text on iPhone with OpenGL

So far on my project I’ve been using SDL, a library I’ve used over the years for small projects. I use it as a quick way to get a cross-platform OpenGL context created, so I figured I’d use it in the same manner for my current game. SDL on iOS has a few quirks to work through, but if you’re using OpenGL with shaders it’s not much work to have the same code working on iOS with OpenGL ES 2.0.

One of the strengths of SDL is its lightweight and easy to use ‘standard’ libraries. I’ve used these before to get some basic sound, image loading and text rendering working. Image loading worked quite nicely (because this guy added a nicer version that uses the built in support, thanks man!), but SDL_ttf, the font renderer, relies on being able to link against FreeType, a nifty open source font rendering library. It’s certainly doable to link against it and some people have built it for iOS, but it’s a bit of a procedure and it’s yet another library to have to keep up to date.

It also seemed a bit… grating to have to use a separate library. Apple has historically placed an above average emphasis on having nice fonts and the iPhone is no exception. Its built-in font rendering lets you choose from the decent selection of built in fonts (Helvetica!) and also allows you to bring your own font to the dance. A roadblock arises when you realize you need to render the fonts to OpenGL textures and not to some mysterious UIGraphicsContext. Luckily I’m not the first one on the internet to have these concerns and I found a blog post about mixing UIKit and OpenGL.

Unfortunately his example code didn’t work for me. The code looks fine and seems straightforward and runs without error. It just wasn’t actually rendering anything into the buffer. After adding every CGContext state setting function I could find I stumbled upon a combo that sets up enough state to render with only a basic SDL OpenGL window set up. Hopefully Google will bring people here to save themselves the hour of randomly setting states to get it to render… :)

Caveats: it doesn’t word wrap, or do any other fancy tricks. For now, I’m using it to aid in debugging and provide frame times and the ilk.

//from bit twiddling hacks
inline uint32_t nextPowerOfTwo(uint32_t v)
{
    v--;
    v |= v >> 1;
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;
    v++;
    return v;
}

GLuint CreateTextureFromText(const char* text, const sRGBA &rgba, int &out_width, int &out_height)
{    
    NSString *txt = [NSString stringWithUTF8String: text];
    UIFont *font = [UIFont fontWithName:@"Helvetica-Bold" size:16.0f];

    CGSize renderedSize = [txt sizeWithFont:font];

    const uint32_t height = nextPowerOfTwo((int)renderedSize.height); out_height = height;
    const uint32_t width = nextPowerOfTwo((int) renderedSize.width); out_width = width;
    const int bitsPerElement = 8;
    int sizeInBytes = height*width*4;
    int texturePitch = width*4;
    uint8_t *data = new uint8_t[sizeInBytes];
    memset(data, 0x00, sizeInBytes);

    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    CGContextRef context = CGBitmapContextCreate(data, width, height, bitsPerElement, texturePitch, colorSpace, kCGImageAlphaPremultipliedLast);

    CGContextSetTextDrawingMode(context, kCGTextFillStroke);

    float components[4] = { rgba.r, rgba.g, rgba.b, rgba.a };
    CGColorRef color = CGColorCreate(colorSpace, components);
    CGContextSetStrokeColorWithColor(context, color);    
    CGContextSetFillColorWithColor(context, color);
    CGColorRelease(color);    
    CGContextTranslateCTM(context, 0.0f, height);
    CGContextScaleCTM(context, 1.0f, -1.0f);

    UIGraphicsPushContext(context);

    [txt drawInRect:CGRectMake(0, 0, width, height) withFont:font lineBreakMode:UILineBreakModeWordWrap alignment:UITextAlignmentLeft];

    UIGraphicsPopContext();

    CGContextRelease(context);
    CGColorSpaceRelease(colorSpace);    

    GLuint textureID;
    glGenTextures(1, &textureID);    

    glBindTexture(GL_TEXTURE_2D, textureID);

    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);     

    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, data);

    delete [] data;

    return textureID;
}

You generally don’t want to be creating textures very often, so you’ll want to have some sort of caching for the texture ID that’s being returned.

A few people wrote me to ask for more details about using the texture that’s returned. It’s just a normal quad with the dimensions given by the out_width and out_height parameters. I made a simple C++ class header that should get your most of the way. If you include this header in a .mm file, create the class after your OpenGL initialization and bind a shader which has texturing, it should™ render your text! It requires you to be using OpenGL ES 2.0. I’ve also as of 2013 updated it to use the iOS 6.0 text alignment enums as XCode was complaining.

simplerender.h