Audio Week!

People who worked with me at EA knew I had a special place in my heart for audio. Whether it was internally developed, from a licensed engine or third-party middleware, audio generated way more bugs than I would hope. It became something of a running joke where people would come to me for a crash they were trying to figure out and my immediate answer would be “Probably audio.”

All this to say! I’ve been spending a few hours trying to setup the audio bits of my engine. As I’ve already invested a little bit of time not getting too locked into using iOS specific features, I decided to try my hand at using OpenAL for my audio needs. Unfortunately Android and some other very distant porting goals don’t actually support it. Instead, they seem to be going with OpenSL, which seems to have been created with the rationale that mobile devices couldn’t support OpenAL. Except that you know, iOS already supports it. Boo.

Luckily it was pretty easy to get up and running with OpenAL. Sounds are loading and playing on all platforms! 3D positional audio seems like… substantial overkill for what I have in mind, but it’s nice it’s there in case I ever want to use it.

Oh Aeron…

The standard issue office chair at EA Montreal was something strongly resembling the infamous Aeron chair, notable on Wikipedia for being associated with dot-com excess. But man… sitting in my wooden chair of unknown origin for more than a few hours and I think my spine gives up and conforms to the distinctly non-ergonomic shape of the chair. If I keep working from home for long enough I think I’ll have to invest in a significantly comfier chair.

Now back to realizing why I’m not an artist… :)

Wrong On The Internet!

I saw a post on the internet yesterday saying that the SSE implementation of inverse square root was 4x faster than the ‘marvelous‘ method of computing it. My internal thought process was along the lines of: “Of course it is! It’s going 4 floats at a time!” and so I decided… someone on the internet MAY be wrong!

Duty Calls

So I spent an ill-advised hour or so trying to figure out how to get vector intrinsics to work with the version of GCC that I had installed.

I eventually ended up with code that generated OK-looking x86 assembly (although it does make me miss the PS3/360 vector intrinsics!):

void inv_sqrt_marvelous_sse(float* begin, float* end, float* output)
{
    v4sf onePointFiveFloat = {1.5f, 1.5f, 1.5f, 1.5f};
    v4sf zeroPointFiveFloat = {0.5f, 0.5f, 0.5f, 0.5f};
    v4si intConst = {0x5f3759df, 0x5f3759df, 0x5f3759df, 0x5f3759df};
    v4si intShift = {0x1u, 0x1u, 0x1u, 0x1u};

    union {
        v4sf f;
        v4si i;
    } x;

    v4sf *v4out = (v4sf*)output;

    for(; begin < end; begin+=4, v4out++){
        v4sf orig = *((v4sf*)begin);
        v4sf half = zeroPointFiveFloat * orig;
        x.f = half;
        x.i = intConst - (x.i >> intShift);
        *v4out = x.f * (onePointFiveFloat - half * x.f * x.f);        
    }
}

I didn’t actually get the >> operator to work: my version of gcc on Mac is a bit out of date and so doesn’t support all of the magic vector operations (despite the fact someone may be wrong, I decided that potentially breaking XCode would be taking things too far). So I subbed in a cheap nop and… those SSE implementers made that approximate inverse square root pretty fast! The marvelous version naively vectorized as above was about 2x slower with what I guess would be around an order of magnitude more error. It’d be a lot harder to tell and I’d have to use a much better benchmark than the one I was using to say for sure, but it was enough for me to concede. Regardless… go SSE!

So in this case… I was wrong on the internet! :) At least I learned a bit about gcc intrinsics along the way.

(Update: this guy went into it in much more thorough detail if you’re interested!)