A common cause for node deaths is memory exhaustion. Sometimes this is due to some memory-eating monsterbug, but often the virtual address space of 2GB in a 32-bit process simply fills up with legitimate user data. No matter how much memory is installed on the machine, each process can only address 2GB of this.
In order to alleviate this and to buy us more room for growth, we have been working on porting the server binaries to the new x64 architecture and have them run as 64 bit processes under Mac OS X 10.7 Lion or even Ubuntu 64.
Moving to this architecture should give us some other key benefits as well: The number of registers is now larger, which allows for better code optimization, and it uses the sse2 FP architecture exclusively which would mean better FP performance.
Having previously worked with 32/64 portable software in the unix world I was suprised at how relatively painless the switch seems to be going to be. Regular integral datatypes don’t change at all, only such old chestnuts as size_t and ptrdiff_t grow to reflect the new address space.
To test all of this, we have put in orders for some 64 bit machines to put into our clusters. We are going to compare offerings from both Intel and AMD, single and dual core. We expect to be able to start test this seriously on Alive in a few weeks time.