Blog about web scalability and performance by one of the lead engineers at Twitter.
January 10, 2012 – 1:34 PM
My XBox 360 broke, and since my new one supported HDMI, I reworked the connection to the TV (a Samsung PN50A450 plasma). It’s tricky to get the best performance out of the combination so I wanted to mention it here. scalers Even though the HDMI connection is digital, both the XBox and the TV have [...]
September 23, 2011 – 3:22 AM
Thanks to Evan Phoenix, memcached.gem 1.3.2 is compatible with Rubinius again. I have added Rubinius to the release QA, so it will stay this way. The master branch is compatible with JRuby, but a JRuby segfault (as well as a mkmf bug) prevents it from working for most people. vm comparison Memcached.gem makes an unusual [...]
Maximizing simplicity is the only guaranteed way to minimize software maintenance. Other techniques exist, but are situational. No complex system will be cheaper to maintain than a simple one that meets the same goals. ‘Simple’, pedantically, means ‘not composed of parts’. However! Whatever system you are working on may already be a part of a whole. [...]
April 27, 2011 – 12:30 PM
A few weeks ago I gave a performance engineering talk at QCon Beijing/Tokyo. The abstract and slides are below. abstract Twitter has undergone exponential growth with very limited staff, hardware, and time. This talk discusses principles by which the wise performance engineer can make dramatic improvements in a constrained environment. Of course, these apply to [...]
August 12, 2010 – 12:00 AM
Well, it’s been a long time. But! I have five papers to add to my original distributed systems primer: coordination CRDTs: Consistency Without Concurrency Control, Mihai Letia, Nuno Preguiça, and Marc Shapiro, 2009. Guaranteeing eventual consistency by constraining your data structure, rather than adding heavyweight distributed algorithms. FlockDB works this way. partitioning The Little Engines [...]
October 21, 2009 – 12:00 AM
How many objects does a Rails request allocate? Here are Twitter’s numbers: API: 22,700 objects per request Website: 67,500 objects per request Daemons: 27,900 objects per action I want them to be lower. Overall, we burn 20% of our front-end CPU on garbage collection, which seems high. Each process handles ~29,000 requests before getting killed [...]
September 30, 2009 – 12:00 AM
I’ve released Scribe 0.1, a Ruby client for the Scribe remote log server. sudo gem install scribe Usage is simple: client = Scribe.new client.log(“I’m lonely in a crowded room.”, “Rails”) Documentation is here. about scribe The primary benefit of Scribe over something like syslog-ng is increased scalability, because of Scribe’s fundamentally distributed architecture. Scribe also [...]
September 24, 2009 – 12:00 AM
We recently migrated Twitter from a custom Ruby 1.8.6 build to a Ruby Enterprise Edition release candidate, courtesy of Phusion. Our primary motivation was the integration of Brent’s MBARI patches, which increase memory stability. Some features of REE have no effect on our codebase, but we definitely benefit from the MBARI patchset, the Railsbench tunable [...]
August 4, 2009 – 12:00 AM
One of the hardest gems to install is no more. It’s now easy to install! Memcached 0.15 features: Update to libmemcached 0.31.1 Bundle libmemcached itself with the gem (antifuchs) UDP connection support Unix domain socket support (hellvinz) AUTO_EJECT_HOSTS bugfixes (mattknox) Install with gem install memcached. Since libmemcached is bundled in, there are no longer any [...]
Cassandra is a hybrid non-relational database in the same class as Google’s BigTable. It is more featureful than a key/value store like Riak, but supports fewer query types than a document store like MongoDB. Cassandra was started by Facebook and later transferred to the open-source community. It is an ideal runtime database for web-scale domains [...]