snax

ruby performance

is twitter still the biggest rails site?

It looks pretty small:

Lots of people claim they are bigger, like Hulu and Scribd. And others have published big traffic numbers: Yellowpages and Friends for Sale.

maths

Let's break it down now, so to speak. First, we're not interested in rank, otherwise known as unique users; we're interested in pageviews. Can we get pageview data out of Compete? We can. First, take the visits per month:

Now multiply them by the pages per visit:

Yellowpages:22957550 * 7.5172M
Twitter:8647381 * 8.371M
Revolution Health:11465530 * 3.945M
Hulu:2532132 * 7.619M
Scribd:3100697 * 2.99M

The Yellowpages number is exactly what their own developers reported, which is convenient.

However, the Scribd number seems small. Their homepage says "Over 17 million people a month". It would be great if someone from Scribd could explain what that means, but if we generously assume it's visits, not pages, we can update their number to 49M. We will also add the Facebook app Friends for Sale, since their developers are on record projecting 300M pages for May:

Scribd:17000000 * 2.949M
Friends for Sale:n/a300M

api traffic

Unfortunately, we're not done. Both Twitter and Scribd have APIs, and that traffic isn't covered by Compete data.

There are a couple public sources for Twitter API traffic ratios, although they're both a year old. Blaine said 90%, and Alex said "many times our web traffic". Let's be very conservative and say that maybe 60% of the current traffic comes from the API.

I have no idea how much API traffic Scribd turns, so we might as well use the same number, although it's possible they already reported API traffic in the 17M number on their homepage.

final ranking

Now, Friends for Sale had reported projected May traffic, but our Compete numbers are from April. So let's multiply each of our April numbers by Compete's monthly growth rate (sorry Yellowpages):

Friends for Sale:n/a300M
Twitter:8647381 * 8.3 / 0.4 * 1.38248M
Yellowpages:22957550 * 7.5 * 0.85146M
Scribd:17000000 * 2.9 / 0.4 * 1.09134M
Revolution Health:11465530 * 3.9 * 1.0647M
Hulu:2532132 * 7.6 * 1.0620M

That is definitely some breakneck growth for Twitter (and even more so for Friends for Sale, considering its age).

So these are our final, wildly inaccurate values:

discussion

It is important to keep in mind how useless this information is. It doesn't even make sense to say "Rails site" or "PHP site", and the rails100 wiki was originally set up as a joke!

For example, Livejournal uses Perl, Memcached, and MySQL, among other things. Does that make it a Perl site, a MySQL site, or a C site? I don't know what Scribd uses, but it's pretty likely that their document pre-renderer is Java or C, not Ruby. Friends for Sale uses Nginx, Rails, Memcached, MySQL, and Linux. Ruby is really just a little piece of the pie.

And Alex himself has said that the recent Twitter problems are system problems, not language problems.

Just in comparison:

Our little sites ain't no thing but a chicken wing on a string.

May 27, 2008

32 comments

kitty says (May 27, 2008):

You could make it a thing on a chicken wing string...

evan says (May 27, 2008):

How did my cat manage to comment on my blog?

meh says (May 27, 2008):

Don't you work for twitter? How do you not have exact figures for the amount of traffic the api drives?

evan says (May 27, 2008):

I know the real numbers, but we're not publishing new traffic information right now.

Chris says (May 27, 2008):

None of those sites scale.

topfunky says (May 27, 2008):

What I really want to know is, "How does blog.evanweaver.com compare?"

evan says (May 27, 2008):

You two are just taking jobs away from the real internet peanut gallery.

Ikai Lan says (May 27, 2008):

If you are going to RailsConf, please find me. I'm going to blow your mind. =)

Scott Fleckenstein says (May 27, 2008):

Just a note, Bumper Sticker (the facebook application) is a Rails app, and it is over double the size of Friends for Sale.

evan says (May 27, 2008):

What do you mean by "size"? Facebook gets weird because an app can be installed and present on a user's profile page, but not generate any requests due to the push-oriented, proxied display model.

Scott Fleckenstein says (May 27, 2008):

The last time I talked with my buddy, who was on the team at linkedin that wrote Bumper Sticker, he mentioned 18 million page views per day reported in their google analytics account.

This was around the time that Facebook had it listed with 900,000 daily users. It is now at 1.3 million daily users.

evan says (May 27, 2008):

Nice; that makes them apparently the highest-trafficked deployment yet. Has the team made any presentations about it anywhere?

Siqi Chen says (May 27, 2008):

Evan: I heard there might be something about a certain super secret Facebook app owned by a certain large company announcing that they are the largest Rails deployment in the world at a certain conference about Rails which is happening this week. Certainly.

evan says (May 27, 2008):

Siqi, is it you? Serious Business is pretty much a humongous firm.

I see that a LinkedIn dude will be at the scaling panel, though, and this mysterious Ikai Lan is listed as a Bumper Sticker developer.

Scott's numbers would put them at around 750M pages for May. But maybe it's all a ruse...

Dan Yoder says (May 28, 2008):

I should point out that at YPC, we actually push quite a bit of data to partners via APIs that isn't counted as traffic. We don't publish those numbers and, in fact, I have no idea what they are, but John Straw (the YPC architect) is giving a talk at RailsConf (it is focused around the issues involved in migrating a major application to Rails, but does talk to some of the scalability challenges we faced).

Ikai Lan says (May 28, 2008):

Siqi, are you going to RailsConf? We never had that lunch =P

evan says (May 28, 2008):

Dan, I'd love to hear anything you or John have to say about those APIs.

Siqi Chen says (May 28, 2008):

It's the real me.

I'm the only engineer at our company not going to RailsConf, but be sure to watch out for the guys in the awesome black tees. Would love to have you guys exchange war stories.

evan says (May 28, 2008):

I meant, is it your company making the announcement. But it's good to know you're so authentic.

Siqi Chen says (May 28, 2008):

Evan, it is definitely not my company.

evan says (May 28, 2008):

It was a joke...

John Straw says (May 28, 2008):

Well, right now none of our API traffic is being handled directly by Rails -- our API is a Java application. That may change over the course of the year, but it's appropriate to exclude that traffic now.

But I don't know exactly how you would go about counting API traffic as "pageviews", anyway. Does one twitter API request equal one "page"? Aren't they pretty small? Should that matter?

Likewise, if we push a ton more bandwidth through our Rails application than Friends for Sale (hi, Alex!) -- and we do, since we're serving actual full pages -- does that make us bigger? Is there some scaling factor which says that since one of our 170M pages is four times bigger (or whatever) than one of their 300M pages, we get a multiplier?

When you get down to that level, this contest becomes pretty boring, I think. Should we all win, like in t-ball?

I am pretty interested in what the peaks are, rather than monthly averages. I know that we don't come anywhere close to the rumored 12k req/sec handled by twitter. Is that number true? If so, how much of that hits Rails? We peak at around 1600 req/sec hitting our servers (not counting what goes to Akamai), but not all of that hits Rails.

tlj says (May 29, 2008):

Doesn't Compete just estimate US traffic?

I'm sure Scribd has a lot of international users which Compete doesn't count. While Yellowpages.com has only US users.

My site with almost no US users has 1.2m visits/month, while Compete tells me it has 2k visits/month. That's a pretty big difference.

If you look at Alexa, Scribd.com has more page views than Twitter. For all it's problems, I still think Alexa gives a more accurate global picture than Compete.

evan says (May 29, 2008):

Yes! We all win, like in T-ball. It's not a fight. It's a cooperation.

Right now, every request on Twitter except for XMPP outbound hits Rails. This is extremely not optimal.

Honestly, I don't know if the 12k quote is accurate. I don't think it is. I'll try to find out.

Tlj, I think you're right about US traffic—I always try to check Mixi.jp on Compete and it shows practically no traffic, which is ridiculous. I missed that in my analysis.

Alex Le says (May 31, 2008):

Evan, It was great to meet you at the ENTP offices at railsconf.

I agree, I think everyone wins in this situation. Having personally been a part of the YPC and Friends for Sale projects, I can attest to the lack of work that goes into making rails scale, and really, the amount of work that goes into making everything else in your architecture scale.

And @John, I just want to dispel any myth that we aren't serving full pages. The widget portions of our application are cached and served by Facebook, and aren't included in counts of our pageviews. The real meat of our application lives in a canvas page within Facebook where requests are proxied to our servers where we serve as much content as any other website out there (a full page of markup, CSS, javascript and all the normal assets).

Alexey Kovyrin says (May 31, 2008):

At this moment according to Google Analytics we (scribd.com) have about 60M Pageviews during this month (May, 2008).

TJ says (June 02, 2008):

While we're at it, I might as well mention that Warbook got ~126 M pageviews for the month of April. That kinda puts it at 5th place on your list.

Brandon Keene says (July 10, 2008):

Hey guys,

Insider Pages (http://www.insiderpages.com) is built on Rails and usually roughly 30M pageviews a month.

I think there are a lot of other sites out there (as others have posted) that do comparable traffic to "the big Rails sites" but do so under the radar.

I've found at this level of traffic, the other operational and architectural stuff is more important than "Rails scaling" as it is popularly discussed. People like EngineYard are making a killing by throwing hardware (read: money) at the problem when the same problem can be solved with far fewer machines with smarter caching, edge (Akamai, etc.) technology, and architecture tuning.

"Rails scaling" isn't as exotic a problem as consultants would have you believe. There were big website before Rails and everyone should look to tried and true solutions. Sure, things like Rubinius, Thin, etc. are great, but I think the Rails community needs to talk more about other technologies (like ESI for example) rather than relying on hyper-tuning code that should in many cases never be run if your cache strategy is good.

Dan says (July 10, 2008):

ESI == Edge Side Includes. It allows you to break a page into fragments that each have their own caching properties. The fragments are reassembled as close to the edge as possible, usually by a CDN (Content Delivery Network) like Akamai or by ESI aware accelerators sitting in front of your app servers, like Varnish.

For example you can specify the static fragments to be cached for longer periods of time, and for dynamic fragments to be cached for short periods, or not at all. When there's a cache hit, the app only has to generate the fragments that the cache considers stale, which will usually just be the fragments that are completely dynamically generated.

Jay says (July 11, 2008):

I work for a site that does in excess of 200 million dynamic pageviews per month on Rails. We have 7 very-cheap application servers and one database server. We scale.

I'd rather not make a big fuss about it though.

ed hickey says (July 12, 2008):

We were doing around 20-30 million requests a day to our XML ad service. There was no caching except what little MySQL was doing and all requests had to be served in under 1 second. We had MySQL Cluster and (up to) 25 servers backing it all up though. In the end we rewrote that part of the application in PHP because of CPU usage (and a few other factors). In my research, most issues we had were more related to ruby than to rails.

Kind of a different beast than a user-facing HTML website, but I figured I'd throw it out there.

Add a comment

Various HTML tags allowed. Use <pre> for code blocks and <code> for inline references.