<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet [
  <!ENTITY nbsp " ">
  <!ENTITY mdash "—">
  <!ENTITY eacute "á">
]>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title type="text">Snax</title>
  <link href="http://blog.evanweaver.com/xml/atom.xml" rel="self" />
  <link href="http://blog.evanweaver.com/" />
  
  <updated>2008-08-06T11:10:52Z</updated>
  <author>
    <name>Evan Weaver</name>
    <email>snax@evanweaver.com</email>
  </author>
  <id>http://blog.evanweaver.com/</id>
  <subtitle type="html">Ruby performance.</subtitle><xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" /><entry>
  <title>if you're going to san francisco
</title>
  <link href="http://blog.evanweaver.com/articles/2008/08/05/if-youre-going-to-san-francisco" />
  <id>http://blog.evanweaver.com/articles/2008/08/05/if-youre-going-to-san-francisco</id>
  
  <updated>2008-08-06T10:59:40Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>As some people already know, I'm moving to San Francisco in early September. One thing I had trouble finding was a map of the San Francisco microclimates. But here's one:</p>
<p><img src="http://blog.evanweaver.com/files/microclimates.png" /></p>
<p>It's from <a href="http://www.amazon.com/Golden-Gate-Gardening-Year-Round-California/dp/157061136X">this book</a>.</p>
<p>I'll be living on the border of North Beach and Russian Hill (which is in the warm zone, 6). No place in San Francisco is really "hot", despite what that map says. But the patterns of fog and relative coolness are correct.</p>
<h2>postscript</h2>
<p>Sorry that my blogging and open-source output has been light; I'm recovering from serious RSI problems. The fact that you see this post at all is a sign that things are getting better, though.</p>

      
    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F08%2F05%2Fif-youre-going-to-san-francisco</feedburner:awareness></entry>

<entry>
  <title>a statement
</title>
  <link href="http://blog.evanweaver.com/articles/2008/07/10/a-statement" />
  <id>http://blog.evanweaver.com/articles/2008/07/10/a-statement</id>
  
  <updated>2008-07-10T10:10:24Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>I just want to go on record saying that none of Twitter's problems 
have ever been Rails problems.</p>
<h2>postscript</h2>
<p>Brandon Keene is <a href="http://blog.evanweaver.com/articles/2008/05/27/is-twitter-still-the-biggest-rails-site/#comment-029"> 
right on</a>.</p>

      
    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F07%2F10%2Fa-statement</feedburner:awareness></entry>

<entry>
  <title>echoe 3
</title>
  <link href="http://blog.evanweaver.com/articles/2008/06/22/echoe-3" />
  <id>http://blog.evanweaver.com/articles/2008/06/22/echoe-3</id>
  
  <updated>2008-06-22T18:02:54Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>Echoe 3 is out, right on the heels of Rubygems 1.2. It supports the 
new runtime vs. development dependencies, and works correctly with the 
Rubyforge 1.0.0 gem.</p> <p>It still supports all the usual features 
like certificate chains, RDoc upload, changeset parsing, manifest 
building, and cross-packaging. Documentation is <a href="http://blog.evanweaver.com/files/doc/fauna/echoe/">here</a>.</p> 
<p>By the way, Rubygems 1.2 seems pretty great.</p>

      
    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F06%2F22%2Fechoe-3</feedburner:awareness></entry>

<entry>
  <title>fauna projects on github
</title>
  <link href="http://blog.evanweaver.com/articles/2008/06/13/fauna-projects-on-github" />
  <id>http://blog.evanweaver.com/articles/2008/06/13/fauna-projects-on-github</id>
  
  <updated>2008-06-13T12:23:04Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>Fauna projects are now on <a href="http://github.com/fauna">GitHub</a>. Go to it.</p><p>The Subversion repos are going away, 
but the gems and forums will remain on <a href="http://rubyforge.org/projects/fauna/">RubyForge</a>.</p>
<p>You can also follow Snax/Fauna activity on <a href="http://twitter.com/_snax">Twitter</a>.</p>

      
    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F06%2F13%2Ffauna-projects-on-github</feedburner:awareness></entry>

<entry>
  <title>is twitter still the biggest rails site?
</title>
  <link href="http://blog.evanweaver.com/articles/2008/05/27/is-twitter-still-the-biggest-rails-site" />
  <id>http://blog.evanweaver.com/articles/2008/05/27/is-twitter-still-the-biggest-rails-site</id>
  
  <updated>2008-05-27T12:56:00Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>It looks pretty small:</p>
<p><a href="http://siteanalytics.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com?metric=uv"><img src="http://media.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com_uv_460.png" /></a></p>
<p>Lots of people claim they are bigger, like <a href="http://gilesbowkett.blogspot.com/2008/05/twitters-scaling-problems.html">Hulu</a> and <a href="http://www.alleyinsider.com/2008/5/why_can_t_twitter_scale_blaine_cook_tries_to_explain#comment-4828905814b9b9ea009b79f9">Scribd</a>. And others have published big traffic numbers: <a href="http://www.buildingwebapps.com/articles/13-can-rails-scale-absolutely#comment_list">Yellowpages</a> and <a href="http://highscalability.com/friends-sale-architecture-300-million-page-view-month-facebook-ror-app">Friends for Sale</a>.</p>

<h2>maths</h2>

<p>Let's break it down now, so to speak. First, we're not interested in rank, otherwise known as unique users; we're interested in pageviews. Can we get pageview data out of Compete? We can. First, take the visits per month:
      </p>
<p><a href="http://siteanalytics.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com?metric=sess"><img src="http://media.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com_sess_460.png" /></a></p>

<p>Now multiply them by the pages per visit:</p>
<p><a href="http://siteanalytics.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com?metric=ppv"><img src="http://media.compete.com/yellowpages.com+revolutionhealth.com+scribd.com+twitter.com+hulu.com_ppv_460.png" /></a></p>

<p>
<table>
<tr><td>Yellowpages:</td><td>22957550 * 7.5</td><td>172M</td></tr>
<tr><td>Twitter:</td><td>8647381 * 8.3</td><td>71M</td></tr>
<tr><td>Revolution Health:</td><td>11465530 * 3.9</td><td>45M</td></tr>
<tr><td>Hulu:</td><td>2532132 * 7.6</td><td>19M</td></tr>
<tr><td>Scribd:</td><td>3100697 * 2.9</td><td>9M</td></tr>
</table>
</p>

<p>The Yellowpages number is exactly what their own developers <a href="http://www.buildingwebapps.com/articles/13-can-rails-scale-absolutely#comment_list">reported</a>, which is convenient.</p>

<p>However, the Scribd number seems small. Their <a href="http://www.scribd.com/">homepage</a> says "Over 17 million people a month". It would be great if someone from Scribd could explain what that means, but if we generously assume it's visits, not pages, we can update their number to 49M. We will also add the Facebook app Friends for Sale, since their developers are <a href="http://highscalability.com/friends-sale-architecture-300-million-page-view-month-facebook-ror-app">on record</a> projecting 300M pages for May:</p>

<p>
<table>
<tr><td>Scribd:</td><td>17000000 * 2.9</td><td>49M</td></tr>
<tr><td>Friends for Sale:</td><td>n/a</td><td>300M</td></tr>
</table>
</p>

<h2>api traffic</h2>

<p>Unfortunately, we're not done. Both Twitter and Scribd have APIs, and that traffic isn't covered by Compete data.</p>
<p>There are a couple public sources for Twitter API traffic ratios, although they're both a year old. <a href="http://www.slideshare.net/Blaine/scaling-twitter">Blaine said 90%</a>, and <a href="http://groups.google.com/group/twitter-development-talk/browse_thread/thread/6e045bcb6bd89877">Alex said "many times our web traffic"</a>. Let's be very conservative and say that maybe 60% of the current traffic comes from the API.</p>

<p>I have no idea how much API traffic Scribd turns, so we might as well use the same number, although it's possible they already reported API traffic in the 17M number on their homepage.</p>

<h2>final ranking</h2>

<p>Now, Friends for Sale had reported projected May traffic, but our Compete numbers are from April. So let's multiply each of our April numbers by Compete's monthly growth rate (sorry Yellowpages):</p>

<p>
<table>
<tr><td>Friends for Sale:</td><td>n/a</td><td>300M</td></tr>
<tr><td>Twitter:</td><td>8647381 * 8.3 / 0.4 * 1.38</td><td>248M</td></tr>
<tr><td>Yellowpages:</td><td>22957550 * 7.5 * 0.85</td><td>146M</td></tr>
<tr><td>Scribd:</td><td>17000000 * 2.9 / 0.4 * 1.09</td><td>134M</td></tr>
<tr><td>Revolution Health:</td><td>11465530 * 3.9 * 1.06</td><td>47M</td></tr>
<tr><td>Hulu:</td><td>2532132 * 7.6 * 1.06</td><td>20M</td></tr>
</table>
</p>

<p>That is definitely some breakneck growth for Twitter (and even more so for Friends for Sale, considering its age).</p>

<p>So these are our final, wildly inaccurate values:</p>

<p><img src="http://blog.evanweaver.com/files/traffic_graph.png" /></p>

<h2>discussion</h2>

<p>It is important to keep in mind how useless this information is. It doesn't even make sense to say "Rails site" or "PHP site", and the <a href="http://rails100.pbwiki.com/">rails100 wiki</a> was originally set up as a joke!</p>
<p>For example, Livejournal uses Perl, Memcached, and MySQL, among other things. Does that make it a Perl site, a MySQL site, or a C site? I don't know what Scribd uses, but it's pretty likely that their document pre-renderer is Java or C, not Ruby. Friends for Sale uses Nginx, Rails, Memcached, MySQL, and Linux. Ruby is really just a little piece of the pie.</p>
<p>And Alex himself has said that the recent Twitter problems are <a href="http://dev.twitter.com/2008/05/twittering-about-architecture.html">system problems</a>, not language problems.</p>

<p>Just in comparison:</p>

<p><a href="http://siteanalytics.compete.com/google.com+facebook.com+wikipedia.org+livejournal.com?metric=sess"><img src="http://media.compete.com/google.com+facebook.com+wikipedia.org+livejournal.com_sess_460.png" /></a></p>

<p>Our little sites ain't no thing but a chicken wing on a string.</p>

    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F05%2F27%2Fis-twitter-still-the-biggest-rails-site</feedburner:awareness></entry>

<entry>
  <title>xapian search plugin</title>
  <link href="http://blog.evanweaver.com/articles/2008/05/26/xapian-search-plugin" />
  <id>http://blog.evanweaver.com/articles/2008/05/26/xapian-search-plugin</id>
  
  <updated>2008-05-26T21:29:39Z</updated>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      <p>Francis Irving sent me a 
note about his work on a new Rails search plugin, <a href="http://github.com/frabcus/acts_as_xapian/tree/master">acts_as_xapian</a>. It uses the 
<a href="http://xapian.org/">Xapian</a> engine, which is a C++ indexer similar to Lucene.
A particularly neat feature is built-in 
spellcheck. </p>
<p>I still plan to benchmark all these plugins on the Wikipedia dataset...it's been delayed 
by the new job. If anyone has a big piece of iron I could use for a couple weeks I would 
appreciate it (16GB ram, hundreds of GB of free diskspace, no production load).</p>

      
    </div>
  </content>
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetItemData?uri=snax&amp;itemurl=http%3A%2F%2Fblog.evanweaver.com%2Farticles%2F2008%2F05%2F26%2Fxapian-search-plugin</feedburner:awareness></entry>

 
<feedburner:awareness xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://api.feedburner.com/awareness/1.0/GetFeedData?uri=snax</feedburner:awareness></feed>
