<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments for snax</title>
	<atom:link href="http://blog.evanweaver.com/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.evanweaver.com</link>
	<description>on software</description>
	<lastBuildDate>Fri, 28 Oct 2011 05:59:55 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>Comment on memcached gem performance across VMs by tobsch</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8570</link>
		<dc:creator><![CDATA[tobsch]]></dc:creator>
		<pubDate>Fri, 28 Oct 2011 05:59:55 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8570</guid>
		<description><![CDATA[I had an email conversation with Charles and he decided to implement a JRuby extension for memcached. Find it over &lt;a href=&quot;https://github.com/headius/jruby-spymemcached&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;.

So far, I hadn&#039;t had the time to test it intensively. It helped me understand some JRuby internals though. Maybe you have the time to do the benchmark again?
]]></description>
		<content:encoded><![CDATA[<p>I had an email conversation with Charles and he decided to implement a JRuby extension for memcached. Find it over <a href="https://github.com/headius/jruby-spymemcached" rel="nofollow">here</a>.</p>
<p>So far, I hadn&#8217;t had the time to test it intensively. It helped me understand some JRuby internals though. Maybe you have the time to do the benchmark again?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by evan</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8483</link>
		<dc:creator><![CDATA[evan]]></dc:creator>
		<pubDate>Sun, 16 Oct 2011 17:48:29 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8483</guid>
		<description><![CDATA[Those tests are against remote servers; mine are against localhost. So not comparable unfortunately.

You&#039;re right, though, my results are around 20k ops per second.]]></description>
		<content:encoded><![CDATA[<p>Those tests are against remote servers; mine are against localhost. So not comparable unfortunately.</p>
<p>You&#8217;re right, though, my results are around 20k ops per second.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by tobsch</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8481</link>
		<dc:creator><![CDATA[tobsch]]></dc:creator>
		<pubDate>Sun, 16 Oct 2011 11:59:07 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8481</guid>
		<description><![CDATA[I talked to Charles about this issue too and we&#039;ll see better times for calling java 
code in jRuby 1.7. Though, this does not seem to be the problem here:

&lt;pre&gt;                                          user     system      total        real
set: fake                             0.165000   0.000000   0.165000 (  0.165000)
set: xmemcached:bin                  13.383000   0.000000  13.383000 ( 13.383000)
get: dalli:bin                       24.084000   0.000000  24.084000 ( 24.084000)

get: dalli:bin                       22.399000   0.000000  22.399000 ( 22.399000)
get: fake                             0.194000   0.000000   0.194000 (  0.194000)
get: xmemcached:bin                  12.918000   0.000000  12.918000 ( 12.918000)
&lt;/pre&gt;

I just added a &quot;fake&quot; client (Java class accepting the same args as other clients). We clearly see the overhead (160ms for nothing is not that good) but it wont help turning the benchmark around.

If we look at the benchmark over &lt;a href=&quot;http://xmemcached.googlecode.com/svn/trunk/benchmark/benchmark.html&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;, single threaded mode is always below 10k ops/second, which is rather slow compared to your results using libmemcached, as far as I understand this.

Any ideas?
]]></description>
		<content:encoded><![CDATA[<p>I talked to Charles about this issue too and we&#8217;ll see better times for calling java<br />
code in jRuby 1.7. Though, this does not seem to be the problem here:</p>
<pre>                                          user     system      total        real
set: fake                             0.165000   0.000000   0.165000 (  0.165000)
set: xmemcached:bin                  13.383000   0.000000  13.383000 ( 13.383000)
get: dalli:bin                       24.084000   0.000000  24.084000 ( 24.084000)

get: dalli:bin                       22.399000   0.000000  22.399000 ( 22.399000)
get: fake                             0.194000   0.000000   0.194000 (  0.194000)
get: xmemcached:bin                  12.918000   0.000000  12.918000 ( 12.918000)
</pre>
<p>I just added a &#8220;fake&#8221; client (Java class accepting the same args as other clients). We clearly see the overhead (160ms for nothing is not that good) but it wont help turning the benchmark around.</p>
<p>If we look at the benchmark over <a href="http://xmemcached.googlecode.com/svn/trunk/benchmark/benchmark.html" rel="nofollow">here</a>, single threaded mode is always below 10k ops/second, which is rather slow compared to your results using libmemcached, as far as I understand this.</p>
<p>Any ideas?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by evan</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8468</link>
		<dc:creator><![CDATA[evan]]></dc:creator>
		<pubDate>Sun, 09 Oct 2011 06:59:44 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8468</guid>
		<description><![CDATA[Charles Nutter told me that the default JRuby/Java integration is not very fast, so I assume that&#039;s the bulk of the problem.

Maybe you could test the speed of the integration itself with a do-nothing Java stub?]]></description>
		<content:encoded><![CDATA[<p>Charles Nutter told me that the default JRuby/Java integration is not very fast, so I assume that&#8217;s the bulk of the problem.</p>
<p>Maybe you could test the speed of the integration itself with a do-nothing Java stub?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by tobsch</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8454</link>
		<dc:creator><![CDATA[tobsch]]></dc:creator>
		<pubDate>Mon, 03 Oct 2011 11:07:49 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8454</guid>
		<description><![CDATA[I just wrote a simple wrapper for &lt;code&gt;xmemcached&lt;/code&gt; (JRuby) and benchmarked it against &lt;code&gt;dalli&lt;/code&gt; (because I had no compile/compatibility issues with &lt;code&gt;dalli&lt;/code&gt;).

Both libs where using the binary protocol and a single server. I did not implement any marshaling. I used the JRuby parameters &lt;code&gt;--server --fast  -J-Xmn512m -J-Xms2048m -J-Xmx2048m&lt;/code&gt;.

The result is somehow sad:
&lt;pre&gt;
                                          user     system      total        real
set: dalli:bin                       16.387000   0.000000  16.387000 ( 16.387000)
set: xmemcached:bin                  12.662000   0.000000  12.662000 ( 12.661000)

get: dalli:bin                       17.376000   0.000000  17.376000 ( 17.376000)
get: xmemcached:bin                  12.202000   0.000000  12.202000 ( 12.202000)
&lt;/pre&gt;

That&#039;s 4k gets a second for &lt;code&gt;xmemcached&lt;/code&gt;. 

I did some tuning (disabling serialization, tcp nodelay, changing the tcp buffer size) aswell, but the changes aren&#039;t significant. I&#039;m a bit disappointed with the single threaded performance of &lt;code&gt;xmemcached&lt;/code&gt;. With 300 threads, each doing 10k loops I get the following results:

&lt;pre&gt;
set: dalli:bin                        3.989000   0.000000   3.989000 (  3.989000)
set: xmemcached:bin                   2.537000   0.000000   2.537000 (  2.537000)
&lt;/pre&gt;

It can&#039;t be the Ruby code slowing it all down in this case.
]]></description>
		<content:encoded><![CDATA[<p>I just wrote a simple wrapper for <code>xmemcached</code> (JRuby) and benchmarked it against <code>dalli</code> (because I had no compile/compatibility issues with <code>dalli</code>).</p>
<p>Both libs where using the binary protocol and a single server. I did not implement any marshaling. I used the JRuby parameters <code>--server --fast  -J-Xmn512m -J-Xms2048m -J-Xmx2048m</code>.</p>
<p>The result is somehow sad:</p>
<pre>
                                          user     system      total        real
set: dalli:bin                       16.387000   0.000000  16.387000 ( 16.387000)
set: xmemcached:bin                  12.662000   0.000000  12.662000 ( 12.661000)

get: dalli:bin                       17.376000   0.000000  17.376000 ( 17.376000)
get: xmemcached:bin                  12.202000   0.000000  12.202000 ( 12.202000)
</pre>
<p>That&#8217;s 4k gets a second for <code>xmemcached</code>. </p>
<p>I did some tuning (disabling serialization, tcp nodelay, changing the tcp buffer size) aswell, but the changes aren&#8217;t significant. I&#8217;m a bit disappointed with the single threaded performance of <code>xmemcached</code>. With 300 threads, each doing 10k loops I get the following results:</p>
<pre>
set: dalli:bin                        3.989000   0.000000   3.989000 (  3.989000)
set: xmemcached:bin                   2.537000   0.000000   2.537000 (  2.537000)
</pre>
<p>It can&#8217;t be the Ruby code slowing it all down in this case.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by evan</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8453</link>
		<dc:creator><![CDATA[evan]]></dc:creator>
		<pubDate>Mon, 03 Oct 2011 01:23:24 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8453</guid>
		<description><![CDATA[The benchmark script is included in the gem. See https://github.com/fauna/memcached/blob/master/test/profile/benchmark.rb .]]></description>
		<content:encoded><![CDATA[<p>The benchmark script is included in the gem. See <a href="https://github.com/fauna/memcached/blob/master/test/profile/benchmark.rb" rel="nofollow">https://github.com/fauna/memcached/blob/master/test/profile/benchmark.rb</a> .</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on memcached gem performance across VMs by tobsch</title>
		<link>http://blog.evanweaver.com/2011/09/23/memcached-gem-performance-across-vms/#comment-8440</link>
		<dc:creator><![CDATA[tobsch]]></dc:creator>
		<pubDate>Tue, 27 Sep 2011 05:06:20 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1871#comment-8440</guid>
		<description><![CDATA[Could you please post/send the benchmark script you used?
]]></description>
		<content:encoded><![CDATA[<p>Could you please post/send the benchmark script you used?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on simplicity by catwell</title>
		<link>http://blog.evanweaver.com/2011/07/25/simplicity/#comment-8350</link>
		<dc:creator><![CDATA[catwell]]></dc:creator>
		<pubDate>Wed, 27 Jul 2011 09:06:51 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1996#comment-8350</guid>
		<description><![CDATA[I get your point about multiplicative complexity. About improving performance through increased coupling too, I have had to do that more than once.

It&#039;s the same kind of discussion than Tanenbaum vs. Torvalds (micro-kernels vs. macro-kernels). History proved Torvalds right so far. I still think Tanenbaum&#039;s vision is theoretically better but sometimes we have to trade off local simplicity, elegance and low coupling for performance, productivity or global simplicity.

I&#039;m originally a network engineer so the best example I have for this is the TCP/IP stack vs. the ISO stack. TCP/IP won because it broke design rules and increased coupling, resulting in a simpler architecture overall. Yet it&#039;s still a layered architecture at its core.

Ultimately I think  you&#039;re right: it&#039;s obviously better to re-use what you have than to re-invent the wheel all the time or duplicate things. We just have to be careful not to end up having all our blocks depending on all the other blocks just because it allows optimal re-use.

To sum up I&#039;d say overall complexity arises from three main things: the number of parts, the size of parts and the number of links (dependencies) between the parts. Because you don&#039;t have to think of all the system all the time, what really matters is global complexity (number of parts + total number of links) and maximum local complexity (for each part, size of the part and number of outgoing + incoming links). It&#039;s up to the system designer to assign weights to those things and make trade-offs.]]></description>
		<content:encoded><![CDATA[<p>I get your point about multiplicative complexity. About improving performance through increased coupling too, I have had to do that more than once.</p>
<p>It&#8217;s the same kind of discussion than Tanenbaum vs. Torvalds (micro-kernels vs. macro-kernels). History proved Torvalds right so far. I still think Tanenbaum&#8217;s vision is theoretically better but sometimes we have to trade off local simplicity, elegance and low coupling for performance, productivity or global simplicity.</p>
<p>I&#8217;m originally a network engineer so the best example I have for this is the TCP/IP stack vs. the ISO stack. TCP/IP won because it broke design rules and increased coupling, resulting in a simpler architecture overall. Yet it&#8217;s still a layered architecture at its core.</p>
<p>Ultimately I think  you&#8217;re right: it&#8217;s obviously better to re-use what you have than to re-invent the wheel all the time or duplicate things. We just have to be careful not to end up having all our blocks depending on all the other blocks just because it allows optimal re-use.</p>
<p>To sum up I&#8217;d say overall complexity arises from three main things: the number of parts, the size of parts and the number of links (dependencies) between the parts. Because you don&#8217;t have to think of all the system all the time, what really matters is global complexity (number of parts + total number of links) and maximum local complexity (for each part, size of the part and number of outgoing + incoming links). It&#8217;s up to the system designer to assign weights to those things and make trade-offs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on simplicity by evan</title>
		<link>http://blog.evanweaver.com/2011/07/25/simplicity/#comment-8349</link>
		<dc:creator><![CDATA[evan]]></dc:creator>
		<pubDate>Tue, 26 Jul 2011 16:51:06 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1996#comment-8349</guid>
		<description><![CDATA[Boost consolidates the overhead of package management, linking, and documentation. These are secondary, not primary concerns, and hopefully the package manager would solve this problem, but there&#039;s a reason people use it.

I think you should be concerned more about multiplicative complexity than whether the coupling is loose or tight. Software is a multidimensional world where nice API boundaries are often insufficient for productivity. If your system is above the threshold of human comprehension, hope that your early encapsulation choices were right, because now you&#039;re stuck with them forever.

I have inflicted some awesome performance improvements on the world by tightly coupling two components into a single component that serves simultaneous, orthogonal goals.]]></description>
		<content:encoded><![CDATA[<p>Boost consolidates the overhead of package management, linking, and documentation. These are secondary, not primary concerns, and hopefully the package manager would solve this problem, but there&#8217;s a reason people use it.</p>
<p>I think you should be concerned more about multiplicative complexity than whether the coupling is loose or tight. Software is a multidimensional world where nice API boundaries are often insufficient for productivity. If your system is above the threshold of human comprehension, hope that your early encapsulation choices were right, because now you&#8217;re stuck with them forever.</p>
<p>I have inflicted some awesome performance improvements on the world by tightly coupling two components into a single component that serves simultaneous, orthogonal goals.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on simplicity by catwell</title>
		<link>http://blog.evanweaver.com/2011/07/25/simplicity/#comment-8348</link>
		<dc:creator><![CDATA[catwell]]></dc:creator>
		<pubDate>Tue, 26 Jul 2011 16:42:34 +0000</pubDate>
		<guid isPermaLink="false">http://blog.evanweaver.com/?p=1996#comment-8348</guid>
		<description><![CDATA[What I don&#039;t like is the &quot;merging overlapping components&quot; parts.

Say you need X and Y to do A, B and C. If you combine X, Y, A, B and C because they overlap you end up depending on A and C to do B. A good example of this is &quot;I can do everything&quot; libraries, eg. OpenCV or Boost.

I think that if you&#039;re working on a project with a reasonable number of moving parts, then fine, reduce duplication as much as possible and make it possible for somebody to understand how the system works. But if the system becomes larger, focus on simple, well-defined APIs and work with black boxes. If that means that you have to duplicate code (eg. in the server and its client(s) for a web service) then that&#039;s the price to pay.

I think we agree on this: paradoxically, simplicity matters are complicated.]]></description>
		<content:encoded><![CDATA[<p>What I don&#8217;t like is the &#8220;merging overlapping components&#8221; parts.</p>
<p>Say you need X and Y to do A, B and C. If you combine X, Y, A, B and C because they overlap you end up depending on A and C to do B. A good example of this is &#8220;I can do everything&#8221; libraries, eg. OpenCV or Boost.</p>
<p>I think that if you&#8217;re working on a project with a reasonable number of moving parts, then fine, reduce duplication as much as possible and make it possible for somebody to understand how the system works. But if the system becomes larger, focus on simple, well-defined APIs and work with black boxes. If that means that you have to duplicate code (eg. in the server and its client(s) for a web service) then that&#8217;s the price to pay.</p>
<p>I think we agree on this: paradoxically, simplicity matters are complicated.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

