<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The Web Server Benchmarking We Need</title>
	<atom:link href="http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/</link>
	<description></description>
	<lastBuildDate>Fri, 06 May 2011 07:16:39 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Edmundo Gave</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-187957</link>
		<dc:creator>Edmundo Gave</dc:creator>
		<pubDate>Fri, 31 Dec 2010 00:19:29 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-187957</guid>
		<description>Hey I just wanted to let you know, I really like the written material on your website. But I am employing Firefox on a machine running version 8.x of Ubuntu and the design doesn&#039;t seem quite as intended. Not a serious issue, I can still essentially read the articles and research for info, but just wanted to inform you about that. The navigation bar is kind of tough to use with the config I&#039;m running. Keep up the great work!</description>
		<content:encoded><![CDATA[<p>Hey I just wanted to let you know, I really like the written material on your website. But I am employing Firefox on a machine running version 8.x of Ubuntu and the design doesn&#8217;t seem quite as intended. Not a serious issue, I can still essentially read the articles and research for info, but just wanted to inform you about that. The navigation bar is kind of tough to use with the config I&#8217;m running. Keep up the great work!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dr Sophie Henshaw</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-183058</link>
		<dc:creator>Dr Sophie Henshaw</dc:creator>
		<pubDate>Fri, 26 Nov 2010 04:12:28 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-183058</guid>
		<description>I wish there was more competition among servers as well. Nicholas&#039;s benchmark is a start, and benchmarks can only progress from here. We need to keep that in mind.</description>
		<content:encoded><![CDATA[<p>I wish there was more competition among servers as well. Nicholas&#8217;s benchmark is a start, and benchmarks can only progress from here. We need to keep that in mind.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vince</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-170400</link>
		<dc:creator>Vince</dc:creator>
		<pubDate>Sat, 14 Aug 2010 08:05:33 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-170400</guid>
		<description>Well, it seems that someone is talking less about theory and more about practice: those guys at TrustLeap not only disclosed the source code of their benchmark template but they also applied it to the market leaders.

Maybe that&#039;s because their own Web Application server, G-WAN, is smoking everybody else - whether this is in the kernel or in user-mode.

Things can be simple, after all, when one does not fear comparisons.</description>
		<content:encoded><![CDATA[<p>Well, it seems that someone is talking less about theory and more about practice: those guys at TrustLeap not only disclosed the source code of their benchmark template but they also applied it to the market leaders.</p>

<p>Maybe that&#8217;s because their own Web Application server, G-WAN, is smoking everybody else &#8211; whether this is in the kernel or in user-mode.</p>

<p>Things can be simple, after all, when one does not fear comparisons.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guy Siverson</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-169406</link>
		<dc:creator>Guy Siverson</dc:creator>
		<pubDate>Tue, 03 Aug 2010 14:39:55 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-169406</guid>
		<description>&quot;I wish there was competition among servers not to see who can tweak their performance for entirely unrealistic situations, but to see who can implement the most fail-safe server. We’re missing good benchmarks.&quot;

What benchmarks would you like to see that are currently being missed?</description>
		<content:encoded><![CDATA[<p>&#8220;I wish there was competition among servers not to see who can tweak their performance for entirely unrealistic situations, but to see who can implement the most fail-safe server. We’re missing good benchmarks.&#8221;</p>

<p>What benchmarks would you like to see that are currently being missed?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Yaniv Aknin</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-157405</link>
		<dc:creator>Yaniv Aknin</dc:creator>
		<pubDate>Sun, 28 Mar 2010 11:05:05 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-157405</guid>
		<description>I wholeheartedly agree with your approach, and as I wrote you in a separate email, made a feeble attempt at sketching the kind of benchmark you&#039;re talking about (see my site for a short writeup of what I coded or the &quot;Labour&quot; repository in my github for the actual code). Cheers.</description>
		<content:encoded><![CDATA[<p>I wholeheartedly agree with your approach, and as I wrote you in a separate email, made a feeble attempt at sketching the kind of benchmark you&#8217;re talking about (see my site for a short writeup of what I coded or the &#8220;Labour&#8221; repository in my github for the actual code). Cheers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sylvain Hellegouarch</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-156499</link>
		<dc:creator>Sylvain Hellegouarch</dc:creator>
		<pubDate>Wed, 17 Mar 2010 22:25:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-156499</guid>
		<description>&gt; Do you know what your web server does when 1% of requests go into an infinite loop?

Where do you get that 1% from? What value that random percentage has? If a request goes into an infinite loop most likely the request will timeout one one end or the other. 

&gt; But we all know these things happen.

If it&#039;s a bug, it should be unit tested, not performance tested. 

&gt; This represents exactly the kind of situation you can’t test in context (..)

If it can be shown under a certain load only, then there is a context.

&gt; we need some common understanding of performance

True but this should not be translated into code. Merely asking people who have a good experience in those areas what should be taken into account when discussing about server performances. Why the compulsory requirement for some code?

&gt; Server developers aren’t going to tweak their servers for your one particular application

I don&#039;t expect them to but they can&#039;t expect their servers to be used without an application context either (apart from static serving). So they must be aware of the context in which their server will be used.</description>
		<content:encoded><![CDATA[<p>&gt; Do you know what your web server does when 1% of requests go into an infinite loop?</p>

<p>Where do you get that 1% from? What value that random percentage has? If a request goes into an infinite loop most likely the request will timeout one one end or the other. </p>

<p>&gt; But we all know these things happen.</p>

<p>If it&#8217;s a bug, it should be unit tested, not performance tested. </p>

<p>&gt; This represents exactly the kind of situation you can’t test in context (..)</p>

<p>If it can be shown under a certain load only, then there is a context.</p>

<p>&gt; we need some common understanding of performance</p>

<p>True but this should not be translated into code. Merely asking people who have a good experience in those areas what should be taken into account when discussing about server performances. Why the compulsory requirement for some code?</p>

<p>&gt; Server developers aren’t going to tweak their servers for your one particular application</p>

<p>I don&#8217;t expect them to but they can&#8217;t expect their servers to be used without an application context either (apart from static serving). So they must be aware of the context in which their server will be used.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Bicking</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-156495</link>
		<dc:creator>Ian Bicking</dc:creator>
		<pubDate>Wed, 17 Mar 2010 21:58:14 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-156495</guid>
		<description>&gt; What Ian expects is much more of course but I’d say that if by 2010 no one has come with a generic load/performance/etc testing of a web server, maybe it’s because it’s not actually doable in any meaningful way. Testing always depends on the context it takes place.

Do you know what your web server does when 1% of requests go into an infinite loop?  That&#039;s not hard to test, and it&#039;s both a reasonable and easy to test.  But how many people know what the result is?  This represents exactly the kind of situation you can&#039;t test in context, because developers don&#039;t develop expecting to have that problem (it is after all a bug).  But we all know these things happen.

Right now I see primarily microbenchmarks (that benchmark one small isolated aspect of a system) and people who claim no general benchmarking is valid.  There is a middle ground.  It does not seem that difficult to me to just start to benchmark somewhat more interesting things.  Server developers aren&#039;t going to tweak their servers for your one particular application; we need some common understanding of performance because server and application are two separate lines of development.  Outside of the process model decision (async/greenlets, threaded, multiprocess), I don&#039;t see anything interesting happening in terms of performance, nor do I expect any server to accomplish anything by using the results of this kind of microbenchmark except perhaps outliers that want to get in line with the rest of the pack.</description>
		<content:encoded><![CDATA[<blockquote>
  <p>What Ian expects is much more of course but I’d say that if by 2010 no one has come with a generic load/performance/etc testing of a web server, maybe it’s because it’s not actually doable in any meaningful way. Testing always depends on the context it takes place.</p>
</blockquote>

<p>Do you know what your web server does when 1% of requests go into an infinite loop?  That&#8217;s not hard to test, and it&#8217;s both a reasonable and easy to test.  But how many people know what the result is?  This represents exactly the kind of situation you can&#8217;t test in context, because developers don&#8217;t develop expecting to have that problem (it is after all a bug).  But we all know these things happen.</p>

<p>Right now I see primarily microbenchmarks (that benchmark one small isolated aspect of a system) and people who claim no general benchmarking is valid.  There is a middle ground.  It does not seem that difficult to me to just start to benchmark somewhat more interesting things.  Server developers aren&#8217;t going to tweak their servers for your one particular application; we need some common understanding of performance because server and application are two separate lines of development.  Outside of the process model decision (async/greenlets, threaded, multiprocess), I don&#8217;t see anything interesting happening in terms of performance, nor do I expect any server to accomplish anything by using the results of this kind of microbenchmark except perhaps outliers that want to get in line with the rest of the pack.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Bicking</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-156494</link>
		<dc:creator>Ian Bicking</dc:creator>
		<pubDate>Wed, 17 Mar 2010 21:46:46 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-156494</guid>
		<description>Even making the example application a little more complicated would make the results more reasonable, I think.  For instance, return a large string, or the first 100 Fibonacci numbers, or something like that.  If that hurts threaded servers, so be it (I&#039;m not sure it would be that bad though).  Certainly `time.sleep()` has a particular problem that async servers have monkeypatched that specific function call, though arguably it might not be a bad representation of how a database call could work (if you have a genuine async database driver, or a pure-Python driver based on socket).

As for combinations, there certainly are many; adding Nginx to the mix doubles the combinations, for example.  And uWSGI can be embedded in a bunch of environments.  And there are at least several distinct ways to configure mod_wsgi, and I don&#039;t particularly know which one is best.  Because of this I think getting a testing rig that other people can use and extend is more useful than one presentation of benchmark results.  Actually putting together the results would be handy every so often, it also gives people a baseline to compare against (so they can test their new server setup against some other setup that seems good, instead of testing it against all other setups).</description>
		<content:encoded><![CDATA[<p>Even making the example application a little more complicated would make the results more reasonable, I think.  For instance, return a large string, or the first 100 Fibonacci numbers, or something like that.  If that hurts threaded servers, so be it (I&#8217;m not sure it would be that bad though).  Certainly <code>time.sleep()</code> has a particular problem that async servers have monkeypatched that specific function call, though arguably it might not be a bad representation of how a database call could work (if you have a genuine async database driver, or a pure-Python driver based on socket).</p>

<p>As for combinations, there certainly are many; adding Nginx to the mix doubles the combinations, for example.  And uWSGI can be embedded in a bunch of environments.  And there are at least several distinct ways to configure mod_wsgi, and I don&#8217;t particularly know which one is best.  Because of this I think getting a testing rig that other people can use and extend is more useful than one presentation of benchmark results.  Actually putting together the results would be handy every so often, it also gives people a baseline to compare against (so they can test their new server setup against some other setup that seems good, instead of testing it against all other setups).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris McDonough</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-156461</link>
		<dc:creator>Chris McDonough</dc:creator>
		<pubDate>Wed, 17 Mar 2010 15:41:23 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-156461</guid>
		<description>&quot;All these graphs tend to do at the moment is cause the sheep who are easily impressed to move to whatever system the results give the impression of being better than anything else for that specific test.&quot;

When you see people say &quot;wow, awesome, I&#039;ll really need to try CoolNewWSGIServerX!&quot; in a reddit comment against a set of benchmarks, you might remember:

- They aren&#039;t going to try it.  They would have just tried it instead of leaving a comment saying they&#039;re going to try it.

- Even if they do try it, they probably don&#039;t have a production application anyway. If they had a production application, they&#039;d be more concerned about stability and maintainability than raw speed, because except for a very few cases, the server isn&#039;t their bottleneck; their app is always their bottleneck.

As a result,  although it&#039;s good to stay involved and try to correct misstatements and benchmarking misconfigurations, I wouldn&#039;t worry about sheep or flocking or whatever.  At the end of the day, people are lazy, and they&#039;re likely to choose the system that has the best production feature-set and the best documentation.  Marketing also helps, of course, and positive benchmark results can be seen as a form of marketing.  But if a server doesn&#039;t have good docs and a good production feature set, it&#039;s not going to see a lot of use, even if it goes at plaid speed.</description>
		<content:encoded><![CDATA[<p>&#8220;All these graphs tend to do at the moment is cause the sheep who are easily impressed to move to whatever system the results give the impression of being better than anything else for that specific test.&#8221;</p>

<p>When you see people say &#8220;wow, awesome, I&#8217;ll really need to try CoolNewWSGIServerX!&#8221; in a reddit comment against a set of benchmarks, you might remember:</p>

<ul>
<li><p>They aren&#8217;t going to try it.  They would have just tried it instead of leaving a comment saying they&#8217;re going to try it.</p></li>
<li><p>Even if they do try it, they probably don&#8217;t have a production application anyway. If they had a production application, they&#8217;d be more concerned about stability and maintainability than raw speed, because except for a very few cases, the server isn&#8217;t their bottleneck; their app is always their bottleneck.</p></li>
</ul>

<p>As a result,  although it&#8217;s good to stay involved and try to correct misstatements and benchmarking misconfigurations, I wouldn&#8217;t worry about sheep or flocking or whatever.  At the end of the day, people are lazy, and they&#8217;re likely to choose the system that has the best production feature-set and the best documentation.  Marketing also helps, of course, and positive benchmark results can be seen as a form of marketing.  But if a server doesn&#8217;t have good docs and a good production feature set, it&#8217;s not going to see a lot of use, even if it goes at plaid speed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sylvain Hellegouarch</title>
		<link>http://blog.ianbicking.org/2010/03/16/web-server-benchmarking-we-need/comment-page-1/#comment-156460</link>
		<dc:creator>Sylvain Hellegouarch</dc:creator>
		<pubDate>Wed, 17 Mar 2010 13:40:21 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/?p=198#comment-156460</guid>
		<description>Well testing a server should always go with a context and a rationale behind said test. A benchmark isn&#039;t a test per se, it&#039;s just one aspect of a server that is pushed to the limit. So to me what Nicholas has done is fine since he explained the goal right in the premise of the article:

&gt; That benchmark looked specifically at the raw socket performance of various frameworks.

What Ian expects is much more of course but I&#039;d say that if by 2010 no one has come with a generic load/performance/etc testing of a web server, maybe it&#039;s because it&#039;s not actually doable in any meaningful way. Testing always depends on the context it takes place.

Probably the only realistic test that can be performed across various servers is to test for static serving performances. Once you hit the application server, it becomes way harder. Personally I&#039;m not finding what Ian proposes to test to be actually helpful. This is just a fancier benchmark, not a real test.

For instance, when testing an intranet application, I would be interested in seeing how an operation that I consider sensitive perform under certain conditions. For example, I would like to know how login in performs on a Monday morning by simulating 500 virtual users. How many of those can log at once? How long does it take? Do I have failures? How long does it take for the servers to cool down? Most likely, the HTTP server performances won&#039;t impact any of those.</description>
		<content:encoded><![CDATA[<p>Well testing a server should always go with a context and a rationale behind said test. A benchmark isn&#8217;t a test per se, it&#8217;s just one aspect of a server that is pushed to the limit. So to me what Nicholas has done is fine since he explained the goal right in the premise of the article:</p>

<p>&gt; That benchmark looked specifically at the raw socket performance of various frameworks.</p>

<p>What Ian expects is much more of course but I&#8217;d say that if by 2010 no one has come with a generic load/performance/etc testing of a web server, maybe it&#8217;s because it&#8217;s not actually doable in any meaningful way. Testing always depends on the context it takes place.</p>

<p>Probably the only realistic test that can be performed across various servers is to test for static serving performances. Once you hit the application server, it becomes way harder. Personally I&#8217;m not finding what Ian proposes to test to be actually helpful. This is just a fancier benchmark, not a real test.</p>

<p>For instance, when testing an intranet application, I would be interested in seeing how an operation that I consider sensitive perform under certain conditions. For example, I would like to know how login in performs on a Monday morning by simulating 500 virtual users. How many of those can log at once? How long does it take? Do I have failures? How long does it take for the servers to cool down? Most likely, the HTTP server performances won&#8217;t impact any of those.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.289 seconds -->

