I've been resisting posting about it for some time, I think since Patrick pointed me at the link almost six months ago. I've reached my breaking point. In my quest to learn Erlang, I've come across at least three blogs and/or articles that actually site this "measurement" with attribution as if it were some sort of legitimate claim as to the scalability of Erlang.
Given that I'm reading his book, I would really expect more from Joe Armstrong and by attribution Ali Ghodsi.
The Apache vs. Yaws measurement is one of the most useless pieces of information produced by the Erlang community, to the point that I'd argue it does a disservice to the Erlang community and the language.
In any sort of quasi-scientific measurement (or primary school science experiment for that matter), I would expect to see:
- the actual code used to test the server
- the actual Linux kernel version
- the actual yaws server code
Instead, we see a graph of an under-documented experiment that creates conditions for a DoS test at best, not a web server scalability test.
From the looks of it, this "measurement" is not:
- documented to any reasonable extent - what kernel was used? what was the exact Apache configuration? what was the Yaws code used to serve the files?
- repeatable - no source, little documentation, little detail on the environment
- peer reviewed - without the above, nobody else can discuss in detail or attempt to repeat the same results
- valid - it simply does not repeat a real-world environment faced by any modern web server
Allow me to support some of those assertions from real-world experience helping large web sites:
- Not knowing what Linux kernel version was used, how Apache was configured (in detail) and how Apache was built, it's impossible to know if Apache "was configured for maximum performance". Seeing as they decided not to share any of that vital information, I consider the entire experiment invalid. These things matter, just like if I were to run BEAM on an SMP system without SMP, it's easy to misconfigure the runtime. For example, Apache started using epoll in 2.6.x kernels. Anything earlier than that would be an inappropriate kernel to use with mpm_worker. We wouldn't know from the sparse detail if it is a valid configuration or not.
- In a real-world environment, if I saw 100s of connections inactive for 10 seconds at a time, I would simply set the connection timeout to five seconds. Not sure how you would do the same on Yaws.
- Most production web servers don't serve one byte files (the "load" requested by the clients in the measurement). I can't actually recall the last time I saw a high-volume server dish out more than one or two one byte files. Instead, real web servers serve files that measure in the KB or even MB in size. In fact, given Erlang's empirical difficulty handling basic file I/O, I'm not surprised that they chose a one byte file to use in simulating "load". If the file were any larger, Yaws would likely have been swamped under it's own unbuffered I/O implementation and exhibited substantial latency per request.
- A one byte file would be cached upstream of the server in a high-volume site, particularly if your clients were operating over latent connections where one character per ten seconds was realistic (i.e. dialup).
- If this were a real DoS attempt, it would be choked off at the ISP router, well before the web server saw the intentionally slow request.
- I can personally type an HTTP request faster than one character per ten seconds, this is simply not a realistic access pattern from a benign client.
I put this in the same category as Microsoft FUD - there is an agenda here, and people will read it and say "Oh, yea - told you we rock" without questioning the details. Most however, should dismiss it as pure FUD, and FUD served from an Apache 2.2.3 server no less (maybe Yaws wasn't up to the task).