Clock Blog
Preventing 'http: Raise hangup error on destroyed socket write' from crashing your node.js server
@tomgco and I were hacking late on a new Clock node.js project. The caffeine fueled @tomgco loves pounding browser refresh like a freaking machine gun, then I hear “Oh my web server has crashed!” Developers pummel refresh, it's a fact of life, but it doesn't normally cause the httpServer to crash. Earlier that day we'd upgraded node to 0.8.20 so it didn't take long to turn our attention to the changelog and then on to a tweet that Tom had spotted.
https://twitter.com/nodejs/status/303893363877363712
‘No more leaking memory’; This killer line fills me with mixed emotion. Memory leaks are our new worst enemy since switching to node.js. Sneaking up on us, killing our services at peak times and keeping me up at night reading the dtrace manual. Naturally I’m ecstatic to find there will be less of them, but at the same time, ALL MY NODE APPS ARE LEAKING MEMORY and the fix requires a code change. Dang!
After some googling and testing we confirmed the following fix in 0.8.20 was now causing our development web server to crash:
http: Raise hangup error on destroyed socket write (isaacs)
Here is the original commit:
https://github.com/isaacs/node/commit/e261156e7386e3d870543bee4218c7f106bfcf22
Pulling down to the stable branch: https://github.com/joyent/node/pull/4775
and found issues were already coming in: https://github.com/ether/etherpad-lite/issues/1541 https://github.com/LearnBoost/socket.io/issues/1160
In case you missed it, this isn’t going to get fixed properly in 0.8
“The proper fix is to treat ECONNRESET correctly. However, this is a behavior/semantics change, and cannot land in a stable branch. So, the full-of-sad bandaid fix is to not put data into the output buffer if the socket is destroyed, and also remove anything that is in the output buffer when the HTTP request sees that it closes.”- issacs
We just needed a ‘bandaid’ on our 0.8 apps and I was actually glad to have good reason to retro fit Domains around our apps.
The Problem
Below is a simple web server that waits 5 seconds before responding. This will error in 0.8.20 when the client connection hangs up.
var http = require('http')
http.createServer(function (req, res) {
// Wait 5 seconds before responding
setTimeout(function () {
res.writeHead(200, {'Content-Type': 'text/plain'})
res.end('Hello World\n')
}, 5000)
}).listen(1337, '127.0.0.1')
setInterval(function () {
console.log(process.memoryUsage().rss)
}, 2000)
console.log('Server running at http://127.0.0.1:1337/')
Running this server pre 0.8.20 you can:
curl http://127.0.0.1:1337/ & ; sleep 2 && killall curl
Which will kill the connection atfer 2 seconds and you won't see any errors from the server but instead get a memory leak.
Switch to 0.8.20. (We use nave) to quickly switch node versions:
nave use 0.8.20
Run the server, then connect run the curl oneliner
curl http://127.0.0.1:1337/ & ; sleep 2 && killall curl
You'll see the server errors and dies.
timers.js:103
if (!process.listeners('uncaughtException').length) throw e;
^
Error: socket hang up
at createHangUpError (http.js:1360:15)
at ServerResponse.OutgoingMessage._writeRaw (http.js:507:26)
at ServerResponse.OutgoingMessage._send (http.js:476:15)
at ServerResponse.OutgoingMessage.write (http.js:740:18)
at ServerResponse.OutgoingMessage.end (http.js:882:16)
at Object._onTimeout (/socket-hangup/server.js:8:9)
at Timer.list.ontimeout (timers.js:101:19)
Our Solution
Wrap the request and response in a domain.
var http = require('http')
, domain = require('domain')
, serverDomain = domain.create()
// Domain for the server
serverDomain.run(function () {
http.createServer(function (req, res) {
var reqd = domain.create()
reqd.add(req)
reqd.add(res)
// On error dispose of the domain
reqd.on('error', function (error) {
console.error('Error', error, req.url)
reqd.dispose()
})
// Wait 5 seconds before responding
setTimeout(function () {
res.writeHead(200, {'Content-Type': 'text/plain'})
res.end('Hello World\n')
}, 5000)
}).listen(1337, '127.0.0.1')
})
setInterval(function () {
console.log(process.memoryUsage().rss)
if (typeof gc === 'function') {
gc()
}
}, 2000)
console.log('Server running at http://127.0.0.1:1337/')
Express
If you are using express 3 you can apply a fix like this
var http = require('http')
, domain = require('domain')
, serverDomain = domain.create()
, express = require('express')
, app = express()
app.get('/', function (req, res) {
// Wait 5 seconds before responding
setTimeout(function () {
res.send('Hello World')
}, 5000)
})
// Domain for the server
serverDomain.run(function () {
http.createServer(function (req, res) {
var reqd = domain.create()
reqd.add(req)
reqd.add(res)
// On error dispose of the domain
reqd.on('error', function (error) {
console.error('Error', error.code, error.message, req.url)
reqd.dispose()
})
// Pass the request to express
app(req, res)
}).listen(1337, '127.0.0.1')
})
setInterval(function () {
console.log(process.memoryUsage().rss)
if (typeof gc === 'function') {
gc()
}
}, 2000)
console.log('Server running at http://127.0.0.1:1337/')
We’ve not got this in production yet but this patch looks like it is going to get us by. If you have a better solution please let us know.
Like what you've read?