I think we've gotten to the bottom of this one. We didn't expect our connection "accepts" to fail non-fatally. However, accepts can apparently fail, probably when severed by a DPI device. This is why we experience different behavior on different ports. Once we use up our pre-allocated set of accepts in our accept pool, relay stops responding.
Also older unattendeds don't behave very nicely when their connections get severed like this. Since the connection was originally established correctly, they don't wait 5 seconds to retry-- they retry immediately. They'll essentially DDoS your Relay. This was fixed a while back, but even 5 or 10 of the older guys connecting constantly can put a big load on relay.