Reverse proxy deep dive: Why HTTP parsing at the edge is harder than it looks

(startwithawhy.com)

59 points | by miggy 1 day ago

2 comments

pixl97 1 day ago
Oh, and it can get messy and lead to exploits really quick.
Incorrect parsing and parsing differences between libraries can lead to exciting exploits.
Like what do you do when there is multiple of the same headers with odd line breaks?
GET /example HTTP/1.1 Host: bad-stuff-here Host: vulnerable-website.com
[-]
- freeone3000 1 day ago
  It’s a good thing we have RFCs! For duplicate Host, you MUST respond with a 400. If the Host is different than the authority, Host must be ignored. If Host is not specified, it must be provided to upstream. See “Host” in RFC 7230:
  https://www.rfc-editor.org/rfc/rfc7230#section-5.4
  [-]
  - ranger_danger 1 day ago
    it's a good thing all RFCs are 100% specified with no ambiguities.
    EDIT: Sorry I dropped my /s. I was only trying to say that unfortunately not all RFCs are sufficiently specified... and that I think saying "good thing we have RFCs" should not imply they will all be sufficiently specified, which is how I interpreted their comment... and didn't feel like typing all this out, but I guess it was necessary anyway.
    [-]
    - necovek 1 day ago
      That's a very weird take as a reply on a bit that is sufficiently specified.
      [-]
      - pixl97 1 day ago
        I mean, I was pointing out one in a chain of security failures reverse proxies have had. I could probably point out 20-30 other ones that have cropped up. Adding the binary complexity to H2 has really increased the number of these coming.
      - ranger_danger 1 day ago
        Sorry, what I was implying is that "It’s a good thing we have RFCs" doesn't mean that they ARE always sufficiently specified... even if this one is.
        [-]
        necovek 17 hours ago
        I understand that: the problem is that in this example, it is, so the problem is obviously somewhere else — that's what we should explore.
        Is it just that the RFC has not been read properly? Maybe, but even if it was, I do not think having precisely defined behaviour in RFCs is sufficient: real world implementations have to be more flexible due to other buggy implementations they interact with.
TechDebtDevin 1 day ago
I've been building out a very large network of reverse proxies the last year. Very fun, and your article is very relatable. Go has been my friend. Been spending the last couple months testing trying to figure out all the weird things that can happen and its quite a bit.
[-]
- bithavoc 1 day ago
  me too, what are you building?
  [-]
  - TechDebtDevin 1 day ago
    A sort of boutique mobile-first proxy, with emphasis on geography spread/accuracy. I've been running my own proxies for a long time via friends and families networks, but in those instances security/safety wasn't as big of a deal. Yourself?
    [-]
    - bithavoc 1 day ago
      that’s cool, I’m working on branded artifact delivery. Docker, Go, NPM, Pypi repos delivered on free custom sub-domains. Vultr BGP services doing the trick so far.
      [-]
      - TechDebtDevin 23 hours ago
        And my solution is primarily SOCKS5 reverse, on top of tailscale (moving away from ts, although no complaints) with lots of routing in the middle.
      - TechDebtDevin 23 hours ago
        Awesome, that sounds like it could be really useful.