The appeal of serving your web pages with a single process

(utcc.utoronto.ca)

85 points | by ingve 185 days ago

17 comments

kevmo314 182 days ago
Wholly agree. Too often we think about scale far too early.
I've seen very simple services get bogged down in needing to be "scalable" so they're built so they can be spun up or torn down easily. Then a load balancer is needed. Then an orchestration layer is needed so let's add Kubernetes. Then a shared state cache is needed so let's deploy Redis. Then we need some sort of networking layer so let's add a VPC. That's hard to configure though so let's infra-as-code it with terraform. Then wow that's a lot of infrastructure so let's hire an SRE team.
Now nobody is incentivized to remove said infrastructure because now jobs rely on it existing so it's ossified in the organization.
And that's how you end up with a simple web server that suddenly exploded into costing millions a year.
[-]
- AlotOfReading 182 days ago
  In a former job, I wrote a static PWA to do initial provisioning for robots. A tech would load the page, generate a QR code, and put it in front of the camera to program the robot.
  When I looked into having this static page hosted on internal infra, it would have also needed minimum two dedicated oncalls, terraform, LB, containerization, security reviews, SLAs, etc.
  I gave up after the second planning meeting and put it on my $5 VPS with a letsencrypt cert. That static page is still running today, having outlived not only the production line, but also the entire company.
  [-]
  - Aurornis 182 days ago
    > When I looked into having this static page hosted on internal infra, it would have also needed minimum two dedicated oncalls, terraform, LB, containerization, security reviews, SLAs, etc.
    In my experience there are two kinds of infrastructure or platform teams:
    1) The friendly team trying to help everyone get things done with reasonable tradeoffs appropriate for the situation
    2) The team who thinks their job is to make it as hard as possible for anyone to launch anything unless it satisfies their 50-item checklist of requirements and survives months of planning meetings where they try to flex their knowledge on your team by picking the project apart.
    In my career it’s been either one or the other. I know it’s a spectrum and there must be a lot of room in the middle, yet it’s always been one extreme or the other for me.
    [-]
    - stackskipton 182 days ago
      Having work as Ops person at second place, most of time it ends up like that because Ops become dumping ground so they throw up walls in vain attempt to stem the tide. Application throwing error logs? Page out Ops. We made a mistake in our deploy. Oh well, page out Ops so they can help us recover. Security has found a vulnerability in a Java library? Sounds like Ops problem to us.
    - nucleardog 181 days ago
      I've observed the same and, anecdotally, how much of a pain the ops team is is a function of how much responsibility shifts from the dev team to the ops team during the project lifecycle.
      Basically, is the ops team there to support the developer/development team, or the product?
      In the case of the development team, the ops team will tend to be willing to provide advice and suggestions but be flexible on tooling and implementation. In the case of the product, the ops team will tend to be a lot more rigid and inflexible.
      This plays out in things like:
      When the PWA becomes critical for the production line and then "is not working" at 3AM, who is getting paged? If it's the developer, then ops is "supporting the developer". If it's the ops team getting called to debug and fix some project they've never laid eyes on before at 3AM, then it's the product. They are, naturally, going to start caring a lot more about how it is set up, deployed, and supported because nobody likes getting woken for work at 3AM.
      When some project's dependencies start running past EOL, who is going to update it? If it's the developer, then ops is "supporting the developer". If the ops team isn't empowered to give a deadline and have _someone else_ responsible for keeping the project functioning, then they're supporting the product and by letting it be deployed effectively committed to maintaining it in perpetuity and they're going to start caring a lot more about what sort of languages, frameworks, etc are used and specifically how projects are set up because context switching to one of dozens of different projects at 3AM is hard enough as-is without having to also be trying to learn some new framework du jour.
      (And before anyone says "well the updates probably aren't necessary this is just ops being a pain"--think of the case of a project relying on GCP product that's being shutdown or some kubernetes resource that's been changed. In one case inaction will cause the project to fail, in the other ops' action will cause it to fail. See the first point as to who is going to get called about that. Even in the happy case, consistency brings automation and allows the team to support a _class_ of deployments instead of individual products.)
      I don't think places exist stably in the middle ground because it's a painful place to be for very long. The responsibility and the control land on separate people, and the person with the responsibility but without the control is generally going to work to wrestle control to reduce misery. In the case where ops acts as if they're supporting the developers but is in practice supporting the product, it's not going to take too many 3AM calls before they start pushing back on how the product's deployed and supported.
      I've been both of those ops guys you describe. When I was the "checklists, meetings, and picking the project apart" guy it had nothing to do with me wanting to make anyone's life difficult or flexing my knowledge. It had to do with the 3AM calls waking myself, my wife, and my newborn up. If I was taking on responsibility for keeping your _product_ functional through its useful life, yeah, I wasn't going to let people dump stuff on my plate unless I had some reasonable basis to believe it wasn't going to substantially increase my workload and result in more middle of the night calls. The checklists were my way of trying to provide consistency and visibility into the process of reducing my own pain, not my way of trying to create pain for others.
  - beng-nl 182 days ago
    I am reminded of “I forgot how to count that low”
    https://news.ycombinator.com/item?id=28988281
- roncesvalles 182 days ago
  Also, I think people vastly overestimate how much uptime their application really needs and vastly underestimate how reliable a single VPS can be.
  I currently have VPSes running on both lowend and big cloud providers that have been running for years with no downtime except when it restarts for updates.
  [-]
  - kqr 182 days ago
    > no downtime except when it restarts for updates.
    This sounds a little like saying "all of North America except the U.S."
    I don't think people are worried about random breakdowns on a single VPS, but scheduled updates are still downtime, and downtime causes revenue loss regardless of why it happened.
    Any time a service is important enough I ask for two servers and a load balancer specifically to handle deployments and upgrade windows transparently. But! I agree services are usually less important than people think.
    [-]
    - RadiozRadioz 182 days ago
      > upgrade windows
      Ok, that explains this and the above comment. The last time I had to restart anything to apply an OS update was when I moved to a new RHEL LTS version, the lifespan of which is about 10 years. And there are many ways to do similar GNU/Linux upgrades without a restart at all.
      Does Windows Server really need to restart for updates like normal Windows? If so, that's hilariously crap and I'm glad I've never had to touch it.
      Edit: not saying a single VPS is fine if it's GNU/Linux, just remaking on the "restart to update" thing they mentioned
      [-]
      - HappMacDonald 182 days ago
        > to handle deployments and upgrade windows transparently
        GP might have meant "upgrade: Windows(tm)", or he might have meant "windows of time which we have allocated to upgrading the server", and on my first reading I interpreted the second without a single shred of thought towards the possibility of the first.
      - stackskipton 181 days ago
        Most people have to apply patches and if they don't install kpatch, there is generally a restart required to make sure everything is using new versions.
        Yes, you can restart all the services that probably slightly less downtime to full reboot on most VPS these days.
- ajayvk 182 days ago
  Having a single process web/app server simplifies things operationally. I am building https://github.com/claceio/clace, which is an application server for teams to deploy internal tools. It runs as a single process, which implements the webserver (TLS certs management, request routing, OAuth etc) as well as an app server for deploying apps developed in any language (managing container lifecycle, app upgrades through GitOps etc).
- benoau 182 days ago
  Shed a tear for Heroku, they made all this go away such a long time ago but ultimately squandered their innovation and the ~decade lead they had on other thinking in this fashion.
  [-]
  - the__alchemist 182 days ago
    Could you please clarify? I haven't noticed any impact to Heroku on my web applications; it.. just works, anecdotally. They send periodic mandatory upgrade emails re database and application stack, but they have been harmless so far; going back a decade.
    [-]
    - benoau 182 days ago
      They went from leading / pioneering horizontal scalability and database deploying and scaling and orchestration to "quiet-quitting" 15 years ago and doing almost nothing ever since - today they're barely worthy of mention in any discussion on any tech that solves these problems.
      [-]
      - the__alchemist 182 days ago
        We define worth. I mention Heroku as a great way to make web applications at various scales. Reliable, and easy to use.
        [-]
        nine_k 182 days ago
        Heroku was / is like a mainframe. You get provisioning, scaling, configuration, etc all sorted out for you, as long as you pay through the nose.
        [-]
        ksec 181 days ago
        I kind of think Railway and Fly.io is taking that spirit forward? Although I would love if Salesforce would just sell Heroku to somebody else and take it forward.
        the__alchemist 181 days ago
        So, it's a price issue? My confusion / out-of-the-loop is thus: I hear that Heroku has gone downhill, is no longer recommended etc, yet I've noticed no degradation personally. Works well, is stable, has all the benefits I initially liked it for.
        [-]
        benoau 181 days ago
        I mean their prices have never been great, but the sadness is because they should have WON - Heroku should have been the standard for deploying and managing hosting instances and databases and stuff on any cloud.
  - bigfatkitten 182 days ago
    Salesforce is what happened to Heroku.
- colechristensen 182 days ago
  It was always fun reimplementing some process 100x faster with a pipeline of grep and such on my laptop than somebody's hadoop cluster or whatever it was.
- Ferret7446 182 days ago
  I don't think you're agreeing to what the article is saying.
  The article seems to be saying, instead of using CGI which spawns a process per request, to have a single Web server binary in Go/whatever. Which is totally reasonable and per my understanding what everyone already does nowadays (are any greenfield projects still using CGI?)
  CGI is a "clever 'Unixy' hack" to add dynamicism to early web servers. They stopped being "relevant" a long time ago IMO.
  In fact, I think your diatribe actually contradicts the article.
  Basically, the article is saying that they went with the "simple" CGI approach which ended up creating more complexity than using the slightly more complex dedicated binary. The author essentially followed your advice which ended up causing more complexity and hacks.
  The morale of the story is, you need to use the right tool for the job, and know when to switch. Sometimes that is the simple path, sometimes that is not.
torginus 182 days ago
I was shat on multiple times by throbbing-brain architecture guys for this, but what I've done multiple times that worked in prod, was to start with a single monolith with dependency injection, and run all services in the same process.
Once the compute and robustness demands mandated we should scale the thing across multiple machines, both horizontally and vertically, I would replace most DI services with proxies that wrapped the original functionality and called out to a remote host.
So for this scaled up version, I would only need to change how the DI container got started, which could either be in 'Manager' mode (meaning a high level functionality, with the submodules injected as proxies calling out to services on client machines), 'Worker' mode (meaning it served the proxy requests on said worker machines) or 'Standalone' mode (meaning the DI container actually injected the actual fat versions of the services, that allowed the whole thing to run in a single process, very useful for local testing and debugging).
The was only a single executable which could be run in multiple 'modes' selected via a command line switch, which made versioning and deployment trivial.
[-]
- pdimitar 181 days ago
  In what programming language did you do this? Sounds like a nice mostly future-proof architecture.
  [-]
  - torginus 179 days ago
    It was C# with ASP.NET containers. Not the sexiest tech stack in existence, but it's really versatile. I kind of simplified the architecture in my post, there were like almost a dozen different 'roles', and not all of them ran off the same executable (since some roles had specialist requirements, with some running off of Windows, the others off Linux, so there was some real world messiness to account for), but the Standalone 'god' mode was there and pretty much worked well on dev machines.
Lord_Zero 182 days ago
I guess it depends on what world you live in. For example, using ASPNET Core, I just drop in this https://learn.microsoft.com/en-us/aspnet/core/performance/ra... and boom I have rate limiting and I do not have to stress about threads or state or whatever.
[-]
- colonCapitalDee 182 days ago
  ASP.NET Core truly is a joy to work with
- porridgeraisin 182 days ago
  That's rate limiting locally i.e single instance. It is a fairly trivial thing and isn't the topic here.
  [-]
  - Lord_Zero 181 days ago
    While that is true, at that point you should be rate limiting at the reverse proxy or load balancer.
    - Nginx https://blog.nginx.org/blog/rate-limiting-nginx
    - Caddy https://github.com/mholt/caddy-ratelimit
    - Treafik https://doc.traefik.io/traefik/middlewares/http/ratelimit/
- geewee 182 days ago
  That's almost certainly per server instance though, there's no mention of any type of synchronization across multiple instances, so if you e.g run many small ones or run the service as a lambda I'd be surprised if it worked like you expected.
  [-]
  - Lord_Zero 181 days ago
    At that point you should be rate limiting at the reverse proxy or load balancer.
    - Nginx https://blog.nginx.org/blog/rate-limiting-nginx
    - Caddy https://github.com/mholt/caddy-ratelimit
    - Treafik https://doc.traefik.io/traefik/middlewares/http/ratelimit/
    IMO Lambda is kind of an unfair example because the author doesn't mention having multiple instances. Plus a hot take I have is you should not be building an entire web-app as a Lambda or series of Lambda functions... AWS does not have solutions for load balancing in things like APIG so you would have to architect that via DynamoDB or ElastiCache which is the "extra layer or two of overhead" the author mentioned.
emacsen 182 days ago
All that's old is new again. More than a decade ago, I inherited a Node application and was able to use many of these exact techniques inside the application.
Even moreso, I could introspect the entire application state, including providing myself a shell and modify application state within the application. Keeping a blocklist inside a simple array- no problem! And being able to run a shell inside the same process meant I could inspect and even modify the array while the application ran.
That made it incredibly pleasant to use and run.
On the flip side, upgrades can be very challenging.
In a modern web application it's standard practice to run (at least) two instances of the application at once and use the load balancer to test both, or to drain jobs from one to the other. This is relatively easy if the applications are stateless.
Once the application holds all the state in memory, there's a real challenge. That array that seemed so clever- you'll need to serialize it so it can be reloaded at initialization time. Keeping all the session identifiers in memory- be ready to dump that.
Worse, if the application is not designed to share this state with another application, you're now in some trouble. This is fine if you're running a small site with a few users who can accept some downtime, but if you're running a serious service, you'd like to have some kind of upgrade path other than shutting the application down and starting it back up again.
"I'll just share state with other application servers", you might think, and then use something like ZeroMQ to transmit state", but once you think about sharing state between application servers, you realize you'd probably be better off using a tool like Redis, and you're right back where we started.
[-]
- loxs 182 days ago
  You can migrate to the heavier infra later. I am serving hundreds of concurrent users from a single Rust binary backed by Sqlite. So far it hasn't shown the slightest problems and migration can happen if the service grows an order of magnitude (or two) from here.
  [-]
  - emacsen 181 days ago
    Migration/upgrades need to happen no matter the size of the service, so I'm not sure what you mean.
    [-]
    - loxs 181 days ago
      Sorry, I mean porting to more flexible infra, not DB migrations. For these I use downtime right now... it's not a big deal.
koakuma-chan 182 days ago
> I discovered some that were making repeated requests at a quite aggressive rate, such as every five minutes
5 minutes sounds reasonable
remram 182 days ago
It's easy to have two implementations of your rate-limiting thing, in-process and Redis. Or change your implementation when you need it. Just put a nice interface in front of it.
There is a cost to the network synchronisation, so you definitely want to scale vertically until you really must scale horizontally.
gregors 178 days ago
>>>> The simple great thing about doing everything through a single process (with threads, goroutines, or whatever inside it for concurrency) is that you have all the shared state you could ever want, and that shared state makes it so easy to do so many things.
If you like the sound of this, check out Elixir/Phoenix
loxs 182 days ago
I do this for my personal projects.
Rust. Axum. Single compiled binary. Even html/js/css is baked into it via RustEmbed. Sqlite + litestream to S2. Cloudflare in front.
Works extremely well.
[-]
- pdimitar 181 days ago
  Do you have a link to such a complete app? Would love to take a look and learn.
  [-]
  - loxs 174 days ago
    No, sorry, they are not open source. But they are also not that hard to build. But as far as I know there is no ready solution to build all the boilerplate for you.
j45 182 days ago
Ironically, in a good way - simple scales, complex fails.
stana 182 days ago
Would love to explore this in Python. But would it be correct to assume single process service would not be as performant due to GIL?
[-]
- simonw 182 days ago
  The GIL means you only get to use a single core for your code that's written in Python, but that limit doesn't hold for parts that use C extensions which release the GIL - and there are a lot of those.
  The GIL is also on the way out: Python 3.13 already shipped the first builds of "free threading" Python and 3.14 and onwards will continue to make progress on that front: https://docs.python.org/3/howto/free-threading-python.html
  And honestly, a Python web app running on a single core is still likely good for hundreds or even thousands of requests a second. The vast majority of web apps get a fraction of that.
- nogul 182 days ago
  I’ve been running essential production systems with pythons uvicorn and 1 worker reliably, mainly because I get to use global state and the performance is more than fine. It’s much better than having to store global state somewhere else. Just make sure you measure the usage and chose a different design when you notice when this becomes a problem.
  [-]
  - dmurray 182 days ago
    I have inherited a suite of apps that do this. I have mixed feelings about this: on the one hand, many of them work just fine and will never need a second worker, and they were relatively easy for inexperienced devs to put together.
    On the other hand, some of them do have performance problems and are a nightmare to migrate to a different solution, it's hard to reason about what parts of the application turn out to be stateful.
fsckboy 182 days ago
what was that superfast web server, opensource of some sort, from about 25 years ago, single process, single thread? it just raced around a loop taking care of many queued i/o streams
[-]
- moebrowne 182 days ago
  Perhaps it's quark?
  https://tools.suckless.org/quark/
  [-]
  - fsckboy 178 days ago
    yes that was it, thx!
- wmf 182 days ago
  You might be thinking of Nginx.
  Today I would not recommend single threading since it won't be able to use multiple cores.
AStonesThrow 182 days ago
The filmmakers of “A Quiet Place” or the “Terminator” franchises should employ cks as a technical consultant
tryauuum 182 days ago
and no race conditions whatsoever
[-]
- Deebster 182 days ago
  Well, unless you're using threads.
  Or did I miss the sarcasm?
  [-]
  - tryauuum 177 days ago
    no sarcasm, the code won't have race conditions and sometimes it's great
    I actually recently planned to write a single threaded API to free myself from thinking about race conditions
  - jeffrallen 181 days ago
    If you're using Go right, you won't have race conditions.
throwaway984393 182 days ago
[dead]
lleymrl651 182 days ago
nice
theideaofcoffee 182 days ago
Then why not go to the extreme and just build a static version and serve that? Why do you need any dynamic content at all? Then you won't need shared state, then you won't need a database, then you won't need 90% of the stuff that dynamically driven pages need. Be the l33t h4x0r you are and just go static, save the complexity for your build process, because that shows the true hacker spirit. Hell, you may even be able to wring out another few blog posts from that.
Also, why are people submitting every single post from this blog recently? Does this person actually do any work at UToronto, or is he just paid to write? There are -8000- links to various pages under this domain. I hope it's just a collective pseudonym like Nicolas Bourbaki and one person didn't write 8000 pages.
I'm desperate to use some of the insights from a navel-gazing university computing center in my infrastructure: IPv6 NAT (huh? what? What?!), custom config management driven by pathological NIH (I know precisely zilch about anything at utcc but I can already say with 100% confidence that your environment isn't special enough to do that), 'run more fibers', 'keep a list of important infrastructure contacts in case of outages', 'i just can't switch away from rc shell', and that's just in the last six months. On second thought, I'll just avoid all links to here in the future to save my sanity.
[-]
- AStonesThrow 182 days ago
  He's been working at the same job, in the same place, since he graduated from college over 30 years ago. So that's an average rate of 5 blog posts per week. Not too shabby.
  And your surface-level scans indicate a lot of specialized deep-thinking about some specific tools. Sure, but you'll also find some good generalizations that arose from the depth and breadth of experience. He knows Linux like the back of his hand, and he's been using Debian and Ubuntu, and Fedora, so perhaps we can derive some takeaways from those? And thoughts on ZFS and anti-spam email hosting, those are good too.
  cks is the guy who influenced me to run Byron's rc shell, and also to install and run MH as my mail reader in 1993, and he also singlehandedly convinced me to install Ubuntu in 2006, which I maintained through multiple computers and upgrades through 2020. I cannot say that many, if any, of his blog posts were directly helpful to me, except for his opinions on PC hardware such as PCIe, parity RAM, and the like. But his wisdom is truly inspiring, as is his ability to stay with one employer for 100% of his career, doing more or less the same sysadmin things as he did in 1995.
- no_protocol 182 days ago
  The traditional pattern seen here of serving pages under a hierarchy called `~cks` indicate this is the personal site of someone who is affiliated with the university. Unless otherwise noted you should probably assume all the content is from "cks", not an army of dozens of coders.
DaSexiestAlive 182 days ago
Single-thread could be a thing if it's like a full-stack all sitting in a web browser--like Dioxus is going toward..
If a web browser is in a glorified chromebook like a 2025 Macbook Air, indeed there's a lot of breathing room. A lot of ram. Processing power. Cores. It's nice. I get that.
And then you can do off-line first: meaning use the cached local storage available to WASM apps.
Then whatever needs to go to the mother ship, then call web apis in the cloud.
That would, in theory, basically giving power back from "net pc theory of things" back to "fat client"--if you ask the grey-haired nerds among you. And you would gain something.
But outside of a glorified chromebook like a 2025 Macbook Air--we have to remember that we are working with all kinds of web devices--everything from crap phones to satellite servers with terabytes of ram--so the scalability story as we have it isn't entirely wrong.
I have been to U of Toronto, very smart people. But honestly this is a troll piece. Doesn't go into any depth and one-sided. Unhelpful. I think U of Toronto's reputation would be better served by something more sophisticated than this asinine blog entry.