I would not so much, perhaps look for spelling errors. For me writing the sermon is the goal, diving into the deep and exploring the topic, passage and such. If I preach it, it is kind of a bonus, or a driver to begin with :-)
It would be interesting to extend it to zip, which is what redbean/greenbean use to serve static assets.
Back in school, I worked on a project called Velox, with a partner - the idea was to take a bz2-compressed dump of the giant XML export of wikipedia, and write a program to serve that copy of wikipedia from disk (this was in 2008-2010? in my master's program, so before Kiwix and the amazing zim dumps they produce). My partner worked on the UI and indexing, and I was focusing on how to parse the bz2 compression format to locate article boundaries in the (giant) XML dump that Wikipedia provides. I ended up putting a lot of time into it because it was a bunch of fun.
Writing this just sent me back to the presentation I made. The slide I wrote back then said:
> Significant original work went into creation of archive access. The Apache BZip2 library that is part of Ant was used as a basis for archive access.
> Modified to support random access to a given byte/bit offset pair within the compressed data stream (BZip2 is not a byte-aligned format)
> Extended to index all BZip2 block positions, allowing Java-based pseudo-random access to BZip2 compressed data
> Extended to map article IDs to block numbers for constant-time article retrieval, even in BZip2 archives exceeding 5GB in size
> Current article retrieval times are ~2 seconds.
This is back when the archive was ~7GB IIRC. My Kiwix dumps today are ~120GB, but that includes images.
This is the link to the presentation in Google Slides that we wrote back in 2008 or so. The version history shows 2013, but I think some kind of import/conversion happened around that time.
Zip isn't useful for random access here; the problem with random access in HTTP serving is then you have to decompress the data and potentially recompress.
The more interesting trick you can do with zip files for HTTP serving is to serve the compressed deflate stream as gzip, or use Zstd inside zip. Then you have a valid zip file from which bytes can be served directly.
Well, according to first paragraph of the section titled "One tarball, served in place":
The whole site is a single tar file. zeroserve indexes it on load - building a path -> byte-range map - and then serves files by issuing byte-range reads against the tarball itself. Nothing is ever unpacked to disk. The site lives entirely in that one file, so there's no document root for a stray location rule to expose, and a deploy is a single atomic file swap.
OTOH, that could be an LLM justification, since the copy is littered with -isms like "the right shape" or "the surface is broad".
"requires" is of course subjective, there are always multiple ways to do something. But sometimes it is convenient to model a system as concurrent execution streams, for example: multiple sessions (servers), multiple entities (games, robotics), multiple in-flight transactions (any kind of i/o or concurrent compute). Agreed these are often C++ use-cases but there are obvious benefits to using Erlang or other virtual machines: memory safety, isolation, fault tolerance.
from experience, during bursts it's never actual web/api server that is bogged down, it's the downstream io bottlenecks.
if your accepting layer is abstracted away and implemented correctly, there is very little performance difference between different concurrency approaches and all you're exposed to as developer is implementation of your handler functions.
Not the case; good abstractions are valuable, but the performance differences between runtimes are very real.
Take the example of some simple HTTP<->blob store service gets slammed with millions of requests when someone using the API does a backfill via some framework on their end that aggressively scales request volume up and out.
Something like, say, async Python/starlette with a coroutine per request is gonna perform slightly worse than Erlang, which in turn is gonna perform much worse than Go.
You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.
It really takes surprisingly little volume to cripple a return-hello-world Phoenix app that indirects the "hello world" behind way too much middleware and message passing; it takes even less to kick over, say, a Gunicorn instance returning "hello world" at the bottom of the Django middleware stack. Golang with Gin, on the other hand, is surprisingly hard to cripple in the same way. And I say that as someone who likes Elixir and Python a lot more than I like Go!
Thank you. As a guy who made a career out of Elixir (and begins to regret it recently but oh well) I agree that Elixir's throughput is not amazing. However, it can get very far and we should always optimize for the most common usages.
I've personally rewritten one hobby and one professional projects from Elixir to Golang and loved the result; as you said, extremely difficult to bring down a Golang service to its knees.
One clarification: Phoenix server behind Caddy/nginx fairs better btw. But, details. Your point stands.
I am yet to see a Rust web/API service I wrote to _ever_ buckle under pressure and just crash. It was either an application bug (like the famous Cloudflare's `.unwrap()` error from the last weeks/months) or the Linux OOM killer. Literally never crashed. But I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.
> I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.
Haha yep. In my experience, everyone running CGI/process-per-request application servers is bullish on switching to a concurrent or cooperative runtime...until they realize they just removed the primary ratelimiter on downstream DB/service accesses.
The converse war stories are also amusing: people rewrite their whole app in a concurrent/asynchronous framework and nothing changes, because the DB driver is still farming out all queries to a tiny fixed-size threadpool of connections that was the bottleneck all along.
Oh yeah, definitely. If your DB server (or any storage backend) cannot have like 200+ connections alive at all times then it's absolutely pointless rewriting your app in Elixir or Golang. You'll just serve DB timeouts in your responses.
> You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.
fair enough, although at this point we start talking about LB in front of the thing, consumption mechanics, autoscaling signals
i will still maintain that my simple advice for a dev worrying about scale, is that they should focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.
> focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.
All good advice, but the choice of runtime can affect the point at which autoscaling and load balancing even need to enter the conversation at all. Optimizing, say, a mostly in-memory cache service and writing it in Golang may yield results like "we can run a single instance of this and serve three orders of magnitude of business growth; slap it behind a DNSRR or a k8s NodePort for update/replacement/fast failover if it crashes, no complex load balancer needed", where writing the same thing in, say, PHP might require discussing orchestration/load balancing/memory/worker process recycling/autoscaling early on in the service's lifetime. Being able to skip those conversations (entirely or for a long time) is a very significant business benefit.
(Upvoted for a really relevant and valuable refinement to the thread.)
Admittedly that layer is almost always abstracted i.e. AWS / GCP and various other smaller hosted solutions that handle a good chunk of load balancing for us. In that landscape BEAM VM's strengths shine even brighter. I've seen firsthand that you can in fact bring a BEAM VM to its knees if you expose it just like that to the net. It's not pretty. Golang fares a touch better and Rust seems almost immune (provided one does not screw up their caching layer and don't do elementary N+1 query mistakes).
I don't think you're missing anything, the speed is the impressive thing but the model is limited. Perhaps Chatjimmy is more impressive in this respect. Most people haven't tried either
You can get something pretty fast right now with a Cerebras Coder subscription, sadly I think the best model they had last I checked was the somewhat dated GLM 4.7: https://inference-docs.cerebras.ai/models/overview
I feel like if they got DeepSeek V4 Flash and Pro running on their hardware, even if at less than 1000 tok/s, they’d still be crushing it with any subscription they’d provide, given how generous their token limits were.
As for the demo it's fast and extremely dumb like expected for 2B. I asked how to stop drinking habit and in just one follow-up message it recommended trying 8% ABV. Hilarious.
looks like it! didn't know about yosemite's half dome but that makes perfect sense considering it's in the Sierra nevada mountain range. It's gonna have to stay an orca in my headcanon though
Just got an email this morning saying my monthly $3 donation went through, and this article reminded me how the internet archive is truly the internet’s library and very worthwhile to support
reply