For stateful systems, tests named after setup details often get weakened over time. Tests named after the claim they are trying to falsify are harder to water down.
The part I’d be most interested in is how well this works for business invariants like idempotent posting, no lost acknowledgements and recovery after partial failure.
I think all these scripts become poor where they're context based as opposed to actual guardrails; what we need is various silo'd protocols like a ssh protocol that keeps the harness producing work through the protocol rather than a bunch of loosely based bash scripts, etc. Plus, the harness needs to be outside the environment so it's not something you have to install ever on a remote system, whether it's a container, a vm, a ssh location. We shouldn't base everything around running bash without a secure tunnel into the location of interest.
The failure mode of these tools is self destructive in many cases.
Idempotency is what bites me most in practice — I've been driving these against an unreleased database I work on. The main trap is using the op_id as the idempotency key rather than a business key the client reuses on retry. When they're the same thing, the checker is trivially true and the test passes without testing anything.
No-lost-ack is conceptually the same shape with a simpler property (every acked write shows up at the end), but it breaks the same way most checkers break — if the recorder treats timeouts as success or failure instead of "unknown," real lost writes silently disappear.
Recovery after partial failure is where the AI-agent angle gets shaky honestly. Quiescence is the hard part. Agents will declare a system "recovered" while compaction is still running in the background. The skill forces a three-part check (no in-flight ops, no pending background work, replicas converged) before the invariant runs. How reliably that holds up against a specific SUT, I'm still figuring out.
If postinstall scripts are restricted the people behind these attacks will switch to something else. Package code is executed automatically by Node when imported, which could be a good replacement. It'll probably run when tests run instead but it's still going to run for most people.
Limiting post install as an attack vector is still a good thing.
Node is working on a similar permission model to Deno that allows explicitly granting certain system resource permissions https://nodejs.org/api/permissions.html. Using it should help reduce impact from malicious code, though if you allow wide permissions it's unlikely to help.
> Limiting post install as an attack vector is still a good thing.
If npm got rid of the post install scripts it would permanently break the install process of packages that use it. Affected systems will need to bypass it, stay on an old npm version, or upgrade the packages to versions that work without post install. Meanwhile, attackers switch to a different attack vector and continue.
I said limit post install, not remove them. Having an allow list in package.json of packages which can run post install would work fine. Pnpm already does this.
Having said that I'm not against full on removal of post install either. It would get more pushback, but would still be possible for people to manually run the post install for the few packages that require it, or to add them as a script in package.json.
I agree with this. Moving the git repo is easy, moving the whole project surface is the hard part.
Issues, releases, CI, docs, security advisories, search and discoverability all tend to get coupled to GitHub over time.
For open-source projects, I like the idea of self-hosted as the source of truth, but still keeping a read-only GitHub mirror so people can actually find it.
The ideal situation is to eliminate thinking that the thought process for "actually finding" a project == GitHub.
We let Microsoft parasitize our brains with this. The software community has long had alternate forums. GitHub isn't even a particularly good one, and it's recently just become a swamp of generated content, fake stars, and mining your content.
In the last couple months at least once a week I get some LLM generated phishing spam from some bot that "found your projects on GitHub and want to collaborate" etc.
And it's well documented now how you can just go out and "buy" GitHub stars.
...Maybe that's the answer, we need a "hub" for the smaller missing things to start, you pop in your git repository when you join, and it can sit as a thin layer over your repo with issues, releases, etc... Sounds like a lot of work, but doing it piecemeal would do it.
I think trying to re-host git itself might be more trouble than its worth. My kingdom for someone to build this so I don't have to use ADO boards anymore.
For stateful systems, tests named after setup details often get weakened over time. Tests named after the claim they are trying to falsify are harder to water down.
The part I’d be most interested in is how well this works for business invariants like idempotent posting, no lost acknowledgements and recovery after partial failure.