What you're describing only applies to security or biotech downgrades. A downgrade related to the model believing that you're doing something related to model development is invisible and silent and internal.
When I reported this, Anthropic sent me an email on Tuesday saying, "You have been approved into the Cyber Verification Program", but it's still downgrading. Is this a bug? What's the point of the Cyber Verification Program if Fable 5 downgrades when you tell it to write secure code?
They've publicly apologised for the invisible PEFT that deliberately makes the model dumb on some tasks. Whether they still do it, or will once again do it in future in more subtle ways, is something we can't verify.
Personally I think they have proven themselves to be the stewards of AI in the same way Exxon Mobil are the stewards of petroleum.
> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose
Whatever problem we might have with them, they explicitly say that they do not do this in the launch post.
It does leave me wondering about other runtimes that could be used as the go-between though, because at the point of compiling Rust, an approach like Cloudflare's Pingora (https://github.com/cloudflare/pingora) which I've tried using before... in _theory_ should be a 'nicer' solution - just historically awkward when I've tried using it the way that I'd have liked. Wish it were more library-shaped!
I don't think they 'index' videos, per se. They just point the model at the video's transcript on demand when you ask a question, I believe. Doesn't change any of your conclusions, though. You're absolutely right, they have an absolute ton of data.
> Anthropic’s CFO testified under oath this March that the company spent $10 billion on compute and made $5 billion in revenue (Ed Zitron has the math). The labs are underwater on inference. They’re raising prices to keep the lights on.
'The labs are underwater on inference' is an absurd thing to say whilst not separating the cost of _compute_ out into training and inference.
According to Dario Amodei, Anthropic are even profitable when including inference as long as you look at it on a per-model basis; it’s just that every model is more expensive to train than the last one.
For instance, if you have already spent $n to train a model and are currently earning $2n selling inference with it; but are concurrently spending $3n training the next model in anticipation of earning $6n with it, then you are already in the hole for $n and are currently also losing $n – but you are doubling your money with each model because your $n investment in the first model returns $2n and your $3n investment in the second model returns $6n.
How is training vs inference any different than other product spaces, where all the costs of bringing a product to market have to be considered for profitability? You can't just look at marginal production cost. You are still underwater if the other development costs are not being recouped by the final sales revenue.
The whole commercial AI enterprise is not economically viable if the inference revenue will not cover both inference and the amortized training costs. Given how fast they are churning through models to compete, you cannot act like the training is an asymptotically low cost.
In any reasonable setup, hovering would be a rare, rare operation (like 30-60 seconds during takeoff and landing), with most of the time spent in wing-borne forward flight – which'd be _wildly_ lower power usage, more like 200-250kW tops. About ~par with staying in continuous acceleration in an EV. More for sure, but not nearly as insane as what you're pointing to.
... and this is exactly where better batteries would help – being able to hold that power level for longer so you could actually go places in earnest without untenable mass.
Is it? If we're talking about a future where EVTOL takes over for passenger cars, there will be air traffic jams with delays that require extended circling and likely hovering.
There's a reason all the EVTOL startups show individual vehicles landing in pristine fields, and it's the same reason car advertisements show one car on a closed course instead of I-95 at 3pm on a Friday
... air traffic jams? The air is _much_ bigger than the corresponding ground.
Certainly there'd be density _at_ take-off and landing, but even that's manageable by having e.g. arrival/departure locations at multiple heights.
It also seems vanishingly unlikely (at this point) that we'd have EVTOL that's not fully autonomous, further reducing the odds of this - ~perfect and coordinated driving, as well as foreknowledge of what's happening between you and the arrival location drastically reduces traffic.
... because the entire point of VTOL (which is what the parent commentary was about) is that you can take off and land vertically and therefore don't need one of a few, scarce, super-long runways? ... and the waiting you're talking about is entirely because of those?
On top of that, small VTOL craft that can hover and would be at lower speeds closer in (esp. autonomously flown) would just need less mutual clearance compared to jets, which also have an altitude band they have to stay in, as well as no ability to slow to a crawl and coordinate finely.
You asked me why the problem of circling waiting for your turn would vanish when using VTOL aircraft. I don't know how to respond to that with anything other than, "That's the entire point of VTOL. It doesn't need one of those scarce runways that planes circle waiting for.".
My bad! You do list that you're an aeronautics person. I would genuinely genuinely love to understand what I'm missing – I'm sure there's some context here that I'm lacking!
If you want many things to land approximately at the same time and place, you need a little bit of play to schedule the arrivals/departures and ensure that you don't have collisions. There is a limit to the amount of aircraft you can safely cram in any amount of space.
Any aircraft you imagine will circle at landing and possibly loiter for minutes while waiting for their turn at using the airspace. (Edit0:See helicopters)
Building an open skyscraper for aircraft to land on will not save you since crafts will lockdown a large part of the building to land/depart safely. And it's not clear to me that it would be profitable.
Then many other problems about energy density and aircraft weight limiting the whole scope of who would possibly use those crafts.
Have a good one!
Edit1: I don't know for you, but my city doesn't have enough parking for cars. I'd be surprised if there were enough parking for EVTOL everywhere - you could very well need to loiter waiting for a spot to open, could need emergency landing if you run out of power, many many un-perfect things that make the card castle fall apart
This reminds me of that in a good way – a small Linux device that doesn't have to maintain a screen all the time (power) or focus on real-time but has physical buttons, connectivity, a microphone and a sealed case so it can be thrown in your pocket would be... an absolute dream.
Counter to some others here, I would buy this at whatever cost if it lived up to that intent!
As others have said it's just a fraction. I'm in a medium size tech-related company and we have 7500+ in one Github org. We have two orgs, so altogether easily 10K+. Of course most of it is stale, obsolete, sandbox, personal tools, etc. I wouldn't be surprised if Github would have 100K+ internal repos or even more.
No OP but I used to work at a large company with a similar number of repos.
When I left about a year ago, we had just started (after being on Github for almost 8 years) an ongoing project of first archiving old/outdated repos in place, and then moving them to an "archived" sub-org, and waiting to see if anyone complained.
Previously no one wanted to outright delete or remove repos because of the risk that someone somewhere was relying on it, and also there was no actual downside to just leaving them there (no cost savings, no imminent danger other than clutter, etc), so resources were never allocated to do it. There was always something more important to work on.
In an org with a higher floor of engineering management, a proactive program for removing unused or outdated repos would absolutely be expected though I think.
This is a continual fight for me. At nearly every company I've had to compromise on using a graveyard repo for packages within a monorepo, even though git has the whole history already.
The problem with history is that you need to know when to look. If you're looking for some old code that you know existed but you don't know exactly what it was, you can't just browse to go and find it.
I worked for a food retail store once. I remember going in the first day wondering, how hard can it really be... From the outside, it looks like they have a simple website. The website to order things on was an amalgamation of 300+ repo's. GitHub lost less in this breach. It takes a lot of effort to keep things simple as you grow.
Something cool that I've always liked about working at GitHub is how much of the company _runs on GitHub_ -- A lot of teams, even non-technical teams, have their own repos just to organize docs/SOP's/designs/etc like a traditional knowledge work company might use a Sharepoint
Personally I have over a hundred, especially from quick prototypes, studies or instances of templates so I can easily see how over 18 years and many hundreds of employees you end up with thousands.
I remember working at a company with at least 5,000 repos across five or six GitHub orgs, plus more stuff in Perforce.
Probably some old experiments in there but the company had its fingers in a few pies and some departments didn't mind creating yet another service to solve a problem.
I definitely archived the old stuff in my department (we had eight repos and that felt like enough for three people).
In my personal experience, give it a decade or two, and any corporation will accumulate hundreds (or even thousands) of abandoned internal repos containing discontinued services, POCs/prototypes that never went anywhere, etc – people forget to archive them, or aren't sure whether something is still in use or not so err on the safe side.
AI is making this even worse. With coding agents, anyone can throw together a quick internal prototype of any idea they have, even if it has no hope of ever making it to production.
Maybe though AI will make it better, assign agents to monitor, maintain and keep repos up to date or via A2A refer them to an agent to dispose of them in accordance with company requirements. I actually think AI will greatly help this type of problem.
Autoarchiving repos which nobody has used in X years doesn’t require any AI - you can just write a bot to do it. People don’t, because it isn’t a priority. AI can make writing such a bot a bit easier, but can’t help much with getting approval from the powers that be to run it.
really? I mean these are internal repos. Probably most of them are random one-off experiments or a place to park code. Google has 2,900 "public" repos on github. Microsoft has ~8k "public" on github too. Can't even imagine how many they have on their internal systems.
No, there's no joke, you might have just misread the article (the 3,800 number is the number of internal GitHub repos the employee had downloaded on their personal computer / had access to on their own GitHub account)
Because everything in Github is designed for growth:
Easy to create a repo, very hard to delete it (a lot of scrolling, clicking, copy/pasting the full name of the repo, etc.)
I mean "Deleting", not "Archiving".
MS and Github need their number to go up, not having people cleaning up their repos to avoid any loose ends.
I have hundreds of them, it took me a few hour to delete the unused ones.
In a medium size org with thousands of them, it will take weeks for security to do a cleanup.
Google's 3.5 Flash – which came out yesterday – is 200-300 tokens/second (albeit purportedly inefficient in its use of reasoning tokens) and according to Google, 800-1500+ tokens/second on their 8i TPUs when they're out!
It's... suboptimal, but hopefully that's a reason to hope... if Google get themselves together for 3.5 Pro / the next Flash.
reply