Hacker Newsnew | past | comments | ask | show | jobs | submit | chrismustcode's commentslogin

You don't consider Input $0.435 Output $0.87 cache read $0.003625 per million tokens for near frontier intelligence cheap?

No. They still have enormous profit margins on inference with these prices.

Their margins doesn't impact my own assessment of end user pricing as cheap.

Any source to backup this claim, pretty please?

Source? There are a countless number of providers serving open weight models for fun and profit.

I highly doubt there is any margin on those inference pricing.

> I highly doubt there is any margin on those inference pricing.

And yet, OpenCode Go offers DeepSeek flash 6 times cheaper than DeepSeek itself. And they claim they are still profitable.


Part of their model is that not everyone will use their entire quota each month. I don't think I will. I use under $1/day with deepseek v4 flash. We get $60 for the $10 sub.

It’s near the frontier meaning it’s the best intelligence for the price.

It’s not even close to frontier meaning it’s the best intelligence.


I hardly notice DeepSeek being inferior to Claude Opus unless I have it working on tricky and under-defined problems. That is, I trust Opus to reason much better when it has the choice. Otherwise, IME DeepSeek is far cheaper and more effective for anything where the solution is even somewhat obvious.

Out of curiosity, what is your stack? And is this in a legacy project or new one?

I have tried using deep seek flash and pro but they make amateur mistakes. Sonnet level at best.

However v4 flash is absolutely amazing as a generalist model and it’s what we’re using on a product built on top of LLMs. I wish I could code with it but it’s not going to happen anytime soon


I've used it across many new projects as well as many legacy ones. It does make amateur mistakes so you can't leave it unsupervised for hours like I do with Claude, but it's so much cheaper that weeks of heavy usage haven't even cost me $10 yet. Only other downside IMO is that Pro is pretty slow, even compared to frontier models; only around 120t/s IIRC.

Yes I also noticed it is pretty slow, which sort of defeated the purpose of using it for me.

Usually I'm working on a large task, typically with Opus, while also having a bunch of smaller tasks in their own independent worktrees. Those still need supervision, but less. My goal was to get deepseek to drive the cost of those down, but it was too slow and unreliable...


Yes, I could tolerate the unreliability better if it were faster, but it's really not. So it's too slow for me to actively supervise it, but too unreliable for me to trust it unsupervised. The shitty middle. I often have multiple of them open at a time and check my terminal every few minutes to lead them along. Mostly works.

I once got told for an internal promotion I couldn't put anything regarding my current role, responsibilities and achievements in the role. I got told to put any volunteering or previous.

Reason given was it's what is expected at work everything you do in your role, you need to show above and beyond.


Seems like that'd just discourage people from going above and beyond at work. Why do more than the bare minimum to avoid being fired if nothing else you do counts?

>Look, we want you to express yourself, okay? Now if you feel that the bare minimum is enough, then okay. But some people choose to wear more and we encourage that, okay? You do want to express yourself, don't you?

(This is from Office Space for those who don’t know. Hilarious scene with Jennifer Aniston)


The Flair scene? Oh seriously than got me so much vicarious embarrassment, I feel uneasy just at the thought of it.

[Jennifer Anniston flips Mike Judge the bird, on-screen #inLove]

>>"How's THIS for expression?!? I'm sick and TIRED of this ... job!"

----

I will never go above&beyond again – for any corporate entity – ever again. You can blame past corporate bullies, not yourselves.


If they can block IPs of cloudflare what extra mechanisms would be needed to block VPN IPs?


The only viable way to even get most of them is to shut down internet access entirely. It's not a realistic solution, unlike blocking a few well known IP ranges belonging to a large corp like Cloudflare.

And even if you managed to get them all beforehand, some VPN providers will adapt and keep some servers in reserve, putting them online just as you managed to block the previous ones. Getting around internet censorship is a large chunk of their business, and some are really good at it.


You don’t really need to block all, you just need to annoy the users enough that paying is easier. And I think there are enough games to use up the IP reserve pretty quickly and getting new ones every time is pretty annoying.


I can provision a new VPS in about 5s of active work. I'd probably fully automate spinning up new servers and failing over because automatically detecting which got blocked is trivial. Bonus points if you use providers that let you attach multiple IPs to each VPS for cheap. Use some censorship resistant decentralized protocols to provide the next couple IPs to your client software and you're good.

And then they still need to monitor hundreds of VPN providers for whether they have new IPs, which is not neccssarily as easy as just grabbing a list of them. Once they have some, they then need to forward them to the ISPs and ask for them to be blocked. Their process is significantly less friendly to automation.

No country ever won this fight short of total shutdown/disconnects.


> No country ever won this fight short of total shutdown/disconnects.

Some countries also throttle pretty effectively. So you can connect but if you're trying to do more than read Hacker News it's not very usable.


It's a game. The VPN marketplace is huge so it's wack-a-mole.

Big companies don't hide their VPN ASNs. Obscure, for sure, but getting a good list isn't hard. Usually they get blocked.

Smaller companies may pass under the radar, and have higher tolerance for risky strategies.

The fringe providers are the problem. They aggressively change IP ranges, front-vs-obscure ownership, and play dirty. Shady folks will resell residential ranges. End-users often get tainted goods.

... and you still have the collateral damage game when VPNs host infra with big cloud providers vs colofarms vs self-host, etc.


Is aider even a thing considered anymore?

It was pretty much first for CLI agents and had a benchmark that was the go to at the start of LLM coding. Now the benchmark doesn't get updated and aider never gets a mention in talking about CLI tools till now.


Aider is dead because it's pre function calling era of tech


5 days ago OpenAI raised $122b and 26' Q1 recorded the largest amount of startup funding in a Q ever.

I wouldn't say it's drying.

https://x.com/OpenAI/status/2039085161971896807

https://techcrunch.com/2026/04/01/startup-funding-shatters-a...


I'm on sky in the UK which is marked as not safe due to no RPKI.

It's not on the list so imagine there is a fair few missing, would be neat to have a table you could filter by country, provider type (cloud/isp etc) based on real results from users.

edit: there's a show all button to expand the table


If you're interested, Community Fibre is a yes from this website


I get the same result for A&A, but frankly I trust them more than some random site with (apparently) an axe to grind.



And here we are six years on... I have a lot of respect for A&A, but I do find it hard to sympathise with that page.


My hope would be that A&A have a process manually whitelisting the route that made the test fail because in fact (as of course it would be) it's actually deliberately not signed but it is really their route.

But on some level that's like assuming the reason the guy with the handgun is on your plane is that he's a sky marshal and not that some idiot let a concealed handgun through security. I mean, sure, maybe, but, maybe not.

Without asking it's just a guess and I haven't asked. Maybe I should.


And now thanks to jsty's sibling comment I don't have to ask, thanks! It does seem like they've been more than "cautious" enough at this point and should just implement RPKI.


This isn’t accurate even for API prices for a request/response.

Go on something like openrouter with gpt 5.1 and use the chat then check the billing and you’ll see an average joe query is like 0.00102 or something.

You’re quoting figures from articles for initial ChatGPT release in 2022


They were failing as an online IDE for several years then growth shot up after the AI pivot.


Couldn’t they just send some hardware down Texas to co-locate there (presuming specialist hardware) and add another deployment target for their software? Would it be that hard?


The speed of light limits fibre speed which in turn limits high-frequency trading.

Flash Boys by Michael Lewis was a fun read on the subject. One memorable quote alleged that HFT traders would "sell their grandmothers for a microsecond [of edge]"


The issue is the speed of light.


for an interesting reversal of the "problem" of the speed of light, IEX is a stock exchange design to combat HFT by adding a physical speed bump by way of 38 miles of fiber optic cable. The general idea being to level the playing field and improve market liquidity using physical communication limits of light. https://en.wikipedia.org/wiki/IEX


That marketing gimic adds hundreds of microseconds to order latency. It’s not designed to level any playing fields it’s designed to get publicity.


Not really because anyone running a trading strategy that needs to worry about latency is already running their servers in the same datacenter as the exchange, so that just moves with it. What probably is an issue is that the datacenters required for a market don't look like AWS datacenters. I don't have any direct experience here, but I would be shocked if HFT software is something you could just deploy to a standard VM like on AWS.


They'd probably be running in an Equinix facility instead of AWS.


I thought they use GPU for learning and TPU for inference, I’m open to been corrected.


The first tpu they made was inference only. Everything since has been used for training. I think that means they weren't using it for training in 2015 but rather 2017 based on Wikipedia.


The first TPU they *announced" was for inference



no. for internal training most work is done on TPUs, which have been explicitly designed for high performance training.


I've heard its a mixture because they can't source enough in-house compute


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: