Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thinking that harder-to-guess IDs will mitigate attacks is an example of security by obscurity. It's better to think of any IDs in your database as being public knowledge, because they will leak anyway. Assuming that no one can guess another ID leads to shoddy practices. I generally keep IDs sequential and build security around the basic assumption that IDs are not keys, passwords, sessions, or secrets - they're just the public matching identifier for those things.

To that end, I think it's neat to be able to improve indexing on UUIDs, but it's not a security solution.



Having sequential ID's is more than just a security risk, it's an information risk. Competitors can use them to estimate the size of your business, the number of customers you have, and all sorts of stuff.

This was used in the war to estimate the number of German tanks based on the sequential IDs

https://en.wikipedia.org/wiki/German_tank_problem

So just for business intelligence you don't want to leak your IDs.


I’ve heard this argument many times, but I’ve never seen anyone actually post a reference to it happening (as in, a company finding and using this information; not the German tank problem).

To me, it reeks of solving imaginary problems while causing new ones.


Years ago I wrote a library that would exaggerate sequential IDs to make our SaaS platform appear more popular than it actually was to anyone trying to pay attention. Not sure if I’m proud of the hack or embarrassed. But of both I suppose.


but UUIDv7 isn’t sequential (unless I’m getting it mixed up). There’s just a time-based component which can make sorting really nice & some random bits at the end.

If you don’t let an attacker iterate your data, all they can tell is when the ID was created.


I was responding in a sub-thread about risks/opportunities associated with sequential IDs with an anecdote on opportunities.


> If you don’t let an attacker iterate your data, all they can tell is when the ID was created.

The ordering means that you can reconstruct the sequence if you have enough of them, though.


I vaguely remember somebody figuring out from photoshop activation IDs how many sales adobe was making and trading options around their earnings report with that information.

I don’t remember the details, so maybe it was something else and not photoshop/adobe.


It's perhaps embarrassing for a new startup to have a user ID of "10".

That's about the only problem I can discern.


Depends on how far you get, I suppose. Wozniak's apple employee badge has ID #1 on it, and that's cool as heck. I also remember ICQ users sorting themselves based on how many digits their identifier had.


I agree just knowing the number is bad, but it also makes it easier to discover far worse problems as well.

My second job was for a company that provided internet enabled phone conferencing solutions (this was years before VoIP became widespread).

The customer ids were sequential. Couple that with an outright idiotic security flaw (the login process set the customer ID in a cookie and the app trusted it on ever subsequent request. Just the ID. Nothing else), I was able to iterate over all the customer ids and hand over a complete list of users to my boss to illustrate the problem, starting with a list of the accounts of the complete upper management.

They could have been used to spin up huge numbers of 30-person long distance conference calls at high costs (this company was building out nodes with 20,000 line pstn switches before they had customers... it was crazy, and they failed but would've failed far faster if that had been abused and they were on the hook for costs from their carriers)

Trusting that cookie was still stupid, but had it been a long random key it'd at least been a bit harder to discover and exploit (their next attempt was to base64 encode it and I had to explain why that didn't help; they then finally blowfish encrypted it, but without any time component, so still subject to replay attacks... I jumped at the first opportunity I got to get out of there)


Huh. That's exactly the same security flaw as Moonpig had. Tom Scott made a video about it.


Not disagreeing with the general concept - these IDs leak information - but these are sequential IDs, not auto-incrementing IDs. The leak is the time the ID was generated, not the volume of IDs generated.


That's still a competitive risk -- it does things like reveal if a given list of customers from recent orders/posts are all new customers or long-term customers.

Or from a list of most recently added customers/users, you can figure out the rate of signups.

Revealing timestamps is bad because it can reveal way too much information about the health of your business that you prefer to keep private, if a sequential list of ID's ever gets exposed (which is hard to prevent).


They’re not even strongly sequential (is there a term for this?). The gaps between them can be arbitrarily large.


They are sequential, where they are in a sequence where one is clearly before or after another.

They're not monotonic.


Thanks! This is what I was looking for.


They can still find out through LinkedIn, ZoomInfo, BuiltWith, and any number of tools.

The tank problem doesn't fit when the incrementing value is time since epoch. Integers yeah, UUIDs, KSUIDS, or any other semi-ordered thing to make your database indexes less fragmented I haven't seen a real leak issue with those.


Just friday I've had a discussion with a colleague about filenames.

We do a lot of computer vision and in his project, each processed object is assigned a UUID and he wanted to save images to files for each one.

So we took some time to go over various timestamp formats to be embedded into the filename to make the files sort chronologically. UUIDv7 is just spot-on solving our problem. In this use case, there are no real security considerations.


Doesn't that still leak (statistical) information?

It may not be technically security, but e.g. knowing your competitor just added N products to their shop, might be a security issue for the business.


It may. Certainly, for instance, sequential invoice numbers do. If a business decides to take measures to obscure that, no problem. All I'm saying is that obscuring a numbering system for data artifacts shouldn't be considered any sort of security as far as keeping your endpoints from being hacked.


The point on invoice-numbers brings another issue to mind.

We model our domain(s) using DDD, and often "The ID" really is best left a thing with meaning. Customer-id, Bank-account-number, invoice-number, email, etc. At least within the domain, it is. The business (and laws etc) already ensure there can only ever be one invoice with this number. Its terribly counterproductive to have two ID's for something. "Hey, can you have a look at invoice 20230233, because it seems the VAT was applied wrong. Hmm, do you have the UUID for that invoice and DM me that? You know, the long one with the hyphens".

I guess there isn't a one-size-fits all solution and that "it depends" very much on what e.g. "public" means.


You're absolutely right, this is also why you generally encrypt sessionized or "consistent view" pagination tokens for public apis (save for primitives like ddb or Kafka)

The end user should know no details about your internal key space.


Security by obscurity is a necessary step in most software security.

It hardens, completes and complements other measures.

Examples of every day security using obscurity: every password and encryption key

EDIT: Thanks for the replies.

Ignore above!

Obscurity is the low bit of security. But when it’s convenient, it still helps.


Obscurity and secrecy are different things. Though I agree with you. Moderate amount of well implemented obscurity is helpful.


> Moderate amount of well implemented obscurity is helpful.

You're getting that wrong: Everything else being equal, the more obscure system will always be the safer one. It's just that obscurity can easily be lost, so your system should, if in any way possible, still be secure even if fully known. In the end, however, no system is 100% secure, but more obscurity will make it harder to find the inevitably existing issues.


I think the counter argument is, that all else is not equal when obscurity is a goal of security, because it adds a maintenance burden to some greater or lesser degree, and that maintenance burden becomes time taken away from proper security practices, or other value providing work.


I think the main argument is that security by obscurity can easily be circumvented, be it via sidechannel, secret leak, source code leak or a surprisingly small search space (for example the whole range of IPv4 being scanned by now). It's easy to assume something is secure and spend a lot of time on obscurity, which completely falls apart thanks to a small sidechannel attack. It's (usually) just a weak defense overall. Yes, it can also be a maintenance overhead and therefore risk via proxy, but it can actually be easier in other situations.

For a personal anecdote, I used to work in a small webshop and our software was horrible, to the point where minimal effort would have been able to compromise our servers, which were running software roughly as old as I was at the time (I want to note that I worked on improving the situation while I was there). Still, the only time we had a problem was when we took over a Joomla-hosted site, as we were small enough to not get any individual attention and your off-the-shelf WordPress or Joomla-scripts did not work on our home-brewed software.

In the end, I fully agree that security by obscurity is a weak concept and the usual wisdom of not relying on it is completely correct. Still, it's important to acknowledge that obscurity can and does help security and bring actual reasons on why you should not rely on it. Just saying "it's obviously bad" leads to an easily refuted argument and will not convince some developers, leading to worse software overall.


To me, the main reason to avoid obscurity in naming or numbering things, or even in code - rather than view it as a modest addendum to security - is to force yourself to do the mental exercise of what happens when that obscurity is lost.

Not doing that is how small companies seem to get away with terrible security holes for a long time, until suddenly they don't. I've seen too many cases of companies in a position where they built a small, insecure service that's now getting shared more widely than envisioned, who don't want to spend the money to make it right, because no one has compromised it yet (that they know of), and what are the chances of someone stumbling across it - where even pointing out that it's an attack vector can earn you trouble.


passwords and encryption keys are secrets, not obscurity.

Security by obscurity would be hiding your house key under a doormat for your friend to find - depending on the culture you live in you may be more or less safe but it is not security (just like hosting your ssh server on port 9384 will repel 99% of attackers but is not a security measure).


I keep SSH on Port 22. After years, I'm still amazed about the operational model of these attacking hosts.

They are completely dumb. I haven't kept record, but I have the feeling that some IPs in my fail2ban list are practically in there for month or even years now.

I assume they are just sweeping the whole IPv4 range? No state, no cache. Either they successfully attack a host or they go to the next IP. Repeat 2^32 times, start again.

I'm not sure where I wanted to go with this comment. Is it _that cheap_ to constantly sweep the IPv4 range or is it _that profitable_ to do it once you have a successful attack?


You should think of them as public, but that doesn't mean it isn't still helpful to obscure aspects of the information they carry.

Obscurity can be helpful as part of defence in depth, to reduce the impact when someone does something stupid, or to make it more difficult to extract information that might be helpful as a means to attack the system from another angle.

If you're already thinking about the implications, you can likely ensure people doesn't jump to the conclusion that the IDs can be trusted just because they look complex.


Security by obscurity is a working solution if implemented with other measures. It increases the cost of attack, which in the presence of unknown vulnerabilities gives you precious time to respond.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: