I'm the original author of AxCrypt. There's a new compatible iteration of it based on the original cryptography code called Xecrets, with several components. Xecrets Cli is the main fully featured command line engine, free open source GPL on https://github.com/xecrets/xecrets-cli . Xecrets Ez is a simple GUI frontend, single executable, portable. Runs on Linux, macOS and Windows. There are several more components available as nuget packages.
Very interesting! Thanks for the link, I didn't know about it and I'll have to check it out.
It still falls slightly outside of my zone of preference, though, as it's a .Net application, which limits which platforms it can be run on (I'm aiming for a tool that can readily be used on most of my devices, including on an old Raspberry Pi or a router).
Fair enough, that's a little out of main scope for Xecrets Cli.
Just for fun, I tested now to build and publish for an ARM processor running Linux (linux-arm) and the build worked fine. I don't have a device to test it on though.
So... as long as it runs Linux and has an ARM or ARM64 processor and sufficient storage & memory it could work! (It's not tiny like a pure C-program unfortunately).
Yes, my article is pretty much speculation - in the absence of a proper explanation by CrowdStrike. (Now there actually is sort of an explanation, but it raises almost as many questions as before). I don't have data for a first hand investigation, but do cite the investigation by Dave Plummer - which of course also contains quite a bit of speculation.
Whether or not "Rapid Response Content" and "Template Instances" are Turing complete is unclear, but the fact of the matter is that according to CrowdStrike "problematic content in Channel File 291 resulted in an out-of-bounds memory read triggering an exception", so the interpretation of the content is at least fairly complex. CrowdStrike also states that "Each Template Instance maps to specific behaviors for the sensor to observe, detect or prevent", and mentions a "Content Interpreter". Whether it's code or configuration data is not really relevant though, the point is that it's interpreted by a kernel mode driver which did not have sufficient validation of it's "Content" to prevent the crash.
Well, you don't want a fancy anti-virus update to rip through the global population of customers, effectively killing 8.5 million systems (according to an estimate by Microsoft) either in the space of approximately 78 minutes, right? What possible malware threat warrants that risk? And in this case, according to CrowdStrike, it was "to detect novel attack techniques that abuse Named Pipes". That doesn't really sound like such an urgent situation.
It is kind of obvious, isn't it? But I've yet to see any hard questions asked in main stream media about the process of simultaneous global rollout of "content updates".
But in a recent update of https://www.crowdstrike.com/falcon-content-update-remediatio... they have a long explanation of of things are supposed to work, with a lot of nice words (sounds almost AI-written...) and quite a few implications that it's really the customers fault who have not configured their systems to for example stay one version behind the latest and still a very short explanation of what went wrong.
But... Lo and behold, what are CrowdStrike going to do to avoid this happening in the future?
"Implement a staggered deployment strategy for Rapid Response Content in which updates are gradually deployed to larger portions of the sensor base, starting with a canary deployment." About time...
My main point, and the reason for the title, is that this has not been the major takeaway in main stream media analyses. Of course not "everyone" has missed this, but pretty much all media articles about the incident do appear to miss this.
Mainstream media are unlikely to report into such detail, they would often ask questions such as "How could such an update cause a global outage?" or "How could this be allowed to happen".
Most people didn't even know what Crowdstrike was, let alone understand the concept of testing updates and staggering them.
Lastly, the media are at risk of reporting the wrong thing and being a target of litigation. Therefore they report often in hyperbole and without much factual information until the facts are determined.
Yes, since CrowdStrike won't tell us, we'll have to rely on our own or third party analysis. As I write "Since as usual the company won't release any detailed information on what really happened, we'll have to rely on other sources. I found that Dave Plummer's account on YouTube was very good, and trustworthy."
But, absolutely, probably is a required qualifier for some statements about the details.
What is definitely known is that a WHQL kernel mode driver from CrowdStrike crashes, and removing a single file external to the driver causes it to stop crashing. Some pretty sure conclusions can be drawn from that. No "probably" required.
Interesting, do you have a link to this statement? Also, do they state what did cause the crash? At least removing the file of zeroes does solve the problem, as the instructions both from Microsoft and CrowdStrike states "Boot into safe mode. Delete C-00000291*.sys." That's the file(s) with the zeroes... See https://www.crowdstrike.com/falcon-content-update-remediatio... and https://www.youtube.com/watch?v=Bn5eRUaMZXk (3 minutes 20 seconds in).
AFAIK in one of the older crowdstrike threads, there was a tweet that said the driver checked for a sentinel value of AAAAA... before loading it, so an entirely blank value wouldn't have caused the issue. I can't find the source now, but some comments do seem to corroborate it:
Right, they write rather cryptically "This is not related to null bytes contained within Channel File 291 or any other Channel File."
That's not quite the same as saying "This is not related to Channel File 291 containing all nul bytes."...
I don't have first to hand knowledge here, but rely on Dave Plummer's statement.
Regardless of zeroes or single files or not, the fact is that bad data in C-00000291.sys in combination with bad validition in the driver causes it to crash. Deleting C-00000291.sys causes the driver to stop crashing.
Anyway, my main point isn't really about this. It's about the big bang global roll out simultaneously to at least 8.5 million systems in one go that's irresponsible.
The driver architecture is the lesser evil here, although it's bad enough!
> the fact is that bad data in C-00000291.sys in combination with bad validition in the driver causes it to crash
This is, in fact, not a fact. We really don't know yet.
CrowdStrike blue screened one of my laptops twice right as the incident was getting started, before a fix was available. There was no boot loop in my case. I was back up and in the middle of an episode of Breaking Bad the second time it got me, 30 minutes after the first. Did the agent wait that long to load a content update it had already loaded before? Maybe, but it's at least as likely that the content was loaded the whole time, and that some activity pattern set it off. Thus, I'm skeptical of the problem being simple content validation.
Yes - subsequent to my comment. Thanks.
But how can this latest statement can be true, if the previous statement that the crash was not related to the zero bytes content is true?
Good question. There's some evidence that not all affected systems has seen this 'all zeroes' file, the first account stories varies. But something was definitely broken in the deployed data. But, once again, CrowdStrike does not paint a clear picture and it raises new questions and only partially answers old ones.
Why is it so hard for manufacturers to just go ahead and explain what really went wrong, without a lot of corporate b..t? Probably, if they do really say what happened in so many words they might open themselves for negligence lawsuits. Hopefully somebody files one anyway. The industry needs to learn to be better, and the only thing that talks loudly enough is probably money. Lost revenue, liability damages, and share holder value loss.
Speculation: this "all zero" file is part of a signed batch, they have to have signatures, they are not that dumb (I hope...). By removing a file, the batch becomes incomplete, fails the check, and some corruption recovery mechanism takes over, most likely disabling the update and triggering an update. In the meantime, they fixed the content update, fixing the crash.
Nice, very ambitious! I like the twist to extract from the compiled IL code, much easier, more stable and reliable than parsing the source. My one gripe here is that the code does not follow the .NET paradigm of using resources at all. Still, very clever and a lot of functionality.
If you're referring to my app, Xecrets Ez ( https://www.axantum.com/ ), it runs in Windows, macOS and Linux.
As mentioned, the issue I'm trying to solve is not the code end. Resx works fine once it's there. It's the translator end. How to present the texts and translations and context etc to the human, often non-technical, translators and often many and one translator might only translate a few strings, then another one etc. So it has to be real easy to use and gain access to. Can't for example ask them to install a piece of software. Finally, once a text has been translated, how to get it back to the app as easy and preferably as automated as possible.
Quite so, and in the case of resx, that's exactly what you're getting. A list of key-value-pairs, no support for pluralisation either (as you say, it gets hard. I understand Polish for example is very complex, perhaps Russian is the same?).
I never had a problem with resx and .net satellite assemblies and all that as far as the format goes. But it's always been an issue how to involved translators in a way that's both simple for the translators, safe for the quality and as automated as possible in bringing the translations back to the app.
That may be the case, but resx is a quite bloated XML format for a simple key value pair listing. Besides that, resx is yet another format for the same thing.
I think the solution is quite simple:
- One unified key value pair format (for translators and GUI tools)
- One intermediate format that is programming language specific (it could be generated code or highly integrated formats like resx)
- A simple tool that can transliterate between those two formats
Workflow example:
- Export a unified format file from resx with placeholders for the translations
- Translators: Here you go, use your GUI tools on this
- Get back the translated unified format
- Import a unified format file to resx
Well, that's what the setup I have does, almost. If you consider .po/.pot to be a unified format file. I extract the original texts, comments and internal names from the resx into a .pot (Portable Object Template). This is then sent to a translator-centric web site service. A translated file is then exported, although I export it as a .po file, and then use a library and a little bit of my own code to implement an IStringLocalizer. As someone said elsewhere, more and more services do support .resx directly, so I could consider skipping the .po handling in my code, and just use the .resx.
As for bloat, I find the .po format to be quite bloated, with it's use of the full texts as keys in each and every translation. I don't really like that, but in practice it appears to be working well and has been for many years. Then again, the obvious choice today would be json.