Fault in CrowdStrike caused airports, businesses and healthcare services to languish in ‘largest outage in history’

Services began to come back online on Friday evening after an IT failure that wreaked havoc worldwide. But full recovery could take weeks, experts have said, after airports, healthcare services and businesses were hit by the “largest outage in history”.

Flights and hospital appointments were cancelled, payroll systems seized up and TV channels went off air after a botched software upgrade hit Microsoft’s Windows operating system.

It came from the US cybersecurity company CrowdStrike, and left workers facing a “blue screen of death” as their computers failed to start. Experts said every affected PC may have to be fixed manually, but as of Friday night some services started to recover.

As recovery continues, experts say the outage underscored concerns that many organizations are not well prepared to implement contingency plans when a single point of failure such as an IT system, or a piece of software within it, goes down. But these outages will happen again, experts say, until more contingencies are built into networks and organizations introduce better back-ups.

  • NuXCOM_90Percent@lemmy.zip
    link
    fedilink
    arrow-up
    20
    arrow-down
    7
    ·
    edit-2
    4 months ago

    So we should have five different cyber security solutions at any given site? That wheezing is the sound of every it person on the planet queuing to swing a sock full of nickles at you.

    Crowdstrike was near ubiquitous because it was the best tool out there. And plenty of threats were prevented because of it.

    The answer isn’t to force every single site to manage everything themselves. It is to increase oversight on ci/CD models

      • NuXCOM_90Percent@lemmy.zip
        link
        fedilink
        arrow-up
        4
        arrow-down
        1
        ·
        4 months ago

        Like it or not, that is the most effective way to collect the data these solutions need.

        This isn’t riot anti cheat where it is of questionable effectiveness. Crowdstrike was demonstrably amazing at its job.

        • Riskable@programming.dev
          link
          fedilink
          English
          arrow-up
          7
          ·
          edit-2
          4 months ago

          Crowdstrike has clients that run on MacOS and Linux. Only the Windows version requires kernel level access. I believe it has something to do with the absolute shitshow that is Windows security model but it might also be because it runs a 31-year-old filesystem that still doesn’t allow one process to read another process’s files while they’re open.

          • NuXCOM_90Percent@lemmy.zip
            link
            fedilink
            arrow-up
            2
            ·
            4 months ago

            There have been issues with Linux and Mac clients in the past. Not to this scale but market share is very much a factor.

            Kernel access is a mess but it is also important to understand that even the less priveleged software can cause problems.

            I do firmly believe more hardware should run Linux but it is also important to understand the support burden. But, regardless, that is a different conversation.

            • bamboo@lemm.ee
              link
              fedilink
              arrow-up
              1
              ·
              4 months ago

              Less privileged software can also cause problems, but you can limit the scope in which those problems can occur.

    • TheDemonBuer@lemmy.world
      link
      fedilink
      arrow-up
      4
      arrow-down
      1
      ·
      4 months ago

      Crowdstrike was near ubiquitous because it was the best tool out there.

      I understand the reason for it, but that ubiquity comes with potential dangers, as we saw on Friday. But, no, I don’t think the solution is “five different cyber security solutions” at every site. However, different cyber security solutions for different industries might not be such a bad idea. Or, I suppose the root of the problem might be the ubiquity of the OS. Should every PC be running the same jack of all trades but master of none OS?

      • NuXCOM_90Percent@lemmy.zip
        link
        fedilink
        arrow-up
        5
        arrow-down
        2
        ·
        4 months ago

        Again, all you are doing is increasing complexity and punting it to a support staff who are likely unqualified to even know what crowdstrike did.

        This was one of those rare cases of capitalism working. There are many options. There was one that was miles ahead of all the others and that dominated.

          • sandalbucket@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            4 months ago

            I want to spin up a separate thread here if that’s okay.

            Please give me an example of any EDR solution produced through “public ownership structures”. I don’t think such a thing exists, but I welcome being proven wrong.

          • sandalbucket@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            4 months ago

            Private ownership and investment of capital created Crowdstrike as a profit-seeking venture. It also created MS Defender, SentinelOne, trellix, carbon black, etc. Competition in the marketplace (and there was/is lots of competition) forced these products to be as good as they could, and or self-stratify into pricing tiers. Crowdstrike, being the best (and most expensive) is the most widely-used. Note that not every enterprise requires that level of security, and so while CS is widely used, it is not ubiquitous. This outage could have been significantly worse.

    • fishpen0@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      4 months ago

      It’s not the best tool out there. It’s the laziest one that works. It’s perfectly possible to securely operate without a rootkit hacked into your kernel.

      Modern approaches involve running an ebpf module on rootless immutable images that are scanned on build. My org is PCI, SOC2, and HITRUST and we didn’t go down because we would never take such a sad lax approach to hand off responsibility for security to a third party. The trade off is your head of compliance and security need to actually learn things and work hard to push alternatives with auditors and consultants and most companies put an MBA who can’t critically think their way out of an empty room at the helm.