As the world recovers from the largest IT outage in history, it shows the danger of one point of failure in IT infrastructure

A global IT failure wreaked havoc on Friday, grounding flights and disrupting everything from hospitals to government agencies. Over all the chaos hung a question: how did a flawed update to Microsoft Windows software bring large swaths of society to a screeching halt?

The problem originated with an Austin, Texas-based cybersecurity firm called CrowdStrike, relied upon by most of the global technology industry, including Microsoft, for its Falcon program, which blocks the execution of malware and cyber-attacks. Falcon protects devices by securing access to a wide range of internal systems and automatically updating its defenses – a level of integration that means if Falcon falters, the computer is close behind. After CrowdStrike updated Falcon on Thursday night, Microsoft systems and Windows PCs were hit with a “blue screen of death” and rendered unusable as they were trapped in a recovery boot loop.

Microsoft is a juggernaut with significant market power, dominating cloud-computing infrastructure across Europe and the United States. So it wasn’t just computers that were affected, but servers and a host of other systems as well. Overwhelming requests from users, devices, services and businesses ushered in a cascading series of failures with Microsoft products – namely Azure Cloud and Microsoft 365. Failures plaguing Azure led to additional but separate disruptions with 365 services. A giant clusterfuck ensued.

  • EtherWhack@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    2
    ·
    4 months ago

    Only systems running CloudStrike were affected, but all systems were Windows-based as that is the only OS it works with.

    I think it’s more touching on the vulnerability of infrastructure if a larger portion is run by only one OS. Something a lot of usb here may realize, but the general public has never really understood it. Where a scenario like this or similar can can cause a wide-spread blackout, all from a single bug; be it from popular software, or the OS itself.

    • ImADifferentBird@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      4 months ago

      That’s not correct. Crowdstrike does also work with Mac and Linux, but this particular incident only impacted the Windows sensor.

      They actually had a similar issue with the Linux sensor a couple of months ago, which… doesn’t speak well of their update process.