Hi All,
pls find the eventual RCA from Microsoft below re the last major incident (on 22nd Feb AKS became unstable due to underlying VMSS Azure hosting side maintenance/updates). This is a “safety net” statement from MS that we could present or at least hint at to customers, in case anybody had any fallout from that incident.
Regards,
Peter
From: Aminat A
Sent: Friday, March 1, 2024 14:05
To: support@mail.support.microsoft.com; Péter Králl ; supportmail@microsoft.com
Cc: v-nokwerekwu@microsoft.com; v-cnwaudu@microsoft.com; v-amajagbe@microsoft.com; v-cnnoli@microsoft.com
Subject: RE: [EXTERNAL] RE: Case 2402220050002042 Your... - TrackingID#2402220050002042
Hello Péter,
I hope this email meets you well.
Below is an updated root cause for the issue raised.
Summary of impact: The physical host node where your VM is running had a networking stack update. This might result in a brief connectivity loss.
Resource Impact Start Time Impact End Time Impact Duration (Timespan)
aks-core-17988763-vmss_2 2024-02-22 08:59:04 UTC 2024-02-22 08:59:08 UTC 00:00:03.3140000
Root Cause: Azure performs updates to improve reliability, performance, and security of the VMs. Azure chooses the least impactful method, which might result in a brief connectivity loss. We are continuously working to improve and reduce impact of our updates, and we apologize for any inconvenience this may have caused you.
Mitigation: The update was completed however they might be some interruptions that requires you to reboot the system.
===================================================================================
Apologies for any misunderstanding, we needed the confirmation from you. Upon, further investigation the reboots signature associated with the respective scale sets indicate that the reboot operation came from the node and not the user interaction. Hence the reason, we asked if the reboot operation was carried out by a user.
We apologize for the impact and how it has affected your environment. We are continuously taking steps to improve the Microsoft Azure Platform and our processes to help ensure such incidents do not occur in the future.
Please respond to this email to let me know if this provides enough clarification. Thank you for your continuous patience and cooperation. It is well appreciated!
Looking forward to hearing from you soon.
Best Regards,
Aminat Ajagbe
Support Engineer 2
Azure | Virtual Machines
Working hours: Mon - Fri 8:00am – 5:00pm UTC+1
Email: v-amajagbe@microsoft.com
Managers:
Chinyere Nwaudu | v-cnwaudu@microsoft.com
Afeez Ojuolape | v-afojuo@microsoft.com
Chinenyenwa Omekara | v-chomek@microsoft.com
Maureen Williams-Ofurum | v-mauwi@microsoft.com
Aanuoluwapo Agboola | v-aaagboola@microsoft.com
Can’t reach me? Contact Azure Support Backup | azurebu@microsoft.com