On May 25th, 2021 VMware released a security advisory related to two new CVE’s. These CVE’s included CVE-2021-21986 a vulnerability with some of their provided plugins and CVE-2021-21985 a remote code execution vulnerability thanks to no/weak input validation in the Virtual SAN Health Check plug-in. Both of these vulnerabilities score a CVSV3 above 6.0 and cause many Incident Response people like me to take notice.
Timeline:
2021-05-25 - Security researcher privately reported vulnerabilities to VMware
2021-05-25 - VMware Releases security advisory (https://packetstormsecurity.com/files/162812/VMware-Security-Advisory-2021-0010.html)
2021-6-02 - Public PoC is released
2021-6-09 - Reports of attacks in the wild come to the surface.
Technical Information:
https://packetstormsecurity.com/files/162812/VMware-Security-Advisory-2021-0010.html
https://www.VMware.com/security/advisories/VMSA-2021-0010.html
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21986
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21985
https://attackerkb.com/topics/X85GKjaVER/cve-2021-21985
https://www.iswin.org/2021/06/02/Vcenter-Server-CVE-2021-21985-RCE-PAYLOAD
Day in the Life
For those not in Incident Response(IR), we follow a simple workflow to do our work: Preparation, Detection, Containment, and Post-Incident Response. There are many ways to solve the many puzzles we face every day but if you are not prepared to handle the incident then everything can go south quickly. Next, we have Detection, did it happen at all if you can’t see it? Once you have seen it and the need arises, can you isolate the threat and stop it from spreading throughout your organization? Now that the threat is isolated to one device or network, what is your next move? I plan to take you on a small IR journey and share some insights along the way. Please grab a coffee, take my hand and let’s explore.
Preparation
Incident Response, as the name suggests, is the Response to a threat to your organization and like any good internet battle, the fight will need you to have some great hits in your arsenal to be successful.
Know your assets - You can’t isolate a machine or network properly if you don’t know that the asset even exists. Asset inventory can be a huge pain but it’s a necessary evil.
Know your escalation points - Unless you have ultimate authority there isn’t much you can do even if you wanted to without permission. Know the data and application owners in your organization so you are able to quickly get the permission you need while keeping key players updated.
Know the drill - Practice, Practice, Practice! Drills with your supporting teams and the teams you support will make the real incidents so much easier to execute because when under pressure we fall to our training.
Know the norm - You have an entire network full of information and likely some application sending you tons of alerts. Know which alerts are false positives and which to investigate by simply doing the work. Investigate the incident until you’re satisfied that you can explain it then go that extra mile to help your team tune the application.
Stay Informed - Staying abreast of the latest vulnerabilities and hacks is all part of being in information security, but that information does only so much if you’re the only one with it. Release alerts to your team or provides a set date/time to brief the team on the latest going on both in your organization and external threats.
Stay ready so you don’t have to get ready - A wise man once told me this and to this day it’s something I strive for in Incident Response. Have playbooks, workflows, and escalation paths mapped and ready so when things go bad you don’t need to run around lost while your team fumbles the incident? Know what to do and be ready to do it.
Detection
When we was first notified of CVE-2021-21986 and CVE-2021-21985, the first step after reading up on them was to see if we could detect them in our environment. To do that, we should look through the released advisory for clues.
CVE-2021-21986 - Multiple Vulnerable Plugins
Virtual SAN Health Check
Site Recovery
vSphere Lifecycle Manager
VMware Cloud Director Availability plug-ins.
CVE-2021-21985 - RCE in Virtual SAN Health Check.
vSphere Client (HTML5) open externally on port 443
Now that we have the information to make the inquiries, the next is to follow the playbook “Security Advisories Released” which lists the application owners and network teams we need to contact. we quickly shoot off an email to our VMware team and ask if we are running any of these plugins and if we happen to expose port 443 on any of our VMware instances. we sent similar emails to both our SOC and Engineering team so that they could look for any abnormal scans of our infrastructure or evidence of compromise/attempt to compromise our VMware infrastructure. Within a few minutes reports started to come in. Yes, there were people scanning, no we did not run any of those plugins, and no, externally port 443 did not get mapped to any internal vCenter instances. In my case, this is when I moved to post a response but for many, you had a much different situation. Let’s assume my situation was different and the responses from my teams were different.
Containment
Good security isn’t about getting hacked, it’s about being able to survive being hacked. You can have strong passwords but without 2FA they may still be leaked. You can have VPN services but are you restricting which accounts have access to it or from where they can log in? Hackers, even the criminal ones, know that getting in is just one stage of a multi-stage attack and once in the knowledge of your network is a top priority to them.
Malware can scan your network looking for more hosts to infect while hackers will try to grab passwords locally then spray them to any machine which will accept them. With this in mind, we continue our little journey.
CVE-2021-21985
Exposed 443 - Had we found an instance that did have its port 443 exposed to the internet, the first thing I would do is close it down! I would contact our NOC and request the port be blocked at the firewall and have routing set up so only some investigative boxes can reach it and vice versa.
Access Control - We know who is SUPPOSED to access that box, so we remove them from our logs and see who/what’s left. Check for recent changes, permission changes, or file writes.
Clean up - If we happen to be lucky enough that we were exposed but not exploited then lucky us. If we happen to find otherwise, it’s time to clean and restore those backups. You have backups right? You did prepare, right?
CVE-2021-21986
Plugins Found - Had we been found to be running them, my playbook calls for us to notify application owners about them and the explanation of why we are disabling them. This is necessary as downtime isn’t always tolerable.
Access Control - We know who is SUPPOSED to access that box, so we remove them from our logs and see who/what’s left. Check for recent changes, permission changes, or file writes.
Clean up - If we happen to be lucky enough that we were exposed but not exploited then lucky us. If we happen to find otherwise, it’s time to clean and restore those backups. You have backups right? You did prepare, right?
Locking down the asset and fixing anything that the attacker may have touched is often a long and tiresome job. It’s also likely a thankless one since who cares the name of the fireman as long as the burning has stopped. Along the way make sure to note all your findings, when what, and who is especially important for the number crunchers. We however don’t have to stress, since we are prepared, we know what’s coming and we have done it before.
Post-incident response
By the time we are here emotions are generally high. Root cause analysis is happening and the business is on high alert. As everyone spins down however it’s normal for blame to not only want to be placed but needed to be placed. Here is where your training and preparedness can come in super handy.
After Action Report - To those of us on the front lines, that first pump of adrenaline can cause us to go into super mode. While we are multitasking, take the time to take notes, what time a piece of evidence was found, where, and by whom. Prepare a report showing a timeline, the information you found, and the most important part, what to do to not have a repeat of the incident. Your upper management will thank you, if not already required to do it.
Continued Monitoring - While I wish we could just follow these steps and we would be done but we need to be vigilant. Here are some things to consider depending on how far the actor went.
Login/Email Account Compromised - After forcing a password change continue to monitor the account for any suspicious activity. Off-hours login, weird VPN use, and the like. Enable 2FA on everything.
Compromised Asset - Force password changes on all the accounts, and monitor the box for suspicious ingress/egress traffic.
Malware / Dropper - Had they gotten access, and they happened to drop some malware there are two options.
Continued monitoring which we are doing anyway, but this time much more low level, such as file writes, mem writes, and the like.
Wipe and reinstall. I’m a fan of this. If you did it once, you can do it twice, and if you have been prepared the application’s configuration is backed upright? You did remember backups?
After you present your After Action Review (AAR) to the upper management it’s time to do the dirty work. Your AAR will describe the event and all the hard work you and your team put in but it will also highlight things you may not be so proud of.
Key Questions to Review
Was your Mean time to detect or discover (MTTD) low?
What about your Mean time to Resolve (MTTR)?
Did you find that you have too many any-any firewall rules?
Did you find that your EDR software didn’t catch it?
Did you find that your network segmentation wasn’t segmented enough?
Repeat
While I do wish each incident would be a one-off, the best we can hope for is that this incident prepares us more for the next one. That means going back over your workflows to refine the process. It means going over with your teams the lessons learned and how each, including ourselves, can be more proactive and less reactive.
This is where tools such as Airgap Ransomware Kill Switch™ can come in very handy in Zero Trust network segmentation, Ransomware containment, and post incident investigation on all devices communications with the agentless approach. With the newly announced Ransomware Kill Switch for Endpoints, organizations can now transform the infected endpoints into a zero trust endpoints and stop lateral movement among those endpoints. Using our example above we could have utilized their Zero Trust isolation to block port 443 while their Ransomware Kill Switch would have stopped the attacker’s lateral reconnaissance had they gotten access before we detected them. Airgap acts as a gateway so all the devices must communicate through them. This allows a welcome and amazing amount of control of what on your network talks to whom. We hope we was able to shine a little light on both the recent VMware vulnerabilities and a few of the strategies that can be utilized during an Incident.