I encountered an issue with a Hyper-V virtual machine recently that had me very confused. I still don’t have a great resolution to it but at least a functional workaround.
At a site I have a Server 2012 R2 Hyper-V host (Host), a Server 2012 R2 file server guest (FileServer) and a Server 2012 R2 domain controller (DC).
The Host has two NICs in a single switch-independent address hash team. This Team is set as the source of the vSwitch which has management capabilities enabled. We wanted to use a NIC team to provide network redundancy in case a cable was disconnected as this network is being provided by the site owner rather than my company.
Host and FileServer could reach DC, but nothing else could. DC could reach Host and FileServer, but not even it’s own default gateway.
This immediately sounded like a mis-configured virtual switch on Host, as it appeared DC could only access internal traffic. But I confirmed the vswitch was set to “External” and if this were the case the FileServer would have presumably been affected by this issue as well, but it was not.
I tried disabling VMQ on the Host NICs, as well as Large Send Offload, since both those features have been known to cause problems, but that did not resolve the issue either.
I tried changing the teaming algorithm to Hyper-V Port and Dynamic, but that didn’t resolve the issue either.
Then I decided to put one of the NICs into a Standby state in the team. This caused the accessibility to switch between the VMs; all of a sudden nothing external could reach my file server, but the DC came online to external traffic.
I tried changing which NIC was in standby but that still left me with one VM that had no connectivity.
My assumption at this point is that this issue is being caused by Port Security on the network switches; something that we are aware the site owner is doing. I suppose that the Team presents a single MAC address across multiple ports, which the port security doesn’t like and so it blocks traffic from one side of that team. Because of how traffic is balanced across the team it leaves one of the VMs in an inaccessible state.
Unfortunately we do not have control over this network or the ability to implement LACP, and so I’ve had to remove the NIC teaming and go back to segregated NICs for management and VM access.