Monday, August 22, 2011

Network Virtualization in Cisco's UCS

I would like to start this blog with my favorite project - Network Virtualization in Cisco's Unified Compute System (UCS). Particularly, I would like to write about integration of UCS with VMware's vCenter to achieve seamless network virtualization in data centers. Before I go into the details of the "solution", let me go over the "problem" first.

Issues with virtualization in data center network

As soon as virtualization is added on a host, a software switch or "soft switch" had to be added in the host kernel. Refer to the diagram on left. As soon as you have multiple VMs running on a host (an ESX host in case of VMware), switching must happen in the host kernel if one VM wants to communicate with another VM because of the Ethernet switching 101 - a frame is never sent back on the same interface from where it came from (it would be either sent to a specific interface where destination MAC address is learned, or flooded to all the interface except from where it came from in case of unknown unicast, broadcast and multicast). So, a soft switch was mandated from the day one in VMware ESX host.

However, as soon as the soft switch was added to the host, another issue surfaced in the way data centers are managed. The line in the middle of the diagram is the boundary between server admin team and network admin team, and now a switch is lying in the server admin's domain. In the good old days before virtualization, the port connected to the host was typically put in access mode, with one access VLAN assigned to it and specific QoS and access control policies attached with it. Now, because of the soft switch in the host, the trust boundary has to be extended to the soft switch, multiple VLANs have to be allowed on the port connected the ESX host, QoS trust boundary has to be extended to the soft switch (or more expensive NBAR based classification has to be performed), and access control policy enforcement becomes complex. The link between switch and ESX host becomes a dump pipe (that's why I have drawn it that way), and network policy enforcement become responsibilities of soft-switch - which is typically managed by server admin.

Cisco's UCS solves this problem and streamlines the data center virtualization by offering hardware VNTag solution.

UCS Network Virtualization

In Cisco's Unified Compute System (UCS), customers have choice to replace the ESX host's soft-switch by the hardware based switching in the UCS access switch, aka UCS Fabric Interconnects. In the ESX host, a Cisco's kernel module - Virtual Ethernet Module or VEM is added.

VEM works as a replacement of soft-switch, where instead of performing switching in the host, for every north bound virtual NIC created on the VEM, a south bound instance of VNIC is created in the Cisco's M81KR virtualized adapter. In the current releases, up to 56 such VNICs can be created on the M81KR adapter. The adapter in turn requests the attached fabric-interconnect (access switch) to create dynamic virtual Ethernet interfaces (vEths). vEths are logical entities created in ASIC and enjoys the same status as physical interfaces, so now the frames from one VM to another is switched in the ASIC on the access switch. All the rich set of switching features of Nexus 5000 switch is now available to the vEths, which is managed by the network admin team. M81KR is also industry's first implementation of VMDirectPath technology, which essentially bypasses the switching module in the hypervisor and provides near bare metal performance to VM-VNICs. In addition to solving the domain issues and providing faster switching in hardware, UCS provides smooth integration with vCenter and port-profile based network management.

UCSM / vCenter integration and port-profiles

Another benefit of using UCS in data center is its integration with VMware's vCenter and port-profile based VM network management. A UCSM first registers itself to vCenter as a management extension and UCSM can create multiple distributed virtual switches in the vCenter. A distributed virtual switch is a VEM that expands across multiple ESX hosts. At this point, network admin can create multiple port-profiles. Port-profiles are grouped network policies and configuration identified by a name. For example, "finance" and "hr" port-profiles would contain VLANs, QoS policies, L2 security policies etc. for those two groups. UCSM pushes profile names and description to vCenter, as server admin wouldn't be interested in the nittygritty of network configuration.

When server admin deploys a VM, she would create VNICs for the VM and assign a port-profiles to VNICs. When VM instantiates on the hypervisor, VEM and M81KR would request the access switch to dynamically create vEth interfaces and provide the corresponding port-profile names. As access switch has complete configuration of port-profiles, the vEths would inherit the configuration from the corresponding port-profiles, and thus, the whole loop finishes. As you can see, this architecture has several benefits:
  • It decouples server and network admin's work flow and eliminates inter-dependency. For example, network admin can easily change VLAN id in a port-profile without bothering the server admin team to make any changes on server side.
  • When vMotion happens, the VM-VNIC's configuration also smoothly moves from one host to another. A vEth interface is detached from one access port and is attached to access port connected to the destination hypervisor. All the state and stats of the vEth is preserved after vMotion. Any external script or entity doesn't need to change configuration of access ports with vMotion.
  • UCS (which includes the access switch) is aware about the virtualization. You can see the association of physical servers, their associated service-profiles, hypervisors, VM instances, VNICs and port-profiles in the UCSM UI.
You can find white papers that describe more technical details of this solution on Cisco website, my goal here was to iterate over the latest end-to-end virtualization technology as simply as possible.

Disclaimer: Anything explained/expressed here is not an official form of communication by/from Cisco Systems, Inc.

4 comments:

  1. This is a nice approach to solve this problem of VM. Mehul, can you share some statistical data i.e. thru'put with soft switch compared to line rate, scalability etc?

    ReplyDelete
  2. Best comparative data is available at http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns944/white_paper_c11-593280.html

    Personally I've observed that efficiency improvement is noticeable when you have more VMs per host. If your VMs are CPU bound, then also you would like to offload the switching load in the hardware.

    ReplyDelete
  3. This is a really well explained article. Assuming we can potentially run 10s of VMs per blade, we need ~1500 VEths on the Fabric Interconnect, can we scale up to that number?

    ReplyDelete
  4. Yes, so far so good - current limit is 2k vEths per FI, with plans to increase it much further....

    ReplyDelete