Saturday, September 24, 2011

Kindergarten QoS of UCSM

With quite a bit of over-subscription and combination of FCoE and Ethernet over the same physical media, QoS tends to be quite complicated in the UCS platform. Still in the management plane, the QoS configuration is presented in such a simplified manner that it was referred internally as "kindergarten QoS". Let's take the FCoE part out of the picture. Fibre-Channel is fundamentally loss-less L2 technology on contrary, Ethernet doesn't provide any such promise. However, things changed with per-priority-pause (PPP) and backward congestion notification features of data-center Ethernet (DCE). FCoE uses PPP to simulate loss-less media over Ethernet. In UCS networking components (Fabric-Interconnect aka switch, Febric-Extender and adapters), a dedicated queue is used for FCoE to ensure end-to-end PPP behavior. That takes care of FCoE.

Now, let's be back to how UCSM presents the QoS configuration to the administrator. There are two main parts:
  1. System QoS classes
  2. VNIC (egress) QoS policies
Following is the global QoS class configuration:



There are six global QoS classes defined:
  1. Default Ethernet class (best-effort)
  2. FCoE
  3. Platinum
  4. Gold
  5. Silver
  6. Bronze
Only first two are enabled by default (and can't be disabled). There's one more hidden control class defined internally, but it's not exposed to users. Following are the salient characteristics of UCSM system classes:
  • Classification and marking are combined in one. Classification is only based on L2 cos value (with cos 7 reserved for control traffic). Adapters mark the traffic by referring to the system class, FI trust the marking and classifies the frames accordingly.
  • CBWRR queuing strategy is used on all the ports of FI. Per port policy application is not required.
  • Bandwidth allocation per class is done using relative weight. Explicit bandwidth percentage is not exposed to avoid user (or script) configuration error of exceeding 100% interface bandwidth. (Once user chooses weight, the systems displays percentage for all classes).
  • Per interface MTU is not supported, per class MTU for the entire system is specified.
  • Notably missing things are: priority queuing, Weighed class based Random Early Drop.
Following figure shows how a VNIC QoS policy looks like:

As you can see, not much config is available here. You refer to the system class, and specify shaping parameters. "cos" specified by the system class is used to mark the untagged egress packets from the host. "Host control" config detects the system behavior with already marked packets by host. If "full" is specified, then packets tagged by host are trusted, otherwise, any packet that is marked by host and doesn't match cos specified by the system class is dropped by the Cisco adapters. Shaping parameters specified are enforced by the adapters for the egress (host's perspective) traffic.

That's it! Once you have defined various VNIC QoS policies, the VNICs in the service-profile can refer to them by name. The named policy reference works as per the policy resolution mentioned in my previous post.

Just to compare, above mentioned configuration in the UCSM expands to following MQC in the NXOS:

UCS-A(nxos)# show class-map


Type qos class-maps
===================

class-map type qos match-any class-fcoe
match cos 3

class-map type qos match-all class-gold
match cos 4

class-map type qos match-all class-bronze
match cos 1

class-map type qos match-all class-silver
match cos 2

class-map type qos match-any class-default
match any

class-map type qos match-all class-platinum
match cos 5

class-map type qos match-any class-all-flood
match all flood

class-map type qos match-any class-ip-multicast
match ip multicast


Type queuing class-maps
=======================

class-map type queuing class-fcoe
match qos-group 1

class-map type queuing class-gold
match qos-group 3

class-map type queuing class-bronze
match qos-group 5

class-map type queuing class-silver
match qos-group 4

class-map type queuing class-default
match qos-group 0

class-map type queuing class-platinum
match qos-group 2

class-map type queuing class-all-flood
match qos-group 2

class-map type queuing class-ip-multicast
match qos-group 2



Type network-qos class-maps
==============================

class-map type network-qos class-fcoe
match qos-group 1

class-map type network-qos class-gold
match qos-group 3

class-map type network-qos class-bronze
match qos-group 5

class-map type network-qos class-silver
match qos-group 4

class-map type network-qos class-default
match qos-group 0

class-map type network-qos class-platinum
match qos-group 2

class-map type network-qos class-all-flood
match qos-group 2

class-map type network-qos class-ip-multicast
match qos-group 2

UCS-A(nxos)# show policy-map


Type qos policy-maps
====================

policy-map type qos system_qos_policy
class type qos class-platinum
set qos-group 2
class type qos class-silver
set qos-group 4
class type qos class-bronze
set qos-group 5
class type qos class-gold
set qos-group 3
class type qos class-fcoe
set qos-group 1
class type qos class-default
set qos-group 0

Type queuing policy-maps
========================

policy-map type queuing system_q_in_policy
class type queuing class-platinum
bandwidth percent 22
class type queuing class-gold
bandwidth percent 20
class type queuing class-silver
bandwidth percent 18
class type queuing class-bronze
bandwidth percent 15
class type queuing class-fcoe
bandwidth percent 14
class type queuing class-default
bandwidth percent 11
policy-map type queuing system_q_out_policy
class type queuing class-platinum
bandwidth percent 22
class type queuing class-gold
bandwidth percent 20
class type queuing class-silver
bandwidth percent 18
class type queuing class-bronze
bandwidth percent 15
class type queuing class-fcoe
bandwidth percent 14
class type queuing class-default
bandwidth percent 11
policy-map type queuing org-root/ep-qos-HTTP
class type queuing class-fcoe
bandwidth percent 50
class type queuing class-default
bandwidth percent 50
shape 10000 kbps 10240
policy-map type queuing org-root/ep-qos-Streaming
class type queuing class-fcoe
bandwidth percent 50
class type queuing class-default
bandwidth percent 50
shape 100000 kbps 10240


Type network-qos policy-maps
===============================

policy-map type network-qos system_nq_policy
class type network-qos class-platinum

mtu 1500
pause no-drop
class type network-qos class-silver

mtu 1500
pause drop
class type network-qos class-bronze

mtu 1500
pause drop
class type network-qos class-gold

mtu 9000
pause drop
class type network-qos class-fcoe

pause no-drop
mtu 2158
class type network-qos class-default

pause drop
mtu 1500

Friday, September 9, 2011

UCS management paradigm

UCS Manager (UCSM) exhibits a very unique and interesting set of features for ease of deployment in the cloud. UCSM stores configuration, device information, statistics and policies in an object oriented data model. The "management brain" of UCS is completely data driven. Only interface to UCSM is through XML APIs, and both GUI and CLI internally use XML to communicate with the core UCSM process. With that high level background, let me get in to the details of some interesting characteristics of UCSM.

Named References

UCSM makes maximum use of named references. Templates, pools and policies are used to loosely bind the configuration and to easily share common data. Policies dictate behavior and multiple configuration end-points can share the same behavior. A change in policy would not require to revisit all the end-points that refer to the policy. There are standalone policies for global configuration, for example chassis discovery policy, VM life cycle policy etc. The VLANs are also referred by name. For example, all database servers can be in a "dbNet" VLAN with VLAN id 10 and all the service profiles corresponding to database servers would have VNICs referring to the VLAN by the name "dbNet". Once this is in place and servers are up and running - if network admin changes the network architecture and VLAN id changes from 10 to 20 -- he/she wouldn't have to revisit all the servers to change - it would be changed at only one place.

Policy resolution in hierarchical org structure

UCSM allows to reflect hierarchical organizational structure of a company (or tenants in case of cloud service provider) in the managed object model. For example, you can have classic "coke" and "pepsi" top level orgs. Under "coke", you can have "operations", "research", "legal", "marketing" orgs. Now let's say there are "streaming" and "http" QoS policies defined at "coke" level, where "http" restricts the bandwidth to 1Mbps. But, in the "research", there's a requirement to let web traffic flow up to the line-rate, so, administrator can create a QoS policy with same name at the "research" org level. When policy is referred by name, policy gets resolved to the closest org level that matches the name. So, a port-profile defined in "research" level or sub-org levels would enjoy line-rate if it refers to "http" QoS policy.


Also, the policy gets re-resolved if another policy is added with same name in the org hierarchy. If the previous example, if "http" QoS policy is deleted from "research" level, the port-profiles referring to them would automatically resolve to "http" policy at the "coke" level.

Loose Referential Integrity

Most management system would not let you delete a policy if it is referred by other configuration. However, UCSM does not enforce such strict referential integrity. UCSM referential integrity works in following loose manner:

  • For every policy/pool type, there exists a predefined policy at the root level with "default" name.
  • If the system doesn't find a named policy in org scope of configuration that refers a policy by name, then default policy is used.
  • If a referred policy gets deleted (and no other policy in org scope exists at any other org level), then configuration referring to such policy resolves to the default policy.
  • If a more relevant (closer) named policy is defined with respect to the configuration, then policy gets re-resolved to the "closer" policy in org hierarchy.
Of course, when system dynamically resolves policy, then it must have mechanism to inform administrator about its resolution. Two mechanism are used for this:
  1. An operational property is defined for every named reference which specified distinguished name of the policy that system resolved.
  2. If a specific named policy is not found and default is used, then a fault is raised.
Asynchronous Configuration Deployment

In server management systems, many operations take long time to finish - like VM deployment, server reboot etc., so north bound APIs can not be blocked for such extensive period of time. UCSM provides completely asynchronous north-bound experience. It differs from Cisco's networking gears in this regards - for example, when user issues command to create a VLAN, the management system would check the range and maximum VLANs etc, if user input is good, it immediately unblocks the user, effectively telling "consider it done". Later, it deploys the VLAN on the switch and it would only fail if there are other serious issues like control plane running out of memory etc. and if that happens, faults are raised.

Putting it all together


UCSM is designed to keep cloud and data-center virtualization in mind. It is extremely friendly to automation given its XML APIs, data-driven model and asynchronous nature. Loosely coupled policies and maximum usage of pools, policies and templates make it and ideal fit for server procurement in cloud, especially in multi-tenant systems. Field has very warmly welcome these characteristics. Just as a sample, Here is a blog that talks about integration of XML APIs in the power-shell.