Notes for QoS from IE’s CoD — Part 1

I really like the way Brian McG explains stuff. In just the first CoD on QoS, he cleared a lot of confusions I had. I have to hear the rest of the CoDs on QoS but here are my notes from the first one for now…


  • when we’re talking about QoS, we’re giving different levels of service to different classes of traffic.
  • Typically when we’re talking about this, it’s Queueing features on outbound interface (output queue).
  • Output queue is a buffer before the hardware queue, or transmit ring (TxR). TxR is always FIFO queue. Transmit Ring is the last stop before it is serialized on the interface.
  • Queueing is only outbound.
  • 2 portions of the output queue, the tail of the queue and the head of the queue. Tail is where the packets are entering the output queue and the head of the queue, they’re getting ready to move on to the TxR.
  • CBWFQ is outbound only not inbound.
  • Queueing Mechanisms
    • Method1: bandwidth guarantee on the interface.
      • guarantee minimum space in the output queue
        • legacy custom-queue(1 byte of FTP: 3 bytes of other traffic)
        • MQC “bandwidth” 25% or 2500Kbps, you can just specify it like that
      • Congestion management
        • It’s called that because we’re waiting till congestion occurs then we’re trying to deal with it.
        • Only comes into effect when there is congestion in the output queue.
        • e.g: we’re trying reserve 25% of the output queue for FTP traffic, if the output queue is not a 100% full, there’s no need to do this kind of reservation. So when we’re doing a bandwidth guarantee, we’re simply saying at a minimum, FTP traffic is going to get 25%. However if output queue is not full, FTP traffic can go ahead and take 100% of the bandwidth. Unless there are other classes competing, we’re not necessarily limiting the amount of traffic that’s going in the output queue.
        • With legacy CQ, we don’t directly configure with Bandwidth value. It uses a byte count ratio for what kind of traffic is reserved for what kind of bandwidth. E.g. that we’re trying to reserve 25% bandwidth for FTP. With MQC, we can specify specific percentage or bandwidth value. So if we had a 10MBps Ethernet interface, we could say we want to reserve 2.5Mbps. (25%).
        • with Legacy CQ, based on a relative ratio, we need to figure it out vs all the rest of the other traffic.
        • Key: we’re doing a “minimum” bandwidth guarantee.
    • Method2: Prioritization
      • Traffic that enters the tail of the Output Queue, move it to the head of the queue as soon as admitted. Traffic that need low latency or low delay.
        • legacy priority-queue: not implemented anymore, has big shortcomings. Defines 4 different queue definitions (high, medium,normal,low). For every packet that we go to send out the interface, we take a look at the high queue, if there are any packets in the high queue, we send them first, if none, we go down to medium, if not in high or medium, if not in either, then normal and so on. If there are consistently packets in the higher queue, it will starve the bandwidth.
        • MQC “priority” (LLQ): resolves the above problem by building an in-built policer to the prioritization. For e.g. we say, we give VOIP a max-guarantee of 640k, that is how much is guaranteed Low Latency. If it is less than or equal to 640k/s, if it exceeds that rate, it’s not going to be guaranteed Low Latency. If there is not congestion in output queue and we exceed the prioritized rate, the traffic can be sent however it would not be guaranteed priority. Policer only comes into effect if there is congestion AND the priority traffic is exceeding it’s max rate.
    • Method3: Traffic Shaping:
      • Used for slowing down the output rate to the TxR.
      • With traffic shaping, we’re taking traffic that is in excess of what we have configured and we’re putting it into a new shaping queue and holding it there under the assumption that at a later time interval, we’re going to transmit the traffic.
      • Shaping is different than a mechanism like policing, because we’re holding onto the traffic and smoothing the traffic rate over a longer term average.
      • Delay excess traffic for later transmission.
        • Legacy GTS & FRTS:
          • any non-fr, we’re going to be using GTS.
          • in FR traffic shaping, we’ll see that we are allowed to create individual shaping queues on a per VC basis. So if we have 3 different DLCIs, we’re going to create 3 different logical queues to process them individually.
          • Now for GTS, which would be on interfaces like Ethernet or P2P link, we’re going to have just one single output queue that will be dealing with all traffic on the wire.
          • Depending on what version of IOS, there will be different ways to apply the MQS when we’re configuring it for FR.
        • MQC “shape”
    • Method4: Random Detection, or Weighted Random Detection.
      • With Random Detection, we’re trying to avoid congestion before it occurs. Considered a “congestion avoidance” (as opposed to congestion management) technique.
      • It looks at traffic as it is being admitted into the tail of the output queue. Based on different configurable thresholds for different classes of traffic, random detection is going to determine, whether the traffic should be admitted to the output queue or whether it should be dropped.
      • Drop traffic before congestion occurs
        • legacy “random-detect”
        • MQC “random-detect”
      • Weighted based on IPP or DSCP
      • Congestion avoidance
        • avoid congestion before it occurs.
      • the way it is setup, the higher the value of IPP or DSCP, the less likely it would dropped on the interface.
      • WRED is typically used in environment where most of the traffic going out is TCP and is also used to prevent the problem of Global Synchronization of TCP packets.
      • TCP has a congestion management technique built into the protocol known as TCP slow start.
      • Now the way TCP communication works is that when a TCP client creates a connection with the TCP server, they’re going to negotiate a window size. This determines how many TCP segments you can send before waiting for ACK. The smaller the window size, the more ACK are sent, the higher, the more traffic you’re sending before you get ACK.
      • The higher the window size, the more efficient the communication is but at the same time, it’s going to take longer to detect a packet loss in the network.
      • so a window size of 1000bytes vs 10bytes, somewhere around 700 or 800 bytes, we’re not going to detect it soon in larger window size. TCP detects packet loss when it does not get an ACK that it has sent for a packet. When TCP sees that, it knows there’s a packet loss and it will have to retransmit. Now in this time, TCP window size drops back down to it’s minimal value and it starts to build back up again.
      • The slow start mechanism does, when the client sees that there is a packet loss, its going to drop down it’s window size, and it’s going to start retransmitting. Once it sees that ACKs are getting through, it slowly starts increasing the window size, until the value hits the maximum or until we get a packet loss again.
      • When you look at the traffic pattern of traffic vs time. We’ll see, traffic going up and up until they hit some max value and then they go into slow start (kind of like Shark’s teeth..the graph).
      • At the peak, that is where the packet loss occurs, at the low point, slow start is occurring. Now the problem what we run into, when a packet loss occurs typically, it doesn’t occur for one host. It would occur for multiple TCP flows that are transiting the device (swith/router in the transpit), it’s typically going to be dropping more than one flow at a time. What we’ll see due to this is it will cause multiple clients to go into slow start mechanism at the same time. Over a long term average, all of these clients build up and then there is a packet loss and then they go into slow start. So what we see is that there are periods of very high utilization followed by periods of very low utilization. This condition is called, Global Synchronization.
      • Random Detection is used to prevent this case. The reason global synch happens, is that when the output queue starts to build up, and there is packet loss on the interface. Typically this packet loss is based on the tail of the output queue. So if we’re running a queueing mechanism like FIFO, when the output queue fills up, traffic that is trying to enter the output queue is dropped. That is known as tail dropping, because we’re dropping it before we are admitting it to the output queue.
      • What random detection does, it looks at the thresholds that are inside of the output queue. Based on IPP or DSCP. And we look at how many packets of IPP 5 are already in the output queue for example, if that number of packets exceeds a certain threshold, we’re going to start dropping traffic, out of the output queue, for that particular precedence value. What we’re trying to do is prevent the case that output queue fills completely up, and we start to have tail drop. When we run into the tail drop situation, we run into Global synch problem with TCP, where multiple traffic flows start to get dropped and they go into slow start at the same time.
      • Now with radom detect, we randomly drop traffic before congestion occurs so we may cause one or two hosts to go into slow start, but not lots of flows at the same time.
      • Key words to keep in mind, how mechanisms are described. Read the words behind what they want.
      • legacy config, one command on the interface.
    • Method5: Traffic Policing
      • all the mechanisms we saw above have been queuing mechanisms so they will be applied outbound. But policing is not a queueing mechanism therefore it can be applied outbound or inbound on the interface.
      • Limit the output or input rate of the interface
        • legacy “rate-limit” (CAR)
        • MQC “police”
      • Not a queueing mechanism
        • does not buffer traffic for later transmission
        • this is the reason, not a queueing mechanism.
        • any traffic that is shaped/delayed were hanging on in shaping.
        • with policing, like shaping we’re using it to enforce rate, but traffic that exceeds is not queued for later transmission.
      • can be used to enforce rate or remark
        • conform action vs exceed action.
        • if the traffic is <= to, this is our conform action, if we go above the configured rate, is exceed action.
        • conform usually transmit, exceed usually drop
        • can be applied as input or output feature.
      • minor differences b/w legacy and MQC
  • MQC Overview:
    • Modular QoS Command Line Interface
    • AKA class-based WFQ, or the MQC or CBWFQ. That is referring to class-map/policy-map based methods.
    • Allows multiple QoS methods in same direction on same interface at the same time.
    • With legacy QoS, can apply one QoS method at a time.
    • Class-map/policy-map config
    • Now with MQC, based on the different types of traffic we are matching (using class-maps), we are going to choose, what type of QoS mechanism are applied for that class of traffic. So we can say, FTP traffic is going to be guaranteed bandwidth, while VOIP will get priority, Web is going to be shaped, we can do all at the same time, in the same direction on the same interface.
    • Only limitation we have is that we cannot apply output features, input on the interface. So we cannot apply shaping inbound or bandwidth reservation inbound, or prioritization inbound. Queueing mechanisms are only outbound on the interface.
    • MQC Configuration: 3 step process
      • Define Traffic classes
        • what type of traffic do I want to apply QoS to?
        • class-map [match-all | match-any] [name]
          • match
          • acls
          • packet length
          • dscp/ipp
          • nbar – to look at things above L3-L4 headers.
      • Define traffic policy
        • what type of QoS do I want to apply?
        • policy-map
          • Bandwidth | Priority | Shape | Police | Random-detect | Set…
      • Apply Policy
        • service-policy [input | output] [policy_name]
        • queueing only outbound.
      • Verify with “show policy-map interface” command. Unless we see packets being matched on the classes on the interface, that indicates there’s a problem with the way you’re applying the class. Based on the platform and type of interface, it could vary where to apply the service policy.

2 thoughts on “Notes for QoS from IE’s CoD — Part 1

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s