Cisco SD-WAN IPsec Tunnel Configuration

This blog post describes configuring a site-to-site IPsec VPN tunnel from a Cisco SD-WAN IOS-XE-based router to a non-SD-WAN device.

How to enable configure Cisco SD-WAN IPsec Tunnels to a non-SD-WAN device? In Cisco SD-WAN template-based deployment, IPsec tunnels are configured via the Cisco VPN Interface IPsec feature template. This template is then applied to the Transport VPN (0) or one of the Service VPNs.

Cisco SD-WAN IPSec Tunnels Step-by-step

Figure 1. Configuration Map: Elements Diagram
Figure 1. Configuration Map: Elements Diagram

The logical elements required to be configured shown in Figure 1. All pre-configured elements have check mark symbols next to them.

In our example, the edge device has a device template attached with basic configuration applied, such as system and transport interfaces, sufficient for the router to have control connections to the controllers.

The device template uses a service VPN, which is described by the Cisco VPN feature template. This type of template, despite its name, is not related to IPsec VPN settings. Cisco VPN feature template defines VRF settings and is a container for routing and participating interface information. In our example, the user-facing interface is assumed to be configured and associated with the VRF.

While it is possible to configure all child templates from within the device template, in our example, we will pre-configure child feature templates first and then select them in the device template.

Create and Configure Cisco VPN Interface IPsec Feature Template

The first two steps deal with configuration of IPsec feature template.

Figure 2. Configuration Map: Cisco VPN Interface IPsec Feature Template
Figure 2. Configuration Map: Cisco VPN Interface IPsec Feature Template

Step 1. Create feature template

  • Select Configuration section of the side menu
  • Click on Templates
  • Click on the Feature tab
  • Click on Add Template button
  • Select model of devices that this feature template will be applied
  • Select Cisco VPN Interface IPsec
Figure 3. Create new feature template in vManage
Figure 3. Create new feature template in vManage

Step 2. Configure feature template

Customize IPSec tunnel parameters. There are 5 sections in IPsec template:

  • Basic configuration, such as name and IP address of the tunnel interface and its underlying source (local router) and destination (remote router)
  • Dead-peer detection settings
  • IKE or Phase 1 parameters
  • IPSEC or Phase 2 parameters
  • Advanced Settings

SD-WAN requires an IP-numbered interface (/30) and supports route-based tunnels known as VTI (Virtual Template Interface) in Cisco IOS documentation.

Instead of specifying interesting traffic using ACL known as policy-based tunnels, route-based tunnels use static or dynamic routing over a tunnel interface.

Figure 4. Configure feature template in vManage
Figure 4. Configure feature template in vManage

As figure 4 shows, there are various options available for both IKE and IPSEC security parameters. These need to match between tunnel endpoints.

Adjust device template to use IPsec Feature Template

Figure 5. Configuration Map: Device Template reference to IPsec Interface Feature Template
Figure 5. Configuration Map: Device Template reference to IPsec Interface Feature Template

Step 3. Add feature template to the device template.

IPsec interface template can now be attached to the service VPNs. Figure 6 shows how to modify the existing device template.

Figure 6. Add IPsec template to service-side VPN.
Figure 6. Add IPsec template to service-side VPN.

Routing Configuration

Figure 7. Configuration Map: Static Routes over IPsec interface
Figure 7. Configuration Map: Static Routes over IPsec interface

Step 4. Set-up routing over the tunnel. This can be static or dynamic-routing protocol-based. In the screenshot below, the static route configuration is shown.

Figure 8. Configure routing over IPsec tunnels
Figure 8. Configure routing over IPsec tunnels

Step 5. Test the tunnel.

As the tunnels are VTI-based and have Layer 3 addresses on both sides, the simplest test is to ping the remote side of the tunnel.

There is limited information available via real-time monitoring using vManage web interface. Native SD-WAN tunnels are also IPSec-based. These tunnels have centralized authentication and key management done by OMP instead of IKE/ISAKMP protocols used in non-SD-WAN tunnels. Real-time device options that contain string IKE in their name will be relevant to us in the context of this article.

Figure 9. Validate IKE tunnel status
Figure 9. Validate IKE tunnel status

Using SSH connection to the router these 2 commands can be used to check operational details of the tunnel:

  • show crypto isakmp sa / show crypto ikev2 sa
  • show crypto ipsec sa

We will demonstrate output of these commands in the practical example below.

Cisco SD-WAN IPSec Tunnels Example

Now it’s time for a practical example. We will establish an IPsec tunnel to a Cisco IOS-XE router configured to match VPN gateways settings in public clouds. For example, AWS provides sample configuration files for different platforms (see this URL). We will apply configuration from the Cisco IOS sample, and we can assume that if our router can work with it, it will work with a real AWS gateway. The configuration is slightly adjusted to use IKEv2 by replacing all isakmp commands with IKEv2-variants.

Figure 10 Cisco SD-WAN IPSec Tunnel Lab Diagram
Figure 10 Cisco SD-WAN IPSec Tunnel Lab Diagram

External router configuration

Non SD-WAN router shown on the top in figure 10 has the following configuration:

interface GigabitEthernet2
 ip address 5.5.5.10 255.255.255.0

ip route 0.0.0.0 0.0.0.0 5.5.5.1

crypto ikev2 keyring KEYRING-1
 peer 21.1.1.2
  address 21.1.1.2
  pre-shared-key cisco

crypto ikev2 proposal IKE-PROPOSAL-1 
 encryption aes-cbc-256
 integrity sha1
 group 16

crypto ikev2 policy IKE-POLICY-1 
 match address local 5.5.5.10
 proposal IKE-PROPOSAL-1

crypto ikev2 profile IKE-PROFILE-1
 match address local interface GigabitEthernet2
 match identity remote address 21.1.1.2 255.255.255.255 
 authentication remote pre-share
 authentication local pre-share
 keyring local KEYRING-1

crypto ipsec transform-set TRANSFORM-SET esp-256-aes esp-sha-hmac 
 mode tunnel

crypto ipsec profile IPSEC-PROFILE-1
 set security-association lifetime kilobytes 102400000
 set transform-set TRANSFORM-SET
 set ikev2-profile IKE-PROFILE-1

interface Tunnel0
 ip address 169.254.23.2 255.255.255.252
 ip tcp adjust-mss 1400
 tunnel source GigabitEthernet2
 tunnel mode ipsec ipv4
 tunnel destination 21.1.1.2
 tunnel protection ipsec profile IPSEC-PROFILE-1

ip route 192.168.22.0 255.255.255.0 Tunnel0

SD-WAN configuration

We followed the same steps described in the first part of the article to configure vManage. To make it easier to follow, the majority of parameters are hardcoded into the template. In a real deployment, per-device variables can be used to allow for template re-use.

Figure 11 Cisco SD-WAN IPSec Feature Template Configuration
Figure 11 Cisco SD-WAN IPSec Feature Template Configuration

Then the feature template was added to the device template under VPN 1 section (see Figure 6 above) and route to 192.168.15.0/24 was added to VPN 1 feature template (see Figure 8).

Testing and Validation

Let’s assume that we have access only to the SD-WAN router, and testing will be done only from one side of the connection. We will use the router’s command-line interface via SSH from the vManage web console, as it gives access to information not available via the web interface.

The first test that we perform is checking if the remote side of the tunnel is reachable. The “vrf 1” parameter makes sure that the router uses the correct interface.

CSR01#ping vrf 1 169.254.23.2
 Type escape sequence to abort.
 Sending 5, 100-byte ICMP Echos to 169.254.23.2, timeout is 2 seconds:
 !!!!!
 Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
 CSR01#

If ping responses are not received, we can run “show crypto ikev2 sa” and “show crypto ipsec sa” commands. The first command displays if the IKEv2 security association is established, which is a prerequisite for IPSEC security associations. The troubleshooting should start here. If IKEv1 is used, the command is “show crypto isakmp sa.”

CSR01#show crypto ikev2 sa
  IPv4 Crypto IKEv2  SA 
 Tunnel-id Local                 Remote                fvrf/ivrf            Status 
 1         21.1.1.2/500          5.5.5.10/500          none/1               READY  
       Encr: AES-CBC, keysize: 256, PRF: SHA1, Hash: SHA96, DH Grp:16, Auth sign: PSK, Auth verify: PSK
       Life/Active Time: 86400/545 sec
 IPv6 Crypto IKEv2  SA 

We were running ping of the tunnel interface in our example, which is directly connected to both routers. This test might be successful; however, the connectivity between devices behind the tunnel gateways may still not work.

In this case, we can use the “show crypto ipsec sa” command. It displays a set of counters for the number of encrypted and decrypted packets. 

If the encrypted packets count is not increasing, that usually suggests a local routing problem when traffic is not being sent out of the tunnel interface.

If the encrypted packets count does increase but decrypted doesn’t, it can mean that the remote router has routing misconfiguration.

CSR01#show crypto ipsec sa
 interface: Tunnel100001
     Crypto map tag: Tunnel100001-head-0, local addr 21.1.1.2
 protected vrf: 1
    local  ident (addr/mask/prot/port): (0.0.0.0/0.0.0.0/0/0)
    remote ident (addr/mask/prot/port): (0.0.0.0/0.0.0.0/0/0)
    current_peer 5.5.5.10 port 500
      PERMIT, flags={origin_is_acl,}
     #pkts encaps: 5, #pkts encrypt: 5, #pkts digest: 5
     #pkts decaps: 5, #pkts decrypt: 5, #pkts verify: 5
     #pkts compressed: 0, #pkts decompressed: 0
     #pkts not compressed: 0, #pkts compr. failed: 0
     #pkts not decompressed: 0, #pkts decompress failed: 0
     #send errors 0, #recv errors 0
 <output truncated>

There are some useful debug commands available, such as “debug crypto ikev2”. It can generate extensive output on a router with multiple tunnels, so be careful not to overload the production router. In the example below, we’ve changed the key on the other side of the tunnel to break the tunnel. Auth exchange failed message is logged, suggesting that we have mismatched keys and “show crypto ikev2” will not display any tunnels.

CSR01#debug crypto ikev2
 Payload contents: 
  VID IDi AUTH SA TSi TSr NOTIFY(INITIAL_CONTACT) NOTIFY(SET_WINDOW_SIZE) NOTIFY(ESP_TFC_NO_SUPPORT) NOTIFY(NON_FIRST_FRAGS) 
 *Jul 10 00:46:36.630: IKEv2:(SESSION ID = 2,SA ID = 1):Sending Packet [To 5.5.5.10:500/From 21.1.1.2:500/VRF i0:f0] 
 Initiator SPI : EE7E2D729412F370 - Responder SPI : B37D8CA8BAB8C150 Message id: 1
 IKEv2 IKE_AUTH Exchange REQUEST 
 Payload contents: 
  ENCR 
 *Jul 10 00:46:36.633: IKEv2:(SESSION ID = 2,SA ID = 1):Received Packet [From 5.5.5.10:500/To 21.1.1.2:500/VRF i0:f0] 
 Initiator SPI : EE7E2D729412F370 - Responder SPI : B37D8CA8BAB8C150 Message id: 1
 IKEv2 IKE_AUTH Exchange RESPONSE 
 Payload contents: 
  NOTIFY(AUTHENTICATION_FAILED) 
 *Jul 10 00:46:36.633: IKEv2:(SESSION ID = 2,SA ID = 1):Process auth response notify
 *Jul 10 00:46:36.633: IKEv2-ERROR:(SESSION ID = 2,SA ID = 1):
 *Jul 10 00:46:36.633: IKEv2:(SESSION ID = 2,SA ID = 1):Auth exchange failed
 *Jul 10 00:46:36.633: IKEv2-ERROR:(SESSION ID = 2,SA ID = 1):: Auth exchange failed
 *Jul 10 00:46:36.633: IKEv2:(SESSION ID = 2,SA ID = 1):Abort exchange
 *Jul 10 00:46:36.633: IKEv2:(SESSION ID = 2,SA ID = 1):Deleting SA
 CSR01# show crypto ikev2 sa
 CSR01#

And finally we can perform end to end test from the test machines using ping and tracert commands.

Figure 12 End-to-end testing
Figure 12 End-to-end testing

Cisco IP SLA IOS-XE

Cisco IP Service Level Agreements (SLAs) is a proprietary feature available on Cisco routers and switches, which actively generates monitoring traffic, processes replies, and measures network performance.

This feature can be used to perform continuous end-to-end connectivity testing with automated re-routing over failover links. It can also simulate different application behavior, such as voice and video to check if the network provides the expected level of service.

The examples and features described in this article are based on Cisco IOS-XE version 16.9.

Configuration Components

IP SLA configuration starts with defining an SLA operation and then scheduling it to run immediately or at a specific time.

SLA Operation Definition

To create or edit an SLA operation use the “ip sla <operation-id>” global configuration mode command. It places CLI into the IP SLA configuration sub-mode, where you can select one of the IP SLA types and provide corresponding configuration parameters.

One of the simplest types of IP SLA operations is icmp-echo. The router pings a specified IP address and records round-trip times if the remote side is reachable. In the example below, we will set the type of SLA entry as ICMP echo with the destination’s IP address of 10.0.3.1. The sample topology is shown in Figure 1.

Figure 1. IP SLA Test Topology
Figure 1. IP SLA Test Topology

The configuration mode changes to IP SLA echo where you can adjust different optional parameters, such as request sending frequency and timeout for replies.

X(config)#ip sla 1 
X(config-ip-sla)# icmp-echo 10.0.3.1 source-ip 10.0.0.1
X(config-ip-sla-echo)#?
IP SLAs Icmp Echo Configuration Commands:
  data-pattern       Data Pattern
  default            Set a command to its defaults
  exit               Exit operation configuration
  frequency          Frequency of an operation
  history            History and Distribution Data
  no                 Negate a command or set its defaults
  owner              Owner of Entry
  request-data-size  Request data size
  tag                User defined tag
  threshold          Operation threshold in milliseconds
  timeout            Timeout of an operation
  tos                Type Of Service
  verify-data        Verify data
  vrf                Configure IP SLAs for a VRF

Scheduling IP SLA

Once SLA is defined and optional parameters are specified, start it by running the “ip sla schedule <id>” command. IP SLA can be configured to start at a specific time or as soon as the command is entered, which is shown in the example below.

X(config)#ip sla schedule 1 start-time now life forever

To check SLA operation’s details use the “show ip sla statistics <id> details” command.

X#show ip sla statistics 1 details 
IPSLAs Latest Operation Statistics

IPSLA operation id: 1
        Latest RTT: 1 milliseconds
Latest operation start time: 10:08:18 UTC Sat Jan 30 2021
Latest operation return code: OK
Over thresholds occurred: FALSE
Number of successes: 2
Number of failures: 0
Operation time to live: Forever
Operational state of entry: Active
Last time this entry was reset: Never

To see configuration details, including default values for various parameters, use the “show ip sla configuration” command.

X#show ip sla configuration  
IP SLAs Infrastructure Engine-III
Entry number: 1
Owner: 
Tag: 
Operation timeout (milliseconds): 5000
Type of operation to perform: icmp-echo
Target address/Source address: 10.0.3.1/10.0.0.1
Type Of Service parameter: 0x0
Request size (ARR data portion): 28
Data pattern: 0xABCDABCD
Verify data: No
Vrf Name: 
Schedule:
   Operation frequency (seconds): 60  (not considered if randomly scheduled)
   Next Scheduled Start Time: Start Time already passed
   Group Scheduled : FALSE
   Randomly Scheduled : FALSE
   Life (seconds): Forever
   Entry Ageout (seconds): never
   Recurring (Starting Everyday): FALSE
   Status of entry (SNMP RowStatus): Active
Threshold (milliseconds): 5000
Distribution Statistics:
   Number of statistic hours kept: 2
   Number of statistic distribution buckets kept: 1
   Statistic distribution interval (milliseconds): 20
Enhanced History:
History Statistics:
   Number of history Lives kept: 0
   Number of history Buckets kept: 15
   History Filter Type: None

Once SLA operation is scheduled, it cannot be modified. Instead, the old operation can be deleted, and then a new one created again using the same ID.

Different IP SLA Types

To view available types of SLA operations, use context-sensitive help as shown in the next example.

X(config-ip-sla)#?
IP SLAs entry configuration commands:
  dhcp         DHCP Operation
  dns          DNS Query Operation
  ethernet     Ethernet Operations
  exit         Exit Operation Configuration
  ftp          FTP Operation
  http         HTTP Operation
  icmp-echo    ICMP Echo Operation
  icmp-jitter  ICMP Jitter Operation
  mpls         MPLS Operation
  path-echo    Path Discovered ICMP Echo Operation
  path-jitter  Path Discovered ICMP Jitter Operation
  tcp-connect  TCP Connect Operation
  udp-echo     UDP Echo Operation
  udp-jitter   UDP Jitter Operation

The next subsections will review some of them.

Jitter operations

Jitter is a variation in the delay between packets. The smaller the jitter the better performance of time-sensitive applications such as voice. It also means that the network delivers packets with a predictable delay and doesn’t experience congestions causing intermittent delays along the paths.

There are 2 types of SLA operations performing Jitter measurements – ICMP and UDP-based.

ICMP Jitter

ICMP Jitter SLA operation is based on ICMP message types (Timestamp Request and Timestamp Reply). The destination can be any device that supports these ICMP messages. Not all devices support it, or it can be blocked by the firewalls. The previously shown ICMP Echo operation is based on more commonly used Echo and Reply message types.

The configuration in the example below demonstrates how to configure ICMP jitter operation and, that once launched, it reports the round-trip-time (RTT) and jitter statistics from the source to the destination, and vice versa.

X(config)# ip sla 2 
X(config-ip-sla)# icmp-jitter 10.0.3.1 source-ip 10.0.0.1
X(config)# ip sla schedule 2 start-time now life forever

X#show ip sla statistics 2
IPSLAs Latest Operation Statistics

IPSLA operation id: 2
Type of operation: icmp-jitter
        Latest RTT: 1 milliseconds
Latest operation start time: 00:20:39 UTC Sun Jan 31 2021
Latest operation return code: OK
RTT Values:
        Number Of RTT: 10               RTT Min/Avg/Max: 1/1/1 milliseconds
Latency one-way time:
        Number of Latency one-way Samples: 0
        Source to Destination Latency one way Min/Avg/Max: 0/0/0 milliseconds
        Destination to Source Latency one way Min/Avg/Max: 0/0/0 milliseconds
Jitter Time:
        Number of SD Jitter Samples: 9
        Number of DS Jitter Samples: 9
        Source to Destination Jitter Min/Avg/Max: 0/1/1 milliseconds
        Destination to Source Jitter Min/Avg/Max: 0/1/1 milliseconds
Over Threshold:
        Number Of RTT Over Threshold: 0 (0%)
Packet Late Arrival: 0
Out Of Sequence: 0
        Source to Destination: 0        Destination to Source 0
        In both Directions: 0
Packet Skipped: 0       Packet Unprocessed: 0
Packet Loss: 0
        Loss Periods Number: 0
        Loss Period Length Min/Max: 0/0
        Inter Loss Period Length Min/Max: 0/0
Number of successes: 4
Number of failures: 0
Operation time to live: Forever

To measure jitter, a source router sends a number of packets (10, by default) periodically. The time between these packets is called interval (20ms). The operation is repeated at a specified frequency (60 seconds by default).

The example below shows default configuration values – every 60 seconds the router will send 10 packets with 20 milliseconds interval between each packet.

X#show ip sla configuration 2
IP SLAs Infrastructure Engine-III
Entry number: 2
Owner: 
Tag: 
Operation timeout (milliseconds): 5000
Type of operation to perform: icmp-jitter
Target address/Source address: 10.0.3.1/10.0.0.1
Packet Interval (milliseconds)/Number of packets: 20/10
Type Of Service parameter: 0x0
Vrf Name: 
Schedule:
   Operation frequency (seconds): 60  (not considered if randomly scheduled)
<output is truncated>

UDP Jitter and IP SLA responder

UDP is used in voice and video communications. Using UDP traffic for measurements suits better to simulate such applications. In an IP SLA operation configuration codec type can be specified, and this will define packet size and enable various voice-specific metric calculations, such as Mean Opinion Score (MOS).

Figure 2. IP SLA Responder Configuration
Figure 2. IP SLA Responder Configuration

To use the UDP Jitter destination device must also be a Cisco device with an IP SLA responder feature enabled on it. The responder’s basic configuration is completed with the single command on router Z:

Z(config)#ip sla responder

Back on the router X, IP SLA configuration is done per the example below.

X(config)#ip sla 3
X(config-ip-sla)#udp-jitter 10.0.3.1 2345 source-ip 10.0.0.1
X(config)#ip sla schedule 3 start-time now life forever
X#show ip sla statistics 
IPSLAs Latest Operation Statistics

IPSLA operation id: 3
Type of operation: udp-jitter
        Latest RTT: 1 milliseconds
Latest operation start time: 05:53:56 UTC Sun Jan 31 2021
Latest operation return code: OK
RTT Values:
        Number Of RTT: 10               RTT Min/Avg/Max: 1/1/1 milliseconds
Latency one-way time:
        Number of Latency one-way Samples: 0
        Source to Destination Latency one way Min/Avg/Max: 0/0/0 milliseconds
        Destination to Source Latency one way Min/Avg/Max: 0/0/0 milliseconds
Jitter Time:
        Number of SD Jitter Samples: 9
        Number of DS Jitter Samples: 9
        Source to Destination Jitter Min/Avg/Max: 0/1/1 milliseconds
        Destination to Source Jitter Min/Avg/Max: 0/1/1 milliseconds
Over Threshold:
        Number Of RTT Over Threshold: 0 (0%)
Packet Loss Values:
        Loss Source to Destination: 0
        Source to Destination Loss Periods Number: 0
        Source to Destination Loss Period Length Min/Max: 0/0
        Source to Destination Inter Loss Period Length Min/Max: 0/0
        Loss Destination to Source: 0
        Destination to Source Loss Periods Number: 0
        Destination to Source Loss Period Length Min/Max: 0/0
        Destination to Source Inter Loss Period Length Min/Max: 0/0
        Out Of Sequence: 0      Tail Drop: 0
        Packet Late Arrival: 0  Packet Skipped: 0
Voice Score Values:
        Calculated Planning Impairment Factor (ICPIF): 0
        Mean Opinion Score (MOS): 0
Number of successes: 1
Number of failures: 0
Operation time to live: Forever

Path operations

When measuring latency between two hosts it is useful to know how much delay is contributed by each hop along the path. Path-type SLAs work by running traceroute first and then performing echo or jitter operation against each discovered hop.

X(config)#ip sla 4
X(config-ip-sla)#path-echo 10.0.3.1 source-ip 10.0.0.1
X(config)#ip sla schedule 4 start-time now life forever

The configuration is based on the same network topology with 2 paths available to the destination.

Figure 3. IP SLA Path Operations
Figure 3. IP SLA Path Operations

The “show ip sla statistics aggregated 4 details” command will display round-trip time statistics for each hop. For example, the command output below shows statistics for the first hop in the path going via router B. The remaining hops statistics are omitted in the example below.

X#show ip sla statistics aggregated 4 details 
IPSLAs aggregated statistics

Distribution Statistics:
Bucket Range: 0 to < 20 ms
Avg. Latency: 1 ms
Percent of Total Completions for this Range: 100 %
Number of Completions/Sum of Latency: 1/1 
Sum of RTT squared low 32 Bits/Sum of RTT squared high 32 Bits: 1/0
Operations completed over threshold: 0
Start Time Index: *09:54:47.234 UTC Tue Feb 2 2021
Path Index: 3
Hop in Path Index: 1
Type of operation: path-echo
Number of successes: 18
Number of failures: 0
Number of over thresholds: 0
Failed Operations due to Disconnect/TimeOut/Busy/No Connection: 0/0/0/0
Failed Operations due to Internal/Sequence/Verify Error: 0/0/0
Failed Operations due to Control enable/Stats retrieve Error: 0/0
Target Address 172.16.100.6
<output is truncated>

Other operations

IP SLA can perform application-specific monitoring, for example, it can run HTTP or FTP operations against a remote server. For generic TCP applications, TCP operation can be used, which validates if the TCP connection can be established and how long it takes to connect to the server.

The other available operation types are DHCP (checks how long it takes to obtain an IP address) and DNS queries.

Reactions and Proactive Monitoring

In the previous sections, we’ve reviewed different types of SLA operations and how to check their operations using CLI. By using SNMP-based monitoring software, administrators can poll SLA data from routers periodically, which is a more scalable approach to monitoring. Cisco provides SNMP MIB called RTTMON that provides access to IP SLAs from monitoring software.

It is also possible for a router to send an SNMP trap or Syslog message if a certain event happens. IP SLA includes a feature called proactive threshold monitoring. To enable it IP SLA reaction must be configured.

SLA reaction configuration command references an IP SLA operation and has a mandatory parameter specifying what monitored element of SLA we want to monitor. For example, this can be a timeout or threshold-based value. Available monitored elements vary based on the type of IP SLA operation. A compatibility table is available on this page. https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipsla/configuration/xe-16-9/sla-xe-16-9-book/sla-threshold-mon.html

The example below uses a previously configured IP SLA operation of icmp-echo type.

ip sla 1
 icmp-echo 10.0.3.1 source-ip 10.0.0.1
ip sla schedule 1 life forever start-time now

The reaction configuration below references SLA operation 1 and selects timeout as a monitored element. Then we enable simple SNMP configuration to send a trap when operations times out.

X(config)#ip sla reaction-configuration 1 react timeout action-type trapOnly threshold-type immediate
X(config)#snmp-server enable traps ipsla
X(config)#snmp-server host 10.0.0.5 PUBLIC

To test it, I’ve turned off the interface on router Z and enabled debug of SNMP packets to confirm that SNMP traps were generated by the router.

X#debug snmp packets
*Feb  4 09:41:49.232: SNMP: Queuing packet to 10.0.0.5
*Feb  4 09:41:49.232: SNMP: V1 Trap, ent rttMonNotificationsPrefix, addr 10.0.0.1, gentrap 6, spectrap 2 
 rttMonCtrlAdminTag.1 =  
 rttMonHistoryCollectionAddress.1 = 0A 00  03 01    
 rttMonCtrlOperTimeoutOccurred.1 = 2
<output is truncated>

Enhanced Object Tracking

The final section of this document addresses how a router can change its data plane operation in response to IP SLA operational status.

The feature responsible for translating the status of IP SLA operations to the data plane protocols is called Enhanced Object Tracking. First-Hop Redundancy Protocols, such as HSRP and VRRP use the tracking objects to take over or release data forwarding roles. Static and policy-based routing can also use the tracking objects to re-route around degraded paths.

EEM (Embedded Event Manager) scripts can monitor tracking object’s status changes. This makes it possible to run powerful EEM scripting in response to track status change.

A Track object can have 2 states – Up and Down. IP SLA operations can have several return codes. For the purpose of object tracking, we are interested in the 3 return codes discussed below. All other codes that are not covered in this article translate into the Down state of the Track object.

Let’s re-use the IP SLA ICMP Echo operation from the previous example and check the code of the operation. The second command in the listing shows how to check the default threshold (5000ms):

X#show ip sla statistics 1 | incl code           
Latest operation return code: OK
X#show ip sla configuration 1 | include Threshold
Threshold (milliseconds): 5000

The operation’s return code is OK, which means that the ping operation is successful and its round-trip time is within the threshold (in our network it is 1ms). OK status always translates to the UP state of the corresponding track object.

If we will turn off the remote interface on router Z, the return code will change to Timeout, as per the listing below. This is the second SLA operation code, which always translates into the Down state of the Track object.

X#show ip sla statistics 1 | incl code
Latest operation return code: Timeout

The third code that is relevant for the purpose of track operation is OverThreshold. To demonstrate it, we increased the request data size to 5000 bytes, so the operation takes longer than 1ms. We also decreased the threshold to 1ms, so the operation goes over the threshold.

The OverThreshold code translates differently to the track object state, which we will discuss in the next section.

X(config)#ip sla 2
X(config-ip-sla)#icmp-echo 10.0.3.1 source-ip 10.0.0.1
X(config-ip-sla-echo)# request-data-size 5000
X(config-ip-sla-echo)# threshold 1
X(config)#ip sla schedule 2 start-time now life forever

X#show ip sla statistics 2               
IPSLAs Latest Operation Statistics

IPSLA operation id: 2
        Latest RTT: 2 milliseconds
Latest operation start time: 09:44:48 UTC Sat Feb 6 2021
Latest operation return code: Over threshold
Number of successes: 2
Number of failures: 0
Operation time to live: Forever

IP SLA Track State vs. Reachability

track <id> ip sla <sla-id>” command creates a track object that references SLA operation. There are 2 types of track objects for SLA – reachability, and state. Let’s configure both types against the previously configured SLA 2 to see the difference.

X(config)#track 1 ip sla 2 reachability
X(config)#track 2 ip sla 2 state

X#show track 
Track 1
  IP SLA 2 reachability
  Reachability is Up
    1 change, last change 00:00:29
  Latest operation return code: Over threshold
  Latest RTT (millisecs) 2
Track 2
  IP SLA 2 state
  State is Down
    1 change, last change 00:00:17
  Latest operation return code: Over threshold
  Latest RTT (millisecs) 2

Track 1 is based on reachability and it is UP, even when the underlying IP SLA object is over the threshold. As the name suggests, we only interested in the fact that we can reach the remote side of the connection. Track 2 is state-based and it returns DOWN for the same IP SLA object. Track of the state type checks both reachability and that the IP SLA object is under the specified threshold.

Static Routes

The next listing demonstrates how to use the track objects with static routes:

X(config)#ip route 0.0.0.0 0.0.0.0 172.16.100.2 track 1 200
X(config)#ip route 0.0.0.0 0.0.0.0 172.16.100.6 track 2

X#sh ip route 0.0.0.0
Routing entry for 0.0.0.0/0, supernet
  Known via "static", distance 200, metric 0, candidate default path
  Routing Descriptor Blocks:
  * 172.16.100.2
      Route metric is 0, traffic share count is 1

The route via 172.16.100.6 is the preferred, as it has the default administrative distance of 1. The route via 172.16.100.2 is a backup route with an administrative distance of 200. However, the route table shows that the router prefers the route via 172.16.100.2.

For a route to be installed into the routing table the corresponding track must be UP. Because track object #2 is Down, the static route via 172.16.100.6 is withdrawn and the backup route is chosen and installed into the routing table.

EEM Scripts with IP SLA

The final example of this article is an EEM (Embedded Event Manager) script that adjusts the router configuration, generates a Syslog message in response to the Track state change.

The script is triggered when Track object #1 goes down, with the “event track <object-id> state down” command.

The access-list is being adjusted by the script in this example only for demonstration purposes. It can be adjusted to suit real-life requirements. For example, the script can reconfigure VPN tunnel endpoints or enable a backup interface. EEM can perform many other non-CLI actions as well.

X(config)#event manager applet TRACK-1-DOWN 
X(config-applet)#event track 1 state down
X(config-applet)#action 1 cli command "enable"
X(config-applet)#action 2 cli command "conf t"
X(config-applet)#action 3 cli command "ip access-list extended INTERNET-IN"
X(config-applet)#action 4 cli command "permit ip host 1.2.3.4 any"         
X(config-applet)#action 5 syslog msg "TEST Message"

X#show run | section ^ip access-list
<none>

To test the script, I’ve turned off the remote interface on router Z causing IP SLA and track to go down. The router generates a test Syslog message and creates an access-list, as shown in the listing below. “terminal monitor” command displays console messages into VTY lines, such as SSH or Telnet.

X#terminal monitor
*Feb  6 10:13:50.285: %TRACK-6-STATE: 1 ip sla 2 reachability Up -> Down

*Feb  6 10:13:50.530: %HA_EM-6-LOG: TRACK-1-DOWN: TEST Message

X#sh run | section ^ip access-list
ip access-list extended INTERNET-IN
 permit ip host 1.2.3.4 any

Cisco Routers Performance

In this blog post I will summarize available information on Cisco ISR and ASR performance. The following platforms will be covered: ISR G2, ISR 1100, ISR 4000, ASR 1000.

Cisco Routers Performance

Update: check my new article on SD-WAN routers and platforms here.

ISR G2

Let’s start with ISR G2 performance numbers. ISR G2s are legacy products with Classic IOS, however, they are still around and it is important to know how they perform to properly size newer replacement routers.

Important: These are not real-world numbers. Please read further.

ModelPackets Per SecondMegabits Per Second
Cisco 86025,000197
Cisco 88050,000198
Cisco 890100,0001,400
Cisco 1921290,0002,770
Cisco 1941330,0002,932
Cisco 2901330,0003,114
Cisco 2911352,0003,371
Cisco 2921479,0003,502
Cisco 2951579,0005,136
Cisco 3925833,0006,903
Cisco 3925E1,845,0006,703
Cisco 3945982,0008,025
Cisco 3945E2,924,0008,675

Table 1. Cisco ISR G2 RFC 2544 Performance

The second column displays the number of packets per second that the platform can forward under maximum CPU utilization just before starting to drop the packets. For a router’s CPU it takes the same amount of effort to route the 64-byte packet as it would take for 1500-byte one. So it is usually a more reliable metric that removes packet size from the equation.

The third column displays the value in bytes per second (i.e. packet size in bytes x packets per second). As the results can differ more than 20x times based on the size of the packets selected, the specification must provide average packet size that was used during the test.

What is IMIX? The traffic doesn’t consist of packets of the same size, many tests are using packets of different sizes (called Internet Mix (IMIX)). For example, in a simple IMIX sample in every 12 packets transmitted – 7 will be 40 bytes long, 4 – 576, and 1 – 1500. The average packet size in this case will be 340 bytes.

Values provided in Table 1 are based only on IP packet routing without any additional processing, such as QoS, encryption, or NAT, so it is a maximum performance that a platform can deliver. The real-world number will be significantly smaller.

Another important thing to note is how a packet is counted, for example, it can be counted twice – as it enters an ingress interface and exits egress one. Cisco counts this is as a single packet, as it is seen by the forwarding engine. On the other hand, to select a router for a specific WAN interface bandwidth utilization in each direction must be added. For example, in the case of 10Mbps WAN with expected 9Mbps download and 3Mbps upload – calculation should be based on 12Mbps of the load.

For G2 platforms Cisco recommended WAN-link based sizing is as per the table below. Values are much smaller compared to normal IP forwarding. It is also expected that the router will not be running at 99% CPU and will be dropping packets.

PlatformWAN Link
8604
8808
89015
192115
194125
290125
291135
292150
295175
3925100
3945150
3925E250
3945E350

Table 2. ISR G2 Recommended Sizing Based on WAN Link Speed

ISR 4000

ISR 4000s are running IOS-XE and have introduced performance-based licensing with 3 tiers:

  • Default
  • Performance (x2-3 of default throughput level)
  • Boost (removes shaping completely)

Cisco publishes the following statistics for basic IP routing without services with IMIX traffic (~330 bytes packets).

ModelDefault (Mbps)Performance
(Mbps @ CPU %)
Boost
(Mbps @ CPU %)
Boost
(pps @ CPU %)
Encryption
(Mbps, AES 256)
42213575
@ 8% CPU
1,400
@ 94% CPU
530,000
@ 94% CPU
75
432150100
@ 8% CPU
2,000
@ 68% CPU*
760,000
@ 68% CPU*
100
4331100300
@ 16% CPU
2,000
@ 53% CPU*
760,000
@ 53% CPU*
500
4351200400
@ 17% CPU
2,000
@ 45% CPU*
760,000
@ 45% CPU*
500
44315001,000
@ 18% CPU
4,000
@ 62% CPU*
1,520,000
@ 62% CPU*
900
44511,0002,000
@ 19% CPU
4,000
@ 35% CPU*
1,520,000
@ 35% CPU*
1,600
44611,5003,000
@ not published
10,000+
@ not published
3,790,000+ @ not published7,000

Table 3. ISR 4000 Performance (IP forwarding, IMIX 330 byte average packet size)

*- bottleneck was the physical interface speed, not forwarding CPU

As the routers are capable to forward significantly more traffic than default and performance license allows, the numbers in table 3 for these license tiers are close to real-life when services are getting added. It is safe to choose ISR 4000 with “factory default” and “performance” levels and in most cases lower models with a “performance” license if you plan to use multiple services.

Recently added boost license removes shaping completely. Table 3 displays PPS values for ISR 4000, however, most of the routers didn’t have high CPU utilization, as the bottleneck was at the interface clock speed. The calculation is based on an IMIX size of 330 Bytes.

The data provided should be used as an only approximation, as there are many variables that can affect actual device performance which also will not scale linearly with CPU load increase.

ISR 1100

ISR 1100 is a new branch office platform running IOS-XE and similar to Cisco 890 and 1921. Published performance numbers are listed in Table 4. IP forwarding of ISR 1100 is comparable to ISR 4221 with a boost license. Note that ISR 1100 doesn’t support voice features.

PlatformRFC-2544
(Mbps, IMIX)
RFC-2544
(pps, IMIX)
Encryption
(Mbps, AES 256, IMIX)
NAT (Mbps, IMIX)ACL + NAT + HQoS (Mbps, IMIX)
C1100-4P1,252475,000230660330
C1100-8P1,750660,000335960510

Table 4. ISR 1100 Performance

ASR 1000

In the cases when you need more than 10Gbps of throughput provided by ISR 4461, ASR 1000 will be the platform of choice. All models in the ASR 1000 range have 2 dedicated hardware components – RP (Route Processor) and ESP (Embedded Service Processor). RP is responsible for control-plane operations and ESP for data forwarding.

Lower-end models, such as ASR1001-X and ASR1002-X have RP and ESP integrated into chassis. The throughput of the system depends on ESP, which runs Cisco-proprietary programmable ASICs called Quantum Flow Processor (QFP).

The performance of 3 integrated models is shown in Table 5. For the models presented in Table 5, an incremental throughput license is required.

ModelESP Bandwidth (Mbps)Throughput (pps)
ASR1001-X20,00019,000,000
ASR1002-X30,00036,000,000
ASR1002-HX100,00058,000,000

Table 5. ASR 1000 Performance (integrated ESP models)

Related Links

RFC-2544: Provides information on recommended way to perform testing

Portable Product Sheets Routing Performance – ISR G1, Legacy Platforms Performance

ISR 4000 Performance – 3rd Party Testing Report by Miercom

ASR 1000 FAQ 

Cisco Firewalls Performance

Cisco ACI Switch Models

Cisco SD-WAN Viptela

Overview

Cisco routers are one of the most widely deployed WAN devices. Traditionally they are individually managed and for the larger networks, administrators require additional tools to monitor, perform configuration backup, and to automate tasks.

Many newer Cisco technologies have some form of a central controller and managed data-plane devices. For example, ACI in the data center and SD-Access for the campus. In WAN space, the Cisco portfolio included IWAN (Intelligent WAN) technology and cloud-managed products from Meraki acquisition. In 2017 Cisco has acquired Viptela and its SD-WAN product line. This post contains an overview of this technology and some basic terminology.

Traditional WAN design

To understand the benefits of SD-WAN, let’s consider how most of the Wide Area Networks are designed. Multiple branch offices connect via an MPLS network to one or two data centers, which also provide centralized Internet access. It is secured by high-performance firewalls, intrusion protection, and web filtering platforms. Each branch or remote office has a single or pair of routers forwarding multiple types of traffic, such as:

  • Business applications (SAP, ERP)
  • Office 365 (Outlook, Sharepoint, etc)
  • Internet browsing
  • Video and IP telephony
  • Interactive applications, such as remote desktops

Management and Operational Issues

The device-centric approach has many challenges. For example, application performance troubleshooting requires an administrator to check every router in path hop-by-hop and takes a significant amount of time.

In many WAN environments, quality of service (QoS) configuration is static in nature, as a change in QoS design may take several maintenance windows to deploy across the network.

In a similar way, wireless deployments have transformed from autonomous to controller-based, as many tasks require a coordinated approach in management. For example, Radio Resource Management is one of such tasks, when the channel and transmit power selection is very difficult to maintain manually on every access point.

WAN links are also relatively expensive. In many networks, standby WAN links are required for high availability. Establishing these links takes time and service providers may require fixed-term commitment. In contrast, Internet links are affordable and have shorter lead times to provision.

With traditional design described earlier, traffic going to the workload and applications in a data center has to compete with the services reachable via the public Internet. It is cost-effective to offload Internet traffic to a branch local Internet link.

This interface can also be used as a secondary WAN link connecting sites over VPN connections. However, it is difficult to manage multiple tunnels as the number of routers goes up while providing consistent user experience and ensuring that the security is not compromised.

SD-WAN Design Approach

SD-WAN addresses these issues. A centralized set of controller devices provides a level of abstraction, so network administrators can spend more time on creating policies and configuration templates without having to touch every device on the network.

WAN is treated as a transport-agnostic fabric. Underlay network provides connectivity between tunnel endpoints and doesn’t need to have knowledge about reachability information behind these gateways. As a result, overlay tunnels can be created dynamically and networks can recognize application traffic and select the best path in real-time.

Components and Architecture

SD-WAN operations comprise of 4 planes, implemented by a set of controllers and gateways:

  • Management plane controller (vManage)
  • Orchestration plane controller (vBond)
  • Control plane controller (vSmart)
  • Data plane forwarding device (vEdge)

Controllers can be hosted and managed by Cisco as a subscription-based product or can be deployed on-premises. vManage, vBond, and vSmart are virtual machines available for download as OVA files. ESXi and KVM are the supported hypervisors.

Figure 1. SD-WAN Architecture
Figure 1. SD-WAN Architecture

The first component to be configured in a new SD-WAN network is vManage, which can be deployed as a single appliance or cluster of at least 3 nodes. vManage implements a management plane and is the place where all configuration happens. It also performs fabric monitoring and can expose centralized API access for external applications to the SD-WAN network.

vBond is responsible for accepting registration and authenticating vSmart controllers and vEdges. Every device needs to be pointed to vBond during provisioning. It then ensures that all other elements are able to locate each other. vBond must have a public IP address and should be placed into DMZ, so it can be accessed over the Internet.

vSmart controls all overlay routing and secure tunnel establishment between vEdges. The control protocol between vSmart and vEdge elements is called OMP (Overlay Management Protocol). It is protected by DTLS and carries not only reachability information, but also security associations details for IPSec tunnels. vSmart performs policy propagation to the edge devices.

vEdge devices are gateways performing data forwarding over overlay networks. This can be Viptela appliances (vEdge Routers), or Cisco devices running SD-WAN image such as Cisco ISR 4000. There is an option of software vEdge Cloud routers hosted in the public cloud – AWS or Azure.

Cisco works on getting routers with SD-WAN image to have feature parity with Viptela appliances, so always check release notes, as there might be a feature not yet supported on Cisco ISRs.

The next few sections explain the most important terms and concepts of SD-WAN, such as VPNs, TLOCs, and OMP.

Figure 2. SD-WAN Terminology
Figure 2. SD-WAN Terminology

VPNs

Viptela SD-WAN uses the concept of VPN which is a way to segregate networks. Each VPN has interface allocation and a routing table isolated from other VPNs. It is similar to the Cisco VRF (Virtual Routing and Forwarding) instance. VPN number is globally significant and must match for communication to happen. Encapsulated IP packets carry VPN tag, so egress gateway can determine which VPN packet belongs to.

There are 513 VPNs with the first and last reserved for fabric operations. VPN 0 is transport VPN and is similar to the global VRF context. Interfaces in VPN 0 are called tunnel interfaces and have IP addresses visible by transit networks and form underlay of the fabric. Communication between the network controllers of SD-WAN happens over VPN 0.

VPN 512 is used for Out-Of-Band-Management network.

All other VPNs 1-511 can be used to forward user data.

In Figure 2, VPN 100 and VPN 200 are created in the network. Subnets A, B, and E can communicate with each other within VPN 100. And subnets C and D can communicate with each other within VPN 200.

TLOCs (Transport LOCators)

One of the tasks of OMP is to distribute reachability information. Each destination can be reachable via a specific interface on one of the vEdges on the network. TLOC is a composite structure describing this interface and consists of:

  • System IP address of the OMP
  • Color of the link
  • Encapsulation of the tunnel (IPSec or GRE)

TLOC is similar in concept to the next hop in BGP. Color is a pre-defined tag that describes type of the WAN interface, for example mpls, 3g or biz-internet.

OMP (Overlay Management Protocol)

vSmart exchanges information with vEdges using OMP. This protocol covers all control-plane aspects required to transmit data on top of the overlays.

OMP is responsible for exchange of 3 types of routes:

  • vRoutes, reachability on the LAN side of the router. vEdge supports static routes, dynamic protocols – BGP and OSPF are supported. Information about a source routing protocol, its metric is carried along with these routes. VPN, the Site ID is another important information present in vRoutes as well.
  • Service Routes. The way to perform service chaining and insert a firewall or a load balancer
  • TLOC Routes. Carries information on how to reach specific TLOC such as IP addresses of the interface.