Answering the question: IP SLA vs. HSRP (and other FHRPs)


There was a question posted in a Facebook CCNA group today, regarding redundancy over different ISP links, and whilst the suggestions were good, most favoured HSRP, or one of the other FHRP protocols, over something like IP SLA.

So, first of all, the question, from Jeril Abraham is:

Hi all,
There are two static router to different ISP like this.
10.0.0.0 255.255.255.0 172.14.35.8-ISP1
11.0.0.0 255.255.255.0 172.35.26.8-ISP2
so my question is, The ISP one is went down so is it possible to divert the all traffic to ISP2 automatically?

thanks in adv


OK, so, we have two destinations, via different ISPs. Let's visualize this:
IP SLA vs. HSRP, VRRP and GLBP

We need to get from the Source router to the Destination router. Let's make a simple configuration, and start by confirming access:
Source(config)#int gi0/0
Source(config-if)#ip add 172.14.35.10 255.255.255.0
Source(config-if)#no shut
Source(config-if)#int gi0/1
Source(config-if)#ip add 172.35.26.10 255.255.255.0
Source(config-if)#no shut
Source(config-if)#exit
Source(config)#ip route 10.0.0.0 255.255.255.0 172.14.35.8
Source(config)#ip route 11.0.0.0 255.255.255.0 172.35.26.8
Source(config)#

ISP-1(config)#int gi 0/0
ISP-1(config-if)#ip add 172.14.35.8 255.255.255.0
ISP-1(config-if)#no shut
ISP-1(config-if)#int gi 
ISP-1(config-if)#ip add 1.1.1.1 255.255.255.0
ISP-1(config-if)#no shut
ISP-1(config-if)#
ISP-1(config-if)#ip route 0.0.0.0 0.0.0.0 1.1.1.2
ISP-1(config)#

ISP-2(config)#int gi0/0
ISP-2(config-if)#ip add 172.35.26.8 255.255.255.0
ISP-2(config-if)#no shut
ISP-2(config-if)#int gi 0/1
ISP-2(config-if)#ip add 1.1.2.1 255.255.255.0
ISP-2(config-if)#no shut
ISP-2(config-if)#ip route 0.0.0.0 0
ISP-2(config)#

Destination(config)#int gi0/0
Destination(config-if)#ip add 1.1.1.2 255.255.255.0
Destination(config-if)#no shut
Destination(config-if)#int gi 0/1
Destination(config-if)#ip add 1.1.2.2 255.255.255.0
Destination(config-if)#no shut
Destination(config-if)#int lo10
Destination(config-if)#ip add 10.0.0.1 255.255.255.0
Destination(config-if)#
Destination(config-if)#int lo11
Destination(config-if)#
Destination(config-if)#ip add 11.0.0.1 255.255.255.0
Destination(config-if)#
Destination(config-if)#ip route 172.14.35.0 255.255.255.0 1.1.1.1
Destination(config)#ip route 172.35.26.0 255.255.255.0 1.1.2.1
Destination(config)#
With this, we should be able to get from the Source to the Destination:
Source#ping 10.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/4/6 ms
Source#ping 11.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 11.0.0.1, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 4/4/5 ms
Source#
And we can. We can also see that the routes taken are the ones we would expect:
Source#trace 10.0.0.1 num
Type escape sequence to abort.
Tracing the route to 10.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.14.35.8 4 msec 4 msec 4 msec
  2 1.1.1.2 5 msec *  4 msec
Source#trace 11.0.0.1 num
Type escape sequence to abort.
Tracing the route to 11.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.35.26.8 3 msec 3 msec 3 msec
  2 1.1.2.2 5 msec *  5 msec
Source#
So, before we look at the solution, let's explain why HSRP (and by extension, VRRP and GLBP which make up the FHRP, or First Hop Redundancy Protocols) are not the the right solution in this instance.

When is FHRP the wrong choice?

Look at the topology we have here. We connect from the source to two different ISPs. We only have one source, and will have no control over the ISP routers, so who would we create an HSRP group with? To use any FHRP solution, we would need another router within our own environment, something a little more like this:


We would have two routers, Source-1 and Source-2, and they would be joined in an HSRP group, providing the internal users and devices with a virtual IP address to connect to. In the event that one router went down, the other would take over.

Back to our conundrum. How would we provide the redundancy we need? The answer here would either be through floating static routes, redistribution from the ISP, or IP SLA.

Floating static routes

Floating static routes would be the easiest option, but on their own, only have limited success. We supply the initial routes (as we have already done above), then add a second route via a different next-hop, with a higher metric, such as:
Source(config)#do sh run | i ip route
ip route 10.0.0.0 255.255.255.0 172.14.35.8
ip route 11.0.0.0 255.255.255.0 172.35.26.8
Source(config)#ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
Source(config)#ip route 11.0.0.0 255.255.255.0 172.14.35.8 50
Source(config)#
Source(config)#do sh run | i ip route                        
ip route 10.0.0.0 255.255.255.0 172.14.35.8
ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
ip route 11.0.0.0 255.255.255.0 172.35.26.8
ip route 11.0.0.0 255.255.255.0 172.14.35.8 50
Source(config)#do sh ip route | b Gate
Gateway of last resort is not set

      10.0.0.0/24 is subnetted, 1 subnets
S        10.0.0.0 [1/0] via 172.14.35.8
      11.0.0.0/24 is subnetted, 1 subnets
S        11.0.0.0 [1/0] via 172.35.26.8
      172.14.0.0/16 is variably subnetted, 2 subnets, 2 masks
C        172.14.35.0/24 is directly connected, GigabitEthernet0/0
L        172.14.35.10/32 is directly connected, GigabitEthernet0/0
      172.35.0.0/16 is variably subnetted, 2 subnets, 2 masks
C        172.35.26.0/24 is directly connected, GigabitEthernet0/1
L        172.35.26.10/32 is directly connected, GigabitEthernet0/1
Source(config)#
How well does this work? Let's test by shutting down the Gi0/0 interface on ISP-1:
ISP-1(config)#int gi0/0
ISP-1(config-if)#shut
ISP-1(config-if)#
%LINK-5-CHANGED: Interface GigabitEthernet0/0, changed state to administratively down
%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
ISP-1(config-if)#
This does not really work that well though, so simulating a failure on the ISP-1 router, has little effect on our routing table:
Source#sh ip route 10.0.0.0   
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [1/0] via 172.14.35.8
Source#
Let's turn the ISP-1 interface back up and turn our interface down, then see what effect this has:
ISP-1(config-if)#no shut
ISP-1(config-if)#
%LINK-3-UPDOWN: Interface GigabitEthernet0/0, changed state to up
%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
ISP-1(config-if)#

Source(config)#int gi0/0
Source(config-if)#shut
Source(config-if)#
%LINK-5-CHANGED: Interface GigabitEthernet0/0, changed state to administratively down
%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
Source(config-if)#
Source(config-if)#do sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [50/0] via 172.35.26.8
Source(config-if)#
Source(config-if)#do ping 10.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/4/5 ms
Source(config-if)#
This works, we can send all our traffic through ISP-2 and get to 10.0.0.1, but shows that the floating static route only really works if the problem lies with us, not with an upstream device. Let's turn the network up a notch and set ip IP SLA tracking.

IP SLA

IP SLA is a feature that allows you to implement some monitoring of a line, we can use it to ping devices and if a failure is detected, make changes appropriately.

We will do this by adding another loopback address to the Destination router, and track that. We will also turn the interface on the Source back on, and remove the floating static routes.
Source(config-if)#no shut
Source(config-if)#
*May 14 15:34:43.044: %LINK-3-UPDOWN: Interface GigabitEthernet0/0, changed state to up
*May 14 15:34:44.043: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
Source(config-if)#
Source(config-if)#do sh run | i i proute
Source(config-if)#do sh run | i ip route
ip route 10.0.0.0 255.255.255.0 172.14.35.8
ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
ip route 11.0.0.0 255.255.255.0 172.35.26.8
ip route 11.0.0.0 255.255.255.0 172.14.35.8 50
Source(config-if)#no ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
Source(config)#no ip route 11.0.0.0 255.255.255.0 172.14.35.8 50
Source(config)#

Destination(config)#int lo8
Destination(config-if)#
Destination(config-if)#desc We will track this interface
Destination(config-if)#
Destination(config-if)#ip add 8.8.8.8 255.255.255.255
Destination(config-if)#

Source(config)#ip route 8.8.8.8 255.255.255.255 172.14.35.8
Source(config)#ip route 8.8.8.8 255.255.255.255 172.35.26.8
Source(config)#
Source(config)#do sh ip route 8.8.8.8
Routing entry for 8.8.8.8/32
  Known via "static", distance 1, metric 0
  Routing Descriptor Blocks:
    172.35.26.8
      Route metric is 0, traffic share count is 1
  * 172.14.35.8
      Route metric is 0, traffic share count is 1
Source(config)#do ping 8.8.8.8
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/4/5 ms
Source(config)#do trace 8.8.8.8
Type escape sequence to abort.
Tracing the route to 8.8.8.8
VRF info: (vrf in name/id, vrf out name/id)
  1 172.14.35.8 4 msec
Source(config)#do trace 8.8.8.8 num
Type escape sequence to abort.
Tracing the route to 8.8.8.8
VRF info: (vrf in name/id, vrf out name/id)
  1 172.14.35.8 4 msec
    172.35.26.8 4 msec
    172.14.35.8 6 msec
  2 1.1.2.2 4 msec
    1.1.1.2 4 msec * 
Source(config)#
Notice the routing table and the trace, this shows that both routes are working at the same time. Now let's set up tracking!
Source(config)#ip sla 10 
Source(config-ip-sla)#icmp-echo 8.8.8.8 source-interface gigabitEthernet 0/0 
Source(config-ip-sla-echo)#frequency 5
Source(config-ip-sla-echo)#exit 
Source(config)#ip sla sched 10 life forever start-time now
Source(config)#track 10 ip sla 10 reachability
Source(config-track)#exit
Source(config)#
Source(config)#ip route 10.0.0.0 255.255.255.0 172.14.35.8 track 10
Source(config)#
Source(config)#do sh run | i ip route
ip route 10.0.0.0 255.255.255.0 172.14.35.8 track 10
ip route 8.8.8.8 255.255.255.255 172.14.35.8
ip route 8.8.8.8 255.255.255.255 172.35.26.8
ip route 10.0.0.0 255.255.255.0 172.14.35.8
ip route 11.0.0.0 255.255.255.0 172.35.26.8
Source(config)#no ip route 10.0.0.0 255.255.255.0 172.14.35.8
Source(config)#end
Source#
We start by creating an IP SLA entry, with a number. We then use an ICMP echo to track the IP address 8.8.8.8, through our Gi0/0 interface. We send five pings, we need to schedule it to start, and also set up a tracking object that references the SLA entry we created first.

Let's test this. Firstly, by making sure that our tracking is working, then by taking the interface down on ISP-1, and seeing what happens:
Source#show track
Track 10
  IP SLA 10 reachability
  Reachability is Up
    1 change, last change 00:00:51
  Latest operation return code: OK
  Latest RTT (millisecs) 5
  Tracked by:
    Static IP Routing 0
Source#

ISP-1(config)#int gi0/0
ISP-1(config-if)#shut
ISP-1(config-if)#
ISP-1(config-if)#
*May 14 15:48:58.593: %LINK-5-CHANGED: Interface GigabitEthernet0/0, changed state to administratively down
*May 14 15:48:59.592: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
ISP-1(config-if)#

Source#
*May 14 15:49:08.666: %TRACK-6-STATE: 10 ip sla 10 reachability Up -> Down
Source#
Source#show track
Track 10
  IP SLA 10 reachability
  Reachability is Down
    2 changes, last change 00:00:06
  Latest operation return code: Timeout
  Tracked by:
    Static IP Routing 0
Source#

Source#sh ip route 10.0.0.0
% Network not in table
Source#
So we have lost the network, but can add it in again, using the floating static route, as before:
Source(config)#ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
Source(config)#
Source(config)#do sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [50/0] via 172.35.26.8
Source(config)#do ping 10.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/5/8 ms
Source(config)#

ISP-1(config)#int gi0/0
ISP-1(config-if)#no shut
ISP-1(config-if)#

Source(config)#
*May 14 16:13:03.755: %TRACK-6-STATE: 10 ip sla 10 reachability Down -> Up
Source(config)#
Source(config)#do sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [1/0] via 172.14.35.8
Source(config)#
So now, if we lose the access through ISP-1 to the 10.0.0.0/24 network, we will go through ISP-2. Let's try once more, and see what happens (this time simulating a failure elsewhere in the network):
Destination(config)#int lo8
Destination(config-if)#shut
Destination(config-if)#
*May 14 16:14:42.302: %LINK-5-CHANGED: Interface Loopback8, changed state to administratively down
*May 14 16:14:43.302: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback8, changed state to down
Destination(config-if)#

Source(config)#do sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [1/0] via 172.14.35.8
Source(config)#
*May 14 16:14:43.761: %TRACK-6-STATE: 10 ip sla 10 reachability Up -> Down
Source(config)#                       
Source(config)#do sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets
S        10.0.0.0 [50/0] via 172.35.26.8
Source(config)#
Above, if we shut down the Lo8 interface on the destination (the object we are tracking), our traffic goes through ISP-2. Let's add another tracker, and protect all our routes!
Destination(config-if)#no shut
Destination(config-if)#
Destination(config-if)#
*May 14 16:16:37.442: %LINK-3-UPDOWN: Interface Loopback8, changed state to up
*May 14 16:16:38.445: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback8, changed state to up
Destination(config-if)#

Source(config)#do sh run | i ip route
ip route 10.0.0.0 255.255.255.0 172.14.35.8 track 10
ip route 8.8.8.8 255.255.255.255 172.14.35.8
ip route 8.8.8.8 255.255.255.255 172.35.26.8
ip route 10.0.0.0 255.255.255.0 172.35.26.8 50
ip route 11.0.0.0 255.255.255.0 172.35.26.8
Source(config)#
Source(config)#no ip route 11.0.0.0 255.255.255.0 172.35.26.8
Source(config)#ip route 11.0.0.0 255.255.255.0 172.35.26.8 track 11
Source(config)#ip route 11.0.0.0 255.255.255.0 172.14.35.8 50
Source(config)#
Source(config)#ip sla 11
Source(config-ip-sla)#icmp-echo 8.8.8.8 source-interface gigabitEthernet 0/1
Source(config-ip-sla-echo)#freq 5
Source(config-ip-sla-echo)#exi
Source(config)#ip sla schedule 11 life forever start-time now
Source(config)#
Source(config)#track 11 ip sla 11 reachability
Source(config-track)#exi
Source(config)#
So, if all is good, then traffic to 10.0.0.1 should go through ISP-1 and traffic for 11.0.0.1 should go through ISP-2. If 8.8.8.8 goes down, then this traffic should switch ISPs:
Source#trace 10.0.0.1 num
Type escape sequence to abort.
Tracing the route to 10.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.14.35.8 5 msec 3 msec 4 msec
  2 1.1.1.2 5 msec *  4 msec
Source#trace 11.0.0.1 num
Type escape sequence to abort.
Tracing the route to 11.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.35.26.8 4 msec 3 msec 3 msec
  2 1.1.2.2 4 msec *  5 msec
Source#

Destination(config)#int lo8
Destination(config-if)#shut
Destination(config-if)#
*May 14 16:30:25.463: %LINK-5-CHANGED: Interface Loopback8, changed state to administratively down
*May 14 16:30:26.464: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback8, changed state to down
Destination(config-if)#

Source#
*May 14 16:30:28.822: %TRACK-6-STATE: 10 ip sla 10 reachability Up -> Down
*May 14 16:30:28.823: %TRACK-6-STATE: 11 ip sla 11 reachability Up -> Down
Source#
Source#trace 10.0.0.1 num
Type escape sequence to abort.
Tracing the route to 10.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.35.26.8 3 msec 3 msec 3 msec
  2 1.1.2.2 5 msec *  6 msec
Source#
Source#trace 11.0.0.1 num
Type escape sequence to abort.
Tracing the route to 11.0.0.1
VRF info: (vrf in name/id, vrf out name/id)
  1 172.14.35.8 4 msec 3 msec 3 msec
  2 1.1.1.2 5 msec *  4 msec
Source#
Source#
IP SLA works really well, in conjunction with floating atatic routes, and can be used to redirect traffic within a few seconds.

In the real-world though, you would track more than one address, so that in the event that it is a particular end-point thats failed, outside of your network, you don't go reconfiguring the network for everyone. Instead you would track three addresses (or so), and then if all three fail, chances are the linked is messed up, at which point failing over to a different link would be a good idea.

HSRP, VRRP and GLBP are all covered in the CCNA and Beyond study guide, which you can purchase now, or read a sample.

You can download the UNetLab file for this here.

CCIE #49337, author of CCNA and Beyond, BGP for Cisco Networks, MPLS for Cisco Networks, VPNs and NAT for Cisco Networks.

Related Posts

Previous
Next Post »