Adventures in IWAN – Part 1 – Transport Independence

For a number of reasons, and part of wider interactions with SD-WAN, I have been having a few adventures with Cisco’s current SD-WAN offering  – Intelligent WAN or IWAN (2.1).  Whether IWAN is an SD-WAN as you understand it from other vendors is a topic for another day, but I thought it might be useful to cover a few things I have come across.

This is intended to be a multi-part post with the first 2 parts covering the first 2 pillars (hopefully throw in a few config examples after part 2), and the third eventually covering pillars 3 and 4.

As at least 80% of getting IWAN up and running is in the first 2 pillars, I  am going to focus on these primarily.

One thing to note from dealing with Cisco IWAN so far, is a lot of the underlying mechanisms are exposed.  This has helped me, personally, to improve my understanding of all SD-WAN vendors and how their solutions fit together (e.g.  what is the overlay and how does it actually work? What affects the control plane?  What is being provided in the Data Plane?  Overlay routing for dynamic point-to-multipoint with encryption?  How exactly are you doing encryption?  What protocols are you using?  How are you managing key distribution and re-keying?  How is traffic diverted to the device or inline? How is performance monitoring working?  App optimisation?  Is it flow based?  How are you looking into the flow? Application identification – what method? Controller for traffic control?  Real time? Orchestration?  etc.)  Basically, what is the magic?

Are we ready?  OK. Strap in, here we go.

The Building blocks of IWAN

Have a quick look at the Cisco picture below and you can see the 4 pillars of IWAN

iwan-technical-overview-with-management

Each pillar of IWAN has underlying technology building blocks and those technologies also have foundational components.  Hopefully I will provide some clarity on the the building blocks to layer on top of each other to help produce a shiny polished IWAN solution.

The first 2 pillars of IWAN are – 1) Transport Independence and 2) Intelligent Path Control.

Part 1

 Transport Independence

The fundamental technology underpinning transport independence in IWAN is DMVPN (Dynamic Multipoint VPN) as the transport overlay technology, and this also has component parts.

So what is DMVPN fundamentally?

It is a combination of 4 things:

  1. Multipoint GRE tunnels
  2. NHRP (Next Hop Routing Protocol) – basically creates a mapping database of the spoke’s GRE tunnel interfaces to real (or public) addresses.  Think of this like tunnel overlay IPs ARPing for the “real” underlay IP addresses.
  3. IPSEC tunnel protection – creates and applies encryption policies dynamically
  4. Routing – Essentially the dynamic advertisement of branch networks via routing protocols, e.g BGP, EIGRP, OSPF, RIP, ODR.

Let’s cover each one in turn, then you will have your tunnel overlay or secure transport independence sorted.

DMVPN – your overlay transport technology

Multipoint GRE tunnels

If you are familiar with GRE you will be familiar that you create a tunnel with an extra GRE header  between two endpoints.  You create a tunnel interface (virtual interface) with an address, and tie this to a real source and destination address on actual interfaces that terminate the tunnel.

A couple of pre-canned Cisco diagrams do the trick here for the sake of illustration:

GRE tunnel

tunnel

Multipoint GRE  broadens this idea by allowing a tunnel to have “multiple” destinations and you can terminate the tunnels on a single interface.  Handy for Hub-and-Spoke, and Spoke-to-Spoke I think you will agree.

So Multipoint GRE is your tunnel overlay SD-WAN transport in the Cisco world.  Well that was simple, so onward to the less straightforward.

Next Hop Resolution Protocol – NHRP

The next building block of DMVPN is NHRP, and this provides a way of dynamically mapping all those multi-point GRE tunnel interfaces you just created with their associated real addresses or underlay transport network.

NHRP has actually been around a while in different forms and originates from an extension of the ATM ARP routing mechanism which dates back to 1998/1999 as a technology.

Think of NHRP (Next Hop Resolution Protocol) as like ARP but for the underlying real IP addresses.  So you have a physical interface on your wan router with an address, and you have a GRE tunnel address on that same router.  One is your IP underlay and one your IP tunnel overlay.  You now need a way to map your IP underlay network to your IP tunnel overlay network, and NHRP does this job.

By way of visualization, I particularly like the below diagram from Cisco which shows very clearly which are your overlay addresses, which are your tunnel addresses, and which are your real addresses or NBMA addresses.  As a distinction it might help to think of GRE as your transport overlay technology (each multipoint GRE tunnel maps to a WAN transport), and your overlay network as the network addresses you wish to send over this tunnel, so a network overlay.

dmvpn

A spoke router will register with a Next Hop Server (NHS) as it comes up,  (you will give the spoke a NHS address to register with, and incidentally a multicast address for broadcast over the tunnel if the underlying network does not support IP multicast – useful for routing protocols).  Once registered, the NHRP database will maintain a mapping of Real addresses to Tunnel Addresses.  Once registered, if a spoke needs to dynamically discover the logical tunnel IP to physical NBMA IP mapping for another Next-Hop-Client (spoke) within the same NBMA network, then it will do an NHRP resolution request to find this.  This discovery means you do not have to go via the Hub every time for Spoke to Spoke communication – so the Dynamic part of DMVPN really. You can create dynamic GRE tunnels from a spoke (and ultimately encrypted tunnels) on the fly by querying NHRP, find the real NBMA address of another spoke and, voila, you have the peer information to set up your tunnels direct.

Nb. There are some interesting CEF details with NHRP between DMVPN Phase 1, 2 and 3 but that is follow on reading I would say.  Allowing a layer 2 resolution protocol to ultimately control your layer 3 direction and interactions is maybe controversial for the purist, and I will doubtless attempt to cover this when looking at some other SD-WAN techniques in other posts.

In short all spokes register their NBMA addresses with a Next hop Server (hub typically), and when a spoke needs to send a packet via a next hop (spoke) on the mGRE cloud or transport overlay, it asks the NHS (via a resolution request) “can I please have real/NBMA address of this next hop?”, the NHS replies with the NBMA address of the other spoke, and from this point the spokes can speak directly.

Encryption 

IPSEC tunnel protection 

IPSEC is the suite of protocols that enable the end to end encryption over the network in IWAN.  We are using IKEv2 and IPSEC.  Remember you can get DMVPN working as on overlay transport without encryption; this is optional (but good practice for security). Technically you just need your routing, multipoint GRE tunnel overlay network, and NHRP,  then you can add encryption once network connectivity is sorted.  I have found this is a good way to build the solution in blocks to make troubleshooting easier.

It is a little involved to go into here, but essentially IPSec Phase 1 identifies who you want to form an encrypted tunnel with and securely authenticates the peer (and sets some parameters for Phase 2), and then Phase 2 agrees on what to use to actually encrypt the traffic.  The fundamental problem is that when you have to create a lot of point to point IPSEC tunnels, you need some way to tell the devices what the address of the peer is so it can create an encrypted tunnel.  Each would then be an individual configuration for every peer to peer connection, managing keepalives (Dead Peer Decpection), and failover etc.   If you want on-demand dynamic spoke-to-spoke encryption, then IPSEC needs some work.  There are a number of ways to solve this, but DMVPN  phase 3 (Multipoint GRE and NHRP)  has been used for some time and is the method of choice today in IWAN.

With DMVPN it is always worth covering headers and how they are used in the real world should you choose to use IPSEC.  This way you can visualise the overlay network.

Typically you use transport mode with DMVPN so what does this mean and why use this with DMVPN?

Header confusion

There are Encryption Headers and GRE headers, do not confuse or conflate the two.

IPSEC uses 2 distinct protocols to either encrypt or authenticate your Layer 3 payload. These are ESP header (Encapsulating Security Payload) and AH   (Authentication Header) and both add headers to your packet.  They both also run in one of two modes, tunnel or transport.  These modes either use the original IP header (transport), or add a new IP header (tunnel) in order to traverse the network.  This is outlined clearly in the diagram below.

Headers

The next level of header confusion comes with GRE – which also adds an IP header.

Your original packet might look something like:

IP hdr 1   |   TCP hdr  |    Data

GRE Encapsulation:

IP hdr 2   |    GRE hdr  |   IP hdr 1   |    TCP hdr  |   Data

GRE over IPsec Transport Mode (with ESP):

IP hdr 2   |   ESP hdr |    GRE hdr  |    IP hdr 1   |   TCP hdr   |   Data

GRE over IPsec Tunnel Mode (with ESP):

IP hdr 3   |   ESP hdr   |   IP hdr 2   |   GRE hdr   |   IP hdr 1 |   TCP hdr   |   Data

Transport mode only encrypts the data payload and uses the original IP header – whereas tunnel mode will encrypt the whole IP packet (header + payload) and use a new IP header.

In DMVPN both the GRE peer and IPsec peer addresses are the same, so typically transport mode saves on header addition which is essentially repeating identical information (20 bytes saved right there).

Typically you use ESP with Transport mode for DMVPN

Now you should have a reasonable view of the Encryption overlay and the GRE overlay and the headers that are added end to end.

Routing

Routing comes up in two areas of IWAN, one in the transport independence piece, and again in the best path selection with PfR, but it is important not to confuse the two.  For example, PFR uses EIGRP for the Service Advertisement Framework (SAF), but for the transport piece you could use the same or a different routing protocol e.g. BGP, EIGRP or OSPF.  When EIGRP is used for your underlay and overlay routing as well (which is highly likely) conversations can get confusing.

You have a router at the customer edge, trying to get to another router at another edge. In between you have a Service Provider network.  Typically in order to get traffic to where you want to go you need to interact with the Service Provider’s BGP network, whether that is BGP advertised default routes, statics, redistribution, whatever is most suitable for you and your SP.

Now with IWAN you are adding a tunnel overlay, and this overlay network needs to be advertised into your current Enterprise network so that traffic that needs to get to another one of your sites knows which next-hop to use.  That the next-hop will now be a tunnel , i.e. you need to use a tunnel to get there.  Remember NHRP is used to do the mappings here to actually get the tunnel traffic across to the real address of the remote site to terminate the tunnel.  So where previously you may have used dynamic or static routing or default route in BGP to say,  “if you want to get to an address that lives across the WAN use the following next hop (WAN interface)”,  well with an overlay you are telling traffic to use your tunnel interface as your next hop.  To advertise these tunnel overlay routes into your network you can either use statics or a routing protocol of choice like BGP or EIGRP.  Of course if your routing protocols are covering both the real WAN interface and your tunnel interface networks, you need to take care that the correct route gets installed into the forwarding table, and that you are learning the information from a consistent place so your routing protocols don’t get confused and bounce the tunnel up and down (the recursive routing problem described a little further down).

As mentioned, the other use of a routing control plane in IWAN is for PfR (Performance Routing) , where the EIGRP engine is used for the Service Advertisement Framework and creates its own neighbours and domains accordingly.

Of course this is logically separate from the underlay and actual traffic forwarding and relies on the overlay network to get connectivity across the WAN between members of the SAF domain for sending SAF information to each other.  That is, the tunnels provide connectivity for SAF peers.

So what does all this mean?  Well it means you can very easily have 3 routing protocol names flying around in conversation confusing everyone on a whiteboard – BGP for underlay,  EIGRP for overlay,  EIGRP for PfR (or any mixture e.g. OSPF, BGP, EIGRP for routing and EIGRP for PfR).  The one constant here is the EIGRP engine is always the mechanism for PfR SAF peering.  However if you separate the PfR / SAF process in your mind as a monitoring technology that just happens to use an EIGRP process to set up its domains (nothing to do with network connectivity) – then the rest is really just routing as normal with care taken over your DMVPN.

DMPVN which routing protocol?

If you have ever configured DMVPN you will know that there are limitations or caveats with each routing protocol.  Let’s have a brief look at these at a basic level, and for simplicity, only with DMVPN phase 3 .

OSPF – You can use OSPF of course, but you need to be a little careful with network types. Point to Point? won’t work because you are using a multipoint GRE tunnel interface. Broadcast?  Well this will work but you need to make sure the spokes will never be elected as the DR or BDR  (IP ospf priority 0 on the tunnel interfaces of the spokes should do the trick here).  Non-Broadcast? – yes this will work as with broadcast but you need to statically configure your neighbours.  Point to Multipoint? – works well with phase 3, and you don’t have to worry about DR/BDR election.  With DMVPN phase 2 it is important to note that Point to Multipoint does not work so well, as this changes the next hop so all traffic goes through the hub router, so not ideal for dynamic spoke to spoke. In phase 2 you have the same issue with OSPF point to multipoint non-broadcast with the addition of having to statically define your neighbours.

What are the issues with OSPF? – well a couple that spring to mind are that in DMVPN you use the same subnet, and therefore all OSPF routers would be in the same area. Summarisation is only available on Area Border Routers (ABRs) and Autonomous System Border Routers (ASBRs), therefore the hub router would need to be an ABR for summarisation.  Also as OSPF is link-state, any change in the area will result in an SPF calculation across the area i.e. all the routers will run an SPF calculation on link change. Misconfiguration of the DR/BDR will break connectivity and traffic engineering has its issues with a link-state protocol.

So OSPF is doable, using NSSA (Not So Stubby Areas) on the spoke and careful config, but for larger scale DMVPN people drift towards BGP/EIGRP.

EIGRP – Is not link state, does not have an area concept and you don’t have to think of the topology tweaks you need to do with OSPF above.  One thing to note in DMVPN phase 2 is that you don’t wan’t the hub setting itself as the next hop for routes, but you can configure around this with EIGRP.  Of course you need to disable split-horizon so routing advertisements are allowed out of the same interface (mGRE tunnel int).  Good advice for scale is to turn the spokes into EIGRP stubs and also to watch for the number of adjacencies the hub has, as hellos can become an issue (you can play with timers here too).  Also EIGRP can summarise and manipulate metrics at any point.

EIGRP is well-suited to DMVPN at scale.

BGP

BGP also works for DMVPN – we know it scales (the Internet), and the default timers are less onerous that other protocols.   The choice, as ever, is IBGP vs EBGP.  Whereas with IBGP you might require route reflectors at scale and an IGP to carry next hops, EBGP might need several AS numbers, or you could disable loop prevention.

With DMVPN eBGP, the next-hop is changed on updates outbound, so all good there.  Next question is whether to use the same AS for every site, or unique ASs.  This can limit you to 1024 spokes as the 16-bit AS number allows only for 1024 spokes, but good to prevent loops. With a 32 bit number the private AS number is solved, but there is a deal of configuration at the hub with unique AS numbers.

Say you run the same AS at all sites, well in this case the receiving router sees its own AS number in the AS path of a received BGP packet and it assumes the packet came from its own AS, has reached the same place it came from, so drops the packet.  To get round this you can use as-override, but this can produce loops in the control plane.

iBGP then, back to the next hop modification issue – so with phase 1 and 3 you can use “neighbor next-hop-self-all” for reflected routes on a route reflector.  iBGP with this becomes probably the preferred option when it comes to BGP with this.

iBGP is well-suited to DMVPN at scale.

From the above EIGRP or BGP tend to be the preferred choices for DMVPN.

Now the assumption often with IWAN, is that BGP and EIGRP are chosen entirely because of the above typical reasons.

However in addition to the good reasons above, remember with IWAN you want some method of quick failover to an alternate or best path based on monitoring.  With BGP and EIGRP you have topology tables and feasible successors with alternate routes ready and waiting to go on failure to populate the Routing and Forwarding table and facilitate quick change of preferred paths.

Another very good reason for the use of EIGRP and BGP with IWAN.

So there you have it, a brief tour of the 4 building blocks of DMVPN

Finally, of course, no current discussion of DMVPN would be complete without a brief excursion into Front-Door VRFs and recursive routing.

Front Door VRFs.

These are a very useful technique in IWAN as they simplify paths and configuration a good deal.  What is VRF (Virtual Routing and Forwarding)?  Basically it allows multiple instances of a routing table to exist on a router and work simultaneously.  This is useful as it allows network paths to be segmented without using multiple devices.  Effectively in an IWAN design you put your wan interfaces into a separate VRF (front-door you see) and this avoids some recursive routing problems you may be familiar with using GRE (more on that later).

Recursive routing with GRE

If you are familiar with configuring DMVPN you may be aware that you can get yourself into a pickle when it comes to routing, and in particular, recursive routing.  So if you are using a routing protocol for your overlay and another for your underlay, there could be a conflict here.  For example, if you learn your route both inside and outside of the tunnel for the same prefix, well the router gets a little confused.

If you have ever seen “Tunnel temporarily disabled due to recursive routing” then you know what I am talking about.  The first time you bump into this it can lead to furrowed brows and prolonged head scratching until the light-bulb fires.

So here is the crux of this issue:

If, for example, you have two routers with NBMA (Wan) interfaces addressed at one end with 10.10.1.10/16  and 10.12.1.12/16 at the other, well these are on different networks so you use a routing protocol to get across any intermediate hops to the other end.  Say we use OSPF for this.  E.g Router_A (10.10.1.10/16) – Router_B(ospf) – Router_C(ospf) – Router_B(10.12.1.12/16).  These are also your tunnel end-points remember.

Now say you want to use EIGRP to advertise your tunnel network, and you make the easy mistake of having an overlapping network i.e. your GRE tunnel interface addresses are 10.2.1.10 and 10.2.1.12 at each end.  So you may set EIGRP for network 10.0.0.0. (which also happens to cover the NBMA or real addresses).

Ok, so the problem here is that you now have the 10 network being advertised for the NMBA addresses in OSPF and then, when the tunnel comes up, you also have the 10 network being advertised through EIGRP over the tunnel.  So as soon as the EIGRP neighbour comes up over the tunnel, the tunnel goes down and with it the EIGRP neighbour – and rinse and repeat.  The problem of course is the NBMA (or wan interface) is now being advertised over the tunnel network using EIGRP.

Given the way the tunnel gets set up, which is to rely on OSPF (to find the actual NBMA tunnel endpoint), then this is simply not going to work

In short, the EIGRP neighbour comes up and you are saying the way to get to the real address (or tunnel endpoint) is over the tunnel, while simultaneously overriding the way the tunnel actually gets connectivity to that real address (tunnel endpoint) to set itself up as a tunnel (over OSPF).  The only way the EIGRP neighbour could come up in the first place is that OSPF had already provided the underlay routing to set up a tunnel.  All clear?  Yeah, I know, this can make you rub your forehead the first time you come across it.

The way to get round this usually is to be very careful with your subnets and routing to avoid the recursive.

But there is another way to avoid this – enter (or enter through) Front Door-VRF.

The principle here is that you have a separate routing table for the physical WAN interface (the front-door), and the tunnel or overlay network – so a VRF for each.  Or most simply, a separate VRF for the WAN interface and everything behind this is in the global routing table if you so wish.  As we are not learning the routes for the tunnel and NBMA through the same routing table, bingo you have solved your recursive routing problem.

There is still some magic needed, as there must be a way to tell the tunnel you are creating to use the Wan interface as a tunnel endpoint.  Create your WAN interfaces in their own VRF, then create your tunnel interfaces with these addresses as the source and destination tunnel endpoints, and finally just stitch these together with a VRF command under the tunnel interface (the stitching is the internal pixie dust).  Your network and routing over the tunnel are now separated from your transit network underneath.

F-VRF

Shut the front door – that is much simpler for DMVPN 

Part 2 Intelligent Path Control

 

 

Advertisements