DDOS approaches and Flowspec

Dos imge


Denial of Service – for once a technical term that is pretty much exactly what it says on the tin.  Essentially the service you are trying to provide (be that internet access, a corporate application, connectivity), is denied.  I unplug your TV and you can no longer watch TV, denial of service, simple.

In a volumetric Denial Of Service, where a large “volume” of traffic is created to fill up your connectivity pipe, typically you need some way of creating the traffic and sending it towards your victim.  This could be a single machine pumping out traffic, but this is rather hard to scale nowadays for the amount of traffic needed to clog some pretty beefy pipes.

Distributed Denial of Services, again, means what it says.  Instead of using one machine, you use a lot of distributed machines to all generate traffic simultaneously and point it at your victim (or many victims) to deny a service.


What is the problem?

Denial of Service and Distributed Denial of Service have been around for some time, but with the advent of IOT (Internet of Things) and proliferation of cheap devices with basic distros and firmware, the scale of DDOS has exploded over the last year or so.  The major headlines making people take notice were around the Dyn DNS attacks – which were a combination of the Mirai botnet and Bashlite, affecting the services of Twitter, Github, Dyn DNS, Netflix, Reddit etc.  Variants thereafter affected many Deutsche Telecom and Talk Talk routers to name a few.

This, in turn,  signaled the dawn of Terabit per second DDOS attacks with tens of millions of IP addresses.


Mirai and Bashlite

The Mirai botnet scans Ip addresses of vulnerable IOT devices with default usernames and password, reports to a control server, and from there can become a BOT to be used for DDOS.  It also removes competing malware from memory and blocks remote admin ports.  The source code for Mirai was dropped into the open, so effectively became open source.

Mirai Nikki
Mirai Nikki (Future diary)

Bashlite came out of a flaw in the Bash Shell (ShellShock) which took advantage of devices running BusyBox.  BusyBox is software to provide unix tools in a single binary – originally for Debian distro, so instead of each app having its own binary, you can call the single BusyBox binary with different names for various apps.  Essentially this allows code to be shared between multiple applications without requiring a library to reduce overhead.  Useful for memory constrained, portable operating system environments – POSIX – hence IOT).

From booters and stressers, to unpatched NTP or DNS internet services,  old home routers, 4g/5g handsets, all variety of IOT devices with basic firmware, flimsy linux distros with secondary passwords etc.  if these types of devices are converted to a network of botnets, then you can ask them all to send traffic to somewhere and the traffic scales rather quickly.    A brief Internet search for booters and stressers reveals just how easy it is to rent a service to cause annoyance and disruption (not that you would generally want to do that of course, as they are mostly illegal for use in the wild, but for testing purposes for your own infrastructure with correct permissions … maybe).

Unfortunately in a rush to get basic functionality out there, the market has been flooded with a variety of cheap IOT devices (or IOCT  – Internet of cheap or crappy things).

As many of these cheap IOT devices never get upgraded through hardware, firmware or software, or swapped out all too often, we shouldn’t expect this problem to go away any time soon. (Mirai, incidentally,  means “future” in Japanese.)

This is the general problem, so where typically on your infrastructure does it present?

The Denial of Service problem typically affects several places in the network, most commonly the Internet Pipe, the Firewall and the actual server under attack, but SQL servers, load-balancers, IPS/IDS are also affected.  The failure of Firewall, IDS/IPS or Application delivery controllers to address the problem (they tend to get hosed in a serious attack), has lead to dedicated DDOS solutions in the market.  As Internet pipes tend to be hit the most a lot of focus has ended up there.

So what is the solution, and is there a 100% effective solution today?  (I am guessing we already know the answer).

Historically the first way to mitigate an attack would be for an enterprise who is being DOS’d to contact their friendly Service Provider upon detection and ask for help. The SP would then perhaps install an ACL or provide filtering ad-hoc.  This could be a short conversation and ultimately money based – so not altogether that practical long term.

A stellar-mass black hole in orbit with a companion star located about 6,000 light years from Earth.

Another way would be to use DRTBH (Destination-based remote-triggered black hole), which essentially needs SP acceptance of BGP advertisements from the Enterprise as part of an agreement.  The Enterprise, by means of a BGP community attribute, would (upon detection of an attack), advertise say a /32 to the SP and then this would be blocked.  Of course this effectively completes the DOS for the host but could be of use for investigation and damage limitation.

A second method following this is to use SRTBH (Source-based remote-triggered black hole) which again involved BGP interaction, but this time the SP would look at blocking the source of the attack, typically in relation to something like uRPF (unicast Reverse Path Forwarding).  Remember BCP 38?

There are two forms of uRPF – Strict mode and Loose mode.  In strict mode the packet must be received on the interface that the router would use to forward the return packet. This has the potential to drop legitimate traffic if you have asymmetric routing for example, i.e. say you receive traffic on a interface that is not the choice of the router to forward return traffic.

In loose mode, the source address must appear in the routing table, not necessarily the actual interface, and allows default routes.  You can also configure filters to permit or deny certain sources.  There is also an option for the ISP-to-ISP edge (uRPF-VRF) to allow uRPF to query the VRF table containing all routes for a specific eBGP peering session over the interface, verifying the source addresses of packets matching the advertised routes from the peered ISP.

Not so much DDOS mitigation as a fairly manual or prescriptive dropping / blackholing of traffic really.  I tend to see this as damage limitation.


Of course if you can identify an attack manually or by notifying your service provider then you don’t actually have to drop the traffic.  You can redirect to a scrubbing or cleaning service, e.g. maybe a manual PBR redirect to drop the traffic onto a VPN/VRF, but generally it is better if there is a method to inform and push policy to redirect real-time, based on an attack in progress.

To this end we now consider BGP Flowspec, which is a little like enhanced PBR (Policy Based Routing) for more granular policy decision making and policy distribution across the infrastructure.


BGP Flowpsec

If you are going to drop traffic or at least differentiate between what you think is bad traffic and have some granularity of control, then BGP Flowpsec becomes a good option. If you know the type of traffic you should not be seeing, you can do worse than Flowspec to both identify traffic and distribute policy to the routers in your network that need to drop the traffic.  Indeed, based on policy, you can also determine which traffic you need to redirect and then redirect to a dedicated DDOS scrubbing device or service where “clean” traffic is returned to the user and service continues despite an attack taking place.

So what does BGP flowspec do? Policies

Well you can effectively create a policy to match a particular flow with source AND destination, and L4 parameters with packet specifics such as length, fragment etc, and allow for a dynamic installation of an action at the border routers.

If this policy is matched then you can perform an action:

  • drop the traffic
  • Redirect the traffic – e.g. inject it into a different vrf (for analysis)
  • or allow it, but police it at a specific defined rate.

Much like rate-limiting polices and QOS but with specific malicious flows that you recognise.

BGP  Flowspec basically adds a new NLRI into BGP (AFI=1, SAFI=133), NLRI is a field in BGP that, at its simplest, is used to identify the prefix for BGP advertisements (literally Network Layer Reachability Information), but as a variable field, you can use NLRIs in BGP to represent pretty much anything you wish (BGP is being overloaded with all sorts nowadays with this NLRI field, think EVPN etc.).

In this case you add information about a flow as below :

1.Destination IP Address

2.Source IP Address

3.IP Protocol


5.Destination port

6.Source Port

7.ICMP Type

8.ICMP Code

9.TCP Flags

10.Packet length



10.Packet length




Now you can match based on the above and define what characteristics of traffic are most likely DDOS. Indeed it is typical to have a Netflow feed off to an analysis engine to determine which traffic needs cleaning or is DDOS traffic and then use BGP Flowspec to instruct the network which precise flows to redirect to a cleaning service based on specific defined parameters as per the above – real time! (redirect/next hop modification, DSCP remark, drop or police, VRF leaking).

BGP flow-spec is a client-server model, so you can have the analyser as the server dictating to the client what you would like to match and action on, and then instruct the client (router for example) what to drop, police or redirect.

Flowspec is therefore a useful tool for policy distribution, drilling into specific actions on specific flows with traffic characteristics, and as a method to inform redirection, rate-limiting or dropping, real-time. Now let’s have a look at some typical types of DDOS attack.

Types of attack

Amplification attacks (dns, ntp, ssdp, snmp, chargen, qotd etc.).  The idea here is that traffic is spoofed.  By not using a full handshake, a large answer is sent to the victim’s address, or takes advantage of vulnerable protocols on large servers.  DNS is a prime example, where DNS responses can be much larger than the initial request – up to 4096 bytes with EDNS.

An example mitigation for this is using rate limiters for traffic and ports that have no business crossing network boundaries – SSDP UDP 1900, Netbios UDP 138, NTP 123, Chargen UDP 19 (character generation stream, not that prevalent, but there have been cases), fragments and large TCP Syn packets.

On some platforms you can be even more granular.  Instead of rate-limiting per class of traffic per interface, you can rate-limit per user (micro-flow policing) e.g. for DNS and NTP – if you see excessive traffic of this type then it is likely an amplification attack. Understanding your traffic patterns as best you can is of course key with these techniques.

At a Service Provider level, many of these amplification attacks can be blocked with router config.  There is no desperate need to send off to a cleaner, as routers can do a quicker job without the redirect if patterns are identified at the edge router with ACLs or BGP Flow-spec (maybe informed by an analyser).  You do however need to be extremely careful and know exactly what you are doing.

At the Enterprise level it is usually a bit late by the time the attack reaches you, so BGP flow-spec or cloud services might suit best.

It also makes sense to implement a bunch of security best practices on your infrastructure itself   e.g.  Generalized TTL Security Mechanism (GTSM) for  TTL Security   (RFC 5082 if you are interested.) Protect your control plane and management plane on devices through policing and protections, , MD5 auth for your routing protocols, key chaining, using SSH/SFTP/SCP and of course AAA where possible.

Layer 3/4 stateless volumetric attacks (udp frag, icmp flood ) usually filtered at the edge router typical of an SP or inline device for on premise (DDOS inline box).  If using on-premise or inline (e.g. scrubbing device inline, maybe on a s firewall), then if the pipe is already hosed before you get a chance on prem, seeking help from your service provider is a priority.


Stateful volumetric attacks – tcp syn, http, ssl, sip  (e.g look deep into Syn packets for legitimate replies)  For these attacks you need a more intelligent scrubbing device, and ideally deploy as close to the edge as possible.  These attacks are typically scrubbed at the PE or SP data centre (like an SP with one of the leading DDOS applicance vendors).  This is locally hosted in the provider and gets around the tricky routing with BGP external and returned scrubbed traffic using  GRE tunnels –  much easier when it is your own infrastructure and routing.  Another option would be cloud scrubbing of traffic and clean traffic returned.  If you are tackling this on premise then maybe you can collapse this function into a FW or ddos box., as long as the DDOS inspection and mitigation is done first before other security or traffic controls.

Slow Loris

Finally your slow pace attacks that can be stopped in the cloud or require an inline solution (slowloris slow and low, http flood, ssl floods, sql injections, xss csrf, app misuse, brute force, server resource hogs etc.) where an SP doesn’t typically have much visibility of the types of attacks that slowly exhaust the resources on the server.  For example Slowloris where sending HTTP headers in tiny chunks as slow as possible and waiting to send the next chunk until just before the server times out the request.  The server then is forced to wait for the headers to arrive – so if you quickly open enough connections, the server struggles to handle legitimate requests.  Or R.U.D.Y (R U dead yet?) where you send HTTP POST requests with an abnormally long ‘content-length’ header field and then start injecting the form with information, one byte-sized packet at a time, every 10 seconds (or random intervals in some ddos protection evasion techniques), with the long content header stopping the connection being closed and exhausting the connection table.

You should consider enhanced reporting on these attacks.  Netflow is essentially a sampling technology, so in order to spot these type of attacks you might need an inline device. They need some pretty deep inspection of each packet and anomalies so it is nigh on impossible to do this effectively at several 100 Gbps.  Keep this in mind when paying for cloud solutions and any premiums.  The rule in general is to do this as close to the resources you would like to protect as possible, hence inline, and understand the performance limitations.

Types of redirection

DNS redirection to the cloud is one method, and very popular at the moment (everyone loves clouds in IT but, oddly, no-one expects rain).  With DNS redirection, when you resolve to an IP address, you actually resolve to the DDOS protection service address and therefore go through the scrubbing protection before being served.  This can be on-demand or always on.

A second method is BGP based “inter-as” DDOS protection. Similar to BGP hijacking. Effectively your Service provider advertises a more specific /24 block for your site address (e.g. ip address 1.2..3.4/24) and as this is more specific routed traffic is magnetically drawn to this advertisement, which in turn redirects you to a scrubbing site first, before returning clean traffic back to you over a GRE tunnel. You need this tunnel of course to prevent a routing loop – you send traffic back to the original address, which again gets picked up by the /24 and ends up back at the scrubber for ever and ever. It is also possible to tie up a /24 permanently to specifically cater for DDOS.  One other method that some providers are using is to effectively act as an IP proxy – give away their own public IP addresses dedicated to you to obfuscate your own in an “always on” type service (for l3/l4 volumetric), bit like dns redirection, but a new IP advertised that belongs to the cloud.  When you get the traffic back you need to NAT etc. for your own range.  The caveat here is that it is always on, and all your traffic goes through the cloud provider first.

Typically the tunnel is what you are paying for from the cloud provider.  Remember, if you are doing a /24 BGP advertisement, you are essentially doing BGP hijacking. BGP origin validation could make this approach more difficult in future.

There are various flavours here – your edge router can act as a detector sending sflow or Netflow to your cloud provider, who, upon detection of DDOS, takes over the routing via BGP advertisement to redirect to the Cloud DDOS provider for the duration of the attack – on demand service (GRE back to the edge router for scrubbed traffic), or a permanent IP advertisement so everything goes to the cloud by default – always on service.

Service providers can of course bypass this and provide a hosted DDOS solution, where all the redirection happens internal to the service provider in a local SP Data Centre where they scrub the traffic for your Internet connectivity.  As above, this can be a premium “always on” service, or a service that kicks in under attack notification either automatically based on suspicious detected traffic or on request from the customer.

Finally let’s have a brief look at places in the network

Places in the network

Once you understand the type of attack you are protecting, then you can look at which service and place in the network is most effective for each, and whether you have covered what you need.  To be honest, whenever most people talk about DDOS they jump to volumetric – my pipe gets filled, re-direct and clean please.

To summarise some of what I have mentioned above, you have a bunch of options:

“In the cloud” services from a DDOS vendor where traffic is redirected either permanently or upon detection of an attack, cleaned, and returned ( Your method of redirection is either DNS based or BGP inter-as based).  An ISP hosted DDOS service; your SP can redirect you to a cloud vendor DDOS service, or stand up their own detectors and scrubbing service in their own Data Centres to monitor traffic at their Internet peering points and potentially provide a service back to a customer – like a “Clean Pipes” solution. Or you can go On Prem (centralised, distributed, mixed, inline).  Finally you can of course mix and match these approaches to make sure you cover as many conceivable denial of service attacks as possible at a latency that suits your applications, even down to deep packet inspection.  I guess there is no one size fits all.  One thing to note is if you use a Cloud provided DDOS, then to protect web-sites/L7 a proxy based DNS service usually works for most stuff and they maybe look at the l3/l4 attack prevention on demand should the attack circumvent the DNS and head straight for the real IP.  In truth multi-terabit, deep L7, slow and low cleaning etc. at any kind of reasonable latency and cost doesn’t exist today, which is why you protect per asset or web address or as a percentage of traffic.

Ultimately in future I expect DDOS protection to be a given as part of any Internet service consumed (it mostly already is), and premiums for this kind of service will likely come down, depending on volume of attacks, complexity and scale.  Cloudflare, for example, have just announced that DDOS is bundled at any scale within their Internet services as part of the usual rate.

New vectors tend to mean new solutions to problems so I would expect this to be charged for, wherever significant investment has been needed to solve the problem.

So there you have it, a brief tour of DDOS approaches being used in the market today. Implement the above, scrub yourself down, redirect yourself to the nearest good cup of coffee, and relax knowing your traffic flows gloriously uninterrupted…for now.

shiny traffic
Service resumed





A few basic dB wireless tips

If you have done a lot of wireless the below is bread and butter and appears in many text books in various guises, but for those that need a quick summary or someone looking for a quick “in” to further reading and practise the below might clear a few things up.

First Logarithms

Why? (trust me here, this is brief and worthwhile).  Well Logarithms are useful to represent the ratio between two values (i.e. values that can be measured), and we use ratios all the time in Wireless. This is because it is so so much easier than using real numbers with many decimal places when looking at signal strength, gain and power levels.  So what are they?

Logarithms are actually fairly easy at a basic level.

For example –   how many 2s do I need to multiply to get 8?

2 x 2 x 2 = 8

So I had to multiply three 2s to get 8, so the Log is 3 – and the way you write this is below:


The base is the little 2, so log base 2 of 8 = 3.  We are essentially using three numbers here:  The number we are multiplying (2 in this case), how many times it is multiplied (3 in this case), and the number we want to end up with (8 in this example).

Ok, one more just so we are clear.

Work out the following:      log2

so 6 x 6 x 6 x 6 = 1296, so we need four of the 6s multiplied to get 1296 so the answer is:


Incidentally this also tells you the exponent – so 6 to the power 4 = 1296.

All well and good and a refreshing trip down memory lane but what about wireless?

Well log base 10 is used a lot in wireless, particularly when it comes to dB.

What is dB (decibel)? Well it is a ratio, and what is good at representing ratios? Well logarithms of course.  The decibel is actually a unit of measure that came out of Bell Telephone Labs, and was very useful around the attenuation of audio frequency signal down a mile long telephone cable.

The decibel (dB) therefore is a good way to express the ratio of two power levels.  A couple of equations are coming up, nothing too difficult, but hold on for the result as that is where the trick comes in to impress your friends (most people have friends who are impressed by RF conversations right?).

When you express a power ratio in decibels then it is 10 times the 10-based logarithm of the ratio.  What on earth does that mean?

We are trying to find a ratio, so for 1 Bel (B)

RatioB = log10(P1 / P0)

To take this a step further 1 Bel is 10 decibels, so that is where you get your 10 x from. Therefore for dB:

RatiodB = 10 x log10(P1 / P0)

So this gives you a handy way to work out a ratio between two real power values, and in the wireless world you would typically use this when looking at Gain (how well an antenna converts input power into radio waves headed in a specified direction.)

Let’s have a look where this is useful.

GdB = 10 log10(P2 / P1)

P2 is the power level.

P1 is the referenced power level.

GdB is the power ratio or gain in dB.

So for the gain in dB for a system with input power of 10W and output power of 20W then

GdB = 10 log10(Poutput/Pinput) = 10 log10(20W/10W) = 3.01dB

Now remember that figure 3.01dB

The first of 2 decibel values an RF engineer has etched into their head is 3dB, because as you have just seen, this is a ratio of 2 (yes we know it is actually 3.01dB, but that is close enough for RF design) – 20/10 is 2 – voila.

So this is great to work out power ratios in your head.  If you know the power level has doubled then you have a 3dB gain, if the signal level is four times higher at the output than the input you have a 6dB gain.  Equally if you have -3db then the ratio is 1/2 or a half.

The second figure to have hardwired into your brain is 10dB.  Remember dB is a ratio, and 10dB is handily a ratio of 10 🙂  Equally -10dB is 1/10 or 1 tenth.

For example, if the signal level at the output is 10 times higher than the input then you have a 10 times ratio (i.e. 10W input and 100W at output, a 10x gain) which is 10dB.

Even more usefully you can now combine the two.  Say you have an amplifier with a gain ratio of 20 ( 20 times or 10 x 2), then the gain value is 10 + 3 which is 13dB.  (3dB is a 2 x ratio).

Got it?

So now say I want to calculate the power ratio of a given value….

P2 is equal to the reference power P1 times 10 raised by the gain in GdB divided by 10.

P2 = P1  10(GdB / 10)

P2 is the power level.

P1 is the referenced power level.

GdB is the power ratio or gain in dB.

log table

Basically a positive gain means there is more power at the output than the input and a negative gain means less power at the output than the input. Now consider a 40dB gain, well that is 10000 times more power at the output than the input, whereas a 20 dB negative gain is 100 times less power at the output than the input.  You can see now where these factors of 10 and logarithms can be useful for quick calculations.


So we have established that dB is a ratio, but what about dBm?  Well this is also a ratio but a ratio to a real value, i.e. the ratio of a power level relative to 1mW (or 1 thousandth of a Watt – 0.001W).  dBm therefore is a way to express absolute power.  10 dBm then is 10mW, or 10 x 1 mW.   20 dBm is 100 mW.  Remember the factor of 10 earlier?  So 20dBm references 1mW x 10 x 10 = 100mW.

Finally you need to be careful when expressing the difference between 2 power levels. 20 dBm – 10dBm = 3dB and not dBm because again, here we are expressing a ratio between 2 values as dictated by decibels.

So why use ratios and logarithms at all?  Well look at the table below of typical 802.11 power levels and signal strength.  At -90dBm you are talking 0.000000001 mw. Calculations based on a ratio or logarithm become much easier to compare than saying “the signal strength is 0.0000something as opposed to 0.00000something so we know the power has decreased by?  and the negative gain is?”.  In essence it provides meaning to the low power levels involved in wireless communication and makes working out real-world designs a whole lot easier.

dbm to mw

So there you have it a few handy basic wireless tips.

A couple more tips to keep up your sleeve is knowing that 5db is a ratio of 3 and -5dB is 1/3 or a third.  Not exactly of course, but close enough to work things out.


dBi is another measure around antennas that is handy to keep in mind, as this is the gain with reference to an antenna that transmits evenly around a 360 degree sphere. This is the perfect antenna which radiates power in all directions equally.  Of course real antennas like this do not exist, but it can be a useful reference point when looking at antenna gain in relation to the theoretical perfect antenna.

Isotropic radiation pattern in free space

One of the most common antennas that you will come across is the dipole antenna, or you may have heard it called a rubber duck antenna.  This has a doughnut shaped radiation pattern (toroidal).  It is handy to know that this has a gain of 2.15dB over an isotropic antenna, so if you had a different type of antenna with a gain of 5dB, you now know this is 2.85dB of gain higher than a common dipole – 2.15dB + 2.85dB (sounds a bit like we are bird-watching now).

Dipole antenna radiation pattern

And what is 5dB again?  Well it is a ratio, and coincidentally ratio of around 3 times the power gain over an isotropic antenna. See, I told you knowing the 5dB ratio was handy.

So remember, 3dB, 10dB, (maybe 5dB) and 2.15dBi and you can work things out like a wireless design bod in no time at all:-)

Hopefully these very basic wireless tips will send you on your way into the magical world of RF with a little less confusion when dB is thrown around like confetti.

EBITDA, Cashflow and Service Providers

It might seem odd to see a piece about cash flow in a technical blog, but looking at EBITDA and cash flow got me thinking about whether the new wave of service provision (NFV, SDN, SDO, SD-WAN etc.) has any impact on the traditional ways of reporting for Service Providers currently based on high up-front costs for future service and revenue.

For many years EBITDA has been a preferred reporting indicator of Telecommunications Providers, and in turn, Service Providers.  Execs knew the score – grow revenue, sort out your EBITDA, and get good Enterprise Value Multiples.  Enterprise Value is how much it would cost to buy a company’s business free of its debts and liabilities.  Enterprise Value Multiple is your Enterprise Value (EV) divided by your EBITDA earnings number.

But with automatic rapid growth for telecoms providers on hold for now, and the fact that even though companies may be hitting market expectations, they are not always seeing the Enterprise Value they would like in the market, the EV multiples are simply not there.

Based on that, I thought it might be interesting to look at what EBITDA is?   Why the preference exists? And do changing cycles of investment with SDO/SDN/NFV/Agile change anything at all?


First up – What is EBITDA?


EBITDAEarnings Before Interest, Taxes, Depreciation and Amortisation.

The way to calculate this is:

EBITDA = Income before taxes + Interest Expense + Depreciation + Amortisation

So basically you add the deductions (Interest Expense, Taxes, Depreciation and Amortisation) to your net income.

Let’s break this down a little.  Before the 1980s EBIT was a key indicator of a company’s ability to service its debt (Earnings before Interest and Tax).  With the advent of Leveraged Buyouts (LBO) EBITDA was now used as a tool to measure a company’s cash. LBO is essentially a way of acquiring another company with a large amount of borrowed money.  The assets from the acquiring company as well as the assets of the company being acquired are used as collateral for the loans.  This enables acquisitions without committing lots of capital..

Some conflate operating income (or operating cash flow) with EBIT but these are not the same, as EBIT makes adjustments for items not accounted for in operating income and thus are non-GAAP (Generally Accepted Accounting Principles).  Briefly, operating income is gross income minus operating expenses, depreciation and amortization but does not include taxes and interest expenses as in EBIT.


Let’s build this up from Cash Flow and Net Income.

Net income and Operating Cash Flow may seem very similar but they are not really the same.  Operating cash flow, or raw cash flow in general might need converting through accrual accounting to get a more realistic measure of income, namely net income. Accrual accounting?  Very briefly this recognises when an expense is incurred not when that expense is paid (think of buying something one month but then having 60 days before you need to pay it off, well the expense was incurred the moment you bought it really).  As a result businesses can list credit and cash sales in the same reporting period that sales were made.

Cash flow or Operating Cash Flow (OCF) = Revenue minus Operating Expenses

Ideally you want your Operating Cash Flow to be higher than your operating expenses or you might not be in business for too long.  Certainly an important measure.

Net income = Gross Revenue minus Expense.

Net income reflects the profit that is left after accounting for ALL expenditures, every pound/dollar the company earns minus every pound/dollar the company spends.

As I mentioned Cash Flow and Net Income are not the same, with Cash Flow showing not only how much a company earned and how much it spent, but when the cash actually changed hands.  This difference is significant and an illustrative example is below:

Say you sign $20million worth of contracts in a year, complete work on $10million of them, and collect $8million of that work in cash in the year.  But say you have also paid out $6million dollars in equipment, your raw cash flow would be $2million.  Net income however might look significantly different.

So if you have provided economic value to your customers (revenue) of say $15m of the contracts (i.e. you have completed $10m of the contracts, are 50% of the way through the remaining $10m, so $5m of economic value to customers), and of the $6million of equipment purchased you consume only a third of this, and they are estimated to last 3 years, then $6million divided over 3 years is $2million, so your net income for the year would be $15million minus $2 million in expenses, so $13million.

$13 million is a very different number than the raw cash flow value of $2million, and perhaps a better indication of the operating income of the company.

Earnings and cash therefore are not the same thing but that is not to say that operating cash flow is not important.  Companies must still pay interest, and they must pay taxes, and that cash has to come from somewhere, so using this to look at Free Cash Flow (Cash Flow after all capital expenditure), and looking at working capital to see whether a company can service its short term debts is useful..  A negative working capital might suggest a company will struggle with its short term debts.

Working Capital = Current Assets – Current Liabilities

Current Assets  – assets that can be converted to cash in less than a year.

Current liabilities  – liabilities that must be paid this year.


Now we have had a quick look at why Net Income might be a preferable way to get a read of a company’s operating income over and above raw cash flow, is there a way to calculate this to give us a better standard view of Net Income without all the financing decisions and conditions that are peculiar to each company? i.e. provide a better method of comparison?

Well EBITDA adds back (or deducts from the calculation the expenses associated with) interest, taxes, depreciation and amortization.  This therefore removes the effects of financing and accounting decisions.

The idea being that by ignoring expenses like interest, taxes, depreciation and amortization you strip away the costs that aren’t directly related to the core operations of a company.  The proposition is that what you are left with (EBITDA), is a purer measure of a company’s ability to make money.

Let’s take a brief look at what these costs are to reach EBITDA:

Interest expense – is the interest payable on any borrowings  such as bonds, loans, convertible debt or lines of credit (interest rate X outstanding principle amount of debt). You hear this referred to as just the Principle sometimes.

Taxes – Generally refers to income i.e. by a state or country.  Business taxes (property, payroll taxes etc. are considered operating expenses and therefore no factored into EBITDA.

Depreciation – In this sense the reduction of the value of an asset over time.  Tangible assets  – both fixed assets (land, buildings, machinery) and current assets (Inventory). Basically things that can be physically harmed.  Depreciation is a way of giving a cost to a tangible asset over its useful life, or how much of an assets value has been used up over time.

You also have something called intangible assets, which are things you can’t really touch like copyrights, patents , brand recognition etc. (as opposed to equipment, machinery, stocks, bonds and cash for example).  Goodwill is also a part of this, so solid customer base, reputation, employee relations, patents and proprietary technology – an amount you are willing to pay over the book value of the company (If the company was liquidated, the value of the assets the shareholders would theoretically receive)

Amortisation – Deals with these intangible assets.  So the paying off of debts such as a loan or mortgage with a fixed regular repayment schedule over a time period.  Also the spreading out of capital expenses of intangible assets over the assets’ useful life, i.e. a fixed period.  Say you spend $10 million on a patent with a useful life of 10 years, then over these 10 years you can spread the cost as a $1million a year amortisation expense.


So EBITDA is not really a good measure of cash flow, but can be a reasonable measure of a company’s profitability or net income.  However it does leave out the cash required to fund working capital and the replacement of old equipment, which can be significant.

This is particularly useful for new companies or those trying to attack a new large market where depreciation and amortisation spreads out the expenses of large capital investments, which can be considerable.

Conversely EBITDA has a bit of a bad rep as it has been used by a bunch of dangerously leveraged companies in boom times, but that is not to say it is all bad by any means.

There are many pros and cons to EBITDA but most are out of scope of what I am trying to convey, namely that telecoms providers tend to like it as a reporting method, have large initial infrastructure outlay to get return on investment over time through telecoms or service provision.  They are also not seeing the Enterprise Value multiples they would like despite a focus on market expectations around EBITDA (which doesn’t penalise a company in the market for having to invest in longer term infrastructure providing earnings are going in the right direction).


As with all accounting it is useful to look under the covers when you get chance to get a better read.  In Service Providers or Telecommunications companies where there is significant expense in building out the network, buying frequency, and installing physical towers or radios, it makes a deal of sense to judge them more fairly over a longer term as repeated income will come some time after the initial investment.

In the boom years however, rapid growth was a sure fire way of getting great EV multiples, but with growth in the industry now at GDP levels and below, fixed lines in decline, mobile revenues tailing off and there being no business case in the transport of bits, EBITDA focus does not automatically equate to investor or customer value.

In essence it gets harder and harder to hide behind growth.  Knee jerk reactions from execs can zone in on cash and EBITDA while they under-invest in the very things that provide economic value to customers (your revenue as a provider) i.e. better products and services to facilitate ROI, and lead to better capital returns.  Essentially the focus needs to be on creating value and not just controlling costs.


So my question is, with SDN, NFV, Software Defined Orchestration, Agile product development etc. can you facilitate some of the above?  Are the providers who are simultaneously adopting flexible automation and orchestration to roll out new services faster (services chained to customers’ business logic), with greater visibility and with better quality, likely to be the ones who get the EV multiples they hope for?


“What is that huge fiery technology shift I see?  Should we prepare?” .   “Don’t worry it’s miles away….stop thinking, keep cutting costs and we’ll be fine!”

As customers start to play providers off against each other with shorter term contracts to drive down cost and improve service (as with SD-WAN), does flexibility to respond with new services and efficient operations become more important than high sunk costs, long contracts, eventual gain, and a relentless focus on internal cost cutting with traditional technology to justify business as usual?

Is this all doubly worrying with the reduction of some SPs to mere local cloud access providers (just give me a pipe to the nearest cloud provider thanks)?  Is it coincidence that the big cloud providers are the ones physically laying fat cables to connect their services nowadays?

Reducing operating costs and quick iterative roll-outs with new techniques and technology certainly seems to appeal to the largest and more innovative SPs.  Using the latest methods to hold vendors to account and change traditional network operating practices (e.g  vCPE), while simultaneously recognizing they can provide new services to customers with more flexibility, seems primed for technology shifts like 5G.  Will all of this lead to a renewed focus on ROI rather than EBITDA growth or Capex/Sales? At the moment EBITDA growth and Capex/Sales don’t appear to be moving the EV multiple needle.

Are those who are not adopting new ways destined for extinction?

Of course EBITDA isn’t going to be ditched by Telecoms providers anytime soon, but maybe a softening of the relentless internal focus on one reporting measure will only lead to better value for customers, and in return better results for those providers focusing on customer value.  You never know, internal business cases might actually start to incorporate vision.

Architecture Reflections



If you follow IT news and commentary around Architectures at the moment it might seem that Enterprise Architecture is in a bit of a quandary. On the one hand there seems broad agreement that to drive IT advantages in line with business needs, at speed, a sound architectural base is needed. What is also needed is efficient processes to drive this architecture.

On the other hand there are articles talking about Enterprise Architecture being broken, or of business irrelevance of architectural frameworks as we know them today; a view of self-referential documentation which only the architecture community ever really read, while others “haven’t got time for this, there is a business to run and improve”.

As reliance on IT increases (it certainly isn’t going away), and while new models of IT consumption and service delivery are morphed all the time, I believe architecture will become more important, not less. The problem, as with so many things, is one of communication. The challenge is to communicate relevance to business stakeholders and show how we are improving or introducing new and better business relevant services, and solving business problems through architecture. There is a need to create and communicate the base architecture to achieve this.

Although the aim of this post is not to compare and contrast the different frameworks and methodologies, a brief mention of the options does highlight some of the problems we face.

We have a LOT of choice around governance, framework, service and methodologies. (COBIT, Zachmann, Togaf, ITIL, Prince2, PMBOK, CMMI, MSP, FEA, DOGAF, ISO 20000, ISO 27000, Agile, Six Sigma/DMAIC, and Lean to name but a few). Of course you will notice that, while not all mentioned are directly comparable (they overlap, and also address slightly different concerns), it does illustrate that the landscape is far from simple.

As an aside, while it is not quite accurate and too simplistic to view COBIT as the “why”, ITIL as the “how”, or with other frameworks the “how” and “what”, it can serve as a useful starting point in questioning what you can get out of each, and mapping the overlapping functions in getting them to work together.

To further illustrate the complexity let’s briefly go back to 1987 and the Zachman framework with its 4 domains and numerous sub-domains as an initial point of reference.

1) Process Domain – Business Context Engines, Planning Engines, Visualisation Engine, Business tools

2) Information/Knowledge Domain – Business Data, Business Profiles, Business Models, Data Models

3) Infrastructure Domain – Computers, Operating Systems, Display Devices, Networks

4) Organisation Domain – People, Roles, Organisational Structures, Alliances

Domains have been added in other frameworks and, as you can see, this isn’t getting any simpler even if the constructs are useful. Even a single domain can have mind boggling degrees of complexity.

If I take a component of the Infrastructure domain with which I am familiar (Network) there is a vast array of technology to architect around, each with their functional and deep technical specialists. From Software Defined Networking, Control Plane emulation and Policy creation, to layer 3 identity separation (LISP), OTV for layer 2 over Layer 3 tunneling, FabricPath and TRILL for layer 2 routing (MAC in MAC, MAC in UDP – no more spanning tree), and VXLAN (MAC in UDP) for extension of layer 2 segments over shared physical infrastructure, to name just a few recent headlining technologies. And this is just one small part of one sub-domain

You will, of course, have spotted an error in the complexity I have just outlined. A good base architecture will not have to architect around each new technology, but identify solutions to fit seamlessly into the architecture as they solve a business problem, enable a service, or support business functions. This is why architecture is there in the first place.

So we ask ourselves what business problem are we solving, what service are we enabling, or function we are supporting? For example with SDN are you gaining greater flexibility? more open platform support? better visibility? better policy control? more customisation? lower costs? better security? reduced risk? and does this let you roll out new services more quickly and robustly to serve the business? Or with some of the other technologies, are you able to move workloads faster regardless of location with less operational overhead and cost, or spin up services more quickly, reliably, cheaper? How does it aid mobility? Public / private cloud? Security? Once you ask, and indeed have your own answers to such questions, the technology seems to slot naturally into a base architecture.

Given the complexities, how do we get everyone on the same page?

We could just throw around nebulous buzz phrases like “business outcomes” and hope everyone nods in agreement, but a more practical method might yield better results.

What I am not saying is that some of this is not covered in Architecture Vision or Business Architecture phases of TOGAF for example, but it is all too easy to slip into the technicalities of the process, or explain the entire process in order to try get everyone on board with this piece, let alone contribute meaningfully.  This can often be a challenge.

One practical suggestion is briefly outlined below.

As all of the above frameworks, methodologies, processes etc. were (to a greater or lesser degree) born out of the Deming cycle (Plan, Do, Check, Act), it does allow common ground to be established and serve as a foundation of getting all stakeholders on the same page. We can use this to simplify communication and create a common understanding.

The aim is the get business stakeholders involved as early in the process as possible, to understand, and to avoid the redundancy and time wasted from erroneous requirements.

If you allow the value stream to “pull” the process and architect with this in mind, it can really help in making architectures business relevant. By this I mean viewing the process from the perspective of customer value, business value, and demand, then working backwards to eliminate waste and improve service quality. As obvious as this sounds, it is rarely done effectively.

This brings us to the option of a process/tool that can literally get everyone on the same page: the Lean A3 process/tool.

With the A3 report everything is discussed and agreed upon in one A3 sized document, which is highly visible, has agreed reference data, and follows a simple common sense process. As this process revolves loosely around the initial Deming cycle it has instant familiarity with architects, developers, designers, manufacturers and business process professionals across the board. The idea here is to get everyone to agree on the problem being solved, the service offered, or the function supported. This in turn enables a more seamless flow into the base architecture.

Although the above might indeed sound like “common sense”, and increasingly there is a reliance on this quality in architects; by formalising, and standardising this common sense in one place, with common agreed data in a concise format, with stakeholder contribution and understanding, we can then provide a solid base for the detailed architecture to really achieve what it sets out to do. It also makes it easier for anyone new, to initially understand the architecture and contribute meaningfully without wading through reams of framework documentation for weeks on end. As they say ” put talented people into average processes and you often get poor results, but put average people into great processes and you often get excellent results”.

Like anything, the A3 process/tool takes practice, (it is not something you write individually and present), but the idea of having a one-page reference that everyone has contributed to and agreed upon, can be a very powerful way of getting different functions to work together and most importantly understand why things are happening the way they are. Does it have to be the A3 process/tool? Of course not, but it does seem to be a useful reference or starting point.

Another advantage is that the components of the A3 process can quite easily be mapped to individual architecture phases in other frameworks such as TOGAF.

IT organisations will be increasingly measured by their alignment with the business; by speed and flexibility, productivity and growth, with security and risk mitigation embedded at every level. Combined with this, the idea of service managers running an IT service as a function of the business and measured as such, will be a powerful one. For me, this only puts greater emphasis on making sure everyone is referring to the same thing to avoid costly misunderstanding.

Through process/tools such as A3, allowing architectures to be pulled from the value stream, making things as simple and visible as possible, and having stakeholders embedded in the process as early as possible, maybe we can cut through some of the communication issues commonly associated with architecture relevance of late.

Some examples of the A3 process/tools can be found below:

Explanation of an A3 example – pdf

A3 example templates can be found here

Think before you leap

What Do You Know About Why You Are Doing This A3?

Is the A3 a tool, process or both?


There are several formal definitions of terminology within the various frameworks. I try, where possible, to ground myself in standard English definitions of the terminology firstly to remind myself of what I am trying to achieve, and secondly to gain common understanding. Some of these basic dictionary definitions are included below:

TOGAF defines architecture as

  1. 1. A formal description of a system, or a detailed plan of the system at a component level to guide its implementation
  2. The structure of components, their inter-relationships, and the principles and guidelines governing their design and evolution over time”

Dictionary definitions

Architecture – the complex or carefully designed structure of something

Architect – ” a person responsible for inventing or realizing a particular idea or project.” from the Latin or Greek arkhitekton meaning “chief – builder”

Framework – a skeletal structure designed to support or enclose something – a frame or structure composed of parts fitted together, or a set of assumptions, concepts, values, and practices that constitutes a way of viewing reality


Adventures in IWAN – Part 2 Intelligent Path Control

Intelligent Path Control

Broadly speaking Intelligent Path Control looks at monitoring traffic live across multiple transport links and making a next-hop decision on the fly to make sure the traffic you define in policy always chooses the best path automagically.  Application aware routing really.

If your expensive backup-links are underutilized, or you want to take advantage of multiple Wan transports and you want to do any kind of load-sharing then traditionally you would be using all your complex BGP tricks at the routing level, and if you wanted to do any kind of application monitoring to connect the two, then there would be some manual Netflow checking, SLA probes etc.  The entire  SD-WAN market was spawned, in part, to solve this problem.

Intelligent Path Control looks to remedy this by automatically routing traffic per class based on real-time performance of links to the most optimal transport on the fly.  This is useful, as routing protocols in the main are blissfully unaware of brownouts, soft errors or intermittent short lived flapping etc.

In a nutshell –  Intelligent Path Control is intelligent routing based on performance.

This “Performance Routing (PfR)” is what enables Intelligent Path Control, so here we can end the circuitous marketing and talk about PfR from this point in.

Under the covers PfR consists of Cisco’s SAF (Service Advertisement Framework), Monitoring Mechanisms, NBARv2 and Netflow v9.  It influences forwarding decisions in PfRv3 not by altering the routing table in the main, but through dynamic route maps or Policy Based Routing to change the next hop.  It can also enable some prefix control, injecting of BGP or static routes.

Before PfR, in a Cisco environment you would rely on manually using a bunch of scripts, static routes, PBR etc. to get anything like intelligent path control, and this would be a long way from dynamic or automatic. Then you would somehow be trying to stitch this together with some monitoring maybe using Netflow, or your IPSLA probes.  All very manual, all very labour intensive.

As with DMVPN, PfR is made up of a number of components, these are below and I will cover each one in turn to get an understanding of how this solution all fits together.

  • Performance Routing
  • PfR IWAN components – Controllers and Border Routers
  • Monitoring and Performance routing – NBAR2, Netflow
  • Service Advertisement Framework (SAF)
  • Paths
  • Prefixes
  • Smart Probes (optimistation, Zero SLA)
  • Thresholds
  • Steps to set up PfR – Traffic Classes and Channels
  • Routing
  • Transit site Preference
  • Backup Master controller
  • Prefixes
  • Routing
  • Controlled and uncontrolled traffic
  • Path of Last Resort.
  • VRFs

Performance Routing (PfR)

PFR (performance routing) initially came out of OER (Optimised Edge Routing) and version 1 appeared in the year  2000 with prefix route optimisation.

PFRv2 with application path optimisation then came in 2007.

PFRv3 is the latest version with new functionality, evolving a well-established and long standing Cisco IOS feature for over 10 years.

Essentially Pfr monitors network performance and makes a routing decision based on policies for application performance.  It load-balances traffic based on linked utilization to use all available wan links to the best performance per application.

PfR is really a passive engine for monitoring and gives you a superset of Netflow monitoring with around 40+ measurements.

  1. First you start with defining your polices – There are two ways here, either by DSCP or by Application.  If you use application you enable NBARv2.  There are some very handy defaults here.
  2. Then you learn your traffic.
  3. Once the traffic is learned the next step is to measure the traffic flow and network performance and report this to a Master Controller
  4. Finally you choose your path.  Performance Routing, via the Master controller, will change the next hop based on what you have learned about the traffic and how you have set-up your policies for each traffic class and link.

PfR can automatically discover the branch once connectivity is established.  The CPE/Branch (BR) is discovered and is automatically configured – auto starts.  The whole idea was to make the PfR configuration as simple and light-touch as possible.

PfR IWAN components

A device (Cisco ISR/ASR router, virtual or physical) can play one of 4 roles in PfRv3.  It is important to outline these, as the HQ Border Routers are used simultaneously as DMVPN hubs and PfR BRs.  By separating out the functionality you know which design you are talking about –  the transport overlay, or performance routing.

Let’s now look at the components involved in PfR.

Master Controller

The Master Controller(MC) is the decision maker and the Border Router(BR) is the forwarding path (roughly analogous to Controllers in other vendors’ SD-WAN architectures, this one just happens to be a router, physical or virtual).

You need a Master Controller at every site, so inevitably there is some confusion when it comes to the Hub Master Controller.  I look at the Hub Master Controller as the place where all the policy is configured and then distributed to the rest of the Master Controllers in your network as they join your IWAN domain.  As the Hub MC is looking after the IWAN domain this will normally be a separate device at the hub (physical or virtual) for scale.  An IWAN domain is basically all your routing devices that participate in the Path Control.

If you look at a typical IWAN design you see a Hub Master Controller at the central site (and a second maybe on a redundant central site or DC).  Cisco should probably have called this an IWAN domain controller or something.  Instead they call these Hub Master Controllers (HUB MCs),  central sites are IWAN POPs (Points of Presence), and each central site has a unique POP-id.

In Cisco IWAN domain you must have a Hub site, you may have a Transit Site and of course you have your Branch sites.

The Master Controller functionality doesn’t do any packet forwarding or inspection, but simply applies policy, verifies and reports.  The MC can also be standalone or combined with a Border Router (BR).  You have to have a Master controller per site (branch, hub etc.) so policy can be applied and verified.  If you only had one router at the branch with 2 transports then the border router would be a combined Master Controller and Border Router MC/BR.

Master Controller as a term comes up in several places then, but all you have to do is differentiate between the functionality of the one at the central site, and the ones everywhere else and it all becomes simpler to understand.

So a ultimately there is central place for configuration, policies, timers etc. (your Hub MC), but a completely distributed MC control plane.  It is important to know that even if you lose your HUB MC, you still have the local MCs optimising and controlling your traffic. A distributed control plane.

Let’s briefly go through each of the components involved in PfR in turn, and then look at the monitoring:

  • Hub Master Controller:  This is where all the polices are configured, and is the Master Controller at the hub site, Data Center or HQ.  Also for this central site it makes any optimisation decisions for traffic on the Border Routers (BRs).
  • Hub Border Router:  At each central site you need Border Routers to terminate WAN interfaces and PfR needs to be enabled on these.  The BR needs to know the address of its local Master Controller (the Hub Master Controller in this case), and you can have several hub BRs and indeed interfaces per BR.  As far as PfR is concerned you also need a path name for external interfaces – I will come onto Paths shortly.
  • Branch Master Controller:  I mentioned you need an MC on each site to make the optimisation decisions, but in this case there is no policy configured as with the MC at the Hub.  Instead it receives the policy from the Hub Master Controller. Obviously it therefore needs the IP address of the hub MC.  At a branch the MC can be on the same device as the border router – an MC/BR.
  • Branch Border Router:  The branch Border Router (BR) terminates the Wan interface and you need to enable PfR on this.  It also needs to know where its MC is (if it is on a separate box).  One enabled for PfR the Wan interface is detected automatically.  As noted earlier the branch Border Router can also house the Master Controller for he site.

All Master controllers  peer with the hub MC (or IWAN Domain Controller) for their configuration and policies.

Branches will always be DMVPN spokes.

Every site runs PfR to get path control configuration and policies from the IWAN domain controller through the IWAN Peering Service

A Cisco diagram probably does a better job than a scribbling of mine to get the components across visually.

IWAN domain

One other confusing term in the architecture is that of Transit (this is the problem when you use terms with multiple meanings, even if it accurately describes a functionality.)  I understand a transit site as a redundant HUB Data Centre with a Redundant HUB MC (IWAN domain controller).  So exactly the same as the HUB site, the only difference is you do not define the policies here, these are copied from the HUB MC.  The Transit site also gets a POP-ID

Remember each central site gets a POP-id.

Technically traffic can transit a branch site to get to another branch site if you get the routing and advertisements wrong, but this can get pretty messy and is best avoided.

As with most network architectures, a solid predictable routing design where you know your expected routed paths is the key to a stable and robust IWAN deployment.

Monitoring and Performance Routing

Unified Monitoring (Performance Monitor) and application visibility includes bandwidth, performance, correlation to QoS queues, etc. and is responsible for performance collection and traffic statistics.

As I mentioned at the start PfRv3 is an evolution of Optimised Edge Routing (OER) which was prefix route optimisation with traditional Netflow for passive monitoring and IPSLA probes for active monitoring.  This moved to PFRv2 to add application routing based on real-time performance metrics and then PFRv3 which adds a bunch of stuff including smart probes, NBARv2, one touch provisioning, Netflowv9, Service Advertisement Framwork, VRF awareness etc.

For application recognition (dynamic or manual) IWAN and PfR use NBAR2.

NBAR2 – Network Based Application Recognition, is way of inspecting streams of packets up to layer 7 to identify applications,  it provides stateful deep packet inspection on the network device to identify applications, attributes and groupings.

Cisco defines this as a Cisco cross platform protocol classification mechanism.  It can support and identify 1500+ applications and sub-classifications, and Cisco adds new signatures and updates through monthly released protocol packs.  It can identify Ipv4, Ipv6, ipv6 transition mechanisms, a bunch of application like TOR, Skype, MS-Lync, Facetime, Youtube, Office 365 etc.  then you can configure policy based on this.

(ios commands)

Ip NBAR protocol discovery

IP NBAR custom transport

You match the protocol from NBAR when setting up your QOS policy for IWAN and then your TCA levels (Threshold Crossing Alerts).  TCAs are pretty much how they sound, you cross predefined thresholds around jitter, loss, and delay then create an alert that the controller can then act upon for a path.

Netflow – (developed by Cisco) essentially collects IP traffic information and monitors network traffic.  There is Netflow V5 and V9.  Netflow v5 is a fixed packet format and has fields like src and dst Ip, number of packets in the flow, src and dst ports, number of L3 packets in the flow, protocol, Type of Service, TCP flags etc.   PfR also takes advantage of Netflow v9 which adds a cornucopia of extra information to customise and report on, and you can define what you want to report on as well by creating your own custom flexible flow record.

For more information see Netflow V5 export format

For more information on Netflow v9 see Netflow v9 Format

IWAN Monitoringall the PfR stats are exported using Netflow, so it is important to have a monitoring platform that supports Netflow V9 to get the most out of your monitoring for PfR for visibility.

Service Advertisement Framework (SAF)

PFRv3  has a concept of a peering domain or Enterprise domain for service coordination and exchange at the wan edge.  SAF is the underlying technology here.

SAF creates the framework for passing service messages between routers, with SAF forwarders and clients – basically a service advertisement protocol.  There is some considerable detail under the covers, but in IWAN it has been made very simple to configure.  (If you have history here you will remember SAF headers, database, client APIs, Client and Forwarding protocols, transitive forwarders, services etc.  Fortunately improvements mean the exact nature of the underlying mechanisms are improved and hidden now).

In essence the advertisement of the SAF service uses EIGRP as the transport layer for the advertisement, and completely separate to the IP routing protocol you are using to actually forward packets . It also uses link-local multicast for neighbour discovery.

Since you are using EIGRP as the engine for service advertisments then this also comes with split-horizon and DUAL (Diffusing Update Algorithm) to prevent service advertisement loops.

SAF relies on the underlying network and DMVPN in order to know about and communicate with its peers, as through the tunnel they are effectively one hop away. There can be confusion here again, as it easy to see EIGRP and assume this is related to underlying network connectivity, but for SAF, provided there is IP connectivity to the domain Border routers  (i.e. through the DMVPN tunnel) peers can communicate and pass service advertisements between each other as they have established SAF topology awareness and neighbours (peers) through the overlay EIGRP control engine.

Also SAF is efficient in that it only sends out incremental updates when advertisement changes occur and does not periodically broadcast/flood service advertisements.

In Iwan you have the concept of a Path

A Path name in IWAN identifies a transport

These Paths are identified with a Path-ID  You manually define the path name and path-id on the Hub and Transit BRs,  they then start sending discovery probes to the branches and these probes contain information about the path:  namely the Path Name, Path-ID and POP-ID.

The Path-ID is unique per site.

Paths in IWAN also have three labels.  Preferred path, Fallback Path and the next Fallback Path, and under each of these labels are three actual paths

For example you might say for this DSCP or this application (e.g EF) your preferred path is MPLS,  for something else the preferred path is Inet1 etc.

Each transport is a Path, and understanding the concept of a Path and Path-ID certainly makes troubleshooting easier when you are looking at traffic that has changed paths, when, and the reason why.

Path of Last Resort

One final bit of confusion in IWAN is that at the Hub you need a BR per transport.  At the branch/CPE you can have 2 transports per router, and if you want to do 3 transports, then you must have a second router for the 3rd transport (There are rumours this will be up to 5 transports per router in a future release, but we wait and see). There is also this thing called Path of Last resort, which some see as, “great, 3 transport are really supported per router. ” Turns out, No!

Basically if all other paths are unreachable, then we can fallback to the path of last resort.  This is not the same as the monitoring and control you get on the other paths.

PfR will not probe as usual (smartprobes – instead of sending 20pps, will be reduced to 1 packet every 10 seconds – so really just a keepalive).  Also path unreachable will be extended to 60 seconds.  So really this is to be used if you have a 4G/LTE connection as a last resort or backup path for your traffic if all else fails.  You add this config on the central site.

SummarySteps of setting policy for PfR

PfR – First you define a Class (a but like a class-map if you are familiar with QOS policy config in Cisco IOS), then this has a match on DSCP or Applicaton,  then you have your transport preference (Preferred, Fallback, Next Fallback), then your performance threshold based on loss, latency or jitter to decide which is the preferred path.

PfR actually work on the basis of a traffic class which is not an individual flow, but an aggregation of flows.

The traffic class is based on Destination Prefix, DSCP and Application name. (obviously not app name if just DSCP is used).  For each traffic class PfR will look at the individual next hop.

Performance metrics are collected per channel:

Per channel means:

  • per DSCP
  • per Source and Destination site
  • per Interface

A Channel is essentially a path between 2 sites per DSCP, or a path between 2 sites per next hop.

A Traffic Class will be mapped to a channel.

Channels are added containing the unique combination of DSCP value received, site id and exit.

PfR controlled and uncontrolled traffic

There is a concept of PfR controlled and uncontrolled traffic – and if some new traffic is seen for the spoke, then for 30 secs the normal routing table controls the traffic destination.  When this comes under the control of PfR then it abides by Threshold Control and is directed accordingly.

There is also an unreachable timer in PfR determined by PfR probes do detect the reachability of a channel.  This is seen as down once traffic is not seen either for 1 second, or if there is no traffic on the link and smart probes are not seen for 1 second.  These are the defaults I believe, but there is a recommendation to set the timer to 4 secs.  I assume this will become the new default at some point.

So for failover, blackout will be the 4 seconds above, for brownout then this is 30secs by default, but again can be reduced down to 4 or 6 seconds.

Performance Monitoring (PerfMon)

Unified Monitoring under PfR is enabled through Performance Monitor (PerfMon) which has been around a while and you might be familiar with it from Voice roll-outs.  It is responsible for performance collection and traffic statistics.

Application visibility includes bandwidth, performance,  and correlation to QoS queues

When it come to IWAN domain policies and domain monitoring there are 3 performance monitors to be aware of in PfR

  • Performance Monitor 1: to learn site prefixes (applied on external interfaces on egress)
  • Performance Monitor 2: to monitor bandwidth on egress (applied on external interfaces on egress)
  • Performance Monitor 3: to monitor performance on ingress (applied on external interfaces on ingress)

IWAN uses these performance monitors to get a view of traffic in flight (traffic flowing through the interfaces) to look for performance violations and to make path decisions based on this.  Border Routers (BRs) collect data from their Performance Monitor cache, along with smart probe results (described below), aggregate the info and then direct traffic down the optimal forwarding path as dictated by the Master Controller.

Monitoring and optimisation – Smart probes

When there is no user traffic, (e.g. a backup path) then probes are sent to get the monitoring. These are called Smart Probes

Smart Probes are used to help with the discovery but also for measurement when there is no user traffic. These probes are generated from the dataplane.  Smart Probe traffic is RTP and measured by Unified Monitoring just like other data traffic

Smart probes add a small overhead to bandwidth on a link, but this is not performance impacting in general and can be tuned.

The Probes (RTP packets) are sent over added Channels to the sites discovered via the prefix database. Without actual traffic, BR sends 10 probes spaced 20ms apart in the first 500ms and another similar 10 probes in the next 500ms, thus achieving 20pps for channels without traffic. With actual traffic, a much lower frequency is observed over the channel. Probes sent every 1/3 of [Monitor Interval], I.e. every 10 sec by default.

That is 20pps per channel or per DSCP.  

Zero SLA is another feature that is often missed and should be mentioned. If you are concerned about a very low bandwidth link and that you would be sending smart probes per channel or DSCP, then you can configure Zero SLA so only DSCP 0 uses smart probes on secondary paths. All the other channels  now do not get smart probes, only DSCP 0.  If you have a 4G or low bandwidth link and are worried about overhead this is definitely an option to have in the back pocket.

Smart probes are of three types:

  • Active Channel Probe: Active channel probe is sent out to measure network delay if no probe is sent out for past 10 seconds interval.
  • Unreachable Probe: Unreachable probe is used to detect channel reachability when there is no traffic send out.
  • Burst Probe: Burst probes are used to calculate delay, loss, jitter on a channel that is not carrying active user traffic.

For low-bandwidth links (e.g a DSL or 4G.LTE) it is possible to tune this further to have even less overhead – for example the below command:

smart-probes burst quick number-of-packets packets every seconds seconds


The whole point of defining thresholds is to look for a passing of a threshold or a performance violation – if we see this then an alert is sent to the source Master controller (a Threshold Crossing Alert or TCA) from the Border Router.  It is at this point that PfR controls the route and changes the next hop for an alternative path as per the policy configuration i.e. re-routed to a secondary path.  It is not PBR (policy based routing) as you might already be familiar with, but is similar in that the remote site device knows what to do with this traffic class and routes it accordingly based on policy.  The local Master Controller makes this decision.

All the paths are continuously monitored so we have awareness across the transports.

Routing and PfR

In part one we went through some of the choices around routing in DMVPN.  Well there are additional considerations with PFR.

One of the reasons EIGRP and BGP are preferred for IWAN is that alternate paths are available in the BGP and EIGRP topology table, and as PfR looks into these tables to make decisions and change next hops based on policy, they are well suited.

Scale.  The first thing pfr does is look at the next hop of all paths.  It looks in the BGP or EIGRP table.  If you show your routing table you have one next hop per path, but because PfR looks at both the routing table and the topology table, it see the next hops for both paths.

With EIGRP you can adjust the delay to prefer MPLS, so this combined with the EIGRP stub feature means you can control routing loops.

With BGP you would have the hubs configured as the route reflector for BGP, and to prefer MPLS you can simply set a high local pref for MPLS.  If you have say OSPF on the branch then you redistribute the BGP into OSPF, and set a route tag on the spokes to identify routes redistributed from BGP.

As ever there are many ways to configure BGP, but the validated designs guide you to one relatively simple way.

If you looked at using OSPF for example, well PfR does not look into the OSPF database, therefore relies on the RIB (Routing Information Base), so in order to support multiple paths for decision making you would need to run ECMP (Equal Cost Multi Path) – far from ideal.


When a site or a branch is part of PfR it advertises its prefix to the HUB-MC, and then this forward this to all the MCs in the domain.  

This can be confusing because obviously BGP or EIGRP send prefixes, but PfR also sends prefixes.  One of the performance monitors will collect the source prefix and mask and advertise this to all Master Controllers.  It uses the domain peering with the HUB-MC and then this will reflect this prefix out to all the other MCs in the domain.

Ultimately you end up with a table with a mapping of site-id to site prefix and how this was learned i.e learned through IWAN peering (SAF service advertisement framework), configured, or shared.

It is important that attention is paid to your routing (of course, it is always important that you pay attention to the routing) because in advertising the prefixes, PfR looks in the topology table based on the IP address and mask to dig out the prefix.

There are two Prefix concepts to be aware of 1) Enterprise Prefix List and 2) Site-Prefix

Enterprise Prefix list  is a summary of all your subnets, or all your prefixes in your IWAN domain.  This is defined on the HUB-MC for the domain.

A prefix that is not in this prefix-list is seen as an Internet prefix and load-balanced over the DMVPN tunnels.  This is important, as If there is no Site-id for example, (the site is not enabled for PfR), then you don’t necessarily want traffic to be load-balanced, such as Voice for example.  So it is important to make sure you have a complete Enterprise Prefix List.  Once included in the Enterprise prefix list, PfR will know that traffic is heading to a site where PfR is not enabled and will subsequently know not to load balance it.

Site-Prefix – the site prefix is dynamic in PfR, so that on a site Perfmon will collect the traffic outbound, look at the IP address and mask, and then advertise that prefix up to the hub through PfR.  On the hub and transit site however you want to manually advertise the Site-Prefix that is advertised.

Prefixes specific to sites are advertised along with site-id. The site-prefix to site-id mapping is used in monitoring and optimization.

It is important that the right Site-Prefix is advertised by the right Site-id

Transit site preference

When you have multiple transit sites (or multiple DCs) with the same set of prefixes advertised from the central sites, you can prefer a specific transit site for a given prefix – the decision is made on routing metrics and advertised masks, and this preference takes priority over path preference.  The feature is called transit site affinity and is on by default (you can turn this off with no transit-site-affinity).

Traffic Class timers – if no traffic is detected for 5 minutes then the dynamic tunnel is torn down.

BACKUP Master Controller

BACKUP Master Controller – you can have a backup Master Controller but it should be noted that today there is no way to provide a stateful replication to the Backup Master Controller – the two are not synched.  The way to do this is to configure the same PfR config on both, and the same loopback address on the backup controller but instead use a /31 mask so that should the primary go away the BRs will detect the failure and reconnect to the Backup Master Controller – so stateless redundancy.

The backup MC will then re-learn the traffic.

In the meantime Branch MCs keep the same config and continue to optimise traffic as you would expect – Anycast Ip.   We follow the routing table and do not drop packets, which is why you set the MPLS prefer.

On the branch you need a direct connection between the BRs – on the HUB you just need IP connectivity.

Finally VRFs

VRF-Lite is used with all the same config ideas but per VRF.  Your overlay tunnel is per VRF (DMVPN per VRF) and your overlay routing is also per VRF  (VRFs are not multiplexed in one DMVPN tunnel!).   Under PfR, I mentioned that SAF (Service Advertisement Framework) was part of the magic behind PfR, well the SAF peering for advertisements is also per VRF, as is MC and BR config, and also policies are per VRF.

Monitoring – all the PfR stats are exported using netflow, so it is important to have a monitoring platform that supports Netflow V9 to get the most out of your monitoring for PfR.


Too much to learn

Ok, so that was a lot to take in I agree.  But hopefully by breaking down the component parts a little, next time you look at IWAN you will at least have a place to start ,and understand what is actually going on when you select the drop downs in an IWAN GUI.

When you first look at IWAN you have terminology flying at you at an alarming rate. Much of it sounds familiar(ish), and it is easy to leap to a feeling of general understanding, until you realise that you are talking at cross purposes when it comes to EIGRP, or you are not sure exactly what Transit is, or the meaning of a Path.  Hopefully the above provides some context for deployments.

What I would say is that once you understand the components, deployment is surprisingly light touch and easy through your choice of IWAN app and gui.  In fact it is not too bad without really understanding it all,  but it is always best to understand what you have just done.   If you look at other SD-WAN vendors (and I will cover some of the broader protocol choices in another post), the GUIs have abstracted much of the underlying workings.  This makes it all seem “oh so simple”, and to be honest it should be like this.  But as long as you understand that abstractions have been made and that there is no magic, you will quickly get a good feel for the various technologies.   You will understand the protocols and design choices, and be able to identify the innovations that have been sprinkled along the way.

Finally you have a number of options when it comes to monitoring and orchestration with IWAN.  All take the pain away from setup and all work towards enhanced visibility. The fact you have some products marketing IWAN deployments in 10 minutes shows how the mechanisms can be streamlined through abstraction and automation.  In brief, your main options are below.

  • Orchestration – Cisco Prime, Anuta networks, Glue networks, Cisco VMS, Cisco APIC-EM, Cisco Network Service Orchestrator
  • Visualisation/monitoring – Cisco Apic-EM, Living Objects, Liveaction, Cisco Prime/VMS.

Hopefully by now you have enough of a feel for the technology to jump into the validated designs for IWAN productively, and deploy a whizzy tool with growing confidence. You never know, IWAN might be less painful than you might have feared, despite first impressions.

Adventures in IWAN – Part 1 – Transport Independence

For a number of reasons, and part of wider interactions with SD-WAN, I have been having a few adventures with Cisco’s current SD-WAN offering  – Intelligent WAN or IWAN (2.1).  Whether IWAN is an SD-WAN as you understand it from other vendors is a topic for another day, but I thought it might be useful to cover a few things I have come across.

This is intended to be a multi-part post with the first 2 parts covering the first 2 pillars (hopefully throw in a few config examples after part 2), and the third eventually covering pillars 3 and 4.

As at least 80% of getting IWAN up and running is in the first 2 pillars, I  am going to focus on these primarily.

One thing to note from dealing with Cisco IWAN so far, is a lot of the underlying mechanisms are exposed.  This has helped me, personally, to improve my understanding of all SD-WAN vendors and how their solutions fit together (e.g.  what is the overlay and how does it actually work? What affects the control plane?  What is being provided in the Data Plane?  Overlay routing for dynamic point-to-multipoint with encryption?  How exactly are you doing encryption?  What protocols are you using?  How are you managing key distribution and re-keying?  How is traffic diverted to the device or inline? How is performance monitoring working?  App optimisation?  Is it flow based?  How are you looking into the flow? Application identification – what method? Controller for traffic control?  Real time? Orchestration?  etc.)  Basically, what is the magic?

Are we ready?  OK. Strap in, here we go.

The Building blocks of IWAN

Have a quick look at the Cisco picture below and you can see the 4 pillars of IWAN


Each pillar of IWAN has underlying technology building blocks and those technologies also have foundational components.  Hopefully I will provide some clarity on the the building blocks to layer on top of each other to help produce a shiny polished IWAN solution.

The first 2 pillars of IWAN are – 1) Transport Independence and 2) Intelligent Path Control.

Part 1

 Transport Independence

The fundamental technology underpinning transport independence in IWAN is DMVPN (Dynamic Multipoint VPN) as the transport overlay technology, and this also has component parts.

So what is DMVPN fundamentally?

It is a combination of 4 things:

  1. Multipoint GRE tunnels
  2. NHRP (Next Hop Routing Protocol) – basically creates a mapping database of the spoke’s GRE tunnel interfaces to real (or public) addresses.  Think of this like tunnel overlay IPs ARPing for the “real” underlay IP addresses.
  3. IPSEC tunnel protection – creates and applies encryption policies dynamically
  4. Routing – Essentially the dynamic advertisement of branch networks via routing protocols, e.g BGP, EIGRP, OSPF, RIP, ODR.

Let’s cover each one in turn, then you will have your tunnel overlay or secure transport independence sorted.

DMVPN – your overlay transport technology

Multipoint GRE tunnels

If you are familiar with GRE you will be familiar that you create a tunnel with an extra GRE header  between two endpoints.  You create a tunnel interface (virtual interface) with an address, and tie this to a real source and destination address on actual interfaces that terminate the tunnel.

A couple of pre-canned Cisco diagrams do the trick here for the sake of illustration:

GRE tunnel


Multipoint GRE  broadens this idea by allowing a tunnel to have “multiple” destinations and you can terminate the tunnels on a single interface.  Handy for Hub-and-Spoke, and Spoke-to-Spoke I think you will agree.

So Multipoint GRE is your tunnel overlay SD-WAN transport in the Cisco world.  Well that was simple, so onward to the less straightforward.

Next Hop Resolution Protocol – NHRP

The next building block of DMVPN is NHRP, and this provides a way of dynamically mapping all those multi-point GRE tunnel interfaces you just created with their associated real addresses or underlay transport network.

NHRP has actually been around a while in different forms and originates from an extension of the ATM ARP routing mechanism which dates back to 1998/1999 as a technology.

Think of NHRP (Next Hop Resolution Protocol) as like ARP but for the underlying real IP addresses.  So you have a physical interface on your wan router with an address, and you have a GRE tunnel address on that same router.  One is your IP underlay and one your IP tunnel overlay.  You now need a way to map your IP underlay network to your IP tunnel overlay network, and NHRP does this job.

By way of visualization, I particularly like the below diagram from Cisco which shows very clearly which are your overlay addresses, which are your tunnel addresses, and which are your real addresses or NBMA addresses.  As a distinction it might help to think of GRE as your transport overlay technology (each multipoint GRE tunnel maps to a WAN transport), and your overlay network as the network addresses you wish to send over this tunnel, so a network overlay.


A spoke router will register with a Next Hop Server (NHS) as it comes up,  (you will give the spoke a NHS address to register with, and incidentally a multicast address for broadcast over the tunnel if the underlying network does not support IP multicast – useful for routing protocols).  Once registered, the NHRP database will maintain a mapping of Real addresses to Tunnel Addresses.  Once registered, if a spoke needs to dynamically discover the logical tunnel IP to physical NBMA IP mapping for another Next-Hop-Client (spoke) within the same NBMA network, then it will do an NHRP resolution request to find this.  This discovery means you do not have to go via the Hub every time for Spoke to Spoke communication – so the Dynamic part of DMVPN really. You can create dynamic GRE tunnels from a spoke (and ultimately encrypted tunnels) on the fly by querying NHRP, find the real NBMA address of another spoke and, voila, you have the peer information to set up your tunnels direct.

Nb. There are some interesting CEF details with NHRP between DMVPN Phase 1, 2 and 3 but that is follow on reading I would say.  Allowing a layer 2 resolution protocol to ultimately control your layer 3 direction and interactions is maybe controversial for the purist, and I will doubtless attempt to cover this when looking at some other SD-WAN techniques in other posts.

In short all spokes register their NBMA addresses with a Next hop Server (hub typically), and when a spoke needs to send a packet via a next hop (spoke) on the mGRE cloud or transport overlay, it asks the NHS (via a resolution request) “can I please have real/NBMA address of this next hop?”, the NHS replies with the NBMA address of the other spoke, and from this point the spokes can speak directly.


IPSEC tunnel protection 

IPSEC is the suite of protocols that enable the end to end encryption over the network in IWAN.  We are using IKEv2 and IPSEC.  Remember you can get DMVPN working as on overlay transport without encryption; this is optional (but good practice for security). Technically you just need your routing, multipoint GRE tunnel overlay network, and NHRP,  then you can add encryption once network connectivity is sorted.  I have found this is a good way to build the solution in blocks to make troubleshooting easier.

It is a little involved to go into here, but essentially IPSec Phase 1 identifies who you want to form an encrypted tunnel with and securely authenticates the peer (and sets some parameters for Phase 2), and then Phase 2 agrees on what to use to actually encrypt the traffic.  The fundamental problem is that when you have to create a lot of point to point IPSEC tunnels, you need some way to tell the devices what the address of the peer is so it can create an encrypted tunnel.  Each would then be an individual configuration for every peer to peer connection, managing keepalives (Dead Peer Decpection), and failover etc.   If you want on-demand dynamic spoke-to-spoke encryption, then IPSEC needs some work.  There are a number of ways to solve this, but DMVPN  phase 3 (Multipoint GRE and NHRP)  has been used for some time and is the method of choice today in IWAN.

With DMVPN it is always worth covering headers and how they are used in the real world should you choose to use IPSEC.  This way you can visualise the overlay network.

Typically you use transport mode with DMVPN so what does this mean and why use this with DMVPN?

Header confusion

There are Encryption Headers and GRE headers, do not confuse or conflate the two.

IPSEC uses 2 distinct protocols to either encrypt or authenticate your Layer 3 payload. These are ESP header (Encapsulating Security Payload) and AH   (Authentication Header) and both add headers to your packet.  They both also run in one of two modes, tunnel or transport.  These modes either use the original IP header (transport), or add a new IP header (tunnel) in order to traverse the network.  This is outlined clearly in the diagram below.


The next level of header confusion comes with GRE – which also adds an IP header.

Your original packet might look something like:

IP hdr 1   |   TCP hdr  |    Data

GRE Encapsulation:

IP hdr 2   |    GRE hdr  |   IP hdr 1   |    TCP hdr  |   Data

GRE over IPsec Transport Mode (with ESP):

IP hdr 2   |   ESP hdr |    GRE hdr  |    IP hdr 1   |   TCP hdr   |   Data

GRE over IPsec Tunnel Mode (with ESP):

IP hdr 3   |   ESP hdr   |   IP hdr 2   |   GRE hdr   |   IP hdr 1 |   TCP hdr   |   Data

Transport mode only encrypts the data payload and uses the original IP header – whereas tunnel mode will encrypt the whole IP packet (header + payload) and use a new IP header.

In DMVPN both the GRE peer and IPsec peer addresses are the same, so typically transport mode saves on header addition which is essentially repeating identical information (20 bytes saved right there).

Typically you use ESP with Transport mode for DMVPN

Now you should have a reasonable view of the Encryption overlay and the GRE overlay and the headers that are added end to end.


Routing comes up in two areas of IWAN, one in the transport independence piece, and again in the best path selection with PfR, but it is important not to confuse the two.  For example, PFR uses EIGRP for the Service Advertisement Framework (SAF), but for the transport piece you could use the same or a different routing protocol e.g. BGP, EIGRP or OSPF.  When EIGRP is used for your underlay and overlay routing as well (which is highly likely) conversations can get confusing.

You have a router at the customer edge, trying to get to another router at another edge. In between you have a Service Provider network.  Typically in order to get traffic to where you want to go you need to interact with the Service Provider’s BGP network, whether that is BGP advertised default routes, statics, redistribution, whatever is most suitable for you and your SP.

Now with IWAN you are adding a tunnel overlay, and this overlay network needs to be advertised into your current Enterprise network so that traffic that needs to get to another one of your sites knows which next-hop to use.  That the next-hop will now be a tunnel , i.e. you need to use a tunnel to get there.  Remember NHRP is used to do the mappings here to actually get the tunnel traffic across to the real address of the remote site to terminate the tunnel.  So where previously you may have used dynamic or static routing or default route in BGP to say,  “if you want to get to an address that lives across the WAN use the following next hop (WAN interface)”,  well with an overlay you are telling traffic to use your tunnel interface as your next hop.  To advertise these tunnel overlay routes into your network you can either use statics or a routing protocol of choice like BGP or EIGRP.  Of course if your routing protocols are covering both the real WAN interface and your tunnel interface networks, you need to take care that the correct route gets installed into the forwarding table, and that you are learning the information from a consistent place so your routing protocols don’t get confused and bounce the tunnel up and down (the recursive routing problem described a little further down).

As mentioned, the other use of a routing control plane in IWAN is for PfR (Performance Routing) , where the EIGRP engine is used for the Service Advertisement Framework and creates its own neighbours and domains accordingly.

Of course this is logically separate from the underlay and actual traffic forwarding and relies on the overlay network to get connectivity across the WAN between members of the SAF domain for sending SAF information to each other.  That is, the tunnels provide connectivity for SAF peers.

So what does all this mean?  Well it means you can very easily have 3 routing protocol names flying around in conversation confusing everyone on a whiteboard – BGP for underlay,  EIGRP for overlay,  EIGRP for PfR (or any mixture e.g. OSPF, BGP, EIGRP for routing and EIGRP for PfR).  The one constant here is the EIGRP engine is always the mechanism for PfR SAF peering.  However if you separate the PfR / SAF process in your mind as a monitoring technology that just happens to use an EIGRP process to set up its domains (nothing to do with network connectivity) – then the rest is really just routing as normal with care taken over your DMVPN.

DMPVN which routing protocol?

If you have ever configured DMVPN you will know that there are limitations or caveats with each routing protocol.  Let’s have a brief look at these at a basic level, and for simplicity, only with DMVPN phase 3 .

OSPF – You can use OSPF of course, but you need to be a little careful with network types. Point to Point? won’t work because you are using a multipoint GRE tunnel interface. Broadcast?  Well this will work but you need to make sure the spokes will never be elected as the DR or BDR  (IP ospf priority 0 on the tunnel interfaces of the spokes should do the trick here).  Non-Broadcast? – yes this will work as with broadcast but you need to statically configure your neighbours.  Point to Multipoint? – works well with phase 3, and you don’t have to worry about DR/BDR election.  With DMVPN phase 2 it is important to note that Point to Multipoint does not work so well, as this changes the next hop so all traffic goes through the hub router, so not ideal for dynamic spoke to spoke. In phase 2 you have the same issue with OSPF point to multipoint non-broadcast with the addition of having to statically define your neighbours.

What are the issues with OSPF? – well a couple that spring to mind are that in DMVPN you use the same subnet, and therefore all OSPF routers would be in the same area. Summarisation is only available on Area Border Routers (ABRs) and Autonomous System Border Routers (ASBRs), therefore the hub router would need to be an ABR for summarisation.  Also as OSPF is link-state, any change in the area will result in an SPF calculation across the area i.e. all the routers will run an SPF calculation on link change. Misconfiguration of the DR/BDR will break connectivity and traffic engineering has its issues with a link-state protocol.

So OSPF is doable, using NSSA (Not So Stubby Areas) on the spoke and careful config, but for larger scale DMVPN people drift towards BGP/EIGRP.

EIGRP – Is not link state, does not have an area concept and you don’t have to think of the topology tweaks you need to do with OSPF above.  One thing to note in DMVPN phase 2 is that you don’t wan’t the hub setting itself as the next hop for routes, but you can configure around this with EIGRP.  Of course you need to disable split-horizon so routing advertisements are allowed out of the same interface (mGRE tunnel int).  Good advice for scale is to turn the spokes into EIGRP stubs and also to watch for the number of adjacencies the hub has, as hellos can become an issue (you can play with timers here too).  Also EIGRP can summarise and manipulate metrics at any point.

EIGRP is well-suited to DMVPN at scale.


BGP also works for DMVPN – we know it scales (the Internet), and the default timers are less onerous that other protocols.   The choice, as ever, is IBGP vs EBGP.  Whereas with IBGP you might require route reflectors at scale and an IGP to carry next hops, EBGP might need several AS numbers, or you could disable loop prevention.

With DMVPN eBGP, the next-hop is changed on updates outbound, so all good there.  Next question is whether to use the same AS for every site, or unique ASs.  This can limit you to 1024 spokes as the 16-bit AS number allows only for 1024 spokes, but good to prevent loops. With a 32 bit number the private AS number is solved, but there is a deal of configuration at the hub with unique AS numbers.

Say you run the same AS at all sites, well in this case the receiving router sees its own AS number in the AS path of a received BGP packet and it assumes the packet came from its own AS, has reached the same place it came from, so drops the packet.  To get round this you can use as-override, but this can produce loops in the control plane.

iBGP then, back to the next hop modification issue – so with phase 1 and 3 you can use “neighbor next-hop-self-all” for reflected routes on a route reflector.  iBGP with this becomes probably the preferred option when it comes to BGP with this.

iBGP is well-suited to DMVPN at scale.

From the above EIGRP or BGP tend to be the preferred choices for DMVPN.

Now the assumption often with IWAN, is that BGP and EIGRP are chosen entirely because of the above typical reasons.

However in addition to the good reasons above, remember with IWAN you want some method of quick failover to an alternate or best path based on monitoring.  With BGP and EIGRP you have topology tables and feasible successors with alternate routes ready and waiting to go on failure to populate the Routing and Forwarding table and facilitate quick change of preferred paths.

Another very good reason for the use of EIGRP and BGP with IWAN.

So there you have it, a brief tour of the 4 building blocks of DMVPN

Finally, of course, no current discussion of DMVPN would be complete without a brief excursion into Front-Door VRFs and recursive routing.

Front Door VRFs.

These are a very useful technique in IWAN as they simplify paths and configuration a good deal.  What is VRF (Virtual Routing and Forwarding)?  Basically it allows multiple instances of a routing table to exist on a router and work simultaneously.  This is useful as it allows network paths to be segmented without using multiple devices.  Effectively in an IWAN design you put your wan interfaces into a separate VRF (front-door you see) and this avoids some recursive routing problems you may be familiar with using GRE (more on that later).

Recursive routing with GRE

If you are familiar with configuring DMVPN you may be aware that you can get yourself into a pickle when it comes to routing, and in particular, recursive routing.  So if you are using a routing protocol for your overlay and another for your underlay, there could be a conflict here.  For example, if you learn your route both inside and outside of the tunnel for the same prefix, well the router gets a little confused.

If you have ever seen “Tunnel temporarily disabled due to recursive routing” then you know what I am talking about.  The first time you bump into this it can lead to furrowed brows and prolonged head scratching until the light-bulb fires.

So here is the crux of this issue:

If, for example, you have two routers with NBMA (Wan) interfaces addressed at one end with  and at the other, well these are on different networks so you use a routing protocol to get across any intermediate hops to the other end.  Say we use OSPF for this.  E.g Router_A ( – Router_B(ospf) – Router_C(ospf) – Router_B(  These are also your tunnel end-points remember.

Now say you want to use EIGRP to advertise your tunnel network, and you make the easy mistake of having an overlapping network i.e. your GRE tunnel interface addresses are and at each end.  So you may set EIGRP for network (which also happens to cover the NBMA or real addresses).

Ok, so the problem here is that you now have the 10 network being advertised for the NMBA addresses in OSPF and then, when the tunnel comes up, you also have the 10 network being advertised through EIGRP over the tunnel.  So as soon as the EIGRP neighbour comes up over the tunnel, the tunnel goes down and with it the EIGRP neighbour – and rinse and repeat.  The problem of course is the NBMA (or wan interface) is now being advertised over the tunnel network using EIGRP.

Given the way the tunnel gets set up, which is to rely on OSPF (to find the actual NBMA tunnel endpoint), then this is simply not going to work

In short, the EIGRP neighbour comes up and you are saying the way to get to the real address (or tunnel endpoint) is over the tunnel, while simultaneously overriding the way the tunnel actually gets connectivity to that real address (tunnel endpoint) to set itself up as a tunnel (over OSPF).  The only way the EIGRP neighbour could come up in the first place is that OSPF had already provided the underlay routing to set up a tunnel.  All clear?  Yeah, I know, this can make you rub your forehead the first time you come across it.

The way to get round this usually is to be very careful with your subnets and routing to avoid the recursive.

But there is another way to avoid this – enter (or enter through) Front Door-VRF.

The principle here is that you have a separate routing table for the physical WAN interface (the front-door), and the tunnel or overlay network – so a VRF for each.  Or most simply, a separate VRF for the WAN interface and everything behind this is in the global routing table if you so wish.  As we are not learning the routes for the tunnel and NBMA through the same routing table, bingo you have solved your recursive routing problem.

There is still some magic needed, as there must be a way to tell the tunnel you are creating to use the Wan interface as a tunnel endpoint.  Create your WAN interfaces in their own VRF, then create your tunnel interfaces with these addresses as the source and destination tunnel endpoints, and finally just stitch these together with a VRF command under the tunnel interface (the stitching is the internal pixie dust).  Your network and routing over the tunnel are now separated from your transit network underneath.


Shut the front door – that is much simpler for DMVPN 

Part 2 Intelligent Path Control



It’s all about the Bayes.


The defensive side of Security technology is an interesting place to be at the moment, with a vast number of products and techniques trying to defend against an ever-changing attack landscape.

Where there is uncertainty, people want to be assured, and reduce the likelihood that they will get breached.  Is the information gathered real and actionable?  Have we been breached or not?  What is the probability?

Bayes Theorem is fashionable across a number of fields today, and the idea of ‘machine learning’ to solve a security problem seems compelling.

Bayes was an “amateur” mathematician and Church Minister in the 18th Century, so no knowledge of computers, but he set out to solve a fundamental problem and this is where lasting ideas come from.

So why Bayes?

If you have read Daniel Kahneman’s book “Thinking Fast and Slow” (highly entertaining read), you will be aware that humans are not always great at instinctive decisions based on statistics.  Or if you add context it sometimes overrides the facts, when it really shouldn’t.

Consider a drug being brought to market that definitely cures a disease 99.9% of the time. (I like it, where can I get it?)  I know what 99.9% means, that means pretty much a sure thing?  Hold on, how often does it fail and what are the consequences when it does?  Well in a 40,000 seater stadium this fails for 40 people.  What if then, when it fails, it kills the person 50% of the time?  So 20 people in that stadium would die.  Clearly not acceptable, the drug is shelved.

Extending the base-rate fallacy.  If I have a test for a disease that is 99% accurate, what are the odds I have the disease if  am tested positive? That is 99 out of 100 people who have the disease will test positive, and 99 out of 100 who don’t will test negative.  Turns out the answer is around 50%.  Take the test again, the odds go back up to 99%.  Take a 90% accurate test, and even after the 2nd test the odds are still not at 50%.  As an exercise, once you have been through the below examples, and fully understand, I encourage you to take these numbers and plug them into the Bayes equation.

Another example of this flaky judgement is a conjunction fallacy.

The classic example is below:

“Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which is more probable?

  1. Linda is a bank teller.
  2. Linda is a bank teller and is active in the feminist movement.”

The idea is that some may think the second is more likely, when the “and” makes it less likely given the base rates.  How many 31 year old females are Bank Tellers?  How many 31 year old females are active in a feminist movement?  Does the “AND” make sense?

The additional information often confuses our instinctive noodle.

The whole idea with Bayes is we can add numbers to things that seem subjective or confusing, use any additional info, and get a more accurate read.

Bayes states that…

The prior odds times the likelihood ratio equals the posterior odds.


The formula for Bayes is not difficult, the hard part is what to plug into the formula and how.  You need to decide on your tests and events.  There is a test for a condition, and there is the event that someone actually has that condition.

You are saying “what is the probability of the event given the following test was positive?”

Also what is the probability of the  positive test being accurate?  (False positives – and you can use false negatives depending how you frame the test and the event.)


You can see how this can get a little confusing but let’s have a go anyway  with Bayes below.

The point about Bayes is that you have some Data and you make a claim about the data, or a hypothesis.

So you have a Hypothesis and you want to know what the probability of that hypothesis is given the Data.

The notation P(A|B) can be summarised as the Probability of A assuming B to be true

So P(A|B) where A is the hypothesis given B (which is the Data).  You can see the “|” sign as the word “given” if you like.  Altogether this is called the Posterior

So this is equal(=) to

P(A) the Probability of the hypothesis  – we call this the Prior

Multiplied by

P(B|A) –  the probability of the Data given a particular hypothesis, call this the Likelihood

Take all this and divide by

P(B)  the probability of the data itself.

Got that?  Good.  Let’s plug in some numbers with an example

We will do this in steps and then the equation.

We have a condition, let’s call it “Geekiness”.  What if we had a test to try to identity Geekiness? (aside from writing blog posts about Bayes)

We are testing 100 students for this condition.

We know Geekiness affects 20% of the students tested

The test for Geekiness involves watching a Star Trek film trailer and seeing if the pupils dilate excessively.

Among the students with Geekiness, 90% of the pupils dilate when tested.

But among those without Geekiness, 30% also dilate when seeing the trailer.


So what is the probability that the test shows a student actually has “Geekiness” from 100 students?

Or the hypothesis – what is the probability that students who get excited at a Star-Trek trailer have “Geekiness” given a positive test result.

Step 1: Find the probability of a true positive on the test. That is people who actually have Geekiness (20%)  multiplied by true positive results (90%) = 0.18 (or 18 out of 20 from the 100 students)

Step 2: Find the probability of a false positive on the test. That equals people who don’t have the Geekiness (80%) multiplied by false positive results (30%) = 0.24  (or 24 people out of the 100)

Step 3: Figure out the probability of getting a positive result on the test. That equals the chance of a true positive (Step 1) plus a false positive (Step 2) = .0.18 + 0.24 = 0.42

Step 4: Finally find the probability of actually having Geekiness given a positive result. Divide the chance of having a real, positive result (Step 1) by the chance of getting any kind of positive result (Step 3) = .0.18/0.42 = 0.43, or 43%.  So considerably less than the 90% that we started with.  With that additional info, a test that starts as 90% accurate for those with the condition, is less that 50% accurate when you take into account everyone (the base rate).

Surprising for some but we get a real figure, and this is the power.

Let’s now plug the same info as above into the Bayes equation.


A Posterior, a Prior and a Likelihood walk into a bar….

P(A|B) is the probability the student has “Geekiness” given a positive test result.  (Posterior)

P(A) = Probability of having  Geekiness = 20%  (Prior)

P(B|A) = Chance of a positive test result given student actually has Geekiness = 90%. (Likelihood)

P(B) = Chance of a positive test in the overall student population of 100, which is  42%

Now we have all of the information we need to put into the equation:

P(A|B) = P(B|A) * P(A) / P(B)


P(A|B) = P(0.9) * P(0.2) / P(0.42) = 0.43 (43%)


P(A|B) = (90% * 20%) / 42% = 43%

Another way to express this:

Prior odds * Relative likelihood = Posterior Odds


So there you have it.  Try some examples yourself, and be patient, don’t expect to be a whizz in 5 minutes.

There is more to Bayes than I have covered, but hopefully you should get a feel for how taking into account data, sample size, and accuracy can affect your probability.  We need to be rigorous in questioning the data.   A number of Security start-ups are using these techniques to better predict and detect anomalies or breaches, and although it doesn’t promise to be a panacea, I am excited to see where all this leads over the next few years.

and remember..