Using apply-groups to insert policy in a predictable manner….

Recently I’ve been solving some interesting problems around how routing policy can be implemented. What I’ve been trying to do is have a way that enables me to quickly insert a policy at the start of a set of routing-policies on a BGP neighbor to stop sharing routes.

For those who are not aware, you can implement multiple policies in a chain on a BGP session. You do this like so;

[edit protocols bgp group R2]
[email protected]# show 
export [ policy-1 policy-2 policy-3 ];
neighbor 192.168.3.1 {
    peer-as 2;
}

In this configuration, all policies will be evaluated sequentially. Policy with matching terms will take action as defined in the policy on the routes matched. Routes will continue to be processed by subsequent terms in each policy until a terminating action is hit. A “terminating action” is something like an “accept” or “reject” action (once you hit one of these you use the actions gathered so far including the terminating action, then stop processing)

Let us say for the sake of this test that the policies we have applied to our peer do 2x as-path prepends, add a MED value, then accept all routes;

[edit policy-options]
[email protected]# show 
policy-statement policy-1 {
    then as-path-prepend "4 4";
}
policy-statement policy-2 {
    then {
        metric 400;
    }
}
policy-statement policy-3 {
    then accept;
}

This would result in the following routes being sent from this router (note the MED and as-path prepends);

[email protected]# run show route advertising-protocol bgp 192.168.3.1    

inet.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
  Prefix		  Nexthop	       MED     Lclpref    AS path
* 172.16.172.0/24         Self                 400                4 4 [4] I
* 192.168.3.0/24          Self                 400                4 4 [4] I
* 192.168.6.0/24          Self                 400                4 4 [4] I

In the scenario I described at the start, the aim was to be able to easily (with one line of configuration) insert a policy at the start of the policy-set that rejected all routes. However, as described above, this peer might have existing policies that allow routes to be sent.

Assume we say that I have already created the following policy;

[edit policy-options]
+   policy-statement policy-reject {
+       then reject;
+   }

If I quickly activated this by adding it as an export policy, it would appear as the last policy on the policy-chain;

[email protected]# set protocols bgp group R2 export policy-reject 

[edit]
[email protected]# show | compare 
[edit protocols bgp group R2]
-    export [ policy-1 policy-2 policy-3 ];
+    export [ policy-1 policy-2 policy-3 policy-reject ];

As we can see below, this does not reject any of the routes, because the reject action is processed after a terminating accept action in policy-3;

[email protected]# run show route advertising-protocol bgp 192.168.3.1              

inet.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
  Prefix		  Nexthop	       MED     Lclpref    AS path
* 172.16.172.0/24         Self                 400                4 4 [4] I
* 192.168.3.0/24          Self                 400                4 4 [4] I
* 192.168.6.0/24          Self                 400                4 4 [4] I

In order to implement this in the manner I want with this way of doing it, I would have to do the following to achieve it;

[edit protocols bgp group R2]
[email protected]# delete export 
[email protected]# set export policy-reject
[email protected]# set export policy-1
[email protected]# set export policy-2
[email protected]# set export policy-3

You could shorten this by using the “insert” functionality, however you are still requiring a re-ordering of policies as opposed to a one liner to activate this.

The trick I have come up to solve this is quite cool. I am adding a term to an existing policy with an apply-group;

[email protected]# show | compare 
[edit]
+ groups {
+     policy-reject {
+         policy-options {
+             policy-statement policy-1 {
+                 term reject-term {
+                     then reject;
+                 }
+             }
+         }
+     }
+ }

When we activate this apply-group, the term gets inserted into policy-1, which is already ordered to be applied prior to the terminating accept action in policy-3;

[email protected]# show | compare 
[edit]
+ apply-groups policy-reject;

We can see here that it is implemented in policy-1;

[edit policy-options policy-statement policy-1]
[email protected]# show | display inheritance 
##
## 'reject-term' was inherited from group 'policy-reject'
##
term reject-term {
    ##
    ## 'then' was inherited from group 'policy-reject'
    ## 'reject' was inherited from group 'policy-reject'
    ##
    then reject;
}
then as-path-prepend "4 4";

As policy-1 is already the first policy to be processed, this ensures that without having to re-order our existing policies we can insert a reject policy at the start. We can now check the result;

[email protected]# run show route advertising-protocol bgp 192.168.3.1    

[edit]
[email protected]#

And it is working!

You might ask why I did not just apply a new policy in an apply-group instead of a new term? The reason for this is that new policies are added at the end of the policy chain, while a new term in a pre-existing policy uses the existing place that policy has in the policy-chain.

For my purposes, this has enabled me to give out a one-line command which a large group of people can use to disable the advertisement of certain routes at certain times. As with everything in Junos though, there are many possible uses for this set of functionality. Hope this helps!

Junos CSPF for multiple dynamic paths

A quick one today, but a few people have recently asked me how the Junos CSPF process manages to always ensure the primary and secondary paths are kept diverse *automagically*. RSVP path calculation with CSPF is an interesting topic, and if you look through the archives, you’ll find I’ve touched on this as I’ve discussed SRLGs, RSVP auto-bandwidth, and other such topics.

Lets assume for the purposes of this post that we have a LSP that looks like this:

[edit protocols mpls]
[email protected]# show 
label-switched-path test-LSP-to-PE5 {
    to 10.0.0.5;
    primary primary-path-to-5;
    secondary secondary-path-to-5;
}
path primary-path-to-5;
path secondary-path-to-5;

This LSP is pretty simple. There are no constraints implemented for either of the primary or secondary paths, and no constraints on the LSP as a whole.

Unpon configuration of a LSP such as this, Junos will attempt to stand up the primary path. This will follow the best IGP metric subject to constraints (available bandwidth, admin-groups, SRLGs, etc). Only once this has stood up will the calculation for the secondary begin.

As Junos calculates the secondary path, it will first take a copy of the topology of the RSVP network and inflate the IGP metric of every link that has been used by the primary path by a factor of 8 million. This ensures that if at all possible, a diverse path is used – while still allowing the use of a shared path between the primary and secondary if this is our only option.

When the primary path re-optimizes to a new path, this process is repeated, thus ensuring that we at all times try to keep the primary and secondary as diverse as possible.

It’s worth noting that if you turn off CSPF for the LSP, the above functionality will not be implemented, and both paths will simply follow the best IGP path.

Hope this helps!

The 10 most useless tips to pass your JNCIE

Recently I’ve had a bunch of people asking me what to do on exam day to pass their JNCIE, from whether they should have coffee in the morning or not to what sort of breakfast they should have. Other questions have included concerns about screen size on the laptops provided. These sorts of things are mentioned on a bunch of websites and study guides. I’m going to address my view on this once and for all in this post…

I had to abandon my holiday yesterday (Easter Friday) and spend it debugging a fault. It was a particularly hard and complex one, which had left others stumped. While I’m not able to go into a lot of detail, I can disclose that to identify the cause of the fault I had to spend 15 minutes or so tcpdumping RSVP messages and examining them to find the issue. From this I was able to deduce that a particular box was doing something that was particularly odd and we were able to proceed with resolving it.

The above paragraph might seem unrelated to the point of the post… but it really isn’t. The issue I debugged yesterday was far harder than the troubleshooting elements of any JNCIE exams I have taken. This isn’t to say that the JNCIE exams aren’t incredibly challenging – they are – and I have great respect for anyone who has done one of these exams – as it displays a significant level of skill.

The point is however that, like many others, I spend most of my days doing a combination of debugging the most ugly of problems you will ever see in a service provider environment and architecting a huge range of solutions comprising technologies from many vendors to deliver to customers. A couple of times a year, people in these positions are going to face a mind-bendingly odd problem which will require significant debugging skill and an intimate understanding of the technologies and protocols involved to resolve.

If you are doing this sort of work regularly, you are not going to struggle to achieve a JNCIE. You are still going to have to do a hell of a lot of reading, labbing and learning. The breadth of topics covered in a JNCIE will stretch most to have to cover many technologies they have never touched to a depth they have not understood anything before. I personally spent a considerable amount of time reading, re-reading, and re-re-re-reading many RFCs, books, and other resources to make sure I had a solid understanding of the standards defining each protocol. Significant time was spent labbing technologies that were new to me (for example – NV-MVPN, draft-rosen, Carrier of Carriers VPN, and Interprovider MPLS VPN options B,C & E).

But what I didn’t have to learn was how this stuff fundamentally works. I get this stuff already, as do most people who successfully attempt this exam. I didn’t have to learn how to troubleshoot to do the JNCIE – I’ve been troubleshooting complex and horrible problems all of my career. Most people sitting these shouldn’t need to learn this.

About 8 weeks ago I was sitting in a hotel room in drinking with a few friends, when someone piped up and had what I considered to be a long and whiney rant where they blamed everyone else for their failures. Essentially this person was saying that the questions were not worded well enough, and this was why he failed his CCIE. I tend to disagree. Both the JNCIE and CCIE exams are extensively alpha & beta tested (I recently sat the beta version of the JNCIE-ENT). During this time much feedback is given around the readability and understandability of the questions. The questions will never be written to tell you what to do – that’s not the point of an expert level exam. But they are written in a way that ensures that if you have the understanding requred you will know what to do. Another person in the room pointed out that they’d lost half the time in the troubleshooting section of the his CCIE because they hadn’t seen that there was a topology diagram, but still passed. Because he understood the content to an expert level, and had a huge amount of experience, he was able to get everything in that section done in the 50% time remaining.

What’s the point of all this? Why is this all relevant? Well it’s actually all pretty relevant – the point is that for any of these IE exams, they are designed so that an expert in the subject areas will pass. If you have an intimate understanding of how multicast works, you aren’t going to struggle to deploy and troubleshoot it.

Many websites and study guides have tips and tricks for these exams. These range from not having too much coffee (duh!) to eating a certain type of bread in the morning. Most of them are utter rubbish. Or more to the point – while they’re going to help you focus during the day, and might even make the difference between having time for that one extra question that gets you over the pass mark and not having time – this isn’t going to pass the exam for you. For the record – both times I’ve done a JNCIE the morning has begun with a massive breakfast from the Juniper cafe that makes me sleepier than normal! And both times I’ve walked out hours before the finish time with everything done.

While I don’t want to throw stones at anyone in this article, I think it’s time for a bit of a reality check for those who think that the type of bread they eat, range of pens in the exam, different colour highlighters, or the size of the laptop screen are going to make any difference at all. The best difference you can make is regular hard work in the many months leading up to the exam, and having great amounts of hands-on experience to boost.

Happy easter everyone!

Two new certifications… and….. aspirations for another JNCIE!

Over the last month or so, I have been quietly working away at a couple more JNCIS certifications – JNCIS-QF (QFabric) and JNCIS-SEC (Security).

For those of you who have not done any Juniper certifications before – a JNCIS is roughly equivalent to a CCNP level certification. I’ve come off the back of fairly intensive study for my JNCIE-ENT, for which I was invited to sit the beta of some new test forms in February – so rather than doing 6+ an hour per night, I slowed this down to an hour or two per week. It’s actually been quite odd re-figuring out what to do with all this “free time” – so doing these certifications was a good thing to do to keep my brain occupied and learning something new!

I’m pleased to say that I passed both, getting my JNCIS-QF on 14 March, and my JNCIS-SEC on 8 April. I thought both did a really good job of establishing that the candidate had a reasonable knowledge of the subject areas covered, and would feel confident setting up either a QFabric system or a Juniper SRX in a live deployment (which I guess is the point, right?).

I’ve also decided that I am in fact going to work towards a third JNCIE – the JNCIE-SEC. In many ways, this is going to be far more interesting than the other two, given the fact that I have far less experience in the area of Network Security than I have in Service Provider or Enterprise (as I have spent most of my career working for large service providers!). I’m really looking forward to learning a bunch of new and different technologies – something which is always very enjoyable!

I do however plan to take this one significantly slower than the other two. Essentially I did all the study for my last two JNCIEs in one year – and while I am glad I did it, as I wanted to prove to myself that I could; I would not do it again as I had no life at all while I was doing it. My plan is to slowly work towards this with the aim of doing the JNCIE-SEC exam sometime in the next year. Over the next couple of months I plan to sit the JNCIP-SEC and the JNCSP-SEC – though I will be doing plenty of labbing for the JNCIE as I study for these two written exams. From there I’ll make a week-by-week study plan of what I want to learn and work out a pace to approach it, and only book the exam when I’m sure that I am entirely ready.

As I do this, I’ll be blogging regularly on some of the new technologies and concepts I will be learning – and would appreciate any feedback/corrections; much of this stuff will be very new and different for me!

I also am hoping to hear back on my JNCIE-ENT result in the next few weeks – and will post this as soon as I get it!

Thanks!

RPKI on Junos is easy!

My friend Michael Fincham did a great presentation a few weeks ago at NZNOG on RPKI. I spent a fair bit of time helping him get said presentation ready to go – as we did a bunch of testing on the MX implementation of RPKI. He made a good argument that as a community we are horribly bad at the security of our prefixes. We need to be ensuring we do more than just blindly trust our peers and check that they actually have the right to send us the prefixes they advertise. Of course we are all at some level aware of this, but this verification stuff is hard right… right?….surely?!?…

Michael and I spent some time prior to him doing his presentation playing around with the RPKI on a MX80 in my lab. One of the things that came out of this is how truly easy it is to get a basic RPKI deployment going. To  turn on validation in Junos only requires the following single line of config;

set routing-options validation group some-awesome-rpki-server session 1.2.3.4

Of course there are a bunch of other options available, such as the priority of different RPKI servers, timeouts, port numbers, source address, etc – but this is not complex stuff to configure….

At this point, you’ll start to see prefixes looking like this (if they are valid);

[email protected]_RPKI_DEMO_MX80> show route
[output omitted]
2.0.0.0/16         *[BGP/170] 1w3d 03:59:37, localpref 100
                      AS path: 23655 4648 2914 5511 3215 I, validation-state: valid
                    > to 111.69.18.246 via ge-1/1/8.0
[output omitted]

Or this (if they are not valid);

[email protected]_RPKI_DEMO_MX80> show route
[output omitted]
5.10.137.0/24      *[BGP/170] 1w3d 04:01:05, localpref 100
                      AS path: 23655 4648 4134 I, validation-state: invalid
                    > to 111.69.18.246 via ge-1/1/8.0
[output omitted]

Then you can start writing policy that matches on the following;

[email protected]# set policy-options policy-statement abc from validation-database ?  
Possible completions:
  invalid              Match for invalid database validation-state
  unknown              Match for unknown database validation-state
  valid                Match for valid database validation-state

The cool thing about the JUNOS implementation of this is that you take action based on policy – so you can do anything at all with this! This might just be attaching an “untrusted” community, or it might be doing more.

Once concern I had that we spent some time playing around with before his presentation was that verifying each route with RPKI would make the MX take longer to process routes. So we did a bit of testing. We had a MX80 in my lab (and we all know that the MX80 does not have a particularly awesome CPU!) with a full BGP feed from my network. We measured the time it took for the route-table to be populated, plus the time for it to push these routes to the forwarding-table before and after implementing RPKI (with a route policy applied on import from the BGP feed inspecting RPKI attributes). The result was that the load time for the full table increased by 300-400%. This is bad, but not unmanageable – especially if you are not validating the full table – but just a few specific peers.

If you are going to set this up in your network, you are likely going to want to use a local RPKI server. A good place to start getting info on some of this stuff would be the slide deck from Michael’s talk, which can be found here; http://hotplate.co.nz/archive/nznog/2014/rpki/

Please also see the video of the talk here; http://www.r2.co.nz/20140130/michael-f.htm

Another bit of interesting reading is about the deployment of RPKI on the IX in Ecuador of all places! Read here; http://iepg.org/2013-11-ietf88/RPKI-Ecuador-Experience-v2b-1.pdf

One of the key things to note with RPKI is that while it is far from the final solution at this point (few have their routes signed yet), it’s a very good step in the right direction. And for that reason, I think we should all be looking into at least validating our routes, and perhaps assigning a better local-preference to validated routes. Like IPv6, it takes all of us ‘buying in’ to get this to the point where a large proportion of the internet routing table can be successfully validated!

CoS for RE sourced traffic

Many of you will have deployed CoS extensively on your networks. One area of a Junos CoS deployment that I am often asked about by friends is how to manipulate traffic that is sourced from the routing-engine. There are multiple catches and caveats in dealing with this traffic, and different ways to manipulate this.

At a 10,000 foot level, whenever we deploy CoS we generally want to be able to manipulate route-engine sourced traffic such as ISIS, BGP, OSPF, BFD, RSVP, etc to have various different DSCP, EXP, and 802.1p markings. Firstly, we can set a policy as to the marking used for traffic sourced from the route-engine;

set class-of-service host-outbound-traffic dscp-code-point 111000
set class-of-service host-outbound-traffic ieee-802.1 default 111

You can also specify the forwarding-class that is used for processing traffic sourced from the route-engine;

set class-of-service host-outbound-traffic forwarding-class hoffs-odd-class

It’s important to note that your rewrite rules will not take effect with this traffic (by default). Even if you have specified a forwarding-class, the “host-outbound-traffic” markings will be applied outbound for this route-engine sourced traffic.

However as of Junos 12.3, Juniper have implemented a new option for “host-outbound-traffic” on the MX, which causes the router to use the rewrite-rule for each unit to put markings onto traffic from the RE (based on the forwarding-class it is assigned). This is particularly helpful where you might have multiple fibre providers providing access to your customers, each with a different markings scheme that you are required to use. Note that this is only available for the 802.1p markings (not DSCP) This is done as follows;

set class-of-service host-outbound-traffic ieee-802.1 rewrite-rules

Of course a rewrite-rule must be configured on the outbound unit for this to have effect. So if we have a rewrite-rule to map “hoffs-odd-class” traffic to a marking of 010, the traffic will be now marked as 010 on egress.

This of course does not help us for DSCP markings (it only applies to 802.1p markings). Often we will want to manipulate these. Also how would we approach this problem if we were to wanted to assign different forwarding-classes to different types of traffic being sourced from the RE? A great example of this is that while we might want to ensure that BGP is prioritised, we probably don’t need prioritisation of http traffic sourced from the RE!

The solution for this is quite clever. Most of you will know that you can firewall off all traffic to the RE (regardless of the IP it is destined to – even if that IP is on a physical interface) by applying an inbound firewall filter to the loopback. The clever thing is that you can also apply a firewall filter to all traffic leaving the RE by applying an outbound firewall filter to the loopback. If we want to ensure that all http/https traffic is put into the best-effort forwarding-class, we could do the following;

set interfaces lo0 unit 0 family inet filter output RE-QOS
set firewall family inet filter RE-QOS term web from protocol tcp
set firewall family inet filter RE-QOS term web from port http
set firewall family inet filter RE-QOS term web from port https
set firewall family inet filter RE-QOS term web then dscp be
set firewall family inet filter RE-QOS term web then forwarding-class best-effort
set firewall family inet filter RE-QOS term web then accept
set firewall family inet filter RE-QOS term catchall then accept

This is a pretty handy tool and allows us to do a fairly fine-grained manipulation of how each traffic-type being sourced by the RE is treated. Obviously you could customise this in any way to suit your needs. However it’s worth noting that to my understanding you cannot manipulate the 802.1p markings with a firewall filter – hence why the “rewrite-rules” option becomes so important for host-oubound-traffic.

If you thought that this is all there is to marking/classifying traffic sourced from the RE, you would be wrong! On a MX router, the processing of certain control traffic is delegated to the individual line-cards (such as BFD). I have learned the hard way that the markings on this traffic are not modified by any configuration you apply to normal RE-sourced traffic.

The news is not all bad though, as there is an easy workaround for this, and not many protocols are distributed to the line cards. For this traffic, you can apply an outbound firewall filter to the interface you are doing this traffic on. As an example, here is how to ensure that BFD traffic which has been distributed to the line card is placed into the correct forwarding class and marked appropriately;

set interfaces ge-1/2/3 unit 0 family inet filter output cos-bfd-link
set firewall family inet filter cos-bfd-link term 1 from protocol udp
set firewall family inet filter cos-bfd-link term 1 from port 3784
set firewall family inet filter cos-bfd-link term 1 from port 3785
set firewall family inet filter cos-bfd-link term 1 then loss-priority low
set firewall family inet filter cos-bfd-link term 1 then forwarding-class network-control
set firewall family inet filter cos-bfd-link term 1 then dscp 111000
set firewall family inet filter cos-bfd-link term 2 then accept

Hope this helps!