By Hirotaka Yamamoto (@ymmt2005)
This is a supplementary material for Modular, Pure Layer 3 Network for Kubernetes: The Implementation to cover the basics of the BIRD Internet Routing Daemon.
Although there is an official user's guide, this document hopefully helps readers to understand the core concepts of BIRD more easily. Other than basics, this document describes how to cope with Invalid NEXT_HOP
errors deeply and other advanced topics.
- What is BIRD?
- Architecture
- Important features
- Protocols
- Troubleshooting
- Coping with
Invalid NEXT_HOP
errors - Advanced topics
- Summary
What is BIRD?
BIRD is a routing software that runs on Linux and other UNIX-like operating systems. It implements various routing protocols, including BGP, OSPF, and RIP. As of November 2019, the latest release is 2.0.7, and maintained branches are 2.0 and 1.6.
The following contents are written for BIRD 2.0.
Resources
- Homepage
- User's guide 2.0
- Wiki - Sample configurations, FAQs, and so on.
Architecture
BIRD runs as a single process that has several routing tables in memory and protocols that exchange routing information between a routing table and an entity such as another routing table, kernel routing table, an external network router, and so on.
Let's take a look one by one.
Routes and routing tables
A routing table of BIRD is an in-memory collection of routes.
A route is a set of information to route packets for a destination network. It consists of a destination network address (net
), a router address often referred to as next hop (gw
), the source protocol instance that brought this route (proto
), and attributes depending on the routing protocol that brought the route. For example, a route originated from a BGP peer has a list of ASN (bgp_path
), local preference value (bgp_local_pref
), and so on.
BIRD routing tables are not Linux kernel routing tables. A route in a kernel routing table has only two attributes, that is, the destination address and the next hop. To distinguish, a kernel routing table is called "Forwarding Information Base", or FIB in this document.
BIRD can have as many routing tables as you want. The following tables are predefined.
master4
The default routing table for IPv4.
master6
The default routing table for IPv6.
To create another routing table for IPv4 routes, add the following line to bird.conf
:
ipv4 table another_table;
Protocols and channels
Protocol connects a routing table with something. If something is another router, the protocol is one of the standard routing protocols, such as BGP or OSPF. Other than routers, something can be FIB, another routing table, a set of static routes, and so on. For example, kernel protocol connects a routing table with FIB to exchange routes between BIRD and the kernel.
A BIRD process can have several instances of a protocol. For example, BGP instances are needed to be created for each peer router. To distinguish instances of a protocol, each protocol instance has a unique name. The following configuration creates two BGP protocol instances named tor1
and tor2
, respectively.
protocol bgp tor1 { local AS 65000; neighbor 10.20.30.40 AS 65001; ... } protocol bgp tor2 { local AS 65000; neighbor 10.20.30.41 AS 65001; ... }
A protocol instance may have channels. A channel represents how to export or import routes for a network type such as IPv4 or IPv6. Available channels vary by protocol. BGP can have both ipv4
and ipv6
channels. RIP can have either one of ipv4
or ipv6
. Special protocols like BFD cannot have channels.
Each channel is associated with a routing table. If not specified, ipv4
channel is associated with master4
table, and ipv6
channel is associated with master6
table.
The next snippet is a configuration of a BGP instance that exchanges IPv4 routes between an eBGP peer running at 10.20.30.40 and master4
routing table.
It imports all IPv4 routes from the peer router and exports all routes in master4
excluding static routes defined in static1
.
protocol bgp tor1 { local AS 65000; neighbor 10.20.30.40 AS 65001; ipv4 { import all; export filter { if proto = "static1" then reject; accept; }; }; }
If a channel has empty (default) configuration, it can be written shortly as follows:
protocol bgp tor1 { // snip ipv4; }
Except for eBGP, the default channel configuration imports all routes and exports nothing.
Important features
Protocol template
A BGP router having many peers may have almost the same configurations among the protocol instances. BIRD has a feature called protocol template to make such configurations simple, as shown below.
template bgp tor { local AS 65000; rr client; ipv4 { import all; export filter { if proto = "static1" then reject; accept; }; }; } protocol bgp tor1 from tor { neighbor IP_OF_TOR1 AS 65001; } protocol bgp tor2 from tor { neighbor IP_OF_TOR2 AS 65001; }
Here, BGP instances tor1
and tor2
inherit the same configurations from template tor
.
Route filter
A route filter is a function to filter routes based on their attributes. It takes a route and determines whether to reject or accept the route.
Route attributes can be referenced as variables such as gw
or proto
in a route filter function.
Some attributes can be edited; for example, rewriting dest
attribute to RTD_UNREACHABLE
would make the packet is returned with ICMP unreachable message.
Control structures like if
and various data types, including sets, are available in route filters, which enables highly flexible routing control.
Functions
Aside from route filters, pure functions can be defined.
function with_parameters (int parameter) int local_variable; { local_variable = 5; return parameter + local_variable; }
Protocols
Only protocols used in Project Neco are described below.
Device
Device is a special protocol that has no channels. It is to configure BIRD to scan network devices in OS.
One Device protocol instance should be included in bird.conf
.
protocol device { }
BFD
BFD is, again, a special protocol that has no channels. It is always used alongside with other routing protocols such as BGP to detect link failures very fast.
The only thing you should care about BFD is the interval of keep-alive packets.
protocol bfd { interface "*" { interval 50 ms; }; }
Direct
Direct is to import routes automatically generated by the kernel.
Suppose that a network interface eth0
has an IP address 192.168.16.3/24
. A route for the network of this address (192.168.16.0/24
) is automatically registered with FIB to route packets for this network via eth0
. BIRD ignores this kind of route because usually, there is another router that owns the layer-2 network and advertises itself as the destination.
In some cases, however, there is no such router. For example, one can assign a representative address of a node to a dummy interface as follows:
$ sudo ip link add node0 type dummy $ sudo ip address add 192.168.16.3/32 dev node0
In this case, BIRD needs to advertise the address. Direct can be used to import the address from the dummy interface into a routing table:
protocol direct direct1 { ipv4; interface "node0"; }
Routes imported by Direct should not be re-exported to FIB. To filter routes from Direct, use an export filter as follows:
protocol kernel { ipv4 { export filter { if proto = "direct1" then reject; accept; }; }; }
Kernel
Kernel protocol is to exchange routes between a BIRD routing table and a kernel routing table (FIB).
Instances of Kernel protocol can share neither a BIRD routing table nor FIB. To exchange routes between two BIRD routing tables, use Pipe as described later.
ipv4 table alt_v4tab; # Export routes in alt_v4tab table to FIB ID 8 protocol kernel { kernel table 8; ipv4 { table alt_v4tab; export all; }; } # Export routes in master4 table to the main FIB (table ID 254 in Linux) # Import alien routes in the main FIB to master4 by "learn" protocol kernel { learn; persist; ipv4 { import all; export all; }; }
FIB ID 8 can be used to route packages by adding a table lookup rule with ip rule
command:
$ sudo ip rule add priority 100 from all lookup 8
Pipe
Pipe protocol exchanges routes between two BIRD routing tables.
Because the two tables must be of the same network type (e.g., IPv4), channel configurations in Pipe omit network type keyword (e.g., ipv4
).
# A table to collect all IPv4 routes ipv4 table bgp_v4tab; # Import routes in master4 to bgp_v4tab protocol pipe { table bgp_v4tab; peer table master4; } # Import routes in alt_v4tab to bgp_v4tab protocol pipe { table bgp_v4tab; peer table alt_v4tab; }
Static
Static protocol defines so-called static routes and imports them into a routing table.
protocol static { ipv4; # import to master4 table route 0.0.0.0/0 via 192.168.0.1; # default gateway }
BGP
A BGP protocol instance needs to be created for each neighbor (peer) router.
The default configuration values are different for iBGP and eBGP. Read the manual carefully.
Use aforementioned protocol template to reduce repetitive instance configurations.
Notes on configurations:
bfd
: To use BFD along with BGP, just write this.protocol bgp { bfd; // snip }
passive
: Do not initiate connection to the peer router.rr client
: Mark the peer as a route reflect client.add paths
: Enable ADD-PATH extension.direct
: Allow direct (layer-2) connection to the peer router for iBGP.multihop
: Allow indirect connection to the peer router for eBGP.
Troubleshooting
Diagnosing internal states
Use birdc
command to diagnose routing tables or protocol instances.
To show routes in a routing table:
bird> show route Table master4: 0.0.0.0/0 unicast [tor1 18:55:10.191] * (100) [AS65000i] via 10.69.64.1 on eth0 unicast [tor2 18:55:09.946] (100) [AS65000i] via 10.69.128.1 on eth1 10.69.0.5/32 unicast [tor1 18:55:07.808] * (100) [i] via 10.69.64.1 on eth0 unicast [tor2 18:55:07.609] (100) [i] via 10.69.128.1 on eth1 10.69.0.4/32 unicast [tor1 18:55:10.903] * (100) [i] via 10.69.64.1 on eth0 unicast [tor2 18:55:11.059] (100) [i] via 10.69.128.1 on eth1 10.69.0.3/32 unicast [direct1 18:55:03.687] * (240) dev node0
To show protocol instances:
bird> show protocols Name Proto Table State Since Info device1 Device --- up 18:16:10.826 bfd1 BFD --- up 18:16:10.826 defaultgw Static master4 up 18:16:10.826 kernel1 Kernel master4 up 18:16:10.826 rack0-tor1 BGP --- up 18:16:14.081 Established rack0-tor2 BGP --- up 18:16:14.686 Established
To show a BGP protocol instance in detail:
bird> show protocols all 'rack0-tor1' Name Proto Table State Since Info rack0-tor1 BGP --- up 18:16:14.081 Established BGP state: Established Neighbor address: 10.0.1.1 Neighbor AS: 64600 Neighbor ID: 10.0.1.1 Local capabilities Multiprotocol AF announced: ipv4 Route refresh Graceful restart 4-octet AS numbers Enhanced refresh Neighbor capabilities Multiprotocol AF announced: ipv4 Route refresh Graceful restart 4-octet AS numbers Enhanced refresh Session: external AS4 Source address: 10.0.1.0 Hold timer: 177.403/240 Keepalive timer: 8.268/80 Channel ipv4 State: UP Table: master4 Preference: 100 Input filter: ACCEPT Output filter: ACCEPT Routes: 4 imported, 2 exported Route change stats: received rejected filtered ignored accepted Import updates: 4 0 0 0 4 Import withdraws: 0 0 --- 0 0 Export updates: 7 4 0 --- 3 Export withdraws: 0 --- --- --- 1 BGP Next hop: 10.0.1.0
Take a look at rejected
counts in Channel ipv4
, Route change stats
, Export updates
.
This indicates that the peer router rejected some routes.
Logging
Use birdc
command to investigate activities of a protocol instance.
bird> debug 'rack0-tor1' all
The activities will be logged as follows:
2018-04-20 19:14:38.143 <TRACE> rack0-tor1: BGP session established 2018-04-20 19:14:38.143 <TRACE> rack0-tor1: State changed to up 2018-04-20 19:14:38.143 <TRACE> rack0-tor1: Sending KEEPALIVE 2018-04-20 19:14:38.143 <TRACE> rack0-tor1 < added 0.0.0.0/0 unicast 2018-04-20 19:14:38.143 <TRACE> rack0-tor1 < added 10.69.0.5/32 unicast 2018-04-20 19:14:38.143 <TRACE> rack0-tor1 < added 10.69.0.4/32 unicast 2018-04-20 19:14:38.143 <TRACE> rack0-tor1 < added 10.69.128.0/26 unicast 2018-04-20 19:14:38.143 <TRACE> rack0-tor1 < added 10.69.0.3/32 unicast 2018-04-20 19:14:38.144 <TRACE> rack0-tor1: Sending UPDATE 2018-04-20 19:14:38.144 <TRACE> rack0-tor1: Sending UPDATE 2018-04-20 19:14:38.144 <TRACE> rack0-tor1: Sending END-OF-RIB 2018-04-20 19:14:38.144 <TRACE> rack0-tor1: Got UPDATE 2018-04-20 19:14:38.144 <TRACE> rack0-tor1 > added [best] 10.69.64.0/26 unicast 2018-04-20 19:14:38.144 <TRACE> rack0-tor1 < rejected by protocol 10.69.64.0/26 unicast
Debugging with print
print
function is useful to display various information from within export/import filters.
ipv4 { export filter { print "route: ", net, ", ", from, ", ", proto, ", ", bgp_next_hop; accept; }; };
Coping with Invalid NEXT_HOP
errors
The next logline is shown when BIRD receives a route with an unreachable NEXT_HOP
address.
2018-04-20 19:14:38.253 <RMT> tor2: Invalid NEXT_HOP attribute
This problem typically happens when a route from an eBGP peer is propagated to an iBGP peer as follows:
- A router is connected with an eBGP peer using
/31
subnet as a point-to-point link. - When it receives a route from the eBGP peer, the
NEXT_HOP
of the route is set to an address in the subnet. - The router sends the route to an iBGP peer without changing
NEXT_HOP
. - The iBGP peer rejects the route because it neither has an address in the
/31
subnet nor knows how to reach it.
There are several options to avoid this problem, as described in each subsection.
Pre-register the point-to-point subnet
Use static routes or IGP such as OSPF or RIP to inform iBGP peers of the point-to-point subnet.
Use next hop self
to rewrite NEXT_HOP
BGP routers, including BIRD, generally have a feature often referred to as next hop self
to cope with this well-known problem.
When enabled, next hop self
rewrites NEXT_HOP
attribute to the address of the sending router for iBGP peers, effectively hiding unknown addresses.
There is a possibly undesirable side-effect in this feature. When the sending router is a route reflector, this feature also rewrites routes from an iBGP peer to other iBGP peers by reflection. This would prevent direct communication between iBGP peers if they can communicate directly.
Use BIRD's import filter to rewrite NEXT_HOP
Unlike next hop self
, this is a receiver side solution.
BIRD normally looks up a route for NEXT_HOP
in the routing table that is connected by a channel. With igp table
, however, BIRD can be configured to look up in a different routing table.
The following configuration abuses igp table
to accept routes having any NEXT_HOP
address then rewrites NEXT_HOP
by an import filter.
The import filter carefully does not rewrite NEXT_HOP
address if the address is directly reachable.
// A dummy IGP routing table with a default gateway to accept any `NEXT_HOP`. ipv4 table dummytab; protocol static dummystatic { ipv4 { table dummytab; }; route 0.0.0.0/0 via "lo"; } protocol bgp { // snip ipv4 { igp table dummytab; # reference the dummy table gateway recursive; # lookup the dummy table even when the peer is directly connected import filter { # if the NEXT_HOP address is in the same subnet of the sending router, use NEXT_HOP as is. # - "gw" is the generic NEXT_HOP for BIRD. # - 26 is just an example subnet maskbits. if bgp_next_hop.mask(26) = from.mask(26) then { gw = bgp_next_hop; accept; } # otherwise, rewrite NEXT_HOP ("gw") to the address of the sending router ("from") gw = from; accept; }; }; }
Advanced topics
Management IP address
A router often has a so-called management IP address in addition to addresses assigned to physical interfaces. Other switches or servers can reach the router with the router's management IP address.
To implement a management IP address on Linux, use dummy interface as follows:
$ ADDR=10.20.30.40/32 # the management address of this server $ sudo ip link add node0 type dummy $ sudo ip address add $ADDR dev node0
To advertise the management IP address with BIRD, use Direct protocol.
# Import the management address into master4 table. protocol direct direct1 { ipv4; interface "node0"; } protocol kernel { # Do not re-export routes from Direct protocol ipv4 { export filter { if proto = "direct1" then reject; accept; }; }; }
Route summarizatoin
Route summarization is an optimization for BGP routers.
Unfortunately, BIRD does not have an easy way to summarize ASPATH of routes.
A workaround is to use another routing table with static routes to advertise virtually summarized routes.
ipv4 table outertab; # Pre-calculated summarized routes protocol static myroutes { ipv4 { table outertab; } route ...; } protocol bgp outerpeer { local as ...; neighbor ADDRESS as ...; # Advertise routes in the alternative table. ipv4 { table outertab; import all; export all; next hop self; }; } # Routes from external routers are piped to master4. protocol pipe outerroutes { table master4; peer table outertab; import filter { if proto = "myroutes" then reject; accept; }; export none; }
Summary
This article introduces the essential features of BIRD and describes some advanced topics.
BIRD's excellent architecture - multiple routing tables, protocols, and channels that connect them - allows you to implement highly flexible routing rules.