The Border Gateway Protocol (BGP) is an exterior routing protocol used for exchanging routing information between autonomous systems. BGP is used for exchange of routing information between multiple transit autonomous systems as well as between transit and stub autonomous systems. BGP is related to EGP, but has more capability, greater flexibility, and less required bandwidth. BGP uses path attributes to provide more information about each route, and in particular to maintain an AS path, which includes the AS number of each autonomous system the route has transited, providing information sufficient to prevent routing loops in an arbitrary topology. Path attributes may also be used to distinguish between groups of routes to determine administrative preferences, allowing greater flexibility in determining route preference to achieve a variety of administrative ends.
BGP supports two basic types of sessions between neighbours, internal (sometimes referred to as IBGP) and external. Internal sessions are run between routers in the same autonomous system, while external sessions run between routers in different autonomous systems. When sending routes to an external peer, the local AS number is prepended to the AS path. This means that routes received from an external peer are guaranteed to have the AS number of that peer at the start of the path. In general, routes received from an internal neighbour will not have the local AS number prepended to the AS path, and hence will have the same AS path that the route had when the originating internal neighbour received the route from an external peer. Routes with no AS numbers in the path may be legitimately received from internal neighbours. These routes should be considered internal to the receiver's own AS.
The BGP implementation supports three versions of the BGP protocol, versions 2, 3 and 4. BGP versions 2 and 3 are quite similar in capability and function. They will only propagate classed network routes, and the AS path is a simple array of AS numbers. BGP 4 will propagate fully general address-and-mask routes, and the AS path structure can represent the results of aggregating dissimilar routes.
External BGP sessions may or may not include a single metric, which BGP calls the Multi-Exit Discriminator, among the path attributes. For BGP versions 2 and 3, this metric is a 16-bit unsigned integer. For BGP version 4, it is a 32-bit unsigned integer. Smaller values of the Multi-Exit Discriminator are preferred. Currently this metric is only used to break ties between routes with equal preference from the same neighbouring AS.
Internal BGP sessions carry at least one metric in the path attributes, which BGP calls the LocalPref. The range of LocalPref is identical to the range of the MED. For BGP versions 2 and 3, a route is preferred if its value for LocalPref is smaller. For BGP version 4, a route is preferred if its value for this metric is larger. BGP version 4 internal sessions may optionally include a second metric, the Multi-Exit Discriminator, carried in from external sessions. The use of these metrics is dependent on the type of internal protocol processing which is specified.
BGP collapses routes with similar path attributes into a single update for advertisement. Routes that are received in a single update will be readvertised in a single update. The churn caused by the loss of a neighbor will be minimized and the initial advertisement sent during peer establishment will be maximally compressed. BGP does not read information from the kernel message-by-message, but fills the input buffer. It processes all complete messages in the buffer before reading again. BGP also does multiple reads to clear all incoming data queued on the socket. This feature may cause other protocols to be blocked for prolonged intervals by a busy peer connection.
All unreachable messages are collected into a single message and sent prior to reachable routes during a flash update. For these unreachable announcements, the next hop is set to the local address on the connection, no metric is sent, and the path origin is set to incomplete. On external connections the AS path in unreachable announcements is set to the local AS. On internal connections, the AS path is set to length zero.
The BGP implementation expects external peers to be directly attached to a shared subnet, and expects those peers to advertise next hops which are host addresses on that subnet (though this constraint can be relaxed by configuration for testing). For groups of internal peers, however, there are several alternatives which may be selected from by specifying the group type and route reflection options. Type internal groups expect all peers to be directly attached to a shared subnet so that, like external peers, the next hops received in BGP advertisements may be used directly for forwarding. Type routing groups instead will determine the immediate next hops for routes by using the next hop received with a route from a peer as a forwarding address, and using this to look up an immediate next hop in an IGP's routes. Such groups support distant peers, but need to be informed of the IGP whose routes they are using to determine immediate next hops.
For internal BGP group types (and for test groups), where possible, a single outgoing message is built for all group peers based on the common policy. A copy of the message is sent to every peer in the group, with possible adjustments to the next-hop field as appropriate to each peer. This minimizes the computational load of running large numbers of peers in these types of groups. BGP allows unconfigured peers to connect if an appropriate group has been configured with an allow clause.
Generally, all border routers in a single AS need to be internal peers of each other, and in fact all non-border routers frequently need to be internal peers of all border routers. While this is usually acceptable in small networks, it may lead to unacceptably large internal peer groups in large networks. To help address this problem, BGP supports route reflection for internal peer groups (with BGP version 4 only). When using route reflection, the rule that a router may not re-advertise routes from internal peers to other internal peers is relaxed for some routers, called route reflectors. A typical use of route reflection might involve a "core" backbone of fully meshed routers ("fully meshed" means all the routers in the fully meshed group peer directly with all other routers in the group), some of which act as route reflectors for routers which are not part of the core group.
Two types of route reflection are supported. By default, all routes received by the route reflector from a client are sent to all internal peers (including the client's group but not the client itself). If the no-client-reflect option is enabled, routes received from a route reflection client are sent only to internal peers which are not members of the client's group. In this case, the client's group must itself be fully meshed. In either case, all routes received from a non-client internal peer are sent to all route reflection clients.
Typically, a single router will act as the reflector for a set, or cluster, of clients. However, for redundancy two or more may also be configured to be reflectors for the same cluster. In this case, a cluster ID should be selected to identify all reflectors serving the cluster, using the clusterid keyword. Gratuitous use of multiple redundant reflectors is not advised, as it can lead to an increase in the memory required to store routes on the redundant reflectors' peers.
No special configuration is required on the route reflection clients. From a client's perspective, a route reflector is simply a normal IBGP peer. Any BGP version 4 speaker should be able to be a reflector client. (Note however that gated versions 3.5B3 and earlier, and 3.6A1 and earlier contain a bug which prevent them from acting as route reflection clients.)
Readers are referred to the route reflection specification document (rfc1966 as of this writing) for further details.
The Communities attribute allows the administrator of a Routing Domain to tag groups of routes with a community tag. The tag consists of 2 octets of Autonomous System (AS) and 2 octets of Community ID. The Community attribute is passed from routing domain to routing domain to maintain the grouping of these routes. A set of routes may have more than one community tag in its Community attribute.
Communities import and export policy is configured using the aspath-opt clause (or mod-aspath clause) to the group, import and export statements.
Please refer to the Communities specification and its accompanying usage document (rfc1997 and rfc1998 as of this writing) for further details on BGP Communities.
The Multi Exit Discriminator, or MED, allows the administrator of a Routing Domain to choose between various exits from a neighboring AS. This attribute is used only for decision making in choosing the best route to the neighboring AS. If all the other factors for a path to a given AS are equal, the path with the lower MED value takes preference over other paths.
This attribute is not propagated to other neighboring AS's. However, this attribute may be propagated to other BGP speakers within the same AS.
The MED attribute, for BGP version 4, is a four-octet unsigned integer.
MED is originated using the metricout option of the export, group and/or peer statement. It is imported using the med keyword on the BGP group statement.
BGP selects the best path to an AS from all the known paths and propagates the selected path to its neighbors. Gated uses the following criteria, in order, to select the best path. If routes are equal at a given point in the selection process, then the next criterion is applied to break the tie.
Routes propagated by IBGP must include a Local_Pref attribute. By default, BGP sends the Local_Pref path attribute as 100, and ignores it on receipt. In effect, this causes rule #2 above to be ignored.
GateD BGP does not use Local_Pref as a route-preference decision maker unless the setpref option has been set. For Routing- or Internal-type groups, the setpref option allows gated's global protocol preference to be exported into Local_Pref and allows Local_Pref to be used for gated's route selection preference. Note that the setpref option is the only way for gated to send a route with a given local_pref. The local_pref is never set directly, but rather as a function of the gated preference and setpref metrics.
The translation of gated's internal preference to and from Local_Pref is done as follows. In the table below, metric is the argument to setpref, e.g., in the statement, "setpref 100", metric is 100. "Exported preference" is the gated preference of the exported route. "Imported preference" is the gated preference assigned to the imported route.
| Exported Preference | Local_Pref | Imported Preference |
| less than metric | 254 | metric |
| metric ... 254 | 254 ... metric | metric ... 254 |
| N/A | greater than 254 | metric |
In effect, any gated preference of less than metric is exported such that it will be re-imported with a preference of exactly metric. Any preference of metric or above will be exported such that it will be re-imported with the same preference it had originally.
Local_Pref, as exported to BGP peers, is calculated as
Local_Pref = 254 - (global protocol preference for this route) + metric
A value greater than 254 will be reset to 254. Gated will only send Local_Pref values between 0 and 254.
Note: Non-gated IBGP implementations may send Local_Prefs which are greater than 254. When operating a mixed network of this type, it is recommended that all routers restrict themselves to sending Local_Prefs in the range metric to 254, so that they will be correctly interpreted by all routers in the network. If gated receives any Local_Pref with a value greater than 254, it will import it with a gated preference of metric.
Note: All routers in the same network which are running gated and participating in IBGP should use setpref uniformly. That is, if one router has setpref set, all should set it, and all should use the same value of metric. The value for metric should be selected to be consistent with the import policy in use in the network. For example, if import policy sets gated preferences ranging from 170 to 200, a setpref metric of 170 would make sense. It is advisable to set metric high enough to avoid conflicts between BGP routes and IGP or static routes.
bgp yes | no | on | off
[ {
preference preference ;
defaultmetric metric ;
traceoptions trace_options ;
[ clusterid host ; ]
group type
( external peeras autonomous_system
[ ignorefirstashop ]
[ med ] )
| ( internal peeras autonomous_system
[ ignorefirstashop ]
[ lcladdr local_address ]
[ outdelay time ]
[ metricout metric ]
[ reflector-client [ no-client-reflect ] ])
| ( routing peeras autonomous_system proto proto
interface interface_list
[ ignorefirstashop ]
[ lcladdr local_address ]
[ outdelay time ]
[ metricout metric ]
[ reflector-client [ no-client-reflect ] ] )
| ( test peeras autonomous_system )
[ aspath-opt ]
{
allow {
network
network mask mask
network masklen number
all
host host
} ;
peer host
[ metricout metric ]
[ setpref metric ]
[ localas autonomous_system ]
[ ignorefirstashop ]
[ nogendefault ]
[ gateway gateway ]
[ preference preference ]
[ preference2 preference ]
[ lcladdr local_address ]
[ holdtime time ]
[ version number ]
[ passive ]
[ sendbuffer number ]
[ recvbuffer number ]
[ outdelay time ]
[ keep [ all | none ] ]
[ show-warnings ]
[ noaggregatorid ]
[ keepalivesalways ]
[ v3asloopokay ]
[ nov4asloop ]
[ ascount count ]
[ logupdown ]
[ ttl ttl ]
[ traceoptions trace_options ]
;
} ;
} ] ;
The bgp statement enables or disables BGP. By default BGP is disabled. The default metric for announcing routes via BGP is no metric.
BGP peers are grouped by type and the autonomous system of the peers. Any number of groups may be specified, but each must have a unique combination of type, peer autonomous system and aspath-opt options. There are four possible group types:
The proto names the interior protocol to be used to resolve BGP route next hops, and may be the name of any IGP in the configuration. By default the next hop in BGP routes advertised to type routing peers will be set to the local address on the BGP connection to those peers, as it is assumed a route to this address will be propagated via the IGP. The interface_list can optionally provide a list of interfaces whose routes are carried via the IGP for which third-party next hops may be used instead.
The BGP statement has group clauses and peer subclauses. Any number of peer subclauses may be specified within a group. A group clause usually defines default parameters for a group of peers. These parameters apply to all subsidiary peer subclauses. Any parameters from the peer subclause may be specified on the group clause to provide defaults for the whole group (which may be overridden for individual peers).
Within a group, BGP peers may be configured in one of two ways. They may be explicitly configured with a peer statement, or implicitly configured with the allow statement. Both are described here:
Within each group clause, individual peers can be specified or a group of potential peers can be specified using allow. Allow is used to specify a set of address masks. If GateD receives a BGP connection request from any address in the set specified, it will accept it and set up a peer relationship.
The BGP peer subclause allows the following parameters, which can also be specified on the group clause. All are optional.
Note that the state option works with BGP, but does not provide true state transition information.
Packet tracing options (which may be modified with detail, send, and recv):
This section eventually will go into a troubleshooting guide.