Affinity-Model

Introduction

The current Internet consists of around 80,000 Autonomous Systems (ASes). Each AS routes and forwards traffic based on its unique interests and business cooperation. In light of the Internet's expansion in both scale and intricacy, precise modeling of AS routing policies and behaviors becomes increasingly important for many Internet research endeavors. Currently, people often simplify the complex AS routing policies into two relationships, namely provider-customer (p2c) and peer-to-peer (p2p) relationships. In a p2c (or its reverse, c2p) relationship, the customer needs to pay the provider to ensure the reachability of the entire network traffic, while in a p2p relationship, the two peers engage in mutual peering and exchange the traffic from their customers for free.

Existing methods typically rely on simplistic assumptions—such as p2c, p2p relationships, and the valley-free principle—to model AS routing policies. However, these approaches, along with their minor extensions (e.g., hybrid relationships), still struggle to accurately capture the complexity of real-world AS routing policies. Moreover, while some AS path inference techniques can identify routing paths, they often fail to provide clear justifications for their approaches. Consequently, none of these methods offer a comprehensive and accurate model of AS routing.

To address these challenges, we propose Affinity-Model, which leverages AS affinity behaviors to elucidate routing behaviors and enhance routing models. The Affinity-Model defines affinity behaviors by comparing actual AS routing with simulated paths, and employs machine learning techniques to interpret and infer additional affinity behaviors. Compared to specific preferences of ASes documented in the Internet Routing Registry (IRR), the Affinity-Model achieves an affinity inference with 81% precision and 79% recall. Moreover, it can model up to 86.5% of valley-free AS routing behaviors—doubling the efficiency of existing methods—and capture 68.0% of non-valley-free behaviors, an area where traditional models face significant challenges.

Challenges and Basic Ideas

example of the absence of AS RIB-in data

Limited and biased VP observations. Limited and biased observations of the Internet have been widely acknowledged. Although 77,076 ASes are active, only 1,604 serve as vantage points (VPs) reporting routing information. These VPs are split into full VPs, which provide complete routing data, and partial VPs, which offer only limited information. While there are fewer full VPs (319), they observe a broader set of ASes, AS links, and paths. In contrast, partial VPs provide fewer observations, leading to a skewed understanding of Internet routing. Moreover, full VPs are unevenly distributed, with 82% belonging to tier-1 ASes and their neighbors, while only 10% are from lower-tier stub ASes. This imbalance results in an overrepresentation of upper-layer ASes and insufficient data on lower-tier ASes, undermining the accuracy and reliability of AS behavior inference.
The absence of AS RIB-in data. The data reported by VPs sheds light on actual AS path choices but lacks insight into their RIB-in paths, pivotal in AS path selection decisions. It hampers our ability to infer the criteria ASes use to sort and select candidate paths, thereby impacting our understanding of AS routing behaviors. In the above figure, for AS paths originating from AS s and destined to AS s, the observed path <s,1,d > is believed to exist in the RIB-in table of AS s. However, we cannot confirm the existence of the simulated path<2,d > in the RIB-in table of AS s. The presence or absence of <2,d > carries significant implications for AS s's routing preferences. If <2,d > exists, AS s does not opt for the most economically optimal path from AS 2, suggesting that AS 1 is a special preference neighbor diverting AS s from the GR-Model. Conversely, if <2,d > does not exist, it implies that AS s adheres to the GR-Model and selects the economically optimal path, aligning with traditional assumptions. However, this scenario introduces another layer of complexity. It remains uncertain whether AS d did not announce its prefix to AS 2 or AS 2 did not convey the prefix to AS s. Such ambiguities, perplexing previous research efforts, lack definitive hypotheses. To address this challenge, we will use PathRadar to simulate the AS RIB-in path to avoid inappropriate assumptions.
The oversimplified AS routing model. As mentioned earlier, ASes often employ a wide array of local policies, including traffic engineering, import and export policies, special local preferences, and backup paths. However, the p2c and p2p AS relationships oversimplify these policies, inadequately reflecting actual AS routing behaviors. While some studies have introduced more complex AS relationships, they ultimately remain rooted in the existing p2c and p2p assumptions, and are far from fully elucidating AS routing behaviors. Therefore, we aim to introduce a new affinity relationship to capture special AS routing policies beyond traditional p2c and p2p assumptions, thereby complementing the AS routing model.
Non valley-free paths. Although GR-Model initially introduced the valley-free principle and it became the default approach for subsequent researchers, the increasing number of non valley-free paths now requires our focus on their inference. In our observations from Route Views and RIPE RIS in January 2022, non valley-free paths constituted ~10.0% of all observed paths. However, limited works currently focus on inferring such non valley-free paths. We have identified two primary types of non valley-free paths that collectively represent the majority (~80%) of all non valley-free paths. Type 1 involves connecting 1 or 2 hop p2p links after the regular valley-free path, while Type 2 entails replacing the 1-hop p2p link in the middle of the standard valley-free path with 2 hops of p2p links. AM will introduce a novel inference method specifically tailored for these two types of non valley-free paths. For the inference of other non valley-free path types, we consider it for future research.

Algorithm Design of TopoScope

overall framework of Affinity-Model

example of aggregating policies

Affinity-Model works in four steps as follows.

Affinity-Model employs PathRadar to infer the valley-free RIB-in paths of source ASes to any destination AS. Traditional routing methods are then used to rank these paths, and a comparison is made between the highest-priority path and the actual selection of the AS. If these paths differ, it indicates that the source AS has established a special preference for the next_hop of its selected path, and this link is recorded as a pre delegator-anchor (d2a) link. Later, pre d2a links exhibiting higher occurrences and stability are identified as d2a links.
We focus on pre d2a links obtained from full VPs. We highlight essential features related to link attributes, their positions, and co-located information. Subsequently, the t-SNE method is employed to establish a threshold for discerning pre d2a links as actual d2a links. We then analyze the feature importance, and use machine learning methods to train a discriminator, for identifying other d2a links across the entire Internet.
Affinity-Model classifies two primary categories of non valley-free paths, designating non valley-free p2p segments within these paths as pre non-vf d2a links. Similar to previous steps, relevant features are presented, and pre non-vf d2a links with higher occurrences and stability are identified as actual non-vf d2a links with the help of t-SNE. Important features of these pre non-vf d2a links are then summarized, and a discriminator is trained to identify all non-vf d2a links across the Internet.

We next analyze AS affinity routing behaviors. Initially, all affinity routing behaviors are treated as prefix policies. If an AS applies affinity policies to all prefixes of a destination AS, its prefix policies can be aggregated into destination policies. If the same affinity neighbors are used for all destination ASes and prefixes, the destination policies can be further aggregated into neighbor policies. Our analysis reveals that the proportion of prefix policies in affinity routing behavior is significantly higher than average. Additionally, we find that affinity policies are predominantly linked to European IXPs. Thus, we conclude that affinity routing behavior is more prevalent in the prefix-specific routing policies of European IXPs.

Publications

Affinity-Model: Improving AS Routing Models via AS Affinity Behavior Inference

Zitong Jin, Xingang Shi, Ying Tian, Zhiliang Wang, Xia Yin, Jianping Wu from Zhongguancun Laborarty and Tsinghua University

Affinity-Model characterizes AS affinity routing behaviors in the Internet by comparing simulated paths with actual paths, and employs machine learning techniques to interpret and predict broader affinity routing patterns, thereby significantly enhancing existing AS routing models. You can learn more about Affinity-Model in INFOCOM 2025.

[PDF]

Code

The code of Affinity-Model is under review and will be released later.

Affinity-Model

Presentation

TBA.

Authors

Zitong Jin, Zhongguancun Laborarty, jinzt@mail.zgclab.edu.cn
Xingang Shi, Tsinghua University, shixg@cernet.edu.cn
Ying Tian, Zhongguancun Laborarty, tianying@mail.zgclab.edu.cn
Zhiliang Wang, Tsinghua University, wzl@cernet.edu.cn
Xia Yin, Tsinghua University, yxia@tsinghua.edu.cn
Jianping Wu, Tsinghua University, jianping@cernet.edu.cn