Background and Motivation
Starlink's official availability and authorization footprint does not always align with what we observe from large-scale, public measurement data. In particular, signals from M-Lab's published datasets (e.g., NDT) and Cloudflare's Aggregated Internet Measurement (AIM) data—also published openly via M-Lab—suggest that Starlink-associated connectivity can appear in regions where the service is not officially offered or is not authorized. This mismatch raises an important research question for Internet measurement and policy: how often, where, and under what conditions does LEO satellite connectivity surface beyond official footprints, and what does its performance look like in those contexts?
The working hypothesis is that these observations can be explained by a combination of practical and socio-technical drivers: roaming-style usage and account-country mismatches, terminal circulation through grey/black markets, and users seeking reliability, higher performance, or resilience when terrestrial networks are poor or disrupted. Importantly, the goal of this thesis is not to enable circumvention. Instead, it is to measure and characterize the phenomenon from an Internet measurement perspective using open datasets, with careful attention to inference limits, ethics, and aggregation. The thesis will focus on identifying where "out-of-authorization" Starlink presence appears in data, quantifying how persistent versus transient it is, and assessing connectivity quality relative to both "official" Starlink regions and non-satellite baselines.
Expected Outcomes
The thesis will deliver a reproducible, end-to-end measurement workflow that ingests, cleans, and analyzes Cloudflare AIM and M-Lab datasets at scale (likely via BigQuery), producing a global, time-resolved view of where Starlink-like connectivity appears outside official availability. The core scientific output will be a characterization of performance and stability in those regions, including distributions and temporal patterns for latency and throughput (and AIM-style quality metrics where applicable), alongside comparisons to (i) regions where Starlink is officially available and (ii) terrestrial baseline performance inferred from the same measurement sources.
Beyond performance, the work will develop a cautious attribution and plausibility layer: evidence-driven indicators that are consistent with likely mechanisms (e.g., roaming constraints, short-lived spikes, persistent presence), while explicitly avoiding over-claiming causality. The final deliverables will include a cleaned and documented codebase, a compact data dictionary, and a thesis-quality narrative that clearly separates what is directly observed from what is inferred. If feasible and ethically appropriate, an additional outcome could be an aggregate-only visualization or lightweight dashboard summarizing results without exposing sensitive or identifying details.
Requirements
- Strong data analysis skills: Python (pandas/NumPy), statistics, visualization, reproducible workflows.
- Comfort with large-scale datasets: SQL and especially BigQuery (strongly preferred, given M-Lab/Cloudflare AIM distribution pathways).
- Networking fundamentals: ASNs, BGP basics, latency vs throughput interpretation, measurement artifacts/biases.
- MLab knowledge: Understanding what NDT measures and how to use M-Lab's published "unified views" appropriately.
- Cloudflare AIM knowledge: What AIM is, what the score represents, and limitations of speed-test-derived aggregation.
- Research hygiene & ethics: Handling measurement bias, avoiding over-claiming causality, and ensuring analyses remain aggregate/non-identifying.
Related Work
Internet Censorship and Measurement
-
OONI: Open Observatory of Network Interference (FOCI'12) — https://www.usenix.org/system/files/conference/foci12/foci12-final12.pdf
-
ICLab: A Global, Longitudinal Internet Censorship Measurement Platform (arXiv) — https://arxiv.org/abs/1907.04245
-
A Survey of Internet Censorship and its Measurement (arXiv 2025) — https://arxiv.org/html/2502.14945v1
US Connectivity / Broadband Inequity
-
Characterizing Internet Access and Quality Inequities in California M-Lab Measurements (ACM COMPASS'22) — https://dl.acm.org/doi/10.1145/3530190.3534813
-
Decoding the Divide: Disparities in Broadband Plans Offered by Major US ISPs (SIGCOMM'23) — https://dl.acm.org/doi/10.1145/3603269.3604831
-
The Efficacy of the Connect America Fund (CAF) in Addressing US Internet Access Inequities (SIGCOMM'24) — https://dl.acm.org/doi/10.1145/3651890.3672272
Speedtest Methodology / Interpreting M-Lab
- Measurement, Meaning and Purpose: Exploring the M-Lab NDT Dataset (SSRN) — https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3898339
Cloudflare AIM + M-Lab Open Dataset
- M-Lab announcement: publishing Cloudflare AIM dataset — https://www.measurementlab.net/blog/cloudflare-aimscoredata-announcement/
Starlink / LEO Measurement Papers
-
A Multifaceted Look at Starlink Performance (WWW'24) — https://dl.acm.org/doi/10.1145/3589334.3645328
-
Realtime Multimedia Services over Starlink: A Reality Check (ACM) — https://dl.acm.org/doi/10.1145/3592473.3592562
-
A Large-Scale IPv6-Based Measurement of the Starlink Network — https://arxiv.org/pdf/2412.18243
-
Investigating Web Content Delivery Performance over Starlink (arXiv) — https://www.arxiv.org/abs/2510.13710
-
Starlink in Northern Europe: Stationary and In-motion Performance — https://arxiv.org/pdf/2502.15552
-
A First Look at Starlink In-Flight Performance (arXiv) — https://arxiv.org/abs/2508.09839
-
An investigation of Starlink's performance during the May 2024 solar superstorm — https://bdebopam.github.io/papers/leonet25_solar.pdf