Detection and Analysis of Infrastructure Bottlenecks in a Cloud-Centric Internet
This project proposes an effort to design measurement and analysis tools that can transform our understanding of cloud connectivity performance and reachability in the U.S. and around the world. Researchers currently lack the measurement ability to even identify such bottlenecks at scale, much less assess their impact on Internet users.
Principal Investigators: Ka Pui Mok kc claffyAlexander Marder
Funding source: CNS-2212241 Period of performance: October 2, 2022 - September 30, 2025.
Project Summary
The CoVID-19 pandemic and associated quarantine has accelerated the Internet’s fundamental shift from a peer-to-peer to a cloud-centric model. Our entire lives have moved online, now predominantly mediated by services in the cloud, and public clouds are rapidly evolving to meet increasing requirements and demands from customers and end users. The importance of the clouds in the modern Internet triggers questions regarding how well existing Internet backbone networks support the applications and content now served from the clouds. Cloud providers can afford the infrastructure upgrades to support the needs of low latency or high throughput applications, but their ability to adapt infrastructure to application demands ends at their network border. The economics of deploying and operating transit backbone infrastructure combine with the surge in traffic toward cloud services to induce performance bottlenecks in the changing Internet landscape.
This project proposes an ambitious effort to design measurement and analysis tools that can transform our understanding of cloud connectivity performance and reachability in the U.S. and around the world. Researchers currently lack the measurement ability to even identify such bottlenecks at scale, much less assess their impact on Internet users. The project is structured as two tasks that will combine to reveal performance bottlenecks outside the cloud networks where the high cost of deployment and operations leads to infrastructure bottlenecks for cloud applications. The first task will develop novel techniques to identify performance bottleneck links between cloud datacenters and thousands of publicly accessible speed test servers, by synthesizing active measurements with TCP flows. The second task will analyze the bottleneck links we identify with comprehensive path measurements from cloud datacenters to the entire public Internet, and we will develop new techniques to support inference of the geographic locations of bottleneck links by geolocating where paths exit cloud networks.
The intellectual merit of this project stems from the innovative methods we will develop and validate to conduct accurate, scalable, and reliable topology and performance measurements of a critical component of the modern Internet, overcoming cost barriers that have prevented measurement studies from the cloud. The measured features and labels the project generates will provide an ideal basis to address the persistent challenge in applying machine learning techniques to network infrastructure research. The project will also have broader impacts outside of the scientific research agenda. The tools and data the project generates will be valuable to enterprises and application developers deploying into the cloud, as well as policy-makers seeking to understand bottlenecks in U.S. Internet infrastructure. The data, tools, and analyses can also lead to the discovery of broadband performance inequities in the U.S. and inform future public investment in infrastructure. Experience with cloud applications and measurements will be incorporated into an undergraduate data science course and undergraduate research mentorships.
Acknowledgment of awarding agency’s support
This material is based on research sponsored by the National Science Foundation (NSF) grant CNS-2212241. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.