Skip to Content
[CAIDA - Center for Applied Internet Data Analysis logo]
Center for Applied Internet Data Analysis
www.caida.org > funding : dibbs-panda
DIBBs: Integrated Platform for Applied Network Data Analysis (PANDA)
Sponsored by:
National Science Foundation (NSF)

We are developing a new Platform for Applied Network Data Analysis (PANDA) that will offer researchers more accessible calibrated user-friendly tools for collecting, analyzing, querying, and interpreting measurements of the Internet ecosystem.

Funding source: NSF OAC-1724853. Period of performance: September 1, 2017 - August 31, 2022.

|   Project Summary    Collaborating Partners   |

Project Summary

For the last 20 years CAIDA has developed many data-focused services, products, tools and resources to advance the study of the Internet, which has permeated disciplines ranging from theoretical computer science to political science, from physics to techlaw, and from network architecture to public policy. As the Internet and our dependence on it have grown, the structure and dynamics of the network, and how it relates to the political economy in which it is embedded, is gathering increasing attention by researchers, operators and policy makers, all of whom bring questions that they lack the capability to answer themselves. CAIDA has spent years cultivating relationships across disciplines (networking, security, economics, law, policy) with those interested in CAIDA data, but the impact thus far has been limited to a handful of researchers. The current mode of collaboration simply does not scale to the exploding interest in scientific study of the Internet.

In response to feedback from these communities, we will integrate existing research infrastructure measurement and analysis components previously developed by CAIDA into a new Platform for Applied Network Data Analysis (PANDA). Our goal is to enable new scientific directions, experiments and data products for a wide set of researchers from the four targeted disciplines: networking, security, economics, and public policy. We will emphasize efficient indexing and processing of terabyte archives, advanced visualization tools to show geographic and economic aspects of Internet structure, and careful interpretation of displayed results. To prove that our platform is easily extensible and adaptable to new opportunities, we will seek to augment it with new data products for unmet research needs: a comprehensive DNS data set (facilitating mapping network behavior to a human view) and anonymized residential network traffic data (supporting privacy sensitive security monitoring of in-home networks by even non-technical users).

We will ensure active engagement of our collaborators: organize annual workshops; develop online video tutorials targeting non-networking experts as well as classroom-focused materials; maintain an annotated bibliography and discussion forum; and institute an advisory board to provide strategic directions.

The success of our project will enable new empirical studies in the four targeted disciplines, promising innovations in: Internet mapping and path prediction; detection of route hijacking and other disruptive events; cybersecurity preparedness; economic studies of correlations between ISP characteristics, market power, performance degradations, security practices, and regional economic growth; and regulatory discourse that has thus far occurred largely without data. It will lower the threshold to use CAIDA's data products and tools for R&E needs, inform discussion of critical issues in current and future large-scale networking, and increase public awareness about Internet structure, dynamics, performance, and evolution. The developed platform will address NSF's CIF21 goal of interconnecting cyberinfrastructure components and developing a comprehensive, robust, scalable shared resource that will bridge diverse communities and integrate HPC, data, software, and facilities to expand the potential of Internet-related science.


Development of Platform for Applied Network Data Analysis (PANDA)

Task 1: Improvements of existing PANDA components

DescriptionProjected DateStatus
1.1: Re-architect AS-rank to use XSEDE resources on the back end.
a.develop new indexing schemes for an efficient AS path database---
b.implement tracking of changes to AS paths over time---
c.implement tracking of changes to AS relationships over time---
d.implement tracking of changes to customer cone sizes over time---
e.enable computation and archiving of the set of ASes comprising a customer cone---
f.improve AS-level visualizations to highlight structure from a given AS perspective---
1.2: Traceroute measurements and inferences
a.implement smooth transition between querying archived Ark probing data and requesting on-demand measurements in real time---
b.link IP- and AS-level views to display all archived and derivative data related to all networks crossed by the probed path---
c.combine bdrmap and MAP-IT (by UPenn) into a unified border mapping module---
d.enable the border mapping software to operate interactively on large archives of traceroute data---
1.3: Improve MANIC functionality
a.enable viewing of an interconnection link from the other direction---
b.enable comparative views of a given interconnect from different locations---
c.integrate geolocation information about networks---
d.integrate facility-level information about interconnections---
e.re-architect influxDB to work on a cluster of XSEDE nodes to support multiple users/queries---
1.4: Continue BGPStream development
a.implement bindings to the main BGPStream C library to facilitate use by external software modules---
b.create distribution-specific BGPStream packages for various OSes (Ubuntu, Debian, FreeBSD, CentOS)---

Task 2: Linking components into a multifunctional platform

2.1: Enable use of BGPStream modules by all PANDA components
2.2: Enable consumption of Periscope BGP data through the BGPStream API
2.3: Enable cross-use of vantage points and probing results from Ark, RIPE Atlas, and Periscope

Task 3: Integrate new external data infrastructure building blocks into PANDA

3.1: Include data from home routers participating in the FCC Measure Broadband America program into MANIC
3.2: Explore the possibility of using video quality reports for cross-correlation with MANIC data
3.3: Incorporate large scale active measurements of DNS by OpenINTEL project
3.4: Integrate user traffic data from home networks with BGP-aware and IXP-aware functionality

PANDA Community Activities

Task 1: Increase community accessibility of unified platform and its underlying components

DescriptionProjected DateStatus
1.1: Improve ITDK
a.create a simplified version removing complex artifacts (MOAs, AS loops and sets, hyperlinks)---
b.render router graph amenable to processing by basic graph database tools---
c.provide documentation---
d.create an economist-friendly version---
1.2: Provide data products in easier-to-use, domain-specific formats (JSON, standard graph formats, inputs for network simulators)
a.develop custom tools for format conversions---
b.on request, convert our data to a specified format---
1.3: Create user-friendly interface to spoofer results accessible via PANDA
a.integrate published IXP membership info to report SAV compliance state at IXPs (when possible)---
b.integrate public BGP info on the stability of edge network address space (to evaluate the feasibility of deploying static access control lists)---

Task 2: Provide support for multidisciplinary collaborations

DescriptionProjected DateStatus
2.1: Regularly interact with PANDA users
a.create a mailing list (panda-interest at caida.org)---
b.dedicate staff time to interact with collaborators---
c.conduct annual surveys on usability and impact of the platform---
d.publish survey summaries---
e.solicit user feedback to guide the development of PANDA interface components and capabilities---
f.demonstrate and promote PANDA at workshop and conferences in targeted disciplines---
2.2: Engage users via CAIDA annual workshop series: Active Internet Measurement Systems (AIMS) and Workshop on Internet Economics (WIE)
a.present new data sets associated with and/or resulting from PANDA---
b.introduce new PANDA capabilities as they develop---
c.conduct hands-on tutorials---
d.discuss opportunities for classroom use---
2.3: Create and maintain an online community resource of project materials
a.tutorials for using PANDA components---
b.summaries of data-gathering efforts from different stakeholders---
c.video tutorials suitable for classroom use---
d.moderated forum for discussion of related empirical studies---
e.project wiki---
2.4: Organize and host annual external advisory board meetings
a.seek board members advice on enriching linkages between PANDA and targeted communities---
b.identify emerging national and international issues to be tackled by PANDA---
c.discuss data collection and analysis developments to inform policy-making---
d.pursue outreach opportunities---
e.develop paths to sustainability by the end of the NSF support for the project---

Task 3: Develop and implement a Science Gateway style interface for interactive access to PANDA


  Last Modified: Thu Oct-5-2017 16:55:00 PDT
  Page URL: http://www.caida.org/funding/dibbs-panda/index.xml