SISTER: Science of Internet Security: Technology and Experimental Research
Using the versatile Ark measurement platform, we will conduct measurements and analysis for documented explanations of structural and dynamic aspects of the Internet infrastructure relevant to cybersecurity vulnerabilities.
Principal Investigator: kc claffy
Funding source: HHSP 233201600012C Period of performance: May 24, 2016 - November 23, 2018.
Statement of Work
Year 1
# | Subtask | Description | Goal | Result | Date | Status | |
---|---|---|---|---|---|---|---|
Task 1: Support for Macroscopic Security and Stability Monitoring and Analysis | |||||||
Deliverable: 24/7 Stability measurement and analysis system | software | May 23, 2017 | done | ||||
1.1 | Generate the target list of prefixes | Develop and implement a system to process and sanitize BGP data from Routeviews and RIPE RIS creating a list of target prefixes based on a 1-week sliding window | 24/7 monitoring of all Routeviews and RIPE RIS collectors; keeping in memory a 1-week sliding window of targets | Software module that continuously interrogates BGP Watcher application | Jul 8, 2016 | done, paper | |
1.2 | Dynamically identify which IPs to probe with traceroute | Develop and implement selection of the target IP for each prefix in the target list based on the current state of prefixes announced on BGP | Ability to perform the selection within 1 hour | Component of the software module developed in the subtask 1.1 | Sep 8, 2016 | done | |
1.3 | Improve AS path inference strategies | Improve our algorithms to infer AS paths from traceroutes minimizing two artifacts of traceroute measurement: false inferences due to third-party addresses, and missing links due to routers responding with IP addresses from interfaces that hide the presence of an AS hop | Improvement of accuracy when comparing inferred AS paths to AS paths announced on BGP respectively from Ark nodes and BGP monitors located in the same AS | Software implementation of an improved algorithm | Feb 8, 2017 | done | |
1.4 | Create measurement and data processing pipeline | Run the systems to gather the target list, execute traceroutes from vantage points, infer AS paths, and curate resulting data | 24/7 monitoring of the entire IPv4 routed address space | System in production | May 8, 2017 | done | |
1.5 | Release data sets through IMPACT | Make resulting datasets available through IMPACT | Datasets indexed in IMPACT | Data | May 23, 2017 | done | |
Task 2: Mapping Peering Interconnections at the Router Level | |||||||
Deliverable: Software to infer border router ownership | software | Nov 23, 2016 | done | ||||
Deliverable: Data set of border mapping inferences | data | Jan 23, 2017 | done | ||||
2.1 | Refine algorithms to infer peering interconnections at the router level | Explore which methods yield most accurate inferences and apply heuristics to traceroute data, focusing primarily on the connectivity found in the network of the vantage point | Infer router ownership for all border routers of a VP | Documented heuristics and software to run on traces collected from a probing VP and infer the ownership of the VP's border routers | Sep 23, 2016 | done | |
2.2 | Deploy border-mapping process on Ark infrastructure | Deploy the software from subtask 2.1 on all VPs of the Ark infrastructure to infer the border links of the hosting networks | Data set indexed in IMPACT | Ongoing dataset of border mapping inferences from all Ark monitors | Oct 23, 2016 | done | |
2.3 | Validate border-mapping inferences | Make our inferences publicly available and solicit feedback from operators. Use BGP data for validation when Ark VPs is collocated with a BGP monitor. | Maximize a set of validation data; focus on covering the set of networks hosting Ark VPs | Set of ground-truth data relating to border router ownership | Feb 23, 2017 | done | |
2.4 | Refine software to support border-mapping on resource- constrained platforms | Adapt prober software of our measurement system to accept commands from a remote server, thus reducing the memory footprint on the VP | Computationally and memory-efficient probing software, secure, and robust to middlebox-initiate timeouts of idle connections | Updated probing software | May 23, 2017 | done | |
Task 3: Mapping Peering Interconnections at the Facility Level | |||||||
Deliverable: Annotated facility-aware map of peering interconnection | data | May 23, 2017 | done | ||||
3.1 | Generate map of interconnection facilities and associated peering networks | Maintain a detailed map of the interconnection facilities and the network that have presence there by manually assembling it from existing data sources (PeeringDB,PCH) | Data set indexed in IMPACT | Data set compiled from best available data on interconnection facilities and participating networks | Jul 23, 2016, then continue to update | pre-proposal study, done | |
3.2 | Develop and test method to infer engineering approach to interconnection | Develop and test new techniques to infer a type of engineering approach to interconnection (private peering with cross-connect, public peering, private interconnects over the public switch fabric, and remote peering) | Annotated data set indexed in IMPACT | Annotations of data set from subtask 3.1 | Feb 23, 2017 | done | |
3.3 | Alias resolution of interconnection IP addresses | Using existing state-of-the-art alias resolution techniques and strategically executed traceroute measurements, classify interconnection IP addresses according to router hardware they share | Router-level topology data set indexed in IMPACT | Data file of IP addresses and router identifier to which each IP address maps | Apr 23, 2017 | done | |
3.4 | Facility-aware map of peering interconnection | Merge the techniques developed in subtask 3.2 and 3.3 to produce a richly annotated global map of peering facilities | Annotated data set indexed in IMPACT | Facility-aware map annotated with peering | May 23, 2017 | done | |
Task 4: Measurements of TCP Behavior to Understand Security Vulnerabilities | |||||||
Deliverable: Report on observed prevalence of TCP vulnerabilities | report | May 23, 2017 | done | ||||
4.1 | Establish the scope of study | Describe set of attacks and defenses we will study, e.g. blind in-window attack | Written description of set of attacks | Feb 23, 2017 | pre-proposal | ||
4.2 | Provide measurement techniques to test for vulnerabilities described in subtask 4.1 | Develop and test active measurement techniques to test for the vulnerabilities | Documented techniques validated against known systems and/or with network operators | Set of techniquess to apply for subtask 4.3 | May 23, 2017 | done | |
4.3 | Apply technique to measure TCP ecosystem | Apply active measurement technique developed and tested in subtask 4.2 to TCP stacks deployed in a web server environment | Public data set | Data | Apr 23, 2017 | done | |
4.4 | Document findings | Assess TCP vulnerabilities of popular web server environment based on measurement, publish results in a peer-reviewed venue | Documentation of vulnerabilities | Paper | May 23, 2017 | done |
Year 2
# | Subtask | Description | Goal | Result | Date | Status | |
---|---|---|---|---|---|---|---|
Task 5: Identifying Grey Market IPv4 Address Transfers | |||||||
Deliverable: Software to infer address transfers from BGP data | software | Aug 23, 2017 | |||||
Deliverable: Monthly lists of candidate transferred prefixes | data | May 23, 2018 | |||||
5.1 | Refine BGP-based filters | Refine the set of BGP-based filters to rule out candidate transfers due to Provider-Aggregatable space. Iteratively validate the results against the set of transfers reported to RIRs, and modify filters to eliminate false negatives | Minimize number of false negatives in the algorithm based on the lists of transfers reported by the RIRs | Algorithms and software to analyze BGP data to detect candidate transfers and filter out transfers due to legitimate reasons | Aug 23, 2017 | done | |
5.2 | Use DNS lookups of the entire routed address space to detect transfers | Set up regular and frequent reverse DNS lookups of the entire routed IPv4 address space capturing both DNS names and SOA records to reveal information about the ownership of IP prefixes. Develop techniques to detect transfers using the DNS data. | High-frequency (at least once a month) DNS lookups of the entire routed address space | Periodic reverse DNS scans of the routed address space (data); algorithms/software to extract information about prefix transfers from DNS data | Nov 23, 2017 | done | |
5.3 | Use IP-level data to detect transfers | Devise a strategy for consistent and frequent probing of routed prefixes from Ark monitors. Build a history of router-level paths and RTT information from Ark monitors toward routed prefixes. Use collected data to develop data-plane signatures for prefix transfers, and for prefix movements due to other causes such as non-BGP speakers changing upstream providers. Use collected RTT data for constraint-based geolocation to identify prefix transfers | Frequent (at least once a week) measurements of IP paths and RTTs from each Ark monitor toward transferred prefixes | Database with history of IP paths and RTTs toward routed prefixes (data); algorithms/software to extract signatures of prefix transfers from the IP path data | Apr 23, 2018 | ||
5.4 | Combine, curate, and validate results from all methods | Combine the techniques developed in subtasks 5.1. 5.2, and 5.3 for detection of transfers and filtering of candidate transfers due to legitimate reasons | Produce monthly lists of candidate transferred prefixes, with low false negatives and false positives | Periodic lists of transferred prefixes | May 23, 2018 | ||
Task 6: Internet Router-Level Topology Mapping on Demand | |||||||
Deliverable: Software and API for the on-demand alias resolution | software | Feb 23, 2018 | done | ||||
Deliverable: Database of known alias data | database | May 23, 2018 | done | ||||
6.1 | Improve fault-tolerance of MIDAR's Estimation stage probing | If a monitor has rebooted, restart the Estimation probing on just the rebooted monitor; if a monitor is down for an extended period, either (i) restart Estimation stage with the down monitor excluded, or (ii) let the Estimation stage finish, accepting that data from the down monitor may be lost | Automated execution of the Estimation stage | Automatic recovery from monitor failures during the Estimation stage | Jul 8, 2017 | done | |
6.2 | Improve fault-tolerance of MIDAR's Discovery stage probing | Whether a monitor has rebooted due to a brief power outage or suffered an extended outage, restart the Discovery stage with the down monitor excluded | Automated execution of the Discovery stage | Automatic recovery from monitor failures during the Discovery stage | Aug 8, 2017 | done | |
6.3 | Improve fault-tolerance of MIDAR's Elimination and Corroboration "one-src" probing | When probing "one-src" candidate alias sets that can be probed entirely from a single monitor, if a monitor has rebooted, restart the Elimination/ Corroboration probing on just this monitor; if a monitor is down for an extended period, redistribute its target list to the remaining monitors and conduct a follow-on Elimination/ Corroboration probing round on these previously failed targets | Automated execution of "one-src" probing | Automatic recovery from monitor failures during "one-src" probing | Oct 23, 2017 | done | |
6.4 | Improve fault-tolerance of MIDAR's Elimination and Corroboration "multi-src" probing | When probing "multi-src" candidate alias sets (that must be probed from two or more monitors due to constraints of the "indir" probing method), modify the multi-src driver program to automatically detect and dynamically skip over down monitors. Investigate the feasibility of conducting a follow-on Elimination/ Corroboration probing round on the alias sets involving the down monitor, after redistributing these alias sets to remaining monitors | Automated execution of "multi-src" probing | Automatic recovery from monitor failures during "multi-src" probing | Jan 23, 2018 | done | |
6.5 | Scriptable API for on-demand alias resolution | Implement a scriptable command-line tool and/or a RESTful web API for submitting addresses for on-demand alias resolution and for retrieving results | User control for the on-demand alias resolution system | Programmable API for interacting with the on-demand alias resolution system | Feb 23, 2018 (overlaps with 6.3 and 6.4) | done | |
6.6 | Query interface for known aliases | Implement a database backend for our known alias data (discovered by prior on-demand measurements or from ITDKs), and a scriptable command-line tool and/or a RESTful web API for querying known aliases of user-submitted addresses. | Query interface for currently known alias data | Database of our alias resolution data | May 23, 2018 | done | |
Final Report | |||||||
Deliverable: Final Report | report | May 23, 2018 |
Acknowledgment of awarding agency's support
This material is based on research sponsored by the Department of Homeland Security (DHS) Science and Technology Directorate, Cyber Security Division (DHS S&T/CSD) via contract number HHSP233201600012C.