Engaging Scholars in Cybersecurity Analysis: A Laboratory for Teaching and Education
This project will develop a centralized cybersecurity community hub that provides cyberinfrastructure-ready, data-driven cybersecurity training resources using real-world datasets.
Principal Investigators: Kimberly Claffy Ka Pui Mok
Funding source: OAC-2519416 Period of performance: July 1, 2025 - June 30, 2028.
Project Summary
Overview
In the rapidly evolving information technology landscape, network security is a cornerstone of cybersecurity. Cybersecurity workforce training empowers the next generation of IT professionals to proactively mitigate risks and secure digital infrastructure against threats. But many cybersecurity training programs are limited to the study of theories, best practices, and protocols. Few programs offer exposure to network measurements where students can discern baseline behavior from sophisticated real-world network attacks. Addressing this gap requires both cutting-edge data science techniques and discipline-appropriate skills in advanced cyberinfrastructure (CI) in the classroom.
Several institutions provide rich publicly accessible Internet infrastructure datasets for research use, but using these real-world datasets presents two major challenges: scalability and complexity. First, prohibitively high compute and storage requirements for processing data pose a barrier to scale. Huge terabyte-scale datasets, sometimes containing sensitive information, are unsuitable for transfer to personal devices, which limits their access to those with sufficient resources. Second, educators and researchers may lack skills to use advanced CI. The steep learning curve discourages adoption of CI in courses and research. Our project addresses both challenges by leveraging existing NSF-funded CI to foster training and preparation of a diverse STEM cybersecurity workforce.
Building on CAIDA’s successful Internet Data Science for Cybersecurity course, our vision rests on three pillars: data-driven course materials, infrastructure support for executing and grading assignments, and a user-friendly platform to organize and share resources within the community. This project will develop and deploy a Cybersecurity Community Hub (C2Hub), a centralized catalog to catalyze the building, delivering, and sharing of CI-ready cybersecurity education and training resources in the cybersecurity training community. Our immediate goal is to provide the community a one-stop platform for CI-ready course modules enabling institutions, including under-resourced ones, to broaden the adoption of CI tools and resources in the Nation’s undergraduate and graduate cybersecurity curriculum. This resource will also enhance researchers’ abilities to efficiently use advanced CI to conduct data-intensive research. We thus pursue both major goals of the solicitation.
Our team will collaborate to optimize usability, dataset security, and use of modern software stacks that facilitate machine learning/artificial intelligence (ML/AL) techniques on the data. We will address technical and policy challenges in resource sharing across institutions and adopting advanced CI. We will seed the platform with course modules that provide hands-on experience in using CI to apply data science techniques to cybersecurity analyses. We will collaborate with PIs of a CyberTraining project (NSF grant CIP-2230127) at UC San Diego to integrate suitable modules into their training program. Two partners (JHU and Calvin U.), with different class sizes, campus CI resources, and student demographics, will help us identify pedagogical challenges in different settings. Seven collaborating institutions have committed to adopt the materials in their classes.
Project leads, key team members
kc claffy (CAIDA/SDSC/UCSD)
Ricky Ka-Pui Mok (CAIDA/SDSC/UCSD)
Alexander Marder (Johns Hopkins University)
Rocky Chang (Calvin University)
Broader Impacts
Our open-source curriculum will serve as a role model for other cybersecurity programs nationwide, accelerate adoption of CI in cybersecurity analyses, and expand and diversify the nation’s cybersecurity work force. A long-term goal of this project is to achieve lasting impact on the new School of Computing, Information, and Data Science (SCIDS) at UC San Diego, which offers undergraduate and graduate data science-related programs for ≈ 1,000 students. We will introduce our course materials into the curriculum of suitable SCIDS programs/courses at various levels.
Project Tasks
| Pillar | Task # | Task Title | Detailed Description | Target Date | Subtasks |
|---|---|---|---|---|---|
| Pillar 1: Build Internet Data Science Framework (T1) | 1.1 | Conduct stakeholder analysis | (a) Perform stakeholder interviews to identify pedagogical hurdles and technical constraints in diverse institutions(b) Define user personas and use cases(c) Execute CI resource and security policy surveys across partner institutions• Stakeholder interviews: CAIDA, UCSD, JHU, Calvin, Daniel Zappala (BYU), Casey Deccio (BYU), Christos Papadopolous (Memphis), Ken Calvert (U. Kentucky), Jean (UNC-Charlotte), Raffaele Sommese (Twente), Romain Fontugne (IIJ), Arpit Gupta (UCSB), Joel Sommers (Colgate), Tijay Chung (VT), Mary Thomas (SDSC), Shouhuai Xu (UC Colorado Springs)g | Mar 2026 | 1. Conduct institutional interviews with MSIs and R1 partners to identify dataset portability constraints 2. Draft user personas and use case documentation for CI Contributors and CI Users |
| 1.2 | Define platform requirements and data sharing policies | Formalize technical specifications for cybersecurity topics, module integration standards, and resource-sharing policies | Apr 2026 | 1. Define curriculum and environment specifications; identify supported topics and required software stacks 2. Establish metadata standards aligned with CAIDA resource catalog 3. Formalize multi-institutional access policies including RBAC and PII sanitization |
|
| 1.3 | Migrate CAIDA Datasets to Cyberinfrastructure (CI) | Migrate datasets to CI platforms; prototype sanitization scripts; enforce RBAC for large-scale dataset access | Sep 2026 | 1. Provision Kubernetes namespaces on Nautilus/NRP 2. Develop automated PII sanitization pipeline 3. Implement role-based access control |
|
| 1.4 | Scaffold platform components | Scaffold backend database and catalog; develop UI mockups; implement user roles | Oct 2026 | 1. Design dashboard and catalog wireframes 2. Build integration hooks to Nautilus S3 and CVMFS 3. Develop lightweight Python package for Jupyter integration and module tracking 4. Develop initial Hello World module and onboarding documentation |
|
| 1.5 | Implement core features | Complete search and tagging; add CI job submission UI | Nov 2026 | ||
| 1.6 | Refine platform requirements based on pilot feedback | Update requirements based on classroom pilot usage | Jan 2027 | ||
| 1.7 | Integrate CAIDA SSO for unified user authentication | Implement CAIDA SSO integration | Jun 2027 | ||
| 1.8 | Perform UI/UX polish | Finalize UI/UX; conduct accessibility audit; publish documentation and API guides | Jul 2027 | ||
| Pillar 2: Enrich Course Module Catalog (T2) | 2.1 | Develop dataset recipes and seed cybersecurity modules | Identify dataset recipes; refine into documented examples; seed three cybersecurity modules; develop container templates | Jun 2026 | |
| 2.2 | Develop bridging modules for STEM students | Create bridging modules covering packet switching, IP, Python/pandas, and CI toolchains | Sep 2026 | ||
| 2.3 | Refine partner-driven module content | Update modules via feedback from JHU, Calvin, and BYU | Aug 2026 | 1. Synthesize classroom feedback 2. Iteratively refine modules 3. Adjust curriculum using pilot metrics 4. Integrate advisor recommendations |
|
| 2.4 | Establish CI resource guidelines | Develop minimum and recommended CI resource guidelines | Feb 2027 | ||
| 2.5 | Expand module catalog | Expand to six modules; implement ML/AI support; optimize workflows | Mar 2027 | ||
| 2.6 | Implement course package libraries | Finalize ten-module curriculum; package for UCSD CyberTraining and SCIDS integration | Sep 2027 | ||
| Pillar 3: Deliver Courses Using C2Hub (T3) | 3.1 | Launch project website | Deploy public resource discovery website | Apr 2026 | |
| 3.2 | Conduct community outreach | Host biannual webinars; cross-institution workshops; annual planning meetings | Feb 2026; Aug 2026; Feb 2027 |
||
| 3.3 | Implement pilot courses on NRP | Onboard JHU and Calvin; deploy Jupyter notebooks; collect usability metrics | Jun 2026 | ||
| 3.4 | Develop autograder environment | Integrate nbgrader into NRP JupyterHub | Jul 2026 (one module); Mar 2027 (remaining modules) |
||
| 3.5 | Publish educator support documentation | Create deployment and troubleshooting documentation | Jul 2026 | ||
| 3.6 | Facilitate course integration | Develop scripts and tutorials for CI setup and curriculum integration | Feb 2027 | ||
| 3.7 | Final community release | Coordinate release; host end-of-project symposium | Aug 2027 | ||
| Pillar 4: Evaluation and Sustainability | 4.1 | Establish project governance | Establish faculty advisory board | Feb 2026 | |
| 4.2 | Perform baseline surveys and metrics collection | Execute baseline surveys; conduct NRP load testing | Oct 2026 | ||
| 4.3 | Analyze evaluation results and identify gaps | Perform gap analysis; develop improvement plans | Oct 2026 | ||
| 4.4 | Identify scaling opportunities | Identify new institutional users and modules | Oct 2026 | ||
| 4.5 | Identify sustainability pathways | Identify follow-on funding sources | Jan 2027 | ||
| 4.6 | Conduct mid-project evaluation | Perform mid-project evaluation; integrate feedback mechanisms; conduct workshop | Oct 2026 | ||
| 4.7 | Update requirements for long-term sustainability | Refine multi-institutional resource-sharing requirements | Mar 2027 | ||
| 4.8 | Develop operational scaling plan | Create scaling and support plan | May 2027 | ||
| 4.9 | Document impacts and disseminate best practices | Publish final evaluation report and best practices | Oct 2027 | ||
| 4.10 | Transition to community-supported operations | Secure follow-on grants; publish maintenance roadmap | Oct 2027 |
Acknowledgment of awarding agency’s support
This material is based on research sponsored by the National Science Foundation (NSF) grant OAC-2519416. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.
Additional Content
Proposal: Engaging Scholars in Cybersecurity Analysis: A Laboratory for Teaching and Education (ESCALATE)
An abbreviated version of the original ESCALATE proposal is shown below. For the full proposal for “Engaging Scholars in Cybersecurity Analysis: A Laboratory for Teaching and Education (ESCALATE)”


