Engaging Scholars in Cybersecurity Analysis: A Laboratory for Teaching and Education
Project Summary
In the rapidly evolving information technology landscape, network security is a cornerstone of cybersecurity. Cybersecurity workforce training empowers the next generation of IT professionals to proactively mitigate risks and secure digital infrastructure against threats. But many cybersecurity training programs are limited to the study of theories, best practices, and protocols. Few programs offer exposure to network measurements where students can discern baseline behavior from sophisticated real-world network attacks. Addressing this gap requires both cutting-edge data science techniques and discipline-appropriate skills in advanced cyberinfrastructure (CI) in the classroom.
Principal Investigators: Kimberly Claffy Ka Pui Mok
Funding source: CyberTraining-2519416 Period of performance: July 1, 2025 - June 30, 2028.
Project Summary
In the rapidly evolving information technology landscape, network security is a cornerstone of cybersecurity. Cybersecurity workforce training empowers the next generation of IT professionals to proactively mitigate risks and secure digital infrastructure against threats. But many cybersecurity training programs are limited to the study of theories, best practices, and protocols. Few programs offer exposure to network measurements where students can discern baseline behavior from sophisticated real-world network attacks. Addressing this gap requires both cutting-edge data science techniques and discipline-appropriate skills in advanced cyberinfrastructure (CI) in the classroom.
Several institutions provide rich publicly accessible Internet infrastructure datasets for research use, but using these real-world datasets presents two major challenges: scalability and complexity. First, prohibitively high compute and storage requirements for processing data pose a barrier to scale. Huge terabyte-scale datasets, sometimes containing sensitive information, are unsuitable for transfer to personal devices, which limits their access to those with sufficient resources. Second, educators and researchers may lack skills to use advanced CI. The steep learning curve discourages adoption of CI in courses and research. Our project addresses both challenges by leveraging existing NSF-funded CI to foster training and preparation of a diverse STEM cybersecurity workforce.
Building on CAIDA’s successful Internet Data Science for Cybersecurity course, our vision rests on three pillars: data-driven course materials, infrastructure support for executing and grading assignments, and a user-friendly platform to organize and share resources within the community. This project will develop and deploy a Cybersecurity Community Hub (C2Hub), a centralized catalog to catalyze the building,delivering , and sharing of CI-ready cybersecurity education and training resources in the cybersecurity training community. Our immediate goal is to provide the community a one-stop platform for CI-ready course modules enabling institutions, including under-resourced ones, to broaden the adoption of CI tools and resources in the Nation’s undergraduate and graduate cybersecurity curriculum. This resource will also enhance researchers’ abilities to efficiently use advanced CI to conduct data-intensive research. We thus pursue both major goals of the solicitation.
Our team will collaborate to optimize usability, dataset security, and use of modern software stacks that facilitate machine learning/artificial intelligence (ML/AL) techniques on the data. We will address technical and policy challenges in resource sharing across institutions and adopting advanced CI. We will seed the platform with course modules that provide hands-on experience in using CI to apply data science techniques to cybersecurity analyses. We will collaborate with PIs of a CyberTraining project (NSF:CIP:2230127) at UC San Diego to integrate suitable modules into their training program. Two partners (JHU and Calvin U.), with different class sizes, campus CI resources, and student demographics, will help us identify pedagogical challenges in different settings. Seven collaborating institutions have committed to adopt the materials in their classes.
Broader Impacts
Our open-source curriculum will serve as a role model for other cybersecurity programs nationwide, accelerate adoption of CI in cybersecurity analyses, and expand and diversify the nation’s cybersecurity work force. A long-term goal of this project is to achieve lasting impact on the new School of Computing, Information, and Data Science (SCIDS) at UC San Diego, which offers undergraduate and graduate data science-related programs for ≈ 1,000 students. We will introduce our course materials into the curriculum of suitable SCIDS programs/courses at various levels.
Acknowledgment of awarding agency’s support

This material is based on research sponsored by the National Science Foundation (NSF) grant CyberTraining-2519416. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.