Software Repositories - STARDUST
Key infrastructure components
https://github.com/CAIDA/dagmulticaster STARDUST uses dagmulticaster to listen to the interface receiving traffic and distribute it on a dedicated multicast group. If you don’t have a DAG card available you can use
tracemcasthttps://github.com/salcock/libtrace/tree/master/tools/tracemcast to do the same job using a standard interface on a commodity NIC or using DPDK.
https://github.com/CAIDA/corsaro3 Corsaro is a software suite for performing large-scale analysis of trace data. It was specifically designed to be used with passive traces captured by darknets, but the overall structure is generic enough to be used with any type of passive trace data. STARDUST uses it (among other things) to read from the multicast group and save the packets into traces. STARDUST also uses:
corsarowdcapis used to read packets directly from the multicast group and write them to storage as trace files in the pcap format.
corsarotaggerto read from the multicast group, tag traffic with additional metadata such as prefix2AS and IP geolocation, and redistribute in another multicast group a tagged version of the stream.
corsarotraceto run various plugins which read from the second multicast group (the one with tagged traffic) e.g., (1) run the plugin to generate timeseries, which can be sent to a Kafka instance (we can then guide you on what else needs to be done to get those timeseries into InfluxDB); or (2) run the flowtuple plugin to save data on file in the special kind-of-flowlevel flowtuple format.
https://github.com/CAIDA/telegraf-friendlytagger Telegraf plugin for applying “friendly” tags based on a set of well-defined tags. This is used to augment time series datapoints with human-readable names for various fields, e.g. full country, region or county names for geo-located region IDs, or AS names for ASNs.
Tools for users
For their use refer to https://stardust-dev.caida.org/docs/
- https://github.com/CAIDA/stardust-tools Tools and scripts for processing and analyzing network telescope data using the STARDUST platform.
- https://github.com/CAIDA/pyavro-stardust PyAvro-stardust provides an interface for fast processing of the STARDUST avro data files using python. Data formats that are currently supported are: flowtuple v3 data, flowtuple v4 data, RSDOS attack data.
- https://github.com/CAIDA/stardust-docker Dockerfiles for the STARDUST project. These can be used to create containers where users can run analysis jobs on live traffic or historical data saved in the Swift object store.
Libraries and minor components
- https://research.wand.net.nz/software/libwandio.php C library for (among other things) reading both compressed and uncompressed files directly from a Swift object store. Required by libtrace. See https://stardust-dev.caida.org/docs/tutorials/wandiotutorial/
- https://github.com/LibtraceTeam/libtrace C Library for working with network packet traces. Used by tracemcast and Corsaro.
- https://github.com/CAIDA/libflowtuple A C library to read legacy (pre-STARDUST) corsaro flowtuple files.
- https://github.com/CAIDA/libtimeseries Time-series abstraction library.
- https://github.com/CAIDA/libipmeta IP metadata lookup library.
- https://github.com/CAIDA/libndagserver nDAG Multicast Server Library.
- https://github.com/CAIDA/python-avro-streamer Python module for working with streamed Avro data.
- https://github.com/CAIDA/hermes-avrofilter Swift middleware for removing sensitive fields from streamed Avro data.
Reference architectural diagram