Wandio - STARDUST
Wandio
Wandiocat
wandiocat is a tool that behaves in a similar way to the cat
Unix command in that it will output contents of a file to standard out. The main
difference with wandiocat is that it can be used to read files over HTTP or files that are in a Openstack Swift object store.
To output the contents of a file (then you can pipe the content into another command):
user@vm001:~$ wandiocat <filename>
or
user@vm001:~$ wandiocat <url>
- Downloads the contents of that URL and outputs it to standard out.
Wandio knows how to read a file from Swift. Just specify a path to the file using a Swift URI:
user@vm001:~$ wandiocat swift://<container name>/<object name>
Example
user@vm001:~$ wandiocat swift://telescope-ucsdnt-pcap-live/datasource=ucsd-nt/year=2020/month=09/day=27/hour=09/ucsd-nt.1601197200.pcap.gz
- It will handle authenticating with Swift, downloading/streaming the contents of this file, and as wandio is downloading it (the extension is .gz which means that the pcap is compressed with gzip compression) the file from Swift it will do the steps of decompression so that you do not have to download the file, decompress it, and process it.
- Because the pcap file is a binary format, it is not going to be human-readable.
- This will return binary feedback in the terminal
The function hexdump
allows you to take binary data and write it out in more human-readable outputs from binary.
user@vm001:~$ wandiocat swift://<container name>/<object name> | hd | head
Example
user@vm001:~$ wandiocat swift://telescope-ucsdnt-pcap-live/datasource=ucsd-nt/year=2020/month=07/day=27/hour=09/ucsd-nt.1595840400.pcap.gz | hd | head
00000000 d4 c3 b2 a1 02 00 04 00 00 00 00 00 00 00 00 00 |................|
00000010 00 00 01 00 01 00 00 00 90 54 70 5f 00 00 00 00 |.........Tp_....|
00000020 3c 00 00 00 3c 00 00 00 3c fd fe 19 d8 00 00 de |<...<...<.......|
00000030 fb ba 06 c7 08 00 45 00 00 28 ff 9f 00 00 f2 06 |......E..(......|
00000040 91 18 2d 81 21 31 2c 6f bc f6 a0 01 0d 64 42 99 |..-.!1,o.....dB.|
00000050 4e 0d 00 00 00 00 50 02 04 00 35 bf 00 00 00 00 |N.....P...5.....|
00000060 e9 75 10 0a 90 54 70 5f 01 00 00 00 3c 00 00 00 |.u...Tp_....<...|
00000070 3c 00 00 00 3c fd fe 19 d8 00 00 de fb ba 06 c7 |<...<...........|
00000080 08 00 45 00 00 2c 31 94 00 00 e7 06 07 58 df f7 |..E..,1......X..|
00000090 99 f4 2c 2d f4 c6 c5 fe 66 76 73 42 a3 24 00 00 |..,-....fvsB.$..|
- Use the output of the
wandiocat
command and pipe it into hexdump - Useful to understand what the exact data in the pcap file.
Storing in tcpdump:
To pipe a pcap trace, for instance, from the Swift object store in tcpdump:
user@vm001:~$ wandiocat swift://telescope-ucsdnt-pcap-live/datasource=ucsd-nt/year=X/month=Y/day=Z/ucsd-nt.TIMESTAMP.pcap.gz | tcpdump -r - <other tcpdump flags> | less
Note that the traces contain a lot of packets. Use less
as a sink for the output.
pywandio
Python bindings for libwandio.
pywandio is a high-level Python file IO.
To use pywandio to access swift objects:
import wandio
with wandio.open("swift://<container>/<object>") as fh:
# use fh like a normal file
Methods
open
Open a file from the given file path.
input swift file path
return which can be used to read the file.
Syntax Ex.
wandio.open(<name of swift file>)
Example
wandio.open("swift://data-telescope-meta-rsdos-daily/year=2021/month=01/day=12/ucsd-nt.rsdos-daily-attacks.2021-01-12.ts=1610409600.csv.gz")
close
Closes the file that was opened.
input none
return none
Syntax Ex.
fh.close()
next
Returns the next item from the file handler.
input none
return the next item in the file.
Syntax Ex.
fh.next()
Example
fh.next()
1.179.217.50,1,42,1,151,14949,49,1610425426,1610425715,131293,TH,AS
read
Reads all the text in the file. input optional
return all the text in the file
Syntax Ex.
fh.read()
readline
Read the next line in the file.
input optional
return the next line in the file
Syntax Ex.
fh.readline()
Usage
# this script should work with both python2 and python3
import wandio
files = [
'http://data.caida.org/datasets/as-relationships/README.txt',
'http://example.caida.org/data/external/as-rank-ribs/19980101/19980101.as-rel.txt.bz2'
]
for filename in files:
# the with statement automatically closes the file at the end
# of the block
try:
with wandio.open(filename) as fh:
line_count = 0
word_count = 0
for line in fh:
word_count += len(line.rstrip().split())
line_count +=1
# print the number of lines and words in file
print(filename)
print(line_count, word_count)
except IOError as err:
print(filename)
raise err