The Conficker worm appeared in November 2008, spread rapidly, and has
been through a series of changes since then. The initial version,
Conficker A, began on the 21 Nov 08, infecting hosts by exploiting the
MS08-067 vulnerability in Microsoft Windows. Conficker B began about
29 Dec 08, and introduced some extra techniques it used to spread.
Both Conficker A and B used an algorithm to generate pseudo-random
domain names, and had the ability to download new code from a web
server when it found one at such a domain name.
CAIDA has observed traffic from Conficker-infected hosts using
the [UCSD network telescope], and documented
Conficker A and B's behaviour in
[CONFICKER-AB].
Conficker has been well documented by SRI in their technical
report [SRI-CONFICKER]. Conficker C
introduced a new mechanism for downloading new code, using infected
hosts to form a peer-to-peer (p2p) network. It also introduced a
mechanism that updates the infected host's Conficker version, and
prevents any earlier versions from running. SRI observed
[SRI-CONFICKER-C] that Conficker C
activity began on 5 Mar 09 (UTC), and increased significantly on
17 Mar 09 (UTC).
There is some confusion over the names of the Conficker variants;
Wikipedia [WIKI-CONFICKER] lists A
through E. In this article we use SRI's naming scheme; the essential
difference is that where SRI refer to B++ and C, Wikipedia refer to
C and D. Both use the same names for A, B and E.
The SRI report documents all the Conficker features, and provides an
algorithm that computes port numbers based on the packet's
destination IP address and Unix epoch week. A packet with such a
destination address is highly likely to be from a Conficker C host.
Using that algorithm, we have investigated the behaviour of Conficker
C packets from mid-March to early May of 2009.
Hourly telescope data volumes
Figure 1:
Conficker C p2p packets and Telescope trace file sizes each hour
Red trace shows number of unique Conficker C p2p sending p2p
packets each hour. These are detected using the
[SRI-CONFICKER] algorithm
described above.
Less than 400kHost/h before 17 Mar 09. This is a 'background'
level of false p2p packet recognition, it is investigated
further in section 4 below.
The average arrival rate for unique Conficker C p2p hosts
appears to show a slowly decreasing rate of decline
from its peak at 17 March. This suggests a declining
population of infected hosts sending p2p packets
Blue trace shows the size of the telescope .pcap trace files
in MB
Volumes rose slightly during 17-21 March, but were fairly
steady until 14 April
On 14 April the trace file sizes rose from about 1.5GB to
about 4.5TB.
Until then we limited the data rate for packets with
destination ports 135-139, 445, 593, 1433, 1434, 3127,
2745, 4751, 5554 and 9996.
On April 14 we removed that rate limiting, hence
the sudden increase in trace volumes
The highest packet loss rate reported by pcap for
the telescope during these measurement was about 6.7kp/s
(15%), but the median was much lower, around 5p/s (0.000001%).
In short, packet loss as reported by pcap was not significant
Trace breakdown by port number, Mar 09 compared to Nov 08
Our initial efforts to look at port numbers simply counted the number
of packets seen for every possible port. That produced large postscript
files and images that were hard to see patterns in.
Instead, we aggregate the port numbers into five ranges as follows:
To test SRI's algorithm for recognizing Conficker C p2p packets,
here are the port breakdowns for the first hour of three days,
Fri, 21 Nov 2008 (early Conficker A)
Wed, 18 Mar 2009 (after Conficker C)
Sat, 25 Apr 2009 (after Conficker E)
In these tables, the Total column gives the number of packets
seen in an hour (0000-0100), the columns on the right show the percentage
of packets in each port range.
21 Nov 08
UDP %
TCP %
Total
wkp
445
xp
apps
eph
wkp
445
xp
apps
eph
Source
18M
1.76
0.00
9.58
5.19
1.21
17.72
0.00
51.38
8.36
4.81
Dst other
18M
3.52
0.00
1.93
8.41
3.86
6.91
47.82
3.39
18.12
6.04
Conf C p2p
668
55.84
0.00
0.00
0.15
0.30
32.93
0.00
0.00
4.79
5.99
18 Mar 09
UDP %
TCP %
Total
wkp
445
xp
apps
eph
wkp
445
xp
apps
eph
Source
61M
1.96
0.00
12.73
7.76
3.08
9.85
0.00
32.66
26.71
5.23
Dst other
47M
3.01
0.00
1.52
9.33
1.75
12.92
29.08
4.87
27.79
9.74
Conf C p2p
14M
0.00
0.00
1.14
39.97
16.67
0.00
0.00
0.84
29.18
12.19
25 Apr 09
UDP %
TCP %
Total
wkp
445
xp
apps
eph
wkp
445
xp
apps
eph
Source
82M
0.83
0.00
11.30
5.07
1.29
18.21
0.00
52.32
8.30
2.69
Dst other
76M
1.46
0.00
6.55
5.93
1.40
2.95
54.32
7.62
14.73
5.03
Conf C p2p
7M
0.04
0.00
1.07
37.08
15.49
0.02
0.00
0.91
32.03
13.37
From 21 Nov 08 (early Conficker A) to 18 Mar 09 (after Conficker C)
The number of Conficker C p2p packets increased from 668 in an hour
to 14M (lilac shading). 668/hour (in Nov 08) is too low for its
percentage port breakdown to be reliable
Almost one quarter of packets reaching the telescope on 18 Mar
were Conficker p2p. Their destination ports (cyan shading)
were mostly in the Apps and Ephemeral range, i.e. 5000 and above.
Other TCP packets sent to port 445 (yellow shading) fell sharply;
Conficker C didn't sent packets to probe the MS08-067 vulnerability
The percentage of packets from XP ports fell sharply (pink shading),
moving instead to the apps ports
From 18 Mar 09 (after Conficker C) to 25 Apr 09 (after Conficker E)
The number of Conficker C p2p packets (lilac shading) had fallen to
half it level at 18 Mar, suggesting that the population of Conficker C
hosts had diminished
The percentage breakdown of Conficker C destination ports (cyan shading)
had changed little since 18 Mar - the remaining Conficker C hosts
were still active as before
Other TCP packets sent to port 445 (yellow shading) rose again,
indicating that Conficker was again probing the MS08-067
vulnerability
The percentage of packets from XP ports (pink shading) rose again,
returning the telescope source port breakdown to that of 21 Nov 08
When did we begin to see Conficker C p2p packets?
In the section above, we observed that on 21 Nov 08 (i.e. before
Conficker A) we observed a few (about 0.005%) packets falsely identified
as Conficker C p2p. However, by 18 Mar 09 the number of p2p packets
seen in an hour had risen to 14 million. To determine when Conficker
C actually began to send p2p packets, we investigated the proportion
of p2p vs 'other,' and UDP vs TCP packets.
Figure 2:
The rise of Conficker C, showing proportions of p2p/other packets
for TCP/UDP
Figure 2 uses stacked bars to show the proportions of p2p vs
'other' packets, and also of TCP vs UDP packets. From 21 Dec 08
through 16 Jan 09 we were not able to record telescope data, as indicated
by the orange arrow. The day SRI noted an upsurge in Conficker
C activity, 17 Mar 09, is indicated by the black arrow (upper right).
The plot shows that on and after 8 Mar 09 a significant fraction of the
packets reaching the telescope were Conficker C p2p. Although we saw
a tiny fraction of such packets before then, that 'false positive' fraction
was insignificant. Interestingly, the fractions of UDP and UDP were
similar throughout November through April.
Number of Unique p2p hosts seen during 17 Mar 09
17 Mar 09 was the day when we saw the greatest number of unique
Conficker C hosts sending p2p packets.
Figure 3 (below) was generated by a C program that reads through all
the trace files for a day, finds the p2p packets, and builds
up a hash table of their destination addresses.
Figure 3:
Unique Conficker C hosts appearing on 17 Mar 09
Cumulative plot shows number of unique Conficker C
hosts that sent p2p packets on 17 Mar 09
Nearly 3M hosts infected during that day
Roughly a steady rate of increase during the day,
though there is some slowing down visible after
about 1800 (UTC)
In [CODE-RED] Moore, Shannon and Brown observed
that some IP prefixes appeared to have high numbers of Code-Red-infected
hosts. They suggested that could be due to "DHCP inflation," i.e.~a
number of infected hosts might log out and return later, with DHCP
giving them a different IP address when they logged in again. They also
pointed out that NATs and Proxy gateways could hide a population of hosts
behind a single IP address, so that a telescope would underestimate the
actual number of infected hosts.
We have not attempted to quantify either of these two effects
in this study.
Slide Show: Spread of Conficker C
Figure 4:
Global spread of Conficker C during 16-18 Mar 09
Conficker C activity in any country is highest in the
morning (local time) for that country; it decreases in
the evening.
Most active countries were China, Russia and Brazil
This slide show was developed in JavaScript, and is based
on a
webmonkey JavaScript Slideshow tutorial, modified and extended to
use the Unobtrusive Slider Control V2 from
frequency decoder.
Thanks to Sebastian Castro for the `world map' images, and to
Brad Huffaker for help in making the slide show.
IP Port Observations for packets reaching the telescope
The following plots use the port groupings described in section 3 above.
Beginning of Conficker C activity
Figure 5:
Destination port packet rates, early March 09
March 17 was the day that Conficker C started spreading; these
two plots show the start of that activity.
For Destination ports:
Conficker C p2p packets increased sharply through Tuesday 17
March (red and magenta)
They show a clear diurnal pattern, with its maximum at
about 1400 UTC, i.e. 0700 local time
That increase was clear for both UDP (red) and TCP
(magenta), i.e. Conficker C uses both protocols for
its p2p network
TCP port 445 packets (blue) stay steady at about 15 Mp per hour.
That's because until 14 Apr 09 we were rate-limiting packets to
port 445 to only 2 Mb/s
There was no significant increase in packets to XP hosts
(cyan). Indeed, the occasional short spikes of XP packets
don't appear to correlate with anything in the other traces
Figure 6:
Source port packet rates, early March 09
For Source ports:
There was no significant change in TCP packets from XP
hosts (orange)
Increases in UDP packets from XP hosts (green) correlate well with
increases in p2p packets (both UDP and TCP) to XP hosts
noted above (red and magenta)
There were no significant changes in UDP or TCP packets to
ephemeral ports, i.e. to non-XP hosts
Beginning of Conficker E activity
Figure 7:
Destination port packet rates, early April 09
April 9 was when Conficker E started
[CONFICKER-E].
Conficker C doesn't send MS08-068 (port 445) packets,
it only sends p2p packets. Conficker E is supposed to have
started to send MS08-068 packets, as a means of infecting
new hosts.
For Destination ports:
Conficker C p2p packets (red and magenta) still show a clear
diurnal pattern, but at average rate about half that in the
mid-March plot above
TCP port 445 packets (blue) are again steady because of
our rate limiting. However, the trace seems to be flatter
at the top, suggesting that 445 traffic was reaching our
rate limit of 2 Mb/s
There were more, bigger, spikes of packets to other XP
ports, again uncorrelated with the other traces
Figure 8:
Source port packet rates, early April 09
For Source ports:
All four of these traces seem similar to those for the
mid-March plot above.
Overall, the only noticeable change is a possible rise in the
volume of port 445 traffic to XP systems.
A four-day period late in April (for comparison)
Figure 9:
Destination port packet rates, late April 09
On 14 April we removed our rate limiting of common attack ports,
particularly of port 445. These two plots provide a view of
the total traffic reaching the telescope.
For Destination ports:
The total volume of packets to port 445 (blue) has doubled,
we can now see clear diurnal variations
Interestingly, they occur about three hours later than that
for the Conficker C p2p packets (red and magenta)
There seem to be slightly more packets to other XP host ports
(cyan)
Figure 10:
Source port packet rates, last April 09
For Source ports:
Now there are clear diurnal variations in TCP packets from
XP hosts. This makes it clear that this (orange) trace on
the earlier plots above was also caused by our rate limiting
The orange trace here (TCP from XP ports) is well correlated
with the blue trace above (TCP to port 445), but there are
some clear differences.
Overall, it's clear that packets to port 445 make up a significant
fraction of all the packets we see arriving at the telescope.
Packet sizes for TCP and UDP Conficker C p2p packets
We investigated whether the packet size distribution for these two
protocols had changed over time. If they had, that might provide a
way to recognise the various Conficker events.
Figures 11-13 (below) compare the telescope packet size distributions for TCP
and UDP on four dates:
Sat, 14 Mar 2009 1000 (UTC) - before Conficker C
Wed, 18 Mar 2009 1000 (UTC) - rise of Conficker C
Wed, 01 Apr 2009 1000 (UTC) - after Conficker C
Sat, 25 Apr 2009 1000 (UTC) - after Conficker E
Figure 11:
TCP port size distributions, Mar - Apr 09
We expect most TCP packets to be attempting to open connections,
i.e. `opening SYNs.' The plot shows that a clear majority of them
have lengths less than 70 bytes; that is, they carry TCP options
(e.g. MSS, timestamp, window scale), but no data
There was no discernable difference in the TCP size distribution
for these four days. Most of the incoming TCP packets really
are just opening SYNs
Figure 12:
UDP port size distributions, Mar - Apr 09
UDP packets don't have to open a connection, hence they can carry
actual payloads. The plot shows roughly a straight-line decrease
for sizes above 60 bytes, suggesting an exponential decrease
The number of packets bigger than about 60 bytes increased with
the rise of Conficker C (red and blue traces), but had fallen back
to Pre-C levels by 25 April
Figure 13:
UDP packet size distributions as lin-lin plots
This plot shows the same data as above, viewed as a set of four linear-linear
plots, one for each day. Viewed in this way, there are clear differences
between the distributions.
14 Mar 09: most UDP packets were less than 66 bytes. 21% were 115 bytes,
they do not appear in the later plots. More interesting, 3.2% were 141
bytes and 3.00% were 138 bytes.
These last (around 140 bytes) appear on the later
plots; they seem characteristic of Conficker C
18 Mar 09: as Conficker C got started, the spike at 140 bytes became
more prominent
1 Apr 09: By the time Conficker E began, the 140-byte spike had
reduced to about the same size as before Conficker C. However,
a new large spike had appeared at 286 bytes, perhaps suggesting a new
activity pattern due to Conficker E
25 Apr 09: the 286-byte spike had gone, but a new spike had appeared
at 404 bytes, accounting for 25% of the packets reaching the telescope.
We are considering how to account for it
Conclusion
This investigation has provided a little more information about Conficker C
and it's peer-to-peer packet behaviour. I believe that the use of both
UDP and TCP is a common characteristics of peer-to-peer networks, so
it's not surprising that Conficker C uses both. Further investigation
of Conficker C's UDP packet usage would clearly be worthwhile.
Working with data from the telescope has been challenging because it
is very different from 'normal' network trace data. Since the telescope
is completely passive, it is purely one-directional (in to the telescope),
so 'traffic flow' analysis methods cannot be used with it. Also, the
data volumes are high - between 2 and 8 GB in a one-hour trace file -
so analysis code must be carefully constructed so that it can produce
useful analysis fast enough to be useful.
Lastly, this work hinges on our ability to identify the Conficker C p2p
packets. Again, we acknowledge the work of SRI on Conficker, as
published in their Technical Report
[SRI-CONFICKER].
"The default dynamic port range for TCP/IP has changed in Windows
Vista and in Windows Server 2008," January 2008
Wikipedia
Acknowledgments
This work builds on that carried out by Emile Aben
[CONFICKER-AB] in February and March 2009. My thanks to Emile,
Stefan Savage, kc Cliff, Brandon Enright, Brian Kantor, Brad Huffaker,
Sebastian Castro and Ryan Koga for their insightful discussions. Conficker
has greatly increased the amount of data flowing from the network
telescope. Managing that data has required significant amounts or technical
work by Dan Anderson, Josh Polterock and Brian Kantor; I'm very
grateful for all their help.