Overview
I recently read the book called ‘Network Security through Data Analysis: Building Situational Awareness’ by Michael Collins and found it
to be useful and a great way to carve and explore threats, one of my main
interest. The book provided a
good overview of ‘beaconing’ and offers solutions to detect and alarm. The book has both breadth and depth but I thought addressing ‘beaconing’ in detail is worth
exploring especially in finding those persistent threats.
Beaconing in the broad sense is an effort by an entity to
contact another entity repeatedly to either provide status request to
establish a communications channel.
The Mars Rover uses the Deep Space Network satellite communications
system to beacon and communicate. Cell
phones when turned on, beacon to the nearby cell towers and, your WiFi enabled devices utilize beacon packets which provide a lot of information. Beaconing is also how malware initiates communications. The
issue is that the average network is awash in non-malicious beacons, each
has to be ruled out in some way in order to detect potential threatening beacons.
Network beaconing is unidirectional and repeated over time
and can communicate from one host to another or to many other hosts and would use any
protocol that can convey a message. A malicious beacon stems from malicious
code and its behavior can be consistent, such as every five minutes or it can be transient
or conditional making it hard to find.
Luckily most attackers don’t want to get too creative, as they are
dependent on the beacon to phone home and know detecting beacons is hard.
Detecting beacons is useful but not idea. It would be far better to detect the
malware prior to execution and even better to have a solid prevention strategy.
As of yet, malware can and remains elusive to most forms of detection. Enough
information is available to show that the most insidious, targeted threat persist
for years with so much effort placed on malware download solutions. Fact is, malware still infects host and
they will beacon to establish connection. Therefore, the discovery of malicious beacons is
critical and unless you have a signature, the probability of detection remains
low.
What is a beacon
Beacon for the most part is the ‘sleep’ or ‘wait’ state the
malware find itself in when executed. Sometime it is a programmable variable,
other times it is static. In some
cases it may have variance or a range and sleep for 900 seconds then change to
3600 seconds. The most consistent
and limited sleeping done by any malware increases the odds of detection. Malware may have different ‘sleep’
state for various processes such as one for ‘phone home’ or, ‘self-update’ and
might even use different external host for each process.
When you look at a single beacon in a graph, it appears as self-evident, sometimes called a ‘heartbeat’ for obvious reasons and it demonstrates that
consistent interval as show below. The beacon was every 1800 seconds (30
minutes) and used TCP/IP with port 443.
The consistent factors were the destination port, protocol, source and destination
IP addresses.
Figure 1 Wireshark IO graph of a malicious TCP beacon
In the simple example above the peak would represent the
beacon that is a single TCP packet with the SYN flag in this case, every 30
minutes of time. Adjusting packets so to align in a set of bins, or buckets of sorts, centered around time, input size, or count or some combination of the
three. Visualizing the data based on any of the factors is useful but for now,
we will stick to time. Viewing multiple
beacons in a single graphic become confusing quickly and requires the use of
bins to sort information.
Depending on resolution, the multiple beacons below can quickly look
like a puzzle. Even if broken into
host pairs, a large set of images takes time to review.
Figure 2 Multiple beacons, single plot FAIL
It is difficult but not impossible to identify malicious
beacons within a large network that has dozen of protocols that beacon such as
NTP or services like twitter and anti-virus updates. First you have to track as much of the network traffic and
use the most common properties to eliminate heavily beaconed sites.
In the evaluation of beacon traffic, look
for the timing and variance and start with a reasonable tolerance for
both. In the ‘spectral’ plot below
each blue circle is centered based on the mean time in seconds or sleep time. Each
time is a representation of a beacon. The variance, representing a simplified
allowance for deviation within the timing itself and, the count of instances
increasing the size of each blue circle.
In this case bigger blue circle need the most attention.
Figure 3 Top beacons in a single plot success (without
labels)
Most of the beacons show above fall below ‘60’ seconds and
the blue dot low to the right is at 7220 seconds or exactly 2 hours. The test
data used was limited to 100000 TCP SYN connections from a network containing
1500 host over the period of three days.
The traffic was known to have contained actual malware, each attempting
external connections. The lower the sleep, the higher tolerance for variance.
Taking a closer look at a region, overlapping beacons that
have the same characteristics can be seen. The labels have random IP addresses but give a good
indication that multiple beacons can be reviewed in a single graph and
malicious beacons that have the same characteristics are grouped by time.
Figure 4 Malicious beacons
The above shows beacons at seven seconds and at eight seconds,
malware attempted to reach out using on rotating ports and different
addresses. Two different internal
host were involved reaching out.
While the display shows a randomized IP destination, the domain name
could be displayed depending on preference. It is possible that beacons exhibiting the same sleep,
variance, and destination port are the same but, infecting different
internal.
Features of beacons
In order to have a reasonable list of beacons, a number of
filters have to be applied to the dataset. Decide the minimum and maximum number of
connections can qualify. In this case the minimum was set to 12 and the maximum
is 5000. The next filter is based
on time between the first packet and the last, ignoring anything less than 15
minutes.
One other variable tracked is the number of internal host
that visit any single external host.
Malware tend to only affect a few host probably five or less while hundreds
visit a site like twitter.
Removing the most popular sites increases performance and keeps analyst
from chasing the obvious.
The remaining filters are controlling the maximum variance ,
minimal sleep time and remove and destination ports such as port 25 for email
for example.
If the goal is to support continuous beacon detection, the
next logical step is remove anything trusted or found to be benign in some way
and avoid storing unnecessary data.
Analyst that inspect traffic don’t want to see it again and a 'white list' can be appended with inspected beacons.
Beacon analytical strategy
Environmental conditions drive the analytical strategy.
Consider what is allowed to traverse the network and how much control users
have. Environments vary from heavy oversight and strict policies to networks
that resemble an unsupervised daycare. The gain in detection in one network targeted frequently was
considerable, an average of six infected host were found through beacon
detection per week. Yet another network had 4 positive detections in six
months.
Depending on the network, detecting beacons is worth a try and with success, should become standard for analyst.
Collection
Collection is a script, it parses from a network source or
flow files. Detection starts with collection of specific network properties
from flow and stored in a database.
At a minimum, three days of traffic is probably enough to evaluate for
beacons and a week is ideal. After
a week, it would be best to wipe the database and start again.
The more collected the more time it takes to evaluate,
Collections should be strategic to the type of traffic known to be malicious by
applying filters to the flow capture in advanced. However, a virtual ‘cleanlist’ can be applied and stored in
a key and checked during collection.
Start with TCP packets with the SYN flag set and try other
protocols or specific ports to get a sense about what beacons. UDP is difficult as most of the traffic
is beacon like in same way from time checks using NTP or ‘keep alive’ for
databases.
Analysis
Analysis is driven by a simple script parsing each flow and
does nothing but evaluate for the characteristics previously described and present
finding in a tabulated text view XXXXshow below XXXX and into the graph
previously show.
For any beacon one has a sense of when it started, how long,
how consistent and has a starting point for analysis.
Analysts use the list or graph of suspect beacon traffic by
evaluating the risk factors of both the internal and external host. The history of the associated full
packet capture between the host pairs remains a great way to identify threats. A more advanced
approach is to inspect the host itself, specifically recent log events and involved
users. The more important analysis is the involved host memory sample looking
for the presence of malware. In some enterprises, it is worthwhile to sinkhole or block any
suspected traffic if the means is available.
Unless you fear more dormant beacons and you can consider a
simple means to parse and store all the low interval traffic as part of the
‘arctic vortex’, a simple and untested capability available for the most
paranoid and targeted among us.
Consider a beacon that sleeps for a month before connecting.
Seems somewhat mythical and would require a very patient attacker at the helm
with long-term objectives or a backup to other connections. Traffic that is so infrequent would be
filtered out and really most of the threats are immediate and if you have
significant coverage or bored, you can store the right data and hunt for the
arctic vortex of malware, lying in cold storage waiting for activation.
Beacon Bits
I wrote and released the basic beacon detection scripts a
few years ago but make some improvements last summer including graphing the
data. The next post will cover the
tools in detail and offer some test data to get started.
Link: https://github.com/bez0r/BeaconBits
I fully expect to move the variables into a configuration file with more guidance and release a new version soon enough.
I fully expect to move the variables into a configuration file with more guidance and release a new version soon enough.
Conclusion
The book by Michael Collins called ‘Network Security through
Data Analysis: Building Situational Awareness’ started this blog and I
highly recommend the book to anyone exploring network security. The book is both a great place to get a
sense of how to use the concepts presented in this article and, evaluate other
complimentary analytics.
Edited on 14Mar2014 to correct spelling errors.
Edited on 14Mar2014 to correct spelling errors.
Nice Article. Please keep posting
ReplyDelete
ReplyDeletecomment localiser iphone ---> comment localiser un telephone portable avec son numero
http://localiser-un-portable.org
LOCALISER UN PORTABLE