A few quick ways to use normal unix command-line tools to parse logs for classification of problems or identification of activity. The aim of this would be to help folks find out what is causing a problem on a particular interface, not long term analysis which should be covered somewhere else. Part of it is covered in this package available on SourceForge.Net.

Lets assume there is an ongoing problem with an interface on a network, say even your network, and you don’t have other tools available (sniffer, cflow/netflow) to diagnose the problem for you. You’ll need to see some of the traffic passing through the interface, you can do this with a simple access-list that logs traffic. An example access-list:

access-list 150 permit ip any any log
access-list 150 permit udp any any eq 98

Note: This should be done with the understanding that punting packets to the route processor on your router might be stressful for the route processor. This stress might lead to an outage… so be cautious. A rule of thumb for 7500 class routers (7206VXR NPE300 as well) would be do not attempt to log more than five thousand packets per second it will definitely cause problems. For 12000 class routers the numbers depend quite a bit more on the cards in use, but in general less than 100,000 packets per second is fine to log.

This access-list will cause your device to output log messages, hopefully you have configured a few other things first on each router:

  1. ntp time synchronization configuration
  2. logging configuration (local and remote logging)

in order to actually see the log messages the access-list would have created. Once the access-list is placed on the interface you should end up with log messages either sent to the "show log" buffer, or on your remote syslog server, which is a unix server correct? (never deploy anything important on anything else...IMHO) You would end up with log messages somewhat like this example from a live router:

Router Logs

Often you will want to see what destination or source address is using the majority of the interface, remember that these log messages are rate-limited on the device (not every packet is represented in log messages) though you should get a representative sample. If a host is sending 90% of the packets through the interface it will show as a majority contributor to the problem.

To do some simple unix-command-line filtering of this file copy and paste the file into a temporary text file somewhere on the unix host in question. We can use , awk/Gawk, sed, uniq, sort to filter down the log messages, a simple one-line command-line to find the most common source of traffic would be:

awk '{print $10}' /tmp/t | \
awk -F\( '{print $1}' | \
sort -n | \
uniq -c | \
sort -rn

which produces a list like:

   2 61.152.90.201
   2 61.152.158.114
   2 221.202.84.227
   2 220.171.97.5
   2 220.163.11.77
   1 missed
   1 88.1.198.164
   1 86.42.131.186
   1 84.173.42.238
   1 80.253.133.10
   1 71.101.241.205
   1 70.229.251.2
   1 66.46.148.161
   1 66.249.66.199
   1 65.214.44.148
   1 63.166.115.14
   1 61.188.38.225
   1 60.188.69.112
   1 60.176.92.172
   1 219.159.236.253
   1 218.64.237.146
   1 218.25.253.18
   1 218.16.121.113
   1 216.251.225.42
   1 207.45.113.212
   1 207.113.244.166
   1 204.168.24.130
   1 202.99.159.35
   1 128.11.138.201

(clearly this isn’t the best data to use as an example...)
You can see that the top 5 hosts sent 2 packets each in the monitored 1 minute timeframe. With a real attack the results are much more striking, looking at the destination instead of source address is as simple as making the first awk statement print out the 12th field instead of the 10th field and produces a resulting list like:

   1 208.218.124.154
   1 199.172.133.30
   1 199.172.133.128
   1 199.172.132.184
   1 199.172.131.81
   1 199.172.128.71
   1 198.3.191.151
   1 198.3.191.101
   1 198.3.189.156
   1 198.3.188.101
   1 198.3.186.46
   1 198.3.183.122
   1 198.3.182.71
   1 198.3.182.38
   1 198.3.182.158
   1 198.3.182.157
   1 198.3.181.52
   1 198.3.181.31
   1 198.3.180.86
   1 198.3.180.248
   1 198.3.180.234
   1 198.3.180.193
   1 198.3.178.24
   1 198.3.178.216
   1 198.3.178.207
   1 198.3.178.192
   1 198.3.177.247
   1 198.3.177.2
   1 198.3.177.161
   1 198.3.177.1
   1 198.3.176.36
   1 198.3.176.171
   1 198.3.176.15

Again not so impressive without the actual attack… but clearly it’s simple to find the sources or destinations. How about what destination ports are more heavily hit? Simple as well! In the second awk instance print the 2nd field instead of the first, then use sed to remove the trailing characters “),” like:

awk '{print $12}' /tmp/t | \
awk -F\( '{print $2}' | \
sed 's/),//' | \
sort -n | \
uniq -c | \
sort -rn

This produces a very tell-tale list:

  14 1434
   2 80
   2 445
   2 1352
   1 53359
   1 4469
   1 4183
   1 389
   1 3813
   1 283
   1 27499
   1 162
   1 161
   1 1451
   1 135
   1 123
   1 12157

So, filtering port 1434 would would indeed have some effect on link utilization. What protocol on prt 1434? Lets look:

awk '{print $9 "::"  $12}' /tmp/t | \
sed 's/::.*(\(.*\)),/\/\1/' | \
sort -n | \
uniq -c | 
sort -rn

This produces the list:

  14 udp/1434
   2 tcp/80
   2 tcp/445
   2 tcp/1352
   1 udp/162
   1 udp/161
   1 udp/123
   1 tcp/53359
   1 tcp/4469
   1 tcp/4183
   1 tcp/389
   1 tcp/3813
   1 tcp/283
   1 tcp/27499
   1 tcp/1451
   1 tcp/135
   1 tcp/12157
   1 or::packets

Which clearly shows us what protocol and port to filter in this case.

These command-line tricks are simple to use, quick and configurable. I’d also stress that being nimble with these commad-line tools that come with your operating system is a great help during times of crisis. There aren’t many folks that can watch scrolling access-list logs and tell you what the problem is, but almost everyone can look at output like the last 2 examples and see where there is a problem.

Hope this helps make searching for problem traffic easier, perhaps in a bit more long term trending and analysis will be attacked here.

-Chris