Finding abusers. List of most frequent IPs in Apache log

Internet is full of malware and people with leisure time who will hammer your server with no good intentions, most of them will try to access well-known URLs looking for exploits of software like Wordpress (/wp-admin.php, /edit, etc...).

If you monitor for a while your access_log it's easy to find out unwanted behaviour. If you want to get a list of the most frequent IPs in your Apache log the following command will get a list of those IPs sorted by number of requests:

[root@www3 ~]# cat /var/log/httpd/access_log_20130620 | awk '{print $1}' | sort | uniq -c | sort -rn | head
 912545 95.27.xx.xx
  85151 66.249.78.72
  70448 66.249.78.139
  59450 95.27.40.10
  49649 178.121.54.212
  48295 91.203.166.250
  37894 157.56.92.165
  37028 157.56.92.152
  36094 157.56.93.62
  20707 157.55.32.87

Many of these IPs are bots like Google (66.249.xx.xx) or MSN (157.56.xx.xx) and they should be let in and out at their will. But as you can see in the first sample line, there is sometimes IPs that are not recognized bots that have a surprisingly high traffic over your network.

If you want to identify these IPs use the service whois.net or install the bind-utils so you can use the "host" command and see the reverse DNS. Example:

$ host 66.249.78.72
72.78.249.66.in-addr.arpa domain name pointer crawl-66-249-78-72.googlebot.com.

Any IP not having a reverse DNS, chances are it is someone playing nasty. If you detect that these IPs are abusing your system you can always block their access:

iptables -I INPUT -s 95.27.xx.xx -j DROP

But this is not a permanent or desired solution at all. If you see this to happen a lot then you might need something more generic, like limiting the connections to the machine (caution with Bots!!). This is also helpful for some dDoS attacks. Example:

iptables -A INPUT -p tcp --dport 80 -m limit --limit 25/minute --limit-burst 100 -j ACCEPT

To better understand limits see this limits-module article.

Finally, at Obolog we use a fantastic tool called GoAccess that analyzes your logs and presents the information in a good looking format. See the screenshot: