I was asked recently to check if someone had been accessing a web service outside the office and at first I thought that would be quite simple.
I would just go and check the access logs on the server and list any IP addresses that did not match the office IP address.
To do this I would use grep.
What is grep?
Grep stands for global regular expression print and is a command-line tool for searching files that match a pattern.
I love the ease at which you can use grep to search through log files on servers with ease.
A simple example of grep can be found below where we search for a domain name in an access log:
$ grep "ad-nav.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]...
This would then list all lines that contain “ad-nav.co.uk” in the access.log
file.
Although this is a very basic use of grep I needed to do more with it.
I needed to check access to a specific site where the request was not from the IP address of the office or my home address.
I needed to build a query or pattern with the AND
, OR
and NOT
operators.
Using AND, OR and NOT operators in grep
Simply put grep has no AND
, OR
and NOT
operators. However, you can still achieve what you need to.
OR operator
OR
can be achieved many ways in grep.
You can use a pipe like you would in regex:
$ grep "ad-nav.co.uk\|adamstacey.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... adamstacey.co.uk:80 196.0.0.1 - [10/May/2019:14:18:32 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
You can use the grep option -E
for extended regex.
$ grep -E "ad-nav.co.uk|adamstacey.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... adamstacey.co.uk:80 196.0.0.1 - [10/May/2019:14:18:32 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
There is also a shortcut command that works like the option -E
called egrep
that does exactly the same thing without having to specify the option.
$ egrep "ad-nav.co.uk|adamstacey.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... adamstacey.co.uk:80 196.0.0.1 - [10/May/2019:14:18:32 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
AND operator
To achieve AND
in grep, we need to use the -E
option and some regex magic.
$ grep -E "ad-nav.co.uk.*10/May/2019" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
It is important to note that the above searches for the patterns in order, so will only match the first pattern followed by the second, so if we had the date before the web address that would not get matched.
To search for the web address and date in any order on a line you can do this using the pipe for multiple patterns in a regex.
$ grep -E "ad-nav.co.uk.*10/May/2019|10/May/2019.*ad-nav.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... [10/May/2019:14:18:32 +0100] - ad-nav.co.uk:80 196.0.0.1... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
Another way to achieve AND
and in my opinion the easiest way, especially if you are not familiar with regex is to chain the grep commands.
$ grep "ad-nav.co.uk" access.log | grep "10/May/2019" ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
What we are effectively saying is do my first grep and then in the results of that grep do another grep and show me the end results.
NOT operator
The NOT
operator in grep is achieved by using the inverted option -v
.
$ grep -v "adamstacey.co.uk" access.log ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:13:15:41 +0100]... ad-nav.co.uk:80 196.0.0.1 - [10/May/2019:14:19:24 +0100]...
Add comment