What Is grep
grep stands for “global regular expression print”.
From the docs:
The grep utility searches any given input files, selecting lines that match one or more patterns.
To me, basically
grep is a tool to filter through output of some command. But as you will see, later in this
post, you can use it to search through the content of files to get the results using
But regular expression (regex) is sometimes scary to write, since there’s a lot of syntax to express some stuff that we forget soon after writing it down. But in most cases, you will just use some term to filter for, not an actual regex. That could be “cat”, “purr”, “.txt”, etc.
Basic syntax is:
grep [OPTIONS] PATTERN [FILE...]
grep can be used with piping other commands to it or standalone. Below is the example with the pipe to
$, when you see that, that is the line that contains commands I typed, everything else is the output of the
$ ls -laFh total 0 drwxr-xr-x 8 milosgarunovic staff 256B Mar 2 20:08 ./ drwxr-xr-x 8 milosgarunovic staff 256B Mar 2 19:23 ../ -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.json -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.txt -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.txt2 -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.xml $ ls -laFh | grep .txt -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.txt -rw-r--r-- 1 milosgarunovic staff 0B Mar 2 20:08 1.txt2
In the example above,
grep will filter everything from the input (which is the output of the
ls command) that
.txt search term, and will give whole lines as output. You can use any command that produces some output to
filter through data.
I’ve created alias for grep -
alias grep="grep --color=auto -i". So when I type
grep, I’m getting that from alias.
To check that out, you can type
type grep and you will get
grep is aliased to `grep --color=auto -i'. I know that
now I’m calling alias every time instead of the command itself, and I’m ok with that.
--color=auto will give color (in my case red color) to term you searched for.
-i, --ignore-case is for case insensitive search. Just be aware that case insensitive search is slower than case
sensitive. But it was never unbearably slow for me. Even for about 300.000 files that I searched for (which comes next
in this post), I’ve got results in under 45 seconds.
For me this works for now, I never had to drop those two arguments for something to work.
Besides, the whole post contains
ls -laFh, this too I’ve made an alias to
alias ll="ls -laFh" is the
command. I type
ll always, but since it’s not standard command, I’m writing the full
ls command with flags for
Search File Content
In the company I work at, we have integration with 3rd party system, where the flow is that we pick up some files from them, process them, put things into the database, and save files (even tho they are processed). Files we get come with some standard naming format, with the important data in the file, most important data being the id (of user, patient, encounter, clinic etc.). One day, some data seems to be missing from the system. We had to figure out if we got the files in the first place, where the bug is and similar.
Since the data isn’t in our database (yet), next level is to search for files. We got ids that we should search for, but those ids aren’t in the name of the file, but inside. This is the first time I needed to go through about 300.000 files to find information that is lost, if it’s even there.
The problem explained above is something where I spent quite some time on (to be honest, more than 4 hours of
experimenting with command line and
grep, I was still new then), but I’ve learned a lot so I don’t regret it.
To figure out which files contain the data, I searched through them with
grep -Rl $SEARCH_TERM $PATH.
Only the names of files containing selected lines are written to standard output. grep will only search a file until a match has been found, making searches potentially less expensive. Pathnames are listed once per file searched. If the standard input is searched, the string ``(standard input)” is written.
-R, -r, –recursive
Recursively search subdirectories listed.
-R search recursively, and inside the files. Use recursion with caution.
-l will return the directory name. I didn’t research that much, but
-l flag should be used with
-R in combination.
It can probably be used with other flags too.
Looking at docs for recursive search, I can use
--recursive, but for some reason I’m used to
you can use any one of them. For the example, I’ve made files that contain only numbers
3 inside them.
Just to keep it simple. So if I want to find number 3, or any other search term in any of these directories, I can do
something like in the example below:
$ ls -laFh total 72 drwxr-xr-x 11 milosgarunovic staff 352B Mar 2 19:26 ./ drwxr-xr-x 8 milosgarunovic staff 256B Mar 2 19:23 ../ -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:26 1.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:26 2.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:26 3.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:26 4.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:27 5.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:27 6.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:26 7.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:27 8.txt -rw-r--r-- 1 milosgarunovic staff 2B Mar 2 19:27 9.txt $ $ grep -Rl 3 . ./9.txt ./8.txt $ $ grep -rl 3 . ./9.txt ./8.txt $ $ grep -l --recursive 3 . ./9.txt ./8.txt
So what about if I have subdirectories? Like we have in production, the
PATH/error. Here’s an
example of what would happen:
$ tree . . ├── error │ ├── 5.txt │ ├── 6.txt │ ├── 7.txt │ ├── 8.txt │ └── 9.txt └── success ├── 1.txt ├── 2.txt ├── 3.txt └── 4.txt 2 directories, 9 files $ grep -Rl 3 . ./error/9.txt ./error/8.txt $ grep -Rl 1 . ./success/2.txt ./success/1.txt ./error/5.txt ./error/7.txt
Grep can be chained, to filter few times. An example combining previous ones, would be to search for files containing something, and then filter by filename:
$ grep -Rl 1 . ./success/2.txt ./success/1.txt ./error/5.txt ./error/7.txt $ grep -Rl 1 . | grep success ./success/2.txt ./success/1.txt $ grep -Rl 1 . | grep error ./error/5.txt ./error/7.txt
Of course, this is not as optimal if you have subdirectories, since it searches both directories and then you search for
only one of those directories. To optimize this example, you can use
include/exclude dirs you want to search, obviously you would usually use latter flag.
$ grep -Rl --exclude-dir success 1 . ./error/5.txt ./error/7.txt
You can pipe something else to output of grep, which I also use from time to time, to count number of files for example.
$ grep -Rl 1 . | wc -l 4 # this is the response of `wc' command
wc -l stands for “word count - lines”, in human readable format.
Next example is something I use when I forget some longer command that I don’t type very often, and I didn’t create an
alias or script for. This is usually some
docker command, or
ssh to remote server that I forget the IP address or
$ history | grep ssh 6650 ssh -f firstname.lastname@example.org -L 1443:something.com:443 -N 6683 ssh -f email@example.com -L 1443:something.com:443 -N 7365 ssh -f firstname.lastname@example.org -L 1443:something.com:443 -N
!6650 to execute the first command from previous example (of course this won’t work on your machine, but
it wil work the numbers you get in return to corresponding commands). Also you can do
!! to execute the last command
you typed, so you don’t have to type up and enter. This is usually useful when you forget to type
sudo, so you can
Using mac and linux command line (I didn’t use it in Windows, so I can’t speak for that), you will find yourself
forgetting a lot of flags and similar, and for that you should use the
man (short for manual) pages. You can usually
man grep to get
grep manual. Once you are in manual, you can navigate through it with typing
/ then the
For example, I’ve made a mistake while writing this post for the flag
--exclude-dir, I’ve typed
(notice the ’s’ at the end, it shouldn’t be there), which grep didn’t find, so I’ve checked the manual, with flow
man grep, then hitting
/ and typing
man page (or
less), navigate with
n for next and
for previous. Hit
q to exit.
You can also use manual online, which you can find here.
Invert Match - Exclude
If you wanted to find if some process runs, you can type
ps aux | grep $TERM. This will give you:
$ ps aux | grep postgres milosgarunovic 73808 0.9 0.0 4268036 820 s003 S+ 2:09PM 0:00.00 grep --color=auto -i postgres milosgarunovic 888 0.0 0.0 4343260 2144 ?? Ss 21Jan19 22:07.11 postgres: stats collector process milosgarunovic 887 0.0 0.0 4488256 3720 ?? Ss 21Jan19 5:46.78 postgres: autovacuum launcher process # truncated output
Look at the first line of the output, it contains the
grep command as well. But if we don’t want to have that, we can
chain greps, which I usually use, you can exclude it like this:
$ ps aux | grep -v grep | grep postgres milosgarunovic 888 0.0 0.0 4343260 2144 ?? Ss 21Jan19 22:07.11 postgres: stats collector process milosgarunovic 887 0.0 0.0 4488256 3720 ?? Ss 21Jan19 5:46.78 postgres: autovacuum launcher process # truncated output
You can put
grep -v grep anywhere after the first pipe, but the problem is that if you have
--color=auto set for
grep, like I do, it won’t highlight your search term if you put it to the end, like
ps aux | grep postgres | grep -v
grep. Since you are excluding the term
grep at the end, you won’t get highlighting for the term
postgres. So my
practice is that if I want to exclude something, I exclude it right away, and most important search term is in the last
Practical example I made while writing this post
In the middle of writing this post, I’ve found one more purpose to use grep, and that is to get every post that is
draft. While using hugo - which this website is based upon - you have
content directory that has
all the posts that will be rendered. Every post has some metadata, one of which is
draft: true/false. So using
example from Search File Content, I’ve made a bash script:
#!/usr/bin/env bash grep -Rl 'draft: true' ./content
You can notice here that if you have more than one word to search for, you should wrap it in single or double quotes.
That’s all folks
Of course there are a lot more use cases for
grep, but these are some that I use on regular basis and that worked for
me as a solution to most of my problems that require this tool to be used.
Congrats on making it to the end of my first post. If you liked the content, you can follow me on twitter. Until next time, I wish you all the best.