Overview
Find is a utility to find files on your computer given a wide range of search parameters. It will recursively walk your filesystem (breadth-first ) and print (by default) all filenames that have matched your search terms. Think of it like grep but for file names rather than file contents.
You can search on a variety of metadata on files, ranging from name, creation time, update time, owner, group, size, and type (file, directory, link, socket, etc.).
Additionally you can execute a variety of commands on the files find
finds.
Terminology
If you read the docs , which I recommend that you do , you’ll see referenes to a handful of terms which may be unfamiliar.
- Expression
The docs state an
expression
is composed of the “primaries” and “operands”. In a less opaque version, it is a search term (with any accompanying arguments) and something to do to each of the results found.- Primary
A primary is a
search term
used to limit the list of files returned by find OR anaction to perform
on the list of files returned.
Operand : Operands (or operators as they’re sometimes referred as) are ways to
combine expressions, like or
or and
.
Primaries - Finding Files
Enough theory, let’s see some examples. For these examples, we will be investigating a Ruby library meant to query Jira’s API.
By file names
find . -name '*.yml'
./.rubocop.yml
./.travis.yml
This executes a search starting in the current directory (.
) for all files
whose name (-name
) matches the bash glob pattern *.yml
. The same search can
be done in a case-insensitive fashion using -iname
instead of -name
.
Truth be told, this pattern of searching for files by some name glob accounts
for 95% of my use of find
. But there’s so much more it can do.
By access time
Consider the scenario where you were recently working on a project and wanted to see which files you have edited in the last 30 minutes.
find . -amin -30 -name '*.rb'
./lib/jira.rb
The -amin
primary finds all files that have been accessed 30 or fewer minutes
ago. We specified -30
and not 30
to represent “30 or fewer”. If we wanted to
only find files that have been accessed more than 30 minutes ago, we could
specify the primary with a +
(-amin +30
). This use of +
and -
applies to
any primary that takes a numeric argument.
If you wanted to look at the files changed over a longer period of time without
having to pull in a
command substitution
(-amin -$(expr 60 \* 24 \* 15)
), you can instead use -atime
which excepts a
number and a time specifier. s
for seconds, m
for minutes, h
for hours,
d
for days, and w
for weeks.
By creation time
If you were instead interested in the file creation time rather than access
time, you could use -Bmin
and its counterpart -Btime
which peform the same
searches as -amin
and -atime
but for file creation time.
find . -Btime -3d
./lib/jira/api/just_created.rb
By owner and group
In some instances it can be handy to find files owned by a particular user, or particular group. Find has you covered.
find . -user root
./bin/stray_file
This finds all the files owned by root
. Similaraly, -group GROUP
finds all
the files set to the group GROUP
.
I rarely use these flags as I rarely work on machines with multiple users or exotic group membership, but if you’re a system administrator, these could be very handy. Regardless, learning more about the UNIX file permission model is very helpful so I would recommend reading up more on it if you’re curious.1
By type and size
Shifting gears, let’s say you’re trying to track down files that are hogging
your hard drive space. -size
helps you find files larger or smaller than your
given size.
find . -size +512k
./coverage/index.html
This will find all files larger than 512 kilobytes. You can also search in units
of megabytes with M
, gigabytes with G
, terabytes with T
and if you’re
working with truly titanic files, petabytes using P
.
I need to come clean about something. I’ve used the term files
throughout this
article to refer to the results that find returns from its searches, but that’s
not exactly true. If you’ve experimented with find yourself, you have even
noticed that find doesn’t just return files. It returns everything on the
filesystem: files, directories, sybolic links, sockets, and more.
You can tell find
to only return results of a given type with the -type
primary.
find . -type d ! -path '*git*' ! -path '*coverage*'
.
./bin
./spec
./spec/issue
./spec/api
./spec/report
./.yardoc
./.yardoc/objects
./lib
./lib/jira
./lib/jira/issue
./lib/jira/api
./lib/jira/report
./doc
./doc/css
./doc/js
./doc/Jira
./doc/Jira/Reporting
./doc/Jira/Issue
./doc/Jira/Api
./doc/Jira/Report
./doc/Jira/Client
./.idea
This returns only the directories in the current directory and its children. The
-path
primary searches in the whole path of the file, not just the file
itself, and the !
will be covered later when we talk about
operands
Primaries - Acting on files
So far find
has only been printing the results that it finds, but you can take
other actions on the matched files as well. This is implicitly the same as using
the -print
primary. Similar to it is -ls
, which formats its output as though
you had ran ls -l
on the list of files returned.
find . -type f -name '*.md' -ls
30288144 8 -rw-r--r-- 1 jharder staff 622 Feb 8 2023 ./CHANGELOG.md
35271368 24 -rw-r--r-- 1 jharder staff 11914 Apr 13 2023 ./README.md
If instead you wanted to clean up a bunch of files, you can use find with the
-delete
primary to delete all the files find finds. No reaching for
xargs
necessary.
You can run any utility you want on the list of files using the -exec
primary,
though there are two important things to keep in mind:
- You must terminate the command with an escaped
;
in order to tell find when your exec command ends and the next primary begins. - To reference the filename in the utility us
{}
.
find . -type f -name '*.md' -exec echo {} \;
./CHANGELOG.md
./README.md
You can replicate the -delete
primary by using -exec rm {} \;
but I might
wonder why you’re doing that. A helpful variation of -exec
is -ok
, which is
identical to -exec
but asks you for permission first. You you wanted to delete
a handful of files but wanted to confirm before they get deleted -ok
is a
great option.
find . -empty -ok rm {} \;
"rm ./lib/empty_file_2"? no
"rm ./lib/jira/api/just_created.rb"? no
"rm ./lib/empty_file_1"? yes
Operands
Operands are quite simple, in fact, most of these examples have been using them!
The operators listed in the
man
pages list (
)
, !
,
-and
, and -or
. When multiple primaries are provided to find, -and
is
implicitly used to link them together. find . -empty -atime +1w
is the same as
typing find . -empty -and -atime +1w
, meaning, find all the files that are
empty AND haven’t been accessed in at least a week.
( expression )
sets an order of precedence similar to how paretheses work in
math. (1+3) * 4
would evaluate 1+3
before * 4
because of the parentheses.
-or
and !
(also expressed as -not
) are similar, and follow boolean logic
that you should be familiar with.
Conclusion
Find is a humble utility that seems simple on its face, but weilds great power in the hands of those who know how to handle it. I’ve used find for years mostly to locate files with a certain name or file extension, but in researching this article I’ve found numerous other use cases.
I encourage you to poke around and consider find when you’re looking for some
files and don’t know where they’re stored. Or you could use it to find files
bigger than 1G
and consider cleaning them up. Or finding empty files that have
been sitting around for a long time.