Most novice users of Unix-systems, in particular Linux, are not familiar with the basic command line operators used in this OS. Let's take a closer look at the functions and use of the find and grep operators.
Using Find and Grep commands in Linux.
FIND
The Linux find command is a command line utility for traversing the file hierarchy. It can be used to search for files and directories and perform subsequent operations with them. It supports searching by file, folder, name, creation date, change date, owner and permissions. Using -exec, other UNIX commands can be executed for found files or folders. Syntax:
$ find [where to start the search] [expression determines what to find] [-options] [what to find]
Options:
- -exec - the required file that meets the above criteria and returns 0 as an exit state for successful command execution;
- -ok - works the same as -exec, except that the user is first prompted;
- -inum N - search with the number "N";
- -links N - search with links "N";
- -name demo - search for files specified in “demo”;
- -newer file - search for files that have been modified / created after “file”;
- -perm octal - search if the resolution is octal;
- -print - show the path to the documents found using the other criteria;
- -empty - search for empty documents and directories;
- -size + N / -N - search blocks "N"; "N" and "c" can be used to measure the size in characters; “+ N” means a larger size of “N” blocks, and “-N” means a smaller size of “N” blocks;
- -user name - search for documents belonging to the user name or the identifier "name";
- \ (expr \) - True if "expr" is true; Used to group criteria in conjunction with OR or AND.
Grep
The grep command is used to search files. The function stands for “global printing of regular expressions” and is one of the most powerful and frequently used commands in Linux. The command searches for one or more input files that match the specified pattern, and writes each corresponding line to standard output. If no files are specified, the command reads from the standard input, which is usually the output of another command. In this article, we will show you how to enter a command, with practical examples and detailed explanations of the most common GNU grep options.
Command syntax
Before we begin using the command, let's begin by reviewing the basic syntax. The utility expressions have the following form:
[OPTIONS] PATTERN [FILE ...]
Items in square brackets are optional.
- OPTIONS - zero or more choices. The team provides a number of options that control its behavior.
- PATTERN - Search pattern.
- FILE - zero or more input file names.
How to enter a command to search files
The main purpose of the command is to search for text in the file. For example, to display from the / etc / passwd file containing the bash line, you can use the following command:
$ grep bash / etc / passwd
The output should look something like this:
root 0: 0: root: / root: / bin / bash
domain1000: 1000: domain: / home / domain: / bin / bash
If the string contains spaces, you need to enclose it in single or double quotes:
$ "Gnome Display Manager" / etc / passwd
Invert Match (ex)
To display lines that do not match the pattern, enter the –v (or –invert-match) parameter. For example, to display a file that does not contain nologin from the / etc / passwd file, you can enter the following command:
$ -v nologin / etc / passwd
Output:
root 0: 0: root: / root: / bin / bash
colord 124: 124 :: / var / lib / colord: / bin / false
git 994: 994: git daemon user: /: / usr / bin / git-shell
linuxize 1000: 1000: linuxize: / home / linuxize: / bin / bash
How to use the command to search in the output
Instead, if you specify input files, you can redirect the output of another command, and then display only the lines that match the specified pattern. For example, to find out which processes are running on your system as a www-data user, you can use the following command:
$ ps -ef | www-data
Output:
www-data 18247 12675 4 16:00? 00:00:00 php-fpm: pool www
root 18272 17714 0 16:00 pts / 0 00:00:00 —color = auto — exclude-dir = .bzr —exclude-dir = CVS —exclude-dir = .git —exclude-dir = .hg —exclude-dir = .svn www-data
www-data 31147 12770 0 Oct22? 00:05:51 nginx: worker process
www-data 31148 12770 0 Oct22? 00:00:00 nginx: cache manager process
You can also combine multiple channels into a team. As you can see in the output above, there is also a line containing the process. If you do not want this line to be displayed, send the output to another instance, as shown below.
$ ps -ef | www-data | grep -v grep
Output:
www-data 18247 12675 4 16:00? 00:00:00 php-fpm: pool www
root 18272 17714 0 16:00 pts / 0 00:00:00 —color = auto — exclude-dir = .bzr —exclude-dir = CVS —exclude-dir = .git —exclude-dir = .hg —exclude-dir = .svn www-data
www-data 31147 12770 0 Oct22? 00:05:51 nginx: worker process
www-data 31148 12770 0 Oct22? 00:00:00 nginx: cache manager process
Recursive search
To recursively search for a pattern, enter the –r (or –recursive) option. This will allow you to search through all the files in the specified directory, skipping symbolic links that occur recursively. To go through all symbolic links, use the –r (or –dereference-recursive) option. In the following example, we are looking for domain.com in all files inside the / etc directory:
$ -r domain.com / etc
The command will print the corresponding fields with the full file path prefix.
/etc/hosts:127.0.0.1 node2.domain.com /etc/nginx/sites-available/domain.com: server_name domain.com www.domain.com;
If instead of –r you use the –R option, the command will follow all symbolic links:
$ -R domain.com / etc
Notice the last output field. This is not printed in the example above, because the files in the sites-enabled Nginx directory are symbolic links to configuration files inside the sites-available directory.
Output:
/etc/hosts:127.0.0.1 node2.domain.com
/etc/nginx/sites-available/domain.com: server_name domain.com www.domain.com;
/etc/nginx/sites-enabled/domain.com: server_name domain.com www.domain.com;
Show only file name
To suppress the default output and print only the names of files containing the matched pattern, you can enter the –l (or —files-with-matches) option. For example, to search for all files ending in .conf in the current working directory, and to print only file names containing the domain.com type, type:
$ –L domain.com * .conf
The output will look something like this:
tmux.conf
haproxy.conf
The -l option is usually used in conjunction with the recursive -R option:
$ -Rl domain.com / tmp
Case insensitivity
By default, the command is case sensitive, which means that uppercase and lowercase characters are treated as different. To ignore case when searching, enter the –i (or –ignore-case) option. For example, if you search for a Zebra without any option, the following command will not display any output, i.e. there are matching.
$ Zebra / usr / share / words
But if you perform a case-insensitive search, use the –i option, it will match both upper and lower case letters:
$ grep -i Zebra / usr / share / words
The indication “Zebra” will correspond to “Zebra”, “ZEbrA” or any other combination of uppercase and lowercase letters.
Output:
zebra
zebra's
zebras
Exact match
When searching, gnu will also print gnu, in which larger words are inserted, such as cygnus or magnum.
$ gnu / usr / share / words
Output:
cygnus
gnu
interregnum
lgnu9d
lignum
magnum
magnuson
sphagnum
wingnut
To return only those expressions in which the specified field is a whole word (not enclosed in words), you can use the –w option (or —word-regexp).
IMPORTANT. The characters of the word include alphanumeric characters (az, AZ and 0-9) and underscores (_). All other characters are treated as non-verbal characters.If you run the same command as above, including the –w option, the command will return only those that include gnu as a separate word.
$ grep -w gnu / usr / share / words
Output: gnu
Show numbers
To show the number of lines containing a pattern, use the –n (or –line-number) parameter. Using this option will print matches to the standard output with the prefix of the number in which it was found. For example, to display from the / etc / services file containing the bash prefix with the corresponding number, you can use the following command:
$ grep -n 10000 / etc / services
The output below shows that the matches are at 10423 and 10424.
Output:
10423: ndmp 10, 000 / tcp
10424: ndmp 10000 / udp
Count
To print the number of matching lines to standard output, use the –c (or –count) parameter. In the example below, we count the number of accounts that have the shell / usr / bin / zsh.
$ grep -c '/ usr / bin / zsh' / etc / passwd
Output: 4
Several lines (patterns)
The OR operator can combine two or more search patterns |. By default, the command interprets the pattern as the main regular expression, in which metacharacters lose their particular meaning, and their versions with a backslash should be used. In the example below, we search for all occurrences of the words fatal, error, and critical in the Nginx error log file:
$ grep 'fatal \ | error \ | critical' /var/log/nginx/error.log
If you use the extended regular expression option –E (or —extended-regexp), the statement should not be escaped, as shown below:
$ grep -E 'fatal | error | critical' /var/log/nginx/error.log
Regular expression
GNU Grep has two sets of regular expression functions - Basic and Extended. By default, the function interprets the pattern as a basic regular expression, to switch to extended regular expressions, you need to use the –E option. When using regular expressions in the main mode, all other characters, except metacharacters, are actually regular expressions that correspond to each other. Below is a list of the most commonly used metacharacters:
- Use the ^ character (the caret character) to match the expression at the beginning of a line. In the following example, ^ kangaroo will match only if it occurs at the very beginning: $ grep "^ kangaroo" file.txt
- Use the $ (dollar) symbol to match the expression at the end. In the following example, kangaroo $ will only match if it is encountered at the very end: grep "kangaroo $" file.txt
- Use the symbol. (dot) to match any single character. For example, to match everything that begins with kan of two characters and ends with roo, you can use the following pattern: $ grep "kan..roo" file.txt
- Use [] (brackets) to match any single character enclosed in brackets. For example, find those that contain accept or "accent, you can use the following pattern: $ grep" acce [np] t "file.txt
To avoid the special meaning of the next character, use the \ (backslash) character.
Extended regular expressions
To interpret a pattern as an extended regular expression, use the –E (or –extended-regexp) parameter. Extended regular expressions include all basic metacharacters, as well as additional metacharacters for creating more complex and powerful search patterns. Below are some examples:
- Match and extract all email addresses from this file: $ grep -E -o "\ b [A-Za-z0-9 ._% + -] [A-Za-z0-9 .-] + \. [A-Za-z] {2.6} \ b "file.txt
- Map and extract all valid IP addresses from this file: $ grep -E -o '(25 [0-5] | 2 [0-4] [0-9] | [01]? [0-9] [0 -9]?) \. (25 [0-5] | 2 [0-4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0- 5] | 2 [0-4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0-5] | 2 [0-4] [0- 9] | [01]? [0-9] [0-9]?) 'File.txt
The -o option is used to print only matches.
Print before counting
To print a certain number of lines before matching, use the –B (or —before-context) parameter. For example, to display 5 lines of initial context before matching, you can use the following command: $ grep -A 5 root / etc / passwd
Print after search
To print a specific number of lines after a match, use the –A (or –after-context) parameter. For example, to display 5 lines of the final context after matching strings, you can use the following command: $ grep -B 5 root / etc / passwd
This is all necessary for the full use of commands information. If you already use Linux and can give any advice to beginners, share comments under this article.