Advanced Bash-Scripting HOWTO: A guide to shell scripting, using Bash | ||
---|---|---|
Prev | Chapter 3. Tutorial / Reference | Next |
In order to fully utilize the power of shell scripting, you need to master regular expressions.
An expression is a set of characters that has an interpretation above and beyond its literal meaning. A quote symbol ("), for example, may denote speech by a person, ditto, or a meta-meaning for the symbols that follow. Regular expressions are sets of characters that UNIX endows with special features.
The main uses for regular expressions (REs) are text searches and string manipulation. An RE matches a single character or a set of characters.
The asterisk * matches any number of characters, including zero.
The dot . matches any one character, except a newline.
The question mark ? matches zero or one of the previous RE. It is generally used for matching single characters.
The plus + matches one or more of the previous RE. It serves a role similar to the *, but does not match zero occurrences.
The caret ^ matches the beginning of a line, but sometimes, depending on context, negates the meaning of a set of characters in an RE.
The dollar sign $ at the end of a an RE matches the end of a line.
Brackets [...] enclose a set of characters to match in a single RE.
[xyz] matches the characters x, y, or z.
[c-n] matches any of the characters in the range c to n.
[^b-d] matches all characters except those in the range b to d. This is an instance of ^ negating or inverting the meaning of the following RE (taking on a role similar to ! in a different context).
The backslash \ escapes a special character, which means that character gets interpreted literally.
A \$ reverts back to its literal meaning of "dollar sign", rather than its RE meaning of end-of-line.
Escaped "curly brackets" \{ \} indicate the number of occurrences of a preceding RE to match.
It is necessary to escape the curly brackets since they have a different special character meaning otherwise.
[0-9]\{5\} matches exactly five digits (characters in the range of 0 to 9).
Note: Curly brackets are not available as an RE in awk.
"Sed & Awk", by Dougherty and Robbins (see Bibliography) gives a very complete and lucid treatment of REs.
Sed, awk, and Perl, used as filters in scripts, take REs as arguments when "sifting" or transforming files or I/O streams.
Закладки на сайте Проследить за страницей |
Created 1996-2024 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |