Hello. In this video I would like to speak about regular expressions in general.
A lot of people asked me to record a video where I explain the basic concepts of regular expressions and I decided to record one.
Actually, there are a lot of regular expression resources on the Web.
In my opinion, one of the most important regular expression resource on the Web is the http://www.regular-expressions.info Web site.
It provides explanation for all basic regular expression constructs.
Lets have a look at those basic regex constructs now.
A regular expression is some text pattern that can be used to search for specific strings.
It can either validate a string, or it can be used to extract a part of a string from a longer string.
It can be used to find and replace parts of a string in longer strings.
It can also be used to split longer strings into smaller chunks.
A regular expression can contain a lot of things inside.
Regular expressions can contain regex escapes, quantifiers, character classes, groups, lookarounds.
Regex escapes are any special sequences of a literal backslash and a character.
For example, zero-width assertions like word boundaries or anchors.
Backreferences are also an example of regex escapes.
They can be numbered or named.
Special characters are also regex escapes, hexadecimal escapes, shorthand character classes, Unicode code points, etc.
These topics are very broad, and we can devote a specific video for each of them.
At this point, I would like to briefly show the use of these constructs.
For example, word boundaries can be used to match a whole word like "REGEX", for example.
So only whole words are matched.
So if you use a "REGEXP", it wont be matched with this regular expression. Anchors, they match the line or string boundaries.
In this case, they match the line boundaries because of this "m" flag here.
So for example, I want to match any line that contains a "META" substring in it.
So this is it.
Well, we dont need any anchors actually, but they dont make any harm here.
Next comes special characters.
For example, if you want to match a dot, we must use a backslash and a dot.
Otherwise a dot matches any character other than a line break character.
So, yes, like this.
Now if we talk about hexadecimal escapes, we can use them like this.
For example, this x20 matches a space or this x0A matches line breaks.
You can see here.
Shorthand character classes are very well-known constructs like "d" to match a digit or "w" to match any word character or "s" that matches a whitespace.
The usage of Unicode code points in regular expressions depends on the regex flavor, and if we select PCRE here, we might use this pattern to match a Polish letter "ł".
This regex escape will match an emoji in a Python regex flavor. Here.
The backreferences are special constructs that let us refer to the part of the regular expression that was captured.
Capturing is done with capturing groups.
For example, we can use a backreference inside a pattern.
For example, when we match "abc" and use "1", we want to match this pattern.
Actually, this is not a good example because "abc" is a literal.
If we do not use a literal it becomes a lot more interesting.
For example, this pattern matches any digit and then the same digit right after it. Like here, we match a zero and another zero.
Here, we do not match zero and one because one is not zero.
Groups can be capturing (those that can be referred to using backreferences) or non-capturing (those that are used to group several patterns with an OR "|" operator or when a whole sequence of patterns needs to be quantified, or repeated.
Capturing groups can be numbered when they are used like this or named when they are named this way.
Capturing groups are defined with a pair of unescaped parentheses and non-capturing groups are defined with open and close parentheses and the open parenthesis is followed directly with a question mark and a colon symbol.
Well discuss more regex basic constructs in the next video.
If you liked my video please click "Like" and subscribe to my channel if you havent done it yet.
Thank you for watching and happy regexing.