Javascript Regex Tutorial with Examples
In this post you will learn what are regular expressions and how to use them inside Javascript language.
What is this?
The first question is "What is regular expressions"?
It's a sequence of characters that specifies a search pattern in text
I know it is not clear and scary to here but here is the example.
As you can see here on the top we have some sequence of symbols, you don't know what these symbols are doing yet but by these symbols here in the bottom text we find all these highlighted words. This is exactly what regular expressions allow us to do.
We write something which will find for us something inside the text
And what you see on the screen is a website https://regexr.com and this is the best website to test your regular expressions and understand them.
How does it work in JS?
Now let's talk about Javascript world. How we can create regular expressions inside Javascript?
const regexr = new RegExp('abc')
Here is how we create a regular expression inside Javascript.
It finds abc
sequence in our text. If we don't have abc
inside the text it won't find anything.
Another possibility to create regular expression is by using 2 slashes.
const regexr = /abc/
And actually this is exactly the same like previous code.
Super important
Before we will start to dive deeper to regular expressions I must tell you the most important point. If you can solve your task without regular expressions go for it. Your regular expressions will never be perfect. There are always some edge cases that you can't cover or it takes too much time to cover them.
Searching something by pattern in the text is a difficult task. And there are lot's of edge cases that we can get.
const paragraph = document.querySelector('p')
paragraph.innerHTML = paragraph.innerHTML.split('.').join('.</p><p>') + '</p>'
This is why if you can solve something without regular expressions for example splitting your string by words and then selecting what you need it will be much easier and much stable than using regular expressions.
Now let's look on 3 most popular functions inside Javascript to use regular expressions.
Replace
The first function is replace. Sometimes you need to replace some symbols inside a string.
const result = 'Foo bar baz foo'.replace(/foo/, 'test')
console.log(result)
Here we used a replace
function but instead of replacing a string we are searching for pattern. And actually we want to replace all occurrences of foo in the text.
Foo bar baz test
As you can see foo
was replaced to test
but Foo
was not.
By default in regular expression everything is case sensitive.
We must add a flag to ignore case. This is a letter i
after our regular expression
const result = 'Foo bar baz foo'.replace(/foo/i, 'test')
console.log(result)
test bar baz foo
As you can see in the result it is still invalid. It replaced now the first Foo
but not the second. It happens because by default it looks for just one occurrence and we must add a flag to make it global.
const result = 'Foo bar baz foo'.replace(/foo/ig, 'test')
console.log(result)
test bar baz test
Now we are getting a correct result with all replacements of the foo.
Split
Another function that you for sure will use it split
and we can provide there a regular expression.
const result = 'Foo bar foo baz'.split(/foo/ig)
console.log(result)
Here we tried to split our string by foo
string as a separator. Here is what we got.
['', ' bar ', ' baz']
So if you want to convert your text to array of elements and you need some pattern inside split
with a regular expression inside is a way to go.
Match
And the last function that you will use a lot is called match
.
const result = 'Foo bar foo baz'.match(/foo/ig)
console.log(result)
This is our result
['Foo', 'foo']
Here we got all matches that we were trying to find.
Match will find all our matches inside a string.
And essentially when we use a match
on a lot of text all occurrences that you see in the website will be written inside this array.
So now you know how inside Javascript you can create a regular expression and what functions you can use for different cases. These are replace
, split
and match
.
Now it is time to learn some symbols that we can use inside regular expressions. We have lots of them but you can start with 10 or so which will be enough for most of searches. I will show all of them inside regexr website because it is more understandable but it will all work inside plain Javascript.
Set
The first thing that I want to show you is set. Set is just square brackets and we have something inside.
Previously we provided abc
but without brackets. This is a difference. In previous example we was looking for string abc
but now we look for any symbol inside a set. This is why event a separate letter a
or b
is also highlighted.
Ranges
Inside set we can also use ranges. Typically we have 2 different types of ranges: numeric and alphabetical.
Here we look for numbers in the range from 1 to 4. Any number like 1
or 2
is highlighted but not 5
. This is a range inside our set.
This range can also be alphabetic.
As you can see all small letters from a
to z
are highlighted. But capital letter are not highlighted. If we want to include them also we must write it like this.
[a-zA-Z]
These are 2 different ranges which we are looking for inside our set.
Negation
Another important thing is negation.
Here we put ^
symbol at the beginning of the set to say that we want to negate everything that we meet in this set. This is why all symbols except of a
, b
or c
are highlighted.
If you don't want to take any number you can write negation like this.
[^0-9]
Period
Now we have a period. Dot symbol is just any character expect of line breaks.
This is why we simply take every single symbol.
But here you for sure think "But what should we do if we are looking for the dot?". In this case we must escape this character with the back slash.
\.
It will find now only dot symbols.
Start and end of the string
Another important thing is to find the start and the end of the string.
As you can see we find abc
only at the beginning of the text but not all abc
matches that we have later.
It is also important to remember that it is not a symbol of the negation because it is not inside a set.
If we need to define the end of the string we use a dollar sign.
Plus & Star
If we need to get a sequence of numbers then we can write a plus symbol after the set.
It means that we look for this set which will occur 1 time or more. This is why we find all number and a sequence of numbers. And we don't find it as separate symbols but like a single matched string.
If we use star symbol instead of the plus it will match zero or more elements of the set. We use it if sometimes something is optional inside a string.
abc[0-9]*
This regular expression will match abc
even without any number afterwards but also any sequence like abc123
.
Question mark
Also we have a question mark which tells us that symbol before it is optional.
Here we have question mark after c which makes it option. This is why abc
is a valid match but ab
is also a valid match because c
is optional.
Brackets
We use brackets if we need to define how many times occurrence can happen.
Here we defined that we can get a
2 or 3 times. Which means it will get only defined amount of symbols and not less or more.
Or
The next keyword is |
which works like OR
construction inside our code.
Here we look for either aa
or bb
. You must use this symbol if you look for the different occurrences in the text.
Groups
And the last symbol here is the most important. It is called grouping.
Here we used round brackets to define a group. It doesn't do anything to the pattern that we are looking for but it allow us to get a group later inside each pattern. This is crucial if you want to be more specific.
I understand that these are lots of symbols and you don't understand how to use them or even remember them. No worries, here are some examples so you can see on practice how we can use regular expressions.
Email validation
The first example is an email validation. This is for sure the most popular usage of regular expressions. And actually it is extremely difficult to implement this is why there is no perfect regular expression which exists to parse it.
Typically you don't need to parse emails at all. You need to use validation of the emails just to highlight for the user that their email is not valid.
This is why our regular expression must in the easiest way check if it is a valid email.
We can split our logic in 3 different sections. We know that we have 3 sections. 1 section is any text, then we have @ symbol, then any text again, then dot symbol and 2 or 3 symbols as an domain extension.
Let's write a regular expression for the first section
[a-z0-9]+@
We look for any letter and numbers which occur 1 time and more. After that we must have an @ symbol.
[a-z0-9]+@[a-z0-9]+\.
After @ we have exactly the same sequence but we get a dot afterwards. Because we look for a real dot symbol we must escape it.
[a-z0-9]+@[a-z0-9]+\.[a-z]{2,3}
Here we added a domain at the end which is lower cased letter which much be either 2 or 3 in a row.
fgege@gwegwe.com
As you can see this is completely valid highlighted email. If we provide an invalid email it won't highlight it.
Phone number validation
Any popular example of regular expressions is phone number validation. Let's say that we want to validate such number.
111-111-1111
Again we have 3 sections. 1 section must have 3 numbers, then we have a dash, 2 section has 3 numbers, then a dash again and 4 numbers.
Let's write the first section
[0-9]{3}-
We look for a sequence a 3 numbers followed by a dash.
[0-9]{3}-[0-9]{3}-
And our second section is the same like the first one.
[0-9]{3}-[0-9]{3}-[0-9]{4}
Our last section is 4 digits in a row. As you can see our string is highlighted correctly now.
Writing regular expressions is not easy. But with time you will get better.
Want to conquer your next JavaScript interview? Download my FREE PDF - Pass Your JS Interview with Confidence and start preparing for success today!