Example 1 – Using a Combined Formula to Match a REGEX Pattern in Excel
REGEX will be: the total character length – 9, the first 3 – uppercase letters, the next 3 – numeric values, and the last 3 – lowercase letters.
Step 1: Creating Dynamic Named Ranges
- Go to Formulas>> Defined Names >> Name Manager.
In the Name Manager dialog box:
- Click New.
In the New Name wizard:
- Enter Letters in the Name box.
- Enter the following in Refers to.
="abcdefghijklmnopqrstuvwxyz"
- Click OK.
In the Name Manager dialog box:
- Click New.
In the New Name wizard:
- Enter Numbers in the Name box.
- Enter the following in Refers to.
="1234567890"
- Click OK.
Go back to the Name Manager wizard, and click Close.
Step 2 – Using a Formula
- Enter the following formula in C5.
=AND(LEN(B5)=9, COUNT(FIND(MID(LEFT(B5,3), ROW(INDIRECT("1:"&LEN(LEFT(B5,3)))),1), UPPER(Letters)))=LEN(LEFT(B5,3)), COUNT(FIND(MID(MID(B5,4,3), ROW(INDIRECT("1:"&LEN(MID(B5,4,3)))),1), Numbers))=LEN(MID(B5,4,3)), ISNUMBER(FIND(RIGHT(B5), Letters)))
Formula Breakdown
- COUNT(FIND(MID(LEFT(B5,3), ROW(INDIRECT(“1:”&LEN(LEFT(B5,3)))),1), UPPER(Letters)))=LEN(LEFT(B5,3)) → checks if the 3 first characters are uppercase letters.
- COUNT(FIND(MID(MID(B5,4,3), ROW(INDIRECT(“1:”&LEN(MID(B5,4,3)))),1), Numbers))=LEN(MID(B5,4,3)) → checks if the 3 middle characters are numeric digits
- ISNUMBER(FIND(RIGHT(B5), Letters)) → checks if the 3 last characters are lowercase letters.
- LEN(B5)=9 → checks if the total character length is 9
- The AND function will return TRUE if all these 4 conditions are fulfilled.
- Drag down the Fill Handle.
You will get TRUE when the pattern matches REGEX, otherwise FALSE.
Read More: How to Perform Pattern Matching in Excel
Example 2 – Getting Results Through Message Boxes for Different Patterns
Step 1: Launching the VBA Editor
- Go to the Developer tab >> Visual Basic.
In the Visual Basic Editor :
- Go to the Insert tab >> Module.
A new module will be displayed: Module 1.
- Go to Tools >> References.
In the References-VBAProject window:
- Click Microsoft VBScript Regular Expressions 5.5.
- Click OK.
It will enable the regEx object.
Step 2: Entering the Code in the Module
- Enter the following code.
Sub match_pat_1()
Dim char_form, char_renew, char_data As String
Dim regEx As New RegExp
char_form = "^[A-Z]{1,4}"
char_renew = ""
If char_form <> "" Then
char_data = "ABCD6758"
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
MsgBox (regEx.Replace(char_data, char_renew))
Else
MsgBox ("Can't find match")
End If
End If
End Sub
Code Breakdown
- char_form, char_renew, char_data are declared as String, and regEx as New RegExp.
- The regular expression pattern “^[A-Z]{1,4}” was assigned to the char_form variable: the first 4 letters should be uppercase. char_renew is assigned to blank.
- The IF statement created the next codes for non-blank characters.
- The input data “ABCD6758” is assigned to the char_data
- the WITH statement characterizes the components of the regEx
- Test(char_data) searches for the given pattern. If there is a match, TRUE is returned. It executes the following line with the REPLACE function which replaces the first 4 characters with a blank.
- If FALSE is returned, the MsgBox displays “Can’t find match”.
- Press F5.
The message box will return 6758.
- You can also use the following code if you want to extract the last 4 characters as letters.
Sub match_pat_1()
Dim char_form, char_renew, char_data As String
Dim regEx As New RegExp
char_form = "^[0-9]{1,4}"
char_renew = ""
If char_form <> "" Then
char_data = "6758ABCE"
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
MsgBox (regEx.Replace(char_data, char_renew))
Else
MsgBox ("Can't find match")
End If
End If
End Sub
Here, “^[0-9]{1,4}” is the pattern – the first 4 characters should be numeric data. “6758ABCE” is the input data.
- After pressing F5, you will get the result as ABCE.
- To extract the 6 last characters including texts and numbers, use the following code.
Sub match_pat_1()
Dim char_form, char_renew, char_data As String
Dim regEx As New RegExp
char_form = "^[0-9]{1,2}"
char_renew = ""
If char_form <> "" Then
char_data = "6758ABCE"
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
MsgBox (regEx.Replace(char_data, char_renew))
Else
MsgBox ("Can't find match")
End If
End If
End Sub
“^[0-9]{1,2}” is the pattern – the first 2 characters should be numeric data. “6758ABCE” is the input data.
- After pressing F5, you will get the result- 58ABCE.
Read More: How to Find and Replace RegEx Patterns in Excel
Example 3 – Creating a VBA Function to Match REGEX and Extract a Portion in Excel
To extract the characters after the first 4 letters:
Steps:
- Follow Step 1 in Example 2.
- Enter the following code in the module.
Function match_pat(val_rng As Range) As String
Dim char_form, char_renew, char_data As String
Dim regEx As New RegExp
char_form = "^[A-Za-z]{1,4}"
char_renew = ""
If char_form <> "" Then
char_data = val_rng.Value
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
match_pat = regEx.Replace(char_data, char_renew)
Else
match_pat = " "
End If
End If
End Function
Code Breakdown
- match_pat, declares val_rng as Range and the output of this function will be a string.
- char_form, char_renew, char_data is declared as String, and regEx as New RegExp.
- The regular expression pattern “^[A-Za-z]{1,4}” is assigned to the char_form variable: the first 4 letters should be lowercase or uppercase letters. char_renew is assigned to blank.
- The IF statement created the next codes for non-blank characters.
- The input data range val_rng is assigned to the char_data
- the WITH statement characterizes the components of the regEx
- Test(char_data) searches for the given pattern. If there is a match, TRUE is returned. It executes the following line with the REPLACE function which replaces the first 4 characters with a blank.
- If FALSE is returned, a blank will be displayed.
Go back to the main worksheet.
- Enter the following formula in C5.
=match_pat(B5)
B5 is the input data, and match_pat will return the characters after the first 4 letters.
- Drag down the Fill Handle.
You will get the following result in the Extracted Portion column.
Read More: How to Filter Using RegEx in Excel
Example 4 – Extracting a Definite Pattern after Matching REGEX for a Loop
Steps:
- Follow Step 1 in Example 2.
- Enter the following code in the module.
Sub match_pat_2()
Dim char_form, char_renew, char_data As String
Dim regEx As New RegExp
Dim val_rng As Range
Dim Val As Variant
Set val_rng = Range("B5:B12")
char_form = "^[A-Za-z]{1,4}"
char_renew = ""
For Each Val In val_rng
If char_form <> "" Then
char_data = Val.Value
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
Val.Offset(0, 1).Value = regEx.Replace(char_data, char_renew)
Else
Val.Offset(0, 1).Value = " "
End If
End If
Next
End Sub
Code Breakdown
- char_form, char_renew, char_data are declared as String, and regEx as New RegExp.
- B5:B12 is assigned to a variable val_rng.
- The regular expression pattern “^[A-Za-z]{1,4}” is assigned to the char_form variable- the first 4 letters should be lowercase or uppercase letters. char_renew is assigned to blank.
- the FOR NEXT Loop goes through each cell in val_rng.
- the IF statement creates the next codes for non-blank characters.
- The input data range val_rng is assigned to the char_data
- the WITH statement characterizes the components of the regEx
- Test(char_data) will search for the given pattern. If there is a match, TRUE is returned. It executes the following line with the REPLACE function which replaces the first 4 characters with a blank.
- If FALSE is returned, a blank will be displayed.
- Press F5.
You will get the output in the Extracted Portion column.
Example 5. Splitting Values Based on Patterns in Excel
The values of the Pattern column were split into 3 parts: the first portion will have the first 4 digits, the second portion will contain the next 2 letters, and the last part will contain the last 4 digits.
Steps:
- Follow Step 1 in Example 2.
- Enter the following code in the module.
Sub match_pat_3()
Dim char_form, char_data As String
Dim regEx As New RegExp
Dim val_rng As Range
Dim Val As Variant
Set val_rng = Range("B5:B12")
For Each Val In val_rng
char_form = "([0-9]{4})([a-zA-Z]{2})([0-9]{4})"
If char_form <> "" Then
char_data = Val.Value
With regEx
.IgnoreCase = False
.Pattern = char_form
End With
If regEx.Test(char_data) Then
Val.Offset(0, 1).Value = regEx.Replace(char_data, "$1")
Val.Offset(0, 2).Value = regEx.Replace(char_data, "$2")
Val.Offset(0, 3).Value = regEx.Replace(char_data, "$3")
Else
Val.Offset(0, 1).Value = " "
End If
End If
Next
End Sub
Code Breakdown
- char_form, char_renew, char_data are declared as String, and regEx as New RegExp.
- B5:B12 is assigned to a variable val_rng.
- the FOR NEXT loop goes through each cell in val_rng.
- The regular expression pattern “([0-9]{4})([a-zA-Z]{2})([0-9]{4})” is assigned to the char_form variable: the first 4 characters should be numeric digits, the next 2 characters should be lowercase or uppercase letters, and the last 4 digits should be numeric values.
- the IF statement creates the next codes for non-blank characters.
- The input data range val_rng is assigned to the char_data
- the WITH statement characterizes the components of the regEx
- Test(char_data) will search for the given pattern. If there is a match, TRUE is returned. The characters are split into the next 3 columns.
- If FALSE is returned, a blank will be displayed.
- Press F5.
This is the output.
Example 6 – Checking REGEX Patterns Through a User-Defined Function in Excel
Steps:
- Follow Step 1 in Example 2.
- Enter the following code in the module.
Function matchP(val_rng As Range, char_form As String) As Variant
Dim storeV() As Variant
Dim limit_1, limit_2, R_count, C_count As Long
On Error GoTo handleER
matchP = storeV
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.IgnoreCase = False
.pattern = char_form
End With
R_count = val_rng.Rows.Count
C_count = val_rng.Columns.Count
ReDim storeV(1 To R_count, 1 To C_count)
For limit_1 = 1 To R_count
For limit_2 = 1 To C_count
storeV(limit_1, limit_2) = regEx.Test(val_rng.Cells(limit_1, limit_2).Value)
Next
Next
matchP = storeV
Exit Function
handleER:
matchP = CVErr(xlErrValue)
End Function
Code Breakdown
- matchP declares val_rng as Range, char_form as String and the output of this function will be a Variant.
- storeV() is declared as Variant, and limit_1, limit_2, R_count, C_count as Long.
- The array storeV is assigned to the function.
- The object regEx is declared, and the WITH statement characterizes the components of the regEx
- R_count, and C_count count the total number of rows and columns in the input range. The dynamic array storeV() up to is resized.
- Two FOR loops test the values from the input range to the number of total rows and total columns. The result is stored in the storeV() array, which will give the output.
6.1. Checking Combinations of Numbers and Texts
Steps:
- Enter the following formula in C5.
=matchP(B5,"\D{4}\d{4}")
\D{4}\d{4} is the pattern. D{4} represents non-digit characters, and their position will be the first 4 positions. d{4} represents any digit in the last 4 positions.
- Drag down the Fill Handle.
This is the output.
- You can also use the following formula.
=matchP(B5,"\w{8}")
The pattern w{8} represents any alphanumeric character up to 8 positions.
6.2. Checking Phone Number Patterns
Steps:
- Enter the following formula.
=matchP(B5,"(\(\d{3}\)|\d{3})[-\.\s]?\d{3}[-\.\s]?\d{4}")
Formula Breakdown
- \(\d{3}\) → checks if the first 3 digits are inside the bracket
- |\d{3} → indicates another option: the first 3 digits are without bracket.
- [-\.\s]? → ? indicates that –, ., or space characters will appear 0 times or a single time.
- d{3} is the next group of 3 digits, and d{4} is the group of the last 4.
6.3. Checking Email Patterns
Steps:
- Enter the following formula.
=matchP(B5,"[\w\.\-]+@[A-Za-z0-9]+[A-Za-z0-9\.\-]*[A-Za-z0-9]+")
Pattern Breakdown
- [\w\.\-]+ → is the first portion of an email id which may contain a name or some digits or . or a –
- @[A-Za-z0-9]+[A-Za-z0-9\.\-]*[A-Za-z0-9]+ represents the portion after @, which is the domain name. There may be different uppercase or lowercase letters with numbers, hyphens, dots, etc. Here, + and * represent multiple occurrences.
Practice Section
Practice here.
Download Practice Workbook
Related Articles
- How to Use REGEX without VBA in Excel
- How to Find & Replace Text Using Regex in Excel
- How to Count Regex with COUNTIF in Excel
- Data Validation with RegEx in Excel
- How to Find RegEx Patterns in Excel
<< Go Back to RegEx in Excel | String Manipulation | Learn Excel
Get FREE Advanced Excel Exercises with Solutions!