CS133JS Beginning Programming: JavaScript
| Topics by Week | |
|---|---|
| 1. Intro to JavaScript programming | 6. Arrays |
| 2. Functions, Operators and Expressions | 7. Objects and Object Constructors |
| 3. Conditional Statements: if and switch | 8. Web Page I/O |
| 4. Repetition: while, do while, and for | 9. Regular Expressions |
| 5. Midterm Check-Point | 10. Term Project and Review |
| 11. Final |
Regular ExpressionsSide-note on grepHow to Use RegExpRegExp Methodstest execMatching "wild card" CharactersMatching at the Beginning, Middle, or End of a StringFlagsGroupsQuantifiersEscape CharactersMetacharactersChoice (Logical OR)ExamplesResourceReferences
One way to compare strings to see if they match is to use a Regular Expression object, this object is a part of the JavaScript language, and something similar exists in almost every other programming language.
grepThe command line utility, grep (get regular expression) was added to the Unix operating system in 19731. It is available in most modern operating systems:
Linux: It has always been distributed with the Linux operating system.
Mac OS: it is available in the terminal.
Windows: Power Shell doesn't have grep, but has the select-string command which uses regular expressions for searching.
RegExpRegExp – Regular expression object. Used for pattern matching in strings. A JavaScript RegExp object can be created two ways:
Defined with forward slashes: const pattern1 = /matchThis/;
Or by using the new operator: const pattern2 = new RegExp("matchThisToo");
The real power is in finding partial matches. Regular expressions are a powerful way to find matches for complex patterns in a string.
These are the most commonly used methods. For a comprehensive list, see the description of RegExp on MDN.
test This method will return true when you pass it a string that contains a match for the pattern defined in the RegExp object.
xxxxxxxxxxconst pattern = /matchThis/;let foundMatch = pattern.test("Does matchThis match?"); // foundMatch will be trueexecThis method will return an array containing just the first matched sub-string. The array also has a number of properties; including the index of the first match in the string—if it finds a match, otherwise it returns null.
x
const pattern = /th/;let matches = pattern.exec("There are two matches in this sentence for 'th'.");// matches: ["th"], matches.index: 25This method has a number of other more complex features that you can read about in the MDN documentation for exec.
You might have used * and ? as wildcards in a search before.
With RegExp, the syntax is a little different:
Use . to match any single character.
Example: /They l.ve/ will match:
"They live"
"They love"
Adding a * will match zero-to-many of the character preceding the * .
Example: /Bo*t/ will match:
"Bt"
"Bot"
"Boot"
Adding the + will match one-to-many of the character preceding the + .
Example: /Bo+t/ will match:
"Bot"
"Boot"
But not "Bt"
The dot-star, .* combination will match zero or more occurances of any character(s).
For example, the pattern,/she jump.* high/, will match:
"she jump high"
"she jumped high"
"she jumps high",
"she jumps high all the time."
"Yes, she jumps high!"
Interestingly, all these regexp patterns are the same:/she jump/ is the same as: /she jump.*/, /.*she jump/, or /.*she jump.*/.
This is because, unless a regexp pattern doesn't specify that something must come before or after the pattern, then anything can.
The dot-plus, .+ combination will match one or more occurrences of any character(s).
For example, the pattern, /she jump.+ high/, will match all the same strings as /she jump.* high/ except "she jump high".
Anchors are used to indicate that a pattern must be applied at the beginning of a string, the end, or must match the entire string.
The pattern below, without anchors, will match a string that contains “this” anywhere:
xxxxxxxxxxconst pattern = /this/;let text = "Is this going to match?";let foundMatch = pattern.test(text); // foundMatch will be trueThe ^ anchor indicates the match must be at the beginning of the string. This pattern will match any string that starts with “This”:
xxxxxxxxxxpattern = /^This/;text = "This should match";foundMatch = pattern.test(text); // foundMatch will be trueThe $ anchor indicates that match must be at the end of the string. This pattern will match any string that ends with “this” :
x
pattern = /this$/;text = "The pattern will match this";foundMatch = pattern.test(text); // foundMatch will be trueRegExp flags (aka properties) include:
g – global
All matches in the string will be found.
Note: The easiest way to use this is with the match method on a string.
i – ignoreCase
Matches either upper or lower case letters.
m – multiline.
Works with a string that has multiple lines separated by a \n (new line) character.
Flags can be applied when crating regular expression object.
Literal RegExp object: Put the flag(s) after the slash that ends the regular expression:
let const pattern1 = /this/i;
RegExp constructor: Add a second argument to the constructor for the flag(s).
const pattern2 = new RegExp("that", gi);
g flag and the lastIndex propertyThe lastIndex property of a JavaScript RegExp object:
Starting Position: It indicates the character position in the target string where the next search for a match should begin. It starts at 0.
After a match is found: it is set to one character after the match.
The Global Flag (g): The lastIndex property is only used and updated when the regular expression has the g flag.
This is how you would count multiple matches using a RegExp object:
x
const str = "There are two matches in this sentence for 'th'.";const regex = /th/g; // The regular expression with the global flag
let count = 0;let match;// Use a while loop that continues as long as exec() finds a matchwhile ((match = regex.exec(str)) !== null) { count++; // This check is important to avoid an infinite loop in some edge cases // like zero-length matches, though not strictly necessary for '/th/'. if (match.index === regex.lastIndex) { regex.lastIndex++; }}console.log("The total number of matches is: " + count);
Character groups – a group of characters that can match one character in a string:
const pattern = /[Tt]his/; // matches capital or lower case T
A group can be negated with a caret, ^
const pattern = /[^T]his/; // matches anything except a capital T
The caret also can be used to indicate a group that matches the beginning of a string. For example, checking for capitalization of at least the first character:
const pattern = /^[A-Z][a-z]*/
(There can be zero or more lower case letters following the capital letter at the beginning. They may be followed by anything, including upper case letters.)
The $ specifies that a char or group must be at the end of the string. For example, now only the first char can be capitalized:
const pattern = /^[A-Z][a-z]*$/
(All the characters following the first character must be lower case all the way to the end.)
Curly braces, { }, specify the number of times a pattern must match:
Match the pattern exactly n times: { n }
Match the pattern at least n times: { n, }
Match the pattern from a minimum of n times to a maximum of x times: { n, x }
For example, the pattern /[0-9]{5}/ will match only strings containing 5 digit numbers like: "97405".
Escape character – backslash is an escape character that lets you use a special character, like the dot as a dot, not for pattern matching. For example, check for a period at the end of a string:
const pattern = /\.$/
Metacharacters are characters with a special meaning. A partial listing is shown below. Notice that the upper-case versions do the inverse of the lower-case versions.
| Metacharacter | Description |
|---|---|
| \w | Find a word character (a-z, A-Z, 0-9 and _) |
| \W | Find a non-word character |
| \d | Find a digit |
| \D | Find a non-digit character |
| \s | Find a whitespace2 character |
| \S | Find a non-whitespace character |
| \b | Find a match at either the beginning or end of a word. |
| \B | Find a match that is not at the beginning or end of a word. |
For a complete list, see the W3Schools JavaScript RegExp Reference in the References below.
Here is an example that will only match whole words:
xxxxxxxxxxconst pattern = /\bpick\b/;let results = pattern.exec("How many pecks of pickled peppers did Peter Piper pick?"); // results: ["pick"], result.index: 50 meaning it matched "pick" but not "pickled"
Pipe character, | to allow choice between patterns
xxxxxxxxxxconst pattern = /JavaScript|C#|Python/;console.log(pattern.test("We teach C# at LCC"));If you want to add a modifer before or after the choice, put the choice inside parenthesis:
xxxxxxxxxxconst pattern = /^(JavaScript|C#|Python)/;console.log(pattern.test("Python is an interesting language."));
Test for a valid e-mail address:
(This pattern uses {2,} to indicate a minimum of 2 characters.)
const pattern = /^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}$/i
Rules for naming variables: Names can contain letters, digits, underscores, and dollar signs, but names cannot begin with a digit:
const pattern = /^[A-Z_$][A-Z0-9_$]*$/i
Check for a valid uoregon.edu address:
const pattern = /^[A-Z0-9._%+-]+@uoregon\.edu$/i
Regular Expression Test Page—Regular Expressions 101
Try out regular expressions to see how they work with different test strings.
JavaScript RegExp Reference—W3Schools
JavaScript Guide: Regular Expressions—MDN
Regular Expressions—Ch. 9 in Eloquent JavaScript, 3rd Edition, by Marijn Haverbeke, No Starch Press, 2018.
The Regular Expressions Book – RegEx for JavaScript Developers—Kolade Chris, FreeCodeCamp, 2023.
Beginning JavaScript Lecture Notes by Brian Bird, written 2018, updated are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.