CS133JS Beginning Programming: JavaScript
Topics by Week | |
---|---|
1. Intro to JavaScript programming | 6. Arrays |
2. Functions, Operators and Expressions | 7. Objects and Object Constructors |
3. Conditional Statements: if and switch | 8. Web Page I/O |
4. Repetition: while, do while, and for | 9. Regular Expressions |
5. Midterm Check-Point | 10. Review |
IntroductionAnnouncementsQ and ARegular ExpressionsRegExp Methodstest
exec
Matching "wild card" CharactersMatching at the Beginning, Middle, or End of a StringFlagsGroupsQuantifiersEscape CharactersMetacharactersChoice (Logical OR)ExamplesResourceReferences
For fall 2024
Due dates:
Lab 7 on JavaScript events:
Code review due Tuesday, 11/26.
Production version due Sunday, 12/1. (The due date is pushed out due to the Thanksgiving holiday)
This week's quiz on regular expressions closes Sunday, 12/1.
Lab 8 on regular expressions, only part 1 is required, part 2 is extra credit:
Part 2 code review due Tuesday, 12/3.
Part 1 and part 2 production version due Thursday, 12/5.
Term project code review due Thursday, 12/5.
Term project due Tuesday, 12/10.
How is lab 7 going?
How is the term project going?
Any other questions?
One way to compare strings to see if they match is to use a Regular Expression object, this object is a part of the JavaScript language, and something similar exists in almost every other programming language.
RegExp
– Regular expression object. Used for pattern matching in strings. A JavaScript RegExp object can be created two ways:
Defined with forward slashes: let pattern1 = /matchThis/;
Or by using the new
operator: let pattern2 = new RegExp("matchThisToo");
The real power is in finding partial matches. Regular expressions are a powerful way to find matches for complex patterns in a string.
These are the most commonly used methods. For a comprehensive list, see the description of RegExp on MDN.
test
This method will return true when you pass it a string that contains a match for the pattern defined in the RegExp object.
let pattern = /matchThis/;
let foundMatch = pattern.test("Does matchThis match?"); // foundMatch will be true
exec
This method will return an array containing just the first matched sub-string. The array also has a number of properties; including the index of the first match in the string—if it finds a match, otherwise it returns null.
xxxxxxxxxx
let pattern = /th/;
let matches = pattern.exec("There are two matches in this sentence for 'th'.");
// matches: ["th"], matches.index: 25
This method has a number of other more complex features that you can read about in the MDN documentation for exec
.
You might have used *
and ?
as wildcards in a search before.
With RegExp
, the syntax is a little different:
Use .
to match any single character.
Example: /They l.ve/
will match:
"They live"
"They love"
Adding a *
will match zero-to-many of the character preceding the *
.
Example: /Bo*t/
will match:
"Bt"
"Bot"
"Boot"
Adding the +
will match one-to-many of the character preceding the +
.
Example: /Bo+t/
will match:
"Bot"
"Boot"
But not "Bt"
The dot-star, .*
combination will match zero or more occurances of any character(s).
For example, the pattern,/she jump.* high/
, will match:
"she jump high"
"she jumped high"
"she jumps high",
"she jumps high all the time."
"Yes, she jumps high!"
Interestingly, all these regexp patterns are the same:/she jump/
is the same as: /she jump.*/
, /.*she jump/
, or /.*she jump.*/
.
This is because, unless a regexp pattern doesn't specify that something must come before or after the pattern, then anything can.
The dot-plus, .+
combination will match one or more occurrences of any character(s).
For example, the pattern, /she jump.+ high/
, will match all the same strings as /she jump.* high/
except "she jump high".
Anchors are used to indicate that a pattern must be applied at the beginning of a string, the end, or must match the entire string.
The pattern below, without anchors, will match a string that contains “this” anywhere:
xxxxxxxxxx
let pattern = /this/;
let text = "Is this going to match?";
let foundMatch = pattern.test(text); // foundMatch will be true
The ^
anchor indicates the match must be at the beginning of the string. This pattern will match any string that starts with “This”:
xxxxxxxxxx
pattern = /^This/;
text = "This should match";
foundMatch = pattern.test(text); // foundMatch will be true
The $
anchor indicates that match must be at the end of the string. This pattern will match any string that ends with “this” :
xxxxxxxxxx
pattern = /this$/;
text = "The pattern will match this";
foundMatch = pattern.test(text); // foundMatch will be true
RegExp flags (aka properties) include:
g
– global
All matches in the string will be found.
i
– ignoreCase
Matches either upper or lower case letters.
m
– multiline.
Works with a string that has multiple lines separated by a \n
(new line) character.
Flags can be applied when crating regular expression object.
Literal RegExp object: Put the flag(s) after the slash that ends the regular expression:
let let pattern1 = /this/i;
RegExp constructor: Add a second argument to the constructor for the flag(s).
let pattern2 = new RegExp("that", gi);
Character groups – a group of characters that can match one character in a string:
let pattern = /[Tt]his/; // matches capital or lower case T
A group can be negated with a caret, ^
let pattern = /[^T]his/; // matches anything except a capital T
The caret also can be used to indicate a group that matches the beginning of a string. For example, checking for capitalization of at least the first character:
let pattern = /^[A-Z][a-z]*/
(There can be zero or more lower case letters following the capital letter at the beginning. They may be followed by anything, including upper case letters.)
The $ specifies that a char or group must be at the end of the string. For example, now only the first char can be capitalized:
let pattern = /^[A-Z][a-z]*$/
(All the characters following the first character must be lower case all the way to the end.)
Curly braces, { }
, specify the number of times a pattern must match:
Match the pattern exactly n times: { n }
Match the pattern at least n times: { n, }
Match the pattern from a minimum of n times to a maximum of x times: { n, x }
For example, the pattern /[0-9]{5}/
will match only strings containing 5 digit numbers like: "97405".
Escape character – backslash is an escape character that lets you use a special character, like the dot as a dot, not for pattern matching. For example, check for a period at the end of a string:
let pattern = /\.$/
Metacharacters are characters with a special meaning. A partial listing is shown below. Notice that the upper-case versions do the inverse of the lower-case versions.
Metacharacter | Description |
---|---|
\w | Find a word character (a-z, A-Z, 0-9 and _) |
\W | Find a non-word character |
\d | Find a digit |
\D | Find a non-digit character |
\s | Find a whitespace1 character |
\S | Find a non-whitespace character |
\b | Find a match at either the beginning or end of a word. |
\B | Finc a match that is not at the beginning or end of a word. |
For a complete list, see the W3Schools JavaScript RegExp Reference in the References below.
Here is an example that will only match whole words:
xxxxxxxxxx
let pattern = /\bpick\b/;
let results = pattern.exec("How many pecks of pickled peppers did Peter Piper pick?");
// results: ["pick"], result.index: 50 meaning it matched "pick" but not "pickled"
Pipe character, |
to allow choice between patterns
xxxxxxxxxx
let pattern = /JavaScript|C#|Python/;
console.log(pattern.test("We teach C# at LCC"));
If you want to add a modifer before or after the choice, put the choice inside parenthesis:
xxxxxxxxxx
let pattern = /^(JavaScript|C#|Python)/;
console.log(pattern.test("Python is an interesting language."));
Test for a valid e-mail address:
(This pattern uses {2,}
to indicate a minimum of 2 characters.)
let pattern = /^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}$/i
Rules for naming variables: Names can contain letters, digits, underscores, and dollar signs, but names cannot begin with a digit:
let pattern = /^[A-Z_$][A-Z0-9_$]*$/i
Check for a valid uoregon.edu address:
let pattern = /^[A-Z0-9._%+-]+@uoregon\.edu$/i
Regular Expression Test Page—Regular Expressions 101
Try out regular expressions to see how they work with different test strings.
JavaScript RegExp Reference—W3Schools
JavaScript Guide: Regular Expressions—MDN
Regular Expressions—Ch. 9 in Eloquent JavaScript, 3rd Edition, by Marijn Haverbeke, No Starch Press, 2018.
The Regular Expressions Book – RegEx for JavaScript Developers—Kolade Chris, FreeCodeCamp, 2023.
Beginning JavaScript Lecture Notes by Brian Bird, written 2018, updated are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.