Joe Levi:
a cross-discipline, multi-dimensional problem solver who thinks outside the box – but within reality™

A Web Developer's Introduction to Regular Expressions (REGEX)

According to Wikipedia, “a regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are used by many text editors and utilities to search and manipulate bodies of text based on certain patterns.”

When teamed up with javascript 1.2+ (present in NS 4+, Mozilla, and MSIE 4+) regular expressions can prove to be a VERY powerful form validation tool. ASP.NET includes a REGEX Validator which can render client-side regex validation on input fields very easily. (Obviously, you do not need to use ASP.NET to accomplish your field validation, this is just here for reference.)

To get started in regex, here are a few items of interest.
Useful Rules:

Modifier/Assertion Description Equivalent
^ Matches at the begining of the string
$ Matches at the end of the string
b Matches a word boundary (between w and W) when not inside []
B Matches a non-word boundary
. Represents “any character” (single character wild-card)
g Do global pattern matching
i Do case-insensitive pattern matching
m Not supported by javascript 1.2!
s
x
{m,n} Must occur at least m times, but not more than n times
{m,} Must occur at least m times
{m} Must occur exactly m times
* Must occur 0 or more times {0,}
+ Must occur 1 or more times {1,}
? Must occur 0 or 1 times {0,1}

Useful Patterns:

Character Matches Equivalent
n Line-feed (LF)
r Carriage return (CR)
t Tab
v Vertical tab
f Form feed (FF)
d A digit [0-9]
D A non-digit [^0-9]
w A word (alphanumeric) character [a-zA-Z_0-9]
W A non-word (alphanumeric) character [^a-zA-Z_0-9]
s A whitespace character [tvnrf]
S A non-whitespace character [^tvnrf]

Useful Characters:

Meta-Character Regular Character
\
$ $
| |
( (
) )
{ {
^ ^
$ $
* *
+ +
? ?
. .

Useful Regular Expressions:

DataType regex Accepts Rejects
whitespace /^s+$/ spaces, tabs, etc. 0 a
single letter /^[a-zA-Z]$/ a A 0 . –
alpha string /^[a-zA-Z]+$/ aBbB a1 23
single digit /^d/ 1 00 1a 12
single alphanumeric /^([a-zA-Z]|d)$/ 0 a 00 1a 12
alphanumeric string /^[a-zA-Z0-9]$/ a1b2 2.3 1.a
integer /^d+$/ 0 23 3.4
signed integer /^(+1-)?d+$/ -1 +2 -1.2 +3.4
floating-point number /^((d+(.d*)?)1((d*.)?d+))$/ 1.1 0.9 .8 -1.1
email address /^.+@.+..+$/ a@b.c a@b

Related Links:

Share

You may also like...

Leave a Reply