Jaison's Blog: Regular expressions basics

Regular expression types

There are 2 types of regular expressions:

POSIX Extended
Perl Compatible

The ereg, eregi, ... are the POSIX versions and preg_match, preg_replace, ... are the Perl version. It is important that using Perl compatible regular expressions the expression should be enclosed in the delimiters, a forward slash (/), for example. However this version is more powerful and faster as well than the POSIX one.

The regular expressions basic syntax

To use regular expressions first you need to learn the syntax of the patterns. We can group the characters inside a pattern like this:

Normal characters which match themselves like hello
Start and end indicators as ^ and $
Count indicators like +,*,?
Logical operator like |
Grouping with {},(),[]

An example pattern to check valid emails looks like this:

Code: 
^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$

The code to check the email using Perl compatible regular expression looks like this:

//code1

$pattern = "/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$/";

$email = "jaison@demo.com";

if (preg_match($pattern,$email)) echo "Match";

else echo "Not match";

//code2

$pattern = "^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$";

$email = "jaison@demo.com";

if (eregi($pattern,$email)) echo "Match";

else echo "Not match";

Regular expression (pattern)	Match (subject)	Not match (subject)	Comment
world	Hello world	Hello Jim	Match if the pattern is present anywhere in the subject
^world	world class	Hello world	Match if the pattern is present at the beginning of the subject
world$	Hello world	world class	Match if the pattern is present at the end of the subject
world/i	This WoRLd	Hello Jim	Makes a search in case insensitive mode
^world$	world	Hello world	The string contains only the "world"
world*	worl, world, worlddd	wor	There is 0 or more "d" after "worl"
world+	world, worlddd	worl	There is at least 1 "d" after "worl"
world?	worl, world, worly	wor, wory	There is 0 or 1 "d" after "worl"
world{1}	world	worly	There is 1 "d" after "worl"
world{1,}	world, worlddd	worly	There is 1 ore more "d" after "worl"
world{2,3}	worldd, worlddd	world	There are 2 or 3 "d" after "worl"
wo(rld)*	wo, world, worldold	wa	There is 0 or more "rld" after "wo"
earth\|world	earth, world	sun	The string contains the "earth" or the "world"
w.rld	world, wwrld	wrld	Any character in place of the dot.
^.{5}$	world, earth	sun	A string with exactly 5 characters
[abc]	abc, bbaccc	sun	There is an "a" or "b" or "c" in the string
[a-z]	world	WORLD	There are any lowercase letter in the string
[a-zA-Z]	world, WORLD, Worl12	123	There are any lower- or uppercase letter in the string
[^wW]	earth	w, W	The actual character can not be a "w" or "W"

Jaison's Blog

Sunday, 13 March 2011

Regular expressions basics

No comments:

Post a Comment