This tutorial aims to explain regular expressions in Java, why and how you can efficiently use them. I will explain Java regex example along with the code.
Table of Contents
Regex or Regular Expressions are used to search a pre-defined pattern of strings or characters in Java.
These expressions define how exactly you want the search string to look like. In other words, regex help you define the criteria to match/find one string or sequence of characters in another given string. You can search, manipulate or edit text/strings using regex. This process if referred to as “applying” regex on text. The Java string is parsed from left to right for applying any regex.
If a char is parsed once, it will not be parsed again. For example, if you search “zzz” in “lizzzzzy” then you will only find a single match. You can not expect to find first a match with first 3 z’s “lizzzzzy” an then with last 3 z’s “lizzzzzy”.
You can find your name in a list of students using regex. Or if you are on a grocery run, you can use regex to look for all the items you want to buy. Moreover, you must have come across the requirement of passing a certain level of security by entering special characters/digits for setting a password. Similarly, some browsers validate if you have entered the correct email following a pre-defined format. All of this can be achieved by using regex.
Java provides us with java.util.regex
package to match/find/manipulate any expression.
Let’s look into a simple example to see how it works.
Java Code
package com.regex.core;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexInJava {
public static void main(String[] args) {
String query = "World";
String data = "Hello World!";
Pattern pattern = Pattern.compile(query);
Matcher matcher = pattern.matcher(data);
// returns true only if pattern is exactly matched
boolean result = Pattern.matches(query, data);
System.out.println("\"" + pattern + "\"" + " matches with \""
+ data + "\" " + result);
// It always starts searching at the beginning of the region
// does not require whole region to match like matches method
result = matcher.lookingAt();
System.out.println("\"" + pattern + "\"" + " lookingAt() \"" + data + "\" " + result);
//returns true if pattern in found "anywhere" in the matcher
result = matcher.find();
System.out.println("\"" + pattern + "\"" + " found() in \""
+ data + "\" " + result);
// matcher.group() returns string "found" in last search result
// matcher.start() returns the starting index of found string
// matcher.end() returns the ending index of found string
System.out.println("Found \"" + matcher.group() + "\"" + " starting at index "
+ matcher.start() + " and ending at index " + matcher.end());
// finds pattern starting from given index parameter to find()
result = matcher.find(7);
System.out.println("\"" + pattern + "\"" + " find(7) in \"" + data + "\" " + result);
//matcher.group() returns string "found" in last search result
//since string not found in last search, exception is thrown
System.out.println("Found \"" + matcher.group() + "\"" + " starting at index "
+ matcher.start() + " and ending at index " + matcher.end());
}
}
Output
"World" matches with "Hello World!" false "World" lookingAt() "Hello World!" false "World" found() in "Hello World!" true Found "World" starting at index 6 and ending at index 11 "World" find(7) in "Hello World!" false Exception in thread "main" java.lang.IllegalStateException: No match found at java.util.regex.Matcher.group(Matcher.java:536) at java.util.regex.Matcher.group(Matcher.java:496) at com.regex.core.RegexInJava.main(RegexInJava.java:44)
You can read a more simplified version of this wide topic at Java regex on CodeGym. But here are is some key information for your understanding. The package comes with 3 major classes.
It contains the information to match the search string (or query for easier understanding).
It defines the criteria for the pattern to search for.
Throws exception in cases like shown above (line 44). It helps the user to understand and troubleshoot the error in the pattern. You can use the Java exception mechanism to catch these kinds of exceptions in your code.
There are some special characters called “meta characters” used in regex to identify certain patters. Let’s say some pattern for password security purposes. Let’s look at some commonly used meta characters in an example.
package com.regex.core;
import java.util.regex.Pattern;
public class MetaChars {
public static void main(String[] args) {
// 1- "." represents any single character
// skips the char at "." and matches the rest pattern
System.out.println("------------Meta Char \".\"-----------");
boolean check = Pattern.matches(".a...", "mango");
// true (In a 5 letter word, matches if 2nd char is a)
System.out.println("Pattern.matches(\".a...\", \"mango\") = " + check);
check = Pattern.matches("..a", "apple");
// false (checks if 3rd char is a)
System.out.println("Pattern.matches(\"..a\", \"apple\") = " + check);
// 2- "\d" represents digits [0..9]
System.out.println("--------Meta Char \"\\d\"-----------");
check = Pattern.matches("\\d", "9");
//true (checks if a single digit is there)
System.out.println("Pattern.matches(\"\\d\", \"9\") = " + check );
check = Pattern.matches("\\d", "kiwi");
// false (kiwi is not a digit)
System.out.println("Pattern.matches(\"\\d\", \"kiwi\") = " + check );
check = Pattern.matches("\\d", "100");
// false (checks a single digit only)
System.out.println("Pattern.matches(\"\\d\", \"100\") = " + check );
check = Pattern.matches("\\d\\d\\d", "100");
// true (checks if 3 consecutive digits are present)
System.out.println("Pattern.matches(\"\\d\\d\\d\", \"100\"); = " + check );
check = Pattern.matches("\\d\\d\\d", "321yay!");
// false (doesn't permit alpha-numeric chars)
System.out.println("Pattern.matches(\"\\d\\d\\d\", \"321yay!\"); = " + check );
// 3- "\D" represents non digits
System.out.println("---------Meta Char \"\\D\"----------");
check = Pattern.matches("\\D", "A");
// true (matches with a single non digit)
System.out.println("Pattern.matches(\"\\D\", \"A\"); = " + check );
check = Pattern.matches("\\D", "!");
// true (matches with a single non digit)
System.out.println("Pattern.matches(\"\\D\", \"!\"); = " + check );
check = Pattern.matches("\\D", "BB");
// false (doesn't match with 2 non digits)
System.out.println("Pattern.matches(\"\\D\", \"BB\"); = " + check );
check = Pattern.matches("\\D\\D", "b1");
// false (doesn't permit 2 alpha-numeric chars)
System.out.println("Pattern.matches(\"\\D\\D\", \"b1\"); = " + check );
}
}
Output
-----------------Meta Char "."-------------- Pattern.matches(".a...", "mango") = true Pattern.matches("..a", "apple") = false ---------------Meta Char "\d"---------------- Pattern.matches("\d", "9") = true Pattern.matches("\d", "kiwi") = false Pattern.matches("\d", "100") = false Pattern.matches("\d\d\d", "100"); = true Pattern.matches("\d\d\d", "321yay!"); = false ---------------Meta Char "\D"---------------- Pattern.matches("\D", "A"); = true Pattern.matches("\D", "!"); = true Pattern.matches("\D", "BB"); = false Pattern.matches("\D\D", "b1"); = false
Conclusion
These are just a few meta chars explained with Java regex example due to the simplicity of this post. You are encouraged to look for more advanced topics and practice them after you’re good at the concepts discussed here. Good luck and happy coding!