# Chapter 5: Regular Expressions - Perl's Superpower
> "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." - Jamie Zawinski
> "Those people weren't using Perl." - A Perl Programmer
Regular expressions aren't bolted onto Perl as an afterthought or imported from a library. They're woven into the language's DNA. When other languages were struggling with clunky regex APIs, Perl developers were parsing complex log files with one-liners. This chapter will show you why Perl's regex implementation is still unmatched and how to wield this power responsibly.
## The Basics (But Better)
### Match Operator: m//
```perl
#!/usr/bin/env perl
use Modern::Perl '2018';
my $text = "The server at 192.168.1.100 responded in 245ms";
# Basic matching
if ($text =~ /server/) {
say "Found 'server'";
}
# Capture groups
if ($text =~ /(\d+\.\d+\.\d+\.\d+)/) {
say "IP address: $1"; # $1 contains first capture
}
# Multiple captures
if ($text =~ /at ([\d.]+) responded in (\d+)ms/) {
my ($ip, $time) = ($1, $2);
say "Server $ip took ${time}ms";
}
# The !~ operator for negation
say "No errors!" if $text !~ /error|fail|timeout/i;
# Default variable $_
$_ = "Testing 123";
say "Contains number" if /\d+/; # No need for $_ =~
```
### Substitution Operator: s///
```perl
my $config = "ServerName = localhost:8080";
# Basic substitution
$config =~ s/localhost/127.0.0.1/;
say $config; # ServerName = 127.0.0.1:8080
# Global substitution with /g
my $log = "Error Error Warning Error";
$log =~ s/Error/Issue/g;
say $log; # Issue Issue Warning Issue
# Capture and replace
my $date = "2024-01-15";
$date =~ s/(\d{4})-(\d{2})-(\d{2})/$3\/$2\/$1/;
say $date; # 15/01/2024
# Using the result
my $count = $log =~ s/Warning/ALERT/g; # Returns number of replacements
say "Replaced $count warnings";
# The /r modifier returns modified string without changing original
my $original = "hello world";
my $modified = $original =~ s/world/Perl/r;
say $original; # hello world (unchanged)
say $modified; # hello Perl
```
### The Transliteration Operator: tr/// (or y///)
Not technically a regex, but often used alongside them:
```perl
my $text = "Hello World 123";
# Count characters
my $digit_count = $text =~ tr/0-9//;
say "Contains $digit_count digits";
# ROT13 cipher
$text =~ tr/A-Za-z/N-ZA-Mn-za-m/;
say $text; # Uryyb Jbeyq 123
# Remove duplicates
$text =~ tr/a-z//s; # /s squashes duplicate characters
# Delete characters
$text =~ tr/0-9//d; # /d deletes matched characters
```
## Regex Modifiers: Changing the Rules
```perl
# /i - Case insensitive
say "Match!" if "HELLO" =~ /hello/i;
# /x - Extended formatting (ignore whitespace, allow comments)
my $ip_regex = qr/
^ # Start of string
(\d{1,3}) # First octet
\. # Literal dot
(\d{1,3}) # Second octet
\. # Literal dot
(\d{1,3}) # Third octet
\. # Literal dot
(\d{1,3}) # Fourth octet
$ # End of string
/x;
# /s - Single line mode (. matches newline)
my $html = "
\nContent\n
";
$html =~ /
(.*?)<\/div>/s; # Captures across newlines
# /m - Multi-line mode (^ and $ match line boundaries)
my $multi = "Line 1\nLine 2\nLine 3";
my @lines = $multi =~ /^Line \d+$/gm;
# /g - Global matching
my $data = "cat bat rat";
my @words = $data =~ /\w+/g; # ('cat', 'bat', 'rat')
# /o - Compile pattern once (optimization for loops)
for my $line (@huge_file) {
$line =~ /$pattern/o; # Pattern compiled only once
}
```
## Advanced Pattern Matching
### Non-Capturing Groups
```perl
# (?:...) doesn't create a capture variable
my $url = "https://www.example.com:8080/path";
if ($url =~ /^(https?):\/\/(?:www\.)?([^:\/]+)(?::(\d+))?/) {
my ($protocol, $domain, $port) = ($1, $2, $3);
$port //= $protocol eq 'https' ? 443 : 80;
say "Protocol: $protocol, Domain: $domain, Port: $port";
}
```
### Named Captures (Perl 5.10+)
```perl
# Much more readable than $1, $2, $3...
my $log_line = '2024-01-15 10:30:45 [ERROR] Connection timeout';
if ($log_line =~ /
(?\d{4}-\d{2}-\d{2})\s+
(?