# Chapter 5: Regular Expressions - Perl's Superpower > "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." - Jamie Zawinski > "Those people weren't using Perl." - A Perl Programmer Regular expressions aren't bolted onto Perl as an afterthought or imported from a library. They're woven into the language's DNA. When other languages were struggling with clunky regex APIs, Perl developers were parsing complex log files with one-liners. This chapter will show you why Perl's regex implementation is still unmatched and how to wield this power responsibly. ## The Basics (But Better) ### Match Operator: m// ```perl #!/usr/bin/env perl use Modern::Perl '2018'; my $text = "The server at 192.168.1.100 responded in 245ms"; # Basic matching if ($text =~ /server/) { say "Found 'server'"; } # Capture groups if ($text =~ /(\d+\.\d+\.\d+\.\d+)/) { say "IP address: $1"; # $1 contains first capture } # Multiple captures if ($text =~ /at ([\d.]+) responded in (\d+)ms/) { my ($ip, $time) = ($1, $2); say "Server $ip took ${time}ms"; } # The !~ operator for negation say "No errors!" if $text !~ /error|fail|timeout/i; # Default variable $_ $_ = "Testing 123"; say "Contains number" if /\d+/; # No need for $_ =~ ``` ### Substitution Operator: s/// ```perl my $config = "ServerName = localhost:8080"; # Basic substitution $config =~ s/localhost/127.0.0.1/; say $config; # ServerName = 127.0.0.1:8080 # Global substitution with /g my $log = "Error Error Warning Error"; $log =~ s/Error/Issue/g; say $log; # Issue Issue Warning Issue # Capture and replace my $date = "2024-01-15"; $date =~ s/(\d{4})-(\d{2})-(\d{2})/$3\/$2\/$1/; say $date; # 15/01/2024 # Using the result my $count = $log =~ s/Warning/ALERT/g; # Returns number of replacements say "Replaced $count warnings"; # The /r modifier returns modified string without changing original my $original = "hello world"; my $modified = $original =~ s/world/Perl/r; say $original; # hello world (unchanged) say $modified; # hello Perl ``` ### The Transliteration Operator: tr/// (or y///) Not technically a regex, but often used alongside them: ```perl my $text = "Hello World 123"; # Count characters my $digit_count = $text =~ tr/0-9//; say "Contains $digit_count digits"; # ROT13 cipher $text =~ tr/A-Za-z/N-ZA-Mn-za-m/; say $text; # Uryyb Jbeyq 123 # Remove duplicates $text =~ tr/a-z//s; # /s squashes duplicate characters # Delete characters $text =~ tr/0-9//d; # /d deletes matched characters ``` ## Regex Modifiers: Changing the Rules ```perl # /i - Case insensitive say "Match!" if "HELLO" =~ /hello/i; # /x - Extended formatting (ignore whitespace, allow comments) my $ip_regex = qr/ ^ # Start of string (\d{1,3}) # First octet \. # Literal dot (\d{1,3}) # Second octet \. # Literal dot (\d{1,3}) # Third octet \. # Literal dot (\d{1,3}) # Fourth octet $ # End of string /x; # /s - Single line mode (. matches newline) my $html = "
\nContent\n
"; $html =~ /
(.*?)<\/div>/s; # Captures across newlines # /m - Multi-line mode (^ and $ match line boundaries) my $multi = "Line 1\nLine 2\nLine 3"; my @lines = $multi =~ /^Line \d+$/gm; # /g - Global matching my $data = "cat bat rat"; my @words = $data =~ /\w+/g; # ('cat', 'bat', 'rat') # /o - Compile pattern once (optimization for loops) for my $line (@huge_file) { $line =~ /$pattern/o; # Pattern compiled only once } ``` ## Advanced Pattern Matching ### Non-Capturing Groups ```perl # (?:...) doesn't create a capture variable my $url = "https://www.example.com:8080/path"; if ($url =~ /^(https?):\/\/(?:www\.)?([^:\/]+)(?::(\d+))?/) { my ($protocol, $domain, $port) = ($1, $2, $3); $port //= $protocol eq 'https' ? 443 : 80; say "Protocol: $protocol, Domain: $domain, Port: $port"; } ``` ### Named Captures (Perl 5.10+) ```perl # Much more readable than $1, $2, $3... my $log_line = '2024-01-15 10:30:45 [ERROR] Connection timeout'; if ($log_line =~ / (?\d{4}-\d{2}-\d{2})\s+ (?