Python Regular Expressions: How to Use Regex in Python

Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

Introduction to Python Regular Expressions

Text processing is one of the most in-demand tasks in programming. Regular expressions become an indispensable tool when solving various problems. This includes parsing HTML pages, validating data formats, or processing logs.

Python uses the built-in re module for working with regular expressions. In this guide, we'll dive deep into using regular expressions in Python. We'll cover popular methods like re.search() and re.sub(), and show their practical application with real-world examples.

Regular Expression Basics

What Are Regular Expressions?

Regular expressions (RegEx) are a special language for describing search patterns in text. They allow you to search, validate, and replace text fragments based on specific rules.

Common Use Cases for Regular Expressions

Regular expressions in Python solve a wide range of tasks:

  • Validating email addresses and phone numbers
  • Finding all numeric values in text
  • Replacing unwanted characters and cleaning data
  • Extracting specific words or phrases from large texts
  • Parsing structured data
  • Processing logs and system files

Getting Started

Importing the re Module

Before working with regular expressions, you need to import the dedicated module:

import re

Basic Symbols and Constructs

To work effectively with regular expressions, it's important to know the basic symbols and their meanings:

Symbol Meaning
. Any character except newline
\d Any digit (0-9)
\D Any non-digit character
\w Letter, digit, or underscore
\W Any character except \w
\s Space, tab, or newline
\S Any non-whitespace character
^ Start of string
$ End of string
[] Character from a specified set
* Zero or more repetitions
+ One or more repetitions
{n,m} Between n and m repetitions

Core Methods for Working with Regular Expressions

Using re.search() to Find the First Match

The re.search() method searches for the first occurrence of a pattern in a string. It returns a Match object if found, or None if no match is found.

import re

text = "Email: example@mail.com"
match = re.search(r'\w+@\w+\.\w+', text)

if match:
    print("Found email:", match.group())

Output:

Found email: example@mail.com

Breaking Down the Search Pattern

Let's analyze the pattern used, piece by piece:

  • \w+ — one or more letters, digits, or underscores
  • @ — the at symbol (required part of an email)
  • \. — a dot (escaped with a backslash)
  • \w+ — top-level domain

Using re.findall() to Find All Matches

When you need to find every occurrence of a pattern in text, use the findall() method. It returns a list of all matches found.

text = "Prices: 100 dollars, 250 dollars, 350 dollars"
numbers = re.findall(r'\d+', text)
print(numbers)  # ['100', '250', '350']

Using re.sub() for Pattern-Based Replacement

The re.sub() method replaces all occurrences of a specified pattern with a replacement string.

Blogs

Book Recommendations