Split PDF Files by Page with Python

March 15, 2024

Sometimes you need to split a large PDF document into individual pages. Whether you’re extracting specific pages for sharing or processing pages separately, this Python utility makes it straightforward.

The Solution

Using the PyPDF2 library, we can iterate through a PDF file and save each page as a separate document:

import PyPDF2

def split_pdf(input_pdf_path, output_folder):
    # Open the PDF file
    with open(input_pdf_path, 'rb') as input_file:
        # Create a PDF reader object
        pdf_reader = PyPDF2.PdfReader(input_file)

        # Iterate through each page in the PDF
        for page_num in range(len(pdf_reader.pages)):
            # Create a new PDF writer object for each page
            pdf_writer = PyPDF2.PdfWriter()
            pdf_writer.add_page(pdf_reader.pages[page_num])

            # Output PDF file name
            output_pdf_path = f"{output_folder}/page_{page_num + 1}.pdf"

            # Write the page to a new PDF file
            with open(output_pdf_path, 'wb') as output_file:
                pdf_writer.write(output_file)

# Example usage
input_pdf_path = 'input.pdf'  # Path to your input PDF file
output_folder = 'output_pages'  # Output folder where individual pages will be saved
split_pdf(input_pdf_path, output_folder)

How It Works

Open the PDF: We use PyPDF2.PdfReader to read the input PDF file
Iterate through pages: Loop through each page using pdf_reader.pages
Create individual files: For each page, create a new PdfWriter object
Save separately: Write each page to its own file with a numbered filename

Installation

First, install the required library:

pip install PyPDF2

Usage

Place your PDF file in the same directory as the script (or provide the full path)
Create an output folder for the split pages
Run the script
Find your individual page files in the output folder

The output files will be named page_1.pdf, page_2.pdf, etc.

Use Cases

This utility is useful for:

Extracting specific pages from large documents
Preparing individual pages for different recipients
Processing pages separately for OCR or analysis
Creating page-by-page backups of important documents

Simple, effective, and ready to use!