Sometimes you need to split a large PDF document into individual pages. Whether you’re extracting specific pages for sharing or processing pages separately, this Python utility makes it straightforward.
The Solution
Using the PyPDF2 library, we can iterate through a PDF file and save each page as a separate document:
import PyPDF2
def split_pdf(input_pdf_path, output_folder):
# Open the PDF file
with open(input_pdf_path, 'rb') as input_file:
# Create a PDF reader object
pdf_reader = PyPDF2.PdfReader(input_file)
# Iterate through each page in the PDF
for page_num in range(len(pdf_reader.pages)):
# Create a new PDF writer object for each page
pdf_writer = PyPDF2.PdfWriter()
pdf_writer.add_page(pdf_reader.pages[page_num])
# Output PDF file name
output_pdf_path = f"{output_folder}/page_{page_num + 1}.pdf"
# Write the page to a new PDF file
with open(output_pdf_path, 'wb') as output_file:
pdf_writer.write(output_file)
# Example usage
input_pdf_path = 'input.pdf' # Path to your input PDF file
output_folder = 'output_pages' # Output folder where individual pages will be saved
split_pdf(input_pdf_path, output_folder)How It Works
- Open the PDF: We use
PyPDF2.PdfReaderto read the input PDF file - Iterate through pages: Loop through each page using
pdf_reader.pages - Create individual files: For each page, create a new
PdfWriterobject - Save separately: Write each page to its own file with a numbered filename
Installation
First, install the required library:
pip install PyPDF2Usage
- Place your PDF file in the same directory as the script (or provide the full path)
- Create an output folder for the split pages
- Run the script
- Find your individual page files in the output folder
The output files will be named page_1.pdf, page_2.pdf, etc.
Use Cases
This utility is useful for:
- Extracting specific pages from large documents
- Preparing individual pages for different recipients
- Processing pages separately for OCR or analysis
- Creating page-by-page backups of important documents
Simple, effective, and ready to use!