Python Topics : Python's pathlib Module
Path Instantiation With Python's pathlib

pathlib represents the file system as objects

Using Path Methods
  • .cwd() - returns the current working directory
    on Windows returns a WidowsPath object
    on 'nix or macOS returns a PosixPath object
  • .home() - returns user's home directory
Passing in a String
can point to a directory or file directly by passing its string representation into Path
>>> from pathlib import Path
>>> Path(r"E:\python\pytopics\pytopics61.html") # raw string as arg
WindowsPath('E:/python/pytopics/pytopics61.html')
Joining Paths
the forward slash operator (/) can join several paths or a mix of paths and strings as long as a Path object is included
use a forward slash regardless of the platform's actual path separator
from pathlib import Path

for file_path in Path.cwd().glob("*.txt"):
    new_path = Path("archive") / file_path.name
    file_path.rename(new_path)
can do the same operation with the .joinpath() method
>>> from pathlib import Path
>>> Path.home().joinpath("python", "scripts", "test.py")
PosixPath('C:/Users/gahjelle/python/scripts/test.py')
File System Operations With Paths
Picking Out Components of a Path
a file or directory path consists of different parts
these parts are conveniently available as properties of Path
  • .name - the filename without any directory
  • .stem - the filename without the extension
  • .suffix - the file extension
  • .anchor - the part of the path before the directories
  • .parent - the directory containing the file or the parent directory if the path is a directory
example
>>> from pathlib import Path
>>> path = Path(r"C:\Users\gahjelle\realpython\test.md")
>>> path
WindowsPath('C:/Users/gahjelle/realpython/test.md')

>>> path.name
'test.md'

>>> path.stem
'test'

>>> path.suffix
'.md'

>>> path.anchor
'C:\\'

>>> path.parent
WindowsPath('C:/Users/gahjelle/realpython")

>>> path.parent.parent
WindowsPath('C:/Users/gahjelle')
.parent returns a new Path object
the other properties return strings

some Path functions

Method Returns Description
.cwd() WindowsPath('C:/Users/phi/rp') get the current working directory
.home() WindowsPath('C:/Users/phi') get the user's home directory
.exists() boolean check if the path points to a file or folder
.is_dir() boolean check if the path points to a folder
.is_file() boolean check if the path points to a file
.with_suffix(".md) WindowsPath('C:/Users/phi/rp/hello.md') get the path with new extension
.with_stem("bye") WindowsPath('C:/Users/phi/rp/bye.txt') get the path with new filename
.with_name("bye.md") WindowsPath('C:/Users/phi/rp/bye.md') get the path with new filename and extension
.read_text(encoding="utf-8") 'Hello' get the contents of a file as a string
.read_bytes() b'Hello' get the contents of a file as bytes
Reading and Writing Files
the way to read or write a file in Python has been to use the built-in open() function
with pathlib can use open() directly on Path objects
from pathlib import Path

path = Path.cwd() / "shopping_list.md"
with path.open(mode="r", encoding="utf-8") as md_file:
    content = md_file.read()
    groceries = [line for line in content.splitlines() if line.startswith("*")]
print("\n".join(groceries))
the .write_ methods shown in the table above handle opening and closing the file
the code above can be simplified by using the .read_text() function
from pathlib import Path

path = Path.cwd() / "shopping_list.md"
content = path.read_text(encoding="utf-8")
groceries = [line for line in content.splitlines() if line.startswith("*")]
print("\n".join(groceries))
can also specify paths directly as filenames
they're interpreted relative to the current working directory
from pathlib import Path

content = Path("shopping_list.md").read_text(encoding="utf-8")
groceries = [line for line in content.splitlines() if line.startswith("*")]
print("\n".join(groceries))
.write_text() and .write_bytes() methods overwrite the existing file without warning

Renaming Files
the .with_suffix(), .with_stem() and .with_name() functions all return a new Path object
the methods do not change the existing file name
to apply the change must use the .replace() method
>>> from pathlib import Path
>>> txt_path = Path("C:/Users/gahjelle/realpython/hello.txt")
>>> txt_path
WindowsPath("C:/Users/gahjelle/realpython/hello.txt")

>>> md_path = txt_path.with_name("goodbye.md")
WindowsPath('C:/Users/gahjelle/realpython/goodbye.md')

>>> txt_path.replace(md_path)
Copying Files
Path doesn't have a method to copy files
example to copy a file
>>> from pathlib import Path
>>> source = Path("shopping_list.md")
>>> destination = source.with_stem("shopping_list_02")
>>> destination.write_bytes(source.read_bytes())
Moving and Deleting Files
to move a file can use .replace() to avoid possibly overwriting the destination path, test whether the destination exists before replacing
from pathlib import Path

source = Path("hello.py")
destination = Path("goodbye.py")

if not destination.exists():
    source.replace(destination)
a possible race condition is while the file may not exist when the check is done, another process may add the file before .replace() is executed
to avoid this possibility use exception handling
from pathlib import Path

source = Path("hello.py")
destination = Path("goodbye.py")

try:
    with destination.open(mode="xb") as file:
        file.write(source.read_bytes())
except FileExistsError:
    print(f"File {destination} exists already.")
else:
    source.unlink()
need to delete source with .unlink() after the copy is done
using else ensures that the source file isn't deleted if the copying fails

Creating Empty Files
to create an empty file with pathlib, can use .touch()method
method is intended to update a file's modification time
can use its side effect to create a new file
>>> from pathlib import Path
>>> filename = Path("hello.txt")
>>> filename.exists()
False

>>> filename.touch()
>>> filename.exists()
True

>>> filename.touch()
to prevent modifying files accidentally, can use the exist_ok parameter and set it to False
>>> filename.touch(exist_ok=False)
Traceback (most recent call last):
  ...
FileExistsError: [Errno 17] File exists: 'hello.txt'
Python pathlib Examples
Counting Files
can conveniently use the .iterdir() method on the given directory
combine .iterdir() with the collections.Counter class to count how many files of each file type are in the current directory
>>> from pathlib import Path
>>> from collections import Counter
>>> Counter(path.suffix for path in Path.cwd().iterdir())
Counter({'.md': 2, '.txt': 4, '.pdf': 2, '.py': 1})
can create more flexible file listings with the methods .glob() and .rglob()
Path.cwd().glob("*.txt") returns all the files with a .txt suffix in the current directory count file extensions starting with p
>>> Counter(path.suffix for path in Path.cwd().glob("*.p*"))
Counter({'.pdf': 2, '.py': 1})
to recurse use .rglob()

Displaying a Directory Tree
a function named tree() will print a visual tree representing the file hierarchy rooted at a given directory
to recurse use .rglob()
def tree(directory):
    print(f"+ {directory}")
    for path in sorted(directory.rglob("*")):
        depth = len(path.relative_to(directory).parts)
        spacer = "    " * depth
        print(f"{spacer}+ {path.name}")
Finding the Most Recently Modified File
to find the most recently modified file in a directory can use the .stat() method to get information about the underlying files
.stat().st_mtime gives the time of last modification of a file
>>> from pathlib import Path
>>> from datetime import datetime
>>> directory = Path.cwd()
>>> time, file_path = max((f.stat().st_mtime, f) for f in directory.iterdir())
>>> print(datetime.fromtimestamp(time), file_path)
2023-03-28 19:23:56.977817 /home/gahjelle/realpython/test001.txt
timestamp returned from a property like .stat().st_mtime represents seconds since January 1, 1970 (epoch)
for a different format can use time.localtime or time.ctime

Creating a Unique Filename
construct a unique numbered filename based on a template string
unique_path.py
def unique_path(directory, name_pattern):
    counter = 0
    while True:
        counter += 1
        path = directory / name_pattern.format(counter)
        if not path.exists():
            return path
to use unique_path() specify a pattern for the filename with room for a counter
function checks the existence of the file path created by joining a directory and the filename including a value for the counter
if the name already exists then function increases the counter and tries again
>>> from pathlib import Path
>>> from unique_path import unique_path
>>> template = "test{:03d}.txt"
>>> unique_path(Path.cwd(), template)
PosixPath("/home/gahjelle/realpython/test003.txt")
index