Skip to the content.
Python for DevOps Engineers — What You Actually Need to Learn

Python for DevOps Engineers — What You Actually Need to Learn

Python for DevOps Engineers — What You Actually Need to Learn

Primary keyword: Python for DevOps engineers Secondary keywords: Python DevOps scripting, Python automation DevOps, DevOps Python skills


Introduction

Python is the most useful programming language for DevOps work — not because of its elegance, but because it runs everywhere, has libraries for everything, and strikes the right balance between “quick script” and “maintainable automation.” You don’t need to master Python data structures or algorithms to use it effectively in infrastructure work. What you need is a focused subset: file and process handling, HTTP requests, JSON parsing, Kubernetes and cloud SDK usage, and enough structure to write scripts that don’t become maintenance nightmares. This guide covers exactly that.


The Basics You Need Cold

Variables, Functions, and Control Flow

# Variables are dynamically typed
environment = "production"
replica_count = 3
is_healthy = True

# Functions with default arguments
def get_pod_count(namespace, label_selector="app=myapp"):
    # returns int
    pass

# Control flow
if environment == "production":
    min_replicas = 3
elif environment == "staging":
    min_replicas = 1
else:
    min_replicas = 1

# Loops
namespaces = ["default", "staging", "production"]
for ns in namespaces:
    print(f"Checking namespace: {ns}")

# List comprehension — common in DevOps scripts
healthy_pods = [p for p in pods if p["status"] == "Running"]

Error Handling

Production scripts must handle errors gracefully. A script that crashes with an unhandled exception at 2am is worse than one that handles the failure and logs it.

import sys
import logging

logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
logger = logging.getLogger(__name__)

def deploy(image_tag: str) -> bool:
    try:
        # deployment logic
        logger.info(f"Deploying image: {image_tag}")
        return True
    except ConnectionError as e:
        logger.error(f"Failed to connect to Kubernetes API: {e}")
        return False
    except Exception as e:
        logger.error(f"Unexpected error during deploy: {e}")
        raise  # re-raise unexpected errors — don't swallow them

if not deploy("myapp:abc123"):
    logger.error("Deployment failed, exiting")
    sys.exit(1)

Working With Files and Configuration

Reading and Writing Files

import json
import yaml  # pip install pyyaml

# Read JSON config
with open("config.json") as f:
    config = json.load(f)

db_host = config["database"]["host"]

# Read YAML (common for Kubernetes manifests)
with open("deployment.yaml") as f:
    manifest = yaml.safe_load(f)

# Write JSON output
with open("results.json", "w") as f:
    json.dump({"status": "deployed", "version": "1.2.3"}, f, indent=2)

# Read environment variables with defaults
import os
db_password = os.environ.get("DB_PASSWORD")
if not db_password:
    raise ValueError("DB_PASSWORD environment variable is required")

environment = os.environ.get("ENVIRONMENT", "dev")  # default to "dev"

Path Handling

from pathlib import Path

# Modern Python path handling
config_dir = Path.home() / ".config" / "myapp"
config_dir.mkdir(parents=True, exist_ok=True)

config_file = config_dir / "config.json"
if config_file.exists():
    config = json.loads(config_file.read_text())

Running Shell Commands

DevOps Python scripts often need to run shell commands. Use subprocess — not os.system().

import subprocess

# Run a command and get output
result = subprocess.run(
    ["kubectl", "get", "pods", "-n", "production", "-o", "json"],
    capture_output=True,
    text=True,
    check=False  # don't raise on non-zero exit
)

if result.returncode != 0:
    print(f"kubectl failed: {result.stderr}")
    sys.exit(1)

pods = json.loads(result.stdout)

# Run a command and let output stream to terminal
subprocess.run(
    ["terraform", "apply", "-auto-approve"],
    check=True  # raises CalledProcessError on non-zero exit
)

Never use shell=True with user input — it’s a shell injection vulnerability. Only use shell=True for trusted, hardcoded commands if you need shell features like pipes.


HTTP Requests and APIs

The requests library handles REST API calls cleanly.

import requests

# GET request with auth
response = requests.get(
    "https://api.github.com/repos/myorg/myapp/releases/latest",
    headers={"Authorization": f"token {os.environ['GITHUB_TOKEN']}"},
    timeout=10  # always set a timeout
)
response.raise_for_status()  # raises HTTPError on 4xx/5xx
release = response.json()
print(f"Latest release: {release['tag_name']}")

# POST request
response = requests.post(
    "https://hooks.slack.com/services/...",
    json={"text": "Deployment complete ✓"},
    timeout=10
)
response.raise_for_status()

Always set timeout. A script that hangs indefinitely waiting for an HTTP response is a production issue.


Working With the Kubernetes API

The official Kubernetes Python client is the right tool for automating Kubernetes operations.

pip install kubernetes
from kubernetes import client, config

# Load kubeconfig (from ~/.kube/config or in-cluster service account)
try:
    config.load_incluster_config()  # running inside a pod
except config.ConfigException:
    config.load_kube_config()  # running locally

v1 = client.CoreV1Api()
apps_v1 = client.AppsV1Api()

# List pods in a namespace
pods = v1.list_namespaced_pod(namespace="production", label_selector="app=myapp")
for pod in pods.items:
    print(f"{pod.metadata.name}: {pod.status.phase}")

# Update a deployment's image
apps_v1.patch_namespaced_deployment(
    name="myapp",
    namespace="production",
    body={"spec": {"template": {"spec": {"containers": [
        {"name": "myapp", "image": "myapp:new-tag"}
    ]}}}}
)

AWS and Azure SDK Usage

AWS with boto3

import boto3

# List EC2 instances
ec2 = boto3.client("ec2", region_name="us-east-1")
response = ec2.describe_instances(
    Filters=[{"Name": "tag:Environment", "Values": ["production"]}]
)

for reservation in response["Reservations"]:
    for instance in reservation["Instances"]:
        print(f"{instance['InstanceId']}: {instance['State']['Name']}")

# Upload to S3
s3 = boto3.client("s3")
s3.upload_file("backup.tar.gz", "my-backups-bucket", "2026/04/17/backup.tar.gz")

Azure with azure-mgmt

from azure.identity import DefaultAzureCredential
from azure.mgmt.compute import ComputeManagementClient

credential = DefaultAzureCredential()  # uses env vars or managed identity
compute_client = ComputeManagementClient(credential, subscription_id)

# List VMs
for vm in compute_client.virtual_machines.list_all():
    print(f"{vm.name}: {vm.location}")

Writing Scripts That Are Maintainable

Scripts that start as “quick one-offs” often become critical automation. Write them as if they’ll be maintained by someone who doesn’t know the context.

#!/usr/bin/env python3
"""
cleanup-old-images.py
Removes container images older than N days from GHCR.
Usage: python cleanup-old-images.py --days 30 --dry-run
"""

import argparse
import logging
import sys

def parse_args():
    parser = argparse.ArgumentParser(description="Clean up old container images")
    parser.add_argument("--days", type=int, default=30, help="Delete images older than N days")
    parser.add_argument("--dry-run", action="store_true", help="Show what would be deleted")
    return parser.parse_args()

def main():
    args = parse_args()
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)

    if args.dry_run:
        logger.info("DRY RUN — no images will be deleted")

    # ... implementation

if __name__ == "__main__":
    main()

The patterns that matter: docstring at the top explaining what it does, argparse for CLI arguments (not hardcoded values), structured logging (not print()), and a main() function that’s callable and testable.


Useful Libraries for DevOps Python

Library Purpose
requests HTTP API calls
boto3 AWS SDK
azure-mgmt-* Azure SDKs
kubernetes Kubernetes Python client
pyyaml YAML parsing
click Better CLI argument parsing than argparse
rich Beautiful terminal output, progress bars
python-dotenv Load .env files into environment
tenacity Retry logic with backoff

Conclusion

Python in DevOps is about automation and integration — calling APIs, parsing JSON/YAML, running subprocesses, and gluing tools together. You don’t need to master every Python feature. Master the fundamentals: file I/O, error handling, subprocess, HTTP requests, and the cloud/Kubernetes SDKs. Write scripts that have proper logging, handle errors explicitly, and take configuration from environment variables rather than hardcoded values. That’s the Python skill level that makes you effective in an infrastructure role.


Want to build Python automation skills as part of a structured DevOps curriculum? Everything is at ashoklabs.com.

Explore the courses →

🎓

16-Week Bootcamp

AI-Augmented Platform Engineering Bootcamp

Go deeper — hands-on Terraform, Kubernetes, GitOps, and AI-assisted operations. Build a complete internal developer platform from scratch.

View Course →
Ashok Valakatla

Written by

Ashok Valakatla

10 years building production infrastructure on Azure and AWS. Azure Certified DevOps Expert, Solution Architect, and Administrator.

Follow on LinkedIn

Leave a Comment

← Back to all posts

More Posts

Networking Basics Every DevOps Engineer Needs to Know

Networking Basics Every DevOps Engineer Needs to Know

Networking is where most DevOps tutorials fall short. Here are the concepts that unlock your ability to diagnose real production issues.

Devops Apr 15, 2026 8 min read
Linux Fundamentals Every DevOps Engineer Must Know

Linux Fundamentals Every DevOps Engineer Must Know

The Linux commands and concepts that show up in every production incident. Not the full manual — just the ones that matter for DevOps work.

Devops Apr 14, 2026 9 min read
Best DevOps Certifications in 2026 — Ranked by Career ROI

Best DevOps Certifications in 2026 — Ranked by Career ROI

Not all DevOps certifications are worth your time and money. Here's which ones hiring managers actually respect — and in what order to pursue them.

Devops Apr 12, 2026 6 min read
Best Free Resources to Learn DevOps in 2026

Best Free Resources to Learn DevOps in 2026

The best free courses, labs, docs, and communities for learning DevOps in 2026 — curated by what actually teaches you to think, not just follow tutorials.

Devops Apr 10, 2026 7 min read
Developer to DevOps Engineer: What You Need to Add to Your Skill Set

Developer to DevOps Engineer: What You Need to Add to Your Skill Set

Already a developer? Here's the exact skill gap between writing code and owning the infrastructure that runs it — and how to close it fast.

Devops Apr 09, 2026 6 min read
Sysadmin to DevOps Engineer: How to Make the Transition in 2026

Sysadmin to DevOps Engineer: How to Make the Transition in 2026

Already a sysadmin? You're closer to DevOps than you think. Here's what to add to your existing skills to make the transition.

Devops Apr 08, 2026 6 min read
How to Get Your First DevOps Job in 2026 (No Experience Required)

How to Get Your First DevOps Job in 2026 (No Experience Required)

A no-nonsense roadmap for landing your first DevOps job — what to learn, what to build, and what hiring managers actually look for.

Devops Apr 07, 2026 6 min read
How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

The Anthropic CLI source code leak wasn't a cyberattack — it was a CI/CD failure. Here's what went wrong and how proper pipeline security prevents it.

Devops Apr 04, 2026 19 min read
GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps extends beyond app deployments — cloud resources, databases, and networks can all be Git-managed, giving platform teams back the time to build real improvements.

Gitops Apr 03, 2026 12 min read
Docker best practices often missed in production

Docker best practices often missed in production

Avoid costly Docker mistakes in production. Learn image optimization, multi-stage builds, security hardening, and CI enforcement in one practical guide.

Docker Mar 29, 2026 6 min read
Terraform enterprise strategy for multi-tenant customers

Terraform enterprise strategy for multi-tenant customers

Manage multi-tenant Terraform at scale with state isolation, layered provisioning, feature flags, and a CI/CD strategy built for parallel development.

Terraform Mar 24, 2026 5 min read
Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

DevOps and Platform Engineering are not the same thing. Understanding the difference can shape which career path you choose — and how fast you grow.

Devops Mar 21, 2026 6 min read
The DevOps Roadmap: A Practical Guide for 2026

The DevOps Roadmap: A Practical Guide for 2026

Stop jumping straight into Kubernetes. Here's the structured, battle-tested roadmap to become a DevOps or Platform Engineer in 2026 — built on real fundamentals.

Devops Mar 20, 2026 11 min read