Skip to the content.
Linux Fundamentals Every DevOps Engineer Must Know

Linux Fundamentals Every DevOps Engineer Must Know

Linux Fundamentals Every DevOps Engineer Must Know

Primary keyword: Linux fundamentals DevOps Secondary keywords: Linux commands for DevOps engineers, Linux for DevOps beginners, Linux DevOps skills


Introduction

Everything in DevOps runs on Linux. Containers are Linux. Kubernetes nodes are Linux. Your CI runners are Linux. When something breaks in production, the investigation starts at the Linux level — checking logs, inspecting processes, tracing network connections. This guide covers the Linux fundamentals that come up repeatedly in real DevOps work: not the full Unix philosophy, but the specific commands and concepts that will save you during an incident or a debugging session.


File System Navigation and Permissions

The Linux file system is the foundation. These commands should feel automatic.

# Navigation
pwd                   # where am I?
ls -la                # list all files including hidden, with permissions
cd /var/log           # change directory
find /etc -name "*.conf"   # find all .conf files in /etc

File permissions use the octal notation you’ll see everywhere:

# Permission breakdown: rwxr-xr--
# Owner: rwx (7) | Group: r-x (5) | Others: r-- (4)
chmod 755 deploy.sh   # owner can execute, group and others can read
chown app:app /srv/myapp  # change owner and group

Why it matters in DevOps: Container image files, SSH keys, and application config files all have specific permission requirements. A wrong chmod is a common source of “permission denied” errors in containers.


Process Management

Understanding running processes is essential during incidents.

# View running processes
ps aux                       # all processes with details
ps aux | grep nginx          # find nginx processes
top                          # interactive process viewer
htop                         # better interactive viewer (usually installed)

# Signals
kill -15 <pid>               # graceful shutdown (SIGTERM)
kill -9 <pid>                # force kill (SIGKILL) — use as last resort

# Systemd services
systemctl status nginx       # check service status
systemctl start/stop/restart nginx
systemctl enable nginx       # start on boot
journalctl -u nginx -f       # follow nginx logs via systemd
journalctl -u nginx --since "1 hour ago"

Why it matters in DevOps: When a pod is crashing, the first step is understanding what the process is doing. journalctl and systemctl are your first line of investigation for systemd-managed services on nodes.


Network Diagnostics

This is the section that separates engineers who can trace a problem from those who can’t.

# Check what's listening on which ports
ss -tulnp                    # modern replacement for netstat
netstat -tulnp               # older but widely available

# Test connectivity
nc -zv myservice.internal 8080       # is port 8080 reachable?
curl -v https://api.example.com/health   # inspect full HTTP request/response
curl -I https://api.example.com          # headers only
telnet myhost 5432           # test database port connectivity

# DNS resolution
nslookup myservice.internal  # basic DNS lookup
dig myservice.internal       # detailed DNS response including TTL
dig myservice.internal @8.8.8.8  # query specific DNS server

# Trace network path
traceroute api.example.com   # trace hops to a host
mtr api.example.com          # better traceroute with real-time stats

Why it matters in DevOps: “Service can’t connect to database” is one of the most common production issues. The diagnostic sequence is: nc to test port connectivity → nslookup/dig to verify DNS → curl to test the full HTTP layer. Know this sequence by heart.


Log Inspection

Fast log searching is a core production skill.

# Tail a live log
tail -f /var/log/nginx/access.log

# Search within a file
grep "ERROR" /var/log/app.log
grep -i "timeout" /var/log/app.log     # case insensitive
grep -n "connection refused" /var/log/app.log  # show line numbers
grep -A 5 "CRITICAL" /var/log/app.log  # show 5 lines after match

# Filter with awk
awk '{print $1, $9}' /var/log/nginx/access.log  # print columns 1 and 9

# Stream processing
tail -f /var/log/app.log | grep "ERROR"   # live filtering
cat /var/log/app.log | grep "500" | wc -l  # count 500 errors

Why it matters in DevOps: When an alert fires at 2am, you’re reading logs to find the error. Fast grep and tail skills mean the difference between a 5-minute investigation and a 30-minute one.


Disk and Resource Usage

Running out of disk is one of the most preventable production incidents.

# Disk usage
df -h                        # disk usage by filesystem, human-readable
du -sh /var/log/*            # size of each item in /var/log
du -sh * | sort -h           # sort by size

# Memory
free -h                      # memory usage summary
cat /proc/meminfo            # detailed memory info

# CPU and load
uptime                       # load averages (1, 5, 15 minute)
vmstat 1 5                   # system stats every 1 second, 5 times
iostat -x 1                  # disk I/O stats

Why it matters in DevOps: “Pod OOMKilled” and “disk full” alerts are common. Diagnosing them requires knowing where memory is going and which directories are consuming disk.


Shell Scripting Basics

DevOps automation requires writing shell scripts that are reliable and readable.

#!/bin/bash
set -euo pipefail  # exit on error, undefined var, pipe failure — always set this

ENVIRONMENT=${1:-"dev"}  # first arg, default to "dev"
LOG_FILE="/var/log/deploy.log"

deploy() {
  echo "$(date): Deploying to $ENVIRONMENT" | tee -a "$LOG_FILE"
  kubectl set image deployment/myapp myapp="myapp:${IMAGE_TAG}"
}

rollback() {
  echo "$(date): Rolling back" | tee -a "$LOG_FILE"
  kubectl rollout undo deployment/myapp
}

# Check if deployment succeeded
if kubectl rollout status deployment/myapp --timeout=120s; then
  echo "Deployment successful"
else
  echo "Deployment failed, rolling back"
  rollback
  exit 1
fi

The set -euo pipefail at the top is the most important line in any production shell script. Without it, errors are silently ignored and scripts continue running in broken states.


Environment Variables and Configuration

# Print environment variable
echo $HOME
printenv PATH

# Set a variable (current shell only)
export DB_HOST="postgres.internal"

# Check if variable is set
if [ -z "${DB_PASSWORD:-}" ]; then
  echo "DB_PASSWORD is not set"
  exit 1
fi

# Load from a file
source .env         # or
. .env

Why it matters in DevOps: Container configuration is injected via environment variables. Understanding how they’re set and inherited is fundamental to debugging containerized apps.


SSH and Secure Remote Access

# Connect to a remote host
ssh -i ~/.ssh/mykey.pem user@hostname

# Copy files
scp myfile.txt user@hostname:/tmp/
rsync -av ./mydir/ user@hostname:/opt/mydir/

# Port forwarding (useful for accessing services in a cluster)
ssh -L 5432:postgres.internal:5432 bastion.example.com
# Now localhost:5432 tunnels to postgres.internal:5432 through bastion

# SSH config for shortcuts
# ~/.ssh/config
Host bastion
  HostName bastion.example.com
  User ubuntu
  IdentityFile ~/.ssh/prod.pem

Cron and Scheduled Jobs

# Edit crontab
crontab -e

# Cron syntax: minute hour day month weekday
# Run every day at 9am on weekdays
0 9 * * 1-5 /opt/scripts/backup.sh >> /var/log/backup.log 2>&1

# Common patterns
0 * * * *     # every hour at minute 0
*/15 * * * *  # every 15 minutes
0 2 * * 0     # every Sunday at 2am

# List current crontab
crontab -l

The Troubleshooting Sequence

When something is broken, this is the order:

  1. Check if the process is running: ps aux | grep <service> or systemctl status <service>
  2. Check the logs: journalctl -u <service> -n 100 or tail -f /var/log/<service>.log
  3. Check network connectivity: ss -tulnp to see if the port is open, nc to test reachability
  4. Check disk and memory: df -h, free -h
  5. Check DNS: dig <hostname> if a service can’t reach another service by name

That sequence resolves the majority of production Linux issues. Have it memorized.


Conclusion

Linux fluency isn’t about memorizing every flag of every command. It’s about having a reliable mental model of how the system works — processes, files, networking, storage — and knowing which commands to reach for when something breaks. The commands in this guide will come up in almost every production incident. Practice them in a real environment (a cheap cloud VM or local VM), not just in a course. The muscle memory is what matters.


Want structured, hands-on Linux labs as part of a full DevOps curriculum? Everything is at ashoklabs.com.

Explore the courses →

🎓

16-Week Bootcamp

AI-Augmented Platform Engineering Bootcamp

Go deeper — hands-on Terraform, Kubernetes, GitOps, and AI-assisted operations. Build a complete internal developer platform from scratch.

View Course →
Ashok Valakatla

Written by

Ashok Valakatla

10 years building production infrastructure on Azure and AWS. Azure Certified DevOps Expert, Solution Architect, and Administrator.

Follow on LinkedIn

Leave a Comment

← Back to all posts

More Posts

Best DevOps Certifications in 2026 — Ranked by Career ROI

Best DevOps Certifications in 2026 — Ranked by Career ROI

Not all DevOps certifications are worth your time and money. Here's which ones hiring managers actually respect — and in what order to pursue them.

Devops Apr 12, 2026 6 min read
Best Free Resources to Learn DevOps in 2026

Best Free Resources to Learn DevOps in 2026

The best free courses, labs, docs, and communities for learning DevOps in 2026 — curated by what actually teaches you to think, not just follow tutorials.

Devops Apr 10, 2026 7 min read
Developer to DevOps Engineer: What You Need to Add to Your Skill Set

Developer to DevOps Engineer: What You Need to Add to Your Skill Set

Already a developer? Here's the exact skill gap between writing code and owning the infrastructure that runs it — and how to close it fast.

Devops Apr 09, 2026 6 min read
Sysadmin to DevOps Engineer: How to Make the Transition in 2026

Sysadmin to DevOps Engineer: How to Make the Transition in 2026

Already a sysadmin? You're closer to DevOps than you think. Here's what to add to your existing skills to make the transition.

Devops Apr 08, 2026 6 min read
How to Get Your First DevOps Job in 2026 (No Experience Required)

How to Get Your First DevOps Job in 2026 (No Experience Required)

A no-nonsense roadmap for landing your first DevOps job — what to learn, what to build, and what hiring managers actually look for.

Devops Apr 07, 2026 6 min read
How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

The Anthropic CLI source code leak wasn't a cyberattack — it was a CI/CD failure. Here's what went wrong and how proper pipeline security prevents it.

Devops Apr 04, 2026 19 min read
GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps extends beyond app deployments — cloud resources, databases, and networks can all be Git-managed, giving platform teams back the time to build real improvements.

Gitops Apr 03, 2026 12 min read
Docker best practices often missed in production

Docker best practices often missed in production

Avoid costly Docker mistakes in production. Learn image optimization, multi-stage builds, security hardening, and CI enforcement in one practical guide.

Docker Mar 29, 2026 6 min read
Terraform enterprise strategy for multi-tenant customers

Terraform enterprise strategy for multi-tenant customers

Manage multi-tenant Terraform at scale with state isolation, layered provisioning, feature flags, and a CI/CD strategy built for parallel development.

Terraform Mar 24, 2026 5 min read
Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

DevOps and Platform Engineering are not the same thing. Understanding the difference can shape which career path you choose — and how fast you grow.

Devops Mar 21, 2026 6 min read
The DevOps Roadmap: A Practical Guide for 2026

The DevOps Roadmap: A Practical Guide for 2026

Stop jumping straight into Kubernetes. Here's the structured, battle-tested roadmap to become a DevOps or Platform Engineer in 2026 — built on real fundamentals.

Devops Mar 20, 2026 11 min read