How to Monitor Cron Jobs with Retry and Alerts

The problem with cron jobs

When a cron job silently fails, you might not notice until it's too late. A script error, a crash, or a network issue can cause a job to stop working — and there's no built-in alert system.

What we’ll configure

Monitoring for cron job success
Retry logic on failure
Alerting if all retries fail

1. Wrap your cron job in a shell script

This wrapper will retry the job up to 3 times if it fails:

#!/bin/bash

MAX_RETRIES=3
RETRY_DELAY=10
COUNT=0
SUCCESS=0

while [ $COUNT -lt $MAX_RETRIES ]; do
  echo "Attempt $((COUNT+1))..."
  /usr/bin/python3 /scripts/my_task.py && SUCCESS=1 && break
  COUNT=$((COUNT+1))
  sleep $RETRY_DELAY
done

if [ $SUCCESS -eq 1 ]; then
  curl -fsS https://ev.okchecker.com/p/<api-key>/backup-db > /dev/null
else
  curl -fsS https://ev.okchecker.com/p/<api-key>/backup-db?status=fail > /dev/null
fi

2. Add it to your crontab

Open crontab -e and add:

*/15 * * * * /scripts/task_wrapper.sh

This runs every 15 minutes. On success, it sends a ping. On failure, it sends a different signal that can trigger an alert.

3. Configure alerts

In your Monitoring SaaS dashboard, set this as a "cron monitor" and define the expected interval. If no ping arrives or a failure ping is received — you'll get notified.

4. Security tips

Store secrets in environment variables, not hardcoded
Log failures to troubleshoot issues

5. Alternative: systemd with restart

If you use systemd instead of cron, you can use Restart=on-failure in your service file. Add a curl ping to ExecStartPost to integrate with monitoring.

Conclusion

Adding monitoring and retry logic to your cron jobs helps prevent data loss and ensures task reliability. Better safe than sorry!

Start Monitoring