How to Monitor Cron Jobs with Retry and Alerts

Make your scheduled tasks reliable with retries and alerting on failure

The problem with cron jobs

When a cron job silently fails, you might not notice until it's too late. A script error, a crash, or a network issue can cause a job to stop working — and there's no built-in alert system.

What we’ll configure

  • Monitoring for cron job success
  • Retry logic on failure
  • Alerting if all retries fail

1. Wrap your cron job in a shell script

This wrapper will retry the job up to 3 times if it fails:

#!/bin/bash

MAX_RETRIES=3
RETRY_DELAY=10
COUNT=0
SUCCESS=0

while [ $COUNT -lt $MAX_RETRIES ]; do
  echo "Attempt $((COUNT+1))..."
  /usr/bin/python3 /scripts/my_task.py && SUCCESS=1 && break
  COUNT=$((COUNT+1))
  sleep $RETRY_DELAY
done

if [ $SUCCESS -eq 1 ]; then
  curl -fsS https://ev.okchecker.com/p/<api-key>/backup-db > /dev/null
else
  curl -fsS https://ev.okchecker.com/p/<api-key>/backup-db?status=fail > /dev/null
fi

2. Add it to your crontab

Open crontab -e and add:

*/15 * * * * /scripts/task_wrapper.sh

This runs every 15 minutes. On success, it sends a ping. On failure, it sends a different signal that can trigger an alert.

3. Configure alerts

In your Monitoring SaaS dashboard, set this as a "cron monitor" and define the expected interval. If no ping arrives or a failure ping is received — you'll get notified.

4. Security tips

  • Store secrets in environment variables, not hardcoded
  • Log failures to troubleshoot issues

5. Alternative: systemd with restart

If you use systemd instead of cron, you can use Restart=on-failure in your service file. Add a curl ping to ExecStartPost to integrate with monitoring.

Conclusion

Adding monitoring and retry logic to your cron jobs helps prevent data loss and ensures task reliability. Better safe than sorry!

Start Monitoring