About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Thursday, October 31, 2024

schedule_interval in Airflow DAGs

In Apache Airflow, the schedule_interval parameter defines how frequently a DAG should run. Here are the main options and formats you can use:

1. Preset Schedule Intervals (String aliases)

  • "@once": Run the DAG only once, regardless of start_date.
  • "@hourly": Run the DAG every hour.
  • "@daily": Run the DAG once a day at midnight (00:00 UTC).
  • "@weekly": Run the DAG once a week at midnight on Sunday (00:00 UTC).
  • "@monthly": Run the DAG once a month at midnight on the first day of the month.
  • "@quarterly": Run the DAG at midnight on the first day of each quarter (January, April, July, October).
  • "@yearly" or "@annually": Run the DAG once a year at midnight on January 1.

2. Cron Expressions (String format)

  • You can use cron syntax to define custom schedules. Format: minute hour day month day_of_week.
    • Examples:
      • "0 9 * * *": Run daily at 9:00 AM UTC.
      • "15 14 * * 1": Run every Monday at 14:15 UTC.
      • "0 0 1 * *": Run at midnight on the first day of each month.

3. TimeDelta (Using datetime.timedelta)

  • Use timedelta for intervals in hours, minutes, days, etc., instead of a specific time of day.
  • Example:
    python
    from datetime import timedelta schedule_interval=timedelta(hours=6) # Runs every 6 hours

4. None

  • Setting schedule_interval=None means the DAG will only run if manually triggered.

Post Reference: Vikram Aristocratic Elfin Share

No comments:

Post a Comment