I am trying to understand how the poke_interval and timeout parameters of Airflow Sensors work the schedule_interval of the DAGs they are assigned to.
Let me take an example,
I have a WasbPrefixSensor, which I use to check the for the existence of blobs with a given prefix in a Blob Storage Account. I need to check for the existence of these files everyday.
Now, say I set the poke_interval of this sensor to 60 and the timeout to 60*60*24. I will then set schedule_interval=@daily of the DAG this task is assigned to. Let's say that the start_time is 9 AM.
Does this mean that starting at 9AM, my pipeline will check for the existence of a file with the given prefix, every minute until it times out at 9AM the next day? Then, will the pipeline start back up again as soon as it ends?
What happens if I have several files with the same prefix at a given moment? Will the pipeline run for each of these files?
Further, say that the pipeline runs for a given file. Does this mean that it will stop poking from that point forward? I am thinking of the situation where I have a file with this prefix at 10 AM, for example, but another file with the same prefix is loaded at 11 AM. Will the pipeline run for both of these files or just the first one?
I have already read the answers given here, Confused about Airflow's BaseSensorOperator parameters : timeout, poke_interval and mode, but it does not say much about how the parameters of the Sensor works with schedule_interval.