Discovery and execution lifecycle¶

The lifecycle starts with an operator registering a source and ends with an execution record containing manifest, backend status, and provenance.

Discovery¶

register source
      |
      v
mark for discovery
      |
      v
scheduler_tick
      |
      v
discover_batch
      |
      v
TAP queries -> field map -> metadata rows
                         |
                         v
                discovery signature
                         |
                         v
                workflow_run_pending

Control	Source
Query templates	`project_config.discovery.queries`
Enrichment queries	`project_config.discovery.enrichments`
Field mapping	`project_config.discovery.prepare_metadata.field_map`
Signature fields	`project_config.discovery.prepare_metadata.signature`
Batch limits	`automation.discovery` and `BEAMPIPE_SHAPING_*`

Discovery signatures determine whether prepared metadata changed enough to trigger future work. Exclude volatile archive fields such as access URLs, file sizes, and timestamps when they should not cause reruns.

Execution¶

pending source(s)
      |
      v
create execution
      |
      v
build manifest -> DALiuGE Graphs -> stage / translate / submit
      |                |                              |
      v                v                              v
ledger row       prepared graph                  backend run
      |                                               |
      v                                               v
poll tick <---------- REST DIM or Slurm status -------+
      |
      v
completed / failed / cancelled

Control	Source
Grouping	`manifest.group_by`
Manifest shape	`manifest.source_template`, `dataset_template`, `path`
Graph mutation	`graph_patches` YAML, documented as DALiuGE Graphs
Backend selection	`deployment_profile_name` or project default
Admission	`automation.execution` and execution shaping variables

Operator model¶

Beampipe separates authoritative state, backend detail, and audit narrative:

Layer	Storage	Use when
Execution ledger	`batch_execution_record`	FSM truth, list/filter runs, cancel
Run record	`workflow_manifest.beampipe_run_record`	Backend integration detail, poll counters, raw excerpts
Provenance	`provenance_events`	Operator timeline: discovery changes, execution transitions, alerts
Metrics	`beampipe_*` on `:9090`	Dashboards and alert thresholds

Recommended debug order for a stuck run: readiness, metrics, provenance events, then beampipe_run_record in the execution response.

Next: choose backend behavior in Deployment profiles.