apache_beam.runners.interactive.recording_manager module
- class apache_beam.runners.interactive.recording_manager.ElementStream(pcoll: PCollection, var: str, cache_key: str, max_n: int, max_duration_secs: float)[source]
Bases:
object
A stream of elements from a given PCollection.
- property pcoll: PCollection
Returns the PCollection that supplies this stream with data.
- class apache_beam.runners.interactive.recording_manager.Recording(user_pipeline: Pipeline, pcolls: List[PCollection], result: beam.runner.PipelineResult, max_n: int, max_duration_secs: float)[source]
Bases:
object
A group of PCollections from a given pipeline run.
- stream(pcoll: PCollection) ElementStream [source]
Returns an ElementStream for a given PCollection.
- class apache_beam.runners.interactive.recording_manager.RecordingManager(user_pipeline: Pipeline, pipeline_var: str = None, test_limiters: List[Limiter] = None)[source]
Bases:
object
Manages recordings of PCollections for a given pipeline.
- record_pipeline() bool [source]
Starts a background caching job for this RecordingManager’s pipeline.
- record(pcolls: List[PCollection], *, max_n: int, max_duration: int | str, runner: PipelineRunner | None = None, options: PipelineOptions | None = None, force_compute: bool = False) Recording [source]
Records the given PCollections.
- read(pcoll_name: str, pcoll: PValue, max_n: int, max_duration_secs: float) None | ElementStream [source]
Reads an ElementStream of a computed PCollection.
Returns None if an error occurs. The caller is responsible of validating if the given pcoll_name and pcoll can identify a watched and computed PCollection without ambiguity in the notebook.