The managed I/O connector is an Apache Beam transform that provides a common API for creating sources and sinks. On the backend, Dataflow treats the managed I/O connector as a service, which allows Dataflow to manage runtime operations for the connector. You can then focus on the business logic in your pipeline, rather than managing these details.
You create the managed I/O connector using Apache Beam code, just like any
other I/O connector. You specify a source or sink to instantiate and pass in a
set of configuration parameters. For example, the Apache Iceberg sink requires a
catalog_name
parameter.
The following example shows how to create the Apache Iceberg sink by passing in a map of configuration parameters:
Java
pipeline.apply(
Managed.write(ICEBERG)
.withConfig(ImmutableMap.<String, Map>.builder()
.put("catalog_name", "<catalog_name>")
.put("warehouse_location", "<warehouse_location>")
.build()));
You can also put the configuration parameters into a YAML file and provide a URL to the file:
Java
pipeline.apply(
Managed.write(ICEBERG)
.withConfigUrl(<config_url>));
For more information, see the
Managed
class
in the Apache Beam GitHub repository.