You have Google Cloud Dataflow streaming pipeline running with a Google Cloud Pub/Sub subscription as the source. You need to make an update to the code that will make the new Cloud Dataflow pipeline incompatible with the current version. You do not want to lose any data when making this update. What should you do?
A. Update the current pipeline and use the drain flag.
B. Update the current pipeline and provide the transform mapping JSON object.
C. Create a new pipeline that has the same Cloud Pub/Sub subscription and cancel the old pipeline.
D. Create a new pipeline that has a new Cloud Pub/Sub subscription and cancel the old pipeline.
Canceling a pipeline will result in data loss so C & D are eliminated. there is no drain flag and hence A is eliminated. Answer is B.
I should be C . You dont want to lose the data so using the same subscription will help getting the non-processed data
A and B are wrong.
You cannot replace a pipeline with an incompatible one because the compatibility check fails.
I would say C.
Not sure between C and D. Can we use the same subscription?
Yes, we can update an existing subscription to send messages to the new pipeline (if pushing).
So my answer is C.
Answer is B. You can update the pipeline. Submit a new job with same job name and pass the transform mapping JSON object as a parameter to the new job. Dataflow will take care of draining the first job and starting the updated job.
https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline
I would say C, readiing Arun’s link. Whay would u want to change the source settings in the new job?
Ans should be B
A
The main requirement is “You do not want to lose any data when making this update” so u have to drain the current pipeline
B
We cannot update the pipeline when the pipeline is incompatible with previous pipeline. check below pipeline
https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline