Home » Google » Professional-Data-Engineer » What should you do?

What should you do?

08/07/2019 – by Mod_GuideK 7

You have an Apache Kafka Cluster on-prem with topics containing web application logs. You need to replicate the data to Google Cloud for analysis in BigQuery and Cloud Storage. The preferred replication method is mirroring to avoid deployment of Kafka Connect plugins.
What should you do?
A. Deploy a Kafka cluster on GCE VM Instances. Configure your on-prem cluster to mirror your topics to the cluster running in GCE. Use a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.
B. Deploy a Kafka cluster on GCE VM Instances with the PubSub Kafka connector configured as a Sink connector. Use a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.
C. Deploy the PubSub Kafka connector to your on-prem Kafka cluster and configure PubSub as a Source connector. Use a Dataflow job to read fron PubSub and write to GCS.
D. Deploy the PubSub Kafka connector to your on-prem Kafka cluster and configure PubSub as a Sink connector. Use a Dataflow job to read fron PubSub and write to GCS.

SHOW ANSWERS

Download Printable PDF. VALID exam to help you PASS.

7 thoughts on “What should you do?”

Arek says:

02/12/2022 at 6:42 PM

D

https://github.com/GoogleCloudPlatform/pubsub/tree/master/kafka-connector

The CloudPubSubConnector is a connector to be used with Kafka Connect to publish messages from Kafka to Google Cloud Pub/Sub or Pub/Sub Lite and vice versa.

CloudPubSubSinkConnector provides a sink connector to copy messages from Kafka to Google Cloud Pub/Sub. CloudPubSubSourceConnector provides a source connector to copy messages from Google Cloud Pub/Sub to Kafka. PubSubLiteSinkConnector provides a sink connector to copy messages from Kafka to Pub/Sub Lite. PubSubLiteSourceConnector provides a source connector to copy messages from Pub/Sub Lite to Kafka.

Reply
Ricky says:

03/18/2020 at 6:25 AM

I think the answer should be A. Checked the Kafka PubSub connector, it requires Kafka Connect, and the question explicitly said that the company does not want to deploy Kafka Connect.

1

Reply
DBT says:

12/19/2019 at 6:11 AM

A
it says “The preferred replication method is mirroring to avoid deployment of Kafka Connect plugins.” Keyword is to mirror and avoid plugin.

1

Reply
1. Amal says:
  
  02/03/2020 at 12:01 AM
  
  Agreed, answer is C. Kafka mirroring is a simple way to replicate a kafka cluster. After replication, you can use Dataproc or Dataflow to copy data to GCS. All other solutions use Kafka connect plugin.
  
  Reply
Arun says:

12/15/2019 at 2:30 PM

Answer is D since Pubsub will act as sink to get messages from Kafka. More on this here
https://github.com/GoogleCloudPlatform/pubsub/tree/master/kafka-connector

Reply
vicThor says:

10/19/2019 at 9:34 AM

Correction in first line: is C or D…with respect to A and B, I think that if we want to replicate between clusters we’d need the confluent plugin. (Not sure of this.)

Reply
vicThor says:

10/19/2019 at 9:29 AM

The connector itself allows two way event traffic, Kafka to PubSub and PubSub to Kafka. this is B or C. Now:

CloudPubSubConnector provides both a sink connector (to copy messages from Kafka to Cloud Pub/Sub) and a source connector (to copy messages from Cloud Pub/Sub to Kafka).

We are t¡in the first case, so I’d say is D

check this cloud.google.com/blog/products/gcp/apache-kafka-for-gcp-users-connectors-for-pubsub-dataflow-and-bigquery

Reply

7 thoughts on “What should you do?”

Leave a Reply Cancel reply