What should you do?

You have an Apache Kafka Cluster on-prem with topics containing web application logs. You need to replicate the data to Google Cloud for analysis in BigQuery and Cloud Storage. The preferred replication method is mirroring to avoid deployment of Kafka Connect plugins.
What should you do?
A. Deploy a Kafka cluster on GCE VM Instances. Configure your on-prem cluster to mirror your topics to the cluster running in GCE. Use a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.
B. Deploy a Kafka cluster on GCE VM Instances with the PubSub Kafka connector configured as a Sink connector. Use a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.
C. Deploy the PubSub Kafka connector to your on-prem Kafka cluster and configure PubSub as a Source connector. Use a Dataflow job to read fron PubSub and write to GCS.
D. Deploy the PubSub Kafka connector to your on-prem Kafka cluster and configure PubSub as a Sink connector. Use a Dataflow job to read fron PubSub and write to GCS.

Download Printable PDF. VALID exam to help you PASS.

7 thoughts on “What should you do?

  1. D

    https://github.com/GoogleCloudPlatform/pubsub/tree/master/kafka-connector

    The CloudPubSubConnector is a connector to be used with Kafka Connect to publish messages from Kafka to Google Cloud Pub/Sub or Pub/Sub Lite and vice versa.

    CloudPubSubSinkConnector provides a sink connector to copy messages from Kafka to Google Cloud Pub/Sub. CloudPubSubSourceConnector provides a source connector to copy messages from Google Cloud Pub/Sub to Kafka. PubSubLiteSinkConnector provides a sink connector to copy messages from Kafka to Pub/Sub Lite. PubSubLiteSourceConnector provides a source connector to copy messages from Pub/Sub Lite to Kafka.

  2. I think the answer should be A. Checked the Kafka PubSub connector, it requires Kafka Connect, and the question explicitly said that the company does not want to deploy Kafka Connect.

  3. A
    it says “The preferred replication method is mirroring to avoid deployment of Kafka Connect plugins.” Keyword is to mirror and avoid plugin.

    1. Agreed, answer is C. Kafka mirroring is a simple way to replicate a kafka cluster. After replication, you can use Dataproc or Dataflow to copy data to GCS. All other solutions use Kafka connect plugin.

  4. Correction in first line: is C or D…with respect to A and B, I think that if we want to replicate between clusters we’d need the confluent plugin. (Not sure of this.)

  5. The connector itself allows two way event traffic, Kafka to PubSub and PubSub to Kafka. this is B or C. Now:

    CloudPubSubConnector provides both a sink connector (to copy messages from Kafka to Cloud Pub/Sub) and a source connector (to copy messages from Cloud Pub/Sub to Kafka).

    We are t¡in the first case, so I’d say is D

    check this cloud.google.com/blog/products/gcp/apache-kafka-for-gcp-users-connectors-for-pubsub-dataflow-and-bigquery

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.