You are building a data pipeline on Google Cloud. You need to prepare data using a casual method for a machine-learning process. You want to support a logistic regression model. You also need to monitor and adjust for null values, which must remain real-valued and cannot be removed. What should you do?
A. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to ‘none’ using a Cloud Dataproc job.
B. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to 0 using a Cloud Dataprep job.
C. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to ‘none’ using a Cloud Dataprep job.
D. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to 0 using a custom script.
Agree should be B.
Null, Missing, and 0 values are same . It can be removed if the script contains NULL or Missing or Zero , hence replacing the required value with None will ensure it is not removed
Looks like C is correct as it says it cannot be removed. If you have None it cannot be removed
Dataprep is the tool. A or B.
Since they need to have a real-valued can not be null N/A or empty, have to be “0”, so it has to be B.