Home » Google » Professional-Data-Engineer » What should you do?

What should you do?

08/07/2019 – by Mod_GuideK 4

You work for a bank. You have a labelled dataset that contains information on already granted loan application and whether these applications have been defaulted. You have been asked to train a model to predict default rates for credit applicants.
What should you do?
A. Increase the size of the dataset by collecting additional data.
B. Train a linear regression to predict a credit default risk score.
C. Remove the bias from the data and collect applications that have been declined loans.
D. Match loan applicants with their social profiles to enable feature engineering.

SHOW ANSWERS

Download Printable PDF. VALID exam to help you PASS.

4 thoughts on “What should you do?”

TL says:

02/05/2020 at 3:23 AM

Answer should be D

B is a classic malpractice for a junior data scientist:

https://thestatsgeek.com/2015/01/17/why-shouldnt-i-use-linear-regression-if-my-outcome-is-binary/

3

Reply
TL says:

02/05/2020 at 3:22 AM

Answer should be D

B is a classic malpractice for a junior data scientist:

https://medium.com/analytics-vidhya/insiders-view-on-logistic-regression-and-how-do-we-deploy-regression-model-in-gcp-as-batch-c62a64563210

2

Reply
Amal says:

02/02/2020 at 9:10 PM

Answer is B.
You have to work with what data you have. So A & D are incorrect.
C is an optimization, not a solution.
You can use a linear regression model to predict the probability score that a loan will be default (unpaid) using the historical data.

5

Reply
vicThor says:

10/18/2019 at 12:39 PM

A and B dont make sense.
C doesnt either, ¿What should we do with the applications? 🙂

D looks correct, in order to find new features for the model.

Reply

4 thoughts on “What should you do?”

Leave a Reply Cancel reply