Home » Google » Professional-Data-Engineer » What should you do?

What should you do?

08/07/2019 – by Mod_GuideK 5

You’re training a model to predict housing prices based on an available dataset with real estate properties. Your plan is to train a fully connected neural net, and you’ve discovered that the dataset contains latitude and longtitude of the property. Real estate professionals have told you that the location of the property is highly influential on price, so you’d like to engineer a feature that incorporates this physical dependency.
What should you do?
A. Provide latitude and longtitude as input vectors to your neural net.
B. Create a numeric column from a feature cross of latitude and longtitude.
C. Create a feature cross of latitude and longtitude, bucketize at the minute level and use L1 regularization during optimization.
D. Create a feature cross of latitude and longtitude, bucketize it at the minute level and use L2 regularization during optimization.

SHOW ANSWERS

Download Printable PDF. VALID exam to help you PASS.

5 thoughts on “What should you do?”

Amal says:

02/02/2020 at 9:31 PM

My answer is B. Simple feature cross would do.
We don’t have evidence that certain houses in the same city block are low priced and some are high. So why do regularization at all ?

1

Reply
1. Amal says:
  
  02/02/2020 at 11:29 PM
  
  Correction, the answer should be C, as explained here: https://developers.google.com/machine-learning/crash-course/feature-crosses/check-your-understanding
  
  1
  
  Reply
DBT says:

12/14/2019 at 9:36 AM

C
Use L1 regularization when you need to assign greater importance to more influential features. It shrinks less important feature to 0.
L2 regularization performs better when all input features influence the output & all with the weights are of equal size.

1

Reply
vicThor says:

10/19/2019 at 7:14 AM

This is solved in the machine learning crash course developers.google.com/machine-learning/crash-course/ml-intro.

A and B dont make sense

Now; the crossed feature represents a well defined city block. If the model learns that certain city blocks (within range of latitudes and longitudes) are more likely to be more expensive than others, it is a stronger signal than two features considered individually. BUT we’ll have way too many dimensions (and that is bad). Would L2 regularization accomplish this task? Unfortunately not. L2 regularization encourages weights to be small, but doesn’t force them to exactly 0.0.
However, there is a regularization term called L1 regularization that serves as an approximation to L0, but has the advantage of being convex and thus efficient to compute. So we can use L1 regularization to encourage many of the uninformative coefficients in our model to be exactly 0, and thus reap RAM savings at inference time.

Ergo, I’d say is C.
also check this: developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-regularization

2

Reply
Thomas says:

09/19/2019 at 3:31 PM

Numeric feature does not seem to be a good representation of the data in a neural network as if a number is greater than an other, it does not necessary mean it is better. For this kind of purpose, bucketization seem a better solution.

Reply

5 thoughts on “What should you do?”

Leave a Reply Cancel reply