What should you do?

You’re training a model to predict housing prices based on an available dataset with real estate properties. Your plan is to train a fully connected neural net, and you’ve discovered that the dataset contains latitude and longtitude of the property. Real estate professionals have told you that the location of the property is highly influential on price, so you’d like to engineer a feature that incorporates this physical dependency.
What should you do?
A. Provide latitude and longtitude as input vectors to your neural net.
B. Create a numeric column from a feature cross of latitude and longtitude.
C. Create a feature cross of latitude and longtitude, bucketize at the minute level and use L1 regularization during optimization.
D. Create a feature cross of latitude and longtitude, bucketize it at the minute level and use L2 regularization during optimization.

Download Printable PDF. VALID exam to help you PASS.

5 thoughts on “What should you do?

  1. My answer is B. Simple feature cross would do.
    We don’t have evidence that certain houses in the same city block are low priced and some are high. So why do regularization at all ?

  2. C
    Use L1 regularization when you need to assign greater importance to more influential features. It shrinks less important feature to 0.
    L2 regularization performs better when all input features influence the output & all with the weights are of equal size.

  3. This is solved in the machine learning crash course developers.google.com/machine-learning/crash-course/ml-intro.

    A and B dont make sense

    Now; the crossed feature represents a well defined city block. If the model learns that certain city blocks (within range of latitudes and longitudes) are more likely to be more expensive than others, it is a stronger signal than two features considered individually. BUT we’ll have way too many dimensions (and that is bad). Would L2 regularization accomplish this task? Unfortunately not. L2 regularization encourages weights to be small, but doesn’t force them to exactly 0.0.
    However, there is a regularization term called L1 regularization that serves as an approximation to L0, but has the advantage of being convex and thus efficient to compute. So we can use L1 regularization to encourage many of the uninformative coefficients in our model to be exactly 0, and thus reap RAM savings at inference time.

    Ergo, I’d say is C.
    also check this: developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-regularization

  4. Numeric feature does not seem to be a good representation of the data in a neural network as if a number is greater than an other, it does not necessary mean it is better. For this kind of purpose, bucketization seem a better solution.

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.