That’s cool, but does being able to estimate the price of a house really count as “learning”?
As a human, your brain can approach most any situation
and learn how to deal with that situation without any explicit instructions. If
you sell houses for a long time, you will instinctively have a “feel” for the
right price for a house, the best way to market that house, the kind of client
who would be interested, etc. The goal of Strong AI research is to be able to replicate
this ability with computers.
But current machine learning algorithms aren’t that good
yet — they only work when focused a very specific, limited problem. Maybe a
better definition for “learning” in this case is “figuring out an equation to
solve a specific problem based on some example data”.
Unfortunately “Machine Figuring out an equation to
solve a specific problem based on some example data” isn’t really a
great name. So we ended up with “Machine Learning” instead.
Of course if you are reading this 50 years in the future
and we’ve figured out the algorithm for Strong AI, then this whole post will
all seem a little quaint. Maybe stop reading and go tell your robot servant to
go make you a sandwich, future human.
Let’s write that program!
So, how would you write the program to estimate the
value of a house like in our example above? Think about it for a second before
you read further.
If you didn’t know anything about machine learning,
you’d probably try to write out some basic rules for estimating the price of a
house like this:
def
estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = 0 # In my area, the average house costs $200
per sqft
price_per_sqft = 200 if neighborhood == "hipsterton":
# but some areas cost a bit more
price_per_sqft = 400 elif neighborhood == "skid row":
# and some areas cost less
price_per_sqft = 100 # start with a base price estimate based
on how big the place is
price = price_per_sqft * sqft # now adjust our estimate based on the
number of bedrooms
if num_of_bedrooms == 0:
# Studio apartments are cheap
price = price — 20000
else:
# places with more bedrooms are usually
# more valuable
price = price + (num_of_bedrooms *
1000) return price
If you fiddle with this for hours and hours, you might
end up with something that sort of works. But your program will never be
perfect and it will be hard to maintain as prices change.
Wouldn’t it be better if the computer could just figure
out how to implement this function for you? Who cares what exactly the function
does as long is it returns the correct number:
def
estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = <computer, plz do some math
for me> return price
One way to think about this problem is that the price is
a delicious stew and the ingredients are the number of bedrooms,
the square footage and the neighborhood. If you
could just figure out how much each ingredient impacts the final price, maybe
there’s an exact ratio of ingredients to stir in to make the final price.
That would reduce your original function (with all those
crazy if’s and else’s) down to something really simple
like this:
def
estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = 0 # a little pinch of this
price += num_of_bedrooms * .841231951398213
# and a big pinch of that
price += sqft * 1231.1231231 #
maybe a handful of this
price += neighborhood * 2.3242341421
# and finally, just a little extra salt for good measure
price += 201.23432095 return
price
Notice the magic numbers in bold — .841231951398213, 1231.1231231, 2.3242341421, and 201.23432095.
These are our weights. If we could just figure out the perfect
weights to use that work for every house, our function could predict house
prices!
A dumb way to figure out the best weights would be
something like this:
Step
1:
Start with each weight set to 1.0:
def
estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = 0 # a little pinch of this
price += num_of_bedrooms * 1.0 # and a big pinch of that
price += sqft * 1.0 # maybe a handful of this
price += neighborhood * 1.0 # and finally, just a little extra salt for
good measure
price += 1.0 return price
Step
2:
Run every house you know about through your function and
see how far off the function is at guessing the correct price for each house:
For example, if the first house really sold for $250,000, but your function guessed it sold for $178,000, you are off by $72,000 for that single house.
Now add up the squared amount you are off for each house
you have in your data set. Let’s say that you had 500 home sales in your data
set and the square of how much your function was off for each house was a grand
total of $86,123,373. That’s how “wrong” your function currently is.
Now, take that sum total and divide it by 500 to get an
average of how far off you are for each house. Call this average error amount
the cost of your function.
If you could get this cost to be zero by playing with the weights, your function would be perfect. It would mean that in every case, your function perfectly guessed the price of the house based on the input data. So that’s our goal — get this cost to be as low as possible by trying different weights.
Step 3:
Repeat Step 2 over and over with every single possible combination of weights. Whichever combination of weights makes the cost closest to zero is what you use. When you find the weights that work, you’ve solved the problem!
Source: Medium.com
About us: TMA Solutions was established in 1997 to provide quality software outsourcing services to leading companies worldwide. We are one of the largest software outsourcing companies in Vietnam with 2,500 engineers.
Visit us at https://www.tmasolutions.com/
No comments:
Post a Comment