Multiple Linear Regression
List of Tensorflow 2.0 Tutorials
- TF2.0 - 01.Simple Linear Regression
- TF2.0 - 02.Linear Regression and How to minimize cost
- TF2.0 - 03.Multiple Linear Regression
- TF2.0 - 04.Logistic Regression
- TF2.0 - 05.Multinomial Classification
- TF2.0 - 06.Iris Data Classification
We previously discussed the Simple Linear Regression Model. This time, we will look at Hypothesis, Cost Function, and Minimizing Cost of Multiple Linear Regression Model.
(Review) Simple Linear Regression Model
Hypothesis
\[H(x) = Wx + b\]Cost Function
\[cost(W) = {1 \over 2m} {\sum_{i=1}^m} (Wx_i-y_i)^2\]Gradient Descent Algorithm
\[W := W - \alpha{1 \over m} {\sum_{i=1}^m} (Wx_i-y_i) x_i\]Multiple Linear Regression Model
Let’s take a look at the model with 3 features.
Hypothesis and Cost Function
\[H(x_1, x_2, x_3) = w_1x_1+w_2x_2+w_3x_3+b\] \[cost(W, b) = {1 \over m} {\sum_{i=1}^m} (H(x_{i1}, x_{i2}, x_{i3})-y_i)^2\]Hypothesis Using Matrix
\[H(X)=XW= \begin{pmatrix} x_1 & x_2 & x_3 \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_1w_1 + x_2w_2 + x_3w_3 \end{pmatrix}\]Predicting exam score
Suppose you have a quiz1, quiz2, or midterm score that predicts a final exam score. Five data are given as follows.
$x_1$(Quiz1) | $x_2$(Quiz2) | $x_3$(Midterm) | $y$(Final) |
---|---|---|---|
80 | 75 | 90 | 85 |
65 | 55 | 60 | 70 |
35 | 55 | 45 | 60 |
78 | 85 | 80 | 78 |
95 | 90 | 94 | 98 |
Hypothesis in this case
\[H(X) = XW\] \[\begin{pmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \\ x_{31} & x_{32} & x_{33} \\ x_{41} & x_{42} & x_{43} \\ x_{51} & x_{52} & x_{53} \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_{11}w_{1} + x_{12}w_{2} + x_{13}w_{3} \\ x_{21}w_{1} + x_{22}w_{2} + x_{23}w_{3} \\ x_{31}w_{1} + x_{32}w_{2} + x_{33}w_{3} \\ x_{41}w_{1} + x_{42}w_{2} + x_{43}w_{3} \\ x_{51}w_{1} + x_{52}w_{2} + x_{53}w_{3} \\ \end{pmatrix}\]Libraries
import numpy as np
import tensorflow as tf
print("TensorFlow Version: %s" % tf.__version__)
TensorFlow Version: 2.0.0
A. Computing without Matrix operations
Problems can be solved without matrix operations. However, it is troublesome to assign variables individually as follows.
# Data
x1 = [80., 65., 35., 78., 95.]
x2 = [75., 55., 55., 85., 90.]
x3 = [90., 60., 45., 80., 94.]
y = [85., 70., 60., 78., 98.]
# Weights
tf.random.set_seed(2020)
w1 = tf.Variable(tf.random.normal([1], mean=0.0))
w2 = tf.Variable(tf.random.normal([1], mean=0.0))
w3 = tf.Variable(tf.random.normal([1], mean=0.0))
b = tf.Variable(tf.random.normal([1], mean=0.0))
# Learning Rate
learning_rate = 0.0000001
# Training
for i in range(2000+1):
with tf.GradientTape() as tape:
hypothesis = w1*x1 + w2*x2 + w3*x3 + b
cost = tf.reduce_mean(tf.square(hypothesis - y))
w1_grad, w2_grad, w3_grad, b_grad = tape.gradient(cost, [w1, w2, w3, b])
w1.assign_sub(learning_rate * w1_grad)
w2.assign_sub(learning_rate * w2_grad)
w3.assign_sub(learning_rate * w3_grad)
b.assign_sub(learning_rate * b_grad)
if i % 200 == 0:
print("#%s \t cost: %s" % (i, cost.numpy()))
#0 cost: 19095.684
#200 cost: 5163.9473
#400 cost: 1452.9823
#600 cost: 464.43408
#800 cost: 201.04555
#1000 cost: 130.81319
#1200 cost: 112.03182
#1400 cost: 106.954956
#1600 cost: 105.52873
#1800 cost: 105.075134
#2000 cost: 104.88092
B. Computing with matrix operations
Unlike the method introduced above, using matrix computation can solve the problem very concisely.
Predicting exam score
$x_1$(Quiz1) | $x_2$(Quiz2) | $x_3$(Midterm) | $y$(Final) |
---|---|---|---|
80 | 75 | 90 | 85 |
65 | 55 | 60 | 70 |
35 | 55 | 45 | 60 |
78 | 85 | 80 | 78 |
95 | 90 | 94 | 98 |
# Data
data = np.array([
[80., 75., 90., 85.],
[65., 55., 60., 70.],
[35., 55., 45., 60.],
[78., 85., 80., 78.],
[95., 90., 94., 98.]
], dtype = np.float32)
# Slice Data
X = data[:, :-1]
Y = data[:, [-1]]
X
array([[80., 75., 90.],
[65., 55., 60.],
[35., 55., 45.],
[78., 85., 80.],
[95., 90., 94.]], dtype=float32)
Y
array([[85.],
[70.],
[60.],
[78.],
[98.]], dtype=float32)
# Weights
tf.random.set_seed(2020)
W = tf.Variable(tf.random.normal([3, 1], mean=0.0))
b = tf.Variable(tf.random.normal([1], mean=0.0))
# Learning Rate
learning_rate = 0.0000001
# Hypothesis and Prediction Function
def predict(X):
return tf.matmul(X, W) + b
# Training
for i in range(2000+1):
with tf.GradientTape() as tape:
cost = tf.reduce_mean(tf.square(predict(X) - Y))
W_grad, b_grad = tape.gradient(cost, [W, b])
W.assign_sub(learning_rate * W_grad)
b.assign_sub(learning_rate * b_grad)
if i % 200 == 0:
print("#%s \t cost: %s" % (i, cost.numpy()))
#0 cost: 7769.522
#200 cost: 2113.2505
#400 cost: 606.6113
#600 cost: 205.28519
#800 cost: 98.37294
#1000 cost: 69.8826
#1200 cost: 62.28067
#1400 cost: 60.242634
#1600 cost: 59.686646
#1800 cost: 59.52544
#2000 cost: 59.46943
List of Tensorflow 2.0 Tutorials
- TF2.0 - 01.Simple Linear Regression
- TF2.0 - 02.Linear Regression and How to minimize cost
- TF2.0 - 03.Multiple Linear Regression
- TF2.0 - 04.Logistic Regression
- TF2.0 - 05.Multinomial Classification
- TF2.0 - 06.Iris Data Classification
댓글남기기