We will briefly summarize a summary of the linear regression before implementing it using Tensorflow. Since we will not go into details of either linear regression or Tensorflow, please read the following articles for more details:
w
— a vector named Weights, a b
— scalar named Bias . The weights and biases are called the parameters of the model.
All we need to do is estimate the value of w and b from a given dataset, so that the resulting hypothesis gives the least J value
which is defined by the following cost function
where m
— the number of data points in a given dataset. This cost function is also called root mean square error .
To find the optimized value of parameters for which J
is minimal, we will use a commonly used optimizer algorithm called gradient descent . Below is the pseudocode for gradient descent:
Repeat untill Convergence {w = w  α * δJ / δw b = b  α * δJ / δb}where
α
— hyperparameter called learning rate .Tensorflow
Tensorflow — is an open source computing library created by Google. It is a popular choice for building applications that require high performance numerical computing and / or require GPUs for computational purposes. These are the main reasons why Tensorflow is one of the most popular solutions for machine learning applications, especially deep learning. It also has APIs like Estimator that provide a high level of abstraction when building machine learning applications. In this article, we will not be using any highlevel APIs, rather we will be building a linear regression model using lowlevel Tensorflow in lazy execution mode, during which Tensorflow creates a directed acyclic graph or DAG that keeps track of everything calculations, and then performs all the calculations done inside a Tensorflow session .
Implementation
Let`s start by importing the required libraries. We will be using Numpy along with Tensorflow for calculations and Matplotlib for plotting.

To make random numbers predictable, we`ll define fixed seeds for Numpy and Tensorflow.
np.random.seed (
101
)
tf.set_random_seed (
101
)
Now let`s generate some random data to train the linear regression model.

Let us visualize the training data.

Output:
Now we start create our model by defining placeholders X
and Y
so that we can serve our tutorials X
and Y
to the optimizer during the tutorial.
We now declare two Tensorflow trainable variables for weights and biases and initialize them randomly using np.random.randn ()
.
X
=
tf.placeholder (
"float"
)
Y
=
tf.placeholder (
"float"
)
W
=
tf.Variable (np.random.randn (), name
=
"W"
)
b
=
tf.Variable ( np.random.randn (), name
=
"b"
)
Now we define model hyperparameters, learning rate and number of epochs.
learning_rate
=
0.01
training_epochs
=
1000
Now we will build a hypothesis, the function st costs and optimizer. We will not be implementing the Gradient Descent Optimizer manually as it is built into Tensorflow. After that we will initialize the variables.

We will now start the tutorial inside a Tensorflow session.
# Start a Tensorflow session
with tf.Session () as sess:
# Initializing variables
sess.run (init)
# Loop over all eras
for
epoch
in
range
(training_epochs):
# Feed each data point to the optimizer using the Feed Dictionary
for
(_ x, _y)
in
zip
(x, y):
sess.run (optimizer, feed_dict
=
{X: _x, Y: _y})
< code class = "undefined spaces"> # Display the result every 50 epochs
if
(epoch
+
1
)
%
50
=
=
0
:
# Calculate cost per epoch
c
=
sess.run (cost, feed_dict
=
{X: x, Y: y})
print
(
"Epoch"
, ( epoch
+
1
),
": cost = "
, c,
" W = "
, sess.run (W),
"b ="
, sess.run (b))
# Store the required values for use outside the session
training_cost
=
sess.run (cost, feed_dict
=
{X: x, Y: y})
weight
=
sess.run (W)
bias
= sess.run (b)
Exit:
Epoch: 50 cost = 5.8868036 W = 0.9951241 b = 1.2381054 Epoch: 100 cost = 5.7912707 W = 0.99812365 b = 1.0914398 Epoch: 150 cost = 5.7119675 W = 1.0008028 b = 0.96044314 Epoch: 200 cost = 5.6459413 W = 1.0031956 b = 0.8434396 Epoch: 250 cost = 5.590799 W = 1.0053328 b = 0.7389357 Epoch: 300 cost = 5.544608 W = 1.007242 b = 0.6455922 = Epoch 5.5057883 W = 1.008947 b = 0.56222 Epoch: 400 cost = 5.473066 W = 1.01047 b = 0.48775345 Epoch: 450 cost = 5.4453845 W = 1.0118302 b = 0.42124167 Epoch: 500 cost = 5.421903 W = 1.0130452 b = 0.36183488 Epoch: 550 cost = 1.0141305 b = 0.30877414 Epoch: 600 cost = 5.3848577 W = 1.0150996 b = 0.26138115 Epoch: 650 cost = 5.370246 W = 1.0159653 b = 0.21905091 Epoch: 700 cost = 5.3576994 W = 1.0167387 b = 0.1812 W = 1.0167387 b = 0.1812 W = 1.0174689 750 b = 0.14747244 Epoch: 800 cost = 5.3375573 W = 1.0180461 b = 0.11730931 Epoch: 850 cost = 5.3294764 W = 1.0185971 b = 0.090368524 Epoch: 900 cost = 5.322459 W = 1.0190892 b = 0.0663058 Epoch: 950 cost = 5.3163586 W = 1.0195289 b = 0.044813324 Epoch: 1000 W = 5.311099 = 0.02561663
Now let`s see the result.

Exit :
Training cost = 5.3110332 Weight = 1.0199214 bias = 0.02561663
Note that in this case, both the weight and the offset are scalars. This is because we only looked at one dependent variable in our training data. If our training dataset has m dependent variables, then Weight will be an mdimensional vector and the bias will be scalar.
Finally, we will map out our result.

Output: