We can use the Numpy.zeros () method to accomplish this task. This method takes three parameters, which are discussed below —
shape: integer or sequence of integers order: C_contiguous or F_contiguous Ccontiguous order in memory (last index varies the fastest) C order means that operating rowrise on the array will be slightly quicker FORTRANcontiguous order in memory (first index varies the fastest). F order means that columnwise operations will be faster. dtype: [optional, float (byDeafult)] Data type of returned array.
Example # 1:

Exit:
Matrix a: [0 0 0] Matrix b: [[0 0 0] [0 0 0] [0 0 0]]Example # 2:

Exit:
Matrix c: [[0. 0. 0.] [0. 0. 0.] [0. 0. 0.] [0. 0. 0.] [0. 0. 0.]] Matrix d: [[0. 0.] [0. 0.] [0. 0 .] [0. 0.] [0. 0.]]
Given:
a = 1
b = 10
c = 100
How do I display a leading zero for all numbers with less than two digits?
This is the output I"m expecting:
01
10
100
I need to add leading zeros to integer to make a string with defined quantity of digits ($cnt). What the best way to translate this simple function from PHP to Python:
function add_nulls($int, $cnt=2) {
$int = intval($int);
for($i=0; $i<($cntstrlen($int)); $i++)
$nulls .= "0";
return $nulls.$int;
}
Is there a function that can do this?
How can I create a list
which contains only zeros? I want to be able to create a zeros list
for each int
in range(10)
For example, if the int
in the range was 4
I will get:
[0,0,0,0]
and for 7
:
[0,0,0,0,0,0,0]
I am trying to build a histogram of counts... so I create buckets. I know I could just go through and append a bunch of zeros i.e something along these lines:
buckets = []
for i in xrange(0,100):
buckets.append(0)
Is there a more elegant way to do it? I feel like there should be a way to just declare an array of a certain size.
I know numpy has numpy.zeros
but I want the more general solution
How can I format a float so that it doesn"t contain trailing zeros? In other words, I want the resulting string to be as short as possible.
For example:
3 > "3"
3. > "3"
3.0 > "3"
3.1 > "3.1"
3.14 > "3.14"
3.140 > "3.14"
I"m trying to convert an integer to binary using the bin() function in Python. However, it always removes the leading zeros, which I actually need, such that the result is always 8bit:
Example:
bin(1) > 0b1
# What I would like:
bin(1) > 0b00000001
Is there a way of doing this?
I have several alphanumeric strings like these
listOfNum = ["000231512n","1209123100000n00000","alphanumeric0000", "000alphanumeric"]
The desired output for removing trailing zeros would be:
listOfNum = ["000231512n","1209123100000n","alphanumeric", "000alphanumeric"]
The desired output for leading trailing zeros would be:
listOfNum = ["231512n","1209123100000n00000","alphanumeric0000", "alphanumeric"]
The desire output for removing both leading and trailing zeros would be:
listOfNum = ["231512n","1209123100000n", "alphanumeric", "alphanumeric"]
For now i"ve been doing it the following way, please suggest a better way if there is:
listOfNum = ["000231512n","1209123100000n00000","alphanumeric0000",
"000alphanumeric"]
trailingremoved = []
leadingremoved = []
bothremoved = []
# Remove trailing
for i in listOfNum:
while i[1] == "0":
i = i[:1]
trailingremoved.append(i)
# Remove leading
for i in listOfNum:
while i[0] == "0":
i = i[1:]
leadingremoved.append(i)
# Remove both
for i in listOfNum:
while i[0] == "0":
i = i[1:]
while i[1] == "0":
i = i[:1]
bothremoved.append(i)
I can use pandas
dropna()
functionality to remove rows with some or all columns set as NA
"s. Is there an equivalent function for dropping rows with all columns having value 0?
P kt b tt mky depth
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 1.1 3 4.5 2.3 9.0
In this example, we would like to drop the first 4 rows from the data frame.
thanks!
I want to know how I can pad a 2D numpy array with zeros using python 2.6.6 with numpy version 1.5.0. But these are my limitations. Therefore I cannot use np.pad
. For example, I want to pad a
with zeros such that its shape matches b
. The reason why I want to do this is so I can do:
ba
such that
>>> a
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
>>> b
array([[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.]])
>>> c
array([[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]])
The only way I can think of doing this is appending, however this seems pretty ugly. is there a cleaner solution possibly using b.shape
?
Edit, Thank you to MSeiferts answer. I had to clean it up a bit, and this is what I got:
def pad(array, reference_shape, offsets):
"""
array: Array to be padded
reference_shape: tuple of size of ndarray to create
offsets: list of offsets (number of elements must be equal to the dimension of the array)
will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference_shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = array
return result
We initialize a numpy array with zeros as bellow:
np.zeros((N,N+1))
But how do we check whether all elements in a given n*n numpy array matrix is zero.
The method just need to return a True if all the values are indeed zero.
If you like ascii art:
"VALID"
= without padding:
inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13)
________________ dropped
_________________
"SAME"
= with zero padding:
pad pad
inputs: 0 1 2 3 4 5 6 7 8 9 10 11 12 130 0
________________
_________________
________________
In this example:
Notes:
"VALID"
only ever drops the rightmost columns (or bottommost rows)."SAME"
tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).Edit:
About the name:
"SAME"
padding, if you use a stride of 1, the layer"s outputs will have the same spatial dimensions as its inputs."VALID"
padding, there"s no "madeup" padding inputs. The layer only uses valid input data.Your array a
defines the columns of the nonzero elements in the output array. You need to also define the rows and then use fancy indexing:
>>> a = np.array([1, 0, 3])
>>> b = np.zeros((a.size, a.max()+1))
>>> b[np.arange(a.size),a] = 1
>>> b
array([[ 0., 1., 0., 0.],
[ 1., 0., 0., 0.],
[ 0., 0., 0., 1.]])
If your main goal is to visualize the correlation matrix, rather than creating a plot per se, the convenient pandas
styling options is a viable builtin solution:
import pandas as pd
import numpy as np
rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
corr = df.corr()
corr.style.background_gradient(cmap="coolwarm")
# "RdBu_r", "BrBG_r", & PuOr_r are other good diverging colormaps
Note that this needs to be in a backend that supports rendering HTML, such as the JupyterLab Notebook.
You can easily limit the digit precision:
corr.style.background_gradient(cmap="coolwarm").set_precision(2)
Or get rid of the digits altogether if you prefer the matrix without annotations:
corr.style.background_gradient(cmap="coolwarm").set_properties(**{"fontsize": "0pt"})
The styling documentation also includes instructions of more advanced styles, such as how to change the display of the cell the mouse pointer is hovering over.
In my testing, style.background_gradient()
was 4x faster than plt.matshow()
and 120x faster than sns.heatmap()
with a 10x10 matrix. Unfortunately it doesn"t scale as well as plt.matshow()
: the two take about the same time for a 100x100 matrix, and plt.matshow()
is 10x faster for a 1000x1000 matrix.
There are a few possible ways to save the stylized dataframe:
render()
method and then write the output to a file..xslx
file with conditional formatting by appending the to_excel()
method.By setting axis=None
, it is now possible to compute the colors based on the entire matrix rather than per column or per row:
corr.style.background_gradient(cmap="coolwarm", axis=None)
Since many people are reading this answer I thought I would add a tip for how to only show one corner of the correlation matrix. I find this easier to read myself, since it removes the redundant information.
# Fill diagonal and upper half with NaNs
mask = np.zeros_like(corr, dtype=bool)
mask[np.triu_indices_from(mask)] = True
corr[mask] = np.nan
(corr
.style
.background_gradient(cmap="coolwarm", axis=None, vmin=1, vmax=1)
.highlight_null(null_color="#f1f1f1") # Color NaNs grey
.set_precision(2))
In numpy v1.7+, you can take advantage of the "where" option for ufuncs. You can do things in one line and you don"t have to deal with the errstate context manager.
>>> a = np.array([1, 0, 1, 2, 3], dtype=float)
>>> b = np.array([ 0, 0, 0, 2, 2], dtype=float)
# If you don"t pass `out` the indices where (b == 0) will be uninitialized!
>>> c = np.divide(a, b, out=np.zeros_like(a), where=b!=0)
>>> print(c)
[ 0. 0. 0. 1. 1.5]
In this case, it does the divide calculation anywhere "where" b does not equal zero. When b does equal zero, then it remains unchanged from whatever value you originally gave it in the "out" argument.
You need to push a byteslike
object (bytes
, bytearray
, etc) to the base64.b64encode()
method. Here are two ways:
>>> import base64
>>> data = base64.b64encode(b"data to be encoded")
>>> print(data)
b"ZGF0YSB0byBiZSBlbmNvZGVk"
Or with a variable:
>>> import base64
>>> string = "data to be encoded"
>>> data = base64.b64encode(string.encode())
>>> print(data)
b"ZGF0YSB0byBiZSBlbmNvZGVk"
In Python 3, str
objects are not Cstyle character arrays (so they are not byte arrays), but rather, they are data structures that do not have any inherent encoding. You can encode that string (or interpret it) in a variety of ways. The most common (and default in Python 3) is utf8, especially since it is backwards compatible with ASCII (although, as are most widelyused encodings). That is what is happening when you take a string
and call the .encode()
method on it: Python is interpreting the string in utf8 (the default encoding) and providing you the array of bytes that it corresponds to.
Originally the question title asked about Base64 encoding. Read on for Base64 stuff.
base64
encoding takes 6bit binary chunks and encodes them using the characters AZ, az, 09, "+", "/", and "=" (some encodings use different characters in place of "+" and "/"). This is a character encoding that is based off of the mathematical construct of radix64 or base64 number system, but they are very different. Base64 in math is a number system like binary or decimal, and you do this change of radix on the entire number, or (if the radix you"re converting from is a power of 2 less than 64) in chunks from right to left.
In base64
encoding, the translation is done from left to right; those first 64 characters are why it is called base64
encoding. The 65th "=" symbol is used for padding, since the encoding pulls 6bit chunks but the data it is usually meant to encode are 8bit bytes, so sometimes there are only two or 4 bits in the last chunk.
Example:
>>> data = b"test"
>>> for byte in data:
... print(format(byte, "08b"), end=" ")
...
01110100 01100101 01110011 01110100
>>>
If you interpret that binary data as a single integer, then this is how you would convert it to base10 and base64 (table for base64):
base2: 01 110100 011001 010111 001101 110100 (base64 grouping shown)
base10: 1952805748
base64: B 0 Z X N 0
base64
encoding, however, will regroup this data thusly:
base2: 011101 000110 010101 110011 011101 00(0000) < pad w/zeros to make a clean 6bit chunk
base10: 29 6 21 51 29 0
base64: d G V z d A
So, "B0ZXN0" is the base64 version of our binary, mathematically speaking. However, base64
encoding has to do the encoding in the opposite direction (so the raw data is converted to "dGVzdA") and also has a rule to tell other applications how much space is left off at the end. This is done by padding the end with "=" symbols. So, the base64
encoding of this data is "dGVzdA==", with two "=" symbols to signify two pairs of bits will need to be removed from the end when this data gets decoded to make it match the original data.
Let"s test this to see if I am being dishonest:
>>> encoded = base64.b64encode(data)
>>> print(encoded)
b"dGVzdA=="
base64
encoding?Let"s say I have to send some data to someone via email, like this data:
>>> data = b"x04x6dx73x67x08x08x08x20x20x20"
>>> print(data.decode())
>>> print(data)
b"x04msgx08x08x08 "
>>>
There are two problems I planted:
x04
character was read, because that is ASCII for ENDOFTRANSMISSION
(CtrlD), so the remaining data would be left out of the transmission.BACKSPACE
characters and three SPACE
characters to erase the "msg". Thus, even if I didn"t have the EOF
character there the end user wouldn"t be able to translate from the text on screen to the real, raw data.This is just a demo to show you how hard it can be to simply send raw data. Encoding the data into base64 format gives you the exact same data but in a format that ensures it is safe for sending over electronic media such as email.
I think all of the answers here cover the core of what the lambda function does in the context of sorted() quite nicely, however I still feel like a description that leads to an intuitive understanding is lacking, so here is my two cents.
For the sake of completeness, I"ll state the obvious up front: sorted() returns a list of sorted elements and if we want to sort in a particular way or if we want to sort a complex list of elements (e.g. nested lists or a list of tuples) we can invoke the key argument.
For me, the intuitive understanding of the key argument, why it has to be callable, and the use of lambda as the (anonymous) callable function to accomplish this comes in two parts.
Lambda syntax is as follows:
lambda input_variable(s): tasty one liner
where lambda
is a python keyword.
e.g.
In [1]: f00 = lambda x: x/2
In [2]: f00(10)
Out[2]: 5.0
In [3]: (lambda x: x/2)(10)
Out[3]: 5.0
In [4]: (lambda x, y: x / y)(10, 2)
Out[4]: 5.0
In [5]: (lambda: "amazing lambda")() # func with no args!
Out[5]: "amazing lambda"
key
argument is that it should take in a set of instructions that will essentially point the "sorted()" function at those list elements which should be used to sort by. When it says key=
, what it really means is: As I iterate through the list, one element at a time (i.e. for e in some_list
), I"m going to pass the current element to the function specifed by the key argument and use that to create a transformed list which will inform me on the order of the final sorted list.Check it out:
In [6]: mylist = [3, 6, 3, 2, 4, 8, 23] # an example list
# sorted(mylist, key=HowToSort) # what we will be doing
Base example:
# mylist = [3, 6, 3, 2, 4, 8, 23]
In [7]: sorted(mylist)
Out[7]: [2, 3, 3, 4, 6, 8, 23]
# all numbers are in ascending order (i.e.from low to high).
Example 1:
# mylist = [3, 6, 3, 2, 4, 8, 23]
In [8]: sorted(mylist, key=lambda x: x % 2 == 0)
# Quick Tip: The % operator returns the *remainder* of a division
# operation. So the key lambda function here is saying "return True
# if x divided by 2 leaves a remainer of 0, else False". This is a
# typical way to check if a number is even or odd.
Out[8]: [3, 3, 23, 6, 2, 4, 8]
# Does this sorted result make intuitive sense to you?
Notice that my lambda function told sorted
to check if each element e
was even or odd before sorting.
BUT WAIT! You may (or perhaps should) be wondering two things.
First, why are the odd numbers coming before the even numbers? After all, the key value seems to be telling the sorted
function to prioritize evens by using the mod
operator in x % 2 == 0
.
Second, why are the even numbers still out of order? 2 comes before 6, right?
By analyzing this result, we"ll learn something deeper about how the "key" argument really works, especially in conjunction with the anonymous lambda function.
Firstly, you"ll notice that while the odds come before the evens, the evens themselves are not sorted. Why is this?? Lets read the docs:
Key Functions Starting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.
We have to do a little bit of reading between the lines here, but what this tells us is that the sort function is only called once, and if we specify the key argument, then we sort by the value that key function points us to.
So what does the example using a modulo return? A boolean value: True == 1
, False == 0
. So how does sorted deal with this key? It basically transforms the original list to a sequence of 1s and 0s.
[3, 6, 3, 2, 4, 8, 23]
becomes [0, 1, 0, 1, 1, 1, 0]
Now we"re getting somewhere. What do you get when you sort the transformed list?
[0, 0, 0, 1, 1, 1, 1]
Okay, so now we know why the odds come before the evens. But the next question is: Why does the 6 still come before the 2 in my final list? Well that"s easy  it is because sorting only happens once! Those 1s still represent the original list values, which are in their original positions relative to each other. Since sorting only happens once, and we don"t call any kind of sort function to order the original even numbers from low to high, those values remain in their original order relative to one another.
The final question is then this: How do I think conceptually about how the order of my boolean values get transformed back in to the original values when I print out the final sorted list?
Sorted() is a builtin method that (fun fact) uses a hybrid sorting algorithm called Timsort that combines aspects of merge sort and insertion sort. It seems clear to me that when you call it, there is a mechanic that holds these values in memory and bundles them with their boolean identity (mask) determined by (...!) the lambda function. The order is determined by their boolean identity calculated from the lambda function, but keep in mind that these sublists (of one"s and zeros) are not themselves sorted by their original values. Hence, the final list, while organized by Odds and Evens, is not sorted by sublist (the evens in this case are out of order). The fact that the odds are ordered is because they were already in order by coincidence in the original list. The takeaway from all this is that when lambda does that transformation, the original order of the sublists are retained.
So how does this all relate back to the original question, and more importantly, our intuition on how we should implement sorted() with its key argument and lambda?
That lambda function can be thought of as a pointer that points to the values we need to sort by, whether its a pointer mapping a value to its boolean transformed by the lambda function, or if its a particular element in a nested list, tuple, dict, etc., again determined by the lambda function.
Lets try and predict what happens when I run the following code.
In [9]: mylist = [(3, 5, 8), (6, 2, 8), (2, 9, 4), (6, 8, 5)]
In[10]: sorted(mylist, key=lambda x: x[1])
My sorted
call obviously says, "Please sort this list". The key argument makes that a little more specific by saying, "for each element x
in mylist
, return the second index of that element, then sort all of the elements of the original list mylist
by the sorted order of the list calculated by the lambda function. Since we have a list of tuples, we can return an indexed element from that tuple using the lambda function.
The pointer that will be used to sort would be:
[5, 2, 9, 8] # the second element of each tuple
Sorting this pointer list returns:
[2, 5, 8, 9]
Applying this to mylist
, we get:
Out[10]: [(6, 2, 8), (3, 5, 8), (6, 8, 5), (2, 9, 4)]
# Notice the sorted pointer list is the same as the second index of each tuple in this final list
Run that code, and you"ll find that this is the order. Try sorting a list of integers using this key function and you"ll find that the code breaks (why? Because you cannot index an integer of course).
This was a long winded explanation, but I hope this helps to sort
your intuition on the use of lambda
functions  as the key argument in sorted(), and beyond.
Very simple, you create an array containing zeros using the reference shape:
result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b)
# but that also copies the dtype not only the shape
and then insert the array where you need it:
result[:a.shape[0],:a.shape[1]] = a
and voila you have padded it:
print(result)
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
You can also make it a bit more general if you define where your upper left element should be inserted
result = np.zeros_like(b)
x_offset = 1 # 0 would be what you wanted
y_offset = 1 # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result
array([[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.]])
but then be careful that you don"t have offsets bigger than allowed. For x_offset = 2
for example this will fail.
If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I"ve found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.
def pad(array, reference, offsets):
"""
array: Array to be padded
reference: Reference array with the desired shape
offsets: list of offsets (number of elements must be equal to the dimension of the array)
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference.shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = a
return result
And some test cases:
import numpy as np
# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)
# 3 Dimensions
a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)
Personally, I"d go for:
(y == 0).sum()
and (y == 1).sum()
E.g.
import numpy as np
y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
num_zeros = (y == 0).sum()
num_ones = (y == 1).sum()
Adapted from the docs
# 
#  Toy Context 
# 
import tensorflow as tf
class Net(tf.keras.Model):
"""A simple linear model."""
def __init__(self):
super(Net, self).__init__()
self.l1 = tf.keras.layers.Dense(5)
def call(self, x):
return self.l1(x)
def toy_dataset():
inputs = tf.range(10.0)[:, None]
labels = inputs * 5.0 + tf.range(5.0)[None, :]
return (
tf.data.Dataset.from_tensor_slices(dict(x=inputs, y=labels)).repeat().batch(2)
)
def train_step(net, example, optimizer):
"""Trains `net` on `example` using `optimizer`."""
with tf.GradientTape() as tape:
output = net(example["x"])
loss = tf.reduce_mean(tf.abs(output  example["y"]))
variables = net.trainable_variables
gradients = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(gradients, variables))
return loss
# 
#  Create Objects 
# 
net = Net()
opt = tf.keras.optimizers.Adam(0.1)
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)
# 
#  Train and Save 
# 
ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")
for _ in range(50):
example = next(iterator)
loss = train_step(net, example, opt)
ckpt.step.assign_add(1)
if int(ckpt.step) % 10 == 0:
save_path = manager.save()
print("Saved checkpoint for step {}: {}".format(int(ckpt.step), save_path))
print("loss {:1.2f}".format(loss.numpy()))
# 
#  Restore 
# 
# In another script, reinitialize objects
opt = tf.keras.optimizers.Adam(0.1)
net = Net()
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)
# Reuse the manager code above ^
ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")
for _ in range(50):
example = next(iterator)
# Continue training or evaluate etc.
exhaustive and useful tutorial on saved_model
> https://www.tensorflow.org/guide/saved_model
keras
detailed guide to save models > https://www.tensorflow.org/guide/keras/save_and_serialize
Checkpoints capture the exact value of all parameters (tf.Variable objects) used by a model. Checkpoints do not contain any description of the computation defined by the model and thus are typically only useful when source code that will use the saved parameter values is available.
The SavedModel format on the other hand includes a serialized description of the computation defined by the model in addition to the parameter values (checkpoint). Models in this format are independent of the source code that created the model. They are thus suitable for deployment via TensorFlow Serving, TensorFlow Lite, TensorFlow.js, or programs in other programming languages (the C, C++, Java, Go, Rust, C# etc. TensorFlow APIs).
(Highlights are my own)
From the docs:
# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)
inc_v1 = v1.assign(v1+1)
dec_v2 = v2.assign(v21)
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
inc_v1.op.run()
dec_v2.op.run()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in path: %s" % save_path)
tf.reset_default_graph()
# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Check the values of the variables
print("v1 : %s" % v1.eval())
print("v2 : %s" % v2.eval())
simple_save
Many good answer, for completeness I"ll add my 2 cents: simple_save. Also a standalone code example using the tf.data.Dataset
API.
Python 3 ; Tensorflow 1.14
import tensorflow as tf
from tensorflow.saved_model import tag_constants
with tf.Graph().as_default():
with tf.Session() as sess:
...
# Saving
inputs = {
"batch_size_placeholder": batch_size_placeholder,
"features_placeholder": features_placeholder,
"labels_placeholder": labels_placeholder,
}
outputs = {"prediction": model_output}
tf.saved_model.simple_save(
sess, "path/to/your/location/", inputs, outputs
)
Restoring:
graph = tf.Graph()
with restored_graph.as_default():
with tf.Session() as sess:
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
"path/to/your/location/",
)
batch_size_placeholder = graph.get_tensor_by_name("batch_size_placeholder:0")
features_placeholder = graph.get_tensor_by_name("features_placeholder:0")
labels_placeholder = graph.get_tensor_by_name("labels_placeholder:0")
prediction = restored_graph.get_tensor_by_name("dense/BiasAdd:0")
sess.run(prediction, feed_dict={
batch_size_placeholder: some_value,
features_placeholder: some_other_value,
labels_placeholder: another_value
})
The following code generates random data for the sake of the demonstration.
Dataset
and then its Iterator
. We get the iterator"s generated tensor, called input_tensor
which will serve as input to our model.input_tensor
: a GRUbased bidirectional RNN followed by a dense classifier. Because why not.softmax_cross_entropy_with_logits
, optimized with Adam
. After 2 epochs (of 2 batches each), we save the "trained" model with tf.saved_model.simple_save
. If you run the code as is, then the model will be saved in a folder called simple/
in your current working directory.tf.saved_model.loader.load
. We grab the placeholders and logits with graph.get_tensor_by_name
and the Iterator
initializing operation with graph.get_operation_by_name
.Code:
import os
import shutil
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model import tag_constants
def model(graph, input_tensor):
"""Create the model which consists of
a bidirectional rnn (GRU(10)) followed by a dense classifier
Args:
graph (tf.Graph): Tensors" graph
input_tensor (tf.Tensor): Tensor fed as input to the model
Returns:
tf.Tensor: the model"s output layer Tensor
"""
cell = tf.nn.rnn_cell.GRUCell(10)
with graph.as_default():
((fw_outputs, bw_outputs), (fw_state, bw_state)) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=cell,
cell_bw=cell,
inputs=input_tensor,
sequence_length=[10] * 32,
dtype=tf.float32,
swap_memory=True,
scope=None)
outputs = tf.concat((fw_outputs, bw_outputs), 2)
mean = tf.reduce_mean(outputs, axis=1)
dense = tf.layers.dense(mean, 5, activation=None)
return dense
def get_opt_op(graph, logits, labels_tensor):
"""Create optimization operation from model"s logits and labels
Args:
graph (tf.Graph): Tensors" graph
logits (tf.Tensor): The model"s output without activation
labels_tensor (tf.Tensor): Target labels
Returns:
tf.Operation: the operation performing a stem of Adam optimizer
"""
with graph.as_default():
with tf.variable_scope("loss"):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=labels_tensor, name="xent"),
name="meanxent"
)
with tf.variable_scope("optimizer"):
opt_op = tf.train.AdamOptimizer(1e2).minimize(loss)
return opt_op
if __name__ == "__main__":
# Set random seed for reproducibility
# and create synthetic data
np.random.seed(0)
features = np.random.randn(64, 10, 30)
labels = np.eye(5)[np.random.randint(0, 5, (64,))]
graph1 = tf.Graph()
with graph1.as_default():
# Random seed for reproducibility
tf.set_random_seed(0)
# Placeholders
batch_size_ph = tf.placeholder(tf.int64, name="batch_size_ph")
features_data_ph = tf.placeholder(tf.float32, [None, None, 30], "features_data_ph")
labels_data_ph = tf.placeholder(tf.int32, [None, 5], "labels_data_ph")
# Dataset
dataset = tf.data.Dataset.from_tensor_slices((features_data_ph, labels_data_ph))
dataset = dataset.batch(batch_size_ph)
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset, name="dataset_init")
input_tensor, labels_tensor = iterator.get_next()
# Model
logits = model(graph1, input_tensor)
# Optimization
opt_op = get_opt_op(graph1, logits, labels_tensor)
with tf.Session(graph=graph1) as sess:
# Initialize variables
tf.global_variables_initializer().run(session=sess)
for epoch in range(3):
batch = 0
# Initialize dataset (could feed epochs in Dataset.repeat(epochs))
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
})
values = []
while True:
try:
if epoch < 2:
# Training
_, value = sess.run([opt_op, logits])
print("Epoch {}, batch {}  Sample value: {}".format(epoch, batch, value[0]))
batch += 1
else:
# Final inference
values.append(sess.run(logits))
print("Epoch {}, batch {}  Final inference  Sample value: {}".format(epoch, batch, values[1][0]))
batch += 1
except tf.errors.OutOfRangeError:
break
# Save model state
print("
Saving...")
cwd = os.getcwd()
path = os.path.join(cwd, "simple")
shutil.rmtree(path, ignore_errors=True)
inputs_dict = {
"batch_size_ph": batch_size_ph,
"features_data_ph": features_data_ph,
"labels_data_ph": labels_data_ph
}
outputs_dict = {
"logits": logits
}
tf.saved_model.simple_save(
sess, path, inputs_dict, outputs_dict
)
print("Ok")
# Restoring
graph2 = tf.Graph()
with graph2.as_default():
with tf.Session(graph=graph2) as sess:
# Restore saved values
print("
Restoring...")
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
path
)
print("Ok")
# Get restored placeholders
labels_data_ph = graph2.get_tensor_by_name("labels_data_ph:0")
features_data_ph = graph2.get_tensor_by_name("features_data_ph:0")
batch_size_ph = graph2.get_tensor_by_name("batch_size_ph:0")
# Get restored model output
restored_logits = graph2.get_tensor_by_name("dense/BiasAdd:0")
# Get dataset initializing operation
dataset_init_op = graph2.get_operation_by_name("dataset_init")
# Initialize restored dataset
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
}
)
# Compute inference for both batches in dataset
restored_values = []
for i in range(2):
restored_values.append(sess.run(restored_logits))
print("Restored values: ", restored_values[i][0])
# Check if original inference and restored inference are equal
valid = all((v == rv).all() for v, rv in zip(values, restored_values))
print("
Inferences match: ", valid)
This will print:
$ python3 save_and_restore.py
Epoch 0, batch 0  Sample value: [0.13851789 0.3087595 0.12804556 0.20013677 0.08229901]
Epoch 0, batch 1  Sample value: [0.00555491 0.04339041 0.05111827 0.2480045 0.00107776]
Epoch 1, batch 0  Sample value: [0.19321944 0.2104792 0.00602257 0.07465433 0.11674127]
Epoch 1, batch 1  Sample value: [0.05275984 0.05981954 0.15913513 0.3244143 0.10673307]
Epoch 2, batch 0  Final inference  Sample value: [0.26331693 0.13013336 0.12553 0.04276478 0.2933622 ]
Epoch 2, batch 1  Final inference  Sample value: [0.07730117 0.11119192 0.20817074 0.35660955 0.16990358]
Saving...
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b"/some/path/simple/saved_model.pb"
Ok
Restoring...
INFO:tensorflow:Restoring parameters from b"/some/path/simple/variables/variables"
Ok
Restored values: [0.26331693 0.13013336 0.12553 0.04276478 0.2933622 ]
Restored values: [0.07730117 0.11119192 0.20817074 0.35660955 0.16990358]
Inferences match: True
If you understand RMSE: (Root mean squared error), MSE: (Mean Squared Error) RMD (Root mean squared deviation) and RMS: (Root Mean Squared), then asking for a library to calculate this for you is unnecessary overengineering. All these metrics are a single line of python code at most 2 inches long. The three metrics rmse, mse, rmd, and rms are at their core conceptually identical.
RMSE answers the question: "How similar, on average, are the numbers in list1
to list2
?". The two lists must be the same size. I want to "wash out the noise between any two given elements, wash out the size of the data collected, and get a single number feel for change over time".
Imagine you are learning to throw darts at a dart board. Every day you practice for one hour. You want to figure out if you are getting better or getting worse. So every day you make 10 throws and measure the distance between the bullseye and where your dart hit.
You make a list of those numbers list1
. Use the root mean squared error between the distances at day 1 and a list2
containing all zeros. Do the same on the 2nd and nth days. What you will get is a single number that hopefully decreases over time. When your RMSE number is zero, you hit bullseyes every time. If the rmse number goes up, you are getting worse.
import numpy as np
d = [0.000, 0.166, 0.333] #ideal target distances, these can be all zeros.
p = [0.000, 0.254, 0.998] #your performance goes here
print("d is: " + str(["%.8f" % elem for elem in d]))
print("p is: " + str(["%.8f" % elem for elem in p]))
def rmse(predictions, targets):
return np.sqrt(((predictions  targets) ** 2).mean())
rmse_val = rmse(np.array(d), np.array(p))
print("rms error is: " + str(rmse_val))
Which prints:
d is: ["0.00000000", "0.16600000", "0.33300000"]
p is: ["0.00000000", "0.25400000", "0.99800000"]
rms error between lists d and p is: 0.387284994115
Glyph Legend: n
is a whole positive integer representing the number of throws. i
represents a whole positive integer counter that enumerates sum. d
stands for the ideal distances, the list2
containing all zeros in above example. p
stands for performance, the list1
in the above example. superscript 2 stands for numeric squared. d_{i} is the i"th index of d
. p_{i} is the i"th index of p
.
The rmse done in small steps so it can be understood:
def rmse(predictions, targets):
differences = predictions  targets #the DIFFERENCEs.
differences_squared = differences ** 2 #the SQUAREs of ^
mean_of_differences_squared = differences_squared.mean() #the MEAN of ^
rmse_val = np.sqrt(mean_of_differences_squared) #ROOT of ^
return rmse_val #get the ^
Subtracting one number from another gives you the distance between them.
8  5 = 3 #absolute distance between 8 and 5 is +3
20  10 = 30 #absolute distance between 20 and 10 is +30
If you multiply any number times itself, the result is always positive because negative times negative is positive:
3*3 = 9 = positive
30*30 = 900 = positive
Add them all up, but wait, then an array with many elements would have a larger error than a small array, so average them by the number of elements.
But wait, we squared them all earlier to force them positive. Undo the damage with a square root!
That leaves you with a single number that represents, on average, the distance between every value of list1 to it"s corresponding element value of list2.
If the RMSE value goes down over time we are happy because variance is decreasing.
Root mean squared error measures the vertical distance between the point and the line, so if your data is shaped like a banana, flat near the bottom and steep near the top, then the RMSE will report greater distances to points high, but short distances to points low when in fact the distances are equivalent. This causes a skew where the line prefers to be closer to points high than low.
If this is a problem the total least squares method fixes this: https://mubaris.com/posts/linearregression
If there are nulls or infinity in either input list, then output rmse value is is going to not make sense. There are three strategies to deal with nulls / missing values / infinities in either list: Ignore that component, zero it out or add a best guess or a uniform random noise to all timesteps. Each remedy has its pros and cons depending on what your data means. In general ignoring any component with a missing value is preferred, but this biases the RMSE toward zero making you think performance has improved when it really hasn"t. Adding random noise on a best guess could be preferred if there are lots of missing values.
In order to guarantee relative correctness of the RMSE output, you must eliminate all nulls/infinites from the input.
Root mean squared error squares relies on all data being right and all are counted as equal. That means one stray point that"s way out in left field is going to totally ruin the whole calculation. To handle outlier data points and dismiss their tremendous influence after a certain threshold, see Robust estimators that build in a threshold for dismissal of outliers.
Shabbir Challawala has over 8 years of rich experience in providing solutions based on MySQL and PHP technologies. He is currently working with KNOWARTH Technologies. He has worked in various PHPbase...
10/07/2020
Automate the Boring Stuff with Python PDF, 2nd Edition: Practical Programming for Total Beginners Illustrated Edition. The second edition of this Python bestseller (over 100,000 copies sold in prin...
22/08/2021
Data and storage models are the basis for big data ecosystem stacks. While storage model captures the physical aspects and features for data storage, data model captures the logical representation and...
10/07/2020
Python:  The Bible  3 Manuscripts in 1 book:
12/08/2021