StackOverflow | zeros

### Answer rating: 186

We initialize a numpy array with zeros as bellow:

```
np.zeros((N,N+1))
```

But how do we check whether all elements in a given n*n numpy array matrix is zero.

The method just need to return a True if all the values are indeed zero.

The other answers posted here will work, but the clearest and most efficient function to use is `numpy.any()`

:

```
>>> all_zeros = not np.any(a)
```

or

```
>>> all_zeros = not a.any()
```

- This is preferred over
`numpy.all(a==0)`

because it uses less RAM. (It does not require the temporary array created by the`a==0`

term.) ~~Also, it is faster than~~`numpy.count_nonzero(a)`

because it can return immediately when the first nonzero element has been found.**Edit:**As @Rachel pointed out in the comments,`np.any()`

no longer uses "short-circuit" logic, so you won"t see a speed benefit for small arrays.

Given:

```
a = 1
b = 10
c = 100
```

How do I display a leading zero for all numbers with less than two digits?

This is the output I"m expecting:

```
01
10
100
```

I need to add leading zeros to integer to make a string with defined quantity of digits ($cnt). What the best way to translate this simple function from PHP to Python:

```
function add_nulls($int, $cnt=2) {
$int = intval($int);
for($i=0; $i<($cnt-strlen($int)); $i++)
$nulls .= "0";
return $nulls.$int;
}
```

Is there a function that can do this?

How can I create a `list`

which contains only zeros? I want to be able to create a zeros `list`

for each `int`

in `range(10)`

For example, if the `int`

in the range was `4`

I will get:

```
[0,0,0,0]
```

and for `7`

:

```
[0,0,0,0,0,0,0]
```

I am trying to build a histogram of counts... so I create buckets. I know I could just go through and append a bunch of zeros i.e something along these lines:

```
buckets = []
for i in xrange(0,100):
buckets.append(0)
```

Is there a more elegant way to do it? I feel like there should be a way to just declare an array of a certain size.

I know numpy has `numpy.zeros`

but I want the more general solution

How can I format a float so that it doesn"t contain trailing zeros? In other words, I want the resulting string to be as short as possible.

For example:

```
3 -> "3"
3. -> "3"
3.0 -> "3"
3.1 -> "3.1"
3.14 -> "3.14"
3.140 -> "3.14"
```

I"m trying to convert an integer to binary using the bin() function in Python. However, it always removes the leading zeros, which I actually need, such that the result is always 8-bit:

Example:

```
bin(1) -> 0b1
# What I would like:
bin(1) -> 0b00000001
```

Is there a way of doing this?

I have several alphanumeric strings like these

```
listOfNum = ["000231512-n","1209123100000-n00000","alphanumeric0000", "000alphanumeric"]
```

The desired output for removing **trailing** zeros would be:

```
listOfNum = ["000231512-n","1209123100000-n","alphanumeric", "000alphanumeric"]
```

The desired output for **leading** trailing zeros would be:

```
listOfNum = ["231512-n","1209123100000-n00000","alphanumeric0000", "alphanumeric"]
```

The desire output for removing both leading and trailing zeros would be:

```
listOfNum = ["231512-n","1209123100000-n", "alphanumeric", "alphanumeric"]
```

For now i"ve been doing it the following way, please suggest a better way if there is:

```
listOfNum = ["000231512-n","1209123100000-n00000","alphanumeric0000",
"000alphanumeric"]
trailingremoved = []
leadingremoved = []
bothremoved = []
# Remove trailing
for i in listOfNum:
while i[-1] == "0":
i = i[:-1]
trailingremoved.append(i)
# Remove leading
for i in listOfNum:
while i[0] == "0":
i = i[1:]
leadingremoved.append(i)
# Remove both
for i in listOfNum:
while i[0] == "0":
i = i[1:]
while i[-1] == "0":
i = i[:-1]
bothremoved.append(i)
```

I can use `pandas`

`dropna()`

functionality to remove rows with some or all columns set as `NA`

"s. Is there an equivalent function for dropping rows with all columns having value 0?

```
P kt b tt mky depth
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 1.1 3 4.5 2.3 9.0
```

In this example, we would like to drop the first 4 rows from the data frame.

thanks!

I want to know how I can pad a 2D numpy array with zeros using python 2.6.6 with numpy version 1.5.0. But these are my limitations. Therefore I cannot use `np.pad`

. For example, I want to pad `a`

with zeros such that its shape matches `b`

. The reason why I want to do this is so I can do:

```
b-a
```

such that

```
>>> a
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
>>> b
array([[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.]])
>>> c
array([[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]])
```

The only way I can think of doing this is appending, however this seems pretty ugly. is there a cleaner solution possibly using `b.shape`

?

Edit, Thank you to MSeiferts answer. I had to clean it up a bit, and this is what I got:

```
def pad(array, reference_shape, offsets):
"""
array: Array to be padded
reference_shape: tuple of size of ndarray to create
offsets: list of offsets (number of elements must be equal to the dimension of the array)
will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference_shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = array
return result
```

We initialize a numpy array with zeros as bellow:

```
np.zeros((N,N+1))
```

But how do we check whether all elements in a given n*n numpy array matrix is zero.

The method just need to return a True if all the values are indeed zero.

If you like ascii art:

`"VALID"`

= without padding:`inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) |________________| dropped |_________________|`

`"SAME"`

= with zero padding:`pad| |pad inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0 |________________| |_________________| |________________|`

In this example:

- Input width = 13
- Filter width = 6
- Stride = 5

Notes:

`"VALID"`

only ever drops the right-most columns (or bottom-most rows).`"SAME"`

tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).

**Edit**:

About the name:

- With
`"SAME"`

padding, if you use a stride of 1, the layer"s outputs will have the**same**spatial dimensions as its inputs. - With
`"VALID"`

padding, there"s no "made-up" padding inputs. The layer only uses**valid**input data.

Your array `a`

defines the columns of the nonzero elements in the output array. You need to also define the rows and then use fancy indexing:

```
>>> a = np.array([1, 0, 3])
>>> b = np.zeros((a.size, a.max()+1))
>>> b[np.arange(a.size),a] = 1
>>> b
array([[ 0., 1., 0., 0.],
[ 1., 0., 0., 0.],
[ 0., 0., 0., 1.]])
```

If your main goal is to visualize the correlation matrix, rather than creating a plot per se, the convenient `pandas`

styling options is a viable built-in solution:

```
import pandas as pd
import numpy as np
rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
corr = df.corr()
corr.style.background_gradient(cmap="coolwarm")
# "RdBu_r", "BrBG_r", & PuOr_r are other good diverging colormaps
```

Note that this needs to be in a backend that supports rendering HTML, such as the JupyterLab Notebook.

You can easily limit the digit precision:

```
corr.style.background_gradient(cmap="coolwarm").set_precision(2)
```

Or get rid of the digits altogether if you prefer the matrix without annotations:

```
corr.style.background_gradient(cmap="coolwarm").set_properties(**{"font-size": "0pt"})
```

The styling documentation also includes instructions of more advanced styles, such as how to change the display of the cell the mouse pointer is hovering over.

In my testing, `style.background_gradient()`

was 4x faster than `plt.matshow()`

and 120x faster than `sns.heatmap()`

with a 10x10 matrix. Unfortunately it doesn"t scale as well as `plt.matshow()`

: the two take about the same time for a 100x100 matrix, and `plt.matshow()`

is 10x faster for a 1000x1000 matrix.

There are a few possible ways to save the stylized dataframe:

- Return the HTML by appending the
`render()`

method and then write the output to a file. - Save as an
`.xslx`

file with conditional formatting by appending the`to_excel()`

method. - Combine with imgkit to save a bitmap
- Take a screenshot (like I have done here).

By setting `axis=None`

, it is now possible to compute the colors based on the entire matrix rather than per column or per row:

```
corr.style.background_gradient(cmap="coolwarm", axis=None)
```

Since many people are reading this answer I thought I would add a tip for how to only show one corner of the correlation matrix. I find this easier to read myself, since it removes the redundant information.

```
# Fill diagonal and upper half with NaNs
mask = np.zeros_like(corr, dtype=bool)
mask[np.triu_indices_from(mask)] = True
corr[mask] = np.nan
(corr
.style
.background_gradient(cmap="coolwarm", axis=None, vmin=-1, vmax=1)
.highlight_null(null_color="#f1f1f1") # Color NaNs grey
.set_precision(2))
```

In numpy v1.7+, you can take advantage of the "where" option for ufuncs. You can do things in one line and you don"t have to deal with the errstate context manager.

```
>>> a = np.array([-1, 0, 1, 2, 3], dtype=float)
>>> b = np.array([ 0, 0, 0, 2, 2], dtype=float)
# If you don"t pass `out` the indices where (b == 0) will be uninitialized!
>>> c = np.divide(a, b, out=np.zeros_like(a), where=b!=0)
>>> print(c)
[ 0. 0. 0. 1. 1.5]
```

In this case, it does the divide calculation anywhere "where" b does not equal zero. When b does equal zero, then it remains unchanged from whatever value you originally gave it in the "out" argument.

You need to push a `bytes-like`

object (`bytes`

, `bytearray`

, etc) to the `base64.b64encode()`

method. Here are two ways:

```
>>> import base64
>>> data = base64.b64encode(b"data to be encoded")
>>> print(data)
b"ZGF0YSB0byBiZSBlbmNvZGVk"
```

Or with a variable:

```
>>> import base64
>>> string = "data to be encoded"
>>> data = base64.b64encode(string.encode())
>>> print(data)
b"ZGF0YSB0byBiZSBlbmNvZGVk"
```

In Python 3, `str`

objects are not C-style character arrays (so they are **not** byte arrays), but rather, they are data structures that do not have any inherent encoding. You can encode that string (or interpret it) in a variety of ways. The most common (and default in Python 3) is utf-8, especially since it is backwards compatible with ASCII (although, as are most widely-used encodings). That is what is happening when you take a `string`

and call the `.encode()`

method on it: Python is interpreting the string in utf-8 (the default encoding) and providing you the array of bytes that it corresponds to.

Originally the question title asked about Base-64 encoding. Read on for Base-64 stuff.

`base64`

encoding takes 6-bit binary chunks and encodes them using the characters A-Z, a-z, 0-9, "+", "/", and "=" (some encodings use different characters in place of "+" and "/"). This is a character encoding that is based off of the mathematical construct of radix-64 or base-64 number system, but they are very different. Base-64 in math is a number system like binary or decimal, and you do this change of radix on the entire number, or (if the radix you"re converting from is a power of 2 less than 64) in chunks from right to left.

In `base64`

encoding, the translation is done from left to right; those first 64 characters are why it is called `base64`

**encoding**. The 65th "=" symbol is used for padding, since the encoding pulls 6-bit chunks but the data it is usually meant to encode are 8-bit bytes, so sometimes there are only two or 4 bits in the last chunk.

Example:

```
>>> data = b"test"
>>> for byte in data:
... print(format(byte, "08b"), end=" ")
...
01110100 01100101 01110011 01110100
>>>
```

If you interpret that binary data as a single integer, then this is how you would convert it to base-10 and base-64 (table for base-64):

```
base-2: 01 110100 011001 010111 001101 110100 (base-64 grouping shown)
base-10: 1952805748
base-64: B 0 Z X N 0
```

`base64`

**encoding**, however, will re-group this data thusly:

```
base-2: 011101 000110 010101 110011 011101 00(0000) <- pad w/zeros to make a clean 6-bit chunk
base-10: 29 6 21 51 29 0
base-64: d G V z d A
```

So, "B0ZXN0" is the base-64 version of our binary, mathematically speaking. However, `base64`

**encoding** has to do the encoding in the opposite direction (so the raw data is converted to "dGVzdA") and also has a rule to tell other applications how much space is left off at the end. This is done by padding the end with "=" symbols. So, the `base64`

encoding of this data is "dGVzdA==", with two "=" symbols to signify two pairs of bits will need to be removed from the end when this data gets decoded to make it match the original data.

Let"s test this to see if I am being dishonest:

```
>>> encoded = base64.b64encode(data)
>>> print(encoded)
b"dGVzdA=="
```

`base64`

encoding?Let"s say I have to send some data to someone via email, like this data:

```
>>> data = b"x04x6dx73x67x08x08x08x20x20x20"
>>> print(data.decode())
>>> print(data)
b"x04msgx08x08x08 "
>>>
```

There are two problems I planted:

- If I tried to send that email in Unix, the email would send as soon as the
`x04`

character was read, because that is ASCII for`END-OF-TRANSMISSION`

(Ctrl-D), so the remaining data would be left out of the transmission. - Also, while Python is smart enough to escape all of my evil control characters when I print the data directly, when that string is decoded as ASCII, you can see that the "msg" is not there. That is because I used three
`BACKSPACE`

characters and three`SPACE`

characters to erase the "msg". Thus, even if I didn"t have the`EOF`

character there the end user wouldn"t be able to translate from the text on screen to the real, raw data.

This is just a demo to show you how hard it can be to simply send raw data. Encoding the data into base64 format gives you the exact same data but in a format that ensures it is safe for sending over electronic media such as email.

I think all of the answers here cover the core of what the lambda function does in the context of sorted() quite nicely, however I still feel like a description that leads to an intuitive understanding is lacking, so here is my two cents.

For the sake of completeness, I"ll state the obvious up front: sorted() returns a list of sorted elements and if we want to sort in a particular way or if we want to sort a complex list of elements (e.g. nested lists or a list of tuples) we can invoke the key argument.

For me, the intuitive understanding of the key argument, why it has to be callable, and the use of lambda as the (anonymous) callable function to accomplish this comes in two parts.

- Using lamba ultimately means you don"t have to write (define) an entire function, like the one
*sblom*provided an example of. Lambda functions are created, used, and immediately destroyed - so they don"t funk up your code with more code that will only ever be used once. This, as I understand it, is the core utility of the lambda function and its application for such a role is broad. Its syntax is purely a convention, which is in essence the nature of programmatic syntax in general. Learn the syntax and be done with it.

Lambda syntax is as follows:

```
lambda input_variable(s): tasty one liner
```

where `lambda`

is a python keyword.

e.g.

```
In [1]: f00 = lambda x: x/2
In [2]: f00(10)
Out[2]: 5.0
In [3]: (lambda x: x/2)(10)
Out[3]: 5.0
In [4]: (lambda x, y: x / y)(10, 2)
Out[4]: 5.0
In [5]: (lambda: "amazing lambda")() # func with no args!
Out[5]: "amazing lambda"
```

- The idea behind the
`key`

argument is that it should take in a set of instructions that will essentially point the "sorted()" function at those list elements which should be used to sort by. When it says`key=`

, what it really means is: As I iterate through the list, one element at a time (i.e.`for e in some_list`

), I"m going to pass the current element to the function specifed by the key argument and use that to create a transformed list which will inform me on the order of the final sorted list.

Check it out:

```
In [6]: mylist = [3, 6, 3, 2, 4, 8, 23] # an example list
# sorted(mylist, key=HowToSort) # what we will be doing
```

Base example:

```
# mylist = [3, 6, 3, 2, 4, 8, 23]
In [7]: sorted(mylist)
Out[7]: [2, 3, 3, 4, 6, 8, 23]
# all numbers are in ascending order (i.e.from low to high).
```

Example 1:

```
# mylist = [3, 6, 3, 2, 4, 8, 23]
In [8]: sorted(mylist, key=lambda x: x % 2 == 0)
# Quick Tip: The % operator returns the *remainder* of a division
# operation. So the key lambda function here is saying "return True
# if x divided by 2 leaves a remainer of 0, else False". This is a
# typical way to check if a number is even or odd.
Out[8]: [3, 3, 23, 6, 2, 4, 8]
# Does this sorted result make intuitive sense to you?
```

Notice that my lambda function told `sorted`

to check if each element `e`

was even or odd before sorting.

**BUT WAIT!** You may (or perhaps should) be wondering two things.

First, why are the odd numbers coming before the even numbers? After all, the key value seems to be telling the `sorted`

function to prioritize evens by using the `mod`

operator in `x % 2 == 0`

.

Second, why are the even numbers still out of order? 2 comes before 6, right?

By analyzing this result, we"ll learn something deeper about how the "key" argument really works, especially in conjunction with the anonymous lambda function.

Firstly, you"ll notice that while the odds come before the evens, the evens themselves are not sorted. Why is this?? Lets read the docs:

Key FunctionsStarting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.

We have to do a little bit of reading between the lines here, but what this tells us is that the sort function is only called once, and if we specify the key argument, then we sort by the value that key function points us to.

So what does the example using a modulo return? A boolean value: `True == 1`

, `False == 0`

. So how does sorted deal with this key? It basically transforms the original list to a sequence of 1s and 0s.

`[3, 6, 3, 2, 4, 8, 23]`

becomes `[0, 1, 0, 1, 1, 1, 0]`

Now we"re getting somewhere. What do you get when you sort the transformed list?

`[0, 0, 0, 1, 1, 1, 1]`

Okay, so now we know why the odds come before the evens. But the next question is: Why does the 6 still come before the 2 in my final list? Well that"s easy - it is because sorting only happens once! **Those 1s still represent the original list values, which are in their original positions relative to each other**. Since sorting only happens once, and we don"t call any kind of sort function to order the original even numbers from low to high, those values remain in their original order relative to one another.

The final question is then this: How do I think conceptually about how the order of my boolean values get transformed back in to the original values when I print out the final sorted list?

Sorted() is a built-in method that (fun fact) uses a hybrid sorting algorithm called Timsort that combines aspects of merge sort and insertion sort. It seems clear to me that when you call it, there is a mechanic that holds these values in memory and bundles them with their boolean identity (mask) determined by (...!) **the lambda function**. The order is determined by their boolean identity calculated from the lambda function, but keep in mind that these sublists (of one"s and zeros) are not themselves sorted by their original values. Hence, the final list, while organized by Odds and Evens, is not sorted by sublist (the evens in this case are out of order). The fact that the odds are ordered is because they were already in order by coincidence in the original list. The takeaway from all this is that when lambda does that transformation, the original order of the sublists are retained.

So how does this all relate back to the original question, and more importantly, our intuition on how we should implement sorted() with its key argument and lambda?

That lambda function can be thought of as a pointer that points to the values we need to sort by, whether its a pointer mapping a value to its boolean transformed by the lambda function, or if its a particular element in a nested list, tuple, dict, etc., again determined by the lambda function.

Lets try and predict what happens when I run the following code.

```
In [9]: mylist = [(3, 5, 8), (6, 2, 8), (2, 9, 4), (6, 8, 5)]
In[10]: sorted(mylist, key=lambda x: x[1])
```

My `sorted`

call obviously says, "Please sort this list". The key argument makes that a little more specific by saying, "for each element `x`

in `mylist`

, return the second index of that element, then sort all of the elements of the original list `mylist`

by the sorted order of the list calculated by the lambda function. Since we have a list of tuples, we can return an indexed element from that tuple using the lambda function.

The pointer that will be used to sort would be:

```
[5, 2, 9, 8] # the second element of each tuple
```

Sorting this pointer list returns:

```
[2, 5, 8, 9]
```

Applying this to `mylist`

, we get:

```
Out[10]: [(6, 2, 8), (3, 5, 8), (6, 8, 5), (2, 9, 4)]
# Notice the sorted pointer list is the same as the second index of each tuple in this final list
```

Run that code, and you"ll find that this is the order. Try sorting a list of integers using this key function and you"ll find that the code breaks (why? Because you cannot index an integer of course).

This was a long winded explanation, but I hope this helps to `sort`

your intuition on the use of `lambda`

functions - as the key argument in sorted(), and beyond.

Very simple, you create an array containing zeros using the reference shape:

```
result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b)
# but that also copies the dtype not only the shape
```

and then insert the array where you need it:

```
result[:a.shape[0],:a.shape[1]] = a
```

and voila you have padded it:

```
print(result)
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
```

You can also make it a bit more general if you define where your upper left element should be inserted

```
result = np.zeros_like(b)
x_offset = 1 # 0 would be what you wanted
y_offset = 1 # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result
array([[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.]])
```

but then be careful that you don"t have offsets bigger than allowed. For `x_offset = 2`

for example this will fail.

If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I"ve found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.

```
def pad(array, reference, offsets):
"""
array: Array to be padded
reference: Reference array with the desired shape
offsets: list of offsets (number of elements must be equal to the dimension of the array)
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference.shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = a
return result
```

And some test cases:

```
import numpy as np
# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)
# 3 Dimensions
a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)
```

Personally, I"d go for:
`(y == 0).sum()`

and `(y == 1).sum()`

E.g.

```
import numpy as np
y = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
num_zeros = (y == 0).sum()
num_ones = (y == 1).sum()
```

Adapted from the docs

```
# -------------------------
# ----- Toy Context -----
# -------------------------
import tensorflow as tf
class Net(tf.keras.Model):
"""A simple linear model."""
def __init__(self):
super(Net, self).__init__()
self.l1 = tf.keras.layers.Dense(5)
def call(self, x):
return self.l1(x)
def toy_dataset():
inputs = tf.range(10.0)[:, None]
labels = inputs * 5.0 + tf.range(5.0)[None, :]
return (
tf.data.Dataset.from_tensor_slices(dict(x=inputs, y=labels)).repeat().batch(2)
)
def train_step(net, example, optimizer):
"""Trains `net` on `example` using `optimizer`."""
with tf.GradientTape() as tape:
output = net(example["x"])
loss = tf.reduce_mean(tf.abs(output - example["y"]))
variables = net.trainable_variables
gradients = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(gradients, variables))
return loss
# ----------------------------
# ----- Create Objects -----
# ----------------------------
net = Net()
opt = tf.keras.optimizers.Adam(0.1)
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)
# ----------------------------
# ----- Train and Save -----
# ----------------------------
ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")
for _ in range(50):
example = next(iterator)
loss = train_step(net, example, opt)
ckpt.step.assign_add(1)
if int(ckpt.step) % 10 == 0:
save_path = manager.save()
print("Saved checkpoint for step {}: {}".format(int(ckpt.step), save_path))
print("loss {:1.2f}".format(loss.numpy()))
# ---------------------
# ----- Restore -----
# ---------------------
# In another script, re-initialize objects
opt = tf.keras.optimizers.Adam(0.1)
net = Net()
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)
# Re-use the manager code above ^
ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")
for _ in range(50):
example = next(iterator)
# Continue training or evaluate etc.
```

exhaustive and useful tutorial on

-> https://www.tensorflow.org/guide/saved_model`saved_model`

detailed guide to save models -> https://www.tensorflow.org/guide/keras/save_and_serialize`keras`

Checkpoints capture the exact value of all parameters (tf.Variable objects) used by a model.

Checkpoints do not contain any description of the computation defined by the modeland thus are typically only useful when source code that will use the saved parameter values is available.The SavedModel format on the other hand

includes a serialized description of the computation defined by the modelin addition to the parameter values (checkpoint). Models in this format areindependentof the source code that created the model. They are thus suitable for deployment via TensorFlow Serving, TensorFlow Lite, TensorFlow.js, or programs in other programming languages (the C, C++, Java, Go, Rust, C# etc. TensorFlow APIs).

(Highlights are my own)

From the docs:

```
# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)
inc_v1 = v1.assign(v1+1)
dec_v2 = v2.assign(v2-1)
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
inc_v1.op.run()
dec_v2.op.run()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in path: %s" % save_path)
```

```
tf.reset_default_graph()
# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Check the values of the variables
print("v1 : %s" % v1.eval())
print("v2 : %s" % v2.eval())
```

`simple_save`

Many good answer, for completeness I"ll add my 2 cents: **simple_save**. Also a standalone code example using the `tf.data.Dataset`

API.

Python 3 ; Tensorflow **1.14**

```
import tensorflow as tf
from tensorflow.saved_model import tag_constants
with tf.Graph().as_default():
with tf.Session() as sess:
...
# Saving
inputs = {
"batch_size_placeholder": batch_size_placeholder,
"features_placeholder": features_placeholder,
"labels_placeholder": labels_placeholder,
}
outputs = {"prediction": model_output}
tf.saved_model.simple_save(
sess, "path/to/your/location/", inputs, outputs
)
```

Restoring:

```
graph = tf.Graph()
with restored_graph.as_default():
with tf.Session() as sess:
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
"path/to/your/location/",
)
batch_size_placeholder = graph.get_tensor_by_name("batch_size_placeholder:0")
features_placeholder = graph.get_tensor_by_name("features_placeholder:0")
labels_placeholder = graph.get_tensor_by_name("labels_placeholder:0")
prediction = restored_graph.get_tensor_by_name("dense/BiasAdd:0")
sess.run(prediction, feed_dict={
batch_size_placeholder: some_value,
features_placeholder: some_other_value,
labels_placeholder: another_value
})
```

The following code generates random data for the sake of the demonstration.

- We start by creating the placeholders. They will hold the data at runtime. From them, we create the
`Dataset`

and then its`Iterator`

. We get the iterator"s generated tensor, called`input_tensor`

which will serve as input to our model. - The model itself is built from
`input_tensor`

: a GRU-based bidirectional RNN followed by a dense classifier. Because why not. - The loss is a
`softmax_cross_entropy_with_logits`

, optimized with`Adam`

. After 2 epochs (of 2 batches each), we save the "trained" model with`tf.saved_model.simple_save`

. If you run the code as is, then the model will be saved in a folder called`simple/`

in your current working directory. - In a new graph, we then restore the saved model with
`tf.saved_model.loader.load`

. We grab the placeholders and logits with`graph.get_tensor_by_name`

and the`Iterator`

initializing operation with`graph.get_operation_by_name`

. - Lastly we run an inference for both batches in the dataset, and check that the saved and restored model both yield the same values. They do!

Code:

```
import os
import shutil
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model import tag_constants
def model(graph, input_tensor):
"""Create the model which consists of
a bidirectional rnn (GRU(10)) followed by a dense classifier
Args:
graph (tf.Graph): Tensors" graph
input_tensor (tf.Tensor): Tensor fed as input to the model
Returns:
tf.Tensor: the model"s output layer Tensor
"""
cell = tf.nn.rnn_cell.GRUCell(10)
with graph.as_default():
((fw_outputs, bw_outputs), (fw_state, bw_state)) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=cell,
cell_bw=cell,
inputs=input_tensor,
sequence_length=[10] * 32,
dtype=tf.float32,
swap_memory=True,
scope=None)
outputs = tf.concat((fw_outputs, bw_outputs), 2)
mean = tf.reduce_mean(outputs, axis=1)
dense = tf.layers.dense(mean, 5, activation=None)
return dense
def get_opt_op(graph, logits, labels_tensor):
"""Create optimization operation from model"s logits and labels
Args:
graph (tf.Graph): Tensors" graph
logits (tf.Tensor): The model"s output without activation
labels_tensor (tf.Tensor): Target labels
Returns:
tf.Operation: the operation performing a stem of Adam optimizer
"""
with graph.as_default():
with tf.variable_scope("loss"):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=labels_tensor, name="xent"),
name="mean-xent"
)
with tf.variable_scope("optimizer"):
opt_op = tf.train.AdamOptimizer(1e-2).minimize(loss)
return opt_op
if __name__ == "__main__":
# Set random seed for reproducibility
# and create synthetic data
np.random.seed(0)
features = np.random.randn(64, 10, 30)
labels = np.eye(5)[np.random.randint(0, 5, (64,))]
graph1 = tf.Graph()
with graph1.as_default():
# Random seed for reproducibility
tf.set_random_seed(0)
# Placeholders
batch_size_ph = tf.placeholder(tf.int64, name="batch_size_ph")
features_data_ph = tf.placeholder(tf.float32, [None, None, 30], "features_data_ph")
labels_data_ph = tf.placeholder(tf.int32, [None, 5], "labels_data_ph")
# Dataset
dataset = tf.data.Dataset.from_tensor_slices((features_data_ph, labels_data_ph))
dataset = dataset.batch(batch_size_ph)
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset, name="dataset_init")
input_tensor, labels_tensor = iterator.get_next()
# Model
logits = model(graph1, input_tensor)
# Optimization
opt_op = get_opt_op(graph1, logits, labels_tensor)
with tf.Session(graph=graph1) as sess:
# Initialize variables
tf.global_variables_initializer().run(session=sess)
for epoch in range(3):
batch = 0
# Initialize dataset (could feed epochs in Dataset.repeat(epochs))
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
})
values = []
while True:
try:
if epoch < 2:
# Training
_, value = sess.run([opt_op, logits])
print("Epoch {}, batch {} | Sample value: {}".format(epoch, batch, value[0]))
batch += 1
else:
# Final inference
values.append(sess.run(logits))
print("Epoch {}, batch {} | Final inference | Sample value: {}".format(epoch, batch, values[-1][0]))
batch += 1
except tf.errors.OutOfRangeError:
break
# Save model state
print("
Saving...")
cwd = os.getcwd()
path = os.path.join(cwd, "simple")
shutil.rmtree(path, ignore_errors=True)
inputs_dict = {
"batch_size_ph": batch_size_ph,
"features_data_ph": features_data_ph,
"labels_data_ph": labels_data_ph
}
outputs_dict = {
"logits": logits
}
tf.saved_model.simple_save(
sess, path, inputs_dict, outputs_dict
)
print("Ok")
# Restoring
graph2 = tf.Graph()
with graph2.as_default():
with tf.Session(graph=graph2) as sess:
# Restore saved values
print("
Restoring...")
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
path
)
print("Ok")
# Get restored placeholders
labels_data_ph = graph2.get_tensor_by_name("labels_data_ph:0")
features_data_ph = graph2.get_tensor_by_name("features_data_ph:0")
batch_size_ph = graph2.get_tensor_by_name("batch_size_ph:0")
# Get restored model output
restored_logits = graph2.get_tensor_by_name("dense/BiasAdd:0")
# Get dataset initializing operation
dataset_init_op = graph2.get_operation_by_name("dataset_init")
# Initialize restored dataset
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
}
)
# Compute inference for both batches in dataset
restored_values = []
for i in range(2):
restored_values.append(sess.run(restored_logits))
print("Restored values: ", restored_values[i][0])
# Check if original inference and restored inference are equal
valid = all((v == rv).all() for v, rv in zip(values, restored_values))
print("
Inferences match: ", valid)
```

This will print:

```
$ python3 save_and_restore.py
Epoch 0, batch 0 | Sample value: [-0.13851789 -0.3087595 0.12804556 0.20013677 -0.08229901]
Epoch 0, batch 1 | Sample value: [-0.00555491 -0.04339041 -0.05111827 -0.2480045 -0.00107776]
Epoch 1, batch 0 | Sample value: [-0.19321944 -0.2104792 -0.00602257 0.07465433 0.11674127]
Epoch 1, batch 1 | Sample value: [-0.05275984 0.05981954 -0.15913513 -0.3244143 0.10673307]
Epoch 2, batch 0 | Final inference | Sample value: [-0.26331693 -0.13013336 -0.12553 -0.04276478 0.2933622 ]
Epoch 2, batch 1 | Final inference | Sample value: [-0.07730117 0.11119192 -0.20817074 -0.35660955 0.16990358]
Saving...
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b"/some/path/simple/saved_model.pb"
Ok
Restoring...
INFO:tensorflow:Restoring parameters from b"/some/path/simple/variables/variables"
Ok
Restored values: [-0.26331693 -0.13013336 -0.12553 -0.04276478 0.2933622 ]
Restored values: [-0.07730117 0.11119192 -0.20817074 -0.35660955 0.16990358]
Inferences match: True
```

If you understand RMSE: (Root mean squared error), MSE: (Mean Squared Error) RMD (Root mean squared deviation) and RMS: (Root Mean Squared), then asking for a library to calculate this for you is unnecessary over-engineering. All these metrics are a single line of python code at most 2 inches long. The three metrics rmse, mse, rmd, and rms are at their core conceptually identical.

RMSE answers the question: "How similar, on average, are the numbers in `list1`

to `list2`

?". The two lists must be the same size. I want to "wash out the noise between any two given elements, wash out the size of the data collected, and get a single number feel for change over time".

Imagine you are learning to throw darts at a dart board. Every day you practice for one hour. You want to figure out if you are getting better or getting worse. So every day you make 10 throws and measure the distance between the bullseye and where your dart hit.

You make a list of those numbers `list1`

. Use the root mean squared error between the distances at day 1 and a `list2`

containing all zeros. Do the same on the 2nd and nth days. What you will get is a single number that hopefully decreases over time. When your RMSE number is zero, you hit bullseyes every time. If the rmse number goes up, you are getting worse.

```
import numpy as np
d = [0.000, 0.166, 0.333] #ideal target distances, these can be all zeros.
p = [0.000, 0.254, 0.998] #your performance goes here
print("d is: " + str(["%.8f" % elem for elem in d]))
print("p is: " + str(["%.8f" % elem for elem in p]))
def rmse(predictions, targets):
return np.sqrt(((predictions - targets) ** 2).mean())
rmse_val = rmse(np.array(d), np.array(p))
print("rms error is: " + str(rmse_val))
```

Which prints:

```
d is: ["0.00000000", "0.16600000", "0.33300000"]
p is: ["0.00000000", "0.25400000", "0.99800000"]
rms error between lists d and p is: 0.387284994115
```

**Glyph Legend:** `n`

is a whole positive integer representing the number of throws. `i`

represents a whole positive integer counter that enumerates sum. `d`

stands for the ideal distances, the `list2`

containing all zeros in above example. `p`

stands for performance, the `list1`

in the above example. superscript 2 stands for numeric squared. **d _{i}** is the i"th index of

`d`

. `p`

.**The rmse done in small steps so it can be understood:**

```
def rmse(predictions, targets):
differences = predictions - targets #the DIFFERENCEs.
differences_squared = differences ** 2 #the SQUAREs of ^
mean_of_differences_squared = differences_squared.mean() #the MEAN of ^
rmse_val = np.sqrt(mean_of_differences_squared) #ROOT of ^
return rmse_val #get the ^
```

Subtracting one number from another gives you the distance between them.

```
8 - 5 = 3 #absolute distance between 8 and 5 is +3
-20 - 10 = -30 #absolute distance between -20 and 10 is +30
```

If you multiply any number times itself, the result is always positive because negative times negative is positive:

```
3*3 = 9 = positive
-30*-30 = 900 = positive
```

Add them all up, but wait, then an array with many elements would have a larger error than a small array, so average them by the number of elements.

But wait, we squared them all earlier to force them positive. Undo the damage with a square root!

That leaves you with a single number that represents, on average, the distance between every value of list1 to it"s corresponding element value of list2.

If the RMSE value goes down over time we are happy because variance is decreasing.

Root mean squared error measures the vertical distance between the point and the line, so if your data is shaped like a banana, flat near the bottom and steep near the top, then the RMSE will report greater distances to points high, but short distances to points low when in fact the distances are equivalent. This causes a skew where the line prefers to be closer to points high than low.

If this is a problem the total least squares method fixes this: https://mubaris.com/posts/linear-regression

If there are nulls or infinity in either input list, then output rmse value is is going to not make sense. There are three strategies to deal with nulls / missing values / infinities in either list: Ignore that component, zero it out or add a best guess or a uniform random noise to all timesteps. Each remedy has its pros and cons depending on what your data means. In general ignoring any component with a missing value is preferred, but this biases the RMSE toward zero making you think performance has improved when it really hasn"t. Adding random noise on a best guess could be preferred if there are lots of missing values.

In order to guarantee relative correctness of the RMSE output, you must eliminate all nulls/infinites from the input.

Root mean squared error squares relies on all data being right and all are counted as equal. That means one stray point that"s way out in left field is going to totally ruin the whole calculation. To handle outlier data points and dismiss their tremendous influence after a certain threshold, see Robust estimators that build in a threshold for dismissal of outliers.

For many decades, some powerful trends have been in place. Computer hardware has rap- idly been getting faster, cheaper and smaller. Internet bandwidth (that is, its information carrying capacity) has...

23/09/2020

A Gentle Introduction to Numerical Simulations with Python 3.6. Computing, in the sense of doing mathematical calculations, is a skill that mankind has developed over thousands of years. Programmin...

23/09/2020

The role of adaptation, learning and optimization are becoming increasingly essen- tial and intertwined. The capability of a system to adapt either through modification of its physiological structure ...

10/07/2020

Scientific progress has increasingly become reliant on large-scale data collection and analysis methodologies. The same is true for the advanced use of computing in business, government, and other are...

10/07/2020

X
# Submit new EBook