NumPy | Python Methods and Functions | square

**Parameters :**

arr:[array_like]Input array or object whose elements, we need to square.

** Return: **

An array with square value of each array.

** Code # 1: Work **

` ` |

** Output: **

Square Value of arr1: [1 9 225 217156] Square Value of arr2: [529 3136]

** Code # 2: Working with Complex Numbers **

` ` |

** Output: **

Square (4 + 3j): (7 + 24j) Square value (16 + 13j): (87 + 416j)

** Code # 3: Graphical representation of numpy.square () **

` `

` ` ` # Python program explaining `

` # square () function `

` import `

` numpy as np `

` import `

` matplotlib.pyplot as plt `

` a `

` = `

` np.linspace (start `

` = `

` - `

` 5 `

`, stop `

` = `

` 5 `

`, `

` num `

` = `

` 6 `

`, endpoint `

` = `` True `

`) `

` print `

` (`

`" Graphical Representation: "`

`, np.square (a)) `

` `

` plt.title (`

` "blue: with square red: without square" `

` ) `

` plt.plot (a, np.square (a)) `

` plt.plot (a, a, color `

` = `

` `red` `

`) `

` plt.show () `

** Output: **

Graphical Representation: [25. 9. 1. 1. 1. 9. 25.]

** Links: **

https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.absolute .html

,

I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.

I did the programming using Python API of OpenCV 2.3.1.

Below is what I did :

- Read the image
- Find the contours
- Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.

e.g. given below:

(

**Notice here that the green line correctly coincides with the true boundary of the Sudoku, so the Sudoku can be correctly warped**. Check next image)warp the image to a perfect square

eg image:

Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )

And the method worked well.

**Problem:**

Check out this image.

Performing the step 4 on this image gives the result below:

The red line drawn is the original contour which is the true outline of sudoku boundary.

The green line drawn is approximated contour which will be the outline of warped image.

Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.

**My Question :**

How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?

I know I could implement a root mean squared error function like this:

```
def rmse(predictions, targets):
return np.sqrt(((predictions - targets) ** 2).mean())
```

What I"m looking for if this rmse function is implemented in a library somewhere, perhaps in scipy or scikit-learn?

```
>>> x=[1,2]
>>> x[1]
2
>>> x=(1,2)
>>> x[1]
2
```

Are they both valid? Is one preferred for some reason?

Why does Python give the "wrong" answer?

```
x = 16
sqrt = x**(.5) #returns 4
sqrt = x**(1/2) #returns 1
```

Yes, I know `import math`

and use `sqrt`

. But I"m looking for an answer to the above.

I see more and more commands like this:

```
$ pip install "splinter[django]"
```

What do these square brackets do?

I"m using Python and Numpy to calculate a best fit polynomial of arbitrary degree. I pass a list of x values, y values, and the degree of the polynomial I want to fit (linear, quadratic, etc.).

This much works, but I also want to calculate r (coefficient of correlation) and r-squared(coefficient of determination). I am comparing my results with Excel"s best-fit trendline capability, and the r-squared value it calculates. Using this, I know I am calculating r-squared correctly for linear best-fit (degree equals 1). However, my function does not work for polynomials with degree greater than 1.

Excel is able to do this. How do I calculate r-squared for higher-order polynomials using Numpy?

Here"s my function:

```
import numpy
# Polynomial Regression
def polyfit(x, y, degree):
results = {}
coeffs = numpy.polyfit(x, y, degree)
# Polynomial Coefficients
results["polynomial"] = coeffs.tolist()
correlation = numpy.corrcoef(x, y)[0,1]
# r
results["correlation"] = correlation
# r-squared
results["determination"] = correlation**2
return results
```

I"ve noticed three methods of selecting a column in a Pandas DataFrame:

**First method of selecting a column using loc:**

```
df_new = df.loc[:, "col1"]
```

**Second method - seems simpler and faster:**

```
df_new = df["col1"]
```

**Third method - most convenient:**

```
df_new = df.col1
```

Is there a difference between these three methods? I don"t think so, in which case I"d rather use the third method.

I"m mostly curious as to why there appear to be three methods for doing the same thing.

I think you"re almost there, try removing the extra square brackets around the `lst`

"s (Also you don"t need to specify the column names when you"re creating a dataframe from a dict like this):

```
import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
{"lst1Title": lst1,
"lst2Title": lst2,
"lst3Title": lst3
})
percentile_list
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
...
```

If you need a more performant solution you can use `np.column_stack`

rather than `zip`

as in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:

```
import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]),
columns=["lst1Title", "lst2Title", "lst3Title"])
```

There is a clean, one-line way of doing this in Pandas:

```
df["col_3"] = df.apply(lambda x: f(x.col_1, x.col_2), axis=1)
```

This allows `f`

to be a user-defined function with multiple input values, and uses (safe) column names rather than (unsafe) numeric indices to access the columns.

Example with data (based on original question):

```
import pandas as pd
df = pd.DataFrame({"ID":["1", "2", "3"], "col_1": [0, 2, 3], "col_2":[1, 4, 5]})
mylist = ["a", "b", "c", "d", "e", "f"]
def get_sublist(sta,end):
return mylist[sta:end+1]
df["col_3"] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)
```

Output of `print(df)`

:

```
ID col_1 col_2 col_3
0 1 0 1 [a, b]
1 2 2 4 [c, d, e]
2 3 3 5 [d, e, f]
```

If your column names contain spaces or share a name with an existing dataframe attribute, you can index with square brackets:

```
df["col_3"] = df.apply(lambda x: f(x["col 1"], x["col 2"]), axis=1)
```

This is an update and modification to Saullo"s answer, that uses the full list of the current `scipy.stats`

distributions and returns the distribution with the least SSE between the distribution"s histogram and the data"s histogram.

Using the El Ni√±o dataset from `statsmodels`

, the distributions are fit and error is determined. The distribution with the least error is returned.

```
%matplotlib inline
import warnings
import numpy as np
import pandas as pd
import scipy.stats as st
import statsmodels.api as sm
from scipy.stats._continuous_distns import _distn_names
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams["figure.figsize"] = (16.0, 12.0)
matplotlib.style.use("ggplot")
# Create models from data
def best_fit_distribution(data, bins=200, ax=None):
"""Model data by finding best fit distribution to data"""
# Get histogram of original data
y, x = np.histogram(data, bins=bins, density=True)
x = (x + np.roll(x, -1))[:-1] / 2.0
# Best holders
best_distributions = []
# Estimate distribution parameters from data
for ii, distribution in enumerate([d for d in _distn_names if not d in ["levy_stable", "studentized_range"]]):
print("{:>3} / {:<3}: {}".format( ii+1, len(_distn_names), distribution ))
distribution = getattr(st, distribution)
# Try to fit the distribution
try:
# Ignore warnings from data that can"t be fit
with warnings.catch_warnings():
warnings.filterwarnings("ignore")
# fit dist to data
params = distribution.fit(data)
# Separate parts of parameters
arg = params[:-2]
loc = params[-2]
scale = params[-1]
# Calculate fitted PDF and error with fit in distribution
pdf = distribution.pdf(x, loc=loc, scale=scale, *arg)
sse = np.sum(np.power(y - pdf, 2.0))
# if axis pass in add to plot
try:
if ax:
pd.Series(pdf, x).plot(ax=ax)
end
except Exception:
pass
# identify if this distribution is better
best_distributions.append((distribution, params, sse))
except Exception:
pass
return sorted(best_distributions, key=lambda x:x[2])
def make_pdf(dist, params, size=10000):
"""Generate distributions"s Probability Distribution Function """
# Separate parts of parameters
arg = params[:-2]
loc = params[-2]
scale = params[-1]
# Get sane start and end points of distribution
start = dist.ppf(0.01, *arg, loc=loc, scale=scale) if arg else dist.ppf(0.01, loc=loc, scale=scale)
end = dist.ppf(0.99, *arg, loc=loc, scale=scale) if arg else dist.ppf(0.99, loc=loc, scale=scale)
# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = dist.pdf(x, loc=loc, scale=scale, *arg)
pdf = pd.Series(y, x)
return pdf
# Load data from statsmodels datasets
data = pd.Series(sm.datasets.elnino.load_pandas().data.set_index("YEAR").values.ravel())
# Plot for comparison
plt.figure(figsize=(12,8))
ax = data.plot(kind="hist", bins=50, density=True, alpha=0.5, color=list(matplotlib.rcParams["axes.prop_cycle"])[1]["color"])
# Save plot limits
dataYLim = ax.get_ylim()
# Find best fit distribution
best_distibutions = best_fit_distribution(data, 200, ax)
best_dist = best_distibutions[0]
# Update plots
ax.set_ylim(dataYLim)
ax.set_title(u"El Ni√±o sea temp.
All Fitted Distributions")
ax.set_xlabel(u"Temp (¬∞C)")
ax.set_ylabel("Frequency")
# Make PDF with best params
pdf = make_pdf(best_dist[0], best_dist[1])
# Display
plt.figure(figsize=(12,8))
ax = pdf.plot(lw=2, label="PDF", legend=True)
data.plot(kind="hist", bins=50, density=True, alpha=0.5, label="Data", legend=True, ax=ax)
param_names = (best_dist[0].shapes + ", loc, scale").split(", ") if best_dist[0].shapes else ["loc", "scale"]
param_str = ", ".join(["{}={:0.2f}".format(k,v) for k,v in zip(param_names, best_dist[1])])
dist_str = "{}({})".format(best_dist[0].name, param_str)
ax.set_title(u"El Ni√±o sea temp. with best fit distribution
" + dist_str)
ax.set_xlabel(u"Temp. (¬∞C)")
ax.set_ylabel("Frequency")
```

**TL;DR**

```
def square_list(n):
the_list = [] # Replace
for x in range(n):
y = x * x
the_list.append(y) # these
return the_list # lines
```

```
def square_yield(n):
for x in range(n):
y = x * x
yield y # with this one.
```

Whenever you find yourself building a list from scratch, `yield`

each piece instead.

This was my first "aha" moment with yield.

`yield`

is a sugary way to say

build a series of stuff

Same behavior:

```
>>> for square in square_list(4):
... print(square)
...
0
1
4
9
>>> for square in square_yield(4):
... print(square)
...
0
1
4
9
```

Different behavior:

Yield is **single-pass**: you can only iterate through once. When a function has a yield in it we call it a generator function. And an iterator is what it returns. Those terms are revealing. We lose the convenience of a container, but gain the power of a series that"s computed as needed, and arbitrarily long.

Yield is **lazy**, it puts off computation. A function with a yield in it *doesn"t actually execute at all when you call it.* It returns an iterator object that remembers where it left off. Each time you call `next()`

on the iterator (this happens in a for-loop) execution inches forward to the next yield. `return`

raises StopIteration and ends the series (this is the natural end of a for-loop).

Yield is **versatile**. Data doesn"t have to be stored all together, it can be made available one at a time. It can be infinite.

```
>>> def squares_all_of_them():
... x = 0
... while True:
... yield x * x
... x += 1
...
>>> squares = squares_all_of_them()
>>> for _ in range(4):
... print(next(squares))
...
0
1
4
9
```

If you need **multiple passes** and the series isn"t too long, just call `list()`

on it:

```
>>> list(square_yield(4))
[0, 1, 4, 9]
```

Brilliant choice of the word `yield`

because both meanings apply:

yield— produce or provide (as in agriculture)

...provide the next data in the series.

yield— give way or relinquish (as in political power)

...relinquish CPU execution until the iterator advances.

This is kind of overkill but let"s give it a go. First lets use statsmodel to find out what the p-values should be

```
import pandas as pd
import numpy as np
from sklearn import datasets, linear_model
from sklearn.linear_model import LinearRegression
import statsmodels.api as sm
from scipy import stats
diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target
X2 = sm.add_constant(X)
est = sm.OLS(y, X2)
est2 = est.fit()
print(est2.summary())
```

and we get

```
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.518
Model: OLS Adj. R-squared: 0.507
Method: Least Squares F-statistic: 46.27
Date: Wed, 08 Mar 2017 Prob (F-statistic): 3.83e-62
Time: 10:08:24 Log-Likelihood: -2386.0
No. Observations: 442 AIC: 4794.
Df Residuals: 431 BIC: 4839.
Df Model: 10
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 152.1335 2.576 59.061 0.000 147.071 157.196
x1 -10.0122 59.749 -0.168 0.867 -127.448 107.424
x2 -239.8191 61.222 -3.917 0.000 -360.151 -119.488
x3 519.8398 66.534 7.813 0.000 389.069 650.610
x4 324.3904 65.422 4.958 0.000 195.805 452.976
x5 -792.1842 416.684 -1.901 0.058 -1611.169 26.801
x6 476.7458 339.035 1.406 0.160 -189.621 1143.113
x7 101.0446 212.533 0.475 0.635 -316.685 518.774
x8 177.0642 161.476 1.097 0.273 -140.313 494.442
x9 751.2793 171.902 4.370 0.000 413.409 1089.150
x10 67.6254 65.984 1.025 0.306 -62.065 197.316
==============================================================================
Omnibus: 1.506 Durbin-Watson: 2.029
Prob(Omnibus): 0.471 Jarque-Bera (JB): 1.404
Skew: 0.017 Prob(JB): 0.496
Kurtosis: 2.726 Cond. No. 227.
==============================================================================
```

Ok, let"s reproduce this. It is kind of overkill as we are almost reproducing a linear regression analysis using Matrix Algebra. But what the heck.

```
lm = LinearRegression()
lm.fit(X,y)
params = np.append(lm.intercept_,lm.coef_)
predictions = lm.predict(X)
newX = pd.DataFrame({"Constant":np.ones(len(X))}).join(pd.DataFrame(X))
MSE = (sum((y-predictions)**2))/(len(newX)-len(newX.columns))
# Note if you don"t want to use a DataFrame replace the two lines above with
# newX = np.append(np.ones((len(X),1)), X, axis=1)
# MSE = (sum((y-predictions)**2))/(len(newX)-len(newX[0]))
var_b = MSE*(np.linalg.inv(np.dot(newX.T,newX)).diagonal())
sd_b = np.sqrt(var_b)
ts_b = params/ sd_b
p_values =[2*(1-stats.t.cdf(np.abs(i),(len(newX)-len(newX[0])))) for i in ts_b]
sd_b = np.round(sd_b,3)
ts_b = np.round(ts_b,3)
p_values = np.round(p_values,3)
params = np.round(params,4)
myDF3 = pd.DataFrame()
myDF3["Coefficients"],myDF3["Standard Errors"],myDF3["t values"],myDF3["Probabilities"] = [params,sd_b,ts_b,p_values]
print(myDF3)
```

And this gives us.

```
Coefficients Standard Errors t values Probabilities
0 152.1335 2.576 59.061 0.000
1 -10.0122 59.749 -0.168 0.867
2 -239.8191 61.222 -3.917 0.000
3 519.8398 66.534 7.813 0.000
4 324.3904 65.422 4.958 0.000
5 -792.1842 416.684 -1.901 0.058
6 476.7458 339.035 1.406 0.160
7 101.0446 212.533 0.475 0.635
8 177.0642 161.476 1.097 0.273
9 751.2793 171.902 4.370 0.000
10 67.6254 65.984 1.025 0.306
```

So we can reproduce the values from statsmodel.

The problem is the use of `aspect="equal"`

, which prevents the subplots from stretching to an arbitrary aspect ratio and filling up all the empty space.

Normally, this would work:

```
import matplotlib.pyplot as plt
ax = [plt.subplot(2,2,i+1) for i in range(4)]
for a in ax:
a.set_xticklabels([])
a.set_yticklabels([])
plt.subplots_adjust(wspace=0, hspace=0)
```

The result is this:

However, with `aspect="equal"`

, as in the following code:

```
import matplotlib.pyplot as plt
ax = [plt.subplot(2,2,i+1) for i in range(4)]
for a in ax:
a.set_xticklabels([])
a.set_yticklabels([])
a.set_aspect("equal")
plt.subplots_adjust(wspace=0, hspace=0)
```

This is what we get:

The difference in this second case is that you"ve forced the x- and y-axes to have the same number of units/pixel. Since the axes go from 0 to 1 by default (i.e., before you plot anything), using `aspect="equal"`

forces each axis to be a square. Since the figure is not a square, pyplot adds in extra spacing between the axes horizontally.

To get around this problem, you can set your figure to have the correct aspect ratio. We"re going to use the object-oriented pyplot interface here, which I consider to be superior in general:

```
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(8,8)) # Notice the equal aspect ratio
ax = [fig.add_subplot(2,2,i+1) for i in range(4)]
for a in ax:
a.set_xticklabels([])
a.set_yticklabels([])
a.set_aspect("equal")
fig.subplots_adjust(wspace=0, hspace=0)
```

Here"s the result:

How about using `numpy.vectorize`

.

```
import numpy as np
x = np.array([1, 2, 3, 4, 5])
squarer = lambda t: t ** 2
vfunc = np.vectorize(squarer)
vfunc(x)
# Output : array([ 1, 4, 9, 16, 25])
```

Use imap instead of map, which returns an iterator of processed values.

```
from multiprocessing import Pool
import tqdm
import time
def _foo(my_number):
square = my_number * my_number
time.sleep(1)
return square
if __name__ == "__main__":
with Pool(2) as p:
r = list(tqdm.tqdm(p.imap(_foo, range(30)), total=30))
```

I"d like to shed a little bit more light on the interplay of `iter`

, `__iter__`

and `__getitem__`

and what happens behind the curtains. Armed with that knowledge, you will be able to understand why the best you can do is

```
try:
iter(maybe_iterable)
print("iteration will probably work")
except TypeError:
print("not iterable")
```

I will list the facts first and then follow up with a quick reminder of what happens when you employ a `for`

loop in python, followed by a discussion to illustrate the facts.

You can get an iterator from any object

`o`

by calling`iter(o)`

if at least one of the following conditions holds true:

a)`o`

has an`__iter__`

method which returns an iterator object. An iterator is any object with an`__iter__`

and a`__next__`

(Python 2:`next`

) method.

b)`o`

has a`__getitem__`

method.Checking for an instance of

`Iterable`

or`Sequence`

, or checking for the attribute`__iter__`

is not enough.If an object

`o`

implements only`__getitem__`

, but not`__iter__`

,`iter(o)`

will construct an iterator that tries to fetch items from`o`

by integer index, starting at index 0. The iterator will catch any`IndexError`

(but no other errors) that is raised and then raises`StopIteration`

itself.In the most general sense, there"s no way to check whether the iterator returned by

`iter`

is sane other than to try it out.If an object

`o`

implements`__iter__`

, the`iter`

function will make sure that the object returned by`__iter__`

is an iterator. There is no sanity check if an object only implements`__getitem__`

.`__iter__`

wins. If an object`o`

implements both`__iter__`

and`__getitem__`

,`iter(o)`

will call`__iter__`

.If you want to make your own objects iterable, always implement the

`__iter__`

method.

`for`

loopsIn order to follow along, you need an understanding of what happens when you employ a `for`

loop in Python. Feel free to skip right to the next section if you already know.

When you use `for item in o`

for some iterable object `o`

, Python calls `iter(o)`

and expects an iterator object as the return value. An iterator is any object which implements a `__next__`

(or `next`

in Python 2) method and an `__iter__`

method.

By convention, the `__iter__`

method of an iterator should return the object itself (i.e. `return self`

). Python then calls `next`

on the iterator until `StopIteration`

is raised. All of this happens implicitly, but the following demonstration makes it visible:

```
import random
class DemoIterable(object):
def __iter__(self):
print("__iter__ called")
return DemoIterator()
class DemoIterator(object):
def __iter__(self):
return self
def __next__(self):
print("__next__ called")
r = random.randint(1, 10)
if r == 5:
print("raising StopIteration")
raise StopIteration
return r
```

Iteration over a `DemoIterable`

:

```
>>> di = DemoIterable()
>>> for x in di:
... print(x)
...
__iter__ called
__next__ called
9
__next__ called
8
__next__ called
10
__next__ called
3
__next__ called
10
__next__ called
raising StopIteration
```

**On point 1 and 2: getting an iterator and unreliable checks**

Consider the following class:

```
class BasicIterable(object):
def __getitem__(self, item):
if item == 3:
raise IndexError
return item
```

Calling `iter`

with an instance of `BasicIterable`

will return an iterator without any problems because `BasicIterable`

implements `__getitem__`

.

```
>>> b = BasicIterable()
>>> iter(b)
<iterator object at 0x7f1ab216e320>
```

However, it is important to note that `b`

does not have the `__iter__`

attribute and is not considered an instance of `Iterable`

or `Sequence`

:

```
>>> from collections import Iterable, Sequence
>>> hasattr(b, "__iter__")
False
>>> isinstance(b, Iterable)
False
>>> isinstance(b, Sequence)
False
```

This is why Fluent Python by Luciano Ramalho recommends calling `iter`

and handling the potential `TypeError`

as the most accurate way to check whether an object is iterable. Quoting directly from the book:

As of Python 3.4, the most accurate way to check whether an object

`x`

is iterable is to call`iter(x)`

and handle a`TypeError`

exception if it isn‚Äôt. This is more accurate than using`isinstance(x, abc.Iterable)`

, because`iter(x)`

also considers the legacy`__getitem__`

method, while the`Iterable`

ABC does not.

**On point 3: Iterating over objects which only provide __getitem__, but not __iter__**

Iterating over an instance of `BasicIterable`

works as expected: Python
constructs an iterator that tries to fetch items by index, starting at zero, until an `IndexError`

is raised. The demo object"s `__getitem__`

method simply returns the `item`

which was supplied as the argument to `__getitem__(self, item)`

by the iterator returned by `iter`

.

```
>>> b = BasicIterable()
>>> it = iter(b)
>>> next(it)
0
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
```

Note that the iterator raises `StopIteration`

when it cannot return the next item and that the `IndexError`

which is raised for `item == 3`

is handled internally. This is why looping over a `BasicIterable`

with a `for`

loop works as expected:

```
>>> for x in b:
... print(x)
...
0
1
2
```

Here"s another example in order to drive home the concept of how the iterator returned by `iter`

tries to access items by index. `WrappedDict`

does not inherit from `dict`

, which means instances won"t have an `__iter__`

method.

```
class WrappedDict(object): # note: no inheritance from dict!
def __init__(self, dic):
self._dict = dic
def __getitem__(self, item):
try:
return self._dict[item] # delegate to dict.__getitem__
except KeyError:
raise IndexError
```

Note that calls to `__getitem__`

are delegated to `dict.__getitem__`

for which the square bracket notation is simply a shorthand.

```
>>> w = WrappedDict({-1: "not printed",
... 0: "hi", 1: "StackOverflow", 2: "!",
... 4: "not printed",
... "x": "not printed"})
>>> for x in w:
... print(x)
...
hi
StackOverflow
!
```

**On point 4 and 5: iter checks for an iterator when it calls __iter__**:

When `iter(o)`

is called for an object `o`

, `iter`

will make sure that the return value of `__iter__`

, if the method is present, is an iterator. This means that the returned object
must implement `__next__`

(or `next`

in Python 2) and `__iter__`

. `iter`

cannot perform any sanity checks for objects which only
provide `__getitem__`

, because it has no way to check whether the items of the object are accessible by integer index.

```
class FailIterIterable(object):
def __iter__(self):
return object() # not an iterator
class FailGetitemIterable(object):
def __getitem__(self, item):
raise Exception
```

Note that constructing an iterator from `FailIterIterable`

instances fails immediately, while constructing an iterator from `FailGetItemIterable`

succeeds, but will throw an Exception on the first call to `__next__`

.

```
>>> fii = FailIterIterable()
>>> iter(fii)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: iter() returned non-iterator of type "object"
>>>
>>> fgi = FailGetitemIterable()
>>> it = iter(fgi)
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/iterdemo.py", line 42, in __getitem__
raise Exception
Exception
```

**On point 6: __iter__ wins**

This one is straightforward. If an object implements `__iter__`

and `__getitem__`

, `iter`

will call `__iter__`

. Consider the following class

```
class IterWinsDemo(object):
def __iter__(self):
return iter(["__iter__", "wins"])
def __getitem__(self, item):
return ["__getitem__", "wins"][item]
```

and the output when looping over an instance:

```
>>> iwd = IterWinsDemo()
>>> for x in iwd:
... print(x)
...
__iter__
wins
```

**On point 7: your iterable classes should implement __iter__**

You might ask yourself why most builtin sequences like `list`

implement an `__iter__`

method when `__getitem__`

would be sufficient.

```
class WrappedList(object): # note: no inheritance from list!
def __init__(self, lst):
self._list = lst
def __getitem__(self, item):
return self._list[item]
```

After all, iteration over instances of the class above, which delegates calls to `__getitem__`

to `list.__getitem__`

(using the square bracket notation), will work fine:

```
>>> wl = WrappedList(["A", "B", "C"])
>>> for x in wl:
... print(x)
...
A
B
C
```

The reasons your custom iterables should implement `__iter__`

are as follows:

- If you implement
`__iter__`

, instances will be considered iterables, and`isinstance(o, collections.abc.Iterable)`

will return`True`

. - If the object returned by
`__iter__`

is not an iterator,`iter`

will fail immediately and raise a`TypeError`

. - The special handling of
`__getitem__`

exists for backwards compatibility reasons. Quoting again from Fluent Python:

That is why any Python sequence is iterable: they all implement

`__getitem__`

. In fact, the standard sequences also implement`__iter__`

, and yours should too, because the special handling of`__getitem__`

exists for backward compatibility reasons and may be gone in the future (although it is not deprecated as I write this).

There are lots of things I have seen make a model diverge.

Too high of a learning rate. You can often tell if this is the case if the loss begins to increase and then diverges to infinity.

I am not to familiar with the DNNClassifier but I am guessing it uses the categorical cross entropy cost function. This involves taking the log of the prediction which diverges as the prediction approaches zero. That is why people usually add a small epsilon value to the prediction to prevent this divergence. I am guessing the DNNClassifier probably does this or uses the tensorflow opp for it. Probably not the issue.

Other numerical stability issues can exist such as division by zero where adding the epsilon can help. Another less obvious one if the square root who"s derivative can diverge if not properly simplified when dealing with finite precision numbers. Yet again I doubt this is the issue in the case of the DNNClassifier.

You may have an issue with the input data. Try calling

`assert not np.any(np.isnan(x))`

on the input data to make sure you are not introducing the nan. Also make sure all of the target values are valid. Finally, make sure the data is properly normalized. You probably want to have the pixels in the range [-1, 1] and not [0, 255].The labels must be in the domain of the loss function, so if using a logarithmic-based loss function all labels must be non-negative (as noted by evan pu and the comments below).

As the title promises, this book will introduce you to one of the world’s most popular programming languages: Python. It’s aimed at beginning programmers as well as more experienced programmers wh...

23/09/2020

It would be easy for me to develop native apps using Java, C++ or Objective-C and I am also able to learn Kotlin, Dart or Swift, but things are much easier when you just use Python. I have done a Djan...

23/09/2020

We are experiencing a renaissance of artificial intelligence, and everyone and their neighbor wants to be a part of this movement. That’s quite likely why you are browsing through this book. There a...

23/09/2020

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems PDF, 2nd Edition. This book assumes you know next to nothing about m...

22/08/2021

X
# Submit new EBook