ravel | StackOverflow

### Answer rating: 243

I have a list in python and I want to convert it to an array to be able to use `ravel()`

function.

Use `numpy.asarray`

:

```
import numpy as np
myarray = np.asarray(mylist)
```

```
import numpy as np
y = np.array(((1,2,3),(4,5,6),(7,8,9)))
OUTPUT:
print(y.flatten())
[1 2 3 4 5 6 7 8 9]
print(y.ravel())
[1 2 3 4 5 6 7 8 9]
```

Both function return the same list. Then what is the need of two different functions performing same job.

I have a list in python and I want to convert it to an array to be able to use `ravel()`

function.

The current API is that:

`flatten`

always returns a copy.`ravel`

returns a view of the original array whenever possible. This isn"t visible in the printed output, but if you modify the array returned by ravel, it may modify the entries in the original array. If you modify the entries in an array returned from flatten this will never happen. ravel will often be faster since no memory is copied, but you have to be more careful about modifying the array it returns.`reshape((-1,))`

gets a view whenever the strides of the array allow it even if that means you don"t always get a contiguous array.

Change this line:

```
model = forest.fit(train_fold, train_y)
```

to:

```
model = forest.fit(train_fold, train_y.values.ravel())
```

*Edit:*

`.values`

will give the values in an array. (shape: (n,1)

`.ravel`

will convert that array shape to (n, )

This is an update and modification to Saullo"s answer, that uses the full list of the current `scipy.stats`

distributions and returns the distribution with the least SSE between the distribution"s histogram and the data"s histogram.

Using the El Ni√±o dataset from `statsmodels`

, the distributions are fit and error is determined. The distribution with the least error is returned.

```
%matplotlib inline
import warnings
import numpy as np
import pandas as pd
import scipy.stats as st
import statsmodels.api as sm
from scipy.stats._continuous_distns import _distn_names
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams["figure.figsize"] = (16.0, 12.0)
matplotlib.style.use("ggplot")
# Create models from data
def best_fit_distribution(data, bins=200, ax=None):
"""Model data by finding best fit distribution to data"""
# Get histogram of original data
y, x = np.histogram(data, bins=bins, density=True)
x = (x + np.roll(x, -1))[:-1] / 2.0
# Best holders
best_distributions = []
# Estimate distribution parameters from data
for ii, distribution in enumerate([d for d in _distn_names if not d in ["levy_stable", "studentized_range"]]):
print("{:>3} / {:<3}: {}".format( ii+1, len(_distn_names), distribution ))
distribution = getattr(st, distribution)
# Try to fit the distribution
try:
# Ignore warnings from data that can"t be fit
with warnings.catch_warnings():
warnings.filterwarnings("ignore")
# fit dist to data
params = distribution.fit(data)
# Separate parts of parameters
arg = params[:-2]
loc = params[-2]
scale = params[-1]
# Calculate fitted PDF and error with fit in distribution
pdf = distribution.pdf(x, loc=loc, scale=scale, *arg)
sse = np.sum(np.power(y - pdf, 2.0))
# if axis pass in add to plot
try:
if ax:
pd.Series(pdf, x).plot(ax=ax)
end
except Exception:
pass
# identify if this distribution is better
best_distributions.append((distribution, params, sse))
except Exception:
pass
return sorted(best_distributions, key=lambda x:x[2])
def make_pdf(dist, params, size=10000):
"""Generate distributions"s Probability Distribution Function """
# Separate parts of parameters
arg = params[:-2]
loc = params[-2]
scale = params[-1]
# Get sane start and end points of distribution
start = dist.ppf(0.01, *arg, loc=loc, scale=scale) if arg else dist.ppf(0.01, loc=loc, scale=scale)
end = dist.ppf(0.99, *arg, loc=loc, scale=scale) if arg else dist.ppf(0.99, loc=loc, scale=scale)
# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = dist.pdf(x, loc=loc, scale=scale, *arg)
pdf = pd.Series(y, x)
return pdf
# Load data from statsmodels datasets
data = pd.Series(sm.datasets.elnino.load_pandas().data.set_index("YEAR").values.ravel())
# Plot for comparison
plt.figure(figsize=(12,8))
ax = data.plot(kind="hist", bins=50, density=True, alpha=0.5, color=list(matplotlib.rcParams["axes.prop_cycle"])[1]["color"])
# Save plot limits
dataYLim = ax.get_ylim()
# Find best fit distribution
best_distibutions = best_fit_distribution(data, 200, ax)
best_dist = best_distibutions[0]
# Update plots
ax.set_ylim(dataYLim)
ax.set_title(u"El Ni√±o sea temp.
All Fitted Distributions")
ax.set_xlabel(u"Temp (¬∞C)")
ax.set_ylabel("Frequency")
# Make PDF with best params
pdf = make_pdf(best_dist[0], best_dist[1])
# Display
plt.figure(figsize=(12,8))
ax = pdf.plot(lw=2, label="PDF", legend=True)
data.plot(kind="hist", bins=50, density=True, alpha=0.5, label="Data", legend=True, ax=ax)
param_names = (best_dist[0].shapes + ", loc, scale").split(", ") if best_dist[0].shapes else ["loc", "scale"]
param_str = ", ".join(["{}={:0.2f}".format(k,v) for k,v in zip(param_names, best_dist[1])])
dist_str = "{}({})".format(best_dist[0].name, param_str)
ax.set_title(u"El Ni√±o sea temp. with best fit distribution
" + dist_str)
ax.set_xlabel(u"Temp. (¬∞C)")
ax.set_ylabel("Frequency")
```

Disclaimer: I"m mostly writing this post with syntactical considerations and general behaviour in mind. I"m not familiar with the memory and CPU aspect of the methods described, and I aim this answer at those who have reasonably small sets of data, such that the quality of the interpolation can be the main aspect to consider. I am aware that when working with very large data sets, the better-performing methods (namely `griddata`

and `RBFInterpolator`

without a `neighbors`

keyword argument) might not be feasible.

Note that this answer uses the new `RBFInterpolator`

class introduced in `SciPy`

1.7.0. For the legacy `Rbf`

class see the previous version of this answer.

I"m going to compare three kinds of multi-dimensional interpolation methods (`interp2d`

/splines, `griddata`

and `RBFInterpolator`

). I will subject them to two kinds of interpolation tasks and two kinds of underlying functions (points from which are to be interpolated). The specific examples will demonstrate two-dimensional interpolation, but the viable methods are applicable in arbitrary dimensions. Each method provides various kinds of interpolation; in all cases I will use cubic interpolation (or something close^{1}). It"s important to note that whenever you use interpolation you introduce bias compared to your raw data, and the specific methods used affect the artifacts that you will end up with. Always be aware of this, and interpolate responsibly.

The two interpolation tasks will be

- upsampling (input data is on a rectangular grid, output data is on a denser grid)
- interpolation of scattered data onto a regular grid

The two functions (over the domain `[x, y] in [-1, 1]x[-1, 1]`

) will be

- a smooth and friendly function:
`cos(pi*x)*sin(pi*y)`

; range in`[-1, 1]`

- an evil (and in particular, non-continuous) function:
`x*y / (x^2 + y^2)`

with a value of 0.5 near the origin; range in`[-0.5, 0.5]`

Here"s how they look:

I will first demonstrate how the three methods behave under these four tests, then I"ll detail the syntax of all three. If you know what you should expect from a method, you might not want to waste your time learning its syntax (looking at you, `interp2d`

).

For the sake of explicitness, here is the code with which I generated the input data. While in this specific case I"m obviously aware of the function underlying the data, I will only use this to generate input for the interpolation methods. I use numpy for convenience (and mostly for generating the data), but scipy alone would suffice too.

```
import numpy as np
import scipy.interpolate as interp
# auxiliary function for mesh generation
def gimme_mesh(n):
minval = -1
maxval = 1
# produce an asymmetric shape in order to catch issues with transpositions
return np.meshgrid(np.linspace(minval, maxval, n),
np.linspace(minval, maxval, n + 1))
# set up underlying test functions, vectorized
def fun_smooth(x, y):
return np.cos(np.pi*x) * np.sin(np.pi*y)
def fun_evil(x, y):
# watch out for singular origin; function has no unique limit there
return np.where(x**2 + y**2 > 1e-10, x*y/(x**2+y**2), 0.5)
# sparse input mesh, 6x7 in shape
N_sparse = 6
x_sparse, y_sparse = gimme_mesh(N_sparse)
z_sparse_smooth = fun_smooth(x_sparse, y_sparse)
z_sparse_evil = fun_evil(x_sparse, y_sparse)
# scattered input points, 10^2 altogether (shape (100,))
N_scattered = 10
rng = np.random.default_rng()
x_scattered, y_scattered = rng.random((2, N_scattered**2))*2 - 1
z_scattered_smooth = fun_smooth(x_scattered, y_scattered)
z_scattered_evil = fun_evil(x_scattered, y_scattered)
# dense output mesh, 20x21 in shape
N_dense = 20
x_dense, y_dense = gimme_mesh(N_dense)
```

Let"s start with the easiest task. Here"s how an upsampling from a mesh of shape `[6, 7]`

to one of `[20, 21]`

works out for the smooth test function:

Even though this is a simple task, there are already subtle differences between the outputs. At a first glance all three outputs are reasonable. There are two features to note, based on our prior knowledge of the underlying function: the middle case of `griddata`

distorts the data most. Note the `y == -1`

boundary of the plot (nearest the `x`

label): the function should be strictly zero (since `y == -1`

is a nodal line for the smooth function), yet this is not the case for `griddata`

. Also note the `x == -1`

boundary of the plots (behind, to the left): the underlying function has a local maximum (implying zero gradient near the boundary) at `[-1, -0.5]`

, yet the `griddata`

output shows clearly non-zero gradient in this region. The effect is subtle, but it"s a bias none the less.

A bit harder task is to perform upsampling on our evil function:

Clear differences are starting to show among the three methods. Looking at the surface plots, there are clear spurious extrema appearing in the output from `interp2d`

(note the two humps on the right side of the plotted surface). While `griddata`

and `RBFInterpolator`

seem to produce similar results at first glance, producing local minima near `[0.4, -0.4]`

that is absent from the underlying function.

However, there is one crucial aspect in which `RBFInterpolator`

is far superior: it respects the symmetry of the underlying function (which is of course also made possible by the symmetry of the sample mesh). The output from `griddata`

breaks the symmetry of the sample points, which is already weakly visible in the smooth case.

Most often one wants to perform interpolation on scattered data. For this reason I expect these tests to be more important. As shown above, the sample points were chosen pseudo-uniformly in the domain of interest. In realistic scenarios you might have additional noise with each measurement, and you should consider whether it makes sense to interpolate your raw data to begin with.

Output for the smooth function:

Now there"s already a bit of a horror show going on. I clipped the output from `interp2d`

to between `[-1, 1]`

exclusively for plotting, in order to preserve at least a minimal amount of information. It"s clear that while some of the underlying shape is present, there are huge noisy regions where the method completely breaks down. The second case of `griddata`

reproduces the shape fairly nicely, but note the white regions at the border of the contour plot. This is due to the fact that `griddata`

only works inside the convex hull of the input data points (in other words, it doesn"t perform any *extrapolation*). I kept the default NaN value for output points lying outside the convex hull.^{2} Considering these features, `RBFInterpolator`

seems to perform best.

And the moment we"ve all been waiting for:

It"s no huge surprise that `interp2d`

gives up. In fact, during the call to `interp2d`

you should expect some friendly `RuntimeWarning`

s complaining about the impossibility of the spline to be constructed. As for the other two methods, `RBFInterpolator`

seems to produce the best output, even near the borders of the domain where the result is extrapolated.

So let me say a few words about the three methods, in decreasing order of preference (so that the worst is the least likely to be read by anybody).

`scipy.interpolate.RBFInterpolator`

The RBF in the name of the `RBFInterpolator`

class stands for "radial basis functions". To be honest I"ve never considered this approach until I started researching for this post, but I"m pretty sure I"ll be using these in the future.

Just like the spline-based methods (see later), usage comes in two steps: first one creates a callable `RBFInterpolator`

class instance based on the input data, and then calls this object for a given output mesh to obtain the interpolated result. Example from the smooth upsampling test:

```
import scipy.interpolate as interp
sparse_points = np.stack([x_sparse.ravel(), y_sparse.ravel()], -1) # shape (N, 2) in 2d
dense_points = np.stack([x_dense.ravel(), y_dense.ravel()], -1) # shape (N, 2) in 2d
zfun_smooth_rbf = interp.RBFInterpolator(sparse_points, z_sparse_smooth.ravel(),
smoothing=0, kernel="cubic") # explicit default smoothing=0 for interpolation
z_dense_smooth_rbf = zfun_smooth_rbf(dense_points).reshape(x_dense.shape) # not really a function, but a callable class instance
zfun_evil_rbf = interp.RBFInterpolator(sparse_points, z_sparse_evil.ravel(),
smoothing=0, kernel="cubic") # explicit default smoothing=0 for interpolation
z_dense_evil_rbf = zfun_evil_rbf(dense_points).reshape(x_dense.shape) # not really a function, but a callable class instance
```

Note that we had to do some array building gymnastics to make the API of `RBFInterpolator`

happy. Since we have to pass the 2d points as arrays of shape `(N, 2)`

, we have to flatten the input grid and stack the two flattened arrays. The constructed interpolator also expects query points in this format, and the result will be a 1d array of shape `(N,)`

which we have to reshape back to match our 2d grid for plotting. Since `RBFInterpolator`

makes no assumptions about the number of dimensions of the input points, it supports arbitrary dimensions for interpolation.

So, `scipy.interpolate.RBFInterpolator`

- produces well-behaved output even for crazy input data
- supports interpolation in higher dimensions
- extrapolates outside the convex hull of the input points (of course extrapolation is always a gamble, and you should generally not rely on it at all)
- creates an interpolator as a first step, so evaluating it in various output points is less additional effort
- can have output point arrays of arbitrary shape (as opposed to being constrained to rectangular meshes, see later)
- more likely to preserving the symmetry of the input data
- supports multiple kinds of radial functions for keyword
`kernel`

:`multiquadric`

,`inverse_multiquadric`

,`inverse_quadratic`

,`gaussian`

,`linear`

,`cubic`

,`quintic`

,`thin_plate_spline`

(the default). As of SciPy 1.7.0 the class doesn"t allow passing a custom callable due to technical reasons, but this is likely to be added in a future version. - can give inexact interpolations by increasing the
`smoothing`

parameter

One drawback of RBF interpolation is that interpolating `N`

data points involves inverting an `N x N`

matrix. This quadratic complexity very quickly blows up memory need for a large number of data points. However, the new `RBFInterpolator`

class also supports a `neighbors`

keyword parameter that restricts computation of each radial basis function to `k`

nearest neighbours, thereby reducing memory need.

`scipy.interpolate.griddata`

My former favourite, `griddata`

, is a general workhorse for interpolation in arbitrary dimensions. It doesn"t perform extrapolation beyond setting a single preset value for points outside the convex hull of the nodal points, but since extrapolation is a very fickle and dangerous thing, this is not necessarily a con. Usage example:

```
sparse_points = np.stack([x_sparse.ravel(), y_sparse.ravel()], -1) # shape (N, 2) in 2d
z_dense_smooth_griddata = interp.griddata(sparse_points, z_sparse_smooth.ravel(),
(x_dense, y_dense), method="cubic") # default method is linear
```

Note that the same array transformations were necessary for the input arrays as for `RBFInterpolator`

. The input points have to be specified in an array of shape `[N, D]`

in `D`

dimensions, or alternatively as a tuple of 1d arrays:

```
z_dense_smooth_griddata = interp.griddata((x_sparse.ravel(), y_sparse.ravel()),
z_sparse_smooth.ravel(), (x_dense, y_dense), method="cubic")
```

The output point arrays can be specified as a tuple of arrays of arbitrary dimensions (as in both above snippets), which gives us some more flexibility.

In a nutshell, `scipy.interpolate.griddata`

- produces well-behaved output even for crazy input data
- supports interpolation in higher dimensions
- does not perform extrapolation, a single value can be set for the output outside the convex hull of the input points (see
`fill_value`

) - computes the interpolated values in a single call, so probing multiple sets of output points starts from scratch
- can have output points of arbitrary shape
- supports nearest-neighbour and linear interpolation in arbitrary dimensions, cubic in 1d and 2d. Nearest-neighbour and linear interpolation use
`NearestNDInterpolator`

and`LinearNDInterpolator`

under the hood, respectively. 1d cubic interpolation uses a spline, 2d cubic interpolation uses`CloughTocher2DInterpolator`

to construct a continuously differentiable piecewise-cubic interpolator. - might violate the symmetry of the input data

`scipy.interpolate.interp2d`

/`scipy.interpolate.bisplrep`

The only reason I"m discussing `interp2d`

and its relatives is that it has a deceptive name, and people are likely to try using it. Spoiler alert: don"t use it (as of scipy version 1.7.0). It"s already more special than the previous subjects in that it"s specifically used for two-dimensional interpolation, but I suspect this is by far the most common case for multivariate interpolation.

As far as syntax goes, `interp2d`

is similar to `RBFInterpolator`

in that it first needs constructing an interpolation instance, which can be called to provide the actual interpolated values. There"s a catch, however: the output points have to be located on a rectangular mesh, so inputs going into the call to the interpolator have to be 1d vectors which span the output grid, as if from `numpy.meshgrid`

:

```
# reminder: x_sparse and y_sparse are of shape [6, 7] from numpy.meshgrid
zfun_smooth_interp2d = interp.interp2d(x_sparse, y_sparse, z_sparse_smooth, kind="cubic") # default kind is "linear"
# reminder: x_dense and y_dense are of shape (20, 21) from numpy.meshgrid
xvec = x_dense[0,:] # 1d array of unique x values, 20 elements
yvec = y_dense[:,0] # 1d array of unique y values, 21 elements
z_dense_smooth_interp2d = zfun_smooth_interp2d(xvec, yvec) # output is (20, 21)-shaped array
```

One of the most common mistakes when using `interp2d`

is putting your full 2d meshes into the interpolation call, which leads to explosive memory consumption, and hopefully to a hasty `MemoryError`

.

Now, the greatest problem with `interp2d`

is that it often doesn"t work. In order to understand this, we have to look under the hood. It turns out that `interp2d`

is a wrapper for the lower-level functions `bisplrep`

+ `bisplev`

, which are in turn wrappers for FITPACK routines (written in Fortran). The equivalent call to the previous example would be

```
kind = "cubic"
if kind == "linear":
kx = ky = 1
elif kind == "cubic":
kx = ky = 3
elif kind == "quintic":
kx = ky = 5
# bisplrep constructs a spline representation, bisplev evaluates the spline at given points
bisp_smooth = interp.bisplrep(x_sparse.ravel(), y_sparse.ravel(),
z_sparse_smooth.ravel(), kx=kx, ky=ky, s=0)
z_dense_smooth_bisplrep = interp.bisplev(xvec, yvec, bisp_smooth).T # note the transpose
```

Now, here"s the thing about `interp2d`

: (in scipy version 1.7.0) there is a nice comment in `interpolate/interpolate.py`

for `interp2d`

:

```
if not rectangular_grid:
# TODO: surfit is really not meant for interpolation!
self.tck = fitpack.bisplrep(x, y, z, kx=kx, ky=ky, s=0.0)
```

and indeed in `interpolate/fitpack.py`

, in `bisplrep`

there"s some setup and ultimately

```
tx, ty, c, o = _fitpack._surfit(x, y, z, w, xb, xe, yb, ye, kx, ky,
task, s, eps, tx, ty, nxest, nyest,
wrk, lwrk1, lwrk2)
```

And that"s it. The routines underlying `interp2d`

are not really meant to perform interpolation. They might suffice for sufficiently well-behaved data, but under realistic circumstances you will probably want to use something else.

Just to conclude, `interpolate.interp2d`

- can lead to artifacts even with well-tempered data
- is specifically for bivariate problems (although there"s the limited
`interpn`

for input points defined on a grid) - performs extrapolation
- creates an interpolator as a first step, so evaluating it in various output points is less additional effort
- can only produce output over a rectangular grid, for scattered output you would have to call the interpolator in a loop
- supports linear, cubic and quintic interpolation
- might violate the symmetry of the input data

^{1}I"m fairly certain that the `cubic`

and `linear`

kind of basis functions of `RBFInterpolator`

do not exactly correspond to the other interpolators of the same name.

^{2}These NaNs are also the reason for why the surface plot seems so odd: matplotlib historically has difficulties with plotting complex 3d objects with proper depth information. The NaN values in the data confuse the renderer, so parts of the surface that should be in the back are plotted to be in the front. This is an issue with visualization, and not interpolation.

To save some folks some time, here is a list I extracted from a small corpus. I do not know if it is complete, but it should have most (if not all) of the help definitions from upenn_tagset...

**CC**: conjunction, coordinating

```
& "n and both but either et for less minus neither nor or plus so
therefore times v. versus vs. whether yet
```

**CD**: numeral, cardinal

```
mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-
seven 1987 twenty "79 zero two 78-degrees eighty-four IX "60s .025
fifteen 271,124 dozen quintillion DM2,000 ...
```

**DT**: determiner

```
all an another any both del each either every half la many much nary
neither no some such that the them these this those
```

**EX**: existential there

```
there
```

**IN**: preposition or conjunction, subordinating

```
astride among upon whether out inside pro despite on by throughout
below within for towards near behind atop around if like until below
next into if beside ...
```

**JJ**: adjective or numeral, ordinal

```
third ill-mannered pre-war regrettable oiled calamitous first separable
ectoplasmic battery-powered participatory fourth still-to-be-named
multilingual multi-disciplinary ...
```

**JJR**: adjective, comparative

```
bleaker braver breezier briefer brighter brisker broader bumper busier
calmer cheaper choosier cleaner clearer closer colder commoner costlier
cozier creamier crunchier cuter ...
```

**JJS**: adjective, superlative

```
calmest cheapest choicest classiest cleanest clearest closest commonest
corniest costliest crassest creepiest crudest cutest darkest deadliest
dearest deepest densest dinkiest ...
```

**LS**: list item marker

```
A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005
SP-44007 Second Third Three Two * a b c d first five four one six three
two
```

**MD**: modal auxiliary

```
can cannot could couldn"t dare may might must need ought shall should
shouldn"t will would
```

**NN**: noun, common, singular or mass

```
common-carrier cabbage knuckle-duster Casino afghan shed thermostat
investment slide humour falloff slick wind hyena override subhumanity
machinist ...
```

**NNP**: noun, proper, singular

```
Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
Shannon A.K.C. Meltex Liverpool ...
```

**NNS**: noun, common, plural

```
undergraduates scotches bric-a-brac products bodyguards facets coasts
divestitures storehouses designs clubs fragrances averages
subjectivists apprehensions muses factory-jobs ...
```

**PDT**: pre-determiner

```
all both half many quite such sure this
```

**POS**: genitive marker

```
" "s
```

**PRP**: pronoun, personal

```
hers herself him himself hisself it itself me myself one oneself ours
ourselves ownself self she thee theirs them themselves they thou thy us
```

**PRP**$: pronoun, possessive

```
her his mine my our ours their thy your
```

**RB**: adverb

```
occasionally unabatingly maddeningly adventurously professedly
stirringly prominently technologically magisterially predominately
swiftly fiscally pitilessly ...
```

**RBR**: adverb, comparative

```
further gloomier grander graver greater grimmer harder harsher
healthier heavier higher however larger later leaner lengthier less-
perfectly lesser lonelier longer louder lower more ...
```

**RBS**: adverb, superlative

```
best biggest bluntest earliest farthest first furthest hardest
heartiest highest largest least less most nearest second tightest worst
```

**RP**: particle

```
aboard about across along apart around aside at away back before behind
by crop down ever fast for forth from go high i.e. in into just later
low more off on open out over per pie raising start teeth that through
under unto up up-pp upon whole with you
```

**TO**: "to" as preposition or infinitive marker

```
to
```

**UH**: interjection

```
Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen
huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly
man baby diddle hush sonuvabitch ...
```

**VB**: verb, base form

```
ask assemble assess assign assume atone attention avoid bake balkanize
bank begin behold believe bend benefit bevel beware bless boil bomb
boost brace break bring broil brush build ...
```

**VBD**: verb, past tense

```
dipped pleaded swiped regummed soaked tidied convened halted registered
cushioned exacted snubbed strode aimed adopted belied figgered
speculated wore appreciated contemplated ...
```

**VBG**: verb, present participle or gerund

```
telegraphing stirring focusing angering judging stalling lactating
hankerin" alleging veering capping approaching traveling besieging
encrypting interrupting erasing wincing ...
```

**VBN**: verb, past participle

```
multihulled dilapidated aerosolized chaired languished panelized used
experimented flourished imitated reunifed factored condensed sheared
unsettled primed dubbed desired ...
```

**VBP**: verb, present tense, not 3rd person singular

```
predominate wrap resort sue twist spill cure lengthen brush terminate
appear tend stray glisten obtain comprise detest tease attract
emphasize mold postpone sever return wag ...
```

**VBZ**: verb, present tense, 3rd person singular

```
bases reconstructs marks mixes displeases seals carps weaves snatches
slumps stretches authorizes smolders pictures emerges stockpiles
seduces fizzes uses bolsters slaps speaks pleads ...
```

**WDT**: WH-determiner

```
that what whatever which whichever
```

**WP**: WH-pronoun

```
that what whatever whatsoever which who whom whosoever
```

**WRB**: Wh-adverb

```
how however whence whenever where whereby whereever wherein whereof why
```

The answer below pertains primarily to *Signed Cookies*, an implementation of the concept of *sessions* (as used in web applications). Flask offers both, normal (unsigned) cookies (via `request.cookies`

and `response.set_cookie()`

) and signed cookies (via `flask.session`

). The answer has two parts, the first describes how a Signed Cookie is generated, and the second is presented in the form of a QA that addresses different aspects of the scheme. The syntax used for the examples is Python3, but the concepts apply also to previous versions.

`SECRET_KEY`

(or how to create a Signed Cookie)?Signing cookies is a preventive measure against cookie tampering. During the process of signing a cookie, the `SECRET_KEY`

is used in a way similar to how a "salt" would be used to muddle a password before hashing it. Here"s a (wildly) simplified description of the concept. The code in the examples is meant to be illustrative. Many of the steps have been omitted and not all of the functions actually exist. The goal here is to provide an understanding of the general idea, actual implementations will be a bit more involved. Also, keep in mind that Flask does most of this for you in the background. So, besides setting values to your cookie (via the session API) and providing a `SECRET_KEY`

, it"s not only ill-advised to reimplement this yourself, but there"s no need to do so:

( 1 ) First a `SECRET_KEY`

is established. It should only be known to the application and should be kept relatively constant during the application"s life cycle, including through application restarts.

```
# choose a salt, a secret string of bytes
>>> SECRET_KEY = "my super secret key".encode("utf8")
```

( 2 ) create a cookie

```
>>> cookie = make_cookie(
... name="_profile",
... content="uid=382|membership=regular",
... ...
... expires="July 1 2030..."
... )
>>> print(cookie)
name: _profile
content: uid=382|membership=regular...
...
...
expires: July 1 2030, 1:20:40 AM UTC
```

( 3 ) to create a signature, append (or prepend) the `SECRET_KEY`

to the cookie byte string, then generate a hash from that combination.

```
# encode and salt the cookie, then hash the result
>>> cookie_bytes = str(cookie).encode("utf8")
>>> signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()
>>> print(signature)
7ae0e9e033b5fa53aa....
```

( 4 ) Now affix the signature at one end of the `content`

field of the original cookie.

```
# include signature as part of the cookie
>>> cookie.content = cookie.content + "|" + signature
>>> print(cookie)
name: _profile
content: uid=382|membership=regular|7ae0e9... <--- signature
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC
```

and that"s what"s sent to the client.

```
# add cookie to response
>>> response.set_cookie(cookie)
# send to browser -->
```

( 5 ) When the browser returns this cookie back to the server, strip the signature from the cookie"s `content`

field to get back the original cookie.

```
# Upon receiving the cookie from browser
>>> cookie = request.get_cookie()
# pop the signature out of the cookie
>>> (cookie.content, popped_signature) = cookie.content.rsplit("|", 1)
```

( 6 ) Use the original cookie with the application"s `SECRET_KEY`

to recalculate the signature using the same method as in step 3.

```
# recalculate signature using SECRET_KEY and original cookie
>>> cookie_bytes = str(cookie).encode("utf8")
>>> calculated_signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()
```

( 7 ) Compare the calculated result with the signature previously popped out of the just received cookie. If they match, we know that the cookie has not been messed with. But if even just a space has been added to the cookie, the signatures won"t match.

```
# if both signatures match, your cookie has not been modified
>>> good_cookie = popped_signature==calculated_signature
```

( 8 ) If they don"t match then you may respond with any number of actions, log the event, discard the cookie, issue a fresh one, redirect to a login page, etc.

```
>>> if not good_cookie:
... security_log(cookie)
```

The type of signature generated above that requires a secret key to ensure the integrity of some contents is called in cryptography a *Message Authentication Code* or *MAC*.

I specified earlier that the example above is an oversimplification of that concept and that it wasn"t a good idea to implement your own signing. That"s because the algorithm used to sign cookies in Flask is called HMAC and is a bit more involved than the above simple step-by-step. The general idea is the same, but due to reasons beyond the scope of this discussion, the series of computations are a tad bit more complex. If you"re still interested in crafting a DIY, as it"s usually the case, Python has some modules to help you get started :) here"s a starting block:

```
import hmac
import hashlib
def create_signature(secret_key, msg, digestmod=None):
if digestmod is None:
digestmod = hashlib.sha1
mac = hmac.new(secret_key, msg=msg, digestmod=digestmod)
return mac.digest()
```

The documentaton for hmac and hashlib.

`SECRET_KEY`

:)## What"s a "signature" in this context?

It"s a method to ensure that some content has not been modified by anyone other than a person or an entity authorized to do so.

One of the simplest forms of signature is the "checksum", which simply verifies that two pieces of data are the same. For example, when installing software from source it"s important to first confirm that your copy of the source code is identical to the author"s. A common approach to do this is to run the source through a cryptographic hash function and compare the output with the checksum published on the project"s home page.

Let"s say for instance that you"re about to download a project"s source in a gzipped file from a web mirror. The SHA1 checksum published on the project"s web page is "eb84e8da7ca23e9f83...."

```
# so you get the code from the mirror
download https://mirror.example-codedump.com/source_code.tar.gz
# you calculate the hash as instructed
sha1(source_code.tar.gz)
> eb84e8da7c....
```

Both hashes are the same, you know that you have an identical copy.

## What"s a cookie?

An extensive discussion on cookies would go beyond the scope of this question. I provide an overview here since a minimal understanding can be useful to have a better understanding of how and why `SECRET_KEY`

is useful. I highly encourage you to follow up with some personal readings on HTTP Cookies.

A common practice in web applications is to use the client (web browser) as a lightweight cache. Cookies are one implementation of this practice. A cookie is typically some data added by the server to an HTTP response by way of its headers. It"s kept by the browser which subsequently sends it back to the server when issuing requests, also by way of HTTP headers. The data contained in a cookie can be used to emulate what"s called *statefulness*, the illusion that the server is maintaining an ongoing connection with the client. Only, in this case, instead of a wire to keep the connection "alive", you simply have snapshots of the state of the application after it has handled a client"s request. These snapshots are carried back and forth between client and server. Upon receiving a request, the server first reads the content of the cookie to reestablish the context of its conversation with the client. It then handles the request within that context and before returning the response to the client, updates the cookie. The illusion of an ongoing session is thus maintained.

## What does a cookie look like?

A typical cookie would look like this:

```
name: _profile
content: uid=382|status=genie
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC
```

Cookies are trivial to peruse from any modern browser. On Firefox for example go to *Preferences > Privacy > History > remove individual cookies*.

The `content`

field is the most relevant to the application. Other fields carry mostly meta instructions to specify various scopes of influence.

## Why use cookies at all?

The short answer is performance. Using cookies, minimizes the need to look things up in various data stores (memory caches, files, databases, etc), thus speeding things up on the server application"s side. Keep in mind that the bigger the cookie the heavier the payload over the network, so what you save in database lookup on the server you might lose over the network. Consider carefully what to include in your cookies.

## Why would cookies need to be signed?

Cookies are used to keep all sorts of information, some of which can be very sensitive. They"re also by nature not safe and require that a number of auxiliary precautions be taken to be considered secure in any way for both parties, client and server. Signing cookies specifically addresses the problem that they can be tinkered with in attempts to fool server applications. There are other measures to mitigate other types of vulnerabilities, I encourage you to read up more on cookies.

## How can a cookie be tampered with?

Cookies reside on the client in text form and can be edited with no effort. A cookie received by your server application could have been modified for a number of reasons, some of which may not be innocent. Imagine a web application that keeps permission information about its users on cookies and grants privileges based on that information. If the cookie is not tinker-proof, anyone could modify theirs to elevate their status from "role=visitor" to "role=admin" and the application would be none the wiser.

## Why is a

`SECRET_KEY`

necessary to sign cookies?

Verifying cookies is a tad bit different than verifying source code the way it"s described earlier. In the case of the source code, the original author is the trustee and owner of the reference fingerprint (the checksum), which will be kept public. What you don"t trust is the source code, but you trust the public signature. So to verify your copy of the source you simply want your calculated hash to match the public hash.

In the case of a cookie however the application doesn"t keep track of the signature, it keeps track of its `SECRET_KEY`

. The `SECRET_KEY`

is the reference fingerprint. Cookies travel with a signature that they claim to be legit. Legitimacy here means that the signature was issued by the owner of the cookie, that is the application, and in this case, it"s that claim that you don"t trust and you need to check the signature for validity. To do that you need to include an element in the signature that is only known to you, that"s the `SECRET_KEY`

. Someone may change a cookie, but since they don"t have the secret ingredient to properly calculate a valid signature they cannot spoof it. As stated a bit earlier this type of fingerprinting, where on top of the checksum one also provides a secret key, is called a Message Authentication Code.

## What about Sessions?

Sessions in their classical implementation are cookies that carry only an ID in the `content`

field, the `session_id`

. The purpose of sessions is exactly the same as signed cookies, i.e. to prevent cookie tampering. Classical sessions have a different approach though. Upon receiving a session cookie the server uses the ID to look up the session data in its own local storage, which could be a database, a file, or sometimes a cache in memory. The session cookie is typically set to expire when the browser is closed. Because of the local storage lookup step, this implementation of sessions typically incurs a performance hit. Signed cookies are becoming a preferred alternative and that"s how Flask"s sessions are implemented. In other words, Flask sessions *are* signed cookies, and to use signed cookies in Flask just use its `Session`

API.

## Why not also encrypt the cookies?

Sometimes the contents of cookies can be encrypted before *also being signed*. This is done if they"re deemed too sensitive to be visible from the browser (encryption hides the contents). Simply signing cookies however, addresses a different need, one where there"s a desire to maintain a degree of visibility and usability to cookies on the browser, while preventing that they"d be meddled with.

## What happens if I change the

`SECRET_KEY`

?

By changing the `SECRET_KEY`

you"re invalidating *all* cookies signed with the previous key. When the application receives a request with a cookie that was signed with a previous `SECRET_KEY`

, it will try to calculate the signature with the new `SECRET_KEY`

, and both signatures won"t match, this cookie and all its data will be rejected, it will be as if the browser is connecting to the server for the first time. Users will be logged out and their old cookie will be forgotten, along with anything stored inside. Note that this is different from the way an expired cookie is handled. An expired cookie may have its lease extended if its signature checks out. An invalid signature just implies a plain invalid cookie.

So unless you want to invalidate all signed cookies, try to keep the `SECRET_KEY`

the same for extended periods.

## What"s a good

`SECRET_KEY`

?

A secret key should be hard to guess. The documentation on Sessions has a good recipe for random key generation:

```
>>> import os
>>> os.urandom(24)
"xfd{Hxe5<x95xf9xe3x96.5xd1x01O<!xd5xa2xa0x9fR"xa1xa8"
```

You copy the key and paste it in your configuration file as the value of `SECRET_KEY`

.

Short of using a key that was randomly generated, you could use a complex assortment of words, numbers, and symbols, perhaps arranged in a sentence known only to you, encoded in byte form.

Do *not* set the `SECRET_KEY`

directly with a function that generates a different key each time it"s called. For example, don"t do this:

```
# this is not good
SECRET_KEY = random_key_generator()
```

Each time your application is restarted it will be given a new key, thus invalidating the previous.

Instead, open an interactive python shell and call the function to generate the key, then copy and paste it to the config.

Actually the purpose of `np.meshgrid`

is already mentioned in the documentation:

Return coordinate matrices from coordinate vectors.

Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,..., xn.

So it"s primary purpose is to create a coordinates matrices.

You probably just asked yourself:

The reason you need coordinate matrices with Python/NumPy is that there is no direct relation from coordinates to values, except when your coordinates start with zero and are purely positive integers. Then you can just use the indices of an array as the index. However when that"s not the case you somehow need to store coordinates alongside your data. That"s where grids come in.

Suppose your data is:

```
1 2 1
2 5 2
1 2 1
```

However, each value represents a 3 x 2 kilometer area (horizontal x vertical). Suppose your origin is the upper left corner and you want arrays that represent the distance you could use:

```
import numpy as np
h, v = np.meshgrid(np.arange(3)*3, np.arange(3)*2)
```

where v is:

```
array([[0, 0, 0],
[2, 2, 2],
[4, 4, 4]])
```

and h:

```
array([[0, 3, 6],
[0, 3, 6],
[0, 3, 6]])
```

So if you have two indices, let"s say `x`

and `y`

(that"s why the return value of `meshgrid`

is usually `xx`

or `xs`

instead of `x`

in this case I chose `h`

for horizontally!) then you can get the x coordinate of the point, the y coordinate of the point and the value at that point by using:

```
h[x, y] # horizontal coordinate
v[x, y] # vertical coordinate
data[x, y] # value
```

That makes it much easier to keep track of coordinates **and** (even more importantly) you can pass them to functions that need to know the coordinates.

However, `np.meshgrid`

itself isn"t often used directly, mostly one just uses one of *similar* objects `np.mgrid`

or `np.ogrid`

.
Here `np.mgrid`

represents the `sparse=False`

and `np.ogrid`

the `sparse=True`

case (I refer to the `sparse`

argument of `np.meshgrid`

). Note that there is a significant difference between
`np.meshgrid`

and `np.ogrid`

and `np.mgrid`

: The first two returned values (if there are two or more) are reversed. Often this doesn"t matter but you should give meaningful variable names depending on the context.

For example, in case of a 2D grid and `matplotlib.pyplot.imshow`

it makes sense to name the first returned item of `np.meshgrid`

`x`

and the second one `y`

while it"s
the other way around for `np.mgrid`

and `np.ogrid`

.

`np.ogrid`

and sparse grids```
>>> import numpy as np
>>> yy, xx = np.ogrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]])
>>> yy
array([[-5],
[-4],
[-3],
[-2],
[-1],
[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5]])
```

As already said the output is reversed when compared to `np.meshgrid`

, that"s why I unpacked it as `yy, xx`

instead of `xx, yy`

:

```
>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6), sparse=True)
>>> xx
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]])
>>> yy
array([[-5],
[-4],
[-3],
[-2],
[-1],
[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5]])
```

This already looks like coordinates, specifically the x and y lines for 2D plots.

Visualized:

```
yy, xx = np.ogrid[-5:6, -5:6]
plt.figure()
plt.title("ogrid (sparse meshgrid)")
plt.grid()
plt.xticks(xx.ravel())
plt.yticks(yy.ravel())
plt.scatter(xx, np.zeros_like(xx), color="blue", marker="*")
plt.scatter(np.zeros_like(yy), yy, color="red", marker="x")
```

`np.mgrid`

and dense/fleshed out grids```
>>> yy, xx = np.mgrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
[-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
[-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
[-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[ 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[ 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]])
```

The same applies here: The output is reversed compared to `np.meshgrid`

:

```
>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6))
>>> xx
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
[-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
[-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
[-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[ 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[ 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]])
```

Unlike `ogrid`

these arrays contain **all** `xx`

and `yy`

coordinates in the -5 <= xx <= 5; -5 <= yy <= 5 grid.

```
yy, xx = np.mgrid[-5:6, -5:6]
plt.figure()
plt.title("mgrid (dense meshgrid)")
plt.grid()
plt.xticks(xx[0])
plt.yticks(yy[:, 0])
plt.scatter(xx, yy, color="red", marker="x")
```

It"s not only limited to 2D, these functions work for arbitrary dimensions (well, there is a maximum number of arguments given to function in Python and a maximum number of dimensions that NumPy allows):

```
>>> x1, x2, x3, x4 = np.ogrid[:3, 1:4, 2:5, 3:6]
>>> for i, x in enumerate([x1, x2, x3, x4]):
... print("x{}".format(i+1))
... print(repr(x))
x1
array([[[[0]]],
[[[1]]],
[[[2]]]])
x2
array([[[[1]],
[[2]],
[[3]]]])
x3
array([[[[2],
[3],
[4]]]])
x4
array([[[[3, 4, 5]]]])
>>> # equivalent meshgrid output, note how the first two arguments are reversed and the unpacking
>>> x2, x1, x3, x4 = np.meshgrid(np.arange(1,4), np.arange(3), np.arange(2, 5), np.arange(3, 6), sparse=True)
>>> for i, x in enumerate([x1, x2, x3, x4]):
... print("x{}".format(i+1))
... print(repr(x))
# Identical output so it"s omitted here.
```

Even if these also work for 1D there are two (much more common) 1D grid creation functions:

Besides the `start`

and `stop`

argument it also supports the `step`

argument (even complex steps that represent the number of steps):

```
>>> x1, x2 = np.mgrid[1:10:2, 1:10:4j]
>>> x1 # The dimension with the explicit step width of 2
array([[1., 1., 1., 1.],
[3., 3., 3., 3.],
[5., 5., 5., 5.],
[7., 7., 7., 7.],
[9., 9., 9., 9.]])
>>> x2 # The dimension with the "number of steps"
array([[ 1., 4., 7., 10.],
[ 1., 4., 7., 10.],
[ 1., 4., 7., 10.],
[ 1., 4., 7., 10.],
[ 1., 4., 7., 10.]])
```

You specifically asked about the purpose and in fact, these grids are extremely useful if you need a coordinate system.

For example if you have a NumPy function that calculates the distance in two dimensions:

```
def distance_2d(x_point, y_point, x, y):
return np.hypot(x-x_point, y-y_point)
```

And you want to know the distance of each point:

```
>>> ys, xs = np.ogrid[-5:5, -5:5]
>>> distances = distance_2d(1, 2, xs, ys) # distance to point (1, 2)
>>> distances
array([[9.21954446, 8.60232527, 8.06225775, 7.61577311, 7.28010989,
7.07106781, 7. , 7.07106781, 7.28010989, 7.61577311],
[8.48528137, 7.81024968, 7.21110255, 6.70820393, 6.32455532,
6.08276253, 6. , 6.08276253, 6.32455532, 6.70820393],
[7.81024968, 7.07106781, 6.40312424, 5.83095189, 5.38516481,
5.09901951, 5. , 5.09901951, 5.38516481, 5.83095189],
[7.21110255, 6.40312424, 5.65685425, 5. , 4.47213595,
4.12310563, 4. , 4.12310563, 4.47213595, 5. ],
[6.70820393, 5.83095189, 5. , 4.24264069, 3.60555128,
3.16227766, 3. , 3.16227766, 3.60555128, 4.24264069],
[6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
2.23606798, 2. , 2.23606798, 2.82842712, 3.60555128],
[6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
1.41421356, 1. , 1.41421356, 2.23606798, 3.16227766],
[6. , 5. , 4. , 3. , 2. ,
1. , 0. , 1. , 2. , 3. ],
[6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
1.41421356, 1. , 1.41421356, 2.23606798, 3.16227766],
[6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
2.23606798, 2. , 2.23606798, 2.82842712, 3.60555128]])
```

The output would be identical if one passed in a dense grid instead of an open grid. NumPys broadcasting makes it possible!

Let"s visualize the result:

```
plt.figure()
plt.title("distance to point (1, 2)")
plt.imshow(distances, origin="lower", interpolation="none")
plt.xticks(np.arange(xs.shape[1]), xs.ravel()) # need to set the ticks manually
plt.yticks(np.arange(ys.shape[0]), ys.ravel())
plt.colorbar()
```

And this is also when NumPys `mgrid`

and `ogrid`

become very convenient because it allows you to easily change the resolution of your grids:

```
ys, xs = np.ogrid[-5:5:200j, -5:5:200j]
# otherwise same code as above
```

However, since `imshow`

doesn"t support `x`

and `y`

inputs one has to change the ticks by hand. It would be really convenient if it would accept the `x`

and `y`

coordinates, right?

It"s easy to write functions with NumPy that deal naturally with grids. Furthermore, there are several functions in NumPy, SciPy, matplotlib that expect you to pass in the grid.

I like images so let"s explore `matplotlib.pyplot.contour`

:

```
ys, xs = np.mgrid[-5:5:200j, -5:5:200j]
density = np.sin(ys)-np.cos(xs)
plt.figure()
plt.contour(xs, ys, density)
```

Note how the coordinates are already correctly set! That wouldn"t be the case if you just passed in the `density`

.

Or to give another fun example using astropy models (this time I don"t care much about the coordinates, I just use them to create *some* grid):

```
from astropy.modeling import models
z = np.zeros((100, 100))
y, x = np.mgrid[0:100, 0:100]
for _ in range(10):
g2d = models.Gaussian2D(amplitude=100,
x_mean=np.random.randint(0, 100),
y_mean=np.random.randint(0, 100),
x_stddev=3,
y_stddev=3)
z += g2d(x, y)
a2d = models.AiryDisk2D(amplitude=70,
x_0=np.random.randint(0, 100),
y_0=np.random.randint(0, 100),
radius=5)
z += a2d(x, y)
```

Although that"s just "for the looks" several functions related to functional models and fitting (for example `scipy.interpolate.interp2d`

,
`scipy.interpolate.griddata`

even show examples using `np.mgrid`

) in Scipy, etc. require grids. Most of these work with open grids and dense grids, however some only work with one of them.

Series and DataFrame methods define a ** .explode()** method that explodes

Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call `explode`

on that column.

```
df = pd.DataFrame({"var1": ["a,b,c", "d,e,f"], "var2": [1, 2]})
df
var1 var2
0 a,b,c 1
1 d,e,f 2
df.assign(var1=df["var1"].str.split(",")).explode("var1")
var1 var2
0 a 1
0 b 1
0 c 1
1 d 2
1 e 2
1 f 2
```

**Note that explode only works on a single column** (for now). To explode multiple columns at once, see below.

NaNs and empty lists get the treatment they deserve without you having to jump through hoops to get it right.

```
df = pd.DataFrame({"var1": ["d,e,f", "", np.nan], "var2": [1, 2, 3]})
df
var1 var2
0 d,e,f 1
1 2
2 NaN 3
df["var1"].str.split(",")
0 [d, e, f]
1 []
2 NaN
df.assign(var1=df["var1"].str.split(",")).explode("var1")
var1 var2
0 d 1
0 e 1
0 f 1
1 2 # empty list entry becomes empty string after exploding
2 NaN 3 # NaN left un-touched
```

**This is a serious advantage over ravel/repeat -based solutions** (which ignore empty lists completely, and choke on NaNs).

Note that `explode`

only works on a single column at a time, but you can use `apply`

to explode multiple column at once:

```
df = pd.DataFrame({"var1": ["a,b,c", "d,e,f"],
"var2": ["i,j,k", "l,m,n"],
"var3": [1, 2]})
df
var1 var2 var3
0 a,b,c i,j,k 1
1 d,e,f l,m,n 2
(df.set_index(["var3"])
.apply(lambda col: col.str.split(",").explode())
.reset_index()
.reindex(df.columns, axis=1))
df
var1 var2 var3
0 a i 1
1 b j 1
2 c k 1
3 d l 2
4 e m 2
5 f n 2
```

The idea is to set as the index, all the columns that should **NOT** be exploded, then explode the remaining columns via `apply`

. This works well when the lists are equally sized.

As explained here a key difference is that:

`flatten`

is a method of an ndarray object and hence can only be called for true numpy arrays.`ravel`

is a library-level function and hence can be called on any object that can successfully be parsed.

For example `ravel`

will work on a list of ndarrays, while `flatten`

is not available for that type of object.

@IanH also points out important differences with memory handling in his answer.

I realize this is old but I figured I"d clear up a misconception for other travelers. Setting `plt.pyplot.isinteractive()`

to `False`

means that the plot will on be drawn on specific commands to draw (i.e. `plt.pyplot.show()`

). Setting `plt.pyplot.isinteractive()`

to `True`

means that every `pyplot`

(`plt`

) command will trigger a draw command (i.e. `plt.pyplot.show()`

). So what you were more than likely looking for is `plt.pyplot.show()`

at the end of your program to display the graph.

As a side note you can shorten these statements a bit by using the following import command `import matplotlib.pyplot as plt`

rather than `matplotlib as plt`

.

X
# Submit new EBook