Relationship between SciPy and NumPy

log10 | log1p | log2 | StackOverflow

SciPy appears to provide most (but not all [1]) of NumPy"s functions in its own namespace. In other words, if there"s a function named numpy.foo, there"s almost certainly a scipy.foo. Most of the time, the two appear to be exactly the same, oftentimes even pointing to the same function object.

Sometimes, they"re different. To give an example that came up recently:

  • numpy.log10 is a ufunc that returns NaNs for negative arguments;
  • scipy.log10 returns complex values for negative arguments and doesn"t appear to be a ufunc.

The same can be said about log, log2 and logn, but not about log1p [2].

On the other hand, numpy.exp and scipy.exp appear to be different names for the same ufunc. This is also true of scipy.log1p and numpy.log1p.

Another example is numpy.linalg.solve vs scipy.linalg.solve. They"re similar, but the latter offers some additional features over the former.

Why the apparent duplication? If this is meant to be a wholesale import of numpy into the scipy namespace, why the subtle differences in behaviour and the missing functions? Is there some overarching logic that would help clear up the confusion?

[1] numpy.min, numpy.max, numpy.abs and a few others have no counterparts in the scipy namespace.

[2] Tested using NumPy 1.5.1 and SciPy 0.9.0rc2.

Answer rating: 145

Last time I checked it, the scipy __init__ method executes a

from numpy import *

so that the whole numpy namespace is included into scipy when the scipy module is imported.

The log10 behavior you are describing is interesting, because both versions are coming from numpy. One is a ufunc, the other is a numpy.lib function. Why scipy is preferring the library function over the ufunc, I don"t know off the top of my head.


EDIT: In fact, I can answer the log10 question. Looking in the scipy __init__ method I see this:

# Import numpy symbols to scipy name space
import numpy as _num
from numpy import oldnumeric
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

The log10 function you get in scipy comes from numpy.lib.scimath. Looking at that code, it says:

"""
Wrapper functions to more user-friendly calling of certain math functions
whose output data-type is different than the input data-type in certain
domains of the input.

For example, for functions like log() with branch cuts, the versions in this
module provide the mathematically valid answers in the complex plane:

>>> import math
>>> from numpy.lib import scimath
>>> scimath.log(-math.exp(1)) == (1+1j*math.pi)
True

Similarly, sqrt(), other base logarithms, power() and trig functions are
correctly handled.  See their respective docstrings for specific examples.
"""

It seems that module overlays the base numpy ufuncs for sqrt, log, log2, logn, log10, power, arccos, arcsin, and arctanh. That explains the behavior you are seeing. The underlying design reason why it is done like that is probably buried in a mailing list post somewhere.





Relationship between SciPy and NumPy: StackOverflow Questions

Answer #1

For those trying to make the connection between SNR and a normal random variable generated by numpy:

[1] SNR ratio, where it"s important to keep in mind that P is average power.

Or in dB:
[2] SNR dB2

In this case, we already have a signal and we want to generate noise to give us a desired SNR.

While noise can come in different flavors depending on what you are modeling, a good start (especially for this radio telescope example) is Additive White Gaussian Noise (AWGN). As stated in the previous answers, to model AWGN you need to add a zero-mean gaussian random variable to your original signal. The variance of that random variable will affect the average noise power.

For a Gaussian random variable X, the average power Ep, also known as the second moment, is
[3] Ex

So for white noise, Ex and the average power is then equal to the variance Ex.

When modeling this in python, you can either
1. Calculate variance based on a desired SNR and a set of existing measurements, which would work if you expect your measurements to have fairly consistent amplitude values.
2. Alternatively, you could set noise power to a known level to match something like receiver noise. Receiver noise could be measured by pointing the telescope into free space and calculating average power.

Either way, it"s important to make sure that you add noise to your signal and take averages in the linear space and not in dB units.

Here"s some code to generate a signal and plot voltage, power in Watts, and power in dB:

# Signal Generation
# matplotlib inline

import numpy as np
import matplotlib.pyplot as plt

t = np.linspace(1, 100, 1000)
x_volts = 10*np.sin(t/(2*np.pi))
plt.subplot(3,1,1)
plt.plot(t, x_volts)
plt.title("Signal")
plt.ylabel("Voltage (V)")
plt.xlabel("Time (s)")
plt.show()

x_watts = x_volts ** 2
plt.subplot(3,1,2)
plt.plot(t, x_watts)
plt.title("Signal Power")
plt.ylabel("Power (W)")
plt.xlabel("Time (s)")
plt.show()

x_db = 10 * np.log10(x_watts)
plt.subplot(3,1,3)
plt.plot(t, x_db)
plt.title("Signal Power in dB")
plt.ylabel("Power (dB)")
plt.xlabel("Time (s)")
plt.show()

Generated Signal

Here"s an example for adding AWGN based on a desired SNR:

# Adding noise using target SNR

# Set a target SNR
target_snr_db = 20
# Calculate signal power and convert to dB 
sig_avg_watts = np.mean(x_watts)
sig_avg_db = 10 * np.log10(sig_avg_watts)
# Calculate noise according to [2] then convert to watts
noise_avg_db = sig_avg_db - target_snr_db
noise_avg_watts = 10 ** (noise_avg_db / 10)
# Generate an sample of white noise
mean_noise = 0
noise_volts = np.random.normal(mean_noise, np.sqrt(noise_avg_watts), len(x_watts))
# Noise up the original signal
y_volts = x_volts + noise_volts

# Plot signal with noise
plt.subplot(2,1,1)
plt.plot(t, y_volts)
plt.title("Signal with noise")
plt.ylabel("Voltage (V)")
plt.xlabel("Time (s)")
plt.show()
# Plot in dB
y_watts = y_volts ** 2
y_db = 10 * np.log10(y_watts)
plt.subplot(2,1,2)
plt.plot(t, 10* np.log10(y_volts**2))
plt.title("Signal with noise (dB)")
plt.ylabel("Power (dB)")
plt.xlabel("Time (s)")
plt.show()

Signal with target SNR

And here"s an example for adding AWGN based on a known noise power:

# Adding noise using a target noise power

# Set a target channel noise power to something very noisy
target_noise_db = 10

# Convert to linear Watt units
target_noise_watts = 10 ** (target_noise_db / 10)

# Generate noise samples
mean_noise = 0
noise_volts = np.random.normal(mean_noise, np.sqrt(target_noise_watts), len(x_watts))

# Noise up the original signal (again) and plot
y_volts = x_volts + noise_volts

# Plot signal with noise
plt.subplot(2,1,1)
plt.plot(t, y_volts)
plt.title("Signal with noise")
plt.ylabel("Voltage (V)")
plt.xlabel("Time (s)")
plt.show()
# Plot in dB
y_watts = y_volts ** 2
y_db = 10 * np.log10(y_watts)
plt.subplot(2,1,2)
plt.plot(t, 10* np.log10(y_volts**2))
plt.title("Signal with noise")
plt.ylabel("Power (dB)")
plt.xlabel("Time (s)")
plt.show()

Signal with target noise level

Answer #2

Without conversion to string

import math
digits = int(math.log10(n))+1

To also handle zero and negative numbers

import math
if n > 0:
    digits = int(math.log10(n))+1
elif n == 0:
    digits = 1
else:
    digits = int(math.log10(-n))+2 # +1 if you don"t count the "-" 

You"d probably want to put that in a function :)

Here are some benchmarks. The len(str()) is already behind for even quite small numbers

timeit math.log10(2**8)
1000000 loops, best of 3: 746 ns per loop
timeit len(str(2**8))
1000000 loops, best of 3: 1.1 µs per loop

timeit math.log10(2**100)
1000000 loops, best of 3: 775 ns per loop
 timeit len(str(2**100))
100000 loops, best of 3: 3.2 µs per loop

timeit math.log10(2**10000)
1000000 loops, best of 3: 844 ns per loop
timeit len(str(2**10000))
100 loops, best of 3: 10.3 ms per loop

Answer #3

np.log is ln, whereas np.log10 is your standard base 10 log.

Relevant documentation:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.log.html

http://docs.scipy.org/doc/numpy/reference/generated/numpy.log10.html

Answer #4

You can use negative numbers to round integers:

>>> round(1234, -3)
1000.0

Thus if you need only most significant digit:

>>> from math import log10, floor
>>> def round_to_1(x):
...   return round(x, -int(floor(log10(abs(x)))))
... 
>>> round_to_1(0.0232)
0.02
>>> round_to_1(1234243)
1000000.0
>>> round_to_1(13)
10.0
>>> round_to_1(4)
4.0
>>> round_to_1(19)
20.0

You"ll probably have to take care of turning float to integer if it"s bigger than 1.

Answer #5

Last time I checked it, the scipy __init__ method executes a

from numpy import *

so that the whole numpy namespace is included into scipy when the scipy module is imported.

The log10 behavior you are describing is interesting, because both versions are coming from numpy. One is a ufunc, the other is a numpy.lib function. Why scipy is preferring the library function over the ufunc, I don"t know off the top of my head.


EDIT: In fact, I can answer the log10 question. Looking in the scipy __init__ method I see this:

# Import numpy symbols to scipy name space
import numpy as _num
from numpy import oldnumeric
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

The log10 function you get in scipy comes from numpy.lib.scimath. Looking at that code, it says:

"""
Wrapper functions to more user-friendly calling of certain math functions
whose output data-type is different than the input data-type in certain
domains of the input.

For example, for functions like log() with branch cuts, the versions in this
module provide the mathematically valid answers in the complex plane:

>>> import math
>>> from numpy.lib import scimath
>>> scimath.log(-math.exp(1)) == (1+1j*math.pi)
True

Similarly, sqrt(), other base logarithms, power() and trig functions are
correctly handled.  See their respective docstrings for specific examples.
"""

It seems that module overlays the base numpy ufuncs for sqrt, log, log2, logn, log10, power, arccos, arcsin, and arctanh. That explains the behavior you are seeing. The underlying design reason why it is done like that is probably buried in a mailing list post somewhere.

Relationship between SciPy and NumPy: StackOverflow Questions

Relationship between SciPy and NumPy: StackOverflow Questions

Answer #1

float ‚Üí float math.log2(x)

import math

log2 = math.log(x, 2.0)
log2 = math.log2(x)   # python 3.3 or later

float ‚Üí int math.frexp(x)

If all you need is the integer part of log base 2 of a floating point number, extracting the exponent is pretty efficient:

log2int_slow = int(math.floor(math.log(x, 2.0)))    # these give the
log2int_fast = math.frexp(x)[1] - 1                 # same result
  • Python frexp() calls the C function frexp() which just grabs and tweaks the exponent.

  • Python frexp() returns a tuple (mantissa, exponent). So [1] gets the exponent part.

  • For integral powers of 2 the exponent is one more than you might expect. For example 32 is stored as 0.5x2‚Å∂. This explains the - 1 above. Also works for 1/32 which is stored as 0.5x2‚Ū‚Å¥.

  • Floors toward negative infinity, so log‚ÇÇ31 computed this way is 4 not 5. log‚ÇÇ(1/17) is -5 not -4.


int ‚Üí int x.bit_length()

If both input and output are integers, this native integer method could be very efficient:

log2int_faster = x.bit_length() - 1
  • - 1 because 2‚Åø requires n+1 bits. Works for very large integers, e.g. 2**10000.

  • Floors toward negative infinity, so log‚ÇÇ31 computed this way is 4 not 5.

Answer #2

Python 3

In Python 3, this question doesn"t apply. The plain int type is unbounded.

However, you might actually be looking for information about the current interpreter"s word size, which will be the same as the machine"s word size in most cases. That information is still available in Python 3 as sys.maxsize, which is the maximum value representable by a signed word. Equivalently, it"s the size of the largest possible list or in-memory sequence.

Generally, the maximum value representable by an unsigned word will be sys.maxsize * 2 + 1, and the number of bits in a word will be math.log2(sys.maxsize * 2 + 2). See this answer for more information.

Python 2

In Python 2, the maximum value for plain int values is available as sys.maxint:

>>> sys.maxint
9223372036854775807

You can calculate the minimum value with -sys.maxint - 1 as shown here.

Python seamlessly switches from plain to long integers once you exceed this value. So most of the time, you won"t need to know it.

Answer #3

Last time I checked it, the scipy __init__ method executes a

from numpy import *

so that the whole numpy namespace is included into scipy when the scipy module is imported.

The log10 behavior you are describing is interesting, because both versions are coming from numpy. One is a ufunc, the other is a numpy.lib function. Why scipy is preferring the library function over the ufunc, I don"t know off the top of my head.


EDIT: In fact, I can answer the log10 question. Looking in the scipy __init__ method I see this:

# Import numpy symbols to scipy name space
import numpy as _num
from numpy import oldnumeric
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

The log10 function you get in scipy comes from numpy.lib.scimath. Looking at that code, it says:

"""
Wrapper functions to more user-friendly calling of certain math functions
whose output data-type is different than the input data-type in certain
domains of the input.

For example, for functions like log() with branch cuts, the versions in this
module provide the mathematically valid answers in the complex plane:

>>> import math
>>> from numpy.lib import scimath
>>> scimath.log(-math.exp(1)) == (1+1j*math.pi)
True

Similarly, sqrt(), other base logarithms, power() and trig functions are
correctly handled.  See their respective docstrings for specific examples.
"""

It seems that module overlays the base numpy ufuncs for sqrt, log, log2, logn, log10, power, arccos, arcsin, and arctanh. That explains the behavior you are seeing. The underlying design reason why it is done like that is probably buried in a mailing list post somewhere.

Tutorials