Python | os.urandom () method

Python Methods and Functions | urandom

The os.urandom () method is used to generate a string of random bytes of size suitable for cryptographic use, or we can say that this method generates a string containing random characters.

Syntax: os.urandom (size)

Parameter:
size: It is the size of string random bytes

Return Value: This method returns a string which represents random bytes suitable for cryptographic use.

Example # 1:

# Python program to explain the os.urandom () method

 
# import of the os module

import os 

 
# Size declaration

size = 5

 
# Using the os.urandom () method

result = os.urandom (size) 

 
# Output a string of random bytes
# Output will be different every time

print (result) 

Exit:

 b'xe2xafxbc: xdd '




Python | os.urandom () method: StackOverflow Questions

Answer #1

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.

The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are...) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.

1. Coroutines versus subroutines in a nut shell

Just like subroutines (functions, procedures, ...), coroutines (generators, ...) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.

The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.

1.1. Subroutines

A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:

def subfoo(bar):
     qux = 3
     return qux * bar

When you run it, that means

  1. allocate stack space for bar and qux
  2. recursively execute the first statement and jump to the next statement
  3. once at a return, push its value to the calling stack
  4. clear the stack (1.) and instruction pointer (2.)

Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.

root -
  :    - subfoo --
  :/--<---return --/
  |
  V

1.2. Coroutines as persistent subroutines

A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:

 def cofoo(bar):
      qux = yield bar  # yield marks a break point
      return qux

When you run it, that means

  1. allocate stack space for bar and qux
  2. recursively execute the first statement and jump to the next statement
    1. once at a yield, push its value to the calling stack but store the stack and instruction pointer
    2. once calling into yield, restore stack and instruction pointer and push arguments to qux
  3. once at a return, push its value to the calling stack
  4. clear the stack (1.) and instruction pointer (2.)

Note the addition of 2.1 and 2.2 - a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.

root -
  :    - cofoo --
  :/--<+--yield --/
  |    :
  V    :

This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.

1.3. Traversing the call stack

So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:

def wrap():
    yield "before"
    yield from cofoo()
    yield "after"

When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.

However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.

1.4. Coroutines all the way down

As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.

root -
  :    -> coro_a -yield-from-> coro_b --
  :/ <-+------------------------yield ---/
  |    :
  : --+-- coro_a.send----------yield ---
  :                             coro_b <-/

Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.

Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!

1.5. Python"s async and await

The explanation has so far explicitly used the yield and yield from vocabulary of generators - the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.

def foo():  # subroutine?
     return None

def foo():  # coroutine?
     yield from foofoo()  # generator? coroutine?

async def foo():  # coroutine!
     await foofoo()  # coroutine!
     return None

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

2. Anatomy of a simple event loop

By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.

This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.

Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.

loop -
  :    -> coroutine --await--> event --
  :/ <-+----------------------- yield --/
  |    :
  |    :  # loop waits for event to happen
  |    :
  : --+-- send(reply) -------- yield --
  :        coroutine <--yield-- event <-/

The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.

2.1.1. Events in time

The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true. However, a regular sleep blocks execution by itself - we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.

2.1.2. Defining an Event

An event is simply a value we can identify - be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.

class AsyncSleep:
    """Event to sleep until a point in time"""
    def __init__(self, until: float):
        self.until = until

    # used whenever someone ``await``s an instance of this Event
    def __await__(self):
        # yield this Event to the loop
        yield self
    
    def __repr__(self):
        return "%s(until=%.1f)" % (self.__class__.__name__, self.until)

This class only stores the event - it does not say how to actually handle it.

The only special feature is __await__ - it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.

2.2.1. Awaiting an event

Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:

import time

async def asleep(duration: float):
    """await that ``duration`` seconds pass"""
    await AsyncSleep(time.time() + duration / 2)
    await AsyncSleep(time.time() + duration / 2)

We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.

coroutine = asleep(100)
while True:
    print(coroutine.send(None))
    time.sleep(0.1)

This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.

2.2.2. Event + Sleep

At this point, we have two separate mechanisms at our disposal:

  • AsyncSleep Events that can be yielded from inside a coroutine
  • time.sleep that can wait without impacting coroutines

Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.

2.3. A naive event loop

If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.

This makes for a straightforward scheduling:

  1. sort coroutines by their desired wake up time
  2. pick the first that wants to wake up
  3. wait until this point in time
  4. run this coroutine
  5. repeat from 1.

A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    # store wake-up-time and coroutines
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting:
        # 2. pick the first coroutine that wants to wake up
        until, coroutine = waiting.pop(0)
        # 3. wait until this point in time
        time.sleep(max(0.0, until - time.time()))
        # 4. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])

Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.

2.4. Cooperative Waiting

The AsyncSleep event and run event loop are a fully working implementation of timed events.

async def sleepy(identifier: str = "coroutine", count=5):
    for i in range(count):
        print(identifier, "step", i + 1, "at %.2f" % time.time())
        await asleep(0.1)

run(*(sleepy("coroutine %d" % j) for j in range(5)))

This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.

3. I/O event loop

An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit "ready for I/O" event.

3.1. The select call

Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:

readable, writeable, _ = select.select(rlist, wlist, xlist, timeout)

For example, we can open a file for writing and wait for it to be ready:

write_target = open("/tmp/foo")
readable, writeable, _ = select.select([], [write_target], [])

Once select returns, writeable contains our open file.

3.2. Basic I/O event

Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object - say an open file. In addition, we store how much data to read.

class AsyncRead:
    def __init__(self, file, amount=1):
        self.file = file
        self.amount = amount
        self._buffer = ""

    def __await__(self):
        while len(self._buffer) < self.amount:
            yield self
            # we only get here if ``read`` should not block
            self._buffer += self.file.read(1)
        return self._buffer

    def __repr__(self):
        return "%s(file=%s, amount=%d, progress=%d)" % (
            self.__class__.__name__, self.file, self.amount, len(self._buffer)
        )

As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times - until our desired amount has been read. In addition, we return the I/O result instead of just resuming.

3.3. Augmenting an event loop with read I/O

The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.

# new
waiting_read = {}  # type: Dict[file, coroutine]

Since select.select takes a timeout parameter, we can use it in place of time.sleep.

# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(reads), [], [])

This gives us all readable files - if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.

# new - reschedule waiting coroutine, run readable coroutine
if readable:
    waiting.append((until, coroutine))
    waiting.sort()
    coroutine = waiting_read[readable[0]]

Finally, we have to actually listen for read requests.

# new
if isinstance(command, AsyncSleep):
    ...
elif isinstance(command, AsyncRead):
    ...

3.4. Putting it together

The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    waiting_read = {}  # type: Dict[file, coroutine]
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting or waiting_read:
        # 2. wait until the next coroutine may run or read ...
        try:
            until, coroutine = waiting.pop(0)
        except IndexError:
            until, coroutine = float("inf"), None
            readable, _, _ = select.select(list(waiting_read), [], [])
        else:
            readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
        # ... and select the appropriate one
        if readable and time.time() < until:
            if until and coroutine:
                waiting.append((until, coroutine))
                waiting.sort()
            coroutine = waiting_read.pop(readable[0])
        # 3. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension ...
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])
        # ... or register reads
        elif isinstance(command, AsyncRead):
            waiting_read[command.file] = coroutine

3.5. Cooperative I/O

The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read. Same as for sleepy, we can define a helper to test reading:

async def ready(path, amount=1024*32):
    print("read", path, "at", "%d" % time.time())
    with open(path, "rb") as file:
        result = await AsyncRead(file, amount)
    print("done", path, "at", "%d" % time.time())
    print("got", len(result), "B")

run(sleepy("background", 5), ready("/dev/urandom"))

Running this, we can see that our I/O is interleaved with the waiting task:

id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B

4. Non-Blocking I/O

While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop - which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.

However, sockets do allow for non-blocking I/O - and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.

4.1. Non-Blocking I/O event

Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket - which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.

class AsyncRecv:
    def __init__(self, connection, amount=1, read_buffer=1024):
        assert not connection.getblocking(), "connection must be non-blocking for async recv"
        self.connection = connection
        self.amount = amount
        self.read_buffer = read_buffer
        self._buffer = b""

    def __await__(self):
        while len(self._buffer) < self.amount:
            try:
                self._buffer += self.connection.recv(self.read_buffer)
            except BlockingIOError:
                yield self
        return self._buffer

    def __repr__(self):
        return "%s(file=%s, amount=%d, progress=%d)" % (
            self.__class__.__name__, self.connection, self.amount, len(self._buffer)
        )

In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.

4.2. Un-Blocking the event loop

As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files - a file descriptor marked ready by select.

# old
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
    waiting_read[command.connection] = coroutine

At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.

4.3. The ugly side of non-blocking I/O

In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now - you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:

# file
file = open(path, "rb")
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
    connection.connect((url, port))
except BlockingIOError:
    pass

Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.

id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5

Addendum

Example code at github

Answer #2

The answer below pertains primarily to Signed Cookies, an implementation of the concept of sessions (as used in web applications). Flask offers both, normal (unsigned) cookies (via request.cookies and response.set_cookie()) and signed cookies (via flask.session). The answer has two parts, the first describes how a Signed Cookie is generated, and the second is presented in the form of a QA that addresses different aspects of the scheme. The syntax used for the examples is Python3, but the concepts apply also to previous versions.

What is SECRET_KEY (or how to create a Signed Cookie)?

Signing cookies is a preventive measure against cookie tampering. During the process of signing a cookie, the SECRET_KEY is used in a way similar to how a "salt" would be used to muddle a password before hashing it. Here"s a (wildly) simplified description of the concept. The code in the examples is meant to be illustrative. Many of the steps have been omitted and not all of the functions actually exist. The goal here is to provide an understanding of the general idea, actual implementations will be a bit more involved. Also, keep in mind that Flask does most of this for you in the background. So, besides setting values to your cookie (via the session API) and providing a SECRET_KEY, it"s not only ill-advised to reimplement this yourself, but there"s no need to do so:

A poor man"s cookie signature

Before sending a Response to the browser:

( 1 ) First a SECRET_KEY is established. It should only be known to the application and should be kept relatively constant during the application"s life cycle, including through application restarts.

# choose a salt, a secret string of bytes
>>> SECRET_KEY = "my super secret key".encode("utf8")

( 2 ) create a cookie

>>> cookie = make_cookie(
...     name="_profile", 
...     content="uid=382|membership=regular",
...     ...
...     expires="July 1 2030..."
... )

>>> print(cookie)
name: _profile
content: uid=382|membership=regular...
    ...
    ...
expires: July 1 2030, 1:20:40 AM UTC

( 3 ) to create a signature, append (or prepend) the SECRET_KEY to the cookie byte string, then generate a hash from that combination.

# encode and salt the cookie, then hash the result
>>> cookie_bytes = str(cookie).encode("utf8")
>>> signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()
>>> print(signature)
7ae0e9e033b5fa53aa....

( 4 ) Now affix the signature at one end of the content field of the original cookie.

# include signature as part of the cookie
>>> cookie.content = cookie.content + "|" + signature
>>> print(cookie)
name: _profile
content: uid=382|membership=regular|7ae0e9...  <--- signature
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC

and that"s what"s sent to the client.

# add cookie to response
>>> response.set_cookie(cookie)
# send to browser --> 

Upon receiving the cookie from the browser:

( 5 ) When the browser returns this cookie back to the server, strip the signature from the cookie"s content field to get back the original cookie.

# Upon receiving the cookie from browser
>>> cookie = request.get_cookie()
# pop the signature out of the cookie
>>> (cookie.content, popped_signature) = cookie.content.rsplit("|", 1)

( 6 ) Use the original cookie with the application"s SECRET_KEY to recalculate the signature using the same method as in step 3.

# recalculate signature using SECRET_KEY and original cookie
>>> cookie_bytes = str(cookie).encode("utf8")
>>> calculated_signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()

( 7 ) Compare the calculated result with the signature previously popped out of the just received cookie. If they match, we know that the cookie has not been messed with. But if even just a space has been added to the cookie, the signatures won"t match.

# if both signatures match, your cookie has not been modified
>>> good_cookie = popped_signature==calculated_signature

( 8 ) If they don"t match then you may respond with any number of actions, log the event, discard the cookie, issue a fresh one, redirect to a login page, etc.

>>> if not good_cookie:
...     security_log(cookie)

Hash-based Message Authentication Code (HMAC)

The type of signature generated above that requires a secret key to ensure the integrity of some contents is called in cryptography a Message Authentication Code or MAC.

I specified earlier that the example above is an oversimplification of that concept and that it wasn"t a good idea to implement your own signing. That"s because the algorithm used to sign cookies in Flask is called HMAC and is a bit more involved than the above simple step-by-step. The general idea is the same, but due to reasons beyond the scope of this discussion, the series of computations are a tad bit more complex. If you"re still interested in crafting a DIY, as it"s usually the case, Python has some modules to help you get started :) here"s a starting block:

import hmac
import hashlib

def create_signature(secret_key, msg, digestmod=None):
    if digestmod is None:
        digestmod = hashlib.sha1
    mac = hmac.new(secret_key, msg=msg, digestmod=digestmod)
    return mac.digest()

The documentaton for hmac and hashlib.


The "Demystification" of SECRET_KEY :)

What"s a "signature" in this context?

It"s a method to ensure that some content has not been modified by anyone other than a person or an entity authorized to do so.

One of the simplest forms of signature is the "checksum", which simply verifies that two pieces of data are the same. For example, when installing software from source it"s important to first confirm that your copy of the source code is identical to the author"s. A common approach to do this is to run the source through a cryptographic hash function and compare the output with the checksum published on the project"s home page.

Let"s say for instance that you"re about to download a project"s source in a gzipped file from a web mirror. The SHA1 checksum published on the project"s web page is "eb84e8da7ca23e9f83...."

# so you get the code from the mirror
download https://mirror.example-codedump.com/source_code.tar.gz
# you calculate the hash as instructed
sha1(source_code.tar.gz)
> eb84e8da7c....

Both hashes are the same, you know that you have an identical copy.

What"s a cookie?

An extensive discussion on cookies would go beyond the scope of this question. I provide an overview here since a minimal understanding can be useful to have a better understanding of how and why SECRET_KEY is useful. I highly encourage you to follow up with some personal readings on HTTP Cookies.

A common practice in web applications is to use the client (web browser) as a lightweight cache. Cookies are one implementation of this practice. A cookie is typically some data added by the server to an HTTP response by way of its headers. It"s kept by the browser which subsequently sends it back to the server when issuing requests, also by way of HTTP headers. The data contained in a cookie can be used to emulate what"s called statefulness, the illusion that the server is maintaining an ongoing connection with the client. Only, in this case, instead of a wire to keep the connection "alive", you simply have snapshots of the state of the application after it has handled a client"s request. These snapshots are carried back and forth between client and server. Upon receiving a request, the server first reads the content of the cookie to reestablish the context of its conversation with the client. It then handles the request within that context and before returning the response to the client, updates the cookie. The illusion of an ongoing session is thus maintained.

What does a cookie look like?

A typical cookie would look like this:

name: _profile
content: uid=382|status=genie
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC

Cookies are trivial to peruse from any modern browser. On Firefox for example go to Preferences > Privacy > History > remove individual cookies.

The content field is the most relevant to the application. Other fields carry mostly meta instructions to specify various scopes of influence.

Why use cookies at all?

The short answer is performance. Using cookies, minimizes the need to look things up in various data stores (memory caches, files, databases, etc), thus speeding things up on the server application"s side. Keep in mind that the bigger the cookie the heavier the payload over the network, so what you save in database lookup on the server you might lose over the network. Consider carefully what to include in your cookies.

Why would cookies need to be signed?

Cookies are used to keep all sorts of information, some of which can be very sensitive. They"re also by nature not safe and require that a number of auxiliary precautions be taken to be considered secure in any way for both parties, client and server. Signing cookies specifically addresses the problem that they can be tinkered with in attempts to fool server applications. There are other measures to mitigate other types of vulnerabilities, I encourage you to read up more on cookies.

How can a cookie be tampered with?

Cookies reside on the client in text form and can be edited with no effort. A cookie received by your server application could have been modified for a number of reasons, some of which may not be innocent. Imagine a web application that keeps permission information about its users on cookies and grants privileges based on that information. If the cookie is not tinker-proof, anyone could modify theirs to elevate their status from "role=visitor" to "role=admin" and the application would be none the wiser.

Why is a SECRET_KEY necessary to sign cookies?

Verifying cookies is a tad bit different than verifying source code the way it"s described earlier. In the case of the source code, the original author is the trustee and owner of the reference fingerprint (the checksum), which will be kept public. What you don"t trust is the source code, but you trust the public signature. So to verify your copy of the source you simply want your calculated hash to match the public hash.

In the case of a cookie however the application doesn"t keep track of the signature, it keeps track of its SECRET_KEY. The SECRET_KEY is the reference fingerprint. Cookies travel with a signature that they claim to be legit. Legitimacy here means that the signature was issued by the owner of the cookie, that is the application, and in this case, it"s that claim that you don"t trust and you need to check the signature for validity. To do that you need to include an element in the signature that is only known to you, that"s the SECRET_KEY. Someone may change a cookie, but since they don"t have the secret ingredient to properly calculate a valid signature they cannot spoof it. As stated a bit earlier this type of fingerprinting, where on top of the checksum one also provides a secret key, is called a Message Authentication Code.

What about Sessions?

Sessions in their classical implementation are cookies that carry only an ID in the content field, the session_id. The purpose of sessions is exactly the same as signed cookies, i.e. to prevent cookie tampering. Classical sessions have a different approach though. Upon receiving a session cookie the server uses the ID to look up the session data in its own local storage, which could be a database, a file, or sometimes a cache in memory. The session cookie is typically set to expire when the browser is closed. Because of the local storage lookup step, this implementation of sessions typically incurs a performance hit. Signed cookies are becoming a preferred alternative and that"s how Flask"s sessions are implemented. In other words, Flask sessions are signed cookies, and to use signed cookies in Flask just use its Session API.

Why not also encrypt the cookies?

Sometimes the contents of cookies can be encrypted before also being signed. This is done if they"re deemed too sensitive to be visible from the browser (encryption hides the contents). Simply signing cookies however, addresses a different need, one where there"s a desire to maintain a degree of visibility and usability to cookies on the browser, while preventing that they"d be meddled with.

What happens if I change the SECRET_KEY?

By changing the SECRET_KEY you"re invalidating all cookies signed with the previous key. When the application receives a request with a cookie that was signed with a previous SECRET_KEY, it will try to calculate the signature with the new SECRET_KEY, and both signatures won"t match, this cookie and all its data will be rejected, it will be as if the browser is connecting to the server for the first time. Users will be logged out and their old cookie will be forgotten, along with anything stored inside. Note that this is different from the way an expired cookie is handled. An expired cookie may have its lease extended if its signature checks out. An invalid signature just implies a plain invalid cookie.

So unless you want to invalidate all signed cookies, try to keep the SECRET_KEY the same for extended periods.

What"s a good SECRET_KEY?

A secret key should be hard to guess. The documentation on Sessions has a good recipe for random key generation:

>>> import os
>>> os.urandom(24)
"xfd{Hxe5<x95xf9xe3x96.5xd1x01O<!xd5xa2xa0x9fR"xa1xa8"

You copy the key and paste it in your configuration file as the value of SECRET_KEY.

Short of using a key that was randomly generated, you could use a complex assortment of words, numbers, and symbols, perhaps arranged in a sentence known only to you, encoded in byte form.

Do not set the SECRET_KEY directly with a function that generates a different key each time it"s called. For example, don"t do this:

# this is not good
SECRET_KEY = random_key_generator()

Each time your application is restarted it will be given a new key, thus invalidating the previous.

Instead, open an interactive python shell and call the function to generate the key, then copy and paste it to the config.

Answer #3

random.random() does exactly that

>>> import random
>>> for i in range(10):
...     print(random.random())
... 
0.908047338626
0.0199900075962
0.904058545833
0.321508119045
0.657086320195
0.714084413092
0.315924955063
0.696965958019
0.93824013683
0.484207425759

If you want really random numbers, and to cover the range [0, 1]:

>>> import os
>>> int.from_bytes(os.urandom(8), byteorder="big") / ((1 << 64) - 1)
0.7409674234050893

Answer #4

np.random.seed(0) makes the random numbers predictable

>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])
>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])

With the seed reset (every time), the same set of numbers will appear every time.

If the random seed is not reset, different numbers appear with every invocation:

>>> numpy.random.rand(4)
array([ 0.42,  0.65,  0.44,  0.89])
>>> numpy.random.rand(4)
array([ 0.96,  0.38,  0.79,  0.53])

(pseudo-)random numbers work by starting with a number (the seed), multiplying it by a large number, adding an offset, then taking modulo of that sum. The resulting number is then used as the seed to generate the next "random" number. When you set the seed (every time), it does the same thing every time, giving you the same numbers.

If you want seemingly random numbers, do not set the seed. If you have code that uses random numbers that you want to debug, however, it can be very helpful to set the seed before each run so that the code does the same thing every time you run it.

To get the most random numbers for each run, call numpy.random.seed(). This will cause numpy to set the seed to a random number obtained from /dev/urandom or its Windows analog or, if neither of those is available, it will use the clock.

For more information on using seeds to generate pseudo-random numbers, see wikipedia.

Answer #5

This Stack Overflow quesion is the current top Google result for "random string Python". The current top answer is:

"".join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

This is an excellent method, but the PRNG in random is not cryptographically secure. I assume many people researching this question will want to generate random strings for encryption or passwords. You can do this securely by making a small change in the above code:

"".join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

Using random.SystemRandom() instead of just random uses /dev/urandom on *nix machines and CryptGenRandom() in Windows. These are cryptographically secure PRNGs. Using random.choice instead of random.SystemRandom().choice in an application that requires a secure PRNG could be potentially devastating, and given the popularity of this question, I bet that mistake has been made many times already.

If you"re using python3.6 or above, you can use the new secrets module as mentioned in MSeifert"s answer:

"".join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))

The module docs also discuss convenient ways to generate secure tokens and best practices.

Tutorials