Creating a Proxy Server in Python | Set 1

File handling | Python Methods and Functions

Socket programming in python is very user friendly compared to c. The programmer does not need to worry about the little things about sockets. In python, the user is more likely to focus on the application layer rather than the network layer. In this tutorial, we will be developing a simple multi-threaded proxy server capable of handling HTTP traffic. This will be mostly based on basic socket programming ideas. If you are unsure of the basics, I would recommend cleaning them up before going through this tutorial.

This is a naive proxy implementation. We'll be gradually turning it into a pretty useful server in the next tutorials.

Let's start by following the process in 3 easy steps

1. Creating an incoming socket
We create a serverSocket in the __init__ method of the server class. This creates a socket for incoming connections. We then bind the socket and then wait for clients to connect.

 def __init __ (self, config): # Shutdown on Ctrl + C signal.signal (signal.SIGINT, self.shutdown) # Create a TCP socket self. serverSocket = socket.socket (socket.AF_INET, socket.SOCK_STREAM) # Re-use the socket self.serverSocket.setsockopt (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) # bind the socket to a public host, and a port self. serverSocket.bind ((config ['HOST_NAME'], config [' BIND_PORT'])) self.serverSocket.listen (10) # become a server socket self .__ clients = {} 

2. Accept customer and process
This is the simplest, but most important of all steps. We wait for the client's connection request and, once a successful connection is established, we send the request in a separate thread, making ourselves available for the next request. This allows us to process multiple requests at the same time, which significantly improves server performance.

 while True: # Establish the connection (clientSocket, client_address) = self.serverSocket.accept () d = threading.Thread (name = self._getClientName (client_address), target = self.proxy_thread, args = (clientSocket, client_address)) d.setDaemon (True) d.start () 

3. Traffic redirection
The main function of a proxy server is to act as an intermediary between source and destination. Here we will fetch data from the source and then pass it to the client.

  • First, we will extract the URL from the received request data.
 # get the request from browser request = conn.recv (config ['MAX_REQUEST_LEN']) # parse the first line first_line = request.split ('') [0] # get url url = first_line.split ('') [1] 
  • Then we find the destination address of the request. Address — it is a tuple of (destination_ip_address, destination_port_no) . We will receive data from this address.
 http_pos = url.find (": //") # find pos of: // if (http_pos == - 1): temp = url else: temp = url [(http_pos + 3):] # get the rest of url port_pos = temp.find (":") # find the port pos (if any) # find end of web server webserver_pos = temp.find ( "/") if webserver_pos == -1: webserver_pos = len (temp) webserver = "" port = -1 if (port_pos == - 1 or webserver_pos & lt; port_pos): # default port port = 80 webserver = temp [: webserver_pos] else: # specific port port = int ((temp [(port_pos + 1):]) [: webserver_pos-port_pos-1]) webserver = temp [: port_pos] 
  • We now establish a new connection to the destination server (or remote server) and then send a copy of the original request to the server. The server will then reply with a reply. All reply messages use the general RFC 822 message format.
 s = socket.socket (socket.AF_INET, socket.SOCK_STREAM) s.settimeout (config ['CONNECTION_TIMEOUT ']) s.connect ((webserver, port)) s.sendall (request) 
  • Then we forward the server's response to the client. conn — this is the original client connection. The response can be larger than the MAX_REQUEST_LEN we get in one call, so a null response marks the end of the response.
 while 1: # receive data from web server data = s.recv (config [ 'MAX_REQUEST_LEN']) if (len (data) & gt; 0): conn.send (data) # send to browser / client else: break 

Then we corresponding close connections to the server and do error handling to make sure the server is working as expected.

How to check the server?
1. Start the server on the terminal. Leave it on and switch to your favorite browser. 
2. Go to the proxy settings of your browser and change the proxy server to "localhost" and the port to "12345". 
3. Now open any HTTP site (not HTTPS) for example. and volla !! You must be able to access the content in a browser.

When the server is running, we can monitor the requests coming to the client. We may use this data to monitor the content we collect, or we may develop statistics based on the content. 
We can even restrict website access or blacklist an IP address. We'll be dealing with a lot of these features in the next tutorials. 
What's next? 
We will add the following features to our proxy server in the next tutorials. 
— Domain Blacklists
— Content Monitoring
— Logging
— HTTP WebServer + ProxyServer

All working source code for this tutorial is available here.


If you also want to showcase your blog here, please see GBlog for a guest blog post on GeeksforGeeks.