Downloading Encrypted and Compressed Files with Python

Earlier this year, I was tasked with creating an application that would download information from our organization's website using Python. The tricky part was that it would be encrypted, gzipped and the payload would be JSON. Could Python do all that? Well, that's what I wanted to find out. Now it's time for you to learn what I discovered.

Python and Encryption

The first order of business was to figure out the encryption stuff.The payload was supposed to be AES encrypted. While Python doesn't seem to have a module that's built in for this sort of thing, there is an excellent PyCrypto package for Python 2.x that works just fine. Unfortunately, their main site doesn't list how to install it on Windows. You need to do some compiling of your own to get it to work (using Visual Studio, I think), or you can download Michael Foord's builds here. I went with that latter.

Here's the basic code I ended up using:

from Crypto.Cipher import AES

cipher = AES.new(key, AES.MODE_ECB)
gzipData = cipher.decrypt(encData).strip('\000')

The encData variable is just the file that's been downloaded using urllib2. We'll look at how to do that soon enough. Just be patient. The key was provided by one of my fellow developers. Anyway, once you've unencrypted it, you end up with the gzipped data.

Decompressing Gzipped Files

The documentation about gzipped stuff is pretty confusing. Do you use gzip or zlib? It took quite a bit of trial and error for me to figure that out, mainly because my colleague was giving me the wrong file format. This part actually ended up being super easy to do too:

import zlib
jsonTxt = zlib.decompress(gzipData)

If you do the above, you'll end up with the decompressed data. Yes, it really is that simple.

JSON and Python

Starting with Python 2.6, you get a json module shipped in Python. You can read about it here. If you're stuck with an older version, than you can download the module from PyPIinstead. Or you can use the simplejson package, which is what I used.

import simplejson
json = simplejson.loads(jsonTxt )

Now you'll have a list of nested dictionaries. Basically, you'll want to do something like this to use it:

data = json['keyName']

That will return another dictionary with different data. You'll want to study the data structure a bit to figure out the best way to access what you want.

Putting it all Together

Now let's put it all together and show you the completed script:

import simplejson
import urllib2
import zlib
from Crypto.Cipher import AES
from platform import node
from win32api import GetUserName

version = "1.0.4"
uid = GetUserName().upper()
machine = node()

#----------------------------------------------------------------------
def getData(url, key):
    """
    Downloads and decrypts gzipped data and returns a JSON string
    """
    try:
        headers = {"X-ActiveCalls-Version":version,
                   "X-ActiveCalls-User-Windows-user-ID":uid,
                   "X-ActiveCalls-Client-Machine-Name":machine}
        request = urllib2.Request(url, headers=headers)
        f = urllib2.urlopen(request)
        encData = f.read()
    
        cipher = AES.new(key, AES.MODE_ECB)
        gzipData = cipher.decrypt(encData).strip('\000')
        
        jsonTxt = zlib.decompress(gzipData)
        return jsonTxt
    except:
        msg = "Error: Program unable to contact update server. Please check configuration URL"
        print msg

if __name__ == "__main__":
    json = getData("some url", "some AES key")

In this particular example, I needed to also let the server know which version of the application was requesting data, who the user was, and which machine the request came from. To do all that, we use the urllib2's Request method to pass a special header to the server with that information. The rest of the code should be pretty self-explanatory

Wrapping Up

I hope that all made sense and that it's helpful to you in your Python adventures. If not, check out the links I provided in the various sections and do a little research. Have fun!

Copyright © 2024 Mouse Vs Python | Powered by Pythonlibrary