"Advanced" Python

NICAR 2015

http://bit.ly/advancedpycarslides

http://bit.ly/advancedpycar

Geoff Hing / @geoffhing

Getting started

In this environment, use the Anaconda shell

                    
cd c:\                    
git clone https://github.com/ghing/nicar2015_advanced_python.git
cd nicar2015_advanced_python
virtualenv venv
.\venv\Scripts\activate
pip install -r requirements.txt
                    
                  

Where we're going

  • Test-driven development (TDD)
  • Object oriented design
  • Functional programming

Format

  • Concept
  • Example
  • Excercise *

* Show your work! Tweet @geoffhing with pastebin or gist links when you get the solution

Test driven development (TDD)

  1. Write (failing) test
  2. Make it pass
  3. Refactor

Test driven development (TDD)

  • unit tests
  • test case
  • test runner
  • assertion

On deadline?!?

  • Simple designs
  • Confidence
  • Ch-ch-ch-ch-changes
  • Manual testing is slow

Testing tools

Example Test

tests/test_sql.py from csvkit

                    
class TestSQL(unittest.TestCase):
    def setUp(self):
        # ...

    def test_make_column_name(self):
        c = sql.make_column(table.Column(0, 'test', [u'1', u'-87', u'418000000', u'']))
        self.assertEqual(c.key, 'test')
                    
                  

Let's run some tests

                    
nosetests
                    
                  

Let's write a test

Implement the test_slugify_whitespace() method in tests/test_utils.py to reflect the new requirements.

Make it pass

Implement pycar_advanced.utils.slugify() so your test passes!
                    
nosetests tests/test_utils.py
                    
                  

Object Oriented Design

  • Inheritance
  • Composition
  • Multiple inheritance
  • Mixin
  • Delegation
  • Magic Methods

Inheritance vs. Composition

is a vs. has a

Inheritance

panda/models/slugged_model.py from PANDA

                    
from django.db import models

class SluggedModel(models.Model):
    slug = models.SlugField(_('slug'), max_length=256)

    def save(self, *args, **kwargs):
        if not self.slug:
            self.slug = self.generate_unique_slug()

        super(SluggedModel, self).save(*args, **kwargs)

    def generate_unique_slug(self):
        # ...
                    
                  

Inheritance

panda/models/dataset.py from PANDA

                    
from panda.models.slugged_model import SluggedModel

class Dataset(SluggedModel):
    """
    A PANDA dataset (one table & associated metadata).
    """
    name = models.CharField(_('name'), max_length=256,
        help_text=_('User-supplied dataset name.'))
    description = models.TextField(_('description'), blank=True,
        help_text=_('User-supplied dataset description.'))
    # ...

    def save(self, *args, **kwargs):
        """
        Save the date of creation.
        """
        if not self.creation_date:
            self.creation_date = now()

        super(Dataset, self).save(*args, **kwargs)
                    
                  

Inheritance

Dataset is a SluggedModel is a Model

Multiple Inheritance

bakery/views.py from Django Bakery

                    
class BuildableMixin(object):
    """
    Common methods we will use in buildable views.
    """
    def get_content(self):
       # ...

    def prep_directory(self, path):
       # ...

    def build_file(self, path, html):
       # ...

    def write_file(self, path, html):
       # ...
                    
                  

Multiple Inheritance

bakery/views.py from Django Bakery

                    
from django.views.generic import TemplateView

class BuildableTemplateView(TemplateView, BuildableMixin):
    @property
    def build_method(self):
        return self.build

    def build(self):
        logger.debug("Building %s" % self.template_name)
        self.request = RequestFactory().get(self.build_path)
        path = os.path.join(settings.BUILD_DIR, self.build_path)
        self.prep_directory(self.build_path)
        self.build_file(path, self.get_content())
                    
                  

Let's use a mixin!

Use the ResultsTestMixin class to avoid duplicate code in the test case classes in tests/tests_oo.py

.

Multiple Inheritance

Be careful!

                    
 >>> class A(object):
 ...     def whoami(self):
 ...         print('A.whoami')
 ...
 >>> class B(A):
 ...     def whoami(self):
 ...         print('B.whoami')
 ...
 >>> class C(A):
 ...     def whoami(self):
 ...         print('C.whoami')
 ...
 >>> class D(B, C):
 ...     pass
 >>> D().whoami()
                    
                  

Multiple Inheritance

Be careful!

                    
 >>> class A(object):
 ...     def whoami(self):
 ...         print('A.whoami')
 ...
 >>> class B(A):
 ...     def whoami(self):
 ...         print('B.whoami')
 ...
 >>> class C(A):
 ...     def whoami(self):
 ...         print('C.whoami')
 ...
 >>> class D(B, C):
 ...     pass
 >>> D().whoami()
 B.whoami
                    
                  

Composition

elections/ap.py from python-elections.

                    
class AP(object):
    """
    The public client you can use to connect to AP's data feed.
    """
    # ...

class BaseAPResultCollection(object):
    """
    Base class that defines the methods to retrieve AP CSV
    data
    """
    def __init__(self, client, name, results=True, delegates=True):
        self.client = client
        # ...
                    
                  

Composition

BaseAPResultCollection has a client (AP)

Composition

                    
class FilesystemAP(AP):
    """
    AP client that reads data from the filesystem instead of from the AP FTP
    server
    """
    def __init__(self, results_path, username=None, password=None):
        super(FilesystemAP, self).__init__(username, password)
        self._results_path = results_path

    @property
    def ftp(self):
        if not self._ftp:
            self._ftp = MockFTP()

        return self._ftp

    def _fetch(self, path):
        clean_path = path.lstrip('/')
        full_path = os.path.join(self._results_path, clean_path)
        return open(full_path)
                    
                  

Delegation

                     
class UpperOut(object):
    def __init__(self, outfile):
        self._outfile = outfile

    def write(self, s):
        self._outfile.write(s.upper())

    def __getattr__(self, name):
        return getattr(self._outfile, name)
                     
                   

Delegation

                     
>>> with open('/Users/ghing/Desktop/test.txt', 'w') as f:
...     f.write(s)
...
>>> with open('/Users/ghing/Desktop/test.txt') as f:
...     print(f.read())
...
nicar
>>> with open('/Users/ghing/Desktop/test_upper.txt', 'w') as f:
...     upperf = UpperOut(f)
...     upperf.write(s)
...
>>> with open('/Users/ghing/Desktop/test_upper.txt', 'r') as f:
...     upperf = UpperOut(f)
...     print(upperf.read())
...
NICAR
                     
                   

Let's compose!

Implement the methods in pycar_advanced/oo.py to use composition to implement a basic election results fetcher/loader. Then these tests should pass:

                    
nosetests tests/test_oo.py
                    
                  

Functional Programming

  • Iterators
  • Generators
  • List comprehensions
  • Decorators

Iterators

  • A Python language feature
  • Represent a stream of data
  • Have a next() method
  • Raise StopIteration when there are no more items

Iterators

                    
>>> L = [1,2,3]
>>> it = iter(L)
>>> it.next()
1
>>> it.next()
2
>>> it.next()
3
>>> it.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  StopIteration
>>>
                    
                  

Generators

  • Return an iterator
  • Functions that pick up where they left off
  • Use yield rather than return
  • Can generate infinite sequences
  • Use less memory

Generators

dataset/freeze/config.py from dataset.

                    
class Configuration(object):
    # ...

    @property
    def exports(self):
        if not isinstance(self.data, dict):
            raise FreezeException("The root element of the freeze file needs to be a hash")
        if not isinstance(self.data.get('exports'), list):
            raise FreezeException("The freeze file needs to have a list of exports")
        common = self.data.get('common', {})
        for export in self.data.get('exports'):
            yield Export(common, export)
                    
                  

Let's make a generator

Implement the generate_results() function in pycar_advanced/functional.py so that the following command shows passing tests.

                    
nosetests \
tests/test_functional.py:FunctionalTestCase.test_generate_results
                    
                  

Filtering and transforming iterables

                    
>>> vals = [1, 2, 3]
>>> map(lambda x: x*x, vals)
[1, 4, 9]
>>> filter(lambda x: (x % 2) == 0, vals)
[2]
                    
                  

Lambda expressions

                      
>>> vals = [1, 2, 3]
>>> map(lambda x: x*x, vals)
[1, 4, 9]
>>> def square(x):
>>>     return x*x
>>> map(square, vals)
[1, 4, 9]
                      
                   

List comprehensions

                    
>>> vals = [1, 2, 3]
>>> [x * x for x in vals]
[1, 4, 9]
>>> [x for x in vals if (x % 2) == 0]
[2]
                    
                  

Let's use a list comprehension

Implement the candidate_first_names() function in pycar_advanced/functional.py using a list comprehension so that the following command shows passing tests:

                    
nosetests \
tests/test_functional.py:FunctionalTestCase.test_candidate_first_names
                    
                  

Generator expressions

                    
>>> vals = range(10)
>>> even_iter = (x for x in vals if (x % 2) == 0)
>>> even_iter.next()
0
>>> even_iter.next()
2
                    
                  

Generator expressions

                    
rahm = [r for r in results if r.candidate_name == "Rahm Emanuel"][0]
rahm = next(r for r in results if r.candidate_name == "Rahm Emanuel")
                    
                  

Functions are first class objects

                    
>>> def f(x):
...     print(x)
...
>>> isinstance(f, object)
True
                    
                  

Functions can be arguments

                    
>>> from functools import reduce
>>> def concat(s1, s2):
>>>     return s1 + " " + s2
>>> strgs = ["NICAR", "is", "great"]
>>> reduce(concat, strgs)
'NICAR is great'
                    
                  

Functions definitions can be nested

                    
>>> def join(bits, sep=' '):
...     def concat(s1, s2):
...        return s1 + sep + s2
...     return reduce(concat, bits)
...
>>> join(["NICAR", "is", "great"])
'NICAR is great'
>>> concat("NICAR", "is")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'concat' is not defined
                    
                  

Functions can return functions

                    
>>> def get_name_fn(first_name, last_name):
...     def name_fn():
...         print("{} {}".format(first_name, last_name))
...     return name_fn
...
>>> geoff = get_name_fn("Geoff", "Hing")
>>> geoff()
Geoff Hing
                    
                  

Decorators

From the Chicago Tribune's electioncenter2 Fabfile. Fabric makes heavy use of decorators.

                    
@roles('admin')
def deploy():
    require('branch', provided_by=BRANCH_PROVIDERS)
    sync()
    install_requirements()
                    
                  

Decorators

                    
>>> def log_output(f):
...     def inner():
...         output = f()
...         print(output)
...     return inner
...
>>> def me():
...     return "Geoff Hing"
...
>>> decorated = log_output(me)
>>> decorated()
Geoff Hing
                    
                  

Decorators

                    
>>> @log_output
... def me():
...     return "Geoff Hing"
...
>>> me()
Geoff Hing
                    
                  

Decorators

                    
>>> @log_output
... def me():
...     return "Geoff Hing"
...
>>> me()
Geoff Hing
                    
                  

Decorators

                    
>>> import logging
>>> def log_output(f):
...     def inner():
...         output = f()
...         logging.debug(output)
...     return inner
>>>
                    
                  

Let's write a decorator!

Implement the run_query() function pycar_advanced/decorators.py. We want to be able to call the candidates() function like this:

                    
from pycar_advanced.query import candidates

# Just runs query
candidates()

# Runs query and outputs the SQL text
candidates(show_sql=True)

# Only outputs the SQL text
candidates(show_sql=True, dry_run=True)
                    
                  

Let's write a decorator!

These tests should pass:

                    
nosetests tests/test_decorators.py