Topic: “Python”

# Floating Point Math is Hard (i)

I’m working a piece of legacy software, and in performing a code review came across what appeared to be some extraneous parentheses. I’ve renamed the terms as generically as possible because my employer would prefer it that way, but we have a calculation that in its original form looks like this, where every term is a floating point value:

calculated_value = a * (b * (c * d + e * f)) + g


Since multiplication is associative, the outermost set of parentheses should be redundant. Thinking that would be the case, a coworker and I rewrote the calculation so:

calculated_value = a * b * (c * d + e * f) + g


This does not always return the same results, though. I have tested this on both 2.7.4 and 3.3.1, and for a certain set of values, these two forms of the calculation definitely return different results. Here is a small, complete program that produces different results between the two forms of the calculation on both 2.7.4 and 3.3.1:

# test.py
def with_parens(a, b, c, d, e, f, g):
return (a * (b * (c * d + e * f)) + g)

def without_parens(a, b, c, d, e, f, g):
return (a * b * (c * d + e * f) + g)

the_values = (1.1523070658790489, 1.7320508075688772, 0.14068856789641426, 0.5950026782638391, 0.028734293347820326, 21.523030539704976, 2.282302370324546)

result_with = with_parens(*the_values)
result_without = without_parens(*the_values)
print((result_with == result_without), result_with, result_without)

The results:
» python test.py
(False, 3.68370978406535, 3.6837097840653494)

» python3 test.py
False 3.68370978406535 3.6837097840653494


It is fairly obvious that there is a precision difference here. I am familiar with the ways that floating point math can be odd, but I would not expect the loss of associativity to be one of them. And of course, it isn’t. It just took me this long in writing up what was originally going to be a post on comp.lang.python to see it.

What is actually going on here is that Python respects order of operations (as well it should!), and floating point math is imprecise. In the first case, a * b * (<everything else>), the first thing that happens is a is multiplied by b and the result is then multiplied by everything else. In the second case, a * (b * <everything else>), b is multiplied by everything else and the result is multiplied by a at the end. Many times, this doesn’t matter, but sometimes there is a slight difference in the results because of the loss of precision when performing floating point operations.

Lesson learned (again): floating point math is hard, and will trick you. What happens here is functionally a loss of the associative property of multiplication. The two calculations are in a pure mathematical sense equivalent. But floating point math is not the same as pure math. Not even close.

# The Great Deglobalization

For nearly all of August, I have been helping a colleague work on a massive code refactoring project. The company I am consulting for relies on a decades-old (though quite robust) piece of engineering modeling software. The software was written in FORTRAN77 in the mid-to-late 1980s, as was common for such packages at the time. It is apparent that it was not written by competent software developers, however: the entire code-base smacks of fastest-way-to-get-it-to-work thinking. All of the code—all 4,000+ lines of it—came in a single file. Every variable in the program is global.1

Yes, you read that right. Every variable in the program is global.

Since no one at the company is an expert in Fortran at this point—even a modern dialect like Fortran 952—we have been translating the whole thing to Python in order to start re-working pieces of it. (We’re using Python because that’s the company’s standard; we’re reworking this piece of code for reasons long and uninteresting.) The formal translation step was completed in mid-July; since then we have been working on refactoring the project so that we can actually change things without breaking everything.

The trick, of course, is all those global variables. In order to be able to begin making any other changes, we have to eliminate the global state. Otherwise, a change in one module can (and often does) have unexpected effects elsewhere.

Our strategy has been simple: take things in small steps, albeit often very time-consuming small steps. In each of those steps we have aimed to make single, incremental change that makes things more explicit and safer. Often, this has simply meant passing and returning inordinate numbers of variables back and forth to functions, simply so that the behavior is explicit (not to say clear). At other times, this has meant doing very bad things that would be intolerable under any other circumstances, but were less bad than what we had before and allowed us to step forward toward our ultimate goals. The prime example—and one of which I’m fairly proud, though I hope never to have to use it again—is leveraging Python’s property decorators to allow us to begin using struct-like objects for passing around large amounts of data before all the global state involved in said structs was eliminated.

Basically, we created a stopgap wrapper that made struct access result in changing global state. That way, functions could receive the struct and act on it, with the global state invisible to them: from any function’s perspective, it is performing perfectly normal access/mutation operations on the class fields. Behind the scenes, we manipulated global state, until we were able to eliminate global dependencies for that variable entirely, at which point we replaced the quirky behavior with standard field initialization.

Here is the awful pattern my colleague and I used as part of that transition, followed by some explanation. (Do not attempt at home!)

class BasicallyAStruct():
def __init__(self, some_field):
global some_field_
some_field_ = some_field

@property
def some_field(self):
return some_field_

@some_field.setter
def some_field(self, value):
global some_field_
some_field = value


Here’s how it works. For starters, we renamed every global variable to end in an underscore (like some_field_) to make it easy to distinguish between global and local variables. Then, we created a class that is basically just a struct to hold data. (It can of course be expanded to be a full-up class with methods later if that makes sense.) In the constructor, we declare every global variable that the struct needs to reference, and then assign it the value passed to the constructor. Then we use Python’s property decorators to specify the access and mutation behavior: instead of storing the value to a class property like normal, calling the setter or getter3 actually returns or sets the value in the global variable. The result:

global some_field_  # set to some old value
my_struct = BasicallyAStruct(some_field_)  # create the struct

# Set
my_struct.some_field = new_value  # assign just like normal
print(some_field_)  # new_value

# Get
print(new_value == my_struct.some_field)  # True


Once we get to a point where we’ve completely eliminated the global state, we change the class definition to look like a normal class definition, completely removing the global declaration and the @property and @some_field.setter decorators:

class BasicallyAStruct():
def __init__(self, some_field):
self.some_field = some_field


From the functions using the struct, nothing has changed; they are already using standard class property access notation.

It has worked like a charm, and it demonstrates the power of decorators in a bit of an unusual situation. It is, of course, an awful use of decorators, to the extent that I would call it an abuse in general. If I ever found this in “final” code I would probably make a horrible noise in outrage; it’s a stupid thing to do. It is, however, less stupid than keeping everything global, and it made for a good intermediate solution that allowed us to minimize changes at each step along the way and therefore minimized the potential number of places anything could break as we refactored.4

I’m happy to say that almost all of those global variables are gone, and the classes are all looking pretty much like normal classes now. And the calling functions never even noticed.

I’m incredibly happy with how that came out—and I hope never to do anything like it again.

1. Experienced FORTRAN programmers will recognize the pattern: all the variables are declared in a common block at the top of every single function, except for a couple subroutines.
2. A fun piece of trivia: my first software development project was all Fortran 95. Fortran was what the physics professor who helped a bunch of students get their projects off the ground knew, so Fortran is where I started. A bit strangely, that background has ended up being valuable to two of my three employers so far.
3. Behind the scenes, Python’s property access always calls getter and setter methods—which is why you can override them as we are here. It’s a nifty bit of metaprogramming capability the language gives you.
4. This is actually a pretty good example of the principle of information hiding put to a non-standard use.

# New Sites Live

I launched a couple of sites today, both for my friend Sarah Warren:

• The Accidental Okie — her blog, based on an existing WordPress theme but with some substantial tweaks to typography, colors, backgrounds, etc.
• Swoon Designs — a site for her business selling custom invitations, stationery, and branding.

Sarah has been blogging as “The Accidental Okie” for a couple years now, but she spent that whole time on WordPress.com using one of their upgrades for a hosted site on her own domain. I was unsurprised to hear she was outgrowing WordPress.com, though — good themes are hard to come by and impossible to customize to the extent one would like, so most people who are serious about blogging end up moving to WordPress.org (if they stay with the platform). It made sense to do that and get her professional site launched at the same time.

The design for The Accidental Okie is just a tweak of an existing theme — completely new background, header, typography, and color scheme, but someone else’s work otherwise (and generally very good work indeed.)

The design for Swoon Designs was a custom design that Sarah and I brainstormed up together after looking at a number of other sites she liked. It’s built as a child theme of a popular framework theme, Reason 2.0. That, frankly, is a mistake I won’t make again. While the theme works well (and our final outcome was one with which we’re both happy), it was a chore getting it to work. The original template designer ignored some fairly basic rules of good design, especially in his CSS, and it showed rather painfully in making a child theme. In the amount of time I spent on it, we could have built Sarah a theme from scratch (and avoided quite a few headaches that came with this setup).

The original is also monstrous as far as these things go — some 36MB when compressed. By contrast, the compressed size of all the themes for chriskrycho.com, including all the child themes is about 394KB. Yes, that’s two orders of magnitude different. Nothing screams bad design decisions like a theme that size.

This experience further persuaded me that, while I’ll be happy to keep doing WordPress projects for friends from time to time,1 I want to continue my move away from PHP and WordPress to other parts of the web. There is simply too much garbage in the PHP world, and I’m increasingly aware that one of the best ways to be getting better work is to get out of that world entirely. There are a lot of reasons for that — not least the “approachability” factor in PHP, which means a lot of non-programmers are doing it, and especially in the WordPress theme area — but moving further the way I already have been toward the Python community seems like a better plan every day. (Node.js is of course on my radar as well.) Things that require you to be a good programmer are better for getting work that pays well (as the market isn’t flooded with people who don’t know what they’re doing but are willing to work for a fifth of a reasonable developer rate), and they almost always entail better kinds of work to be doing anyway.

Speaking of which… now would be the point when I stop writing and go hammer out some more work on Step Stool, which is coming along nicely (albeit more slowly than I hoped). And of course, I’ve resigned myself to the reality that this site will certainly be getting a fresh coat of paint along the way with that project. Can’t pass up an opportunity to redesign the home page, right?

1. Just to be incredibly clear: Sarah was a great client, and I have no qualms about working with her again in the future. Likewise, the work I’ve done pro bono for Mere Orthodoxy is just grand as far as I’m concerned, and I have another couple friends with similar projects potentially in the pipeline. But unless things get really desperate, the likelihood that I’ll be taking PHP or WordPress jobs from random folks on the web is… low at best.

# step-stool.io

As of a couple hours ago, my newest personal project has a website. Step Stool, the static site generator I’m writing in Python, now has a home at step-stool.io.1 If you visit the site, you’ll see that, while it looks nice (at least: I think it does), there’s not much to it and most of the links don’t go anywhere. The link that does go somewhere tells you that the project for which the website exists doesn’t actually have any functioning code, yet.2

(That gorgeous logo? Designed by the absolutely brilliant Cameron Morgan.)

So why is it live? There are a few reasons. First, the whole point of the project is my having fun with programming. Web design may not be directly involved in finishing Step Stool, but it definitely scratches the fun itch just as effectively. Second, and more importantly, it’s allowed me to get some time in with other technologies I’ve been interested in for a while – specifically, SASS. I’ve been frustrated with all the problems SASS solves in CSS for a while now, and I’ve known I wanted to just go ahead and learn it for a while. I just haven’t had a project where it made sense – until now. Read on, intrepid explorer →

# Current plans

Things currently on my near-term radar, web development-wise:

• Finish up a web site that I’m doing pro bono for a friend. This one’s been on the back burner since we moved, but I’d like to actually get it knocked out at the beginning of the summer. Read on, intrepid explorer →

[C]ompared to Java code, XML is agile and flexible. Compared to Python code, XML is a boat anchor, a ball and chain.

# Introducing: JIRA Commit Acceptance Plugin Tweaks (Python, Batch)

One of the tasks I set for myself with JIRA at Quest was to configure it and Subversion so that people can't check in with referencing a JIRA Issue. This is a little thing, but it helps ensure people will actually use the issue tracker, instead of letting it languish. Add good version and source control policies - the kind you implement on the server, not just the kind you tell people to use (because we all know how well that works out) - and you have a solution that helps people use both version control and the issue tracker sensibly. Read on, intrepid explorer →