Published during: August 2013

The Great Deglobalization

For nearly all of August, I have been helping a colleague work on a massive code refactoring project. The company I am consulting for relies on a decades-old (though quite robust) piece of engineering modeling software. The software was written in FORTRAN77 in the mid-to-late 1980s, as was common for such packages at the time. It is apparent that it was not written by competent software developers, however: the entire code-base smacks of fastest-way-to-get-it-to-work thinking. All of the code—all 4,000+ lines of it—came in a single file. Every variable in the program is global.1

Yes, you read that right. Every variable in the program is global.

Since no one at the company is an expert in Fortran at this point—even a modern dialect like Fortran 952—we have been translating the whole thing to Python in order to start re-working pieces of it. (We’re using Python because that’s the company’s standard; we’re reworking this piece of code for reasons long and uninteresting.) The formal translation step was completed in mid-July; since then we have been working on refactoring the project so that we can actually change things without breaking everything.

The trick, of course, is all those global variables. In order to be able to begin making any other changes, we have to eliminate the global state. Otherwise, a change in one module can (and often does) have unexpected effects elsewhere.

Our strategy has been simple: take things in small steps, albeit often very time-consuming small steps. In each of those steps we have aimed to make single, incremental change that makes things more explicit and safer. Often, this has simply meant passing and returning inordinate numbers of variables back and forth to functions, simply so that the behavior is explicit (not to say clear). At other times, this has meant doing very bad things that would be intolerable under any other circumstances, but were less bad than what we had before and allowed us to step forward toward our ultimate goals. The prime example—and one of which I’m fairly proud, though I hope never to have to use it again—is leveraging Python’s property decorators to allow us to begin using struct-like objects for passing around large amounts of data before all the global state involved in said structs was eliminated.

Basically, we created a stopgap wrapper that made struct access result in changing global state. That way, functions could receive the struct and act on it, with the global state invisible to them: from any function’s perspective, it is performing perfectly normal access/mutation operations on the class fields. Behind the scenes, we manipulated global state, until we were able to eliminate global dependencies for that variable entirely, at which point we replaced the quirky behavior with standard field initialization.

Here is the awful pattern my colleague and I used as part of that transition, followed by some explanation. (Do not attempt at home!)

class BasicallyAStruct():
    def __init__(self, some_field):
        global some_field_
        some_field_ = some_field

    def some_field(self):
        return some_field_

    def some_field(self, value):
        global some_field_
        some_field = value

Here’s how it works. For starters, we renamed every global variable to end in an underscore (like some_field_) to make it easy to distinguish between global and local variables. Then, we created a class that is basically just a struct to hold data. (It can of course be expanded to be a full-up class with methods later if that makes sense.) In the constructor, we declare every global variable that the struct needs to reference, and then assign it the value passed to the constructor. Then we use Python’s property decorators to specify the access and mutation behavior: instead of storing the value to a class property like normal, calling the setter or getter3 actually returns or sets the value in the global variable. The result:

global some_field_  # set to some old value
my_struct = BasicallyAStruct(some_field_)  # create the struct

# Set
my_struct.some_field = new_value  # assign just like normal
print(some_field_)  # new_value

# Get
print(new_value == my_struct.some_field)  # True

Once we get to a point where we’ve completely eliminated the global state, we change the class definition to look like a normal class definition, completely removing the global declaration and the @property and @some_field.setter decorators:

class BasicallyAStruct():
    def __init__(self, some_field):
        self.some_field = some_field

From the functions using the struct, nothing has changed; they are already using standard class property access notation.

It has worked like a charm, and it demonstrates the power of decorators in a bit of an unusual situation. It is, of course, an awful use of decorators, to the extent that I would call it an abuse in general. If I ever found this in “final” code I would probably make a horrible noise in outrage; it’s a stupid thing to do. It is, however, less stupid than keeping everything global, and it made for a good intermediate solution that allowed us to minimize changes at each step along the way and therefore minimized the potential number of places anything could break as we refactored.4

I’m happy to say that almost all of those global variables are gone, and the classes are all looking pretty much like normal classes now. And the calling functions never even noticed.

I’m incredibly happy with how that came out—and I hope never to do anything like it again.

  1. Experienced FORTRAN programmers will recognize the pattern: all the variables are declared in a common block at the top of every single function, except for a couple subroutines. 
  2. A fun piece of trivia: my first software development project was all Fortran 95. Fortran was what the physics professor who helped a bunch of students get their projects off the ground knew, so Fortran is where I started. A bit strangely, that background has ended up being valuable to two of my three employers so far. 
  3. Behind the scenes, Python’s property access always calls getter and setter methods—which is why you can override them as we are here. It’s a nifty bit of metaprogramming capability the language gives you. 
  4. This is actually a pretty good example of the principle of information hiding put to a non-standard use. 

Stop Being a Product

Or, Reflections Prompted by the One-Year Anniversary of

Yesterday, turned a year old. Most of my friends in “real life” have never even heard of the social network/platform,1 so the anniversary rightly went by unnoticed and unheralded. For me, however, the day that went live last year was the start of a sea change in the way I approach the software I use — a sea change that is still in progress in many ways. What changed? I decided I wanted to stop being a product and start being a customer.

It’s worth reading the original proposal for a social networking service, for which at least some users pay. This was incredibly audacious for a social networking backbone (though, as founder Dalton Caldwell noted, not completely unheard of in internet services: GitHub, Dropbox, and a number of others offer the same model). Twitter was making it clear that they were following Google and Facebook’s models of selling customer info to advertisers, leaving developers and users of the ecosystem to the whims of whatever made the most sense for sales teams. Facebook’s monetization plans were becoming increasingly annoying (ads in my News Feed? No, that really isn’t why I’m here…). Google’s data-mining was well known but increasingly starting to bother me. All of these had a common thread: they were free to use, on the premise that we don’t mind being advertised to constantly and our data being sold to and analyzed by advertising companies. More and more, though, I did mind. The idea that I would pay for access to a social networking backbone would have seemed crazy at some point; no more.

The web was born and bred on free. In many ways, that has been a good thing. The democratizing effects of free access are well-documented, and people’s ability to publish and read such a wide array of information has been a great boon to the causes of education and liberty. On the other hand, those trends have engendered an expectation that everything digital ought to be free, and this expectation has had a great many deleterious effects as well. We have seen mass piracy of music and video, cutthroat rates (often consisting of “exposure”) for writing, and a flood of terrible content that always threatens to overwhelm the good content available. On balance, I think the web has been a good thing—but it has not been an unalloyed good, and in the last few years an increasing number of people have become dissatisfied with the status quo.

People deserve to be paid for their work, and at some point we have to figure out how to make that happen. Ads have proven insufficient to generate the revenue needed for sustainable business, but they have also incentivized content-providers to aim at generating the most hits, rather than providing the best content. This is not a new problem; the same issue drove a lot of the “yellow journalism” of the last century. The difference is that, coupled with the new, massive data and the disconnecting nature of the web (you don’t read a single “paper” anymore, you read a bunch of articles from various places all over the web), advertisizers have had both the means and the motive to pursue practices that are increasingly proven corrosive to privacy and, in the long run, contrary to the best interests of the public.

The way out—and the way that an increasing number of voices have embraced, including those behind—is to go back to paying for things. (Shocking, I know!) With, Dropbox, GitHub, and many other web services, users pay for access to premium functionality. does not sell my data to anyone; they sell the engineering backbone as a service to me, as well as to developers who want to use that backbone to create their own services. Likewise, Dropbox doesn’t sell my data, they sell storage. GitHub doesn’t sell my data, they sell version control hosting.

I love this idea. So much so that I’m increasingly pulling out of ad-supported services where I’m the product instead of the customer, and moving toward things I pay for. It makes sure the service’s incentives are aligned with my best interests as a user, rather than orthogonally and possibly detrimentally to my desires. Not least, it makes sure the people building the service I’m using get paid. Those are good things. It costs more, to be sure, but that helps me think through what to use (and what to skip) more carefully, and that’s a good thing, too.2

Now, if we only someone could figure out a good way to do the same thing with music, writing, and video…

Oh, and It’s the single best social network I’ve ever used—the quality of conversations I’ve had there is far, far better than that I’ve had anywhere else. You should check it out.

  1. is, strictly speaking, server infrastructure and an API—that is, a set of software tools that people can use to build services and the servers to run those services. It allows everything from Twitter-like conversation streams to personal journaling apps to chat rooms or clients to file storage to photo sharing. It is not just a social network; it is a way to build social networks. Curious? You can join for free, and only upgrade to a paid account if it makes sense. 
  2. Most of these services have free tiers, but—importantly—those free tiers are subsidized by the paying customers, not by ad sales.