Havoc's Blog

this blog contains blog posts

What Matters In Software Development

Lots of traffic on Twitter about Steve Yegge’s post defining a “software ideology” spectrum. Myles Recny made a survey to help you place yourself along said spectrum.

Thinking about it over the weekend, I can’t identify with this framing of software development. Those survey questions don’t seem to cover what I think about most in my own software work; in fact I’d be a little worried if I saw colleagues focused on some of these.

Here’s what I do think about, that could be called an ideology.

Ideological View 1: Risk = Getting Worse On Average

Whether we’re talking about dieting, finance, or software, flows matter more than stocks.

The risk I worry about is: are you adding bugs faster than you’re fixing them? Is your technical debt going up? Is this code getting worse, on average?

If the average direction is “worse” then sooner or later your code will be an incomprehensible, hopeless disaster that nobody will want to touch. The risk is descent into unmaintainable chaos where everyone involved hates their life and the software stops improving. I’ve been there on the death march.

Bugs vs. Features: Contextual Question!

In Steve’s post, he says conservatives are focused on safety (bugs in production) while liberals are focused on features. I don’t have an ideological view on “bugs in production risk”; it’s contextual.

For example: Red Hat maintains both Fedora and Enterprise Linux, two branches of the same software project with mostly the same team but with distinct “bugs in production” risk profiles and distinct processes to match. Red Hat uses the same code and the same people to support different tradeoffs in different contexts. Maybe they’re a post-partisan company?

If I were working on software for the Mars rover, I’d strenuously object to continuous deployment. (Maybe we should test that software update before we push it to Mars?) If I were working on I Can Has Cheezburger, bugs in production wouldn’t bother me much, so I’d be happy to keep the process lightweight.

But in both cases I don’t want to see the code getting worse on average, because in both cases I’d want to keep that code alive over a period of years. That’s the ideology that stays constant.

A project that’s getting worse on average will achieve neither safety nor features. A healthy project might have both (though not in the same release stream).

How to Avoid Getting Worse

To avoid risk of steadily getting worse, a couple issues come up every time.

Ideological View 2: Clarity and Simplicity Are Good

Can the team understand it?

This is relative to the team. If your team doesn’t know language XYZ you can’t write code in that language. If your API is intended for mainstream, general programmers, it can’t be full of niche jargon. If your team doesn’t speak German you can’t write your comments in German. Etc.

Software developers learn to make judgment calls about complexity and over- vs. under-engineering. These calls are countless, contextual, and always about tradeoffs. Experience matters.

A definition of “competent software developer” surely includes:

  • they worry about complexity and can make judgments about when it’s worth it
  • they can write both prose and code such that the rest of the team understands it

Not all teams have the same complexity capacity, but they all have some limit, and good ones use it wisely.

Ideological View 3: Process: Have One

I’ve seen many different methodologies and processes work. They optimize for different team skills, or different levels of “bugs in production” risk. My belief is that you need some method to your madness; something other than free-for-all. Examples:

  • Good unit test coverage with mandatory coverage for new code.
  • OR hardass code review. (Hardass = reviewer spends a lot of time and most patches get heavily revised at least once. Most reviews will not be “looks good to me.”)
  • OR just one developer on a codebase small enough to keep in one head.
  • OR Joel’s approach.

You don’t need all of those, but you need at least one thing like that. There has to be some daily habit or structural setup that fights entropy, no matter how smart the individual team members are.

Companies may have rule-based or trust-based cultures, and pick different processes. Lots of different approaches can work.

Summary

Ideological lines in the sand framing my thinking about software development:

  • Risk = the project becomes intractable.
  • Prerequisite to avoid this risk: you have to be understandable and understood.
  • Process to avoid this risk: have one and stick to it.

If you can write clear, maintainable code, and keep it that way, using your OS, text editor, dynamic language, static language, XML-configured framework, agile process, or whatever, then I’m open to your approach.

If you’re creating complexity that doesn’t pay its way, not making any sense to the rest of the team, don’t have a working process, etc. then I’m against it.

“How many bugs in production are OK,” “static vs. dynamic languages,” “do we need a spec for this,” “do we need a schema here”, “what do I name this function”: these are pragmatic, context-dependent issues. I like to consider them case-by-case.

Postscript: Me me me

A lot of these example “liberal/conservative” statements feel ego-driven. I’d look bad if we shipped a bug, I’m smart and can learn stuff, I never write slow code, I always write small code, blah blah.

It’s not about you.

When you agree or disagree with “programmers are only newbies for a little while” – are you thinking of software creation as an IQ test for developers? The goal is not to “dumb down” the code or to prove that for you, it need not be.

Let me suggest a better framing: is this complexity worth it (in the context of our customers and our team). If we’re trying to maximize how useful our software can be given a certain level of complexity our team can cope with, should we use our brain cycles in this corner of the code or some other corner?

When you agree or disagree with “software should aim to be bug free before it launches” – do you have the same opinion about both the Mars lander and icanhascheezburger? If you do, you might need to refocus on the outside world our software’s supposed to be serving.

Better framing: it has to work.

You get the point I guess…

This article is translated to Serbo-Croatian by Jovana Milutinovich from Webhostinggeeks.com.

Developers: request for complaints

I’m looking for a new personal weekend project that would be useful to others. Maybe there’s a useful book or useful piece of software I could create for fellow developers, tech leads, project managers, etc. Doesn’t have to be anything related to any of my current or past work (Linux, C, GTK+, Scala, etc.), but it could be.

Here’s how you can help: complain in the comments (or in private email or on Twitter), about an aggravation, anxiety, ignorance, or unsolved problem that you encounter as part of your work. Where are you wasting time or being kept up nights?

“+1” comments are awesome because if something annoys a bunch of people that makes it more worthwhile to try to solve.

This is once-in-a-lifetime: I’m going to appreciate hearing you complain, and also appreciate “+1” comments! Nobody else is going to do that for you.

I’m not asking for project ideas directly (though if you have them great), I’m just looking for crap you put up with that you wish someone would do something about, problems you had that you couldn’t find a solution to, topics you wanted to learn about that appeared to be undocumented, whatever.

No need to be original; obvious problems are great, as long as it’s still an unsolved problem for you, I don’t care whether it’s obvious, or how many people have already tried to solve it.

No need to have a solution, some of the most important problems are hard to solve, I’m just wondering what’s at the top of your list of sure wish someone would figure out this problem.

Some examples to get you started, if these resonate you could “+1”, but just as good is to come up with more:

  • Wish you knew more about <some topic>? Tell me which and maybe I could research it for everyone and report back.
  • Anything about your bug tracking, sprint planning, etc. that is tedious, ineffective, or broken?
  • Baffled by how to handle trademarks, copyrights, and patents in your projects?
  • Unhappy with how your team communicates?
  • Are there any software tools you’ve looked for only to say “all the choices suck”?
  • Wish you could write your own window manager, if only you had esoteric and rapidly-obsoleting X11 skills? (j/k … right?)

I don’t promise to solve everyone’s problems, but maybe I can solve one real-world actual problem and that would be cool.

Who knows — if we gather enough data someone other than me might run with some of these problems too. Or your fellow commenters might point you to a good existing solution. So let us know what you’re having trouble with.

Thanks!

New blog hosting

I’m trying out WPEngine instead of self-managed EC2 since my self-managed uptime stats were pretty bad.

Let me know if anything about the blog is more broken than it was before.

Desktop Task Switching Could Be Improved

In honor of GUADEC 2012, a post about desktop UI. (On Linux, though I think some of these points could apply to Windows and OS X.)

When I’m working, I have to stop and think when I flip between two tabs or windows. If I don’t stop and think, I flip to the wrong destination a high percentage of the time. I see this clunkiness every minute or two.

For me to do the most common action (flip between documents/terminals/websites) I may need to use my workspace switch hotkey (Alt+number), app switch (Alt+`), window switch (Alt+Tab), tab switch (Alt+PgUp, Alt+PgDn, C-x-b), or possibly a sequence of these (like change workspace then change window or change window then change tab).

I believe it could be reduced to ONE key which always works.

The key means “back to what I was doing last” and it works whether you were last on a tab, a window, or another workspace. There’s a big drop-off in goodness between:

  • one key that always works
  • two keys to choose from

Once you have two, you have the potential to get it wrong and you have to slow down to think.

Adding more than two (such as the current half-dozen, including sequences) makes it worse. But the big cliff is from one to two.

User model vs. implementation model

Can’t speak for others, but I may have two layers of hierarchy in my head:

  • A project: some real-world task like “file expense report” or “write blog post” or “develop feature xyz”
  • A screen: a window/tab/buffer within the project, representing some document I need to refer to or document I’m creating

The most common action for me is to switch windows/tabs/buffers within a project, for example between the document I’m copying from and the one I’m pasting to, or the docs I’m referring to and the code I’m writing, or whatever it is.

The second most common action for me is to move among projects or start a new project.

Desktop environments give me all sorts of hierarchy unrelated to the model in my head:

  • Workspace
  • Application
  • Window
  • Tab (including idiosyncratic “tabs” like Emacs buffers)
  • Monitor (multihead)

None of these correspond to “projects” or “screens.” You can kind of build a “projects” concept from these building blocks, but I’m not sure the desktop is helping me do so. There’s no way to get a unified view of “screens.”

I don’t know what model other people have in their head, but I doubt it’s as complex as the one the desktop implements.

Not a new problem

I’m using GNOME 3 on Fedora 17 today, but this is a long-standing issue. Back when I was working on Metacity for GNOME 2, we tried to get somewhere on this, but we accepted the existing setup as a constraint (apps, windows, workspaces, etc.) and therefore failed. At litl we spent a long time wrestling with the answer and found something pretty good though perhaps not directly applicable to a regular desktop. I wish I had a good video or link to show for litl’s solution (essentially a zoomable grid of maximized windows, but lots of details matter).

iPhone has simplified things here as well. They combine windows and applications into one. But part of the simplification on iPhone is that it’s difficult to do things that involve more than one “screen” at a time. On a desktop, it wouldn’t be OK to make that difficult.

In GNOME 3, I also use the Windows key to open the overview and pick a window by thumbnail. Some issues with this:

  • It does not include tabs, only windows.
  • In practice, I have to scan all the thumbnails every time to find the one I want.

These were addressed in the litl design:

  • Tabs and windows were the same thing.
  • Windows remained in a stable, predictable location in the overview.
  • The overview was spatially related to the window, that is you were actually zooming in and out, which meant during the animation you got an indication of where you were.
  • I believe you could even click on a window before the zoom in/out animation was complete, though I could be wrong. In any case you could be moving toward it while it was coming onto the screen.

As a result, the litl design was much faster for task switching via overview key plus mouse. If you were repeatedly flipping between two tasks, you could memorize their location in space and find them quickly based on that. If other windows were opened and closed, the remaining ones might slide over, but they’d never reshuffle entirely.

I think GNOME tries to “shrink the windows in their current location” rather than “zoom out”, so it’s trying to have a spatial relationship. A problem is that I have everything maximized (or halfscreen-maximized). “Shrink to current location” ends up as “appears random” when windows don’t have any meaningful relationships on the x/y axes (they’re just in a z-axis stack). (Direction for thought: is there some way maximized windows could be presented as adjacent rather than stacked?)

Overall I vastly prefer Fedora 17 to my previous GNOME 2 setup and I think it’s a step on the path to cleaning this up for good. In the short term, a couple things seem to make the problem worse:

  • The “application” layer of hierarchy (Alt+Tab vs. Alt+`) adds one more way to switch “screens,” though for me this just made an existing problem slightly worse (the bulk of the problem is longstanding and we were already far from one key).
  • The window list on the panel had a fixed order and was always onscreen, so it was faster than the thumbnail overview. I believe the thumbnail overview approach could be fixed; on the litl, for me zoom-out-to-thumbnails was as fast as the window list. The old window list was an ugly kluge (it creates an abstraction/indirection where you have to match up two objects, a button and a window — direct manipulation would be so much better). But its fixed spatial layout made it fast.

GNOME 3 opens the door to improving matters; GNOME 2’s technology (e.g. without animation and compositing) made it hard to implement ideas that might help. GNOME 3 directions like encouraging maximized apps, automatic workspace management, the overview button, etc. may be on the path to the solution.

Can it be improved?

I’ll limit this post to framing the problem and hinting at a couple of directions. I don’t know the right design answer. I’m definitely going to omit speculation on how to implement (for example, getting tabs into the rotation would be possible, but require some implementation heroics).

I know everything is the way it is now for good historical reasons, valid technical and practical constraints, and so on. But I bet there’s a way to get past those with enough effort.

ACA constitutionality doesn’t hinge on what you call it

TL;DR I got it right, you may now send me your offers for lucrative legal consulting work. Note: I am not a lawyer.

I’ve had a little series of posts, first and second, arguing that the tax code already punishes you for failure to purchase health insurance and health care. (Because any tax credit can be framed as an equivalent tax increase + tax penalty.) Thus, the individual mandate should be constitutional in the same way that existing credits are, because practically and economically speaking, it’s the same thing as many credits already in the tax code, including health insurance and health care credits.

I’ve only read the syllabus of the Supreme Court decision so far, but it looks like John Roberts bought the argument that something can’t be unconstitutional just because it’s named the wrong thing:

4. CHIEF JUSTICE ROBERTS delivered the opinion of the Court with
respect to Part III–C, concluding that the individual mandate may be
upheld as within Congress’s power under the Taxing Clause. Pp. 33–
44.
(a) The Affordable Care Act describes the “[s]hared responsibility
payment” as a “penalty,” not a “tax.” That label is fatal to the appli-
cation of the Anti-Injunction Act. It does not, however, control
whether an exaction is within Congress’s power to tax. In answering
that constitutional question, this Court follows a functional approach,
“[d]isregarding the designation of the exaction, and viewing its sub-
stance and application.” United States v. Constantine, 296 U. S. 287,
294. Pp. 33–35.
(b) Such an analysis suggests that the shared responsibility
payment may for constitutional purposes be considered a tax. The
payment is not so high that there is really no choice but to buy health
insurance; the payment is not limited to willful violations, as penal-
ties for unlawful acts often are; and the payment is collected solely by
the IRS through the normal means of taxation. Cf. Bailey v. Drexel
Furniture Co., 259 U. S. 20, 36–37. None of this is to say that pay-
ment is not intended to induce the purchase of health insurance. But
the mandate need not be read to declare that failing to do so is un-
lawful. Neither the Affordable Care Act nor any other law attaches
negative legal consequences to not buying health insurance, beyond
requiring a payment to the IRS. And Congress’s choice of language—
stating that individuals “shall” obtain insurance or pay a “penalty”—
does not require reading §5000A as punishing unlawful conduct. It
may also be read as imposing a tax on those who go without insur-
ance. See New York v. United States, 505 U. S. 144, 169–174.
Pp. 35–40.
(c) Even if the mandate may reasonably be characterized as a
tax, it must still comply with the Direct Tax Clause, which provides:
“No Capitation, or other direct, Tax shall be laid, unless in Proportion
to the Census or Enumeration herein before directed to be taken.”
Art. I, §9, cl. 4. A tax on going without health insurance is not like a
capitation or other direct tax under this Court’s precedents. It there-
fore need not be apportioned so that each State pays in proportion to
its population. Pp. 40–41.

On a more serious note, this law will have huge positive consequences for my family, and I’m grateful that it held up in court.
I was going to be particularly upset to suffer giant practical problems in my own life just because someone failed to open their search-and-replace function in a word processor and change “penalty” to “tax.” I’m very happy we weren’t screwed on that technicality.

While I haven’t read the whole decision yet, it looks like those looking for limitations on federal power will be happy with the discussion of commerce powers and the precedents established in that area.

The best answer requires some aggravation

Once you think you have a good answer to an important problem, it’s time to drive everyone crazy looking for an even better answer.

Here’s a scenario I’ve been through more times than I can count:

  • I thought I had a pretty good approach, or didn’t think anything better was possible, and wasn’t looking to spend more time on the problem.
  • Someone had the passion to keep pushing, and we either stayed in the room or kept the email thread going beyond a “reasonable” amount of effort.
  • We came up with a much better approach, often reframing the problem to eliminate the tradeoff we were arguing about at first.

Steve Jobs was legendarily cruel about pushing for more. But in my experience good results come from more mundane aggravation; there’s no need to make people cry, but there probably is a need to make them annoyed. Annoyed about spending three extra hours in the meeting room, annoyed about the length of the email thread, annoyed about compromising their artistic vision… if the human mind thinks it already has an answer, it will fight hard not to look for a new answer.

That might be the key: people have to be in so much pain from the long meeting or thread or harsh debate or Jobsian tongue-lashing that they’re willing to explore new ideas and even commit to one.

It shows just how much we hate to change our mind. I often need to be well past dinnertime or half a novel into an email thread before my brain gives up: “I’ll set aside my answer and look for a new one, because that’s the fastest way out of here.”

The feeling that you know the answer already is a misleading feeling, not a fact.

Some people use brainstorming rules, like the improv-inspired “yes, and…” rule, trying to separate generative thinking from critical thinking. First find and explore lots of alternatives, then separately critique them and select one. Avoid sticking on an answer prematurely (before there’s been enough effort generating options). Taking someone else’s idea and saying “I like this part, what about this twist…” can be great mental exercise.

To know you’ve truly found the best decision possible, your team might need to get fed up twice:

  • Brainstorm: stay in the room finding more ideas, long after everyone thinks they’re tapped out.
  • Decide: stay in the room debating, refining, and arguing until everyone thinks a decision should have been made hours ago.

A feeling of harmony or efficiency probably means you’re making a boring, routine decision. Which is fine, for routine stuff. But if you have an important decision to make, work on it until the whole team wants to kill each other. Grinding out a great decision will feel emotional, difficult, and time-consuming.

Binding an implicit to a Scala instance

In several real-world cases I’ve had a pair of types like this:

An implicit often leaves a policy decision undecided. At some layer of your code, though, you want to make the decision and stick to it.

Passing around a tuple with an object and the implicit needed to invoke its methods can be awkward. If you want to work with two such tuples in the same scope, you can’t import their associated implicits, so it’s verbose too.

It would be nice to have bind[A,I](a: A, asImplicit: I), where bind(cache, cacheContext) would return the equivalent of BoundCache.

I guess this could be done with macro types someday, but probably not with the macro support in Scala 2.10.

If implemented in the language itself, it’s possible BoundCache wouldn’t need any Java-visible methods (the delegations to Cache could be generated inline).

However, one use of “bound” classes could be to adapt Scala APIs to Java. In Java you could bind one time, instead of explicitly passing implicit parameters all over the place.

Has anyone else run into this?

practice and belief

This NYTimes blog post scrolled past the other day, a discussion of an article by John Gray. John Gray has this to say:

The idea that religions are essentially creeds, lists of propositions that you have to accept, doesn’t come from religion. It’s an inheritance from Greek philosophy, which shaped much of Western Christianity and led to practitioners trying to defend their way of life as an expression of what they believe.

The most common threads of religion, science, and philosophy I learned about in school shared this frame; their primary focus was accurate descriptions of outside reality. Which is fine and useful, but perhaps not everything. In some very tiresome debates (atheism vs. religion, “Truth” vs. “relativism strawman”), both sides share the assumption that what matters most is finding a set of words that best describe the world.

There is at least one alternative, which is to also ask “what should we practice?” not only “what should we believe?”

If you’re interested in this topic, I’ve stumbled on several traditions that have something to say about it so I thought I’d make a list:

  1. Pragmatist philosophy, for example this book is a collection of readings I enjoyed, or see Pragmatism on Wikipedia.
  2. Unitarian Universalism, which borrows much of the format and practice of a Protestant church but leaves the beliefs up to the individual. I’ve often heard people say that their belief is what matters but they don’t like organized religion; UU is the reverse of that. (Not that UU is against having beliefs, it just doesn’t define its membership as the set of people who agree on X, Y, and Z. It is a community of shared practice rather than shared belief.)
  3. Behavioral economics and psychology. For example, they have piled on the evidence that one’s beliefs might flow from one’s actions (not the other way around), and in general made clear that knowing facts does not translate straightforwardly into behavior.
  4. Buddhism, not something I know a lot about, but as explained by Thich Nhat Hanh for example in The Heart of the Buddha’s Teaching. Themes include the limitations of language as a way to describe reality, and what modern bloggers might call “mind hacks” (practical ways to convince the human body and mind to work better).

A few thoughts on open projects, with mention of Scala

Most of my career has been in commercial companies related to open source. I learned to code just out of college at a financial company using Linux. I was at Red Hat from just before the IPO when we were selling T-shirts as a business model, until just before the company joined the S&P500. I worked at litl making a Linux-based device, and now I’m doing odd jobs at Typesafe.

I’ve seen a lot of open source project evolution and I’ve seen a lot of open-source-related commercial companies come and go.

Here are some observations that seem relevant to the Scala world.

Open source vs. open project

In some cases, one company “is” the project; all the important contributors are from the company, and they don’t let outsiders contribute effectively. There are lots of ways to block outsiders: closed infrastructure such as bug trackers, key decisions made in private meetings, taking forever to accept patches, lagged source releases, whatever. This is “one-way-ware,” it’s open source but not an open project.

In other cases, a project is bigger than the company. As noted in this article, Red Hat has a mission statement “To be the catalyst in communities of partners and customers and contributors building better technology the open-source way” where the key word is “catalyst.” And when Linux Weekly News runs their periodic analysis of Linux kernel contributions, Red Hat is a large but not predominant contributor. This is despite a freaking army of kernel developers at Red Hat. Red Hat has many hundreds of developers, while most open source startups probably have a dozen or two.

In a really successful project, any one company will be doing only a fraction of the work, and this will remain true even for a billion-dollar company. As a project grows, an associated company will grow too; but other companies will appear, more hobbyists and customers will also contribute, etc.  The project will remain larger than any one company.

(In projects I’ve been a part of, this has gone in “waves”; sometimes a company will hire a bunch of the contributors and become more dominant for a time, but this rarely lasts, because new contributors are always appearing.)

Project direction and priorities

Commercial companies will tend to do a somewhat random grab-bag of idiosyncratic paying-customer-driven tasks, plus maybe some strategic projects here and there. The nature of open projects is that most work is pretty grab-bag; because it’s a bunch of people scratching their own itches, or hiring others to scratch a certain itch.

In the Scala community for example, some work is coming from the researchers at EPFL, and (as I understand it) their itch is to write a paper or thesis.  Given dictatorial powers over Scala, one could say “we don’t want any of that work” but one could never say “EPFL people will work on fixing bugs” because they have to do something suitable for publication. Similarly, if you’re building an app on Scala, maybe you are willing to work on a patch to fix some scalability issue you are encountering, but you’re unlikely to stop and work on bugs you aren’t experiencing, or on a new language feature.

An open project and its community are the sum of individual people doing what they care about. It’s flat-out wrong to think that any healthy open project is a pool of developers who can be assigned priorities that “make sense” globally. There’s no product manager. The community priorities are simply the union of all community-member priorities.

It’s true that contributors can band together, sometimes forming a company, and help push things in a certain direction. But it’s more like these bands of contributors are rowing harder on one side of the boat; they aren’t keeping the other side of the boat from rowing, or forcing people on the other side of the boat to change sides.

Commercial diversity

My experience is that most “heavy lifting” and perhaps the bulk of the work overall in big open projects tends to come  from commercial interests; partly people using the technology who send in patches, partly companies that do support or consulting around the technology, and partly companies that have some strategic need for the technology (for example Intel needs Linux to run on its hardware).

There’s generally a fair bit of research activity, student activity, and hobbyist activity as well, but commercial activity is a big part of what gets done.

However, the commercial activity tends to be from a variety of commercial entities, not from just one. There are several major “Linux companies,” then all the companies that use Linux in some way (from IBM to Google to Wall Street), not to mention all the small consulting shops. This isn’t unique to Linux. I’ve also been heavily involved in the GNOME Project, where the commercial landscape has changed a lot over the years, but it’s always been a multi-company landscape.

The Scala community will be diverse as long as it’s growing

With the above in mind, here’s a personal observation, as a recent member of the Scala community: some people have the wrong idea about how the community is likely to play out.

I’ve seen a number of comments that pretty much assume that anything that happens in the Scala world is going to come from Typesafe, or that Typesafe can set community priorities, etc.

From what I can tell, this is currently untrue; there are a lot more contributors in the ecosystem, both individuals and companies. And in my opinion, it’s likely to remain untrue. If the technology is successful, there will be a never-ending stream of new contributors, including researchers, hobbyists, companies building apps on the technology, and companies offering support and consulting. Empirically, this is what happens in successful open projects.

I’ve seen other comments that assume the research aspect of the Scala community will always drive the project, swamping us in perpetual innovation. From what I can tell, this is also currently untrue, and likely to remain untrue.

Some open communities do get taken over by narrow interests. This can kill a community, or it can happen to a dead community because only one narrow interest cares anymore. But the current Scala ecosystem trend is that it’s growing: more contributors, more different priorities, more stuff people are working on.

How to handle it

Embrace growth, embrace more contributors, embrace diversity.

The downside is that more contributors means more priorities and thus more conflicts.

When priorities conflict, the community will have to work it out. My advice is to get people together in-person and tackle conflicts in good faith, but head-on. Find a solution. In-person meetings are critical. If you have a strong opinion about Scala ecosystem priorities, you must make a point of attending conferences or otherwise building personal relationships with other contributors.

Never negotiate truly hard issues via email.

As the community grows and new contributors appear, there will be growing pains figuring out how to work together. All projects that get big have to sort out these issues. There will be drama; it’s best taken as evidence that people are passionate.

Structural solutions will appear. For example, in the Linux world, the “enterprise Linux” branches are a structural solution allowing the community to roll forward while offering customers a usable, stable target. Red Hat’s Fedora vs. Enterprise Linux split is a structural solution to separate its open project from its customer-driven product. In GNOME, the time-based release was a structural solution that addressed endless fights about when to release. Most large projects end up explicitly spelling out some kind of governance model, and there are many different models out there.

Whatever the details, the role of Typesafe — and every other contributor, commercial or not — will be to discuss and work on their priorities. And the overall community priorities will include, but not be limited to, what any one contributor decides to do. That’s the whole reason to use an open project rather than a closed one — you have the opportunity, should you need it, to contribute your own priorities.

When talking about an open project, it can be valuable (and factually accurate) to think “we” rather than “they.”

(Hopefully-unnecessary note: this is my personal opinion, not speaking for anyone else, and I am not a central figure in the Scala community. If I got it wrong then let me know in the comments.)

 

The Java ecosystem and Scala ABI versioning

On the sbt mailing list there’s a discussion of where to go with “cross versioning.” Here’s how I’ve been thinking about it.

Disclaimer

I’m a relative newcomer to the Scala community. If I push anyone’s buttons it’s not intentional. This is a personal opinion.

Summary

Two theories:

  • The largest problem created by changing ABI contracts is an explosion of combinations rather than the ABI change per se.
  • The ABI of the Scala standard library is only one of the many ABIs that can cause problems by changing. A general solution to ABI issues would help cope with ABI changes to any jar file, even those unrelated to Scala.

Proposal: rather than attacking the problem piecemeal by cross-versioning with respect to a single jar (such as the Scala library), cross-version with respect to a global universe of ABI-consistent jars.

This idea copies from the Linux world, where wide enterprise adoption has been achieved despite active hostility to a fixed ABI from the open source Linux kernel project, and relatively frequent ABI changes in userspace (for example from GTK+ 1.2, to 2.0, to 3.0). I believe there’s a sensible balance between allowing innovation and providing a stable platform for application developers.

Problem definition: finding an ABI-consistent universe

If you’re writing an application or library in Scala, you have to select a Scala ABI version; then also select an ABI version for any dependencies you use, whether they are implemented in Scala or not. For example, Play, Akka, Netty, slf4j, whatever.

Not all combinations of dependencies exist and work. For example, Play 1.2 cannot be used with Akka 1.2 because Play depends on an SBT version which depends on a different Scala version from Akka.

Due to a lack of coordination, identifying an ABI-consistent universe involves trial-and-error, and the desired set of dependencies may not exist.

Projects don’t reliably use something like semantic versioning so it can be hard to even determine which versions of a given jar have the same ABI. Worse, if you get this wrong, the JVM will complain very late in the game (often at runtime — unfortunately, there are no mechanisms on the JVM platform to encode an ABI version in a jar).

Whenever one jar in your stack changes its ABI, you have a problem. To upgrade that jar, anything which depends on it (directly or transitively) also has to be upgraded. This is a coordination problem for the community.

To see the issue on a small scale, look at what happens when a new SBT version comes out. Initially, no plugins are using the new version so you cannot upgrade to it if you’re using plugins. Later, half your plugins might be using it and half not using it: you still can’t upgrade. Eventually all the plugins move, but it takes a while. You must upgrade all your plugins at once.

Whenever a dependency, such as sbt, changes its ABI, then the universe becomes a multiverse: the ecosystem of dependencies splits. Changing the ABI of the Scala library, or any widely-used dependency such as Akka, has the same effect. The real pain arrives when many modules change their ABI, slicing and dicing the ecosystem into numerous incompatible, undocumented, and ever-changing universes.

Developers must choose among these universes, finding a working one through trial and error.

For another description of the problem, see this post from David Pollak.

Often, projects are reluctant to have dependencies on other projects, because the more dependencies you have the worse this problem becomes.

One solution: coordinate an explicit universe

This idea shamelessly takes a page from Linux distributions.

We could declare that there is a Universe 1.0. This universe contains a fixed ABI version of the Scala standard library, of SBT, of Akka, of Play — in principle, though initially not in practice, of everything.

To build your application, rather than being forced to specify the version of each individual dependency, you could specify that you would like Universe 1.0. Then you get the latest release for each dependency as long as its ABI remains Universe-1.0-compatible.

There’s also a Universe 2.0. In Universe 2.0, the ABI can be changed with respect to Universe 1.0, but again Universe 2.0 is internally consistent; everything in Universe 2.0 works with everything else in Universe 2.0, and the ABI of Universe 2.0 does not ever change.

The idea is simple: convert an undocumented, ever-changing set of implicit dependency sets into a single series of documented, explicit, testable dependency sets. Rather than an ad hoc M versions of Scala times N versions of SBT times O versions of Akka times P versions of whatever else, there’s Universe 1.0, Universe 2.0, Universe 3.0, etc.

This could be straightforwardly mapped to repositories; a repository per universe. Everything in the Universe 1.0 repository has guaranteed ABI consistency. Stick to that repository and you won’t have ABI problems.

One of the wins could be community around these universes. With everyone sharing the same small number of dependency sets, everyone can contribute to solving problems with those sets. Today, every application developer has to figure out and maintain their own dependency set.

How to do it

Linux distributions and large multi-module open source projects such as GNOME provide a blueprint. Here are the current Fedora and GNOME descriptions of their process for example.

For these projects, there’s a schedule with a development phase (not yet ABI frozen), freeze periods, and release dates. During the development phase incompatibilities are worked out and the final ABI version of everything is selected.

At some point in time it’s all working, and there’s a release. Post-release, the ABI of the released universe isn’t allowed to change anymore. ABI changes can only happen in the next version of the universe.

Creating the universe is simply another open source project, one which develops release engineering infrastructure. “Meta-projects” such as Fedora and GNOME involve a fair amount of code to automate and verify their releases as a whole. The code in a Universe project would convert some kind of configuration describing the Universe into a published repository of artifacts.

There are important differences between the way the Linux ecosystem works today and the way the Java ecosystem works. Linux packages are normally released as source code by upstream open source developers, leaving Linux distributions to compile against particular system ABIs and to sign the resulting binaries. Java packages are released as binaries by upstream, and while they could be signed, often they are not. As far as I know, however, there is nothing stopping a “universe repository” project from picking and choosing which jar versions to include, or even signing everything in the universe repository with a common key.

I believe that in practice, there must be a central release engineering effort of some kind (with automated checks to ensure that ABIs don’t change, for example). Another approach would be completely by convention, similar to the current cross-build infrastructure, where individual package maintainers could put a universe version in their builds when they publish. I don’t believe a by-convention-only approach can work.

To make this idea practical, there would have to be a “release artifact” (which would be the entire universe repository) and it would have to be tested as a whole and stamped “released” on a certain flag day. There would have to be provisions for “foreign” jars, where a version of an arbitrary already-published Java jar could be included in the universe.

It would not work to rely on getting everyone on earth to buy into the plan and follow it closely. A small release engineering team would have to create the universe repository independently, without blocking on others. Close coordination with the important packages in the universe would still be very helpful, of course, but a workable plan can’t rely on getting hundreds of individuals to pay attention and take action.

Scala vs. Java

I don’t believe this is a “Scala” problem. It’s really a Java ecosystem problem. The Scala standard library is a jar which changes ABI when the major version is bumped. A lot of other jars depend on the standard library jar. Any widely-used plain-Java jar that changes ABI creates the same issues.

(Technicality: the Scala compiler also changes its code generation which changes ABIs, but since that breaks ABIs at the same time that the standard library does, I don’t think it creates unique issues.)

Thinking of this as a “Scala problem” frames it poorly and leads to incomplete solutions like cross-versioning based only on the Scala version. A good solution would also support ABI changes in something like slf4j or commons-codec or whatever example you’d like to use.

btw, it would certainly be productive to look at what .NET and Ruby and Python and everyone else have done in this area. I won’t try to research and catalog all those in this post (but feel free to discuss in comments).

Rehash

The goal is that rather than specifying the version for every dependency in your build, you would specify “Universe 1.0”; which would mean “the latest version of everything in the ABI-frozen and internally consistent 1.0 universe of dependencies.” When you get ready to update to a newer stack, you’d change that to “Universe 2.0” and you’d get another ABI-frozen, internally-consistent universe of dependencies (but everything would be shinier and newer).

This solution scales to any number of ABI changes in any number of dependencies; no matter how many dependencies or how many ABI changes in those dependencies, application developers only have to specify one version number (the universe version). Given the universe, an application will always get a coherent set of dependencies, and the ABI will never change for that universe version.

This solution is tried and true. It works well for the universe of open source C/C++ programs. Enterprise adoption has been just fine.

After all, the problem here is not new and unique to Java. It wasn’t new in Linux either; when we were trying to work out what to do in the GNOME Project in 1999-2001 or so, in part we looked at Sun’s longstanding internal policies for Solaris. Other platforms such as .NET and Ruby have wrestled with it. There’s a whole lot of prior art. If there’s an issue unique to Java and Scala, it seems to be that we find the problem too big and intimidating to solve, given the weight of Java tradition.

I’m just writing down half-baked ideas in a blog post; making anything like this a reality hinges on people doing a whole lot of work.

Comments

You are welcome to comment on this post, but it may make more sense to add to the sbt list thread (use your judgment).