Software Design and First Principles–Part 0: Concepts of Object Orientation

I often compare software development with building houses or woodworking.  I sometimes even compare software development with the vocation of electrician.  In each of these other vocations, craftspeople need to go through a period of apprenticeship and mentoring before being “allowed” to practice their craft.  In each of these vocations there are a series of rules that apply to a lot of the basics of what what they do.  With building houses there are techniques and principles that are regulated by building codes; with electricians there’s techniques and fundamentals that are effectively regulated by electrical codes and standards.  It’s one thing to learn the techniques, principles, and fundamental laws of physics; but, it’s another thing to be able to call yourself an electrician or a carpenter.

Now, don’t get me wrong; I’m not advocating that software development be a licensed trade—that’s an entirely different conversation.  But, I do believe that many of the techniques and principles around software development take a lot of mentorship in order to get right.  Just like electricity, they’re not the most intuitive of techniques and principles.  But, just like electricity, it’s really good to know why you’re doing something so you can know its limits an better judge “correctness” in different scenarios.

To that effect, in order to understand many of the software development design techniques and patterns, I think the principles behind them are being ignored somewhat in a rush to get hands-on experience with certain techniques.  I think it’s important that we remember and understand what—I’m deeming—“first principles”.

A First Principle is a foundational principle about what it applies to.  Some of the principles I’m going to talk about may not all be foundational; but, I view then as almost as important as foundational, so I’m including them in First Principles.

From an object-oriented standpoint, there’s lots of principles that we can apply.  Before I get too deeply into these principles, I think it’s useful to remind ourselves what object-orientation is.  I’m not going to get too deep into OO here; I’ll assume you’ve got some experience writing and designing object-oriented programs.  But, I want to associate the principles to the OO concepts that guide them; so, It’s important you as the reader are on the same page as me.

OO really involves various concepts.  These concepts are typically outlined by: Encapsulation, abstraction, inheritance, Polymorphism (at least subtype, but usually parametric and ad-hoc as well), and “message passing”.  I’m going to ignore message passing in this part; other than to say this is typically implemented as method calls…

You don’t have to use all the OO concepts when you’re using an OO language; but, you could argue that encapsulation is one concept that is fundamental.  Encapsulation is sometimes referred to information hiding; but, I don’t think that term does it justice.  Sure, an object with private fields and methods “hides” information; but, the fact that it hides the privates of the type through a public interface of methods isn’t even alluded to in “information hiding”.  Encapsulation is, thus, a means to keep privates private and to provide a consistent public interface to act upon or access those privates.  The interface is an abstraction of the implementation details (the private data) of the class.

The next biggest part of OO is abstraction.  As we’ve seen, encapsulation is a form of abstraction (data abstraction); but the abstraction we’re focusing on now is one that decouples other implementation details.  Abstraction can be implemented with inheritance in many languages (e.g. code can know now to deal with a Shape, and not care that it’s given a Rectangle) and that inheritance can use abstract types. Some OO languages expand abstraction abilities to include things like interfaces—although you could technically do the same thing with an abstract type that had no implementation.

Inheritance is key to many of other concepts in OO—abstraction, subtype polymorphism, interfaces, etc.  (if we view an interface as an abstract type with no code, then something that “implements” an interface is really just inheriting from an abstract type; but, my focus isn’t these semantics).  We often let our zeal to model artefacts in our design and run into problems with the degree and the depth of our inheritance; a point I hope to revisit in a future post in this series.

Although you could technically use an OO language and not use polymorphism in any way, I think OO languages’ greatest features is polymorphism.  Subtype polymorphism, as I’ve noted, is a form of abstraction (Shape, Rectangle…).  But all other types of polymorphism are also abstractions—they’re replacing something concrete (implementation details) with something less concrete (abstract).  With subtype polymorphism that abstraction is an abstract type or a base type; with parametric polymorphism we generally create an algorithm abstraction that is decoupled from the data involved (Generics in .NET); and ad-hoc polymorphism is overloading—a decoupling of one particular method to one of many.

I quickly realized the scope of this topic is fairly large and that one post on the topic would be too much like drinking from a firehose as well as potentially to be protracted (and risking never getting done at all :).  So, I’ve split up what I wanted to talk about into chunks.  I’m not entirely sure what the scope actually is yet; I’ll kind of figure that out as a I go or let feedback guide me.  Now that we’ve got most of the OO concepts in our head, the next post will begin detailing the principles I wanted to talk about.

(function() { var po = document.createElement(‘script’); po.type = ‘text/javascript’; po.async = true; po.src = ‘https://apis.google.com/js/plusone.js’; var s = document.getElementsByTagName(‘script’)[0]; s.parentNode.insertBefore(po, s); })();

The Flawed Eventually-upgrade Software Model

I think Windows XP was the first real release of Windows–it had finally gotten to a usability and stability point that people could accept.  The Microsoft support model changed shortly after Windows XP was released to basically support any piece of software for as long as ten years (if you paid extra for support roughly 2 years after a successive version was released). To paraphrase a famous law: software becomes obsolete every 18 months.  That was true for a long time; but hardware and software isn’t improving at that rate any more.   Software has basically caught up with existing hardware design and now has the capability of sustaining itself, without upgrade, for much longer than it did 10 years ago.

To paraphrase once again: you can make some of the people happier all of the time, but you can’t make all of the people happier all of the time.  Releasing new versions of software now-a-days is more about attempting to make more people happier than were happier before.  To approach your solution or your technology from a 100% buy-in point of view is unrealistic.  I think we’ve seen the fallout of that model for at least the last 10 years.  People have said that successors to software like Windows XP, on their own, aren’t enough to make people happier than they already are.  To try to force a change is only coming back with push-back.  The friction that once kept people on a particular brand of OS or even particular architecture is gone–people are exercising their options if they’re unable to use what they’re happy with.

I think it’s time for software companies to change their model so customers can buy into an indefinite support model for software.  I think businesses are more than willing to spend more money to get support for some software packages longer than to buy the latest version every x number of years.  If you look at the TCO of upgrading away from XP compared to what a business pays Microsoft for the OS, it’s very much more. Companies are willing to offset that cost and buy support for XP rather than upgrade away from XP.  It just so happens that Microsoft extended support for XP rather than change their core model.

I think a the current model effectively giving customers the choice between abandoning XP and going to the latest version of an operating system (because you’re effectively forcing them to make that evaluation) the more likely that you end up forcing people away from Windows entirely.  People and businesses are re-evaluating whey they need their computers and thus the operating system installed on it.  There’s much more a need to consume data over the Internet than there was 10 years ago.  People and companies are recognizing that and they’re also recognizing there are many more options for doing just that.

With this model, moving forward, innovation will drive software sales more than they do now.  People will upgrade not because it’s the latest version and not because they have to upgrade their hardware; but because the innovation of the software is pervasive enough to justify upgrading.  Different wouldn’t be enough to sell upgrades.

What do you think?  Do you think the eventually-upgrade software model is out of date?

(function() { var po = document.createElement(‘script’); po.type = ‘text/javascript’; po.async = true; po.src = ‘https://apis.google.com/js/plusone.js’; var s = document.getElementsByTagName(‘script’)[0]; s.parentNode.insertBefore(po, s); })();

The Rat-hole of Object-oriented Mapping

Mark Seemann recently had a great post that, as most of his posts seem to do, challenges the average reader to re-think their reality a bit.  The post is titled “Is Layering Worth the Mapping”.  In the post Mark essentially details some of the grief and friction that developers face when they need to decouple two or more concepts by way of an abstraction.

Mark takes the example of layering.  Layering is a one way coupling between two “layers” where the “higher-level” layer takes a dependency on abstractions in a “lower-level” layer.  Part of his example is a UI layer communicates with a domain layer about musical track information.  That track information that is communicated lives in a hand-crafted Track abstraction.  Typically this abstraction would live with the lower-level layer to maintain the unidirectional coupling.  Of course the UI layer needs a Track concretion for it to do its job and must map between the higher-level layer and the lower-level layer.  To further complicate things other decoupling may occur within each layer to manage dependencies.  The UI may implement an MVx pattern in which case there may be a specific “view-model” track abstraction, the data layer may employ object-relational mapping, etc. etc.

Mark goes on to describe some “solutions” that often fall out of scenarios like this in a need to help manage the sheer magnitude of the classes involved: shared DTOs as cross-cutting entities, POCO classes, classes with only automatic properties, etc.

It’s not just layering.  Layering lives in this grey area between in-memory modules and out-of-process “tiers”.  Layering, I think, is an attempt to get the benefits of out-of-process decoupling without the infrastructure concerns of connecting and communicating between out-of-process processes. Of course, over and above the module/assembly separation, the only thing enforcing this decoupling in layers is skill and being methodical.

I’m convinced layering is often, or often becomes, a “speculative generality” to give some “future proofing” to the application since layering so closely resembles “tiering” (not to be confused with the eventual response of “tearing”) as to make it easy to make it tiered should there ever be a need for it.  To be clear, this is the wrong impetus to design a software solution.  You’re effectively setting yourself up to fail by essentially “making up” requirements that are more than likely going to be wrong.  If the requirements for the design are based on fallacies, they too are designed wrong.  But, you have to continue to maintain this design until you re-write it (ever noticed that anti-pattern?).

But, implementing tiers or any sort of communication between processes often ends up in the same state.  You have internal “domain” entities within the processes (and even within logical boundaries within those processes) that end up spawning the need for “DTO” objects that live at the seams on one or either side of the communication.  Further that, many times that communication is facilitated by frameworks like WCF that create their own DTOs (SOAP envelopes for example).  Except you’re mandated by the physical boundaries of processes and you’re forced to do things like shared-type assemblies to model the cross-cutting “entities” (if you choose that cross-cutting “optimization”) introducing a whole new level of effort and a massive surface area for attracting human error (you’ve technically introduced the need for versioning, potentially serialization, deployment issues, etc. etc.).

Creating an object-oriented type to simply act as a one-way container to something that lives on the other size of some logical or physical boundary has appeared to me to be a smell for quite a while.  e.g. the UI layer in Mark’s original example has this concept of a “Track” DTO-like type that when used is only used in one direction at a time.  When moving from the UI to the domain layer it’s only written to.  If it gets a “track” back from the domain layer the UI layer only reads from it.  Abstracting this into an OO class seems pointless and, as Mark says, not particularly beneficial.

Let’s look specifically at the in-memory representation of something like a “Track”.  We’ll limit our self and say that we need four Track abstractions: one for the view model, one for the domain layer abstraction, one for the data layer abstraction, and one for the object-relational mapping.  (I’ve assumed that the data layer may not have a track “entity” and is only responsible for pushing data around).  So, in effect we have four Track DTO classes in our system (and two or three Track “entities”).  But, if we look at the in-memory representation of instances of these objects they’re effectively identical—each one can’t really have more data than another otherwise there’s something wrong.  If we look at what’s actually happing with this data, we’re really writing a lot of code to copy memory around in a really inefficient way.  The DTO classes in part become the way to copy memory.  To be fair, this is a side-effect of the fact we’re manually mapping from one abstraction to another or from one abstraction to an entity (or vice-versa).

This type of thing isn’t entirely unknown; it sometimes goes by the name of ceremony.

For the most part, I think computer languages are hindering us in our ability to address this.  Languages in general tend to maintain this specific way of messaging called method-calling that limits our ability to communicate only information that can be encapsulated by the language’s or platform’s type-system.  But, to a certain extent we’re also hindered by our myopia of “everything must be a type in language X”.  Maybe this is another manifestation of Maslow’s Hammer.

Imagine if you removed all the mapping code in a complex system—especially a distributed system—and were left with just the “domain” code.  I’ve done this with one system and I was astounded that over 75% of the code in the system had nothing to do with the systems “domain” (the “value-add”) and was “ceremony” to facilitate data mapping.

I sometimes hear this isn’t so much of a problem with specific frameworks.  I’m often told that these frameworks do all the heavy lifting like this for us.  But, they really don’t.  The frameworks really just introduce another seam.  The issue of Impedance Mismatch isn’t just related to object-relational mapping.  I has to do with any mapping where both sides aren’t constrained by the same rules.  I’ve blogged about this before. but I can use some “data framework” to generate “entities” based on a data model or even based on POCO’s.  Some view this as solving the problem; but it doesn’t.  Each side operates under different rules.  The generated classes can only have as much impedance as what it has to communicate with, and you have to plan that that’s different than the impedance you’ll end up mapping from/to.  The only real solution to this is to introduce another DTO to map between your domain and the classes generated by the framework so you are decoupled from the eventual “gotchas” where your domain has different expectations or rules than the framework you’re communicating with.  When people don’t do this, you see all sorts of complains like “date/time in X isn’t the same as what I need”, etc.

Don’t fall into this rut.  Think about what you’re doing; if you’re got 4 DTOs to hold the same data; maybe there’s a better way of doing it.  Try to come up with something better and blog about it or at least talk about the problem out in the open like Mark.