This Month
June 2008
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
Year Archive
Login
User name:
Password:
Remember me 
View Article  Is Architecture? Is Not Architecture?

Enapsulation is a long-standing "tried and true" principle of architecture. It dates back over 35 years to David Parnas' work on the importance of information hiding, which was a cornerstone of function-based and structured programming. Twenty years later, in the middle of the object-oriented revolution, Gamma, Helms, Johnson, and Vlissides (the "gang of four") amplified the importance of this concept with their advice to "encapsulate the concept that varies."

As you might imagine, encapsulation depends heavily on "separation of concerns" and "clearly drawn boundaries." All three concepts are enshrined in the "Architecture Hall of Fame."

A few years ago, I was working at a software outsourcing company, and we were doing a project to write the control software for an excemer laser engine (i.e. the component that powers many applications, such as laser printers, or laser-beam photomask generators for semiconductor chips. While trying to pin down requirements and scope, one of my colleagues engaged the client in an exersize of "is a laser?" vs. "is not a laser?" Somebody would mention a capability, and the group would debate the question, "is this capability the responsibility of the laser, or not?" Is it in scope (technically) or out of scope?" This was a simple, effective exercise for getting the entire team (including product marketing, development, and QA people) to all get a common understanding of the system's scope.

Today, I am working at a foreign language learning company, working on building a repository to manage language content assets. We played a variant of the "is a laser" game.

We determined early on that the repository would hold "application-neutral, language content". This gave us two specific dimensions to consider when addressing the "is the content repository" vs. "is not the content repository" question.

o  "language content vs. non-language content" (e.g. words and translations vs. users, learning activities and outcomes)

o  "application-neutral" vs. "application specific (e.g. words and translations vs. containers and lessons)

Excemer lasers and language content are both fascinating subjects, but here, their purpose is to illustrate encapsulation, as a lanching pad for a slightly different experiment.

 

What Does Architecture Encapsulate?

Architecture is both a process and a result. As a process, architecture should be able to encapsulate a set of discovery and formulation activities that yield an effective solution design from a variety of inputs (deployment contexts, stakeholder expectations, technology opportunities, etc.). As a result, architecture seeks to identify "architecturally-significant decisions" and gather them into a pile, so that they all can be considered as a whole. Clements, Kazman, and Klein of the SEI write:

"The software architecture of a program or computing system is the structure or structures of the system, which comprise software components, the externally visible properties of those components, and the relationships among them."

"By externally-visible properties, we are referring to those assumptions other components are allowed to make of a component, like its provided services, performance characteristics, fault handling, shared resource usage, ..."

"Our criteria for something to be architectural is [that it] needs to be externally visible in order to reason about the system’s ability to meet its quality requirements or to support decomposition of the system into independently implementable pieces."

Please pardon the oversimplification, but architecture encapsulates the reasoning behind:

o  how a system is organized,

o  how the parts collaborate,

o  how it adapts to significant changes between contexts (variation in situation),

o  how it evolves gracefully over time

To slightly extend an observation made by Gerry Weinberg, these four things represent a system's being, behaving, balancing, and becoming.

So that's it. We're done, right? That wasn't so bad.

Not so fast, amigo.

 

What Should Architecture Encapsulate?

I am a firm believer in everything that has been said so far. But something is gnawing at me, hinting that the story cannot end here. What's missing?

If I don't believe that the definition of architecture, or what it encapsulates are fundamentally flawed, then perhaps my concern lies in how we answer the question: "what is the system?" Before considering this question, let's digress for a moment and consider what Russell Ackoff has to say about systems.

After all, if architecture is about a system's being, behaving, balancing, and becoming, we should be clear about "what is the system?" and "what isn't the system?"

Ackoff asserts "A system is a collection of interdependent elements; each is related, directly or indirectly to every other." Further, "a purposive system is a system that has two or more goals, related by a common purpose, and is able to choose the means to achieve them" while a purposeful system "adds the ability to choose its own goals."

In other words, a person driving a car is a purposeful system (can choose destination, route, speed, etc.). A car alone is merely purposive (accellerate, brake, absorb shocks), unless of course, it is "My Mother the Car." (which by the way, is one of the few TV shows from the 1960's and 1970's that hasn't been made into a feature movie).

But the key insight here is Ackoff's observation about "interdependent elements, each related to one another."  What this means is that if we encapsulate at the wrong place, we might push certain elements "outside" of the system, which are really "inside the system".

This is a troubling thought, because it means that our system scope could easily grow without bound.  Consider the following example:

o  A business needs a system - an excemer laser, or a language content repository, or thousands of other possibilities

o  They form a project team to work on it

o  The project team leaders discover and clarify the requirements (some may be mandated)

o  The project team creates some development processes (e.g. agile vs. waterfall, continuous vs. periodic integration, outsource vs. internal)

o  They define the architecture for the system (organization, component interfaces, technology choices, policies)

o  They staff their team, define the work needed, break it down into tasks and dependencies

o  They develop and test the system and prepare to deploy it

o  The business users prepare for the new system (training, workflow changes, migration plan, etc.)

o  The system is deployed, and everybody lives happily ever after (not counting a middle of the night crisis or three)

How many systems can you find in this simple description? Just find the ones that are mentioned. Don't even bother to look for the ones that may be implied.

The following figure shows at least 6 interdependent systems which combine to make at least 3 others.

The 2 layers in this diagram (the development project and the deployment environment) are systems in their own right.  And each of them contains (at least) 3 systems:

o  the technology system,

o  the social (or team) system,

o  and the project (or business process) system.

So, 3 x 2 = 6, plus 2 = 8, plus the business that contains the development and deployment environments, plus, plus...

System and Architecture Revisited

"Project architecture? Social architecture? Who are you trying to kid? These things have absolutely nothing to do with software architecture. "

The fact is that how the project is organized and how the teams are formed has everything to do with software architecture. Conway's Law, first published by Melvin Conway in 1968 concludes:

"The basic thesis of this article is that organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations. We have seen that this fact has important implications for the management of system design. Primarily, we have found a criterion for the structuring of design organizations: a design effort should be organized according to the need for communication. "

What goes on in the project team with the team members or in the deployment environment is unrelated?

Au contraire, mon frere. Conways' Law directly addresses communication. However, it is also closely linked to physical location, organization structure, and project team structure. To verify this, imagine the differences working with:

o  another team down the hall,

o  one located in the same building two floors away,

o  one in a another building across town, and

o  a offshore outsourcing partner, 10,000 miles and 10 time zones away.

The effects of Conway's Law also extends to choice of project process. An agile method, like SCRUM or XP chooses to focus on short iterations, refactoring, and adaptability over longer range planning. A waterfall or spiral approach focuses more on up-front planning and coordination of effort. Is it possible that this choice affects the software architecture and the system being built? You think?

The reach of Conway's Law also extend to cultural norms about decision making authority, peer reviews, raising and resolving objections, and abiding by unpopular decisions.

Technical architecture is too complex as it is. We'll get buried under this avalanche.

What if I need to architect a system today?  If my system is linked to the deployment environment, what happens when the company brings in a new CEO that reorganizes the whole operation? What happens if a competitor releases an unexpected product that redefines the market? What happens if .com mania strikes and the three leaders of my design team leave to form a startup? We'll never decide whether to use Java or C#, Oracle or MySQL, SOAP or REST-ful Web Services if we need to worry about these things. Just wake me up when the nightmare is over.

Back in the 1980's, the Fram Oil Filters ran a popular TV ad that showed an auto mechanic finishing an expensive repair job on a car engine. The mechanic concluded, "You can pay me now. Or... you can pay me later." You can spend a little every 3000 miles to change your oil and filter, or you can press your luck and overhaul your engine. The choice is yours.

So, what if I choose to focus inwardly on my software architecture problem. What if I ignore what's going on with the formation of my project team? What if I push for a new technology like Ruby on Rails without considering whether the developers have the right skills? What if I ignore competitive trends in my industry, or how it is regulated?

Ackoff said that all elements are interrelated, but he didn't say how tightly. If a spider captures an insect in Montana, it's not necessarily going to result in a bug in my Web Server.

 

Synthesis, Risk Management and Hypotheses

There are three very powerful tools that every architect needs to have in his toolbox, and know how to use: synthesis, risk management, and hypotheses.

Synthesis

Ackoff wrote extensively on the distinction between analysis and synthesis. Both concepts date back to the ancient Greeks, and are mirror images of each other.

Analysis starts with a thing and looks inward. It partitions the thing into its component parts, and tries to explain how the parts work (or will work). Architects and designers are very familiar with analysis. It is what they do on a daily basis. Scientists do it, too - biological classification, the periodic table of elements, and geological classifications are examples.

By contrast, synthesis starts with a thing and looks outward. Synthesis asks about the neighboring and containing systems. It seeks to understand what the role of this thing is (or should be) in its environment. Synthesis focuses on the context, environment, and role/purpose of a thing.

Synthesis is extremely important to an architect, because it provides a balanced perspective on the big picture. Synthesis that is performed well provides an architect with an ability to anticipate how things might change or develop. Good anticipation is not uninformed guesswork. It is grounded on on a solid understand of higher-level patterns, much like the way that capable meteorologists use movements of warm/cold fronts, temperature, wind and high/low pressure systems to forecast the weather.

Synthesis cannot be performed effectively in a vacuum. Typically, there is far too much "conceptual distance" between the neighboring and containing domains and the specifics of the system being developed. This is where it is essential for the architect to identify experts and enlist their assistance. The architect's job is to:

o  learn enough about the subject matter to be conversant,

o  provide context about the system being developed,

o  ask enough questions to guide the subject matter expert, then

o  ensure that the architect understands the potential implications of the responses.

Risk Management:

From time to time (or seemingly more frequently), the system we are seeking to develop will be affected by events in a related system. These events can vary according to a number of things:

impact: between nonexistent and critical,

frequency: between one-time and continuous,

likelihood: between highly unlikely and certain,

This is a classic risk management question, and it is important to use combine these three dimensions to separate the things that aren't significant from the ones that are worth worrying about. The marginal risks are the ones we might elect to leave alone and accept. The critical risks are the ones that jeopardize our critical success factors: key capabilities, quality, schedule, or cost. We assess and develop contingency plans for the critical risks. Sometimes these contingency plans only require incremental adjustments. Often, however, they can cause us to rethink some of our fundamental approaches. It is also worth noting that risks change over time, and threats need to be reviewed periodically.

Hypothesis:

As we all learned in high school, hypotheses are the foundation of scientific methods. Hypotheses are testable assertions, frequently used when uncertainty is present. Because they are testable, experiments can be conducted to confirm or refute hypotheses.

Hypotheses can be formed about many things:

about interdependencies

failure conditions (timing, frequency, impact)

effectivenss or suitability of design approaches

The ley thing about hypotheses is that they are statements about a system that can be recorded and traced to other things.  Consider a systems engineering environment for a jet engine, such as one to power an Airbus 320.

There are several large, powerful external systems that interact here:

o  The airlines who buy the planes interact with the airframe manufacturer (Boeing, Airbus) to specify the system requirements for the airplane.

o  The FAA (and other international air travel regulating bodies) interact with both the airlines and airframe manufacturers to add requirements, by mandating safety regulations for both outcomes and processes

o  The airframe manufacturer interacts with the jet engine manufacturers (GE, Rolls Royce).  System engineering for the airplane as a whole (a containing system) creates derived requirements for the engine.

Note that these derived requirements may be made before anything concrete is known about the jet engine, or whether these requirement are even feasible.

o  The process is repeated when the jet engine manufacturer performs its system engineering, and creates derived requirements for the electronics control board, control software, or turbine hydraulics.

More hypotheses need to be made about the ability of these underlying components to come together to satisfy the requirements that have been imposed on them.

The important point to emphasize here is about the traceability of hypotheses.  Safety regulated systems engineering processes mandate that requirements are recorded and traced to each other.  This provides the audit trail that helps ensure: a) that all upper level requirements are backed up by lower level requirements, and b) there are no stray lower level requirements that can’t be traced to one or more upper level ones.

Hypotheses can be treated the same way.  They can be linked to risks, requirements, or architectural approaches.  Further, the results of experiments meant to verify the hypotheses can be linked to the hypotheses themselves.

Combination:

In conclusion, let's take a quick look at how synthesis, risk management, and hypothesis are combined. Suppose that we are trying to architect a large, complex system. The following set of steps makes sense:

Step 1: Synthesis. We look outward at containing and neighboring systems and ask, "how does this system support them" or "how is this system influenced by them"?

Step 2: Analysis: We identify related systems and assess the nature of our relationships with each

Step 3: Risk Management: Based on this assessment, we identify the major threats and opportunities.

Step 4: Synthesis: We use our synthesized understanding of the surrounding systems to prioritize them according to impact, frequency, and likelihood

Step 5: Hypothesis: We make and record testable hypotheses about these threats and opportunities. Note that failing to capitalize on an opportunity is a form of risk

Step 6: Analysis: We prioritize the hypotheses and determine how testable each is. Identify risks you will accept

Step 7: Verify: Specify the experiments and perform the most important ones that can be tested.

What action to take next depends on the situation:

o  how do the results of the experiment affect the hypotheses?

o  how do the hypotheses affect the risks?

o  how do the risks affect the requirements and/or approaches they are linked to?

o  how much of a chain reaction is created in the system and its architecture?

While these questions cannot be answered in the general case, certain facts remain:

o  You must do a good job of synthesis to understand "the big picture" for the system being created.  Ignoring the neighboring and containing systems is inviting trouble.

o  You don't have to specify everything in order to build anything

o  You must have a way to identify what you don't know, and be able to assess how much danger it creates

o  When there is a critical requirement or approach that must be made, and it carries risk that is sufficiently large, you must form testable hypotheses so that you can verify them later.

Many people believe that risk management is the responsibility of the project manager.  I have mixed feelings about this.

First, risk management has to live wherever there is knowledge or awareness of risks.  The project manager might be aware of schedule risks or people behavior risks, but might not have a deep enough technical knowledge to know about technical risks.

Second, as we've discussed above, a project is a big system that encapsulates schedule, resources, social behavior, and technology concerns.  In many large projects, these areas are further divided into responsibility areas like:

o  project management,

o  product (or business operation) management, and

o  technical architecture.

The trouble is that the activities and results are intertwined, and don't stay as nicely organized.  This is a major risk.  For a complex system, the job is frequently too big for one individual.  However, authority and responsiblity can be an impediment to communication.

What are we to do?  Why not make some hypotheses?

o  Functional specialties (like project management, product management, and architecture) give the specialists different responsibilities, perspectives, values and comprehension.

o  These differences between specialialists are a result of "conceptual distance" and create a barrier to communication and shared understanding.

o  Unless we can reduce the complexity or shrink the size of the problem, the only solution is to be more effective at reducing conceptual distance.

o  Reducing conceptual distance requires domain experts to work closely together, educate each other, and focus on how smaller decisions impact the big picture.

View Article  Conceptual Distance - Its Importance to System Architecture

Linguistics has a term called “semantic distance”.  It is a measure of how closely related two words in a language are.  “May” and “might” have a much shorter conceptual distance than “college” and “geriatric”.

If semantic distance is a measure of how closely related two words are, then conceptual distance can be seen as an assessment of how closely related two pieces of knowledge are.  In other words, if 100 individuals understand X very well, what percentage of them will also understand Y, and vice versa.  To illustrate, consider the following concept pairs:

Concept X

Concept Y

Conceptual Distance

Pre-Compiled Query

Integrity Constraint

Low

Relational View

Stored Procedure

Medium-Low

Outer Join

XSLT Transform

Medium

Database Trigger

Ajax or Flex

Medium

·       Both concepts in the first row are related to relational database management systems (RDBMS).  Since both concepts are relatively elementary, it is reasonable to expect that somebody who understands one will also understand the other.  Further, it is reasonable to expect that somebody who knows a little bit about an RDBMS is pretty likely to understand both (or learn them quickly).

·       Both concepts in the second row are also related to an RDBMS.  The difference is that the two concepts are slightly more advanced.  Views are powerful, but simple, data hiding techniques for queries.  However, their use in insert and update operations is much more complex.  Similarly, stored procedures are a simple abstraction, but require an in-depth understanding of when to use them and how to design them effectively.

·       The concepts in the third and fourth rows cross software disciplines.  Some database developers also understand XML/XSLT.  Other DB developers understand web-based user interface technologies.  Some cross over all three.  However, some very good database developers might only have a surface understanding of Javascript, XML, or XSL transforms.

Next, let’s repeat this experiment, but this time associate software with non-software disciplines:

Concept X

Concept Y

Conceptual Distance

Thread Priority Inversion

Vacuum Chamber Control

Moderately-High

Stored Procedures

Radiology

High

Interface Inheritance

Japanese Linguistics

High

XSD Target Namespace

Credit Default Swap

High

·       In the first row, thread priority inversion, and how to avoid it, is a relatively advanced topic in realtime operating systems.  Vacuum chamber control is a physics problem, involving relatively advanced electro-mechanical control of actuators/sensors for pumps and vents.  While some experts understand both concepts, this understanding required a significant leasrning investment in both domains.

·       In the last three rows, the subject matter in column Y is far removed from software.  Someone who has become expert in Radiology, Linguists, or Swap & Derivative markets is unlikely to have more than a passing knowledge about software.  Similarly, someone who has made their career in software development is unlikely to have more than a passing knowledge in radiology, linguistics, or financial instruments (I can personally attest to this).

Why is conceptual distance important?

There are four basic reasons why conceptual distance is important. 

1.     The first reason why conceptual distance is important is that systems are becoming significantly more complex.  In other words, the easy ones have already been built.  Nobody I know is spending much time writing text editors, email readers, or file transfer programs.  Yet, it wasn’t too long ago when many people were focusing a significant amount of effort on exactly these types of things.

2.    The second reason is that as system development efforts become more complex, specialization increases.  In other words, when expertise is demanded, experts must be supplied.  When time is a major constraint and the needed expertise broadens, specialization is a natural response.  As an illustration, consider the rise of specialization over the past 30 years in medicine, computer systems, telecommunications, and many other areas.

In the computer software business, expertise frequently is required from many areas, including:

o      Market experts,who study consumer buying behavior (for products), or business processes (for IT systems) in order to increase adoption and satisfaction rates.

o      Domain experts, who understand deeply the nature of the problem being solved, how things might change, and which approaches have the best chance of being successful.

o      Compliance experts,to advise on legal and regulatory constraints and specify what actions are necessary to comply.

o      Systems experts, who understand systems, both automated and people-based, and understand how they work, how they fail, and how they change over both time and space.

o      Project managers, who know how to coordinate the efforts of many people, and understand work breakdown, structures, scheduling dependencies, risk management, communication and team dynamics.

o      Technology experts,who understand the intricacies of databases, networks, user interfaces, algorithms and a vast array of technologies, computing platforms, and programming languages.

It is important to note that these effects are not limited to computing systems.  A similar phenomenon occurs in many other areas where complex systems are deployed, such as health care, mechanical systems, telecommunications, air travel, agriculture, manufacturing, and others.

3.    Encapsulation (information hiding) is the third reason why conceptual distance is important.  Each area of specialization exposes a thin interface that describes what it can do, and hides the expertise that is needed to determine how best to do it.  This often is an effective technique to permit different subject matters to collaborate without requiring each to be expert in the other.

However, in reality encapsulation is a lever.  Most times it magnifies power in our favor, but sometimes it doesn’t.  As much as we try to contain it, complexity has a way of leaking out.  Anyone who has debugged an exception condition deep inside of a component understands how encapsulation can be a fickle ally.  Parts of a system have inter-dependencies and often have conflicting interactions.  Furthermore, the outcome might depend on contextual factors. 

4.    Scalability of Knowledge Transfer.  So, the level of expertise required in many subject matters is expanding.  The number of subject matters is expanding. And encapsulation is not able to hide significant portions of the complexity between inter-dependent subject matters.  Sounds gloomy, doesn’t it?

When conceptual distance is small and not changing rapidly, old time-proven techniques work pretty well.  Establish goals, identify the key obstacles and risks, determine the architectural approaches, define the requirements, organize the work, etc.

However, in many cases, deep expertise is required across several domains, and these areas are evolving and/or interdependent.  This is where scalability of knowledge transfercomes into play.  Enough individual expertise to cover all important subject matters is necessary, but not sufficient.  A high-bandwidth way to communicate expertise between subject matter experts and have it be understood at the receiving end is of utmost importance!

Dimensions of conceptual distance

Conceptual distance appears to have three important dimensions. 

·       First, are the two concepts in the same subject, in related subjects, or in far-removed subjects?  Databases are closer to Web Services or User Interfaces than any of these are to Radiology or Linguistics.

·       Second, how basic or advanced is a concept within its domain?  Time value of money is a simple concept in finance.  The pricing and risk management issues of credit default swaps is much more advanced.

·       Third, how much is already known about each subject (by the people who need to make decisions about it).  To further magnify this, do these people know what they don’t know or are they blissfully unaware?

The following diagram illustrates four key subject matter areas in an IT application for developing new content and some cross-dependencies between these areas.

The following table shows how conceptual distance varies significantly depending on how context changes affect the individual subject matters.

Example

Conceptual Distance

Description

3.2.1 maintenance release for a software system

Low

Bug fixes improve quality but don’t change the fundamental value proposition.  Tasks are straightforward and workload is balanced.  Most bugs have simple fixes.  A few require tricky or widespread changes.

2.0 release of an existing software system

Medium

Significant improvements to the value proposition.  Some capabilities fit into the 1.0 architecture, others need significant refactoring.  Some new technologies must be integrated and pose notable risks.

1.0 release of a multi-system framework

High

Several existing products are expected to be integrated into a new, common framework to increase interoperability and reuse.  This effort has technical, product management, and project complexity.

Interestingly, there are many cases where the problem domain is a very close cousin of the subject domain.  For example, Interactive Development Environments, such as Eclipse, NetBeans, and Microsoft Visual Studio are designed by developers for use by developers.  The same is true for programming language environments, such as Ruby on Rails, Python, J2EE, and .NET, or open source software tools environments, such as Apache and Source Forge.  In each case, the system creators are delivering technology to a market that they can relate to directly.  This is a powerful, but accidental way to lower conceptual distance).

In summary, conceptual distance is important because:

o      The simpler problems have already been solved,

o      Deeper levels of expertise from more subject matters are needed to solve the more complex problems.

o      Encapsulation helps contain complexity, but cross-dependencies between subject matters can trump information hiding.

o      Scalability of knowledge transfer requires a new way for groups of subject matter experts to think about communication and comprehension.

Conceptual distance and complexity

Imagine two beams about 2 feet apart, that extend out into the distance.  The beams appear to be parallel, but in reality are about one degree off.  As you walk forward, one foot on each beam, you cannot sense that they are gradually getting further apart.  After 50 feet, they are about 10” further apart.  However, by 200 feet they are 3.5 feet further apart!  Conceptual distance is similar.  It sneaks up on you, and then when you can least afford it, it attacks.

As Frederick Brooks aptly observed in his article "No Silver Bullet", there are two types of complexity: essential and accidental.  Essential complexity is part of the problem domain.  It is what is is.  You can’t reduce it or eliminate it.  Accidental complexity is what is added by the solution, as design decisions are made and code is written.

Conceptual distance suggests that essential and accidental complexity represent one important dimension.  The other dimension is visible vs. transparent vs invisible complexity, as illustrated below:

Visible complexity is complexity that can be recognized, communicated and understood by whomever it affects.  It occurs when conceptual distance is small – where one expert recognizes the potential for a problem and is able to explain “what” and “why” so that others can understand both the immediate and potential impact.  This implies three critical things that all must be present:

·       The subject matter expert who detects the potential problem must know enough about the other subject matters to recognize whether this is something that is significant.

·       The subject matter expert must know enough about the other subject matters to be able to express his concerns

·       The other subject matter experts must know enough about the subject with the potential problem to understand its implications on their areas.

Transparent complexity is complexity that is noticed (or noticeable), but is not acted on, because it is not communicated or understood properly.  The Challenger shuttle disaster is a well-known case.  The Wikipedia page on the Challenger Space Shuttle Disaster says:

NASA managers had known that contractor Morton Thiokol's design of the SRBs contained a potentially catastrophic flaw in the O-rings since 1977, but they failed to address it properly. They also ignored warnings from engineers about the dangers of launching on such a cold day and had failed to adequately report these technical concerns to their superiors.

In this case, NASA managers did not understand the risks in the O-ring design and the impact of cold weather on these risks.  At the same time, the engineers who were aware of the risks, did not communicate them in a way that made the risk impossible to ignore.

Invisible complexity is unknowable, at least given the current state of our knowledge.  Predicting the timing and magnitide of a natural disaster falls into this category.  For example, most investment firms treat U.S. Treasury Bills as “risk-free” investments.  The argument is that the U.S. Government can always implement taxes to pay its debt obligations, therefore it will never default.  The use of “always” and “never” in this sentence is a bit worrisome.

Shrinking conceptual distance

As conceptual distance expands, misunderstandings, risk, and rework increase.  Quality declines, and the time and resources needed to design and implement a new system skyrockets.  In short, as complexity grows, our ability to deal with it doesn’t seem to keep pace.

On the surface, the problem seems to be caused by our ability to absorb and comprehend information.  Over the past century, our bodies of knowledge have exploded in both number and size.  Fifty years ago, a high school diploma was enough to qualify someone for a decent job.  Today, a bachelor’s degree might not be sufficient.

Clearly, availability and accessibility of information is not the problem.  Internet search engines put vast quantities of information at our fingertips in seconds.  No, rate of absorbption is the limiting factor.  When 10 inches of rain fall in 5 hours, some of it is absorbed into the ground, but most of it runs off and floods low-lying areas.

All subject matter experts must be strong in each of the following skills:

·       Modeling.  UML class diagrams, activity diagrams, and finite state models, are not just for software.  These are very useful techniques for communicating types, relationships, collaboration, process flow and state dependent behavior.

·       Synthesis.  Subject matter experts need to be able to begin at a system boundary and look outward to discover its purpose, contexts, and forces.  The ability to communicate why is essential to determining the appropriate what and how.

·       System statics.  Subject matter experts must be able to quicly and clearly communicate how any system is organized and how it operates.  Here, we go beyond the techniques discussed under Modeling, and focus on guiding principles of organization and operation.

·       System dynamics.  All interesting systems change in both space and time.  Subject matter experts must be able to explain how feedback loops change the forces that affect a system.  They also must be able to identify and assess external contexts and communicate how common or variable their impact is.

·       Risk Analysis.  Development of any sort is a forward looking activity, and is subject to uncertainty.  Risk is what happens when uncertainty collides with outcome.  Risk analysis involves estimating the probabilities of a set of uncertain events, and the impact on something valued of each event (positive or negative).

·       Communication.  None of the five skills mentioned above are truly useful to a group of subject matter experts, unless:

o      Each individual can communicate, by writing or speaking, in a way that enables others to comprehend the complexities.

o      More importantly, each individual is open to receiving information from others, by listening or reading, and is capable of comprehending the complexities in their message(s).

·       Self-awareness.  It is essential for each subject matter expert to knowing what they don’t know and engage somebody who does.

·       Commitment to Reducing Conceptual Distance.  Finally, subject matter experts must realize that success is a team effort, not an individual showcase.  This requires more than just cooperation.  This requires each expert to take time to describe how their subject matter related to the “big picture”: goals, opportunities, obstacles, risks, and tradeoffs.  It requires an active commitment to contribute to the “collective body of knowledge.” 

 

 

View Article  A Tale of Three Product Types

Imagine, if you will, three projects that run concurrently in three different companies which develop and market software-intensive products.

One is technology-centric. Another is application centric. The third is deployment centric

What does this mean?

The following figure summarizes the essential characteristics of the three types:

A technology-centric project seeks a breakthrough solution to a complex technical problem that has broad utility. GPS receivers, wireless network adapters, and biometrics are all examples of technology-centric developments.

An application-centric project seeks to apply one or more technologies to impove the productivity or enjoyment of a few types of users. For example, a company might:

combine a GPS receiver, color graphic display with touch screen, embedded computer, and street map data to develop a handheld GPS unit.

create a patient monitoring device that can continuously monitor a patient's vital signs and transfer raw data and alterts to a central monitoring station

integrate biometrics with hand-held, laptop, and desktop computers to provide user authentication.

A deployment-centric project seeks to coordinate the activities and improve the effectiveness of an organization. Examples include departments, cross-functional teams, an enterprise, or a supply chain.

Of course, these three categories are archetypes. There are several additional examples that fit inbetween. One example is an application such as instant messaging, which is targeted at relatively small groups of users. Another is where existing technology must be adapted and improved in order to create a viable use application. Miniaturized video cameras, inserted into catheters can be inserted into a patient's blood stream to allow a cardiologist to view the status of the inside of a patient's coronary arteries. In any case, these three archetypes provide three useful points of view about software architecture formulation.

Why are these archetypes significant?

These archetypes are significant because they represent very different situations, and require different approaches to software architecture and project management. In other words, the conventional wisdom that there is a typical software development process is, quite simply, flawed. Let's use some common metaphors to illustrate why this is true.

Technology-centric projects are like climbing a 200-foot sheer rock face. In both cases, the goal is to discover a solution that overcomes a set of complex obstacles. Experimentation, discipline, persistence, and the ability to cope with side-effects are keys to success. Elegance is nice, but results and urgency dominate.

Application-centric projects are like driving solo from Boston to San Diego this year. Modern applications leverage platforms like Java and .NET, and a host of commercial and open source components and frameworks. Similarly, the quality of both the Interstate Highway System and modern autombiles remove a great deal of complexity (by contrast, consider the same trip 150 years ago in a covered wagon). Many applications deal with minimal levels of variation, and rely on actors, use cases, features, and functional decomposition. In the same way, cross-country drivers expect different states to have similar traffic laws, road conditions, and gas stations.

Deployment-centric projects are like being a manager for a Major League Baseball team. Deployment-centric projects experience high-levels of variability, both across deployment contexts and over time. Similarly, many of the decisions a baseball manager makes are dictated by the game situation and the batter-pitcher matchups. Just as deployment-centric projects must get the most out of the efforts of many individuals in an organization, so must a baseball manager find the right roles for each of his players. A deployment-centric system must accomodate different policies and collaboration styles of its users, just as a baseball manager must find different methods to motivate and teach different players. Finally, a deployment-centric system must interoperate with a wide variety of external systems, just as a baseball manager's performance is scrutinized by the fans, media, and owners (often not with the same criteria).

So, why do these archetypes matter in the real world?

As organizations and products grow older, they often evolve from one archetype to another. Unfortunately, since this evolution occurs over months or years, organizations tend to cling to their old processes and ways of thinking. And this is where the seeds of woe are sown. Let's use two examples to illustrate this point.

First, consider a company that is evolving from technology-centric to application-centric. As a technology-centric organization, the primary struggle was overcoming external constraints: logical, electrical, chemical, or mechanical. Experimentation was necessary to discover the solution, and innovation was valued over following the rules. Finally, since the technology usually represented core intellectual property, in-house solutions tended to be preferred over external ones.

Now think about this. How many of the approaches and tendencies that work well for technology-centric development don't fare as well in an application-centric context? Each of the ones mentioned in the previous paragraph morph from a strength into a weakness:

A focus on constraints rooted in science and math versus user experience and functionality leads to introverted design, which may ignore user needs.

Innovation over rules and guidelines can lead to expediency and a decay of architecture and design rules (making code brittle and hard to understand)

Internal IP leads to prefering to build software in-house instead of searching outside for a viable solution. This leads to non-standard capabilities and more code to write, test, review, and support.

Now, consider a company that is evolving from application-centric to deployment-centric. As an application-centric organization, the primary struggle was developing applications that were fast to learn, helped the users to be more productive, and gave advanced users ways to do sophisticated things and automate repetitive tasks. The Microsoft Office family of applications follow this formula. Now, interesting things happen when an application-centric product evolves into a deployment-centric product.

Consider a source code control product. Initially, these products are developed as groupware applications to focus a limited set of actors (developers, project managers, and quality assurance). Emphasis is placed on use cases and features, such as version history and differences, check in/out, and branch/merge. The system is partitioned functionally; the user-interface, controller, database layer and difference engine are encapsulated.

Over time, as the source code control system gets deployed, users appreciate this functionality, but want more. They want to make the development process run more smoothly by connecting the source code control system to some related applications. These include: defect tracking, programmer IDE, automated build, continuous integration, requirements management, project tracking, and change management. Of course, there are several products in each of these areas, and they all do similar things in a slightly different way. Since there are no industry standards to govern the collaboration with these types of systems, a source code control vendor has little choice but to figure out a proprietary solution that is satisfactory to all parties. After working out an approach with the first vendor of one of these partner applications, what are the odds that this interface will be acceptable to the next vendor of the same application, or the vendor after that?

Variability occurs in other areas, also. In the beginning, source code control systems are used by a moderate number of developers (10-40), who work at a single site. Next, developers want the ability to access the source code control system from home, so thin clients are implemented to support Internet access. Next, companies begin to outsource projects to India, China, and Eastern Europe. Now, there are dozens of developers half a world away from the primary team, making concurrent changes, and requiring fast access to the source code control system.

What are the main implications?

Under these types of stresses, many application-centric systems quickly turn into giant piles of difficult to understand spaghetti. The software becomes hard to modify and brittle. One successful defense is refactoring. Instead of anticipating future changes, the development team addresses the real ones as they come. When a new set of requirements would create an unstable solution, the code is reorganized into a design that would have been appropriate if the new requirements had been anticipated. This approach relies heavily on a thorough complement of automated unit and functional regression tests. These are necessary to make sure that the refactoring process didn't result in defects in previous capabilites. Unfortunately, refactoring and a full complement of regression tests require a substantial amount of effort to develop and maintain. While the effort and risk are demonstrably less than manual testing or no testing, schedule pressure can undermine the best intentions.

The point here is that things change, in complex ways. Because of the very nature of improving productivity in significant-sized organizations, the architectures of deployment-centric systems will almost always be exposed to greater stress than other types. Refactoring and regression testing are very useful tools, in the same way that cement foundations and cross beams are useful ways to support a house. But if you are building in California and are concerned about earthquake damage, then they might not be sufficient.

A different approach is required, one that helps you to deal with contextual variation. One example of such an approach is the topic of the next entry.

View Article  Lift Ticket Refunds

This post is a short interlude in the "Alignment" series, but it illustrates an interesting aspect of value  models, so I thought I'd discuss it.

While driving to work one morning about a week ago, I heard a radio ad for a ski resort called Loon Mountain, located in northern New Hampshire (thanks to a co-worker, Matt Allen, for his efforts in tracking down the name of the resort).

Background

Before describing the specifics of the ad, let me first provide a little background.  So far, this winter has had relatively little snowfall in New England, even in  the mountains.  On top of that, during much of the month of December, the temperatures have been warmer than normal.  As a result, the ski resorts have not been faring well.  Even with their snow making equipment and some cold weather in '07, the resorts are operating a fraction of their trails, and I'm told that the conditions are not as good as when there is real snowfall.

Some avid skiers I know tell me that when they go, they arrive early.  Since there are so few trails open, there are an  overload of skiers on each trail.  This can make skiing like driving in rush hour traffic, only in this case, the numerous skiers can degrade the conditions on the trails.

So all of this complicates matters further for the ski resort owners.  Many prospective skiers, facing a multiple hour drive from their home to the mountain, look out their windows and don't see snow in their backyard.  So they wonder whether or not it is worth making the trip, paying for the lift tickets, only to discover that the conditions aren't very good.  Some decide to go, many to quiet the kids, who have been looking forward to skiing all week.  Many decide to stay home, and go bowling or go to a movie instead.

Given the economics of operating a typical ski resort, this is not an enviable situation.  Ski resorts carry a heavy burden of fixed costs (lifts, snow making, grooming, property taxes, debt service).  They earn some amount of "fixed revenue" from people who buy seasons passes, but depend on the revenue from daily pass buyers to cover expenses and make money.  Since there are only a fixed number of days in ski season (at most 5 months, from the end of November to the end of April), its pretty easy to see how mild winters create financial hardships.

The Ad

So, Loon mountain ran the radio spot in the greater Boston area, and also posted the offer on their web site.  It says:

Loon's Unconditional Satisfaction Guarantee assures that you, your friends, and your family will be completely satisfied with your visit. If, for any reason you are not, bring your lift ticket to our Guest Information Desk by ll:00a.m. to receive a voucher for an equivalent ticket. Simply put, if you're not satisfied, we're not satisfied.

So the intent of the ad seems clear.  Loon can  tell that business is down, and the the lack of natural snow, the limited number of trails, and risk averse daily-pass skiers are contributing factors.  So they want to tilt the risk equation of these skiers, and cause a few more of them to decide to go skiing, instead of staying home and doing something else.

Value Modeling Relevance

There are many factors that enter into someones decision about whether or not to go skiing.  We'll focus on the following:

o  Pre-disposed to ski (season pass, on a ski vacation, etc.) vs. Daily decision

o  Shorter drive vs. longer drive (to get to the resort)

o  Avid skier vs. occasional skier (how high a priority going skiing is)

o  Tolerant of conditions vs. preference for good conditions

The radio ad is targeted at daily fee skiers who are put off by inferior conditions.  This lets us ignore the season pass skiers, and those who accept inferior conditions.  Also, we can take the people on ski vacations and combine them in with the short drive crowd.

This leaves us with 4 interesting value contexts:

    A.  Skiers who live relatively close to the mountain (say 1 hour or less)

A.1  Avid skiers who are somewhat intolerant of inferior conditions

A.2  Occasional skiers who are somewhat intolerant of inferior conditions

    B.  Skiers who live relatively far away (say more than 1 hour drive)

B.1  Avid skiers who are somewhat intolerant of inferior conditions

B.2  Occasional skiers who are somewhat intolerant of inferior conditions

Now, let's consider this ad from the point of view of the people in these groups who are making the decision of whether to ski or stay home.

The skiers who live relatively far away have an interesting dilemma.  They have two significant  uncertainties driving their decision.  First, are the conditions going to be acceptable.  Second, if they aren't, how much time will we waste driving up and back.  Several of those with significant drive time may reject the ad's offer, because a free ski pass isn't worth the time and gas expense.  For those who are influenced by the offer, many will be reluctant to ski for an hour, get undressed, and drive home.  So, they'll probably end up staying for the day, in spite of the conditions.

The skiers who live relatively close have a slightly different decision.  Because of the shorter drive time, the risk of wasting a significant amount of travel time is less.  This makes the ad's offer somewhat more attractive.  In addition, because they live closer, it is easier to get to the mountain earlier, giving them more opportunity to ski before the 11:00 AM cutoff.  In fact, it would be possible for less scrupulous skiers to partly the offer into skiing for two mornings for the price of one day.

Summary

In summary, the ad represents a creative attempt by Loon Mountain to overcome a difficult winter.  It directly addresses one of the major risks that many skiers are concerned about - will the conditions be to my liking, and if not, do I waste the price of my lift tickets?

However, it does nothing to address one of the other significant risks - I cannot recover my round trip travel time if the conditions are unacceptable.   When this uncertainty has a higher priority, the ad will have little or no effect.  When this uncertainty has a lower priority, the ad's effect is likely to be greater.

An interesting question is whether there is anything Loon Mountain could do to address the travel time risk.  One might be to offer some information on trail conditions on its web site that is updated in near realtime.  Another might be to offer 50% of the price of the daily passes for coupons that are good at nearby restaurants or shopping outlets.

The bottom line is that this is an excellent, simple example of th