Your Team as a Distributed System - Andrew Harvey
The presentation by Andrew Harvey, the CTO in Residence at Microsoft for
Startups focussed on the people aspect of tech.
A number of the topics that Andrew covered are close to my heart, especially when he was talking about the Peter
Principle which basically states that people are promoted to their
level of incompetence. Andrew talked about the chance of promotion increasing as a developer’s technical ability
increases; this ultimately leads to them moving into a management position, and as many people will say, technical
proficiency has no bearing on management ability. With this in mind, organisations need to ensure they are moving
people into management roles who have the right skills to be a manager, this needs to be supported by training and
mentoring to ensure they are able to lead and manage. I really appreciated Andrew converting this problem into
technical terms by saying “people don’t throw stack traces; instead they silently fail, segfault and then uninstall
Andrew then described how a team can be considered a distributed system. A distributed system has multiple processes
(in this case team members), there are inter-process communications (team members talk), there is a disjoint address
space (each team member is maintaining a different state in their mind) and there is a collective goal (hopefully the
team has this). With the knowledge that a team is a distributed system we can apply the Fallacies of Distributed
Computing to understand in technical sense some of the challenges
of leading and managing people.
Fallacy 1: The network is reliable
The first fallacy is easy to prove in a team environment. We just need to remember the last time we accidentally
missed an email, forgot to respond to a Slack message, or forgot something that was said in a meeting. Once we accept
that our network is unreliable (if it were a computer network, we’d fire the network admin because it’s so bad) we can
start to recognise that we need to use different communication protocols for different types of information and for
different recipients. If the wrong protocol is used, then the communication will likely be lost. Again, in networking
terminology, we need to use TCP not UDP; if we use UDP we have no idea if our communication was received. With TCP we
have some additional overhead but get confirmation that the message was received.
Fallacy 2: Latency is zero
If you need proof that latency isn’t zero, simply send an email to someone and don’t do anything else until you get a
response, we can quickly discover there can be significant latency in our human network. The members of our team have
to prioritise their tasks, the latency caused by a communication delay can trigger team members (or teams) to perform
conflicting work, it can create deadlocks where people are unable to proceed due to unfulfilled dependencies and can
become quite wasteful without appropriate management.
Fallacy 3: Bandwidth is infinite
Unfortunately, as humans, we do not have unlimited bandwidth. We have a limited ability to express ourselves, our ideas
and our knowledge. Our communication mediums are also limited; when we talk face-to-face we have more bandwidth
available as we can get a better idea of body language, in video calls we lose the body language but still have access
to facial expressions, on a phone call we can still hear vocal inflections even though we can’t see the person’s face,
and when we get down to text communications we often have to rely on emojis. As we receive information, we fill in
the blanks to make up for lost bandwidth. This can lead to miscommunication and misunderstandings.
Fallacy 4: The network is secure
In most situations we can make an assumption that our human network is relatively secure and no one is actively
sabotaging it, although sometimes people are looking out for their own self-interests which can be a form of
What is more likely is that our network is prone to corruption. Every time information is communicated it changes to
some degree, the more points of relay, the less likely that the information being conveyed will be accurate. It is
important to ensure that we verify what we heard with the person telling us some information to reduce the chance of
Fallacy 5: Topology doesn’t change
The topology of our human network changes frequently. This could be driven by corporate restructures, it could be the
scaling (either up or down) of a business or team, or it could be due to staff turnover. I’ve never seen a team remain
stable, there will always be changes in the topology of our human network, we need to ensure this is accounted for in
all actions of the team. This could mean documenting processes, removing single points of failure, and having easy
onboarding processes to ensure a change in team topology has the minimum possible impact.
Fallacy 6: Only one administrator
Many businesses are structured to have a single administrator, but in practice there are always multiple people
controlling the flow of information. The board or CEO may set a direction for the business, but at the message is
communicated down through the layers of the hierarchy, each layer adds or removes information to better align to their
own agenda. Even at the individual contributor level, each person will focus on what’s important to them. Because of
our ability to individualise each requirement we need to ensure that consensus is reached and that all parties are
working toward the same goal.
Fallacy 7: Transport is free
Many of us are acutely aware of the cost of communications, rom the time spent in a meeting, to the context switching
and loss of productivity from interruptions. It is important to balance the need for communication and the required
productivity. With too little communications everyone is working for a different goal, with too much communication
no one can achieve anything.
Fallacy 8: Network is homogeneous
The final fallacy is that the network is homogeneous. Every team and every team member has a different driver or
motivator, we need to learn what each of these is and to optimise the interactions as early as possible. The closer
we can get to a homogeneous environment the easier the communications become. We don’t have to align the drivers of
every team member but helping them to understand each other will help to make communications more effective.
Conclusions I’ve Drawn from the Fallacies
Recognising that our team or human network is a distributed system and is subject to the fallacies of a distributed
system leads me to the belief that the primary role as leaders and managers is to coordinate the communications between
the components of our system (the people). If our team is not delivering in a coordinated fashion at the expected
rate of delivery, we are failing as managers to optimise our network.
Our primary role as a leader or manager is to manage and facilitate communication. To achieve this we need to ensure
that messages are delivered clearly and via appropriate means, that delays are kept to a minimum, that messaging is
understood and acknowledged, that we can cope with change, that we don’t under or over communicate, and that we align
as much as possible.
If we treat our team as a distributed system, account for the deficiencies of these systems in all that we do, then we
can be successful as a leader and our team can be successful. If we fail to allow for the deficiencies, or if we are
unable to negate the negative impacts of these deficiencies, then we have failed as a leader and manager.
Andrew talked about his thoughts on scaling teams, the requirements to monitor team health, and resolve issues quickly
whilst ensuring that we know the how, what, why and where are we going. He reminded us that conflict resolution is
a required skill for a leader, and if conflict is left unattended it will only get worse.
Especially when managing managers, it is important to have skip-level one-on-ones. Although I have talked about
one-on-ones in One-on-Ones Don't Exist in the Scrum Guide - Why do we do them?, I didn’t cover the skip-level one-on-one. A skip-level one-on-one
is important as it enables team members to tell you the information they want you to hear when an intermediate manager
may be acting as a filter. As a senior leader it can help you to understand the real pain points from an individual,
and may help to seed ideas for improvements within the organisation. It may also show up a deficiency in an
intermediate manager’s ability.
Another aspect Andrew talked about was ensuring there is no single point of failure. This is something I’ve been
passionate about throughout my career, and likely stems from when I have run my own business. If a single point of
failure exists, be it a core system, a team member with a large amount of knowledge, a CTO who needs to approve every
decision, a small issue can have a massive effect. If we look at a team member with a large amount of undocumented
knowledge, what happens if they are sick or they resign? If a CTO has to approve every decision then that CTO can’t
take a holiday, or if they do then the entire team will cease working because the decision maker is unavailable. As
leaders we have to ensure we empower people to make independent decisions; to do this we need to ensure they have enough
context to make the decision, and that all the required information is easily available to them.
Coupled with this is the need to ensure there is a clarity of roles. Who is leading the team? Who is driving the team?
Are they the same person? What are the goals and drivers for each team member and how can they be coordinated for the
maximum benefit and minimum friction of the team? We need to ensure that collective goals are communicated clearly
and that actions align with these goals. There is no point communicating that the focus should be on the customer if
all the actions being taken are focussed on profitability with little regard given to the customer.
As Andrew’s presentation drew to a close, he talked about what culture is. It’s almost impossible to miss the focus on
culture, especially within IT organisations, but very rarely does anyone define culture and what it does. If there is
one thing you take from this article about culture, it should be the following paragraph:
Culture eats strategy for breakfast! I know I’ve heard this before, although I cannot recall where, but it is
something that needs to be remembered. It doesn’t matter what strategies you put in place, if they don’t align with
the culture they will fail. Remember that as a company scales, culture will set the scene, a culture of laziness will
breed more laziness, a culture that allows unethical behaviour will generate more unethical behaviour. Once a culture
has been created it is hard to change; it is built through decisions, and it will take time for decisions to influence
and change an ingrained culture. When building a team, remember that culture is created by decisions such as who we
hire, who we fire (and why), the actions that we reward, but also the actions that we accept.