Advice to Gen Z training in data science: more cohesion, less morale

Becoming a data scientist requires training, and there’s more to that than learning lists of techniques.

Pass me my pipe and slippers, and I’ll tell you story of how much better it was in the the good old days …

… I’ll stop there. Firstly, we Xennials aren’t that old — even though some our cultural references are starting to look a bit dated. Daddy-Oh.

More importantly the world, or at least the UK, seems to be improving culturally (if not in many other ways). As Gen Z joins the workplace, they seem far more focused and emotionally stable than we were back in the amped-up, alcoholic noughties. Few of us were headed to the gym after work in 2003.

But with all this talent flooding into data science, more experienced practitioners have a responsibility to provide an environment in which people can grow and thrive. Meanwhile Gen Z data scientists have a responsibility to make the most of their potential. The parable of the talents and all that.

My concern is that we might all be failing.

To unpick this, let’s take a look at how we were trained back in the day. Then we’ll take that apart, and contrast it with today’s received wisdom. Finally we’ll search for any babies that might have been thrown out with the bathwater. Spoiler: I’m going to suggest that it comes down to that distinction between morale and cohesion. More on that later.

What was noughties style training trying to achieve?

Back in the day, the concept of “data scientist” didn’t exist, but the concept of “scientist who works with data” very much did. We were trained as scientists.

It’s probably worth pausing at this point to take a look at what that means. Scientists share a set of behaviours that are frankly unnatural. Humans in the wild are Earth’s champion pattern recognition machines. We constantly learn and generalise from tiny amounts of data, classifying and predicting threats and opportunities. Overfitting is a price worth paying for survival. Additionally, we rely on social groups bonded by sets of shared beliefs and rituals. To an outsider, these beliefs look odd, but they exist to reinforce group stability.

So we have a natural human tendency to believe things that aren’t supported by evidence. However there’s good news: society has developed a bastion against this. Evidence-based science. And the defenders of that bastion are “the scientists”.

The Unnatural Triad

Now what behaviours should we want scientists to exhibit? Well firstly they shouldn’t believe things without evidence, because “someone said so” — no matter who that person is. Annoying if you’re their manager, right?

However it’s not enough to be skeptical about other people’s claims. That’s the easy bit! You have to be skeptical about your own claims! It’s far more likely that you’ve made a mistake than have discovered something new. So caution is a part of the job description.

Finally, a scientist should be resistant to spreading false knowledge. They must carefully match their claims to the quality of the evidence — though in a “publish or perish” world, that seems to be a standard that has slipped.

So we now have this triad—skepticism about what you’re being told, constant probing of your own beliefs, and dispassionate dissemination of your work. These are unnatural behaviours for a social ape that evolved on a savannah. And to get people to do unnatural things requires training.

Training, noughties style

So how did they do it back in noughties? Well frankly it was: break ’em down and if you were lucky, build ’em up again. Intellectual hazing was pretty much universal. We all dreaded the Friday afternoon one-to-one meeting with our line-manager. People on the team regularly threw-up, or had stress induced migraines. Afterwards we would drift one by one to the pub and drown our frustrations.

Toxic, right? And yet, I still remember the chewing out I received for not labelling my axes. We quickly learnt the standards expected of us, and made sure we went into that meeting prepared.

Then there were the presentations. Experienced team leads would pick away at the work done by junior members of competing teams. Clashes of egos masqueraded as clashes of ideas.

By today’s standards it was all a pointless exercise in patriarchal power (and it was mostly patriarchal). But then again, you sure as hell examined why you believed what you believed, and dialled back your claims accordingly. The alternative was a ritualised public shaming.

No-one in their right mind would design a workplace like that today. But when I hear someone mumbling self-criticism while looking over their own work (“I’d never have got away with this back in the day … ”) or being hyper-cautious about the claims they’re willing to make, I know that they’ve been through a similar training period to the one I had.

And so, I trust them more.

How do we do it today?

We’ll come back to trust. However before we do, I want to be clear: instinctively I hate this hazing model. Deep down I want it to be possible to train people in a more positive way. And that’s pretty much the consensus. People quote results showing that “positive reinforcement” is more effective (though naturally with academia there’s a lot of “maybe this, maybe that, more research is needed”). Good managers are supposed inspire their team with the “why” of their work: making the world a better place, saving the environment, democratising whatever. Then, so the theory goes, people will gladly transform themselves into scientists, motivated by a higher purpose to learn all the unnatural behaviours that scientists exhibit.

The trouble is that whilst I want to believe it, I’m not sure I do. I see environments where young data scientists violate all parts of the unnatural triad. I even see, horribile dictu, graphs without their axes labelled!

Training as a scientist changes who you are. That is no easy thing, and I’m not sure you can be inspired into doing the work necessary to make that change.

Of course, there’s a contradiction here. I’m convinced the negative methods are the wrong thing to do, but I’m not convinced the positive methods work. It’s something I’ve struggled with for a long time …

… until I read this post. In it, the author, Brett Devereaux, points to a distinction that military historians make between “Morale” and “Cohesion”.

Morale vs Cohesion

Brett Devereaux is a much better writer than I am, so his piece will repay your reading of it. Anyone who can cover Battlestar Galactica and medieval smelting techniques in the same blog is alright by me. However for those short on time, the basic idea is that morale is what got historical armies into battle, and cohesion is what kept people fighting once they got there. All the marching, drilling and general unpleasantness was designed to build cohesion, which was what got people to do all the unnatural things expected of soldiers standing shoulder-to-shoulder in a historical army. Commanders of enthusiastic but not very well trained armies (high morale, low cohesion) would find their units disintegrating on contact with the enemy, only to coalesce again to fight another day. Meanwhile commanders of miserable but disciplined armies (low morale, high cohesion) often found those armies pointing their weapons in the wrong direction.

Now there’s a limit to how much this analogy can be pushed, but I want to suggest there’s a similar distinction to be made in the workplace. All the inspirational stuff builds the equivalent of morale. It gets us out of bed on a Monday morning because we know why we’re getting out of bed. However once we’re at work, there’s this other thing, let’s call it cohesion, that comes into play. It’s what gets people to do the unnatural things expected of scientists. Through some combination of habit, pride, not wanting to let your team down, cohesion makes us hold ourselves to high standards, and grind through all the miserable stuff that makes up most of real-life data science.

If you accept this, it’s not too much of a leap to say that in the bad old days, the scientists’ equivalent of morale was taken as a given. Serving science was a noble end, and if you didn’t like it, you could go do something else. All the hazing was designed to build the scientist’s equivalent of cohesion, instilling that triad of unnatural behaviours. And once you’d been through the process, your personality had changed, and your colleagues were more willing to trust you.

Remember that person from earlier, muttering self-criticism under their breath?

So I want to suggest to senior data scientists and data science managers that yes, all the positivity is important for morale, and you can’t build a well-functioning team with good morale without giving them the “why”.

However it’s not enough on its own. We have to give some thought to building cohesion too.

So here’s the challenge

Traditionally, in different times, places and contexts, cohesion has been built, at least in part, with negative techniques: hazing, shaming, group punishments. It’s unpleasant, but historically, it seems to have worked.

I’m just going to pause here, because we’re getting into difficult territory. I am definitely not saying that teaching someone how to do a linear regression, or time series analysis, should involve pain and humiliation. Technical skills can and should be taught positively.

However there’s more to becoming a data scientist, and I emphasise the word scientist, than technical skills. There’s a deeper personality change that occurs as you imbibe that unnatural triad. And because deep-rooted personality changes are hard, it’s not at all obvious to me that people can be “inspired” into making them. On the other-hand, like it or not, negative methods have historically been shown to work.

Now here’s the problem. No decent, humane person would want to go back to the old way of building what I’m calling “cohesion” but someone else might call “standards”. And even if we did want to go back, it wouldn’t fly in today’s culture. However if Gen-Z’ers are serious about becoming data scientists, it will involve their personalities changing, because the behaviours of scientists are unnatural.

So I’m going to suggest that data scientist trainees and their managers need to agree on some “structured negativity”.

Structuring negativity

The first responsibility is on the trainee data scientist. They’ll have to ask themselves some hard questions about whether they want to change. If they feel they’re awesome just the way they are, that’s fine, go them! But they probably won’t become a scientist. The chances of them exhibiting those unnatural behaviours are very low indeed.

Secondly we have to create a space for (sometimes harsh-sounding) negative criticism. That space has to be private. Hopefully the days of public humiliation are behind us. This means managers have to devote time to their trainees.

Time on its own won’t be enough, of course. It’s certainly possible that trainees will have come through their education system without receiving much in the way of negative feedback. Equally managers aren’t necessarily practiced at delivering it. I know I’m not.

So thirdly, I think it’s necessary to agree on a finite training period with lots of attention from the manager. During this time, data science trainees will hear a lot of: “no, that’s not how scientists do it”. Meanwhile managers will have to devote some time to thinking hard about “how scientists do it” and actually pass on that thinking.

The final factors are, as ever, organisational. The trainer / trainee relationship has to be clear, which it isn’t always — particularly in matrix based organisations. Receiving criticism isn’t ever going to be pleasant, and I’ve seen people skate off to another manager who is more amenable. There have to be organisational consequences to not taking on board the criticism and adapting to it.

On-top of this, managers are sometimes judged on their retention rates. My prediction is that as people get hit with the reality of becoming scientists, the early drop-out rates will be high. However anyone who sticks around will stick around for a long time — once they truly “get it”.

So finally …

Look, I don’t think it’s going to be pleasant for anyone, but I think there’s a place for negative criticism, designed to build to a positive outcome. In the end criticism is a gift. It means a manager has taken some time to think about the person and is trying to help them become something better. If you hear me say “Mmmm, yes that’s great”, I’m just trying to get you out of the door.

But either way, please label your ****ing axes.

Dave DaleDecember 7, 2022