The first rule of using analytics in Major League Soccer is, apparently, that you don’t talk about using analytics in Major League Soccer. The second rule is, of course, the same.
It began with a curiosity and a hunch. The central question was whether the sudden hunger for advanced statistics in European soccer – according to the Guardian, every single English Premier League had a statistical analyst on staff last season; Manchester City had 11 – had famished MLS clubs as well. The U.S. is, after all, where the box score was invented and newfangled metrics such as the Value Over Replacement Player (VORP) and Batting Average on Balls In Play (BABIP) came into being.
Had MLS teams matured to the point where they could devote resources to exploiting edges on the fringe of the game?
As for the hunch, it seemed like analytics – which we’ll use as a catch-all for next-level statistical player performance and fitness analysis – were a perfect fit for our poky domestic league. Despite what many believe, the foundational structure of the Moneyball movement wasn’t so much on-base percentage and eschewing hit-and-runs and base-stealing as it was exploiting market inefficiencies. The market just happened to be for baseball players.
MLS remains a small feeder league – albeit one with an abnormal capacity for luring big, aging stars, relative to its size – where teams operate on extremely tight budgets in a vast, global, and almost psychotically competitive talent exchange. What’s more, that market is already incredibly efficient. You get more or less what you pay for. But if you can redefine what it is you are and are not, what you need and don’t, and how your thin wad of money is best spent, effective analytics can offer real advantages. Especially since everybody is working under the constraints of the same salary cap system – budgets for Designated Players excepted – evening the playing field and increasing the incentive to recruit better.
So what, then, were MLS teams doing about analytics?
Well, as it turned out, talking about what they were doing about analytics was not something many of them are keen on. We reached out to 16 of the 20 MLS teams – the ones with whom we had good, pre-existing relationships. They had always been helpful and eager to accommodate requests. But when we asked about analytics, 10 of them never responded. We followed up several times, over the course of a few months, and got nothing back.
Of the teams that did respond, some explained that they weren’t into analytics. FC Dallas was one of them. The Chicago Fire told us it had dabbled, but stopped because they found no useful applications. Having spoken to Paul McDonough, Orlando City Soccer Club’s Vice President of Soccer Operations, a club spokeswoman said in an email that the club “utilizes a variety of resources, including recruiting databases, websites, and other such tools. We also rely on our network of relationships to provide relevant intelligence about a prospective player.”
But, she wrote, “While statistics and data points are nice, nothing can replace actually meeting a player, understanding his character on and off the field, and seeing if the chemistry works well with the organization.”
The San Jose Earthquakes, which share an owner with the Oakland Athletics, Moneyball’s patient zero under general manager Billy Beane, said it uses a variety of services to provide data. The club has contracted with Wyscout for international scouting services, Match Analysis for in-game analytics, and Catapult to monitor players for overuse at practices. But when we sent follow-up questions on how the data is used, a ‘Quakes spokesman wrote back: “We have some analysts in house that work with our coaching staff on those programs. That is about all we will release about our analytics.”
(Beane, by the way, was briefly an advisor to the Earthquakes but isn’t any longer, according to the club.)
Toronto FC, which recently hired an analyst in Devin Pleuler, who previously worked for the Opta statistical service and wrote the Central Winger analytics column at MLSsoccer.com, didn’t respond to any of our messages either. Pleuler politely rejected an interview request.
But when Pleuler was hired in February, TFC General Manager Tim Bezbatchenko told the Canadian Press, “There’s more information available to coaches and GMs. You need to collect it, organize it, and then look at it and try to figure out patterns and new ways of looking at the game.”
“And really, you don’t know what you don’t know,” Bezbatchenko continued. “So you’ve got to keep an open mind and try to think of doing things a little bit differently. And if there’s more data out there, I don’t understand why you wouldn’t want to have it.”
But the club clearly had no appetite to elaborate further.
To a certain extent, all of this is fair. Teams that are investing time and resources in analytics have no good reason to share what they’re doing with the outside world, lest they give away breakthroughs to their competitors. It wasn’t long after Moneyball came out, after all, that Beane and his A’s had to completely revise their market strategies, since the recipe to their secret sauce, which let a small-market team compete with the much richer clubs, had been painstakingly laid out in a bestseller.
(Manchester City, on the other hand, releases its data to whoever requests it. When it first open-sourced their database in 2012, more than 5,000 people downloaded it within three weeks.)
Still, two MLS clubs were willing to talk: the New England Revolution and Sporting Kansas City. Incidentally, or perhaps not, both were MLS Cup finalists in the last two seasons, competing through shrewd roster building and match planning.
With the Colorado Rapids late on in his playing career, Peter Vermes had a manager in Mooch Myernick who felt that if his team fouled more, it meant it was more aggressive than the other team. In the embryonic MLS of the late ’90s, this sort of simplistic thinking probably passed for highfalutin stuff, but it nevertheless exposed Vermes, now Sporting technical director and head coach, to the idea of a correlation between data and outcomes in a fluid game.
In charge of an ultra-modern club owned by a technology company, Vermes exploits every bit of data he can find some use for. Like many clubs, Sporting uses science to predict and prevent injury, and to keep players performing at optimal levels. But it’s also taken that a step further, seeking to better understand what kind of fitness their possession-oriented 4-3-3 playing style demands.
“You’re looking at what type of training sessions, or even exercises, that you may do with the players that are either extremely useful for your model of play, or actually detrimental,” Vermes explains. “No matter how you play, if you understand the science of it, you can train your team in the model of your play.
“If you want to sit back and counter, you need to train your team that way, to replicate those actions many times over so it can perform it that way,” Vermes says.
That counter-attacking team, for instance, would train more on explosiveness than stamina as a possession-oriented team might. So, Vermes says, you work out how long the sprints are that your players tend to make during games when executing your playing style, and tailor practice to best prepare them.
That, in turn, helps inform the fitness side of what he looks for in potential new signings.
“I want our guys to be ‘Sporting fit,’” Vermes says. “There’s a certain level of fitness that we want from our guys. And truly what that is, is they have to have a certain gas tank.”
This is where analytics can prove useful when you don’t yet control that player, monitoring his capacity for labor from afar.
The other aspect data plays in Sporting’s recruiting is to ascertain exactly how a player plies his craft at his particular position, making it easier to prophesize how he’ll do at Kansas City.
Vermes gives an example: “Can you find out, if he’s a defensive midfielder, is he a guy who recovers balls because of challenges? Does he recover balls because he’s reading the passing lanes? Those are all things that you’d know.”
Provided, of course, that the player’s current league keeps sufficient data. If not, Vermes says, you have to do it the “old-school way” of just watching games.
But for all the progress analytics have made, for the scientific gloss it has applied to the inherently unscientific job of coaching a team in a sport played on a blank canvas, it remains as problematic as it is useful. It’s simply impossible to transpose exact order onto the run of play.
“It’s very hard to take that information – statistics, analytical data, anything from an analytical perspective –without also looking at the video and understanding what transpired in the game,” says Vermes. “You have to use the two together. Soccer is really difficult to sit there and do the Billy Beane theory, like in baseball.
“First off, the sizes of fields aren’t exactly the same [in every stadium], surfaces are different,” Vermes says. “You can go in so many different ways – formations are different – there’s so many things you can change in soccer.”
The variables are just about endless. So you’re better off taking a macro view of the sport.
“What you can do is look at trends.”
Some stats, Vermes argues, are useful in a large context, when comparing, for instance, a player’s passing accuracy to every other player in his position in the league. But here, Vermes again points out, the variables come into play. Not every team likes to build out of the back, for instance, so some just have their central defenders lump the ball forward, throwing off the curve for the passing accuracy of defenders.
So what Sporting does instead is rank the players on each team in a point system derived from a range of statistical categories – Vermes won’t say which, secrets of the trade and all that.
“If we had a positive result [a win], a majority of the time we had at least six or more players in that top 10,” he says. “If we have six out of 10, we’ve played very well and dominated in a lot of categories. And very rarely have we won a game where we haven’t had half of our guys in that category.”
The formula works for the most part. But then again, the variables.
“There have been many, many games at Sporting Park where we have dominated in every single category and we make one mistake and we either tied or lost that game,” Vermes says.
Three years ago, Tim Crawford was a math teacher and tutor in Los Angeles. Then the UConn graduate with a major in math and a minor in statistics was hired as an analyst for the Revs. This hardly happened on a whim. Club president Brian Bilello is an MIT graduate with a past in analytics. And Crawford was brought on with the strong support of incoming manager Jay Heaps, who had spent the two years between his playing and coaching careers advising and managing portfolios for high-net worth individuals at Morgan Stanley. He was instantly receptive to analytics.
“He likes the idea of data telling him things,” Crawford says.
The Revs have a different approach than Sporting KC. Whereas Sporting uses it to fine-tune its fitness and assess its performances, the Revs employ analytics to scout opponents and establish best practices for themselves, both in playing style and the player market.
“For us, we use it [analytics] somewhat in player recruitment,” says Crawford. “But more largely, we use it in our big-picture, overall approach to the game – trying to understand what’s successful in MLS: what works, what we do in the first place, and how we could be better at it. Just kind of trying to grasp the game in a different way.
“And then from there we can use it to evaluate players and see if they can fit into our system or if they’re doing the things we want them to be doing if they’re already on our team,” Crawford adds. “That kind of can frame how we approach any signings. Largely what we’re using data for is to define our playing style and to define what we think is successful, how we can best approach each game.”
If Sporting is trying to develop more of a causational formula, the Revs take a more academic approach, coming up with a thesis and trying to find proof for it.
“We try to find things that correlate to success,” Crawford says. “There isn’t one answer to that.”
They try to analyze which kinds of set pieces result in the most goals, for instance.
“A lot of times we go in with a theory and see if that theory holds true or not. We take a subject and try to find out what the best way is to approach it. With recoveries, what’s the best way to get the ball back? And then once we kind of determine that, we can go from there and say, ‘How do we want to mold our team to meet that best practice?’”
The other function Crawford has is as a kind of statistical advance scout. He breaks down upcoming opponents with data, which essentially boils down to confirming or disproving a preexisting notion of what that other team is, and then expanding upon it.
“A lot of times, you’ll say, ‘Okay, they’re a counter-attacking team.’ Or, ‘They’re a possession-team.’ But to put teams in two labels is not very good; to define them by their formations is not enough,” Crawford explains. “We’ve definitely had scenarios where we’ve gone into the beginning of the week thinking we need to play a team a certain way and then kind of flipped it a couple of days later, once we saw some things, and it’s worked to our benefit.”
As for the Revs’ remarkably strong record in the MLS draft and in both the domestic and international player market, where they’ve unearthed as many bargains as anybody, there the analytics serve to provide guidance on historical indicators of success. Scouting for an MLS team is hard. The franchises hardly get the pick of the litter, leaving them to scout leagues with less media exposure and, therefore, data. The college game, meanwhile, is a morass of inconsistency in playing level, amateurish tactics and spotty development.
So by analyzing data, the Revs can avoid as much risk as possible, just as Billy Beane demurred on drafting high school pitchers because they were least probable to succeed. Likewise, some types of players from some specific backgrounds will more likely offer added value. But then that’s an equation with a lot of variables as well.
“The tough thing,” says Crawford, “is MLS is always evolving. What has worked before may not work again.”
This is just as true for the analytics themselves. In soccer, mining data for competitive edges is a practice still in its infancy. But then so is MLS.