Man-Months to Quality

Posted March 11th, 2009. Filed under , ,

One assumption that designers and developers naively make is that the highest quality game will lead to the highest profit for the company. Unfortunately, that may not be true and I will show you an example of this. Keep in mind that this is a clearly hypothetical example, but I will be disclosing the assumptions made along the way so that you can see my steps. There are some wildly controversial assumptions here and I want anyone reading this to understand my analysis.

Let’s take two individuals: a lead designer and an executive producer. The lead designer wants to create the highest quality game possible and the executive producer wants to create the most profitable game possible. They are sitting down attempting to discuss the scope of the project, which will be boiled down to a spend of man-months. The game being discussed is a mid-size project in a genre that has had a wide array of quality from critical panning to critical acclaim. Assume there is a base level of functionality that must be delivered to ship a game in this genre. This means that there are only so many features that they can cut and still fulfill the requirements of the genre.

Metacritic is a wildly used but also wildly disputed measure of quality. The site metacritic.com gives what is called a “Metacritic Score” by aggregating the review scores of as many legitimate review sites and magazines as possible. Below we see a possible relationship of Man-Month spend to resulting quality. Extremely low spends result in broken and terribly rated games. But each marginal man-month spent increases quality by smaller and smaller amounts (decreasing marginal utility) until the project reaches a point of maximum quality. At this point, the team is adding some of their less exciting and impactful features and the glut detracts from the impressiveness, usability and/or functionality of the other, better features and quality drops.


Fig 1 – Metacritic vs. Man-Months

This graph will vary wildly depending on the market expectations and the subject matter. For instance, World of Warcraft can continue adding quality features without diluting the product for a very long time. A casual market Sudoku title can only do so much before additions become trivial or distracting.

Also, this relationship only holds looking forward from the beginning. The team cannot be basing their design, code and assets off a particular plan of N hours and then have an additional X hours added to the plan in hopes that they will get the same bang-for-the-buck in terms of quality as if they based their original plan on having N+X hours. While the team with suddenly X more hours will be able to move to the right on the curve, they will not be able to move the full X hours over. There is always some issue of refactoring code, data or design that makes this “bonus” time inefficient. In fact, sometimes it is so inefficient that it yields no quality improvements at all. This depends wholly on the team and the circumstances of the project. This is related to the widely quoted (by engineers) assertion that changes made during design are orders of magnitude cheaper than changes made during testing.

Now, further assume as in Figure Two that the Metacritic measure of quality is proportional to the revenue the project amasses. Now, there are numerous counterexamples to this statement: both of critical darlings that couldn’t turn a profit and of dreck that raked in huge revenues. But by and large, the correlation of Metacritic Score and Sales has been reasonably tight (especially for non-licensed titles), so we will use that to aid in our analysis.


Fig. 2 – Revenue vs. Metacritic

There is a certain score in each genre on each platform where any score below is considered a statement of terrible quality. For instance, the difference between a 30-rated game and a 40-rated game is unnoticeable for most examples, while the difference between an 80-rated game and a 90-rated game is significant. You see this in the example in Figure Two where up until a score of 60, revenues are mostly flat. Then the 60-75 range (highlighted by the bracket) is the “bang for the buck” area where each additional point of “quality” convinces a larger and larger audience that the game is worth buying. Then, eventually, a title reaches a level of quality where additional points do not matter as much and the “bang for the buck” dies off.

Cost is fixed plus some factor that varies linearly with man-months.

Now that we have revenues and costs, we can make a graph of profits (Figure Three). As you can see, there is a point of maximum profit where the cost of additional man-hours is greater than the revenue that additional man-hour will bring in.


Fig. 3 – Profit vs. Man-Months

But if we look at both the points of max quality and max profit on the same graph, we see that they are not the same! The point of max profit occurs before the point of max quality.


Fig. 4 – Comparing Profit and Metacritic on the same graph

What happens when we do a sensitivity analysis varying the cost of labor? We find that as we get closer to zero, the point of max profit approaches the point of max quality. But as we increase our variable costs, the point of max profit moves away from the point of max quality! Eventually, there reaches a level of cost such that no level of man-month effort is profitable and then the level of max-profit is to not do the project at all.

So this has huge implications if the assumptions are valid. Companies where it is very expensive to make games (assuming they are profit-maximizing enterprises as well) will choose an amount of man-hours that results in games farther from their point of max quality than will a similar cheaper outfit under the same conditions.

But also consider our two developers. Even if the designer has the perfect plan for how to make the highest quality game, the producer will almost always be forced to cut him short. This is sort of comforting for us who have this happen on every project.

So, as designers, how do we act on this information?

Conclusion One: Find Your Happy Place
There is some level of satisfaction where you can feel good about releasing the project into the wild. A common trait among designers is the desire to be a perfectionist. Shake that desire. You likely won’t be able to hit your game’s full potential in terms of quality, so figure out what the minimum level of quality you would be satisfied with and do everything in your power to at least hit that level. Everything else is gravy. Obsessing about making the complete, perfect experience will only cause pain and strife when your beloved features are cut to lower costs.

Conclusion Two: Figure out the way to make every man-month count.
Many do not realize how truly expensive overhead can be. If you believe that the higher the monthly costs of operating, the farther away the point of maximum profit is from the point of maximum quality, then you will want to do everything in your power to make those precious man-months count. This is essentially changing Figure One into a steeper curve, where top quality happens with less effort. (Note that I didn’t say “where top quality happens sooner”. Many managers in this industry seem to believe that working their charges to complete features by such-and-such date no matter what is productive. What they don’t realize is how little productivity death marches yield.) Productivity is a cause everyone can rally-behind.

I realize that this analysis boils a very complex market down to a few variables, but the purpose of this was not to make a proof or a law, but to get thinking about the differences between maximum quality and maximum profit, which many assume must necessarily be one and the same.

Resolving Idea Surpluses

Posted February 11th, 2009. Filed under ,

Ideas are super-cheap: a dime a dozen or so they would have you believe. But regardless of the actual value of creatives’ ideas, the quantity supplied is much greater than the market’s demand (this being the number and scope of ideas actually implementable). So how do developers decide which ideas are worth transacting? There are three models:

1. Authoritarian

Some central figure decides what is right and what is wrong. This is the one where the lead designer or executive producer says “Because I Said So”. Apple’s Steve Jobs uses this to great market success. This is likely successful because the ideas that Jobs finds palatable resonate with the public so well. You are very lucky if you share that quality with Mr. Jobs.

Pros:
Team wastes little time arguing and money researching.
Singular vision.
It’s easy.

Cons:
Junior folks are left out of the decision-making process.
Risk that the authority is wrong.
No buy-off from line folks can mean uninspired implementation.
Can cause bad blood.

2. Scientific

You do a study and use the results of the study to validate ideas. Playtests are a form of this in the game industry. A problem with this approach is understanding how granular your research must be.Do we test what color the submit button is? Or how about the entire genre of the game? In doing research you have to present multiple options, so this often leads to having work thrown away. This can be expensive if the research is too broad.

Also, there is the risk that your method for measuring preference is unsound. Often times, customers don’t know what they want until you give it to them. You can’t count on market research for creativity because most often these reports come back with “bigger, better, shinier” instead of resonating with innovative and scary ideas.

Research is expensive and takes time. Groups can always say “We will do research when we get to an appropriate point and then make a decision,” but more often than not that point never reaches and the group defaults back to an authoritarian model. “Oh well, we didn’t have time to try it out; let’s just go with Proposal A.”

Pros:
Decisions are based on quantitative results.
No arguing.

Cons:
Expensive.
Difficult to do correctly.
Can’t create innovation on its own.
Often subjects don’t know what they really want.

3. Collaborative

Groups of creatives get together and riff off each other to figure out what is best. The difficulty with this method is that consensus has to be reached. If everyone wants to be the authority, this cannot work. And since ideas are so personal, treading wrong can cause a lot of dissonance and hurt feelings. Your team has to gel well together to create collaboratively. They also have to be willing to be accountable for other people’s risky ideas. That’s tough in a lot of organizations.

Pros:
Most organic and innovative model.
Ideas can build off of each other leading to real discoveries.

Cons:
Groupthink.
Difficult to successfully mediate differences unless the team is balanced well.
Group may never reach consensus.
Can take significant courage, time and energy.

The best method is likely some combination of the three tailored to your project and your team members. I realize that is a bit of a cop-out, but I have been part of teams using all three and each has had their successes and grand catastrophic failures.


PS – I’m getting tired of having to update the Games Industry Death Toll post every day, people. Can you just stop publicly releasing people and sneak them out the back door so we can pretend that the economy isn’t completely broken?