Ridere, ludere, hoc est vivere.

Friday, April 25, 2014

Luck, skill, and research

Last week I opened a discussion on my effort to quantify game characteristics.  I had in mind that I would explore this question on my own, somewhat in a vacuum, based on my own experience and opinions, as something of an exercise to see what defensible conclusions I might reach.

But then a few days ago, I received a Tweet from Christopher "CardboardEdison" Zinsli that reminded me that I need to do my homework.  He sent me a link to a boardgamegeek guild called "Game Genome Project," an effort revived in the last couple of years to characterize "a detailed set of game features that can be used to classify, compare, and recommend games."  That guild includes a number of detailed and thoughtful posts on game taxonomy and characteristics, including a link to a Richard Garfield lecture at New York University on luck and skill in games.  That lecture, in turn, made me realize that I have a lot to learn, and the guild made me realize that I have a lot to contribute, and not just in a vacuum.

So my intent is to continue my research while capturing an initial outline of ideas here, recognizing that the more I learn, the more I may seek to change or correct what I've written as my understanding of the state of the art of games improves.

Among the many game characteristics that might be quantified is luck. Now, until recently, I had considered luck to reside on one end of a spectrum, the opposite end of which was "skill."  On that one-dimensional spectrum, Candyland would reside at the left end of the spectrum ("100% luck") and Chess would reside at the right end ("100% skill").  My thinking was that I would come up with a metric for assigning every game a luck-skill spectrum value between zero (all luck) and one (all skill).

But Garfield's lecture describes this concept as a false dichotomy.  He draws a two-dimensional chart in which a game like poker is high in both luck and skill, whereas tic-tac-toe is low in both luck and skill.  So that made me decide to come up with a metric for luck alone, independent from skill.

My notion of measuring luck in a game is based on the premise that the play of a game from beginning to end constitutes a sequence of discrete game states and that each sequential change is determined either randomly or by a set of decisions by one of the players (but not a combination of both).  I would measure the "first-order luck" of a game as the average fraction of game state changes that are determined randomly.

Given that metric, we can explore some simple cases and then start looking at more complex games, evaluating their "first order luck factor," and seeing whether it's a measure that makes sense or one that needs more refining.

So if we look at Candyland, the initial game state consists of all the player pieces at start and the deck of card shuffled (randomly).  Every subsequent intermediate game state is the result of turning over the top card from the deck and moving the next player's piece to the square indicated by the card.  Every subsequent game state change is determined by the initial deck shuffle.  The game ends when a player's piece reaches the end of the track.  So all of the game state changes are determined by the initially randomized deck, and the first order luck factor of Candyland would evaluate as "100%."  Other games with 100% first order luck factor include Bingo and War.

If we look at Tic-tac-toe, the initial game state is a blank 3x3 grid.  Each subsequent game state change is the addition of an 'X' or 'O' whose placement is the decision of a player.  The game ends when a player achieves three symbols in a row or when the board is full of symbols.  Every game state is the product of a player decision - nothing is randomized - so the first order luck factor of Tic-tac-toe would evaluate as "0%."  Other games of zero luck factor include Checkers, Chess, Go, tafl, Yinsh, Hive, and Chicago Express.

Let's consider something a little more complicated.  Consider a trick-taking game like Hearts.  The initial deck shuffle randomizes the starting state of the game.  Except for the opening play of the Two of Clubs, every subsequent card play for the first twelve tricks is the product of decisions by each of the four players.  The 13th trick involves no decision-making and is essentially determined by the card play of the 12th trick.  So each hand of Hearts has one randomized game state change followed by 3*4*11 = 132 player decisions.  So every hand of Hearts - and therefore a game of Hearts - has a first order luck factor of 1/132 = 0.758%.  Now, that might come as a surprise, but it must be remembered that card games in general are won based on, literally, how players play with the hands they are dealt.

Let's look at Sid Sackson's Acquire.  Every turn consists of a tile play, an option to buy shares, and then a tile draw from a bag.  Everything in the turn is the decision of one player (except in the case when a tile play results in a merger, in which case every player with shares in the taken-over company makes a decision as well) - until the end of the turn, when he draws a tile from the bag.  So, in a non-merger turn, each game state change consists of a set of one player's decisions followed by a randomized event.  So non-merger turns have a first-order luck factor of 50%.  We would need to collect data on a number of games (or do some ridiculously deep analysis) to estimate how many turns are merger turns and how many player decisions each would involve, but I'm going to estimate that 12 mergers happen in a game of about 72 turns, and each merger averages four player decisions, so that the first order luck factor in Acquire might come out to something like 50%(5/6) + 20%(1/6) =  45%.

Right away, it seems that there is some failing in the metric I've described.  While it is certainly true that among comparably skilled players, tile luck can sway the results of a game, I would certainly consider Acquire less of a luck-driven game than Hearts.  So perhaps this metric is indeed something that I need to take back to the drawing board.

And in the meantime, I will continue to research on what others have said and written about trying to quantify game characteristics like luck.

2 comments:

  1. So can I simplify your definition of luck to when something happens to a player where they have no control over the event?

    The reason I highlight that is because it creates a clear definition that has a formula to construct your metric and helps boil down the definition of your metric.

    For example, if the last card laid down in a game of hearts is "luck" then shouldn't the last move made by the last player in "tic tac toe" also count as luck?

    If the definition is also based on whether or not players have a choice then that would also enable you to separate Acquire from Hearts on the last move. Hearts, you must play the card and Acquire still provides players with the choice to lay down their tile or resign.

    A challenge to this approach is that skilled Heart players and skilled tic tac toe players look ahead in the game and anticipate the last move of the game. So even though players had to make a move at the end of the game, that move was not made randomly, it was pre-determined by the way they played their previous moves and therefore, might not really count as being "luck."

    Another definition to consider for luck is how randomness limits player choices. In chess there is no random draw that endows the player with an action or a resource. Versus acquire where actions are constrained by the tile pieces in your hand.

    This perspective, I think, would view Acquire and Hearts as being very similar in one dimension. They are both games where players actions are limited by the hand they draw.

    And different in a 2nd dimension. In hearts all points are earned from how you lay down your titles. In Acquire, points are earned based on stock players own. Stock can be bought by any player and are not decided by the cards each player holds. Points to win the game come from stock and the game played over stock is much close to the setup of a chess board.

    Both processes listed above could be constructed to create an index. One is much simpler to construct, the other, I bet, will be more closer to what you intuitively think of as "luck".

    ReplyDelete
    Replies
    1. Aaron, great points. I should have defined what I mean by "luck." (I know some people define "randomness" differently from the way I do; I plan to address that in a subsequent post.)

      I use "luck" to mean a change in the game state that can have multiple possible outcomes but that is not decided by any player. So the last card played in Hearts is a product of the decision made in the previous trick (as you point out), and therefore is not "luck" as I define it. Likewise, the last move in Tic-tac-toe is a product of the other player's previous move, and therefore is not "luck."

      You're right that Hearts and Acquire share the characteristic of randomizing (and limiting) the resources with which a player can work and therefore his decision option space. It occurs to me that Carcassonne falls into the same category, and would have a "first order luck factor" very close to 50%.

      You are right that tile play in Acquire only indirectly affects a player's score, which probably accounts for our intuitive sense that it is a more skill-driven game than Hearts. And I agree that it would be more difficult (but more interesting) to try to construct an index that better reflects the relationship between luck and score.

      Delete