-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpaper.Rnw
346 lines (256 loc) · 33.1 KB
/
paper.Rnw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
\documentclass[twoside]{article}
\usepackage{multicol}
\usepackage[hmarginratio=1:1, top=32mm, columnsep=20pt]{geometry}
\usepackage{amsmath}
\usepackage{natbib}
\usepackage{url}
\usepackage[normalem]{ulem}
\usepackage{graphicx}
\newcommand{\hh}[1]{{\color{magenta} #1}}
\newcommand{\sam}[1]{{\color{blue} #1}}
\author{Sam Helmich}
\title{Board Game Recommendation}
\begin{document}
\maketitle
\begin{abstract}
This project implements portions of Netflix-Prize finalist's algorithms in an attempt to predict how much a person will enjoy board games. Drawing on user's past predictions and data on the specifics of each game, a hierarchical method is used to link a specific user's ratings to ratings from other users, and allowing the use of data on other users who may have not rated any of the same games to predict enjoyment of unrated games. Through this model, we also can derive user-specific scores on how much they like certain aspects of games.
\end{abstract}
<<loader, echo = FALSE>>=
library(xtable)
load("paper_stuff")
@
\section{Introduction}
Board gaming, while being very popular among children, is a niche hobby beyond adolescence. Everyone may have grown up with Monopoly, but most people may have never heard of Citadels or Twilight Imperium. And as such, board gaming is a very difficult hobby to break into. ``Gateway Games" such as Settlers of Catan or Ticket to Ride offer a gradual transition between the ultra-popular and mechanics-lite games like monopoly, to more rules-heavy games like Puerto Rico. For most people, the difficult part of the transition is actually finding new games to play. New games are often expensive for people who may be apprehensive about trying new games (a copy of Settlers of Catan goes for about \$40 on Amazon). For this reason, a method to recommend games to people would be invaluable.
A few methods of varying usefulness already exist. Boardgamegeek.com, a popular site among frequent board gamers, has a recommendation system that allows users to find a particular game they like, and then from there it recommends more games that a person may like. A pretty thorough analysis of it (and its issues) can be found at \url{http://boardgamegeek.com/wiki/page/Game_Recommendation_Algorithm}. The bigger issue is that it cannot be tailored to an individual. You, an individual user, may have different preferences than the user base average, and it might be more useful to tailor recommendations to the individual.
Recommendations for the individual are not easy to come by. There are a few automated systems out there (such as the board game
recommender bot on reddit), but for the most part they are time intensive and often lack interactivity or any sort of information that would be useful to the user beyond a few single games that it might recommend.
In this paper, we seek to build a system that recommends board games to users based on their own recommendations as well as the recommendations of others. This system offers the user not only recommendations for a very large number of games, but it also offers insights to the user about what sorts of games they like, not only scores for particular games.
\section{Data Collection}
For data, we turn to \url{boardgamegeek.com}, a website that holds information, ratings, and reviews for over 75,000 games. Users can create an account and then rate games they have played as well as keep an inventory of games they own. All of this information is accessible and scrape-able, and is the data that drives our analysis.
There are two types of particular information that we are interested in: user ratings of the games and characteristics about the game (referred to as classifiers in the modeling section). The characteristics we're interested in fall into these categories as outlined in table~\ref{tab:categories}.
\begin{table}[ht]
\centering
\begin{tabular}{|l|c|}
\hline
Category & This covers the broad categories a game\\
& might fall into (ex.: Humor, Puzzle, Sports, etc.) \\
\hline
Family & Some games a naturally part of a ``family" \\
& that share a name or common element \\
& (ex.: Animals:Bats, Ancient Wars Series, Hello Kitty)\\
\hline
Mechanic & This is what mechanisms the game uses\\
& (ex.: Dice Rolling, Card Drafting, Worker Placement)\\
\hline
Subdomain & More general than Category, but loosely bins games\\
& (ex.: Family Games, Strategy Games, etc.)\\
\hline
\end{tabular}
\caption{\label{tab:categories} Overview of the relevant characteristics of board games considered in this project.}
\end{table}
% \hh{Refer to table~\ref{tab:classifiers} somewhere in the text, add a label to the ratings table as well.}
<<classifiers, echo = FALSE, results = 'asis'>>=
print(xtable(paper_stuff$classifiers, caption = c("Example lines of classifiers data, as seen in Database", "Classifiers Table"), label="tab:classifiers"), include.rownames =FALSE )
@
The ratings for each game are on a scale from 1 to 10, with a score of 1 indicating a really bad game and a score of 10 being the best possible. Each of the ratings are paired with a unique user identification number as well as a game number. Combined with the classifiers data, we can provide very specific ratings for each individual user.
<<ratings, echo = FALSE, results = 'asis'>>=
xtable(paper_stuff$ratings, caption = c("Ratings Data, as seen in Database", "Ratings Table"))
@
To keep the information as compact as possible, individual players and games are referred to by an ID number, which corresponds to a row in one of the key tables. These key tables contain information about the players and game that we do not necessarily need to include with each rating or classifier. This greatly reduces the size of the database needed to store and work with the information.
<<player_key, echo = FALSE, results = 'asis'>>=
xtable(paper_stuff$player_key, caption = c("The Player Key relates important info about the individual to the userid", "Player Key Table"))
@
<<game_key, echo = FALSE, results = 'asis'>>=
xtable(paper_stuff$game_key[,1:6], caption = c("The Game Key gives useful information about the game", "Game Key Table"))
@
Not all of the data was scraped before creating the application. Data for 6,088 games was scraped and stored across the four tables. The reason for not scraping all of the data is twofold. First of all, for those 6,088 games there are over 3 million unique ratings. If we were to expand it to all 75,756 games known to BoardGameGeek, then our data set would be cripplingly large for all but very high-powered machines. %the small desktop I have to work on.
Instead, we %Secondly, this will force us to
implement a continuous data scraping procedure and update the recommender model dynamically: Each time a person uses the app, %all we have to do is
we make sure to check that all of their current information is correct. We look through known ratings and scrape their user page on BoardGameGeek. Based on this, we revise old ratings and add new ones, as well as check to see that all of the games that this user has rated are included in the database!
\section{Model}
\subsection{Model Selection}
We base our implementation of a recommender system %To begin implementing such a model, we can look to
on two papers which aimed to improve upon one the most famous recommender systems of all: Netflix. In 2006, Netflix offered a \$1 Million prize for an improvement on their algorithm for predicting user rating on movies they had not seen. In 2008, two teams: BigChaos and BellKor (the eventual winner), published papers on their methodology. In the end, these teams ended up taking something of a shotgun approach, using a linear blend of many different models to come up with a single super-model.
Our approach is to borrow from a single element of their shotgun modeling. Where BigChaos and BellKor use many different models of many different types, we choose a simple implementation of a single model.
There are a few things that make sense to model first. Certain games are more popular than other games, and certain people are naturally more or less generous than others in their ratings. These two effects are very important to measure, as the first allows us to provide estimates for people that have rated no games (and therefore we know nothing about), and the second allows us to fine tune estimates for people who have rated many games. We implement these in a similar manner to the way that BigChaos implemented the Global Effects model \citep{bigchaos}.
While they implemented 14 steps including Movie Effect, User Effect, and those variables crossed with variables like time since rating, average rating for the movie, production year, standard deviations, as well as others, we stick to using only two steps: one effect for each game and one effect for each player.
To compare models to each other we use root mean square error (RMSE). We take our data and split it into a training and test set. The model is fit with the training data, and then we predict the values in the test set. From those values, we can then find the RMSE using the equation below, where $y$ are the observed values from the test set, $\widehat{y}$ are the predicted values from the model built with the training set, and $n$ is the number of observations in the test dataset.
\begin{equation}\label{eq:rmse}
\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^n \left(y_i - \widehat{y_i}\right)^2}
\end{equation}
It is important to note that RMSE is a relative measure, and unless we are comparing models on the same data and subset, a similar magnitude of change in RMSE between two different datasets can be meaningless. Therefore when we look at the RMSE results on the Netflix data, we look to the RMSE change as guidance for our model knowing that we may see different RMSE changes when we implement it with different data.
% \hh{XXX explain RMSE; make sure in the next paragraph to always point out what is yours and what is netflix. You know all those distinctions, but for any other reader it is not that obvious.}
From the RMSE of the probe set of the Netflix data, the difference between using only user/movie effects and using all 14 effects was .02, a decrease of about 2\%. Noting this negligible increase in accuracy and recognizing my lack of significant computing resources, we only use user and game effects. This model is fit on my home desktop, which has an Intel Core i7-4770K CPU @ 3.5GHz and 32GB of RAM running 64-bit Windows 8. With the help of additional computing cores, RAM, and better parallelization, we could hope to achieve the 2\% accuracy increase seen in the Netflix Data, but more than likely it would not be worth it, as a quick cost/benefit analysis reveals that I do not need to hit the nail on the head every time, I just need to implement a method that reliably works a good enough percentage of the time. To this coder, a 2\% increase in accuracy just is not worth the money.
After we fit the general effects, we wish to model the remaining residuals, ideally in some way that gives us informative numeric summaries about the individual users in the process. In addition, we'd also like to compare a users ratings to other users, and hopefully use the behavior of similar users to predict new ratings for someone. A tempting first step is to, for a particular game, find other people who have rated that game who have rated games that you have rated. Then, you could find people who rated games similarly to you, and use their ratings of the new game as an estimate. This often does not work, as sometimes the number of people who have rated the same games as you can be relatively low, if not 0 (show a graph for this part).
To get around the problem of low crossover in game ratings, we turn to the characteristics of the games themselves. This kills two birds with one stone. We create, for each user, a rating for each game mechanic and genre. Then, since there are so many mechanics and genres of game, a player only needs to rate a small, albeit diverse set of games before we can correlate that player's habits with other players, even if they have never played any of the same games! Then we can use those to filter rating results, as we know what board game mechanics that a particular player enjoy.
We now have a very simple two step model that allows us to predict ratings for new board games that a person may not have even played! Also, it allows us to rate games that nobody has played before: We can use the overall mean as a starting point, then for an individual player, we can adjust the rating based on their ratings of similar games from other genres.
\subsection{Model Sequence}
The specifics of the model follow very closely to those outlined in the Global Effects and knn sections of the BigChaos paper [need bib]. Here, we cover the specifics of both. We step through the model in sequence, removing the effects that games and players naturally place on ratings (some games are naturally rated higher than others, some players are more or less generous than others). After removing those larger effects, we model the remaining nuance using characteristics about the games that were rated. From there, we strive to find users who have similar tastes in game characteristics, and then finally use that closeness to find ratings for games that a user has not yet rated.
\subsubsection{Global Effects}
The general idea behind global effects is to account for the natural biases present in particular games or players. For example, we might have two players who both play and enjoy the same game the same amount. However, one person might give this game a score of 10 while the other gives it a score of 9. Despite their overall enjoyment being the same, perhaps the latter player just tends to be a harsher rater in general. The global effect for this person would then be lower than that of the much more cheery fellow who gave the game a 10. Similarly, this effect can be present for games due to any number of reasons.
We estimate these effects one at a time using a hierarchical structure. Define $r_{u,i}^{(1)}$ be the rating from user $u$ for game $i$. The first effect we fit is a global mean, which we denote with the parameter $\theta_g$, and we model the ratings using the following formula:
\begin{equation}
r_{u,i}^{(1)} = \theta_g + \epsilon_{u,i}^{(1)}
\end{equation}
$\theta_g$ is estimated using a simple mean of the $r_{u,i}^{(1)}$'s. After removing a global mean, we model the global effect for each game using the residuals from the global mean step. For this purpose, define $r_{u,i}^{(2)} = \epsilon_{u,i}^{(1)}$, which we model using global effect for each game, denoted $\theta_i$. The residuals are then modeled using the following, where $x_i$ is an indicator variable that the residual belongs to game $i$:
\begin{equation}
r_{u,i}^{(2)} = \theta_i x_{i} + \epsilon_{u,i}^{(2)}
\end{equation}
To estimate $\theta_i$, we want something that is like a mean, but also contains an offset to avoid overfitting. Following from \cite{scalecollab}, $\widehat{\theta}_{i}$ is estimated using the following, where $\alpha_{game}$ is a tuning parameter, and the sum is over all residuals for game $i$ and $n_{i}$ is the number of residuals for game $i$:
\begin{equation}
\widehat{\theta}_{i} = \frac{\sum r_{u,i}^{(2)}}{n_{i} + \alpha_{game}}
\end{equation}
The final step in the global effects stage is finding the global effect for each user denoted $\theta_{u}$. In a similar manner to the game global effects, once again taking the residual from the previous step and model them. To proceed, let $r_{u,i}^{(3)} = \epsilon_{u,i}^{(2)}$, and those residuals are modeled using:
\begin{equation}
r_{u,i}^{(3)} = \theta_{u}x_u + \epsilon_{u,i}^{(3)}
\end{equation}
And again, $\theta_{u}$ is estimated with $\widehat{\theta}_{u}$, which is calculated with:
\begin{equation}
\widehat{\theta}_{u} = \frac{\sum r_{u,i}^{(2)}}{n_{i} + \alpha_{player}}
\end{equation}
Where $\alpha_{player}$ is a tuning parameter to prevent overfitting.
Noting that this can produce estimates that are above 10 and below 1, ratings that fall outside of $[1,10]$ are truncated to the closest interval endpoint.
% The general idea is to remove the global mean from all ratings to get a residual for each observation. We will group these residuals by game, and then find an effect for that game. Subtracting this effect from the residual, we will take those residuals and repeat the same for each user. The motivations for using the formulas that follow can be found in Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights (make this a bib reference, \hh{you still need to paraphrase the motivation here.}).
%
% To begin $r_{u,i}$ will denote the rating by user $u$ for game $i$, where $y_{u,i}$ is a number between 1 and 10. Then, we have $$y_{u,i} = \mu_{global} + \mu_{i} + \mu_{u} + error$$
%
% However, we will approach this in steps, as follows: \hh{what is the benefit of using a hierarchical approach?}
%
% \begin{center}
% \begin{align}
% y_{u,i} &= \mu_{global} + error_{global} \\
% error_{global} &= \mu_{i} + error_{game} \\
% error_{game} &= \mu_{u} + error_{user}
% \end{align}
% \end{center}
%
% And we will use the estimation method outlined in the paper (need ref) for each, as follows below:
%
% \begin{center}
% \begin{align}
% \mu_{global} = \frac{\sum_{u = 1}^{N_{u}}\sum_{i = 1}^{N_{games}}}{1}
% \end{align}
% \end{center}
% \hh{something went wrong in the formula - what are you summing over?}
\subsubsection{Classifier Aggregation}
After finding the global effects for the games and the users, we're left with residuals for every rating that every user has made. These residuals represent the last of what differentiate the users based on their preferences for the nuances of each game. We need a way to quantify these nuances, and then attempt to model these residual so we can get a better understanding of each individual user. Each game in the dataset can be described by things like what mechanics it has, or what family of game it falls under, when it was published, etc. All in all, there are up to 1,851 different quantifiable ways that were chose to describe games. For example, the game Dragonmaster was published in 1981, has a playing time of 30 minutes, is under the family ``Animals: Dragons", falls under the categories ``Card Game", ``Fantasy Game", is in the subdomain ``Strategy Games", uses the mechanic ``Trick-taking", and seats 3 or 4 players. Each one of these is treated as a classifier for the game. A table storing this information by GAME\_ID can be found in Table 2.
In order to model the remaining residuals, we give each player a score based on how they rated games with similar classifiers; that is, how did they rate games with the ``Trick-taking" mechanic, or how did they rate ``Card Games"?
Define $C_{u,z}$ to the be the classifier score for user $u$ for classifier $z$ where $z$ is an index of the 1,851 classifiers. Then:
\begin{equation}
C_{u,z} = \frac{\sum \epsilon_{u,i \in z}^{(3)}}{n_{z}}
\end{equation}
where the sum is over all residuals that belong to a game that can be described by classifier $z$. For example, if the classifier was the mechanic ``Trick-taking" and the user was $u = 3$, the classifier score would be the mean of all of the $\epsilon_{3,i}^{(3)}$ for \textit{all} games that the user has rated that have the ``Trick-taking" mechanic. If it is the case that a user has rated no games with a particular classifier, we leave it as an unobserved value.
\subsubsection{k-nearest-neighbors}
Now we have a few classifier values defined for each user who has rated at least one game. In order to create predictions for other games that the user has not played, we find the correlation between players based on their classifier values. For each pair of players, the correlation between them can be found by finding all of the classifiers that have scores for both players, and then finding the correlation between those values. To make sure that we do not find high correlations between players who only have very few classifier scores in common, we consider a correlation between players, if they have more than six common classifier scores.
Now to generate an estimate for user $u$ for game $i$, we find all of the players who have a correlation value with user $u$ who have also rated game $i$. The rating for game $i$ is then the weighted average of the $k$ most highly correlated (or nearest) neighbors of user $u$.
The weights for this are $(1 - \rho_{u,k})^{-1}$ where $\rho_{u,k}$ is the correlation between user $u$ and user $k$. The number of neighbors to use $k$ is determined by cross-validation on a user-to-user basis. In the end $z_{u,i}$ is the weighted average of the k-nearest-neighbors scores for user $u$ and game $i$, as given in the equation below (where $u_{[j]}$ is the user with the $j^{\text{th}}$ highest correlation between themselves and user $u$):
\begin{equation}
z_{u,i} = \frac{\sum_{j = 1}^{k} \epsilon_{u_{[j]},i}^{(3)} \cdot \frac{1}{1 - \rho_{u,u_{[j]}}}}{\sum_{j = 1}^{k}\frac{1}{1 - \rho_{u,u_{[j]}}}}
\end{equation}
\subsubsection{Back to Rating}
From what we have produced above in equations (2)-(9), of the estimated rating for user $u$ and game $i$ is given by:
\begin{equation}
\widehat{r}_{u,i} = \widehat{\theta}_g + \widehat{\theta}_{i} + \widehat{\theta}_{u} + z_{u,i}
\end{equation}
\subsection{Parameter Optimization \& Selection}
There are three parameters that the model relies on: The two $\alpha$ values that balance out biasedness in the global effects section, and the $k$ in the k-nearest neighbors portion of the algorithm. The $\alpha$s are fitted globally (that is, every individual uses the same values for $\alpha_{game}$ and $\alpha_{player}$), while the values for $k$ are done locally (each user has their own value of $k$).
\subsubsection{Optimizing tuning parameters $\alpha$}
To optimize the tuning parameter $\alpha$ values, we follow the model fitting procedure up to the point in which the parameter is used, estimate the model parameter using a grid of possible values for the turning parameter, and then find which value for the tuning parameter yields the model parameter estimate which has the lowest RMSE (equation \ref{eq:rmse}).
RMSE is calculated by finding the difference between the rating and the rating that is estimated using only the global effects parameter(s). Since the game parameter is fit first, we fit the $\alpha$ tuning parameter for the game effect first. After finding an optimal value, we fit the $\alpha$ tuning parameter for the player effect, as the game effect is needed to estimate the player effect.
The optimization uses ten-fold cross validation, wherein the data is randomly divided into 10 separate sections of roughly equal size. We then treat each of the 10 sections as a test set and the other 9 to use as a training set. $\widehat{\theta}_g$ is fit first, and then from a grid of $\alpha_{game}$ values (where $\alpha_{game} \in [1,10000]$), a value of $\widehat{\theta}_i$ is produced for each of the $\alpha_{game}$ in the grid. For each of the $\widehat{\theta}_i$ that are estimated, we use the section that was not in the training set, the RMSE is calculated here by:
\begin{equation}
\text{RMSE} = \sqrt{\frac{1}{n} \sum \left(r_{u,i}^{(1)} - (\theta_g + \theta_i)\right)^2}
\end{equation}
where we sum over all rated games in the test set, and $n$ is the number of ratings in the test set. The value of $\alpha_{game}$ that minimizes the overall RMSE is used. In figure~\ref{fig:gamealpha}, we see the optimal value for $\alpha_{game}$ sits in a nice trough, with the minimum at $\alpha_{game} = 1.802$.
\begin{figure}
\centering
<<game_alpha_optim, echo = F, fig.width = 4, fig.height = 4, out.width='0.5\\textwidth'>>=
load("game_graph2")
game_graph
@
\caption{\label{fig:gamealpha} RMSE of candidate $\alpha_{game}$ values.}
\end{figure}
After finding the optimal value for $\alpha_{game}$ we can move on to optimizing $\alpha_{player}$. Once again, the data is divided randomly into approximately 10 equally sized sections, and each is treated as an individual test set while the other 9 combined will serve as a training set. A grid of $\alpha_{player} \in [1,10000]$ is used to create a value for $\widehat{\theta}_{u}$, and for each of those values, we calculate the RMSE as below:
\begin{equation}
\text{RMSE} = \frac{ \sum r_{u,i}^{(1)} - (\theta_g + \theta_i + \theta_u)}{n}
\end{equation}
Once again, the value of $\alpha_{player}$ with the lowest RMSE is the value used, which as can be seen in ~\ref{fig:playeralpha} is $\alpha_{player} = 2.022$.
\begin{figure}
\centering
<<player_alpha_optim, echo = F, fig.width = 4, fig.height = 4, out.width='0.5\\textwidth'>>=
load("player_graph2")
player_graph
@
\caption{\label{fig:playeralpha} RMSE of candidate $\alpha_{player}$ values.}
\end{figure}
One may immediately notice that different $\alpha$ candidates have a larger effect on RMSE for players than they have for games. The reason for this is that there are far more ratings for each game than there are for each player. This results in less overfitting for each game and more overfitting for each player, as is seen in the histograms for the log number of ratings for each game and player. Log was used as the number of ratings for player and game is highly right skewed.
\begin{figure}
\centering
<<hists, echo = F, fig.width = 4, fig.height = 4, out.width='0.45\\textwidth', fig.show='hold'>>=
load("game_n_rats2")
load("plyr_n_rats2")
game_n_plot
plyr_n_plot
@
\caption{\label{fig:ratn} Frequency of log number of ratings for games (left) and players (right).}
\end{figure}
\begin{figure}\centering
<<k_optim, echo = F, out.width='0.45\\textwidth'>>=
load("k_plot")
k_plot
@
\caption{\label{fig:k} RMSE of candidate k values. For this player, 18 is the optimal value.}
\end{figure}
\section{Technical Implementation}
This algorithm is implemented in R \citep{rcite} using shiny \citep{shiny} as a vehicle for a simple, easy to use user interface. Additionally, the packages RSQLite \citep{RSQLite}, dplyr \citep{dplyr}, and magrittr \citep{magrittr} were used. Behind the scenes, the data is stored in a data base with six tables: A player key table, which relates the players information to a unique player userid; A game key table, which relates game information to a unique game id; A ratings table which stores the ratings (a score from 1-10) along with the userid of the person who rated it and the game id which the rating is for; the classifiers table, relates the game id to characteristics about the game that is useful in the classifier aggregation and k-nearest-neighbors section; the predicted ratings table, which stores predicted ratings if ratings have already been fit for a particular user; and the aggregated classifier ratings for each user. This data storage allows for ease of implementation: we only really need the ratings table and the classifier table to do all the work, and the key tables are really only needed to display the information to the user in a format which they understand.
In an implementation of this, the application begins by having the user enter the username they have registered on \url{boardgamegeek.com}. As seen in (~\ref{fig:blanksc}), there is an area to enter the users information in, and a button just below that, when clicked, begins the process.
\begin{center}
\begin{figure}\centering
\includegraphics[scale=0.4]{blank_login.png}
\caption{\label{fig:blanksc} A blank log in screen}
\end{figure}
\end{center}
Instead of logging the users ratings directly through the application, we leverage the existing infrastructure of board game geek. After checking that its a valid username, the application scours {\tt boardgamegeek} to see if the database's information is complete: that is, do we have this user's data in the player key? Have they rated games that are not in our database? Have they re rated games that are in our database? Have they rated games that are not in our database? If so, we also need to bring in other users ratings for any games that we may not have had in our database (as a single rating for a game is not very useful and greatly skews the results for that game). After this initial step and database update, we can see if they already have saved predicted ratings in the system, and if not, we proceed to fit the model.
We first apply the global effects strategy as described in the modeling section. The optimization of the $\alpha$ tuning parameters will have occurred sometime before this and the algorithm uses those stored values. A weekly or monthly automatic updating procedure is to be used to keep these values up to date as more data comes into the system. From the global effects, we then aggregate those based on the classifier aggregation procedure as described above. These are then stored in the database for future use by the system. After that, before we move onto the k-nearest neighbors implementation, we must first find an optimal value for $k$ using the 10-fold cross validation as described above.
Finally, after the 10-fold cross-validation, we're ready to run the other games through the algorithm to get a predicted score. The games that the user has already rated will then be replaced with those ratings, and the full result is listed for the user in the ``My Recommendations" Tab, which can be seen in Figure ~\ref{fig:act_use}.
\begin{figure}\centering
\includegraphics[scale=0.4]{active_use.png}
\caption{\label{fig:act_use} The ``My Recommendations" tab, populated for user cwmassey.}
\end{figure}
Since we saved the aggregated classifier ratings for the individual user, we can suggest ways for the user to subset this data. By providing the graph in Figure ~\ref{fig:subdplot} below to the player they can see for themselves their not-necessarily obvious preferences in game family or mechanic (in this case Subdomain). For this particular user, this would indicate them that they tend to prefer Customizable or Party games rather than Children's Games or Strategy Games. They can then search by their preferred subdomain, and get ratings for games that fit that description. This plot is provided to the user within the ``My Category Ratings" tab. The plot for the specific category rating can be chosen by the user, so the user can select from Category, Family, Mechanic, Subdomain, Number of Players, Playing Time, or Year Published.
\begin{figure}\centering
<<subdom_plot, echo = FALSE, fig.width = 4, fig.height = 4, out.width='0.5\\textwidth'>>=
load("classifiers_plot")
out
@
\caption{\label{fig:subdplot} A player preference plot for subdomains for the user cwmassey.}
\end{figure}
\begin{figure}\centering
\includegraphics[scale=0.4]{cat_ratings.png}
\caption{\label{fig:cat_rat_tab} The ``My Category Ratings" tab.}
\end{figure}
If a user wants to rate a new game, they have to go back to BoardGameGeek and rate games there. The app then needs to be restarted, which will pull the new rating, and update the database.
\section{Conclusion}
While the result was not as smooth as hoped, the Netflix recommender system made an excellent base for building a recommender system for board games. It would not be too much of a stretch to imagine that, with a reasonable amount of user rating data about something, and enough descriptive data about that thing, that a similar implementation would be of great use there as well.
There are a few directions that this project is naturally headed in. The first of which is to speed up the prediction generating process. Currently, each time a user starts the app with new data, the whole model is fit again, including generating a truly massive correlation matrix. Instead of doing this each time a person logs in, it may be more advantageous to fit the correlation matrix daily, and then only update the global effects portion of the model for each user, as that portion is relatively snappy. Hopefully this would bring the fitting time down from 20 minutes to a matter of seconds. Additionally, we do not have the full data that is contained on BoardGameGeek. There is much more data to scrape than time for this paper allowed, so going and continually updating the database would be advantageous.
Another possible step is to generalize the whole recommendation system procedure, so that if you had rating and classifying information on any manner of subjects, you could implement a recommendation system with little to no effort. This could have a whole host of applications from what restaurants you might enjoy to what books you might like to read.
Finally, there are hosts of other recommender systems out there. Augmenting this approach with additional models as done in the Netflix Prize papers would undoubtedly be a logical step forward for this kind of modeling.
\bibliographystyle{asa}
\bibliography{references}
\def\thebibliography#1{\section*{References\@mkboth
{REFERENCES}{REFERENCES}}\list
%% default is no labels, for those not using \cite or BibTeX
% undent first line of a reference by using negative \itemindent.
{}{\setlength{\leftmargin}{3em}\setlength{\labelsep}{0pt}\itemindent=-\leftmargin}
\def\newblock{\hskip .11em plus .33em minus -.07em}
\sloppy\clubpenalty4000\widowpenalty4000
\sfcode`\.=1000\relax}
\def\@biblabel#1{\hfill}
% do not box citations, add space between multiple citations, separate with ;
\def\@citex[#1]#2{\if@filesw\immediate\write\@auxout{\string\citation{#2}}\fi
\def\@citea{}\@cite{\@for\@citeb:=#2\do
{\@citea\def\@citea{; }\@ifundefined
{b@\@citeb}{{\bf ?}\@warning
{Citation `\@citeb' on page \thepage \space undefined}}%
{\csname b@\@citeb\endcsname}}}{#1}}
\end{document}