
Originally Posted by
Karamazovmm
interesting, can you plot some tendencies as well?
Good idea. If the data does fit the pattern a×bx then we can take plot of log(players) and fit a line using linear least squares regression. The slope and intercept of the line will give us a and b. I should say at this point that although I originally presented an example with exactly this formula I don't believe it will be a good fit for the data. My original expectation was that the data would fit the sum of three such series (i.e. a×bx + c×dx + e×fx). The original example was illustrative only and I kept to a single series for simplicity. It is also important to point out that with some very small caveats all of the points about the original example still hold with a sum of series.
That said, if we do model the data as a simple a×bx we get the following equations (using the data from my previous post, plus an additional 7 days worth of data):
Rome II: 79686.1091119406 × 0.9749601828x
Shogun 2: 34178.518191299 × 0.9744597318x
Which looks like (including a full half-year's data for Shogun 2, this is for comparison only - it was not used to fit the equation):
This is clearly a nonsense. But what happens if we move to less pure statistical methods, model the data as a sum of two series and fit the data by eye? I came up with the following (fitting the equation to the full set of available data - a half year of Shogun 2 and everything to date for Rome II):
Rome II: 30000 × 0.998x + 90000 × 0.91x
Shogun 2: 10000 × 0.998x + 30000 × 0.95x
Which gives:

I think that's looking pretty good. If we accept this as a good fit to the data the implications are that there are two groups of players. The 'hardcore' which is three times as large for Rome II as for Shogun 2 but otherwise identical and the 'casuals', which are also three times as large for Rome II as for Shogun 2 but are leaving at 9% per day for Rome II compared to 5% per day for Shogun 2.
We can take this a step further by including an approximation for the weekly cycle, modelling the data as (abx + cdx) × f(x), where f gives a different multiplier for each of the seven days of the week. Again, fitting the data by eye (my f functions were derived last week by modelling a subset of the data as abx and taking the average deviation between that and the real data - I could probably do slightly better now with a different model):
Rome II: (26500 × 0.999x + 85000 × 0.91x) * fRome II(x)
Shogun 2: (9500 × 0.998x + 30500 × 0.945x) * fShogun 2(x)
Where:
Code:
0.978412443, x = 0 (mod 7)
0.9366720102, x = 1 (mod 7)
0.9345420418, x = 2 (mod 7)
fRome II(x) = { 0.9756896093, x = 3 (mod 7)
1.1355291687, x = 4 (mod 7)
1.232026096, x = 5 (mod 7)
1.0825580142, x = 6 (mod 7)
0.9804374065, x = 0 (mod 7)
0.948469858, x = 1 (mod 7)
0.9471135231, x = 2 (mod 7)
fShogun 2(x) = { 0.9758807701, x = 3 (mod 7)
1.1021159256, x = 4 (mod 7)
1.1781349105, x = 5 (mod 7)
1.0915161481, x = 6 (mod 7)
This gives us:

Which looks like a pretty good fit to me. How good? Let's look at the model as a proportion of the actual data:

The vast majority of the time the prediction is within ±10%, which is not bad at all. Obviously there's a lot less certaintly in the model for Rome II, but if we accept these models we can conclude that the Rome II 'hardcore' is slightly less than three times the size of the Shogun 2 'hardcore', but is declining more slowly and the Rome II 'casuals' are also slightly less than three times the size of the Shogun 2 'casuals', but are declining a fair bit faster.