Hi everyone,
As originally discussed here, we’ve wanted to let players know that scores may alter slightly, namely by going up. Obviously everyone likes to get more points, but we also have wanted to be clear with you about why such things happen, particularly if we’re not changing the points system itself (which we aren’t changing).
Following that last post, where we had to walk back some of our original thinking and re-analyze, here is an update! Bear with me, because the whole scoring change is actually part of fixing the discrepancies between Review Mode and the Activity Tracker. Basically, the TLDR version: Review Mode (following one bug fix) has actually wound up being the more accurate of the two, so we’re going to fix the Activity Tracker so that it syncs up with what Review Mode shows you, and a result is that sometimes you’ll get more points and won’t be negatively affected by sloppy TBs. If you would like to understand exactly what goes into all this, read on!
1. First, the remaining issue with Review Mode and the Activity Tracker is:
Well, in a nutshell, although your accuracy stats balance out in terms of cubes’ final consensus, the code for Review Mode and the code for the Activity Tracker currently look at different consensus calculations. The Activity Tracker does not correctly account for how your trace actually gets put into consensus. So, naturally we’ve wanted to fix that.
2. To have the Activity Tracker match the % you see in Review Mode, we have to alter the scoring code.
Again, this doesn’t mean changing the actual metrics that go into the points you receive. But we have to change the scoring code to actually score everyone by what’s wound up in the consensus; this has been overdue. And the Activity Tracker relies on the scoring info. Right now, if you are Player 2, you are scored against the TBer as if the TBer singlehandedly produced consensus, which has meant that even when your trace as Player 2 is accepted into the real consensus, you can be scored as if it wasn’t.
When we alter the scoring code, your score as Player 2 will tend to be higher insofar as you won’t be punished for undercoloring, just for overcoloring. So if the TBer adds something that Player 2 doesn’t, there’s no scoring penalty to Player 2, but if the TBer doesn’t add something that Player 2 adds, Player 2 will still receive the same score as they would have in the old scoring code (a slightly penalized score).
However, while everyone might appreciate a higher score, we know that it’s much more common for a TBer to miss stuff, and for Player 2 to correctly add it. Also, we would rather penalize undercoloring in general. So…
3. We’d like to additionally change the % agreement threshold at which segments are added to consensus.
We want to make it so that if 2 enfranchised players play a cube, only 1 of them has to add a segment for it to be included in consensus. (For 3 players, 2 of them would have to add it; for 4, still 2; in the rare case of 5, we’d need 3 to add.)
This would flip things around such that Player 2 would be penalized for missing something the TBer found, but Player 2 would not be penalized for finding something the TBer missed. So now Player 2 would still be earning more points and only being penalized when it made the most sense to penalize. And Player 3 would wind up being less penalized in the long run by a bad trace from the TBer or Player 2, too.
Possible tradeoffs include easier merger growth and someone being able to deliberately overcolor for more points, but since it’s always easier to spot a merger than to spot a missing branch, and since you would have to know exactly what player # on a cube you were in order to make overcoloring work for you, HQ isn’t super concerned about these issues. One other thing to consider is that with the threshold alteration, disenfranchised players would wind up affecting consensus less than they currently do, but HQ would like to stress that those players already barely affect it.
4. So, we’re pretty convinced of the benefits on this end, but do you have any questions, any concerns we haven’t addressed here, etc.?
Let us know in the comments and we’ll try to clarify the above! If a visual example would help, I have something in mind based on what Chris showed the rest of us today. We don’t expect to be ready to make these two changes (scoring and consensus threshold) for at least a few more days, because we still have to make certain database preparations.