Better feedback and more transparent scoring

Hi everyone,

sometimes, I have a cube I’m a bit proud of. Some difficult areas, but with a bit of work, the 3D structure looks like a picture from a biology text book and I’m reasonably sure I got it mostly right. Then I click “submit” and get a measly 23 points, wondering what I did wrong.

I couldn’t find much information about your scoring system, but I guess it’s something along the lines of “the more people agree with you, the higher your score” with some impressive statistical math behind it that I wouldn’t understand anyway. If that is the case, would it be possible to generate a kind of “heat map” of a cube, where controversial areas are rendered as red and undisputed areas as green?

Players could then compare their solution with the general population and maybe get an idea of what went wrong.

Thanks,
Ben

That’s actually a great idea Ben!  We’ll look into it!


There is one thing to note however, which is that just because everyone else doesn’t agree with you doesn’t mean that you are wrong.  You could have found something correct which everyone else missed!  Unfortunately, we have no way of knowing whether what you did was right or wrong until well after the fact.  We opted for an inferior but more immediate point system rather than a more correct but more delayed system.

Matt,

I have had Ben’s experience, I also had several which took a long time and produced 0 (zero) or a 20 for a score. I was puzzled.

Here are two suggestions to help you and Ben and myself:

  1. Please Give us feedback in the 3D window of the mode score and mean score, not the average score because average is not a helpful description here of what we want to know. This would address Ben’s request directly and provide feedback of what everyone is up against with a simple numeric display rather than adding more graphic computation. Also, do not show how many have worked on a cell, that would raise the number of people who would skip the cell.

  2. For your quality assurance and to prevent majority false positives, provide a feedback tool, such as a lasso paint tool in red to ‘flag’ or ‘mark’ a questionable area. For example, I’ve done enough cubes to come across the AI painting across boundaries or the opposite, I would have liked to drawn a red line around the questionable area and then moved on in my coloring.

You would not be adding to your workload to provide a tool like this but actually we would do more of your quality inspection and that would reduce your workload at the end of the process when checking the complete cell for accuracy. For example, a graphic of the completed cell could show dots or ‘x’s’ or a meaningful symbol where most color-ers ran into trouble interpreting the slices. You could then go in by hand and make a final decision. The feedback flags would tally so you could inspect the highest tally areas on a cell for accuracy. We ought not have a cell ID or a way to go back to a cell, you need us to be blind to the cell identity when we work on it rather than get involved in interpretation.

Hope this is helpful!

Flamingo Marty

Hey Flamingo Marty,


Those are interesting ideas too.  I’m not sure how I feel about 1.  On the one hand, it lets you know if you are in the right ballpark, but on the other hand, I can see it as being very discouraging to find out that you are doing much worse than everyone else who did this cell.  We’ve been trying to avoid giving negative feedback.

Number 2 is very hard.  I agree that in the long run, it might make things easier for us, but in the short term, it’s a large amount of work for very little gain.  The workarounds of you guys skipping mergers, and us cleaning up seeds with mergers, while annoying, is a lot less work that allowing users to fix it.  That having been said, if we did have a feature like that, it would be basically exactly as you described.

How about a paint tool, where one could choose a thinner or thicker line and color an area manually ?  Several times I’ve tried to color a thin area that I think is inside a line, but when I click on it, the AI makes a big huge splotch outside of the line.  If I could click a pencil tool with a thinner or wider line, I could apply color where I want it instead of where the AI thinks it should go.   

@karens


I would definitely not provide this ‘out of the box’, the currently given tool in general works very smoothly, but it is a nice idea for more specific input to correct the AI when it makes big mistakes. However, I’m not certain whether implementing such a tool is that trivial.

@Karens, I know it’s frustrating when you try to color a small piece inside the line and it pulls up something huge outside the line–but what we like people to do in cases like that is just skip the little pieces that are merged.  One of the reasons we don’t want to give you guys a paint tool is that it’d slow you down a lot (I actually had to paint a couple of these cubes to help train the AI, it takes FOREVER)  Basically we’d rather you guys be missing the few pieces the AI messed up on than spending an hour trying to perfect one branch.  


@Whathecode you’re right in thinking that implementing that tool would not be an easy feat at any rate.

Dear Matt,


Thanks for the explanation above, I didn’t anticipate the psychology negative feedback in 1) above and it makes sense to keep gamers ‘blind’ and not introduce any extraneous influence. 

About 2) above, my second suggestion doesn’t make sense either. I understand now that any time a feature is added the gamer behaviors must be anticipated for negative effects on your overall goals. 

So maybe you can brainstorm a method to produce useful feedback tallies from existing resources such as the current statistics you keep on cubes. The two ‘errors’ to detect are gamer mistakes and mergers. Can your cube statistics help you to detect gamer disagreement above some threshold after a sample size of 5 up to 30 gamers to ‘flag’ that cube for your examination? That could be very quick to program and hopefully reduce the back-end work load and maybe lead to greater cell accuracy, which is the ultimate goal.

Hey Marty,


So here at the lab, we like to talk about things in terms of false positives and false negatives.  A false positive is where we identify something as being part of the cell which does not actually belong.  A false negative is where something is not included which should be included.  In general, the whole system is set up to produce more false positives than false negatives.  The reason for this is that it’s much easier to see false positives by visual inspection than it is to see that something is missing.  It’s also easier to correct false positives (i.e. pruning is much easier than finding missing stuff).

Now, there is also a distinction to be made between errors made by the players and errors made by the AI.  With the AI, we can set a threshold which determines how much to join stuff together.  The goal with the AI is to make segments (pieces of neuron) which are as large as possible while making as few mistakes as possible.  With where we currently have the threshold set, we are willing to accept the number of errors that the AI makes.  We could set that threshold lower, but then all of the segments (not just the mistakes) that the AI makes would be smaller.  Probably much smaller.  This would mean that it takes much longer for you guys to do each cube, and as such, less science would get done with the same effort from all of you.  The trade off that we make in this bargain is that sometimes there will be small mergers which you guys can’t fix.   Most of these mergers won’t make a big difference in terms of the science.  The very few that do, we can fix manually on our end.

We know that you guys want to make everything perfect and we really appreciate that, because it’s part of why you guys are giving us such good results.  However, there are always trade offs to be made.  “The best is the enemy of the good.”  We decided to make a trade off that will amplify your value to science substantially, and the cost is that you will never be able to trace these cells 100% perfectly.  But it’s okay though, because you guys were never going to be able to reach 100% anyway.  It goes back to those sneaky false negatives.  It’s is difficult to impossible for us to find absolutely every branch that comes off the cell.  Sometimes these things are just really hard to see or judge.  Also, we need to set thresholds on the numbers of users who find something before we include it to prevent vandalism, and to minimize the impact that users who are just learning can have on the end result.  So I would put it this way to you all.  Whenever you see that the AI made a merger, don’t worry about those false positives just because you can see them.  Take a few extra seconds to look over the branch that you just did again to make sure that you didn’t miss anything.  We’ll take care of the false positives, but we need you guys to find all those false negatives.

Matt

P.S. sorry this turned into a sort of rant.  There’s been a lot of people on the forums talking about this stuff recently, and I wanted to address the general concern.

That actually helped Matt, because I didn’t know anything about the AI or the thresholds for false pos. and false neg. I’m not a “gamer” (though i do love me some Crash Bandicoot).   Lately I’ve spent a lot of time on an arty / photo editing place where I’m able to color inside the lines using paint tools.  So I found it very frustrating when the AI created mergers and I knew they shouldn’t be there but I can’t get rid of them without taking my target branch with them;  I felt like I was doing something wrong.  Now that I know more how the AI works and how you guys have it set up, I won’t feel like I’m doing something wrong if there are mergers as long as my target cell is colored in.

Ok, I can see the difficulty of identifying false negatives and therefore the tolerant AI settings - and why it won’t work to let the user play with the tolerance settings of AI coloring.

But I think this solution fits your requirements (and was already mentioned somewhere - maybe slightly different):
Add a small check box for us labeled with “possible merger”. As soon as the click count for this specific cube hits the threshold you are using right now, drop the cube and all incident cubes that were conquered after trailblazing the merger-cube from the Todo-List until a mod/admin had a closer look at the suspicious cube, cut the wrong branch and added the remaining, correct branch back to the queue.

Of course I am assuming that some people simply do not skip mergers like intended but try to follow each branch as good as they could. If that is not the case and we added this huuuuuge branch yesterday because the original merger was just not detectable for mere human eyes… then this solution won’t work either, yeah.

Allowing to click the “possible merger” button for those people who skip the cubes would hopefully exceed the amount of people coloring everything. And hitting a simple button lets you hopefully progress faster than coloring errorneously added cubes due to a former merger and erase them later on…

Obligatory edit:
Problematic would be a situation where all branches are blocked by a “possible merger” and all mods/admins celebrate christmas :smiley:
I would give it a try nonetheless…

Edit 2: Today we conquered (again) another cell in SAC#1 :wink:

@nkem  You make an excellent point about people needing to be able to report a possible merger.  This would work a lot better for mergers caused by the AI than it would for mergers caused by players.  AI mergers are usually pretty obvious in the cube (think when those cell bodies randomly appear), mergers caused by players may actually look correct in the cube, and not be obviously wrong until the branch starts going the opposite way.

Another thing–the mergers that are visible in the overview are annoying, but the bigger ones are waaaay easier for us to find :) 

Yeah, we used to have a “merger flag” but we dropped it at some point.  I think it’s time for it to make a comeback!