Hey everyone! Long time no see. There haven’t been many developments with the site the past couple months (since I removed the ads), but there are a couple things I want to bring to light today.
Word on the street is that there were prereleases last weekend. If you went to one (or are going to one during an upcoming weekend), it would be awesome it you could help contribute card scans! You can find instructions on how to calibrate your scanner for optimal results here.
You do not need to crop the scans; I’ll be able to find someone to take care of that, unless you’re good at it. Last time Spike P. helped out with the image editing, and he did a stellar job.
I’ve got all the cards in the system as drafts right now; we just need the images and I’ll be able to publish them. I know there are some low quality scans floating around, but I’d rather wait for better ones since I know I’d eventually have to replace them.
I mentioned in the past that I thought the current rating system was going pretty well, and that the ratings would even out over time to become fairly accurate. This, however, hasn’t really panned out, as it seems like the first couple people that rate a card have a huge influence on its score.
For example, if a card starts off with a couple 10 ratings, subsequent raters usually assume the card is good, and give it a good rating as well (even if the card actually sucks). Vice-versa happens as well (bad ratings for good cards).
It pisses me off that the rating system hasn’t worked out because I really wanted this to be a good resource for newer players, so they could actually get a decent idea how good or bad a card really is. Right now, the ratings are a pretty lousy indicator of a card’s playability.
However, I do have another (better) rating system in mind. This is an early idea I had for how to get the most accurate ratings, but it was going to be more complicated to implement, so I took the easy way out and went with a pre-packaged ratings plugin (which is what we’re using now).
While I’m laying in bed, trying to fall asleep, I often philosophize and think about things deep topics which I don’t quite yet understand. One night last week I thought about the alternate rating system, and actually figured out a way to build it. I ran the scenarios through my head, then scribbled down the components necessary to make it work (so I wouldn’t forget anything), and set the paper aside to let the plan marinate a little longer in my brain.
Here’s basically how it’ll work (and PLEASE give me input if you have any ideas on how to make it better):
Right now you are shown one card, and you pick a score from 1 to 10. Numbers are extremely arbitrary, and even I, who has played this game for years, don’t even know how to accurately rate a card. There are so many variables to consider, and numbers don’t really mean anything.
How do you quantify the “goodness” of a card, you know? “Gyarados SF is a 9/10.” Ok… what does that mean?
Instead, I think it’s a lot better to compare cards. If you show me Gyarados SF vs Volbeat TM, I can say with certainty that overall, Gyarados SF is the better card. Maybe not in every single game situation, but overall, Gyarados is the more playable card. Gyarados SF vs Professor Juniper? That’s a little more difficult, but I would say Juniper is the more playable (and overall stronger or game changing) card.
You can be a lot more certain about which cards are good when you do one-on-one comparisons. If you do enough of these comparisons (maybe a few hundred for each card), I think you can start to build an accurate picture of where cards rank in terms of playability.
The way this will work is that each card will have a link the says “Click to Rate,” and when you click it, an overlay window will display with the current card vs a random card. You will be prompted to click which card you think is better, then the window will refresh and a new random card will be matched up against our hero.
A new random card will be displayed until your mouse breaks, or you decide to click off the window. I want to get as much data as possible, so I’m not going to limit the number of matchups that show up. The more data we can collect, the better.
I was considering limiting the random card that appears to be in the same set as the initial card, but the issue with doing that is not all sets are created equal. Power Keepers is a much weaker set than Next Destinies, for example.
If Power Keepers cards were only matched up against Power Keeper cards, then some cards would end up seeming a lot better than they really are. Vice-versa applies with Nest Destinies (good cards would seem worse than they really are).
I know with the power creep that has infested the game, newer cards are for the most part going to seem better than older cards… but’s accurate, I think. It might be better to compare cards only to others that are modified legal at the time, but cards always end up being in multiple formats, and either lose or gain power over time. Comparing vs any random card should be good enough.
Side note: I’m also considering maybe keeping track of both a card’s rating compared to all cards AND compared to its set. That might be the way to go.
Each time a card is displayed in a matchup, it will get +1 to its number of impressions. If a card is picked, it gets +1 to its score. The card that isn’t picked gets 0 added to its score. The rating will simply be displayed as the card’s score divided by its number of impressions, multiplied by 100 to get a percentage.
For example, if Gyarados SF is pitted in 100 matchups, and is picked as the better card 87 times, it will have a rating of 87%.
I think it’s important each card receives a minimum number of impressions before its rating is displayed, in order to prevent rating bias. If a card is rated once, and receives a 100% rating, then subsequent raters might think the card is godly, and keep picking it, even though it’s not that hot.
My initial thoughts were to make a minimum of 100 impressions before the rating is displayed. That seems like it should be a decent sample size, but I’m not sure. I should have paid better attention during statistics class… I forget how to tell what sample size makes a number “statistically significant.” I can tell you though that there are almost 6,000 cards in the database.
Lastly, I’m considering only letting authorized registered users vote, at least to start off. I want to prevent trolls like J-Wittz from giving Hoppips perfect ratings. I don’t like making people register, but it might be the best way to keep things legitimate.
At first I thought the trolling might be funny and went along with it, but it’s bad for the site. The database becomes a lot more helpful when the ratings are accurate.
The main thing I’m not sure how to deal with is repeat ratings. What I mean by that is the same two cards getting matched up against one another, repeatedly, before they’ve been matched up with unrated cards.
If Gyarados SF got matched up against Gust of Wind 9 times in a row, then finally got paired up with a meager Magikarp, it might have only a 10% rating when it’s really not that bad. Ideally, it would be matched up against every card out there one time, then repeat the cycle.
With a smaller database, I feel it would be a bigger issue, but with 6,000 cards, my theory is that things should even out. I’m sure there is some way to prevent the same two cards from being matched up before all 6,000 are cycled through, so I’ll have to look into this.
I think that’s about it… I’ll try to starting coding tonight, though playoff hockey has put a damper on my productivity the past week and a half. Expect it to be done sometime in May.
Please leave feedback if you have any, and thanks for reading!