This is an assumption without any substantial evidence. There has been only a single contest in which the changes were implemented. That's a current sample size of 1 which is nowhere near enough to show anything.
I mean, apart from the people literally saying that it's not a coincidence that they didn't give the critique they normally would...

First there's the fact that the sample size isn't one, it's forty, because you count every member of both groups when counting a sample size. The group size is one, and also [insert problems with observational studies here], but the sample size is... I mean, I wouldn't trust a scientific study with sample size 40, but I don't have much of a choice and I'm not trying to be massively precise. Second, there's nothing stopping a single result in one group being meaningful for a known prior trend; if it's multiple standard deviations out of line it can easily hit the critical region (assuming p=0.05) on its own. Third, the fact that the change is noticeable with a single result actually implies that the change was a very drastic one. If it weren't then it would only be noticeable after a long time.

But then there's the fact that you're trying to apply scientific rigour to something that was a relatively trivially obvious observation about something that doesn't really merit the scientific method, so I don't know what that says about you.