I am not sure MCMC would be the best way to go. If I understand correctly I think you could run into some problems as resources get drained through the fight and future performance is negatively correlated with past performance for anyone expending resources.

I haven't had a look at your code but does it handle advantage and are you happy with how it handles that? I tried to do something similar a while ago but getting the decision on shove actions and whether they were worthwhile was something I found difficult. I never really got round to trying spells so the combat abilities like that were very important - this meant that the most important factor in a lot of combats was initiative. For short combats initiative being high was important. For longer combats initiative being the same in the party was important.

Following the logical path of actions was difficult. If player 1 shoves then there is a probability of a prone target which diminishes the value of player 2 shoving. Player 3 may be a rogue who gets a big benefit from attacking a prone target but they may be incapacitated before they can. If player 1 or 2 makes an enemy prone then the enemy should focus on stopping the rogue attacking and so on...

The complexity of the paths means backwards induction is useless and programming in all of the rules etc. was far, far beyond me.

If you have the resources and some pre-existing code you might be better off just collecting all visible characteristics about your party and trying to use a GA to eliminate rubbish tactics. I don't have a lot of experience of GAs myself but some of my colleagues seem to be able to do some pretty awesome things with them.