I hadn't considered a genetic algorithm, which is probably a better idea. When I get time, I'll look into it.
I was thinking that an assessment function for each possible action which does a weighted sum of a few factors, such as ratio baddies to PCs (turn economy), number of weenies, number of fatties, personal death probability and so forth, where the weight is determined through a multidimentional walk (a hill climbing function probably).
The weird thing is that if it did, it might be better than actual players! In a campaign, I made a mistake of having a kill tally and it brought out the worst in the players and the tactics were really poor.
Currently, the stats get reset after each of the thousands of iterations of battle simulations, the casters are targetted first and only heals and buffs are present, so spell depletion might not be an issue any time soon.
Yes, it handles crits, fumbles, advantage and disadvantage —albeit very inelegantly as conditions are a poorly done patch-over.