PDA

View Full Version : Power, Frequency, Versatility, Reliability -- four ability balance metrics



PhoenixPhyre
2021-08-18, 01:54 PM
Thinking about balancing abilities (read broadly, anything from basic things anybody can do to class features to abilities you can buy with points to spells to...), it seems like there're basically four major parameters at any particular "power level" (which isn't necessarily D&D-style levels):

Power -- when a character uses <ability>, how much does it do? A "deal 1d4 damage" ability is (all else equal) weaker than a "deal 10d10 damage" ability.

Frequency -- how often can a character use this ability? This might be due to a cooldown, limited resources, or just it being a niche power that only is enabled when the moon is full and the day name starts with T, during the summer, while standing on a glyph drawn in unicorn blood.

Versatility -- how many situations is this useful for? A power that can damage objects and creatures is more versatile than one that can only damage one or the other; a power that is both a dessert topping and a floor wax is more versatile.

Reliability -- when you use this power, does it <do thing> all the time? Or just some of the time. A power that requires some form of attack or defensive or casting check is less reliable than one that just happens; a power that bypasses immunities or "just works" is more reliable than one that has an element of random chance.

-----------

My basic sense is that (comparing powers of equivalent "power level"), every power should have some kind of bounded aggregate total on those four areas. If (ranking things on a 1-10 scale, higher meaning higher), one power gets 10/10/10/10 and another gets 1/1/1/1, those two either aren't of equivalent power or are unbalanced. But one that totals 20 (7/7/3/3) and another that totals 20 (5/5/5/5) are, if not totally equivalent or equal, at least in the same ballpark.

And the same thing goes for power-sets for a character--if a character's power set is full of super-powerful things, that character shouldn't also have high marks in all the other areas. You could have a powerful and versatile, but unreliable and infrequent power set, or a frequent, powerful, but limited versatility and unreliable power set.

Does this framework make sense? Are there things I'm missing.

KorvinStarmast
2021-08-18, 01:59 PM
Does this framework make sense? Are there things I'm missing.
The reliability one changes a bit if you use a different dice mechanic than the d20 system does.
As an example, 3d6 and 2d10 (and other bell curve approaches) make for more predictable results and thus lean toward higher reliability. Dice pools and exploding dice are another approach that can play havoc with reliability. (Tunnels and Trolls offering but one example). The framework looks like a good start, though how one assigns the values of 1 through 10 might be tricky to nail down.

I played a variant of Dungeon World (Fellowship) a while back. Applying your idea to that would be an interesting test case. The DW resolution (2d6 + mods) has a 'degrees of success' where 10+ gives a wonderful result, 7-9 a success, and 6 or less a not great result. The flat modifiers were usually 0, 1, or 2, though it could go higher if the game went on a lot longer than ours did) and there was some in game currency for a team mate to assist the player attempting a success.

Not sure how that system would fit into your concept, but I think it would. I always had to think through my options.

PhoenixPhyre
2021-08-18, 02:15 PM
The reliability one changes a bit if you use a different dice mechanic than the d20 system does.
As an example, 3d6 and 2d10 (and other bell curve approaches) make for more predictable results and thus lean toward higher reliability. Dice pools and exploding dice are another approach that can play havoc with reliability. (Tunnels and Trolls offering but one example). The framework looks like a good start, though how one assigns the values of 1 through 10 might be tricky to nail down.

I played a variant of Dungeon World (Fellowship) a while back. Applying your idea to that would be an interesting test case. The DW resolution (2d6 + mods) has a 'degrees of success' where 10+ gives a wonderful result, 7-9 a success, and 6 or less a not great result. The flat modifiers were usually 0, 1, or 2, though it could go higher if the game went on a lot longer than ours did) and there was some in game currency for a team mate to assist the player attempting a success.

Not sure how that system would fit into your concept, but I think it would. I always had to think through my options.

The actual quantification would be highly system-dependent and system-relative. So what counts as being "10/10 reliable" in D&D may be very different (mechanically) than a 10/10 reliable ability in a d100-roll-under skill-based system or an exploding dice 3d6 system.

Xervous
2021-08-18, 02:15 PM
While conceptually this is nice and neat I’m not sure how you could practically apply it for design purposes as a rule. Is targeting their mental shield rather than targeting their armor worth a +1 or a +2? What’s the cap in any given category?

PhoenixPhyre
2021-08-18, 02:18 PM
While conceptually this is nice and neat I’m not sure how you could practically apply it for design purposes as a rule. Is targeting their mental shield rather than targeting their armor worth a +1 or a +2? What’s the cap in any given category?

Honestly, I'm not so much worried (or interested) in the quantification (at this stage), but in the idea of balancing based on trying to consider the four metrics in aggregate. But I'm not a quantify-type person myself.

My big question is "is thinking in these terms complete? Are there pieces that heavily overlap or are redundant? Is it useful?" (that last being the biggest question).

Xervous
2021-08-18, 02:43 PM
Well if you’re not quantifying then these are great baskets to lump concerns into. The trick will be asking the right questions that leads you to the appropriate baskets.

The haziest categories are either frequency of use or reliability. At least with how they relate to abilities with a limited range of valid targets. Sure you can assume there might be X every so often, but Joe’s campaign has X all day while Sally’s never does.

PhoenixPhyre
2021-08-18, 03:05 PM
Well if you’re not quantifying then these are great baskets to lump concerns into. The trick will be asking the right questions that leads you to the appropriate baskets.

The haziest categories are either frequency of use or reliability. At least with how they relate to abilities with a limited range of valid targets. Sure you can assume there might be X every so often, but Joe’s campaign has X all day while Sally’s never does.

I agree on that last paragraph. It's something for DMs (assuming some form of tuning of adventures to people or vice versa) to be aware of, which could be "warn players that their anti-undead character is unlikely to face undead" or "add some undead" or "make sure that having heavy anti undead powers doesn't break things for the party".

It's ideally a framework for GMs, not just system designers. In my mind, there's always some level of "system design" that has to be done at the GM level to have a satisfying game. May only be communication, not actual changes. But having a set of buckets and the idea of balancing the total/balancing via tradeoffs is, I think, useful.

Satinavian
2021-08-18, 03:14 PM
What is with "Cost of use" ? That is not really the same as frequency as it is more about "what else could i do in that time/with those components".

PhoenixPhyre
2021-08-18, 04:04 PM
What is with "Cost of use" ? That is not really the same as frequency as it is more about "what else could i do in that time/with those components".

Hmmm. Not sure. It's certainly a relevant factor--an ability that eats 1k gp and 1k XP each time isn't the same as one that only costs an action. Or an ability that has a chance of summoning a warp demon to eat you vs one that only costs a slot.

Maybe it fits (poorly) into frequency? As in--you can use it whenever, but instead of a resource cost or a limit on uses, it's a [money|time|risk] cost.

Telok
2021-08-18, 04:19 PM
What is with "Cost of use" ? That is not really the same as frequency as it is more about "what else could i do in that time/with those components".

There's a cost to aquire abilities too, in most games. Taking power A means you didn't take power B. Sometimes it could be very very temporary like d&d casters who swap out different spells by day or by level. Or it can be permanent like with a point buy system or d&d 5s resilent feat.

RandomPeasant
2021-08-18, 04:31 PM
Does this framework make sense? Are there things I'm missing.

I don't think it's the right way to approach a problem. You don't want to try to balance all the abilities against each other, or even the classes against each other directly. What you want to do is think about the balance point you want, then ensure that people hit that. What is a character who has 100 Karma or 5 levels or however much of whatever you system uses to measure advancement supposed to be able to deal with? Can the characters your system outputs (at whatever level of granularity you're testing at) all handle those things?

Trying to go directly to "power" or "versatility" will confuse things. It is very, very hard to tell, a priori, a set of random abilities that don't add up to anything worth caring about from a collection of silver bullets that trivialize everything a character is going to be asked to deal with. Don't try to solve that problem. Don't try to come up with a complete list of properties abilities can have. Don't try to figure out whether "four enemies become unconscious" is better than "six enemies become nauseated" or "one enemy takes seven boxes of damage" or "two enemies become stunned" (let alone how vastly more complicated those comparisons become when you start talking about non-combat abilities). Just figure out the problems characters are supposed to solve and if the total packages of what they get add up to solving those problems.

PhoenixPhyre
2021-08-18, 05:10 PM
I don't think it's the right way to approach a problem. You don't want to try to balance all the abilities against each other, or even the classes against each other directly. What you want to do is think about the balance point you want, then ensure that people hit that. What is a character who has 100 Karma or 5 levels or however much of whatever you system uses to measure advancement supposed to be able to deal with? Can the characters your system outputs (at whatever level of granularity you're testing at) all handle those things?

Trying to go directly to "power" or "versatility" will confuse things. It is very, very hard to tell, a priori, a set of random abilities that don't add up to anything worth caring about from a collection of silver bullets that trivialize everything a character is going to be asked to deal with. Don't try to solve that problem. Don't try to come up with a complete list of properties abilities can have. Don't try to figure out whether "four enemies become unconscious" is better than "six enemies become nauseated" or "one enemy takes seven boxes of damage" or "two enemies become stunned" (let alone how vastly more complicated those comparisons become when you start talking about non-combat abilities). Just figure out the problems characters are supposed to solve and if the total packages of what they get add up to solving those problems.

That's ok...at a full system design level (ie if you have control over both inputs and outputs). That's not what this is for as much. If I'm homebrewing a creature, or a class, or a spell, or an item, I don't determine what the system assumptions are. But I do care "is this out of band compared to existing features/abilities/powers". Or "what level of spell should this be?" Or "what rarity should this item be?"

Or when I'm making a campaign and looking at the characters my players have submitted, I need to be able to evaluate whether the campaign as designed will work for those characters as designed.

Or when I'm building a character and trying to balance to the table (hi Quertus!). If I've taken a bunch of high-power abilities, I should probably evaluate whether they're also high reliability, versatility, and frequency. Or the inverse.

That's the primary use for this framework, because very few of us are actually ab initio game developers. But many more of us are players and GMs.

------
And if we're only looking at benchmarks, "does this hit the benchmark" is a lousy way of telling overall balance. Because there aren't many benchmark campaigns. Unless you strongly constrain the adventure or party design or make everything homogenous (so everyone has a set of abilities that does X, a set of abilities that does Y, etc). Interactions between powers and campaign elements mean that a group of characters that all meet the benchmark in a white-room scenario may either drastically overshoot (meaning trivialize, which isn't much fun) campaigns that differ or drastically undershoot. And without an analytical framework, there's no way to tell why or where the benchmarking failed. This framework intends to provide some backing for thinking about why things work or don't work, rather than just the binary works/doesn't work.

OldTrees1
2021-08-18, 10:43 PM
My basic sense is that (comparing powers of equivalent "power level"), every power should have some kind of bounded aggregate total on those four areas. If (ranking things on a 1-10 scale, higher meaning higher), one power gets 10/10/10/10 and another gets 1/1/1/1, those two either aren't of equivalent power or are unbalanced. But one that totals 20 (7/7/3/3) and another that totals 20 (5/5/5/5) are, if not totally equivalent or equal, at least in the same ballpark.

Does this framework make sense? Are there things I'm missing.

The basic framework makes sense. The way to aggregate is unlikely to be additive but the concept of 4 metrics + aggregation makes sense.



Power -- when a character uses <ability>, how much does it do? A "deal 1d4 damage" ability is (all else equal) weaker than a "deal 10d10 damage" ability.

Frequency -- how often can a character use this ability? This might be due to a cooldown, limited resources, or just it being a niche power that only is enabled when the moon is full and the day name starts with T, during the summer, while standing on a glyph drawn in unicorn blood.


Power and Frequency have an interesting non intuitive relationship where we need to factor in the opportunity cost.

Say there is an at-will ability that does 4 damage and 2 limited use abilities (one does 2 damage, one does 6 damage). We can immediately tell the 2 damage limited ability is irrelevant because the opportunity cost is the baseline 4 damage. We can also tell that the 6 damage limited ability is +2 damage a limited number of times rather than +6 damage ever.

Now what if you received a functional duplicate of that 6 damage limited ability. It was 6 fire damage for some N number of uses. Now you get 6 ice damage for some N number of uses. Is the second ability worth roughly the same as the first? It depends on if you can use all 2N uses. Generally you can, so generally limited abilities have small diminishing returns.

Now what if you received a functional duplicate of the 4 damage at will ability. It was 4 acid damage at will. Now you get 4 electric damage at will. Is the second ability worth roughly the same as the first? No. The second at will ability expanded the versatility but did not impact the power (outside of cases where the versatility creates power*).

Even more oddly, the overlap of at will abilities extends further than one might expect. Since you can only use one at will ability at the opportunity cost of using another at will ability, even at will abilities that are unrelated but equally applicable to the scenario (attack XOR heal ally for example) will still have this overlap.


* In cases where the at will abilities are not equally applicable, the net benefit of using ability A (the benefit of ability A - the opportunity cost of not using ability B) can be non zero. This is the effect of versatility on power.



Cost to Use is readily factored into Power and Frequency in a similar way the opportunity cost is factored into Power.

Don't worry about quantifying. The main benefit is understanding the different aspects exist and how the different aspects interplay.

Glorthindel
2021-08-19, 04:02 AM
Does this framework make sense? Are there things I'm missing.

I think this is a very solid way to look at things, and definitely would be a good approach in my opinion for a rebalance of things.

One of the things to consider is how perception can differ from reality, and how playstyle can create different results for the same value.

Power is straightforward - a player can look at different damage or bonus values and immediately get a clear impression of their value "weight". But Frequency and Reliability are more woolly, particularly at the lower end.

For example, I personally do not value 1/day abilities, even the really powerful stuff, because of "use remorse". If I have a super powerful 1/day ability, 9 times out of 10 it wont get used, because using it means I wont have it for the next encounter, when I might really need it. So even if its low Frequency is offset by high Power and Reliability, that low frequency weights higher than it should, because its Frequency is in reality far lower than expected. This was one of my issues with 4th ed - I found Daily abilities to just not work right, as I witnessed players hold on to them as long as they could, frequently overlooking ideal opportunities to use them, then as soon as one person used their Daily, everyone else unloaded theirs as it was assumed that the party would call it for a rest immediately afterwards.

Likewise, I personally value spells that require a Saving Throw much lower than an equivalent-odds spell that has a To-Hit roll. Why? Well, I control the To-Hit roll. I roll the dice, and I can arrange as best to stack the odds. Meanwhile, the DM controls the Saving Throw. He rolls the dice (in secret, usually), and any modifiers are provided by effects that are in place that I will be unaware of. And he might have Legendary Resistance anyway. So an attack that 'by book' has a 50% chance to hit, and an attack that 'by book' has a 50% chance to be saved are not equal, as the 50% save might be modified by things I am unaware of, whilst I am aware of everything modifying the 50% to hit, and even if the save is failed, the DM might apply Legendary Resistance or just straight cheat and call a save when it failed. So, despite having the same Reliability value, one will be perceived by me as more Reliable than the other.

False God
2021-08-19, 06:08 AM
I swear there was an edition that did this, a four-something edition.

The versatility one is always a tricky one depending on the setting, adventure, or campaign in question. Sure, we can say an ability has one or more clear uses, but there will always be corner cases that may crop up more at any given table. So at best all you can do (from a design perspective) if see how well it applies to all aspects of the game.

Personally I'd add a very hard category Synergy. One thing that is often overlooked, and IMO leads to power-builds, multiclassing shenanigans and questionable splat releases is how well any given ability synergizes with any other. Even on a basic level, how well two classes work together. It requires a lot of real-gameplay theorycrafting rather than whiteroom simulation.

Your current number of categories works for any individual power on its own, but IMO it's an incomplete view without considering any given power's interaction with any other power.

Cluedrew
2021-08-19, 07:39 AM
To False God: I was actually struggling with that last night and my solution to the synergy issue is a bit different: This evaluation needs to be done on sets of powers as well as individual powers. Eventually adding up to the evaluation of a character's ability set, but you can break it down so that analysis is more manageable.

RandomPeasant
2021-08-19, 08:54 AM
That's ok...at a full system design level (ie if you have control over both inputs and outputs). That's not what this is for as much. If I'm homebrewing a creature, or a class, or a spell, or an item, I don't determine what the system assumptions are.

No. But you have to work within those assumptions. So your benchmarks should be based on whatever the system you're working in uses to define appropriate challenges for characters.


Or when I'm building a character and trying to balance to the table (hi Quertus!). If I've taken a bunch of high-power abilities, I should probably evaluate whether they're also high reliability, versatility, and frequency. Or the inverse.

Once you are talking about a specific table, any kind of general rubric is of extremely marginal utility. If you want to match three other data points, just match those datapoints. Don't try to build a general framework, map those three data points onto it, then find another data point that does the same thing.


And if we're only looking at benchmarks, "does this hit the benchmark" is a lousy way of telling overall balance. Because there aren't many benchmark campaigns.

Benchmarks are the only way to tell overall balance. "How does this fit into every possible campaign" is an unanswerable question. The way to produce a balanced system is to balance it against robust benchmarks, and to understand how deviating from those benchmarks will effect balance. And you should not be benchmarking at the level of campaigns. You should not even be benchmarking at the level of adventures. You should be benchmarking at the level of challenges, so that when those challenges are composed into an adventure or campaign, balance is preserved, or at least broken in predictable ways individual DMs can compensate for.


This framework intends to provide some backing for thinking about why things work or don't work, rather than just the binary works/doesn't work.

If your test suite produces only binary feedback, it is a bad test suite.

Quertus
2021-08-19, 08:49 PM
Or when I'm building a character and trying to balance to the table (hi Quertus!). If I've taken a bunch of high-power abilities, I should probably evaluate whether they're also high reliability, versatility, and frequency. Or the inverse.

Hi :smallbiggrin:

So, your main question(s) seem to be

My big question is "is thinking in these terms complete? Are there pieces that heavily overlap or are redundant? Is it useful?" (that last being the biggest question).

Is it complete? Absolutely not.

Suppose I decided to make a sport, and made the teams perfectly balanced according to a perfect version of your rules.

But then you find that the sport is "team hunger games". And, on another team, I've placed your mom / your child / the love of your life.

You'd definitely have cause to complain about the teams, despite them being perfectly balanced.

But that's probably too abstract. Let's try a more concrete example.

Let's say you bring a character to a mid-level D&D 3e game: a highly optimized 2-weapon SA DPS skill monkey Rogue, complete with UMD. You've even had the forethought to set money aside to buy consumables to UMD as needed, in addition to starting with a few eternal wands, partially charged wands, and scrolls.

What were your categories? Oh, look - you put them in the title (kudos!): Power, Frequency, Versatility, Reliability.

Your SA is 10/10 Power, able to turn anything even remotely CR-appropriate into chunky salsa. Similarly, being usable at-will, and having tricks (like flanking with yourself, or one of several others I've seen) (and you were smart enough to have a source of flight), it's 10/10 on Frequency. It's really only useful in combat… which would be 3/10 Versatility if all pillars were equal; let's say that this group calls it 5/10. Reliability? You're highly optimized, so you'll hit most everything, but… you didn't take anything to let you use SA on plants / undead / constructs / etc, so maybe 5/10 there. Total: 30 points.

General Rogue skill monkey? Although it lets you bypass the epic challenge of the locked door, let's say that everything but your Diplomacy is only a 3/10 on the Power scale. But it's 10/10 Frequency, being at-will, and, as you can use those everywhere (you can Hide and Tumble (and more) in combat), it's 10/10 Versatility, where you're generally left wondering which skill to use, rather than having nothing to do. Reliability? You're pretty optimized to hit expected DCs (and have an Eternal Wand of Wield Skill for those hard to reach places) - let's call it 7/20. Total: 30 points.

Your UMD? Boy, it depends. You could definitely hit above your weight class with higher level spells than the party Arcanist / Divine Caster / Psion / whatever could cast / manifest / invoke / whatever. And they're generally considered the powerhouses, right? So it's hard not to give it 10/10 on Power. But it's all one-shots, with a few 2/day effects. So we'll say 1/10 on Frequency. You can do absolutely any spell/power, and you've set aside the money to do so, so it's 10/10 Versatility. But you can fail checks on a 1, or when you punch above your pay grade, so not perfect Reliability - maybe 5/10? Total: 26 points.

Great. We've got 3 main lines of abilities, worth a level-scaled 30, 30, and 26 points. What does this tell us? How does that compare to a character with one ability at 35, or another with a handful in the low 20's?

More importantly, let's say that the GM decides to run Necromancy on Bone Hill. There's no real opportunity to buy items to adapt to the scenario, and your SA is nearly useless. Looking at the adventure, there's almost no chance for you to get much spotlight time - and certainly not to show off the bits of the character that you're proudest of.

So, if your numbers look better than those of the übercharger, the Mailman, and the turning specialist Cleric, what should the GM do?

-----

Is it useful? It's a great tool for Players to have, to develop the lingo to use to explain why they aren't having fun. It's a terrible tool for GMs (game refs, scenario designers) or system designers to use, because it will fill them with false confidence that they know what's going on.

I know of no shortcut beyond evaluating "how can this character participate in each of these scenes" to accurately measure such things.

Or, if you prefer,

I don't think it's the right way to approach a problem. You don't want to try to balance all the abilities against each other, or even the classes against each other directly. What you want to do is think about the balance point you want, then ensure that people hit that. What is a character who has 100 Karma or 5 levels or however much of whatever you system uses to measure advancement supposed to be able to deal with? Can the characters your system outputs (at whatever level of granularity you're testing at) all handle those things?

Trying to go directly to "power" or "versatility" will confuse things. It is very, very hard to tell, a priori, a set of random abilities that don't add up to anything worth caring about from a collection of silver bullets that trivialize everything a character is going to be asked to deal with. Don't try to solve that problem. Don't try to come up with a complete list of properties abilities can have. Don't try to figure out whether "four enemies become unconscious" is better than "six enemies become nauseated" or "one enemy takes seven boxes of damage" or "two enemies become stunned" (let alone how vastly more complicated those comparisons become when you start talking about non-combat abilities). Just figure out the problems characters are supposed to solve and if the total packages of what they get add up to solving those problems.
And you should not be benchmarking at the level of campaigns. You should not even be benchmarking at the level of adventures. You should be benchmarking at the level of challenges, so that when those challenges are composed into an adventure or campaign, balance is preserved, or at least broken in predictable ways individual DMs can compensate for.



This says something balanced with my opinion (although I doubt I could compare their Power, Frequency, Versatility, and Reliability).

Lorsa
2021-08-21, 02:18 PM
This is incredibly interesting and I will try to read through the thread and answer in more detail during next week (when I should be working, most likely).

If I understood correctly, you are mostly interested in this conceptually so:

In regards to the completeness, I doubt it is complete. While I can't think directly of any one category that is missing, there seems to be a lot of overlap that you have to look into.

For example, what really differentiates frequency from versatility or reliability? Perhaps frequency isn't really what you are after, but rather something like "complicatedness of use". Otherwise when trying to quantify one has to ask: "what frequency are you looking for?". Is it frequency in game time? Frequency in real time? Frequency compared to other abilities? Frequency per session?

Likewise, power seems to have quite a bit of overlap with reliability as well. Even if you clearly think of "power" in numerical terms (i.e. 1d4 vs. 10d4), the reliability metric would only affect this value into an "average score" anyway.

Unfortunately I have to stop now, but I think you really should think of only two main categories, one thing "power" and the other "versatility", where this reliability and frequency can be sub-categories (among other subcategories).

Lastly, is it useful. Most theoretical frameworks are useful to some degree. Even the wrong ones. So yes. :smallsmile:

OldTrees1
2021-08-21, 03:06 PM
For example, what really differentiates frequency from versatility or reliability? Perhaps frequency isn't really what you are after, but rather something like "complicatedness of use". Otherwise when trying to quantify one has to ask: "what frequency are you looking for?". Is it frequency in game time? Frequency in real time? Frequency compared to other abilities? Frequency per session?

Likewise, power seems to have quite a bit of overlap with reliability as well. Even if you clearly think of "power" in numerical terms (i.e. 1d4 vs. 10d4), the reliability metric would only affect this value into an "average score" anyway.

Unfortunately I have to stop now, but I think you really should think of only two main categories, one thing "power" and the other "versatility", where this reliability and frequency can be sub-categories (among other subcategories).

Lastly, is it useful. Most theoretical frameworks are useful to some degree. Even the wrong ones. So yes. :smallsmile:

My understanding

Frequency: How often can the character use this ability? What limits the use from happening infinite times over 0 seconds? For example in 5E the Attack action costs an Action and thus is usable roughly once per turn. In contrast each of a 5E Warlock's Mystic Arcanum can be used once per day. If a gun has 6 bullets, then it has a frequency of 6 uses per reload. Which particular reference frame you are using might shift depending on context.

Reliability: When the ability is used, how often does its effect(s) occur? Is there a check that causes a chance of failure? Does it have any guaranteed outcome? For example consider an ability that does 10 Poison damage plus Con Save vs Stun. With that ability the poison damage is more reliable than the stun effect.

Versatility: How broadly applicable is the ability? Is this an ability that is always useful, or an ability that is only useful during combat?

These 3 categories are not about how "complicatedness of use" the ability is, rather it is a measurement of the projected power beyond the measurement of raw power.

Reliability vs Power:
If you want to display both Reliability and Power you would create a probability distribution. The shape of the distribution is the reliability. A sword that deals 2d6 damage is different from a sword that deals 7 damage. If you use expected value as the measurement for power, then the shape of the probability distribution is the reliability.

Versatility vs Reliability + Frequency:
One Wizard knows Fireball. Another Wizard knows "Fire or Ice ball: AS per Fireball but you could do cold/ice damage instead". The two spells have the same Reliability (Dex Save DC 13 for half damage), and Frequency (2 times per day for example). However the second spell is more versatile (albeit barely).

I can see how one could fold Raw Power, Reliability, and Frequency into just "Power", however you can also fold Versatility in at the same time. Which categories make sense to be explicit/separate rather than implicit/folded in might depend on context.

Quertus
2021-08-22, 07:41 PM
I know of no shortcut beyond evaluating "how can this character participate in each of these scenes" to accurately measure such things.

To expand on this a bit…

If I'm bringing an übercharger / Street Samurai / combat monster that I *know* will take center stage in most combat scenes, if I'm trying to balance to the table, I will try to design for a more passive role in other scenes / try to make sure that there are places where the other characters can shine.

Or, if I build the Sarak of diplomats, who is also a warrior, maybe he'll be a reluctant warrior, who is slow to join the fight, or takes penalties from disarming and subduing his foes, or from not being left-handed, or some such.

But it's a matter of, well, what *matters* to a given table.

At a table where a Monk is considered OP and has to be nerfed, or where the only thing that counts is damage and a high-op BFC Tainted Sorcerer is considered "not contributing", I'll build differently, with different considerations in mind, when I'm trying to balance to that table, than I would trying to balance to "the Playground", or to one of my tables.

And that's the rub. The high-op BFC Tainted Sorcerer is considered Power 0 at one table, and OP at another, because people don't measure the same things.

Or, put another way, what does it matter how many points a character's abilities are worth, if the only thing that the table cares about is getting to deliver one-liners?

Vahnavoi
2021-09-03, 06:25 AM
These are all real measures, but on the highest abstracted level, you only have the first two, power and frequency. The last two, versatility and reliability (the latter which is effectively randomness, without a random component it reduces into versatility) are factors of frequency.

Where I disagree is measuring these on an arbitrary 1 to 10 scale. With many games, it's perfectly possible to measure them using some existing real or game unit.

Quertus
2021-09-03, 11:24 AM
These are all real measures, but on the highest abstracted level, you only have the first two, power and frequency. The last two, versatility and reliability (the latter which is effectively randomness, without a random component it reduces into versatility) are factors of frequency.

Where I disagree is measuring these on an arbitrary 1 to 10 scale. With many games, it's perfectly possible to measure them using some existing real or game unit.

Interesting.

A) do you have any examples of game/metric sets that you might use? Like… you could try to use "casting cost" to estimate "power" in MtG, but there would definitely be some outliers on the accuracy of that metric.

B) how well do these metrics measure and encapsulate what is actually important in the game?

PhoenixPhyre
2021-09-03, 12:37 PM
Where I disagree is measuring these on an arbitrary 1 to 10 scale. With many games, it's perfectly possible to measure them using some existing real or game unit.

I should have been clearer that the 1 to 10 scales was a throwaway scale for pure convenience on that particular abstract example. If I were to actually implement/quantify this, the scales would have to be crafted better and would almost certainly be system-specific.

The one quibble there is that since one of the fundamental principles here is that you're considering the aggregate of these, you have to use metrics that are comparable in some way. 1-10 scales are easy for that, but you could have some kind of mapping function between <power units> and <frequency units>.

As for frequency including versatility/etc, the distinction I had in my head was that frequency was a maximum in in-game time units--how frequently can you push that button in a given amount of in-game time? So something like a 1x/day ability would have a frequency of 1x/day. On the other hand, versatility measures how many different things can this one ability do? A 1x/day "deal X fire damage to one target" and a 1x/day "deal X fire OR cold damage to one target" and a 1x/day "deal X fire damage to a target AND banish target to hell for 1 minute" ability are very different, despite all being useable 1x/day and all having the same basic "use this in combat" specifier. Reliability goes to "how often does pressing the button actually produce the effect." Which could be rolled into frequency, but the variations are much faster so it seems reasonable to have it as a separate factor in the aggregate.

Note: There's no reason why the aggregate would necessarily be the sum of the scores; these could have different weights (as a simple difference) or even be a state machine with arbitrary complexity. The only criteria for the aggregate function is that it's
a) computable reasonably simply (because people have to do it)
b) output can be totally ordered over the ability space of the system. That is, for any two abilities supported by the system, the aggregate function has to be able to implement all of the comparison functions (==, <, >, >=, <=, !=). Most of those can be derived from the others, so basically you need equality, negation, and one of > or <. This is easiest with numerical values, but you can do it with other values as long as you can compare

KorvinStarmast
2021-09-03, 01:40 PM
This is easiest with numerical values, but you can do it with other values as long as you can compare In engineering speak, the units need to match up. :smallsmile:

PhoenixPhyre
2021-09-03, 01:52 PM
In engineering speak, the units need to match up. :smallsmile:

Or have a dimensionful transformation that makes them match.

So if power was a 1-10 scale (example only) and frequency was a uses/in-game-day scale, then one of the two would have to have some function f(frequency) such that [f(frequency)] == [power] or [f(power)] == [frequency]. So you could normalize to f(frequency) = c * uses/day, where c might be something like 1 (day/use) (so something with 10 uses per day would be a 10 on the power-unit-scale), etc.

Quertus
2021-09-04, 08:19 AM
So, this actually gets kinda weird.

Because you have to account for playstyle. Player > Class, as they say.

For some, having a 1/day ability is actually a net negative: they'll hoard it & never use it, forget about it & never use it, or use it "at the wrong time" & be filled with and distracted by regret afterwards.

For some, an "at-will" ability is a rut that they'll fall into, making them less likely to notice their other abilities.

Handing *me* an item with limited charges is different from handing it to most people. I'll sit there and hoard the charges, seemingly having forgotten about it. Then one day, I'll suddenly spam it. Maybe it's because we "needed" it. But more likely, everyone will be confused. I'll just have my character shrug, and say, "it rained yesterday, the ground was still wet - I didn't want to get my boots muddy" or some such. (Depends on the character, of course; point is, my reasoning is not Determinator approved)

Quertus, my signature academia mage for whom this account is named, is ridiculously OP from a pure power standpoint but is balanced (or even "The Load") because of his personality. And there's tables where Monks are considered OP and need to be nerfed, or where totally OP BFC Tainted Sorcerer Arcane Spellcaster builds would be considered "not contributing" because they dealt 0 damage.

Point is, I think that each individual table would need to make their own heuristic to determine exactly how powerful an individual character was.

So let's look at the simplest possible example of a real table: the table that only counts damage dealt.

Let's measure the expected damage output of some 10th level characters over, say, 40 rounds.

The totally OP BFC Tainted Sorcerer Arcane Spellcaster build (also party healer) deals 0 damage (unless you count damage he deals to himself). Well, that was easy.

Based on the AC and DR of the foes the GM has in the module, the Half-Dragon Monk… who gets Haste 10 rounds/day… based on expected AC and DR… is looking at around 700 damage.

The ½-Ogre Rogue… based on AC, DR, and SA vulnerability… if he chooses his targets to maximize his damage… is looking at maybe around 2,100 damage. In one rough patch, though, those numbers drop to about 210.

The blaster Sorcerer… who runs out of gas, so how many *days* those 40 rounds cover matter… based on expected touch AC, saves, clumping, and energy resistances… could be looking at over 4,000 damage. In a rough patch of the adventure, though, those numbers drop to about 530.

Obviously, the blaster is the strongest, and the totally OP BFC Tainted Sorcerer Arcane Spellcaster build the weakest.

Yet I'd think that the totally OP BFC Tainted Sorcerer Arcane Spellcaster build would be very effective at their job, and the blaster Sorcerer the least fun to actually play (having run out of spells and sitting there doing nothing much of the game).

So, even if this metric were right for this table, I'm concerned that a GM using it would be predisposed to Nerf the Sorcerer, rather than listen to their concerns that their character wasn't fun to play, and be disinclined to acquiesce to their request (to retrain / to add to the whitelist / whatever) to get a reserve feat to give them something to do when they're out of gas.

So, again, my suspicion and assertion is that this would make a great tool for the Players to use, to give a name to their pain. But a bad tool for a GM who will get a false sense of security that they understand the issue. And a horrible thing for system designers, who don't know what the actual table looks like or cares about.

LibraryOgre
2021-09-04, 11:41 AM
While conceptually this is nice and neat I’m not sure how you could practically apply it for design purposes as a rule. Is targeting their mental shield rather than targeting their armor worth a +1 or a +2? What’s the cap in any given category?

Some of this can be managed by looking at some other systems; Ars Magica, for example, lets you grade powers on their total ability... range, duration, level of effect, and so on, resulting in a level of spell.

If you could design some reliable parameters for this, you could probably use a 0-9 scale, with the mean turning into the level of the spell in D&D.

RandomPeasant
2021-09-04, 12:05 PM
Some of this can be managed by looking at some other systems; Ars Magica, for example, lets you grade powers on their total ability... range, duration, level of effect, and so on, resulting in a level of spell.

But that doesn't result in particularly balanced spells. Unless you go full effect-based for your system, it's very hard to tell the difference between a broken spell and a pointless one. Conjuring a wooden table is a modestly impressive utility effect, while conjuring a wooden cage is a quite powerful lockdown effect. Even in an effects-based system, mechanics don't always compose neatly. The range that is necessary for remote viewing to be useful is game-changing on blasting spells, but puzzlingly useless on buffs. Asking "how does this feel in comparison to other effects" can be useful, but ultimately you have to test things, and breaking stuff down too far into the details is often counter-productive because it gives a false sense of clarity.

Telok
2021-09-04, 02:56 PM
Some of this can be managed by looking at some other systems; Ars Magica, for example, lets you grade powers on their total ability... range, duration, level of effect, and so on, resulting in a level of spell.

If you could design some reliable parameters for this, you could probably use a 0-9 scale, with the mean turning into the level of the spell in D&D.

One thing I considered doing when I cared about D&D magic was to build a lot/all/some % of the spells in a couple different point buy systems. The difference in effect valuation between the different pb systems should be relatively stable and able to be expressed as ratios, which would allow you to normalize across the systems. You'd end up with a "score" for each spell based on the point buys.

Vahnavoi
2021-09-05, 03:44 AM
@PhoenixPhyre, KorvinStarmast: quite often, the most useful number for comparisons would simply be product of these measures.

Power, frequency and reliability are easiest to track, because often you can get the relevant number directly from the game system. Versatility is the hardest, for the simple reason that game systems rarely control for scenario. If an ability can be used for situation X but the question of how often X comes up is unanswered or, worse, unanswerable, you can't get a good measure.

@Quertus:

When you're mathematically balancing abilities, you do not account for player skill, or lack thereof, at that stage. At most, you can do that after several passes of playtesting, but trying to do so at an earlier or abstract stage makes the problem recursive in a really nasty way.

For example, if somebody's failing to use their 1/day abilities due to psychological loss aversion, the best answer to that is not to muck with the game system, it's to point out the player is following a bad strategy.

Pauly
2021-09-05, 07:23 AM
The 4 variables assume a PvE approach. But some tables /games are more of a PvP approach. In which case a 5th variable “counter-ability” is needed.

“Counter-ability” is how easily a player can counter the effect. This can be considered part of reliability in a PvE approach where the NPCs/monsters usually aren’t actively seeking to avoid effects/exploit weaknesses. However in a PvP environment both players are actively seeking to counter the other’s effects it becomes an important consideration.

RandomPeasant
2021-09-05, 07:54 AM
Power, frequency and reliability are easiest to track, because often you can get the relevant number directly from the game system.

Do they? Consider balancing the abilities Accurate Strike and Powerful Strike. Accurate Strike gets a to-hit boost. Powerful Strike gets a damage boost. Those numbers are "from the game system", but they depend on the opposition people are going to face in the same way that you note versatility does. Or how about the trade-off between an ability that is at-will and one that is X/encounter. That depends on how long encounters take, which is again something that is not an inherent property of the system. The properties of an ability are only meaningful based on the context in which that ability is used. As I said originally, this sort of ivory-tower number crunching is simply insufficient to produce balance. You have to define your balance point.


For example, if somebody's failing to use their 1/day abilities due to psychological loss aversion, the best answer to that is not to muck with the game system, it's to point out the player is following a bad strategy.

Experimentally, telling people they're behaving irrationally is a really bad way to get them to stop behaving irrationally. A design that is mathematically beautiful, but depends on players acting in ways they won't psychologically, is not a good design.

Quertus
2021-09-05, 08:35 AM
@Quertus:

When you're mathematically balancing abilities, you do not account for player skill, or lack thereof, at that stage. At most, you can do that after several passes of playtesting, but trying to do so at an earlier or abstract stage makes the problem recursive in a really nasty way.

For example, if somebody's failing to use their 1/day abilities due to psychological loss aversion, the best answer to that is not to muck with the game system, it's to point out the player is following a bad strategy.

While I do not disagree, I'm more pointing out how one cannot simply *stop* at that step. Or, rather, one *should not* do so.

Most people I've known IRL are unwilling or unable to comprehend things at an appropriate depth (without liberal application of the (verbal) clue-by-four, and often not even then). So I'm just doing due diligence, and trying to set the expectation at "this isn't the only step - if you can't go through all the steps, don't bother starting".

Seemed very appropriate given that the OP asked, "is this complete?", and the answer is a resounding "no" - and not just in the dimensions one might expect.

I'm working to make sure anyone expecting 2-dimensional thinking realizes that the problem is more appropriately approached as 5d Wizard Chess. :smallwink:

And you are very wrong regarding what is the best strategy in that scenario.

OldTrees1
2021-09-05, 09:33 AM
For example, if somebody's failing to use their 1/day abilities due to psychological loss aversion, the best answer to that is not to muck with the game system, it's to point out the player is following a bad strategy.

The best answer is to realize the 1/day ability is not valuable to the player. Ask them why. If it is due to loss aversion, then the best answer is replacing the ability with one that better fits the player preferences.

The best answer is not to try to convince your player to have a bad time and like it. Pointing out that the psychological loss aversion is causing a bad strategy will do nothing (they already knew that duh!) or will make them feel even worse (the new player now understands their aversion has a negative impact) or convinces them to pretend they don't have the aversion (aka suffering in silence because the GM told them to shut up).

Cluedrew
2021-09-05, 10:22 AM
On Player Psychology: Ignoring player psychology is kind of like ignoring the fact you will have players. If some habit is required to play your game (or similarly if a habit breaks it) that is OK but you should be prepared to lead them to (away) from it. And if its something deeper than a habit you might not be able to.

On Dimensions: I think "oh its actually a 5d problem" is the wrong way to think about adding detail. The problem is more like how describing a place's latitude and longitude (2 dimensions) doesn't actually tell you how you to get there.

For instance flexibility could further be sub-divided into two parts: flexibility across axis you can predict and flexibility across axis that are effectively random. This is a fairly important distinction, especially for anything you prepare ahead of time, but do I think that that this split should be described in a high level system like this? Not really. For one things don't always stay in one box or the other, they can shift back and forth due to a variety of situations and even other abilities. How long you have to find out and how long it takes to set-up/swap options are dimensions, but are hard to pin down at a high level.

OK how about fire damage? Does changing to fire damage increase or decrease an abilities power? Assuming any sort of elemental system in this game, well it depends on the enemies you are facing. Are they venerable to fire or resistant to it?

RandomPeasant
2021-09-05, 11:20 AM
OK how about fire damage? Does changing to fire damage increase or decrease an abilities power? Assuming any sort of elemental system in this game, well it depends on the enemies you are facing. Are they venerable to fire or resistant to it?

And it depends on the abilities characters can have. Is there a feat or talent or something that adds extra damage to fire abilities? One that adds some kind of rider? Is there some kind of "use any fire power" ability that would gain versatility by making a new fire-based ability? The idea that you can do anything useful by breaking things down onto a four-axis (or N-axis for any reasonable N) scale is something I'm deeply skeptical of. Ultimately, the only way to achieve balance is to define a balance point, then test and iterate until you reach that balance point. That's a lot of work, but designing and manipulating complex systems so they do what you want is a lot of work. It's way doctors, lawyers, and engineers make so much money.

Quertus
2021-09-06, 11:24 AM
I think people may have taken my "dimensions" comment a bit too seriously / literally.

Let's try this instead: this isn't a BDF problem, that can be solved by hitting it harder. It requires thought, adaptation, customization, and conversation: 5d Wizard chess, as played by a charismatic Bard.

You literally cannot make balance at the system level for something like, say, 3e D&D, when you've got tables that will nerf the Monk, and tables that only count damage as contribution, and tables like mine, and those one would expect of the Playground. Whichever balance metric you choose, everyone else will know that you were wrong, and that you built a horribly unbalanced system.

And they will be right. At least as right as we are claiming that 3e isn't balanced.

Thing is, 3e provided the tools for people to make their own balance. If you only count damage as contribution, you can say, "make a character with roughly X DPS", and everyone can build that. Even if I might build a totally OP Tainted Sorcerer Arcane Spellcaster BFC build, whose familiar / animal companion / summons actually "contributes".

Balance to the table. Because you cannot actually balance anywhere else. Attempting to do so will result in games that are less balanced, less capable of balance, and likely less capable of having meaningful conversations about balance, for any other paradigm of "Balance" other than the one you choose for your system.

Unless, of course, you want to try to create one balance fix for 3e that will satisfy me, tables that nerf the Monk, tables that only count damage as contribution, and the stereotypical Playgrounder table, all with one set of rules.

Never mind the many diverse effects of "player > character" (including psychological, playstyle, preference, and role-playing ones).

Don't get me wrong, I absolutely *love* statistics like these. Trying to create good rules for Power, Frequency, Versatility, and Reliability, and then looking at various characters through those lenses? Great fun, better than cats.

But it's just not a tool one can use to balance a system by more than one paradigm - and certainly not something one can use to balance a system blind for the unknown paradigm of a particular table.

But if you want to talk about dimensions? You've got the dimension of "psychology", to account for loss aversion, risk aversion, hoarding, etc. You've got the dimension of "spotlight sharing", measuring the distribution of how often and for how long your character gets to participate / shine / solo, vs sit there and twiddle your thumbs (and some individual variation on how individuals value different distributions). You've got the dimension of "playstyle", from "kick in the door" to "5d Wizard Chess", CaW vs CaS (vs…), D&D as dungeon crawl vs murder mystery vs horror vs comedy, etc. You've got the dimension of "values", from only counting damage towards contribution, to (over)valuing the Monk's diverse array of at-will abilities gained every level, to counting how many one-liners they got to deliver.

The way to make a game more balanced is not to say that people *must* play a heal-bot Cleric / blaster Mage / golf cart Fighter, *must* play a vampire filled with angst as their humanity slowly drains away, *must* stop at level 6, *must* have a kryptonite that makes them unable to play the game.

IMO, the best way to balance a game (or, at least, one as robust as 3e) is to make even *more* wildly unbalanced options that the players can mix and match, allowing them to both create their concept, and balance to the table, for whatever definition of balance their table uses.

And then these fun metrics can be used when players need to communicate: "I built a DPS UMD skill monkey Rogue, and even set aside some of my starting WBL for adaptive mid-game purchases, expecting I was set for power and versatility, but the samey nature of the encounters, and lack of downtime, has left me unable to use more than a single repetitive, suboptimal tool from my toolkit, participating less than the übercharger's Legacy Weapon's permanent summons".
.

Cluedrew
2021-09-06, 09:24 PM
Actually to cut through a lot of the previous discussion (also, to be quick), I thought I would mention why I like of liked this system: I'm not sure if there is a good way to measure each axis or combine them, I do think there is a value in having the major ways something can be good defined. First because its a good communication tool to have important terms shared between people. Second as a kind of sanity check, if something is coming in unusually high or low on a bunch of these scales than you should have some special note explaining what is not captured in these systems that makes it balanced. So its a good outlier check. Or if something radically changes how other things rate then it should also be examined very carefully.

Quertus
2021-09-07, 12:49 AM
its a good communication tool to have important terms shared between people.

Strongly agree.


Second as a kind of sanity check, if something is coming in unusually high or low on a bunch of these scales than you should have some special note explaining what is not captured in these systems that makes it balanced. So its a good outlier check. Or if something radically changes how other things rate then it should also be examined very carefully.

This feels… overloaded. So I'm gonna babble for a bit.

One thing that's been on my mind for a bit is, what kind of characters would be balanced / work well together.

It's tricky, conceptually, because many characters live in quite a range. D&D characters go from zero to slaying gods. Jedi go from farmboy Luke to TK flattening armies or Force storming entire fleets. Whereas Conan… is Conan. And a CoC character will never get over not being the center of the universe.

A D&D Wizard with "all the spells" (or, you know, a D&D Cleric) can solve any problem… tomorrow. Whereas a WoD Mage can invent and cast new spells on the fly. Even if they're a novice, and are only capable of Forces/Prime effects.

A D&D Wizard can learn a new spell from a scroll in hours, or research an entirely new spell from scratch in weeks or months. A WoD Mage invents spells in… as long as it takes them to think of it. A Matrix character can learn new skills (like Kung Fu or helicopter pilot) in moments.

A Star Trek character gets communication, sensory, blasting, and Teleportation powers, generally at-will. Oh, and a star ship. And matter replication. And probably more.

A superhero character… can live almost anywhere from zero to deity… but doesn't usually progress the way a D&D character can.

A Battletech mech eats peasants and soldiers for breakfast, but is pretty useless outside combat, and struggles to fight stereotypical Dragons.

A Time Lord can, as the name suggests, travel through time. And fly and teleport, and other nifty tricks, like self-resurrect. But they're pretty much just "a guy".

Naruto ninjas, I'm told, have quite a huge variance in their power, and most are not just 1-trick ponies. How much of their power is "level" vs "bloodline", and how quickly they can gain power, though, I don't know enough to answer. (Although the phrase, "younger than you, stronger than me", or something like that, seems to have been spoken early on…)

IIRC, early edition D&D Wizards can only cast 1 spell per minute (as rounds were 1 minute long), putting them at a severe disadvantage vs most beings. IIRC, my first Exalted and Scion characters each acted once per second, one high-level Heroes superhero I built acted twice per second, and some Battletech mechs can make an even higher rate of attacks than that.

And then they're things in D&D (including the Hecatoncheires) that can make 100+ attacks per round. Or infinite.

Harry Potter Wizards… seem to run on "at will, effect vs counter" logic, making them crazy powerful vs things that don't have counters.

And I'd still expect a sniper to one-shot almost every single character I've just described.

I don't know if there's anything useful in there. But I think I care about Growth Range, Growth Rate, Action Rate, and Adaptation Rate as valid ratings, and as balance concerns.

MoiMagnus
2021-09-07, 11:00 AM
Reliability -- when you use this power, does it <do thing> all the time? Or just some of the time. A power that requires some form of attack or defensive or casting check is less reliable than one that just happens; a power that bypasses immunities or "just works" is more reliable than one that has an element of random chance.

Slight subtlety, reliability should be about "being actually useful", not just doing the thing it is supposed to do.
Let me explain myself:
A power saying "+2 to your next attack roll" is significantly less reliable than a power saying "when you would fail an attack by 2 or less, succeed instead". Because literally 90% of the times the first one is useless [either you failed despite the bonus, or you would have succeeded even without the bonus] while the second one is always useful when used.

PhoenixPhyre
2021-09-07, 11:32 AM
Slight subtlety, reliability should be about "being actually useful", not just doing the thing it is supposed to do.
Let me explain myself:
A power saying "+2 to your next attack roll" is significantly less reliable than a power saying "when you would fail an attack by 2 or less, succeed instead". Because literally 90% of the times the first one is useless [either you failed despite the bonus, or you would have succeeded even without the bonus] while the second one is always useful when used.

I think that's a valid point.

There's subtlety around timing--for example the 5e Paladin's Divine Smite ability (choose to trigger it on hit) is quite reliable--every time you use it, it adds damage[1]. Compare that to the paladin's smite spells (e.g. Searing Smite), which are bonus actions with concentration that take effect on your next hit. Which means that if you use it and miss both times that turn (assuming level 5+), you could get hit and lose concentration before you get to actually benefit from the spell. In return, the smite spells generally do more than just damage. Is it enough to make them worth it? Meh, opinions vary.


This is a case of a clear tradeoff at the pair of abilities level (using searing smite because that way I can just do 1st level comparisons):
1) Power: 9 (2d8) radiant damage vs 3.5 (1d6) fire + CON save each turn or 3.5 (1d6) fire damage. Target can also take an action to end it. To make Searing Smite stronger, it needs to burn for 2 rounds (totaling 10.5 damage). Generally, however, prompt damage is better than delayed damage, even if the delayed damage is guaranteed (which this is not).

2) Frequency: At level 2[1], they have the same frequency. You can use them each once per turn, both cost the same resource. At level 5+, Divine Smite is more frequent (you can use it on each hit, including hits not on your turn such as OAs) at the cost of more resources.

3) Versatility: Push. Basically both do damage and that's it. Same targeting restrictions, no other conditions. One tiny benefit to Searing Smite is that by doing recurring fire damage, you can (in theory) keep a troll from regenerating for multiple turns with one ability. One tiny benefit to Divine Smite is that you can blast through a zombie's annoying Undead Fortitude ability.

4) Reliability: Divine Smite wins hands down. You'll rarely, if ever, use Divine Smite and do nothing[2]. Searing Smite costs concentration, which is a big cost, plus can be utterly wasted. Divine Smite also has the more reliable damage type--many more things resist or are immune to fire than to radiant. Not only that, but there's an additional save (and the generally-best save around for most monsters) that can completely negate the additional damage. Without that, Searing Smite stinks.

Verdict: DS >> SS. In each category, divine smite wins or ties with searing smite, and is significantly better in reliability and (at higher levels) frequency.

[1] paladins don't get spell slots or Divine Smite at level 1.
[2] The one exception is if you misjudged the enemy's HP and the base weapon damage would have killed it anyway. But this exists for Searing Smite as well, so it's not a significant difference.


Another example from 5e about timing is the Protection fighting style (when an enemy attacks an ally within 5 feet of you, use your reaction to impose disadvantage on the attack). The default interpretation is that you have to do this before the attack is rolled. And that's not very reliable--if the target has high AC anyway, it's likely the attack would miss, so disadvantage doesn't do much. If they have low AC, then disadvantage isn't a huge help. There's another interpretation (which is the one I play with as a house rule) that says you can do this after the roll, but before damage is applied. That's way stronger--you can save it for times where the first roll hit. Or even turn a crit into...probably not a crit. That second interpretation turns a really lack-luster ability (one generally not worth taking) into one that's quite good (as good defensively as the archery style is offensively). Which is why I use it at my tables.

Both interpretations have the same Power -- impose disadvantage on one attack roll. They have the same Frequency -- once per turn (costing a reaction) and the same restrictions on when you can activate it (when an ally within 5 feet of you is attacked and you have a shield). They have the same Versatility -- both can only be used for that one thing (reducing the chance an ally takes damage). But they have radically different Reliability, and that makes a huge difference in their overall usefulness.

LibraryOgre
2021-09-07, 12:54 PM
As an aside, in the old Bard's Tale Construction Set, I got tired of playing at level 1. So, I created a monster with several overwhelmingly powerful attacks... 100d100 and things like that... but then gave it a mediocre AC and HP, and set its AI so it never attacked. Roll in, kill it quickly, and gain a bunch of XP right off the bat.