Suggestion to compare Infantry Combat
Posts: 3114 | Subs: 2
Pre-release version for power calculator
Download
Please refer to this post for instructions on how to use the script.
Original post from 13.04.21:
I am going to present to you an improved way of comparing infantry combat performance that - when properly handled - takes into account what we currently cannot: the role of DPS retention and even model sniping, which are probably the most important factors in infantry vs infantry combat.
I was not sure if I wanted to post this for quite some time, but after I did first tests today that looked promising to me, I decided to put this out at least as a thought.
I typed out a proper text with introduction and everything, but then decided to just make a TL;DR version since no one will read it otherwise. I'll put the lengthy description of the model building into a spoiler at the end. If you have more detailed questions, I would kindly ask you to look at the spoiler first to check if I already answered it or not.
First off, I made some tests with early game Sturmpios vs IS, some classic matchups of late game Conscripts vs Grens at different ranges. We know Cons win close and Grens win long, but where exactly is the point where the squads are equivalent? Last but not least, there is Tightropes recent video about PPSh Cons vs Assgrens where the strength flips at different vet levels. My model, under the right settings, was able to get the "correct" or at least close to correct answer. This means that it COULD be used as a method to estimate balance changes on weapons and infantry.
But what is it all about?
The base idea is to treat EHP (effective hit points, calculated by HP/received accuracy) not as a mere value, but as a measure of time that allows the squad to dish out damage. Figure 1 shows the plot of DPS vs EHP of double BAR Rifles vs LMG Grens (both vet3) at range 10 (the DPS values for the graph are slightly off since I was still using serealia at the time I made this). It is a step function: Losing EHP only matters when a model dies. If so, the DPS will lower since the model cannot contribute anymore.
The area under this plot is the main metric for my model. It is both a measure of capability to dish out damage, as well as sturdiness. We take into account BOTH defining factors: First, HOW MUCH damage can a squad do, and second for HOW LONG can it keep up the damage. I will term it "power" from now on. The elegancy of this approach is that we get a standardized metric for all squads. Since DPS changes with range, we need to calculate the power (the area under the plot) at all possible distances. This is our metric for the fighting potential of a given squad at these ranges.
Figure 1
Figure 2 shows the DPS and power graph of late game Grens vs Cons. We can see that Grenadiers always have higher DPS at all ranges. We know that they lose at closer ranges because of less EHP. On the other hand they also have better DPS retention. Just by knowing this, we cannot say where the "turning" point would be. My model however calculates it to be slightly above 25 meters. When I tested it in game, I saw Cons slightly winning below 25 while Grenadiers usually won at 30. While I only tested 4-6 fights per range, it is some evidence that my assumptions are probably not that far off the truth.
Figure 2
On a second occasion, I tested the early game situation of Sturmpios assaulting IS behind sandbags (Figure 3). The current sentiment is that SPs lose when they lose a model on the approach, otherwise they win. I did some simplified testings and directly put them in front of each other across the same sand bags. My model predicted that SPs will win even with 3 men if they are all at full health (Sturmpioneers_3m), which they did in game as well. Even at 80% health per model (_low), SPs usually won the duel which is consistent with the model. My model also predicted the point of equivalency at roughly 3 men with 57% health each (vlow), whereas 2 men will lose even at full health (2m). In the game, SPs won 3/6 fights under the predicted equivalency settings, whereas they lost when they were only two men.
Figure 3
Finally, I wanted to check if my model can also explain the situation of Tightropes video. In figure 4, you can see 4 squads: Assgrens (5 men) and PPShs Conscripts that reflect the early game. This situation is interesting: Conscripts have about 5% EHP more, but 8% DPS less. They also have better DPS retention, but if you are presented with these numbers, could you really tell that Assgrens should win so decently as in Tightropes test? The DPS level and EHP value alone are not that clear on the matter. The modeled power level of early PPSh Conscripts however is 36% higher, suggesting a larger advantage. The late game situation handles differently. Here, Assgrens are predicted to win convincingly (35% higher power), but this would have also been predicted by the old method. However, if we assume that this model is capable of making predictions, we can also use it to get a first impression of how impact a balance change will have. I therefore added the live version of late game PPSh Conscripts with 3x PPShs.
Obviously, there are some parameters to set. I will touch on this in more detail in the lengthy description, so this will be only short: We can model the retreat of a squad so that the last models will not add any power once a certain EHP threshold is reached (they will flee and not shoot anymore). We can also to some extend model the damage spread accross multiple soldiers by shifting EHP between them. What it can't do is obviously evaluate the whole process of approaching etc. But it can at least make a prediction how many units must survive that closing in was worth it.
So, finally, that's it from my side. I want to put it out here as an idea, a train of thought. I am interested in what you think about it. I was quite skeptical myself when making it, but at least the first tests gave me some confidence that this might be a decent idea to quickly check on unit balance. Not withing 1-2%, but at least to check how much buffs/nerfs a unit needs to be in line with others. It could also help to increase unit diversity. For example instead of adding EHP to a glass cannon squad, it could estimate how much more DPS the squad needs to become competitive again and thereby keep the uniqueness. Similarly, we can also alter veterancy parameters to see if a different veterancy bonus fits better.
A script will probably be released when it is ready. Currently I am still checking everything and stream line it a bit.
Hannibal
(lengthy version)
Posts: 13496 | Subs: 1
Posts: 999 | Subs: 1
i agree that this approach has some clear advantages over numerical simulations, most notably the much reduced computational effort and strictly deterministic results.
however, with respect to the latter this may also be seen as a disadvantage, as a set of simulations can provide some valuable info on the variance and probability distribution of all possible ingame outcomes.
hence, i was wondering if your model could be tweaked produce some kind of upper and lower bounds for the power level, pretty much like the standard deviation in a set of simulated results would (e.g. by running it once under the most favorable conditions (focus fire on single models, entities with non-transferable weapons die last, etc) and once under the most disfavorable)?
another question i had was how you handle weapon upgrades that are transferable (like the lmg42) and those that are bound to a specific model (officer thompsons)? clearly, for the former the order in which models get killed doesn't matter, while for the latter there would be quite a significant difference in power level depending if the model dies right at the beginning vs at the end of the fight.
Posts: 578
I'd like to see you add scenarios, which people could always refer to for future buff comparisons:
1. Early game long range
2. Early short
3. Mid long
4. Mid short
5. Late long
6. Late short
These scenarios would create a valuable baseline comparison for people to refer to and the model could be used to check impact of changes. Any scripts you create would follow the scenarios. You have to ensure the script gave people this functionality, else they'll 'creatively interpet' it for their own ends.
Consider adding manpower cost as a third axis, or as a weighting factor in the ehp calc, where the average of all unit costs in a given scenario is +0. This would help include manpower cost in the comparative analysis.
Please define EHP in your post above.
Posts: 13496 | Subs: 1
...
Please define EHP in your post above.
EHP stand for effective Hit point. In this case his mean total HP divided by target size (and multiplied by armor if any)
Posts: 3114 | Subs: 2
only had time to quickly glance over it, but i think this model has great potential. awesome piece of work and clear presentation, kudos!
i agree that this approach has some clear advantages over numerical simulations, most notably the much reduced computational effort and strictly deterministic results.
however, with respect to the latter this may also be seen as a disadvantage, as a set of simulations can provide some valuable info on the variance and probability distribution of all possible ingame outcomes.
hence, i was wondering if your model could be tweaked produce some kind of upper and lower bounds for the power level, pretty much like the standard deviation in a set of simulated results would (e.g. by running it once under the most favorable conditions (focus fire on single models, entities with non-transferable weapons die last, etc) and once under the most disfavorable)?
Thank you very much.
Indeed, variance is barely possible. But as I pointed out in another post: If we want to run simulations, we need to know how a model makes a decision which of the enemy models it shoots at. Range is important, but I am fairly certain that if multiple models are in range at the same time, there is a change that a model will be targeted that is not the closest one.
From what I see, variance in infantry fights comes mostly from this model sniping an less from RNG on the weapon stats.
Unless we understand that, we cannot understand variance in infantry fights. Neither with simulations nor with my model.
You could tweak the model by just lowering the DPS output/health/whatever for one of the squads to apply some kind of debuff to them. For example, not shifting HP between models will calculate the power for being 100% model sniped.
another question i had was how you handle weapon upgrades that are transferable (like the lmg42) and those that are bound to a specific model (officer thompsons)? clearly, for the former the order in which models get killed doesn't matter, while for the latter there would be quite a significant difference in power level depending if the model dies right at the beginning vs at the end of the fight.
They don't handle well. For simplicity, there is no "transferable" weapon. This whole thing is a table calculation, it is not a "simulation" in any way.
For non transferable weapons, we have two options: Either run all possible setups and take the average, or put the non-transferable weapon into "the middle" of the squad. For example, the USF officer Thompson goes to the third place of an unupgraded squad. This is at least somewhat close.
Impressive, nice piece of work.
I'd like to see you add scenarios, which people could always refer to for future buff comparisons:
1. Early game long range
2. Early short
3. Mid long
4. Mid short
5. Late long
6. Late short
These scenarios would create a valuable baseline comparison for people to refer to and the model could be used to check impact of changes. Any scripts you create would follow the scenarios. You have to ensure the script gave people this functionality, else they'll 'creatively interpet' it for their own ends.
Consider adding manpower cost as a third axis, or as a weighting factor in the ehp calc, where the average of all unit costs in a given scenario is +0. This would help include manpower cost in the comparative analysis.
Please define EHP in your post above.
Thanks for the note with EHP, I have added it to the main post.
What exactly do you mean by the scenarios? The power is already being calculated for all ranges. What the comparison estimates is the strength the squads it they met at a given range and stay in that range.
The issue is that there need to be some hand-made assumptions. The squads in all my tests did not retreat. As I briefly mentioned, this is not a real in-game situation, it was just to test the model. We can add this retreat to the model without problems. But how? When the squad hits 200 EHP? When there are a specific amount of models left? It is hard to tell. For example, I'd retreat earlier with Conscripts than with Rifles. These values are to some extend arbitrary, and tweaking them is important. We additionally have the case that we might want to "simulate" a certain setup: Obers vs Conscripts and Volks vs Conscripts. However, in a given fight I retreat earlier vs Obers because I know they will still deal a lot of damage long range, whereas I am fairly safe vs Volks once my Conscripts are 20+ meters away. A fixed value can't capture it. But it potentially also does not have to, because no squad gets balanced against one very specific other squad. A generalized value gives us a general power level. For a specific setup, we maybe should tweak it slightly. How? A little bit arbitrary.
Other calcs of mine (not shown) always assumed that e.g. Grens retreat at 1,5 models while Rifles and Volks retreat at 2 models and Cons at 2,5. Is that exactly the case? Probably not. But I think it is at least a reasonable starting point. Disclosing this information when running a setup is important. In my examples I used "fight to death" scenarios since they do not rely on this retreat assumption.
I have two different metrics that are not shown for brevity:
- power per pop (in my eyes the most important one), which divides the power at any point by the population of the squad.
- power per MP: divides the power by the reinforcement MP. However only valuable when the retreat assumption is made. For example, late game Grens lose 2,5 models before retreating -> 2,5*28 = 70 MP -> All power values are divided by 70.
Again, the point of this model is not to discover if a unit is a couple of % stronger than the other. But under the right assumptions it can evaluate some balance changes. In the current example of PPSh Cons, we can potantially decide what to do: What is the effect of the fourth PPSh during the game? Does the squad become too strong or is it still weak? Is it better to add an RA modifier to PPSh Cons? Or just make the weapon itself stronger?
We won't be able to tell if the modeled value is the perfect result, but at least if the change is in the right order of magnitude. Fine tuning must be done within the game.
Posts: 1708 | Subs: 2
The amount of hits required to kill a model is something to take into consideration, a while back you would have the instance of a conscript squad volley (6 shots), where 5 out of the 6 hit a single gren model and killing the model outright. On the basis that succesfful mosin hits used to do 16 damage per hit and a model has 80hp.
Posts: 3114 | Subs: 2
DPS is a bit of an odd one as the interaction between infantry is more complicated than the average damage output.
The amount of hits required to kill a model is something to take into consideration, a while back you would have the instance of a conscript squad volley (6 shots), where 5 out of the 6 hit a single gren model and killing the model outright. On the basis that succesfful mosin hits used to do 16 damage per hit and a model has 80hp.
I agree with this. However this does not allow for a generalized metric at all. Aditionally, there is no easy way to calculate mixed weapons in a squad.
Posts: 486
What do you mean by mixed weapons? Adding lmgs and various should be straightforward as long as range is fixed as those weapons always stay around till the end, so use their DPS for the botton 1-to-2 troops (or am i missing something). The biggest outlier will be non-focus fire weapons which have weird characteristics, which is why Falls sometimes feel bad. But you predict SMG results welll so it looks fine.
Ill be waiting on that script, it'll be nice to have more good tools.
Posts: 3114 | Subs: 2
Great work!
What do you mean by mixed weapons? Adding lmgs and various should be straightforward as long as range is fixed as those weapons always stay around till the end, so use their DPS for the botton 1-to-2 troops (or am i missing something). The biggest outlier will be non-focus fire weapons which have weird characteristics, which is why Falls sometimes feel bad. But you predict SMG results welll so it looks fine.
Ill be waiting on that script, it'll be nice to have more good tools.
The mixed weapons was a response to the suggestion to calculate the number of bullets needed to kill. But this is hugely complicated especially if there is more than one weapon in a squad. E.g. an LMG and a rifle have different damage values. There is no fixed number of bullets that is necessary to kill, because we don't know how many bullets each weapon will contribute. Therefore we'd need the 'most likely' number of bullets. But to calculate this there are further assumptions etc.
Since my model uses DPS instead everything is simplified. Mixing weapons is no issue and already 'implemented'. I have not cared about focus fire so far because no one really knows how it works. Not just in the sense that some formula is slightly off, but really in the sense of that we only have very small pieces of information.
Posts: 486
Posts: 999 | Subs: 1
Thank you very much.
Indeed, variance is barely possible. But as I pointed out in another post: If we want to run simulations, we need to know how a model makes a decision which of the enemy models it shoots at. Range is important, but I am fairly certain that if multiple models are in range at the same time, there is a change that a model will be targeted that is not the closest one.
From what I see, variance in infantry fights comes mostly from this model sniping an less from RNG on the weapon stats.
Unless we understand that, we cannot understand variance in infantry fights. Neither with simulations nor with my model.
You could tweak the model by just lowering the DPS output/health/whatever for one of the squads to apply some kind of debuff to them. For example, not shifting HP between models will calculate the power for being 100% model sniped.
thanks for clearing that up. though i don't necessarily agree that a complete understanding of the targeting algorithm would be necessary to simulate things - we can always make assumptions and simplifications to get a rough idea how things line up with the in-game results and tweak accordingly. after all, your model also requires certain informed simplifications to be made, which isn't necessarily a bad thing if the results work out nicely.
anyway, i guess that wasn't really the point of my question. what i meant was pretty much what you outlined in the last sentence. if i understand this correctly the amount of hp shifting kind of simulates the degree of 'focus fire' towards single models? in this case (and assuming this is the major factor that defines the randomness of the outcome of a firefight) it would be interesting to see if the variance in a set of calculations with randomized values for hp transfer would match up with the variance in a set of in-game tests (to some extent at least). well, maybe i'm fetching things a bit too far but i thought i just put this out here nonetheless...
Posts: 3114 | Subs: 2
thanks for clearing that up. though i don't necessarily agree that a complete understanding of the targeting algorithm would be necessary to simulate things - we can always make assumptions and simplifications to get a rough idea how things line up with the in-game results and tweak accordingly. after all, your model also requires certain informed simplifications to be made, which isn't necessarily a bad thing if the results work out nicely.
anyway, i guess that wasn't really the point of my question. what i meant was pretty much what you outlined in the last sentence. if i understand this correctly the amount of hp shifting kind of simulates the degree of 'focus fire' towards single models? in this case (and assuming this is the major factor that defines the randomness of the outcome of a firefight) it would be interesting to see if the variance in a set of calculations with randomized values for hp transfer would match up with the variance in a set of in-game tests (to some extent at least). well, maybe i'm fetching things a bit too far but i thought i just put this out here nonetheless...
Maybe it is possible, but it will be amount of work and cross checking with in-game measurements quite a pain to get some real variance data (for a bigger set of matchups regarding DPS centralization and squad size). Everytime you realize that you tweak your model, you need to re-run the simulation again. It will probably be doable to find settings that mimick CoH2 behaviour, but it'll take ages.
My points for model sniping being the main factor are twofold: 1. Whenever I tried to verify my DPS calculations with in-game measurements I often tested a carbine ranger model shooting at Partisans. And this dude was surprisingly consistent in killing times despite the carbine having amount of RNG delays. He had a lot of shots though to even things out, but any slower shooting squad would have that as well. I had some other tests too that I don't remember that well anymore, but I still kept that as the bottom line. Second, we have seen for ages that we get way more variance in infantry test when we test them in cover. This also hints that it probably is more important that a model got killed instead of how the RNG rolls turn out. Is it true? I never tested it in detail, but at least this is my assumption.
Anyway, back to your actual question: The idea behind this HP shifting is rather easy. I say three grenadiers shoot at Conscript Yuri and the fourth at Conscript Ivan, Yuri will still get killed first, but it takes more time. But he does not eat all the damage alone anymore. The DPS dropoff due to the loss of Yuri is delayed, because Ivan takes some of the damage. This is functionally similat to Ivan shifting some of his HP to Yuri, thereby prolonging the overall time that a higher DPS of the Soviet squad can be maintained.
The way you set this up is fully arbitrary. Does Ivan give 20% of his HP? 50%? 90%? We don't know. In the game I have seen my Rifles running back with 5 models and what must have been no more than 5 HP judging by the health bar. On the other hand I sometimes drop models one by one. What is the middle ground between those? No idea. I think out of pure luck I found some settings that are "good enough" to align roughly with what I tested. We can surely find better ones down the line, but I assume they might be okay as a starting point.
The amount of shifting can be tested. There probably are multiple setups of this shifting that yield similar results. I just checked my current setup, Conscripts here get about 20% better when shifting is allowed, Grens about 10%. This can be either personal bias in my manual setup, but to some extend it makes sense in my eyes (Conscripts "offer" more targets, so damage is more likely to be spread. Grenadiers also centralize more DPS, so they shift HP to models that have comparatively little contribution to the overall damage. Therefore, power does not increase as much).
Posts: 3114 | Subs: 2
Pre-release version for power calculator
Download
Contains
- 1 ipynb file containing the script
- 1 csv file containing weapon data
- 1 xlsx file containing squads
Needs Python 3.X, pandas, numpy, math, plotly.express libraries
Functions:
- calculation of Squad Power
- calculation of Squad Power/POP and Power/MP
- includes Weapon DPS and Squad DPS script (calculation of weapon DPS and total squad DPS)
Bear in mind this is still a pre-version so some things will still change, especially regarding the squads file. The method of adding squads is rough around the edges and could be solved more elegantly in the future. For the time being, it works.
I think this can be fairly straight forward. I previously explained what "power" is and how it is calculated in the opening post, as well as some hints why this metric could be a good indicator to compare squads. If you have questions in that regard, please read post #1 first. If this does not answer your question, feel free to ask.
Setting up a new squad
What is retreat weight and what is HP weight?
Running the script
Weapon comparison
Posts: 13496 | Subs: 1
one should probably use square when calculating power of squads more about it here:
https://www.gamasutra.com/blogs/PeterQumsieh/20150115/233644/Balancing_Multiplayer_Games__Intuition_Iteration_and_Numbers.php
https://www.gamasutra.com/blogs/PeterQumsieh/20150414/240950/Balancing_Multiplayer_Games__Opportunity_Power_and_Relativity.php
Posts: 3114 | Subs: 2
Great work
one should probably use square when calculating power of squads more about it here:
https://www.gamasutra.com/blogs/PeterQumsieh/20150115/233644/Balancing_Multiplayer_Games__Intuition_Iteration_and_Numbers.php
https://www.gamasutra.com/blogs/PeterQumsieh/20150414/240950/Balancing_Multiplayer_Games__Opportunity_Power_and_Relativity.php
Thanks, I did not know about these articles.
Looks like my method calculates what the author terms the "raw value" of a unit. This is good information since Relic seems to use it as a balance indicator when they were taking care of balancing units. Given the tests I made, I doubt that this principle was patched out over the years, so we can still use it as an indicator.
Posts: 13496 | Subs: 1
Thanks, I did not know about these articles.
Looks like my method calculates what the author terms the "raw value" of a unit. This is good information since Relic seems to use it as a balance indicator when they were taking care of balancing units. Given the tests I made, I doubt that this principle was patched out over the years, so we can still use it as an indicator.
Yes they are interesting articles, glad that I could help.
Livestreams
205 | |||||
107 | |||||
27 | |||||
24 | |||||
2 | |||||
22 | |||||
17 | |||||
1 | |||||
1 | |||||
1 |
Ladders Top 10
-
#Steam AliasWL%Streak
- 1.831222.789+37
- 2.35057.860+15
- 3.1110614.644+11
- 4.921405.695+5
- 5.634229.735+8
- 6.276108.719+27
- 7.306114.729+2
- 8.262137.657+3
- 9.722440.621+4
- 10.1041674.607-2
Replay highlight
- cblanco ★
- 보드카 중대
- VonManteuffel
- Heartless Jäger