Post
by MikeC_81 » Wed Mar 20, 2019 3:41 am
First of all, thanks to Pete and other who ran the tests. I am just going to recap the mechanics of combat, the basic math behind it and predicted results vs actual data before going onto more discussion. Melm already had a good post on this but I'll make sure everything is visible.
What I expected
From my time doing tests a year ago for the Beginner's guide, I knew that the odds of 2 units fighting each other in combat with the same PoA results in approximately each side losing 14% of the time and draws happening around 72% of the time. Every time a unit loses in close combat, it has to take a Cohesion Test and this mathematical function is public information. Simply roll two dice and a score of 6 or more passes and a score of 2 or less means you get the dreaded "double drop" where you not only fail the test but lose 2 ranks of cohesion from it. When bad things happen, you subtract the dice roll for various factors. The relevant factors for this test are Losing to Impact Foot, Suffering more than 5% combat damage, Total combat damage suffered this turn exceeds total damage inflicted by a large margin. The last two are kind of nebulous and I never could get a good answer as to what those meant from RBS other than "Its complicated" but the relevant fact at hand here is that I have never seen two steady, non-light foot units fight each other, produce a loser, and not have those two penalties apply. There is no such thing as a "bad loss" vs a "not bad loss" between these units as far as empirical data shows.
The test set up by Pete involved smashing 2 warbands units together repeatedly to see if double drops occurred more than they were supposed to as was charged by 76mm. The expected result is therefore easy to calculate. Each Warband has the same PoA and thus the odds of either of them losing are 14% + 14% = 28%. This gives us the odds in any given Warband vs Warband Impact round, the % chance someone has to take a CT test. We also know all the factors involved in such a test which rolling two dice and subtracting the result by 3. One needs to roll 3 or higher on those two dice after modifiers are applied to avoid the dreaded double drop. So in effect if you roll 5 or lower in this scenario, a double drop will happen (5-3=2). The odds of you rolling a 5 or lower on two dice is 27.78%. You can either just believe me on this one or you can work it out yourself, I am not going to proof that here.
So to find out how often we get double drops per Impact combat round between two Warbands, we simply multiply the odds of either Warband losing and thus having to take a test with the odds of them rolling a 5 or lower on that test. 28% x 27.78% = 7.7784%. Since Pete smashed 30 warbands vs 30 warbands twenty-five times, we would expect each test to yield an average of 2.33 Double Drops (frags) out of 8.4 CT tests per run.
Here is Pete's Data condensed (cut and paste into a spreadsheet for easier viewing or just skip it and read after) :
frag disrupted CT Passed CT Total
3 4 1 8
3 6 1 10
6 3 2 11
2 5 5 12
3 3 0 6
5 0 3 8
2 5 2 9
1 2 3 6
1 2 3 6
2 4 1 7
3 6 4 13
2 2 3 7
2 6 2 10
0 3 3 6
2 3 1 6
5 7 3 15
2 5 0 7
3 4 3 10
2 6 2 10
2 7 5 14
2 3 3 8
3 1 5 9
2 1 4 7
7 4 0 11
5 3 0 8
2.8 3.8 2.36 8.96
Last line is Averages for the 25 runs.
Gee Mike, is 2.8 a lot higher than 2.33?
No, it is actually well within reason. On average on each run, 8.96 Warbands were forced to take a CT test rather than the 8.4 we expected. The reason why that may be the case is because the 14% chance to win may not be accurate. The exact interaction between units in combat remains an overly complex calculation that even RBS can't fully explain to me. The numbers for the win/draw/loss table I posted on page 2 of this thread was obtained by empirical trial and error right clicking the attack tooltip and coming up with a "close number". It could be that the odds of winning and losing are closer to 15%. Or this is just a case of variance.
In any case, the percentage of Warbands that actually double dropped in Pete's scenario is 2.8/8.96 = 31.25% when we are expecting 27.78%. This would mean a mean of 2.48 double drops per run. The difference here is negligible over a sample size of 224 events (total number of CTs taken over the whole test). A layman's way of looking a it is to suppose for example RBS's code was wrong and required an unmodified roll of 7 instead of 6 somehow. This would mean the expected percentage of failing would shoot up 41.6% and the tests do not come close to that.
As an aside, the theoretical numbers for hold firms is 2.48 vs 2.36. Disrupts was 3.9 vs 3.8.
Once can also look at the distribution of double drops. In 25 runs only 5 entries recorded show significant deviation from the expected norm of 2.33 double drops. I am not going to do a standard deviation analysis on this but it is not in any way abnormal to the point where you can say something unintended is happening.
But it appears one side favoured another over the 15 battles?
This is difficult to say. We are not given all the data unfortunately on exactly who won, how many times, in each run. We merely got a lot of ''RNG neutral situations". Since each Warband vs Warband encounter is independent, we would need to know exactly how many holds, disrupted, and fragmented results for each combat loss each side had to do a real number crunch here. Besides, chopping the 750 combats in to 25 battle runs is entirely an arbitrary affair, you can't say whether one side was favoured or not from this alone, it could be that Side A had the lion's share of small advantageous runs but suffered the majority of the unfortunate ones where large numbers of double drops happened. In other words, if we sliced the results differently into say segments of 5, it could turn out that Side A was the beneficiary of being favoured. Suffice to say that MVP7's comment that it is exceedingly unlikely for there to be balanced luck is correct.
So what does this mean?
It means that arguments that Cunningcairn, 76mm and others are advancing about them getting far more double drops than what the numbers predict is as evidence that something is broken is almost certainly unfounded. Extreme events like the ones described by 76mm at the start of this thread remain exceptionally rare events. They are either victims of the extreme cases of variance (which can happen, otherwise people will never win the lottery for example or their playstyle involves them frequently subjecting their troops to additional factors cause morale drops by the addition of more negative CT die roll modifiers that they may not be aware of.
Also the sample sizes are not small as some have claimed. While they are small in that they are not large enough to drive the overall data towards to the exact theoretical numbers if we were to run the tests towards infinity, they are large enough to show that there is no systemic issue in the background which would lead us to believe the number crunching in the back is in any way shape or form. The Win/draw/loss numbers remain very close to my previously observed estimates (within 1% of overall outcomes) and the CT test are producing results consistent with our understanding of how CT tests work, especially when examining the disrupts and hold firm numbers. If the back end formulas were not as advertised, it is unlikely we would get such close conformity in the frag, disrupt, and hold firm numbers to what we were expecting.
Players should, and be expected to, plan for occasional unexpected events to the best of their abilities. Truly outrageous events that torpedo a game beyond any control of a player is, as shown, exceedingly rare. Players may simply not be aware of the risks or potential domino effects they put themselves in.
Ok fine, but I still think there is too much RNG
I have yet to see a a logically coherent argument that RNG is overriding skill on a consistent basis. Several arguments to that effect have presented here have already been debunked. Those include:
1) Lopsided game scores - can be explained by the snowball nature of the game
2) Games between evenly skilled players are determined by RNG - true if both players are Skynet and play to the theoretical skill and execution cap. No one is at this cap so one can always improve skill to mitigate RNG.
3) Games within an FoG2DL are between evenly skilled players and RNG effects those games negatively - even if true, low skilled players are unlikely to play anywhere close to the execution cap meaning they are introducing RNG into their own games by lack of skill. That is a player driven problem, not a game driven problem. The answer is to tell players to improve their play if they don't like RNG screwing them over.
I would like to know how RNG can be reduced even if you think it isn't a problem
Several possibilities exist, most create problems while at the same time don't actually solve anything. Or worse, they introduce a whole slew of new problems or phenomenon which then need to be looked at or combated or fundamentally change the way the game plays.
For example lets say we simply reduce the scores required to pass CT tests by 1. This would mean that troops all around hold their battle lines longer. The logical strategy adjustment would be to reduce reserves to compensate as troops in the back have less chance to plug a hole or exploit one that forms. This would only mean that battle lines would be pushed out longer and the player that does suffer the catastrophic event now has been incentivized to have fewer options to deal with it - effectively re-magnifying that effect to one of a lower occurrence but more catastrophic than before. This is a perfect example of a change made to accommodate more skill that has the potential to actually strip skill from the game , in the form of accurately judging how much to spend on which units to form said reserve, and accentuate luck instead by pushing out longer, narrower battle lines where any breakthrough due to luck leaves fewer reserves to deal with them and most likely a longer distance to travel to reach the point of crisis.
Stratford Scramble Tournament
http://www.slitherine.com/forum/viewtopic.php?f=494&t=99766&p=861093#p861093
FoG 2 Post Game Analysis Series on Youtube:
https://www.youtube.com/channel/UCKmEROEwX2fgjoQLlQULhPg/