to hell with science | Glowfic Constellation

Sep 8, 2025, 5:13 AM

to hell with science

Keltham's lecture on Science, in, as is usual for him, Cheliax

Keltham

lawful chaotic

Iarwain

(Continued from within my fun research project.)

The library-classroom is more crowded, now, but not overcrowded; it was built with this in mind. Eight existing hires, twelve job candidates, two visible Security, Keltham himself, and finally, if you're paying sufficiently close attention, Broom.

And Keltham out of dath ilan holds forth, then, upon the way of SCIENCE.

"Good morning, all, newcomers and oldcomers alike."

"Let me start by saying that this lecture probably isn't going to work."

"This class is too large, too unfamiliar. The new students are all full of Chelish dignity and haven't learned to show facial expressions that I can use to have any idea whatsoever of how my class is going. I expect the newcomers with all-important dumb questions to think they should stay quiet so that supposedly smarter students can learn faster - don't do that, by the way, just ask the question. You got all your Law of Probability off Asmodia yesterday, and I have no idea how much you really got from her or if that even worked at all."

"We're probably going to have to abort it ten minutes in, and break up into smaller units, so more of the current researchers can try again to teach the new candidates the requisite Probability, with more individualized instruction and myself around to supervise. This being the way things are usually done in dath ilan: older children teach younger children, overseen by the Watchers-over-children. And in teaching, the older children also learn."

Total: 72

Posts Per Page:

‹ Previous 1 2 3 Next ›

« First ‹ Previous 1 of 3 Next › Last »

Keltham

lawful chaotic

Iarwain

Points finger at Korva.

Korva Tallandria

ruin-of-souls

apprenticebard

"I think I must be confused about how the buckets work. Because, like - say we have one hypothesis that the true value is exactly 50%, and one hypothesis that the true value is less than 50%, and one hypothesis that the true value is more than 50%. It seems like the likelihood - or probability, sorry, I didn't catch the distinction - it seems like it must be more likely that the true value is below 50% than that it's exactly 50%, even though the below-50% space also includes values like 0%, which has obviously already been outright disproven. So - I guess I feel like if .4 is more likely to be right than .5, I don't see why the hypothesis that covers a bigger space, that also includes the value that the true value is most likely to be closest to, becomes less likely just because that hypothesis also includes values that are much less likely to be right than some other narrower hypothesis. - I'm not sure I said all that right."

Keltham

lawful chaotic

Iarwain

"Okay, wow, people's intuitions about probability work really differently when they haven't been raised as dath ilani children. I think probably you just need to invent a whole lot of problems and play around with them, the way anyone - well, the way any dath ilani kids would do as kids?"

"Let's say I spin a fair twosided meta-meta-coin... let all coins be assumed two-sided and assumed fair unless explicitly stated otherwise."

"Anyways, new procedure. First, I spin a meta-meta-coin. If the meta-meta-coin comes up Text, I spin an 'objectlevel'-coin five times. If the meta-meta-coin comes up Queen, I spin a three-sided meta-coin. Then depending on the result of that coin, I fivetimes spin a biased 'objectlevel'-coin with 0.1 or 0.2 or 0.4 propensity to produce Queens."

Keltham will attempt to whiteboard this:

            meta-meta-coin
            /            \
         1/2             1/2
          /                \
      meta-coin      (0.5)-propensity
     /   |    \      objectlevel-coin
  1/3   1/3   1/3
   /     |      \ 
(0.1)  (0.2)  (0.4)

"It's actually just true that if you don't know any of the meta-spins, and just see my unknown objectlevel-coin producing Text Queen Queen Text Text, there's roughly two chances in five that my meta-meta-coin came up Queen and picked a biased coin, and three chances in five that my meta-meta-coin came up Text and picked a fair coin."

"Why? Because when you end up with a biased coin, it's sometimes biased 0.4, but two-thirds of the time biased 0.1 or 0.2. Mostly you'll see fewer Queens, when the meta-meta-coin comes up Queen. When you see Text Queen Queen Text Text, that could be because the meta-coin was flipped and selected the 0.4 coin, but more likely, the meta-meta-coin selected a fair objectlevel-coin and that fair objectlevel-coin happened to produce two Queens and three Texts."

"A dath ilani kid would now be 'programming' a 'computer' to run a million simulations of this procedure and show them how many cases of Text Queen Queen Text Text were generated by the meta-meta-coin coming up Queens versus Text and verifying that's how it actually played out. Here... we'd need to find a three-sided coin and a ten-sided coin, or maybe a cube and an 'icosahedron', regular 20-sided solid, to spin. And then we'd probably have to do a few hundred spins to collect enough 'data' for the 'statistics', if the math didn't feel intuitive."

"But I'd hope that - the first time the meta-meta-coin came up Queen, and you spun a meta-coin and it selected 0.1, and the resulting objectlevel-coin produced Text Text Text Text Text, it might become more intuitive why, when the meta-meta-coin comes up Queen, what you mostly expect to see is mostly Text. So when you don't see that, it's not likely the meta-meta-coin came up Queen."

Korva Tallandria

ruin-of-souls

apprenticebard

"That... makes sense, but how do we know that there isn't instead a four-sided meta-coin that picks among all the possible coins, including the fair one?"

Keltham

lawful chaotic

Iarwain

"In terms of this particular problem? Because I told you so."

"Why did I tell you so? Because I was trying to pump the intuition that - assuming 0.1, 0.2, 0.4 equally prior-probable within the 'less than 0.5' bucket - that bucket was then less 'likely' to yield Text Queen Queen Text Text than the 0.5 bucket, even though 0.4 was in that bucket. I was trying to pump that intuition by showing that, if the whole biased bucket and the whole unbiased bucket started out equally probable, then, after seeing Text Queen Queen Text Text, we'd think the unbiased bucket had become more probable and the biased bucket less probable. So what we saw must have a lower 'likelihood' in the biased bucket that starts out with 0.1, 0.2, and 0.4 having equal prior-probabilities."

"I mean, we could argue about how that would go in real life, instead of a thought experiment. You could say that all four hypotheses are equally simple to describe out loud and should therefore be around equally probable. I could then counterargue that if we're talking about an actual coin, then in real life, most coins are probably pretty close to being fair random-number generators when spun - though I ought to actually verify that here, before I bet anything important on it. So it should actually take hundreds of observations before we start believing the coin is 40% biased towards Queen, I would argue; five coinflips is nowhere near enough. Therefore, I'd conclude, 'the coin is biased 40% Queen' is a lot less likely than 'the coin is an unbiased random generator'."

"But it would be better if arguments like that didn't have to appear in our 'published-experimental-reports'. Which is one angle towards 'grokking' an underlying central reason why 'published-experimental-reports' ought to summarize likelihoods for hypotheses that are more like 'observational likelihood if this coin has 0.4 Queen propensity', and less like 'observational likelihood if this coin has a less than 50% Queen propensity'."

"If you just summarize for the reader 'What is the likelihood of my data, in the world where the coin comes up 40% Queen? The world of 50% Queen? The world of 10% Queen?' then you don't have to confront the question of whether 40% Queen was 1/3 as prior-probable or equally prior-probable with 50% Queen."

"Oh, and, uh, to make it explicit:"

Examples of Baseline terms for 'prior', 'likelihood', 'posterior':

'Prior' coin is 40%-Queens:
P( Q=0.4 )    = 1/6
'Prior' coin is 50%-Queens:
P( Q=0.5 )    = 1/2
'Likelihood' of TQQTT, if coin is 40%-Queens:
P( TQQTT ◁ Q=0.4 )     = 0.03456
'Likelihood' of TQQTT, if coin is 50%-Queen (fair):
P( TQQTT ◁ Q=0.5 )     = 0.03125
'Posterior' coin is 40%-Queens, after seeing TQQTT:
P( Q=0.4 ◁ TQQTT )     = 3456 / (729 + 2048 + 3456 + 3125*3)    >(1/5), <(1/4)
'Prior' coin is <50%-Queens:
P( Q<0.5 )    = P( Q=0.1 ) + P( Q=0.2 ) + P( Q=0.4 )   = 1/6 + 1/6 + 1/6   = 1/2
'Likelihood' of TQQTT if <50% Q:
P( TQQTT ◁ Q<0.5 )     = 1/3 * .00729  +  1/3 * .02048  +  1/3 * .03456   ~ 0.02

"How am I doing, Korva Tallandria?"

Korva Tallandria

ruin-of-souls

apprenticebard

"I - think I get that, as far as it goes, with the coins? But I don't immediately see how it applies to the people. - if it applies to the people the same way, I'm assuming it does but I'm not sure I should be."

Keltham

lawful chaotic

Iarwain

"I mean, the sense in which it also applies to the people, is that your report should summarize the 'likelihood' that your results were generated by a 10% propensity for all-10s to guess within 30 minutes, not the 'likelihood' that your results were generated by a less than 50% propensity for all-10s to guess within 30 minutes. Because to do the latter thing you have to make a bunch of weird assumptions, and your math is just going to get more and more needlessly complicated as you dig yourself in further."

"And by way of showing how much further into complicated trouble you'd end up digging yourself:"

"Again, let's say we were going by bucketed hypotheses. One hypothesis, the meta-coin hypothesis, says that there's a 1/3 chance we live in a world where all-10s have a 10% propensity to solve 2-4-6 in 30 minutes, 1/3 chance it's 20% propensity, 1/3 40%. The other hypothesis, the fair-coin hypothesis, says we live in a world where all-10s have a 50% propensity to solve in 30."

"We don't actually need to consider the probability of these two hypotheses relative to each other. Let's say we test five all-10 subjects and get NO YES YES NO NO, meaning two subjects guessed within the time limit, three didn't. Our experimental report is just going to summarize the 0.2 'likelihood' of the data assuming the meta-coin hypothesis bucket, and the 0.3 'likelihood' of the data assuming the fair-coin hypothesis. That's true regardless of the 'relative prior-odds' of the two hypotheses relative to each other."

"So we publish our report. 0.2 likelihood for the less-than-50% bucket, 0.3 likelihood for the 50%-propensity hypothesis. There's a questionable assumption that 10%, 20%, and 40% were all 1/3 likely assuming the propensity was under 50%, but fine, whatever, we've got to assume some 'prior distribution' to report a combined likelihood on that whole bucket all at once, yo."

"Along come some replicators. They test 5 more people. They get YES NO NO NO YES, so also two subjects who guessed and three who didn't."

"Now what? What does the combined evidence say? Anybody want to give the obvious wrong answer?"

Ione Sala

know-it-all

Iarwain

"Obvious wrong answer: This new data also has 0.2 likelihood on the less-than-50% bucket, and 0.3 likelihood on the 50% bucket, so the combined likelihood across the two experiments is 0.2 times 0.2 equals 0.4... equals 0.04 assuming less-than-50%, and 0.3 * 0.3 = 0.09 assuming 50%."

Keltham

lawful chaotic

Iarwain

"And why isn't that just totally right? Or maybe I'm trolling you by calling it the wrong answer, and it is right? Candidates only, you can message Ione if you're worried your reply is stupid."

Aevylmar

"Are we still not considering any other - hypotheses, and just interested in the relative ratios?"

Keltham

lawful chaotic

Iarwain

"Yup!"

Aevylmar

"We've become more confident, which sounds right... does this get the same result as if we did one experiment with ten people in the first place? I'm not sure if it does..." She's going to start checking the math.

Keltham

lawful chaotic

Iarwain

"Well, everyone else do feel free to follow along and try the math on that part. Raise your hand when done, practice accuracy before you practice speed."

"After all, a Lawful way of looking at the world shouldn't care whether you call your collected 'data' by the name of one experiment or two experiments. You should always get exactly the same answer either way. Though, to be sure, you might have occasional occasion to notice that different 'data-subsets' might have been collected under possibly different conditions."

Pilar

curse-of-laughter

Iarwain

Pilar raises her hand before anyone else. She's faster than even Asmodia once she knows exactly which rules to follow and which procedure to execute, and in this case she knows exactly what to do.

Aevylmar

The first of the new researchers to finish speaks up immediately - "About Fifty-five hundred and sixty-seven to ninety-seven hundred sixty-six," she says, "which really isn't the same thing as four to nine."

Keltham

lawful chaotic

Iarwain

"Um, sorry for not being explicit - if I say for people to raise they're hand when they're done, that's so I know everyone is done, and meanwhile, everybody gets a chance to try on their own before hearing anybody else's result."

"At least you didn't say how you did the calculation, so others can, perhaps, do their own calculations and see if they think yours is correct."

Keltham comes over to look at the scratch paper, if any; what calculation seems to have been done?

Aevylmar

Squared both sides of 729/100000, 2048/100000, and 3456/100000, divided it by three, then compared it to 3125/100000 with both sides squared, then canceled out the denominator since it was the same on both sides of the equation, with some approximations for doing faster arithmetic.

Keltham

lawful chaotic

Iarwain

Message to her: Don't smile or anything, but: Correct. Though two significant digits would've been fine there.

...and she was first of the newcomers, so probably everyone is trying to compute needlessly many digits, which, okay, fine.

Keltham

lawful chaotic

Iarwain

Keltham will wait until everyone's raised their hand, and then go to the whiteboard and show how he'd have approximated it:

0.   1/3 * 0.00729^2  +  1/3 * 0.02048^2  +  1/3 * 0.03456 ^ 2   vs.   0.03125^2
1.   ~ (2^2 + 3.5^2)/3   vs.   3.1^2
2.   ~ (4 + 12.25) / 3   vs.   9.61
3.   ~ 5.4 vs. 9.6   (/6 * 10 => proportional to 9 vs. 16)
4.   ~ .00054 vs. .00096

"...where step 1 is dropping the first term that's obviously going to end up insignificant, multiplying both sides of everything by 100, and rounding to two significant digits. I mean, we wouldn't do that in Civilization because we have 'computers' to do the calculation for us, and here you might not do it in a Science! report, but it's definitely fine in my classes."

"And step 4 is dividing again by 10,000 to put that factor back in, because your experimental report should summarize the absolute likelihood of the data, not just the relative likelihood of the data. Somebody who wants to compare some completely different hypothesis's likelihood, one you didn't even consider, needs the absolute likelihoods to do that - they need the .00054 and .00096 version, not the 9 vs. 16 version."

Keltham

lawful chaotic

Iarwain

"Imagine that we've got two separate research groups both testing the hypothesis that all-10s solve 2-4-6 with 50% propensity, or alternatively, less than 50% propensity. They each don't know the other group exists; however, they both use the - hypothetically for this thought experiment - universally standard rule that 'less than 50% propensity' is of course best-modeled in practice by three equally weighted 'probability point-masses' on 0.1, 0.2, and 0.4."

"The first group reports a likelihood of 0.02 on the less-than-50% metahypothesis, and a likelihood of 0.031 on the 50% hypothesis."

"The second group reports the same thing."

"The way we've set up the hypotheses being reported on, we cannot just multiply the two likelihoods together. The task of combining evidence from different 'published-experimental-reports' is now a big complicated deal requiring us to recheck their original data and redo all their calculations."

"Alternatively, if they had both reported likelihoods of 0.007, 0.02, 0.035, and 0.031, on the distinct hypotheses of 10%, 20%, 40%, and 50% propensity respectively, we could have just multiplied the likelihoods together from both groups, and our ability to accumulate data from across multiple experiments would be vastly simplified."

"Of which it is said out of dath ilan, to those dath ilani children who need to hear it: Different 'effect-sizes' are different hypotheses."

"That, Carissa, Pilar, is why we can't just have the hypothesis that all-14s have at least five times the propensity of all-10s to solve 2-4-6 in 30 minutes. We can look at the data and see if that actually happened or not. But as soon as we try to figure out the exact likelihood that it happened, we are cast into a nightmarish multiverse of different ways the world could be, such that the statement 'all-14s are more than five times as likely to solve in thirty as all-10s' is true about worlds like that, all of which have different likelihoods of yielding the data we saw."

"Like, just on this breakdown, that could be because the chances were .2 and 1.0, or .1 and .5, or .1 and .6, or .1 and .8. And every one of those hypothetical propensity '2-tuples' will yield a different likelihood for whatever data we saw. So you'd have to put a prior on their relative odds inside that metahypothesis bucket, before you could calculate the likelihood for the whole bucket."

"And then, actually seeing any data, would update the odds inside that bucket, which would change the likelihood for any future experiments, even if the replicators saw exactly the same data you did."

Keltham

lawful chaotic

Iarwain

"And that takes us to the principle I wrote earlier on my todo list:"

#2 - Separate experiments are usually supposed to avert 'conditional-dependencies', watch out for when that isn't true

"What I mean here is that - when you are otherwise doing things correctly - it should usually be the case that, for the likelihoods that the 'published-experimental-report' is summarizing for different hypotheses, if some replicators came along and did their own experiment, their likelihoods should be something they can calculate independently of your data. It shouldn't be the case that they have to look at your data, to figure out the likelihoods given their hypotheses."

"This in fact is the property that lets us compute the joint likelihood of a hypothesis across two experiments, by multiplying the likelihoods together from the separate experiments. Symbolically:"

P( data from first and second experiments ◁ the hypothesis )
= P( data from 1st experiment ◁ the hypothesis ) * P( data from 2nd experiment ◁ the hypothesis)
if and only if
P ( data from 2nd experiment ◁ the hypothesis ) = P( data from 2nd experiment ◁ the hypothesis & data from 1st experiment)

"When you say, 'maybe all-14s have 60% propensity to solve in time, and all-10s have 10% propensity to solve in time', you're describing a way reality can be, where the likelihood of my found pattern of YES and NO responses, if that's true, is just the same no matter what data you found in your own experiment. Maybe your data made that world look incredibly improbable, but that doesn't matter; I can still answer the question of how likely my data would be, if that world was the case, without looking at your data."

"When you say, 'maybe all-14s have at least five times the propensity of all-10s to solve 2-4-6 in 30 minutes', that is a way the world can be; but it's a way the world can be, where calculating the likelihood of my data in that world, requires me to make up a bunch of prior-probabilities, and then those probabilities change depending on the data that you got."

"Which makes it immensely complicated to quickly look over the summaries of what different people's experiments said about different worlds, and combine them together into a joint summary of what reality has told us about them all."

"It would, in fact, be possible to combine a lot of little experiments all of which suggested that - if you wrote the summary this way - the data was more likely if all-10s had 50% propensity to solve, versus less-than-50% propensity to solve, and get out a new update that the combined data was more likely if all-10s had less-than-50% to solve. If you multiplied enough 0.035 likelihoods from the 40%-propensity hypothesis, compared to the 0.031 likelihoods from the 50%-propensity hypothesis, then eventually the 40%-propensity hypothesis would come to dominate the predictions of its bucket, and then its bucket would start to dominate the other hypothesis."

"Which paradoxical-seeming combination, again, doesn't happen if you consider the 40%-propensity hypothesis separately, because then it's clear from the start that 40% propensity is gaining on 50% propensity in each experiment."

"Hence again the proverb: Different effect sizes are different hypotheses. Which argues again against thinking that 'all-14s are at least five times as likely to solve as all-10s' is a good way to split up the world for purposes of SCIENCE! Even though, in terms of 'truth-functional' scaffolds, it is a way the world can be. It could even be the metafact that is useful and that we're interested in. We should still ask the Science! question first, what are the exact real effectsizes, and then check the useful metafact afterwards."

Keltham

lawful chaotic

Iarwain

"Likewise if you started thinking that 'this coin isn't random' or 'this coin is biased to favor Queen' was a good way to describe the hypothesis you were considering. If two experiments show that the same coin is probably biased to Queen by notably different amounts, they're pointing at incompatible ways the world can be, and something is wrong, some condition has changed between experiments, at least one group is screwing up."

"You definitely wouldn't say, 'Well, our hypothesis was that this coin was biased to favor Queen, and group one spun it a bunch of times and found that it came up Queen 900 times out of 1000, and group two spun it a bunch of times and it came up Queen 520 times out of 1000, and both of those results are instances of 'the coin came up Queen more often than it came up Text', so both confirm the hypothesis that 'the coin is biased Queen', and the experiment has 'reproduced'. You are actually less confident after two apparent confirmations of your original statement than you were after one confirmation, because in the details of the particular worlds, it's clear that something was wrong with at least one experimental setup."

"But that apparent paradox is just an artifact of bucketing together different ways the world could be, that yield very different likelihoods on the exact data observed, into one metahypothesis of 'this coin is biased to favor Queen'. If you said instead 'the coin yields 90% Queens' or 'the coin yields 52% Queens', there would be no illusion of the first experimental result agreeing with the second result, there would be no illusion that the result had 'reproduced'. Fix a local hypothesis, a single effect-size in this case, that makes the data have independent likelihoods between one experiment and another, that fully specifies the likelihood of the data as a matter of logic, and doesn't change when we read other experimental reports. Summarize the likelihoods for hypotheses like that, and it would be clear that the data from one experiment was compatible with an exact hypothesis, and the data from the other experiment was not."

"Which, uh, yeah, the lesson is, there are these careful precise details about how to do SCIENCE! correctly, and those details actually matter a lot for making your whole Civilization's SCIENCE! output fit together and have the whole thing make any sense. Even for a small project like ours, it's still probably best to do it that way, if we want things to make any sense."

Korva Tallandria

ruin-of-souls

apprenticebard

Korva, who was the last to raise her hand for the last exercise, and who has been calculating through another panic attack for the past twenty minutes, has now realized that the horrible feeling of wasting lots of time making errors that she still doesn't understand on stupidly complicated problems was the point of all of this, that this incredibly painful classroom experience has all been an illustration specifically for the benefit of Carissa Sevar about why they shouldn't do things the way she suggested, even though there is no outward indication that the Chosen of Asmodeus herself experienced any distress about it at all at any point.

Korva thinks that, probably, if she were in charge, she would have experimenters report their data, and then everyone who wanted to see how well that data fit some specific hypothesis could CALCULATE IT THEIR OWN GODSDAMNED FUCKING SELVES.

dath ilan

Iarwain

There are, even in dath ilan, children who will agreeably acquiesce in doing SCIENCE! the same way everybody else does it, without carefully detailed demonstrations of exactly what grimdark dooms will befall them if they try to make up their own methodologies that violate the coherence constraints.

Tiny Keltham was not one of those children.

And besides, even if you report the raw data like a sane person, people do need to know which calculations to do after they've got all the data, right? There are some local calculations that do tesselate together neatly to global calculations, and you might as well summarize those when they exist. Which requires you to know how to set up the kind of calculation that modularizes well, and distinguish it from calculations that don't.

Willa Shilira

seeing-red

DeAnno

Willa feels like she's following well, but she did think she was before, and then she still managed to get herself tied up somehow. In front of everyone, as usual.

Though in retrospect she thinks she had it but there was just a lot of confusion about before-chances priors and some miscommunication? But overall she's happy that this particular weird mess of logic is going to stay away from SCIENCE! The bad buckets were hurting it.

She doesn't know SCIENCE(!) well enough to feel protective of it yet, but she expects she will. Except that expecting yourself to be convinced of something in the future doesn't make any sense. So she might as well feel protective of it already.

Total: 72

Posts Per Page:

‹ Previous 1 2 3 Next ›

« First ‹ Previous 1 of 3 Next › Last »