If you work in a very particular section of the hidden Basement of the World inside dath ilan, you would have other metaphors than that, to provide models for slightly less oversimplified stories of what is happening inside Esta.
One of the fundamental dimensions for understanding Thought, among those who seek to understand it in sufficient detail to create it knowingly, is the axis of Memorization versus Generalization. (This, to be clear, is not yet a secret; this is early-teenager comp-sci in dath ilan.)
Imagine a car-driver who repeatedly faces the task: Drive, within your city, from some varying point X to some varying destination Z.
One way you could build a car-driver like this is to memorize a set of turns and directions for every X and every Z, separately. Any time the car gets a call to drive between some new origin-destination pair, it has to stop and make a call to an oracle that knows the directions. But the car's memory is perfect, and it never has to call the oracle a second time for the same X-Z pair.
If the car only ever has to drive a set of 10 fixed origin-destination pairs, memorization isn't a bad algorithm. Anything more than pure Memorization would be overkill, really.
But for a city with ten thousand addresses that each visit a hundred of the other addresses, it would take a full million memorized pathways to drive; more than most human-level human-style minds can be asked to learn.
(This does require some oracle to provide the answers that the car then memorizes. You can imagine a car that works without the oracle; you can imagine, say, that from point X it starts driving and making random turns, noting each point Y1, Y2, Y3 that it reaches by those random turns and memorizing those pathways too, and even that this memorization detects and deletes loops from the memorized paths. Eventually the car will randomly wander into Z, and then the car knows a path, though maybe not a very good path, from X to Z.)
Now consider a different approach, all the way over at the other end of Generalization.
As the car moves around, it builds a map of the city, a map whose representation and numbers correspond by locally simple rules to the actual road-distances and intersection-angles of the city.
When the car is asked to drive from X to Z, it runs some search algorithm (say, combined forward-chaining and backward-chaining until the two expanding frontiers between origin and destination meet, if you like concreteness; or A* search if you like concreteness and also not-totally-inefficient search algorithms) over its map, to see if it can plan a path from X to Z.
This car is vastly more sample-efficient and also needs a less powerful oracle. The car doesn't need to be told every possible path, or every probable path, in order to learn each exact sequence of turns between each X-Z pair. It only needs to visit a bunch of points in the city, once, in order to build enough of a map that it can navigate the probable requests for X-Z pathways.
If the city changes, if a bridge collapses, the Car That Plans Using A Learned Map of Locally Correspondent Truths only needs to visit and see the blockage one time, and then updates its map once, and then all the planner's plans come out different. The Car That Memorizes Turn-By-Turn Sequences Between Origin-Destination Pairs has to requery its oracle for all the source-destination pairs that routed through the collapsed bridge, and for that matter, won't even realize which of its pathways are broken until it comes to the collapsed bridge yet again.
Which is to say: On the spectrum from inefficient memorization to efficient generalization, the key to moving in the direction of more sample-efficient generalization is means-end planning.
The way in which this all relates back to Esta, is that Esta's verbal thought patterns are more like something memorized -- not literally memorized as word-sequences, but closer to the memorization end of the spectrum -- because they are less like means-end planning. Contrast to Esta's desires, what he finds painful or pleasurable, what good-feeling or bad-feeling events his mind anticipates happening to him in a choice-dependent way; the influences this part of himself produces on him, are more toward the end of the spectrum about means-end reasoning using a world model.
It is why the Church of Asmodeus has trouble balancing two simultaneous desiderata: (1) forcing people to conform by thinking particular thoughts in words that the Church wants them to think, and (2) having those people be smart, quick-learning, fast-adapting to changed situations; having the verbal thoughts hammered into that shape smartly and flexibly navigate to destinations that serve the Church.