Reinforcement Learning and the Ergodic Principle across the Multiverses
How we improve a chatbot today? instead of creating a model of the mind, just use the universe to test the answers: the reaction of one person, a thousand, a million.. that’s how a reinforcement learning chatbot would learn today. Now, let’s apply the ergodic principle. Suppose that instead of learning over time , you learn over zillions of simultaneous trials. Now suppose on top of that, that you can try an answer not over a million, not over a googolplex of people, but on an infinity of individuals. All the individuals that compose the multiverse. Suppose that the chatbot don’t even CARE about the “meaning” of the conversation, but instead of making the person smile, and be happy, and think the chatbot is intelligent - and not only right after its answer, but on all the multiverses that emerge from that interaction (perhaps also towards the past.. but I digress). The point is: meaning may emerge by a blind measure of a single quantity , the purpose of the conversatio