Reinforcement Learning and the Ergodic Principle across the Multiverses

How we improve a chatbot today? instead of creating a model of the mind, just use the universe to test the answers: the reaction of one person, a thousand, a million.. that’s how a reinforcement learning chatbot would learn today.

Now, let’s apply the ergodic principle. Suppose that instead of learning over time, you learn over zillions of simultaneous trials. Now suppose on top of that, that you can try an answer not over a million, not over a googolplex of people, but on an infinity of individuals. All the individuals that compose the multiverse. 

Suppose that the chatbot don’t even CARE about the “meaning” of the conversation, but instead of making the person smile, and be happy, and think the chatbot is intelligent - and not only right after its answer, but on all the multiverses that emerge from that interaction (perhaps also towards the past.. but I digress). 

The point is: meaning may emerge by a blind measure of a single quantity, the purpose of the conversation can be to create a persistent smile, a cascade of dopamine on the maximum number of people in the maximum length of time (yes, multiverse-utilitarian morality perhaps, but this is not the point. The point is: it will look like smart, perhaps the smartest chatbot). This is not a proposal for constructing such machine, but rather a mental experiment that makes me think that there is an equivalence between infinite introspection and infinite “extrospection”. There can be no mind “inside”, but outside - the mind inside is a reflexion of the FACTS outside. Yes, you guessed right: it’s very late. 


ps: justo veo esto en Facebook:

Comments

Popular posts from this blog

12.9.2008 - Being an afterthought

Real or Virtual Matter - or How I Learned to Stop Worrying and Love the Matrix