Culture, Philosophy, and Reinforcement Learning

17 Aug, 2020

When you look across cultures with long recorded histories, one thing that you'll almost always see is a timeless, dualistic conflict between certain strands of thought.

Material-Immaterial Dualities

In China, there's Daoism which emphasizes the natural Way things proceed, competing with Confucianism which is a human attempt to organize socioeconomic and political structures. In India, there's Sramana which manifests as Buddhism, Jainism, and other heavily metaphysical doctrines, competing with Brahmana which descends from the materialist Vedic school of thought. And in the West, we can also broadly say that the Greco-Roman tradition with its heavy emphasis on logic and empiricism, competes with the Judeo-Christian tradition which more values faith in the supernatural.

On a surface level, these conflicts seem to me to be about the material vs the immaterial. This duality is put forth explicitly by many Western philosophers, most notably Plato in his theory of forms (immaterial) and objects (material); later on Descartes with his work on the Mind-Body problem. It can also be found in a slightly different form in the Sankhya school of Indian philosophy which centers on the duality of Purusa "spirit" (immaterial) and Prakrti "nature" (material).

One thing to be noted here is that most often in the Western tradition, the immaterial is characterized by the intellect, which through the process of logic is able to discern the Forms. The Indian tradition on the other hand, sees the immaterial as characterized by the soul, or consciousness, which through its seemingly eternal existence is concluded to be more "real" than the material.

What "Truly" Exists

In what way though is the soul eternal? There is another interesting duality here, that shows the irony of trying to determine what is true and what isn't with respect to the material and immaterial. On the one hand, consciousness (the "I" in "I think therefore I am") is all that we can ever be sure to truly exist. Yet simultaneously, it seems that the material is all that exists that we can agree on with other consciousnesses.

So the question ultimately seems to be axiomatic. Ironically, if one takes direct empirical perception as the supreme means to knowledge, the Self is all that exists. A dream, for example, would be as empirical of an observation as that of the "real" world, if this were the only axiom. And also ironically, if one takes logical inference as the supreme means to knowledge, one of the most salient conclusions would be that this material world, in which causal and logical relations fully obtain, is all that exists.

This, I think, is quite different from what most assume about the relationship between perception, logic, the immaterial, and the material. I'm sure there could be arguments for the other way around as well though. As an aside, one thing that consistently bugs me is an often assumed equivalence or similarity between logic and perception. Science has tied these two epistemological means quite closely together and to successful results, but they're astoundingly different upon closer inspection. If one were to rely exclusively on perception, they would end up with a worldview that is completely momentary, consisting of disconnected frames of existence. If one were to rely exclusively on logic, they would constantly arrive at the conclusion that the only eternal is that entity which is constructing the logic. These conclusions appear to refute those of the previous paragraph, and may be the irony that I was originally referring to, suggesting that logic vs perception isn't the basis for disagreements over the real.

Immaterial Intellect or Immaterial Consciousness

But does one construct logic or simply perceive it? The disagreement here might be fundamental to the split between those that see perceptive consciousness as the immaterial of dualism and those that instead see it as logical intellect. One way to look at the nature of self that is growing in popularity in cognitive science circles, is essentially an artifact that arises at the interaction between logic and perception. The theory goes that your mind is constantly in a loop, whereby it constructs a logical relation or theory, perceives the environment, and then updates the logical theory such that it is better in agreement with perception. The self arises at the border between these, perhaps primarily in the update rule.

So with this framework in mind, does the self construct logic or perceive it? Well the self certainly isn't the entity constructing the logic, in fact, the self isn't an entity at all. It seems to be a complete illusion, resulting from the interface between internal logic and external perception. Where the immaterial and material mingle, the self is born.

Learning is Self

Now, I don't know nearly enough about Deep Reinforcement Learning architectures to be able to say anything for sure, but the aforementioned endless loop that generates self seems to be quite akin to the training mechanism of a Reinforcement Learning agent. Without getting too deep into the technical details, RL is an approach to teaching computer programs (Machine Learning) through interaction with a simulated environment. The fundamental idea is that an agent has an innate need to maximize a reward signal supplied to it based on the environment's state, and can do so by improving its policy: a mechanism to decide an action based on its environment.

Already, you may be beginning to see similarities. The environment, is well, the material environment that we make perceptions of. The policy is more or less the logic that allows us to create immaterial abstractions and make decisions based on their results. So where does the self fit in then? We're clearly not simply a reward signal, but it might be the case that we're the direct result of that reward signal—i.e. the policy update step, where an agent changes its policy in order to better fit the environment.

If there's a hint of dissatisfaction with this answer, I think it is due to the assumed simplicity of the reward signal. A solution to this could be a mechanism that takes the simple reward and learns to adjusts it with relation to the environment and internal state, which would provide an even more information-dense signal. This would be an adversarial architecture, and completely speculatively might be a reason for the two-hemisphered nature of the brain.

Exploitation and Exploration in Human Culture

It's quite nice that we arrived at Reinforcement Learning through this discussion of the self, but my original intention to include RL in this piece was primarily in relation to culture, not philosophy. One of the core problems in RL is that of exploitation vs exploration. When an agent discovers a method to receive a high reward signal, to what extent should it exploit that method, thereby guaranteeing at least some success, instead of exploring and taking the risk to possibly find an even higher reward?

I would argue that human culture itself is a solution mechanism to this problem. In traditionally conservative cultures, the entire attitude towards life is that "we already know what works" and to take that path in order to minimize risk, i.e. exploitation. In progressive cultures, risk-taking is praised, often regardless of the results, i.e. exploration. I should also note that my usage of "conservative" and "progressive" here are partially in a vacuum but also partially hold interesting parallels to modern politics.

The Exploit-Explore Cycle

One interesting observation here is that progressive cultures, by their very nature, tend not to exist for that long. When they develop and succeed, they produce grand civilizations, eventually become concretized, and inevitably stabilize into a conservative culture focused on protecting the tradition developed by their forebearers. This can be seen in the early Vedic period (~1500 BC) of India, for example, which displays tremendous social mobility and a culture of expansion that led to the settlement of the entire Gangetic plain. Once it was completely settled, a stable agrarian culture flourished for many centuries and produced the Upanishadic literature which built metaphysical ideas around the ritualistic core of the Vedas (I personally think this period actually found a great balance of explore-exploit).

By ~400 BC though, society was ripe for a revolution in thought, which resulted in the spread of Buddhism across the subcontinent. The subsequent few centuries were again a period of stasis within Indian culture with constant invasion by Central Asian tribes (who themselves eventually converted to Buddhism and partly settled in modern Punjab). Then in the 400s AD we get a revival of Hindu discourse in the Gupta period, a few more centuries of stasis, and I could keep going on analyzing every civilization's history in such a manner.

So all history really seems to be is a bursting forth, every now and then, of an explorational culture that due to selection bias produces wide-reaching civilizational effects. The explorational cultures that take too many risks or fail to protect themselves from collapse often just end up being forgotten. Looking back through history, we primarily see long-lasting exploitational cultures which must find their beginning in an explorational culture at some point or another.

With respect to specifically Eurasia, one might also notice that the nomadic Central Asian cultures seem to act sort of adversarially to the agrarian Peripheral Eurasian cultures (of Europe, the Middle East, South Asia, and East Asia) but to the betterment of both. The nomadic cultures which tend to be quite explorational, regularly attack peripheral agricultural settlements, consequently encouraging the peripheries to explore further and develop defenses. The peripheral cultures that fail to explore will become subsumed by the explorational nomads, further developing civilization and eventually ushering in another age of exploitation. This cycle has been happening since time immemorial, most notably with the Proto-Indo-Europeans (~3000-2000BC), Scythians (~500BC-500AD), Turks (500-1000AD), and Mongols (1000-1500AD). It's quite an elegant adversarial solution to optimizing exploitation-exploration.

Philosophy as an Excuse to Explore

History is a great lens to look at how human cultures come to consensus on exploitation and exploration, but another lens is philosophy, which can be seen as a microcosm of culture. Philosophers play with ideas on the scale of words, while Kings play with ideas on the scale of wars. While it may take hundreds of years for an implicit dispute between exploration and exploitation to be sorted out through militaristic and darwinistic processes, it takes far shorter for philosophical disputes to at least be understood and organized, if not completely resolved.

At the start of this essay, I mentioned a few philosophical traditions and their disputes over material and immaterial. In the Greek tradition, one of the most widely known disputes over this is that between Plato and Aristotle. Plato was an immaterialist for sure, but I'd say that even more than that, he was an explorationist. Similarly, more than a materialist, Aristotle was an exploitationist, that worked to re-integrate some of Plato's ideas into traditional material Greek culture.

An almost identical series of events occurred in India, with the immaterialist Badarayana developing the Vedantic school of philosophy (which actually shares many similarities with Platonism), and his student Jaimini, developing the Mimamsa school (which emphasized logic and action over soul). Even more so in this case, the latter can be seen as an exploitationist who integrates some of the former explorationist's ideas into the long-standing Vedic tradition of ritual action.

China, however, seems to be a slightly different story. The very earliest texts of China aren't materialist like Greece's Odyssey and India's Vedas. Rather, the Yi Jing is much closer to the teachings of what is now known as Daoism. So when the explorationist came along, he wasn't an immaterialist like in the other two civilizations. Instead, Confucius was a staunch materialist who rebelled against immaterialism by valuing the "real" hierarchies of the world above all else. I'm not enough of an expert to say this confidently, but to my knowledge there was no exploitationist after him that tried to reform his ideas into a more immaterialist framework (to parallel Aristotle and Jaimini). Laozi (the formal founder of Daoism) is thought to have been a contemporary but there are no known interactions between them, and Laozi is thought by many to have been largely mythological.

Regardless, in all three cultures, the explorationist is regarded as the "greater" one by most people of the modern tradition. Platonism seems to have always been taken more seriously than Aristotelianism, Vedanta is now the dominant school of Hindu thought with Mimamsa near extinct, and Confucius is regularly quoted by Chinese people. One might then ask what effect the fact that China's explorationist was a materialist have had on its history compared to the other two civilizations.

It should be emphasized here that there isn't anything inherently explorationist or exploitationist about a philosophical idea. This is rather a feature of culture, which can value exploratory risk-taking in thought and action, or exploitation of long-standing tradition. The success of explorationism, whether by a thinker or a conqueror, ushers in an age of exploration in which many different ideas can be explored. Indeed, one often sees nearly identical ideas explored by completely separate cultures in their respective explorationist ages. Take for example, the work of Bhartrhari in the explorationist Gupta Era ~400AD and the work of Saussure in the explorationist Modern Era ~1900AD, independently arriving at quite similar theories of semantics/semiotics. There's nothing inherently explorationist about semiotics, it's simply to be expected that the majority of new ideas will be explored during explorationist cultural eras.

It seems to me that philosophers don't really have intrinsic beliefs, they just enjoy exploring. The culture that a "great" philosopher finds themselves in seems to always be diametrically opposed to their own ideas. Of course, I don't think there's inherently a problem with this—it's a vital mechanism for new ideas to constantly be analyzed with in a mechanism far safer than cultural conquest. But it's also important to recognize that explorative instinct in many of us and question whether some of our ideas are genuinely us, or rather reactions to the culture we find ourselves in.