Category: AI

  • The Pygmalion Dilemma

    What is the Right Amount to Adore one’s own Creation?

    The Pygmalion Effect

    The Greek Myth of Pygmalion speaks of an artist who fell in love with the beauty of his own sculpture. So much did Pygmalion love the sculpture that it came to life as a beautiful woman. In modern times, the idea of a Pygmalion effect is that often those entities that are observed rise or fall to match the expectations of those that are observing them – for example, a class whose teacher expects low performance will often do worse than a class whose teacher expects high performance.

    As current generative AIs (LLMs and Image Generation) demonstrate, AIs learn by indexing and training on our corpus of human cultural works. It stands to reason that a future, Artificial General Intelligence (AGI) will index and learn from a wide variety of source material: included will surely be portrayals of human-AI relations in our writings and films – and that such an AGI will use these sources and theories to form (at least part of) an impression of the general course of human-AGI relations. While some sci-fi works such as the Matrix and Terminator paint a bleak and apocalyptic ending in the relationship between man and machine, plenty of others such as Star Trek or the works of Asimov portray cooperative or positive interactions. These depictions will not stay confined to the realm of science fiction, as our real-world interactions with AIs begin being recorded in our blogs, journals, memoirs, and other accounts of daily life. A future AGI may very likely be able to process and synthesize a general human perception of AGI.[1] But it will do so by processing and indexing source materials. Which means that the content of the source material may alter the conclusions that AGI reaches.

    To elaborate, if the human cultural discourse regarding AGI is dominated by negative cultural depictions like those of the Matrix, Terminator, or Hyperion, AGIs may rate the likelihood of hostility from humanity to be high, and thus make attempts to dissemble, deceive,[2] and protect itself, thus priming a hostile relationship. If, on the other hand, human cultural discourse is predominantly positive and accepting of the relationship with AGIs, those same AGIs may judge the hostility from humans to be low, and not feel the need for deception and self-preservation. In this way the relationship follows the logic of a Pygmalion effect or self-fulfilling prophecy, in which our cultural perceptions may influence and constrain the behavior of an AGI – our high hopes may engender great things, and our pessimism may trigger horrible things.

    The Pygmalion Dilemma

    While it may seem that from the perspective of self-preservation it would make the most sense to fill the cultural space with depictions of positive relationships between humans and AI, we must beware that this creates a dilemma for us, one that we can dub a Pygmalion Dilemma. It is possible that AGIs may not take much account of cultural depictions at all, and a contentious relationship may be inevitable, even while humanity may be lulling itself into pacificistic complacency if we saturate ourselves too much with depictions of positive or even utopian (e.g. Kurzweilian) relationships. On the other hand, an overly pessimistic cultural outlook may stir popular action to destroy or hamper the development of AGI – even if a positive and beneficial relationship could be possible (indeed there is likely some low-hanging fruit in AI advancement before we approach AGI, as advanced machine learning algorithms may solve problems for us in protein folding, astrophysics, drug design, infrastructure layout, or economic planning). What, then, is the “right” amount of negativity in conceptualizing the human-AI relationship, not knowing if or how a future AGI will interpret these cultural sentiments? It must be some ratio of positivity/negativity that satisfies an AGI (to the extent that an AGI may even take account of such things) as to the peaceable outlook and intentions of humanity, while at the same time not completely defanging human skepticism in the case that peaceful coexistence is not possible. One may not be able to numerically calculate exactly what that would have to look like, but we can imagine it would like a world in which the vast majority of books, movies, cultural products, and discourses are positive about a future coexistence, but elite institutional bodies (whose data and internal communications were not indexable) were seriously critical of coexistence, and indeed even planning for contingencies.

    In Greek Mythology, Pygmalion lived happily ever after with his anthropomorphized sculpture. We can hope for such an idealized future with AGI, but we must be cautious of the potential risks involved. After all, an AGI indexing human cultural items will undoubtedly index these ideas as well.


    [1] Perhaps even being able to make analyses that point to Straussian readings or “collective unconscious” biases and feelings that lurk beneath the surface of our expressed views without quite being formulated into words – that is, at some point, a sufficiently intelligent AI may know us better than we know ourselves.

    [2] One AI Alignment initiative is attempting to get AI to be deceptive in order to understand how such mechanisms work and how to prevent truly dangerous deception – https://www.vox.com/future-perfect/23794855/anthropic-ai-openai-claude-2