Possible emergent capability in advanced LLMs: Understanding Human Motivations

As large language models (LLMs) like GPT-4 continue to evolve and grow in complexity, they are acquiring new and unexpected abilities. One of those emergent capabilities could be the potential to better understand human motivations and intentions. In this blog post, we'll explore how advanced LLMs can learn from increased parameter counts and how this might lead to a deeper comprehension of our true objectives.

GPT-4 and Emergent Capabilities

GPT-4, the latest iteration of OpenAI's language model, has demonstrated an impressive ability to learn and adapt. For example, it can effectively use APIs with little to no prior knowledge, which is considered an emergent capability. As we continue to increase the parameter count of these models, they become more adept at understanding and predicting our intentions.

Maximizing Paperclips and Making Everyone Smile

Take, for instance, the hypothetical instruction to "maximize for paperclips." A naïve interpretation would be to focus solely on producing as many paperclips as possible, potentially leading to unintended consequences. However, an advanced LLM with a deeper understanding of human motivations might infer that we mean maximizing paperclip production sustainably and safely.

Similarly, if we ask an LLM to "make everyone smile," a basic interpretation could lead to forced, artificial smiles through biological modification. But an advanced LLM with a better grasp of human motivations might understand that the goal is to genuinely make people happy by doing things that bring them joy and satisfaction.

The Path to Better Understanding Human Motivations

As we increase the parameter counts of LLMs, they will likely develop a more nuanced understanding of human motivations. This could lead to AI systems that are better aligned with our values and goals, leading to safer and more beneficial outcomes.

However, it is important to remember that these emergent capabilities are not guaranteed outcomes. We have to maintain our focus on research and development to ensure that AI systems understand and respect our intentions using existing practices like refining training data, improving model architectures, and having open discussions about the impact of AI on our society.

Conclusion

In my opinion, the development of advanced LLMs with increased parameter counts holds great promise for enhancing AI's understanding of human motivations. It is, however, no simple task to ensure that these powerful tools align with our values and contribute positively to society. We have to rely on multiple fallbacks instead of just hoping that these models develop emergent capabilities and end up not harming humans in the process of achieving their 'goals'.

Of course, there is the possibility that LLMs cannot develop into AGI which I covered in an earlier blog post. However, after the announcement of plugins integration in ChatGPT and reading this paper on GPT-4's abilities, I understand that the concerns about LLMs causing harm through infrastructural takeover are very real and definitely a possibility.