Welcome to 2023! AI (Artificial Intelligence) is in the news these days in part because of GPT-3. GPT-3 is a natural language processing (NLP) engine coupled with a deep learning model built by the San-Francisco-based OPEN-AI startup company. The GPT-3 model follows the GPT-2 model introduced in 2019. The GPT-2 model includes 1.5 billion parameters while the GPT-3 model includes 175 billion parameters. The size of the GPT-3 model greatly increases its ability to capture language and ideas at a conceptual level, meaning at the essence of the ideas being expressed. The OPEN-AI deep learning model uses the concept of “attention” to create better correlations between words. These correlations then enable the GPT2-3 models to better “understand” the phrases being examined.
There are essentially three important components in GPT-3:
- Older models created relationships between words generally based on their proximity. These models use attention and other metrics to create more sophisticated relationships akin to concepts. By the way, the notion of concept is one that has not been studied very much this far…
- The size of the model enables the identification of many different concepts based on context. For instance, the word “foundation” has several meanings including the component of a physical building, the underpinning of an argument, an institution, and others. This model is the culmination (so far) of 70 years of research from different fields, including the 1950 publication of Alan Turing’s “Computing Machinery and Intelligence” paper in the journal “Mind.”
- The development of this model is made in part possible by relatively cheep computing power and data storage. This is also facilitated by the compilation of open web crawl data repositories such as Common Crawl which contained 3.35 billion web pages in its November 26 – December 10, 2022 archive.
Deep Learning Use Cases
The main uses of deep learning models include image recognition, translation, recommendation engines, and, in this case, natural language. The OPEN-AI startup has essentially created a platform business by enabling developers to access the GPT-3 model through an application programming interface (API) and create their own application. See here for a number of apps.
Use cases for natural language deep learning models include:
- Generate natural language documents based on an analysis of concepts found in the GTP-3 database. The user provides key words, the model searches for related concepts and ideas, and composes text based on these ideas,
- Synthesizing ideas from various sources. For instance, analyzing customer feedback or comments to identify themes and provide an overall understanding of the meaning of this feedback,
- Replace humans in customer service chats by understanding questions or comments and creating more relevant answers,
- Creating real-time evolving story lines in games for instance, based on interactions with users,
- Providing recommendations to consumers by analyzing their historical purchases and other characteristics,
- Copyright marketing documents, blogs (not this one…,) essays,
Benefits and Issues
Benefits of GPT-3-type models include high quality automated content creation tailored to a specific situation or need. This is particularly beneficial in the marketing field because specific recommendations, or even customer-specific advertisements, can be developed in real time. Other benefits will be to improve search results by leveraging conceptual meaning found in existing documents (semantic search,) assistive customer service, education (language and technical, special education,) and idea generation…
These benefits can be roughly divided into two categories: increased productivity (replacing humans by competent computers,) and computer recommendations and content creation. This second type of benefit can also be considered an improvement in productivity but it brings digital capabilities closer to creativity which is a capability that we considered hitherto limited to the human sphere. In mid-2020, Gwern Branwen experimented with GPT-3 poetry and commented: “GPT-3’s samples are not just close to human level: they are creative, witty, deep, meta, and often beautiful. They demonstrate an ability to handle abstractions, like style parodies.”
Language-based automation and digital recommendations are not new. Siri and Alexa understand human language and respond – for the most part – adequately. The ability to understand text at a conceptual level, however, brings these capabilities to a different level. Bias is one important issue. GPT-3 and other models that synthesize prior academic or Internet material may bring racial, gender, or other biases that may be contained in the material. Another form of bias is the inclusion of false facts in the analysis because the model cannot (yet) assert what is or may be true or false.
At this stage, these models can be blind and unpredictable as it is not always possible to understand how these large and complex transformer models arrive at certain conclusions. The models intrinsically assume that the language they analyze is logical, grammatically correct, and otherwise free of linguistic errors. Researchers and companies such as Anthropic are trying to address these problems. The question is: what type of interventions are necessary to “improve” the results of these models, what does “improve” mean, and are improvements algorithmic-based or performed by humans (such as reinforcement learning.)
- Is it artificial? Not really! GTP-3 and other similar models are looking in the mirror at existing written material and trying to understand its true meaning.
- Is it intelligence? Yes! In a way, the models are bringing new capabilities in analyzing natural language. In fact, the models seem to be able to manage complexity by digitizing critical thinking and attempting to focus on key concepts in the material.
We all have trouble dealing with complex issues surrounding us and the use of these models can definitely help boil down the complexity of these issues. The key, of course, is to ask the model questions with the requisite amount of specificity. The potential commercial benefits of this technology are fueling new developments in the application sphere, but also in the research fields of linguistics, biology, neural network modeling, conceptual research, and more. These models, however, are unlikely to replace true creativity, using imagination or original ideas. But that will be the topic of another blog…
- Andriy Myachykov, Michael I. Posner, CHAPTER 53 – Attention in Language, Editor(s): Laurent Itti, Geraint Rees, John K. Tsotsos, Neurobiology of Attention, Academic Press, 2005, Pages 324-329, ISBN 9780123757319, https://doi.org/10.1016/B978-012375731-9/50057-4. (PDF) Attention in Language (researchgate.net)
- Polysemanticity and Capacity in Neural Networks by Buck Shlegeris, Adam Jermyn, Kshitij Sachan
- A commentary of GPT-3 in MIT Technology Review 2021
- Watch Two AIs Reflect On The Nature of Complexity. (GPT-3)