Can machines think? How do we even define thinking, and how can we recognize it in anyone other than ourselves? These questions are becoming increasingly important as recent advancements in artificial intelligence blur the line between human cognition and digital simulation. The progress made in language processing through the use of neural networks, which simulate the functioning of a human brain, has yielded results that have surprised many, including myself. In this article, I will share my experience playing with two of these new tools over the past several weeks.
On the most basic level, the two major breakthroughs in AI fall into two broad categories: textual processing and image generation. New tools and apps that employ this new technology are popping up in different variations everywhere, so there are many different tools to try out, but this article is based on my experience with two of the most popular: Chat-GPT and MidJourney AI.
First, let’s discuss the technology behind these advances in artificial intelligence and why they represent such a significant leap forward in achieving human-like responses from a computer. Traditional AI was primarily rules-based, with programmers coding specific instructions for situations the computer might encounter. While effective, this approach was limited by the programmer’s foresight. If the computer encountered an unplanned situation, it would be at a loss. Rules-based AI worked well for simple cause-and-effect relationships between human actions and computer responses, but programming a computer to perform more subtle tasks, such as distinguishing between the images of a dog and a cat, or writing an original essay, proved to be more challenging.
Rather than attempting to develop a program capable of handling every conceivable situation, researchers focused instead on creating learning machines inspired by the human mind and utilizing statistical modeling techniques. Terminology used to describe this technology often reflects the human physiology that inspired it. This new AI is constructed from layers of neurons that transmit information from one layer to the next, while applying sophisticated statistical modeling in order to “understand” the data it is processing.
Connections between these artificial neurons, similar to the connections between neurons in the human brain, are referred to as “synapses” and determine the flow of information within the neural network. Each neuron can be thought of as a mathematical function or algorithm that processes input data, performs a calculation, and generates a result. This output is then used as input for other neurons, which carry out additional calculations, applies further processing and passes the results along to subsequent neurons in lower-level layers.
This process converts any kind of digital input provided into numerical data that the computer can understand. The kind of input data can be diverse, ranging from a sentence in English to a picture of a squirrel. Once the input has been transformed into millions of computational data points, the AI utilizes concepts from statistical modeling to identify patterns within the result. Returning to our earlier example of distinguishing dogs, developers feed thousands of dog images into the computer. The AI then analyzes each image and detects patterns the images have in common. With feedback from developers on correct and incorrect outputs, the artificial intelligence can adjust its own models to improve results without requiring additional human programming. This process of providing data to the AI in order to improve its modeling is known as training. Once the AI has been trained to recognize specific patterns common to dog images, it can successfully identify previously unseen pictures of dogs.
Since the AI has been trained to recognize patterns and features specific to dog images, it can then utilize this knowledge to create new, original images of dogs. This is achieved through a process called generative modeling. The AI essentially learns the underlying structure and distribution of the data it has been trained on, which allows it to generate new data samples that share similar characteristics. In the case of dog images, the AI has learned the various features, such as shapes, colors, and textures, that are commonly found in pictures of dogs. By combining and manipulating these features, the AI can generate entirely new images of dogs that, while unique, still resemble real dogs in appearance. This creative capability has numerous applications, only some of which are we starting to understand or apply.
The same modeling techniques used for processing images of dogs can be applied to understanding language or analyzing other complex input. When applied on a large scale, the AI can recognize a wider variety of objects and interpret intricate input data. For instance, instead of merely examining pictures of dogs, the AI can analyze entire movies, identifying all the objects in each scene. Furthermore, it can detect patterns not only in the objects themselves but also in the relationships between them. In this manner, the AI can begin to understand the world in ways that mirror our own.
Since the AI perceives language and image information as interchangeable forms of data, it can translate textual descriptions into image data and vice versa. This means that users can describe a scene to an AI in natural language, and it can then generate an image based on their specifications. This capability has led to the development of various tools for creating artwork, of which MidJourney and Leonardo AI are two of the most popular and advanced. For example, I used the MidJourney AI tool to create the accompanying original image of a rogue robot standing in the rubble of an American city. See the caption on the photo to see what description I used to generate the image.
Unlike MidJourney, which generates images, Chat-GPT focuses on producing text. At its core, it shares similarities with the predictive text functionality found on modern smartphones. As users begin typing messages on their devices, the interface attempts to predict the intended word to speed up typing. Chat-GPT operates on a similar principle, but with far greater complexity and sophistication. Instead of merely suggesting individual words, Chat-GPT is capable of creating entire sentences and paragraphs. The applications for this are almost limitless. It can engage in knowledgeable conversations on almost any subject in a manner that is strikingly human-like and contextually responsive. It can compose essays with minimal input. In the business world, this technology has the potential to replace customer support representatives, secretaries, and any job requiring language understanding and interpretation, an area that has long eluded automation efforts.
This article has only provided a glimpse into the intricate and expansive world of this new technology. Numerous topics remain unexplored, such as the countless applications this technology can be applied to, the ethical and legal ramifications, and the potential issues surrounding AI bias or manipulation. A more in-depth discussion of these aspects will be reserved for future articles. Nevertheless, it is evident that this technology will transform our world in ways we cannot yet fathom, and it will likely do so in the very near future.
The personal computer took 20 years to revolutionize our lives, while the smartphone achieved a similar impact in under a decade. These latest advancements in artificial intelligence are poised to bring about even more profound changes in a much shorter timeframe – possibly within a few years or even months. So, buckle up, my friends, the AI apocalypse has arrived!
Eric W. Austin writes about technology and community topics. Contact him by email at firstname.lastname@example.org.