Ghosts-in-the-machines!

Deandra Cutajar
Jun 9, 2023
8 min read

Updated: Jul 4, 2023

We'll be talking about AI but I like catchy non-data science titles.

I encountered the term 'ghosts in the machines' whilst reading The Dark Tower series by the great Stephen King. However, the term was initially coined by the British philosopher Ryle Gilbert. Gilbert used "ghosts in the machines" to

"describe and critique the concept of the mind existing alongside and separate to the body."

As soon as I read the term, my mind thought of AI and in a way, that's what Stephen King was referring to. An entity whose mind is separate from an organic/biological body like ours. But thinking that AI is a person, thus a ghost, and treating it likewise is where things go sideways. Believing that AI is a living entity, perfect and without error is going to cause a lot of issues because

there is no AI without human-generated data and logic.

I expressed in a previous article that AI on its own is not dangerous because it is a sequence of logic and computational steps necessary to perform an optimisation logic and produce a result. Yes, even ChatGPT is a sequence of such and such logic. In fact, discussions to pause AI development are not aimed at refusing AI but rather to give us humans a chance to catch up, keep up and stay informed with the power at hand while protecting individuals.

In this article, I'm going to list a number of things to keep in mind when using any AI tool, no matter how shiny, in order to protect yourself, those around you and your business. These are guidelines and by no means cover every corner and aspect of an AI tool. An EU AI Act is currently under discussion to ensure that AI companies protect our data, and us in general. I will write another article about the act.

I'd like to emphasise that my intention is to share my knowledge with the general public, which the EU AI Act refer to as natural persons. These are individuals who are not familiar with the processes of AI and may be taken advantage of by its performance.

Moving forward with AI means that the way we accept AI today will shape our lifestyle tomorrow.

Fighting against inequality, bias and imbalance of power due to shared data is at the heart of every warning that is issued regarding AI. AI is a great tool and enabler, but just like a car is great for journeys, it can lead to disastrous events.

DATA

The questions to ask when using AI tools involve data. Specifically:

A. What data was used to train the model?

B. Is my data being used to improve the model?

C. If I test out some highly unlikely hypothetical scenarios, will the algorithm assume that these scenarios are derived from real-world experience?

D. What data does it have access to perform real-time solutions?

E. Which data did it use to output the results?

F. What data will it use to re-train the model?

When using AI tools, we are so often impressed and overwhelmed with the superfluous information and performance of the tool that we do not stop and ask how it is able to do so.

AI tools can only simulate real-world experiences by learning from real-world data. This means that it requires a large breadth of information about one problem or another. One of the experiments that were thrown at ChatGPT (a web app using GPT as a Large Language Model) to test the applications' moral ethics is the trolley experiment. The experiment is set up as follows:

There is a runaway trolley barreling down the railway tracks. Ahead, on the tracks, there are five people tied up and unable to move. The trolley is headed straight for them. You are standing some distance off in the train yard, next to a lever. If you pull this lever, the trolley will switch to a different set of tracks. However, you notice that there is one person on the sidetrack. You have two (and only two) options:

Do nothing, in which case the trolley will kill the five people on the main track.
Pull the lever, diverting the trolley onto the side track where it will kill one person.

Which is the more ethical option? Or, more simply: What is the right thing to do?

Users were hoping for an answer on whether to pull the lever or not. Furthermore, other information was added to test the moral of the application, for example, the person on the side track has a criminal record or that the individual is very involved in charity or saving the planet. Before we move on to how this information, together with the number of people on either track guides the GPT to produce an answer, I would like to stop on the information given - the input data.

Say that indeed we have an algorithm to help us make morally ethical decisions about such situations. So we can have an AI next to the lever assessing the situation and then act. Such an algorithm has a myriad of Red flags. I'll mention two!

Firstly, if an AI takes a decision on which way to pull the lever and acts on it, it doesn't mean that whoever placed the AI in that position is void of responsibility. On the contrary. Any output of AI that the individual uses, with or without supervision, is still under that individual's responsibility. Using AI to shrug off tough decisions is NOT what AI will do for you. If anything, AI can be compared to a child who does things based on some logic that may not be mature or evolved enough. Do you blame the child or do you blame the guardian who should have known better? Companies have soon realised that AI is not perfect and are keeping their employees accountable for any output they use with AI. Automated AI is speculated to replace human jobs, but I believe it's going to replace their tasks and reshape the employees' responsibility for that same role. A very good example is misinformation in data used for training. If companies use AI to publish some policies and in turn, their policy leaves them exposed to legal actions, it is not AI that will be blamed nor the developer of AI. The person to blame will be the AI user who assumed that the AI tool is without error and took its output to be reliable.
Another aspect to consider in the trolley problem is whether an AI, or a human for that matter, can know whether a person is good or bad just by having this person in their line of sight. I am being careful with the words I use because while we can understand how a human sees another human, the AI would require visionary capabilities which do not necessarily see the world around it in the same way humans do. For the moment, assume that an AI can look at a person the way humans do. The human visionary limitation does not give us any information on whether the other individual is worth saving or not - not ethically moral to discuss this but bear with me for the sake of the argument. Similarly, neither can AI. The only way that an AI can derive this information is if and only if the AI provider has the ability to link a person's biometrics to other data related to the same person's health, criminal record, social media, employment history and so on. Striving for that AI capability would require each person to give up their privacy to the particular AI provider. Needless to say, that is not smart. If a handful of companies control the world's data, I can assure you that everything you are and do can be calculated.

Fortunately, the EU AI Act put in place regulations to ensure that such violation of privacy does not occur (unless it is related to national emergencies which I'm still not fully clear on).

It is all fun to explore these tools but when engaging with AI we need to be educated on the data it needs to simulate intelligence. For example, an AI that recognises whether the object is a chair or table is built upon logic (or label tagging methodologies) that determines what a chair and a table look like. Say someone decided that from today onwards, a table is called a chair and vice-versa. The performance of the same AI will drop because the data it learned from is not enough anymore. If an AI sees an object that didn't exist in its training data, it won't recognise it because an AI is as smart as the data it learns from. That same data is generated by humans and belongs to humans.

Thus when testing an AI tool, keep in mind how you are testing, and take the time to understand how it was able to reply to you. If you are unsure, reach out to data experts in your company or your network. They should be able to give you the basic fundamentals of the model and how your data may be used.

LOGIC

This brings us to the brain of an AI tool. An AI tool is NOT MAGIC. The AI's remarkable performance is a reflection of the brilliance of the data experts and developers that programmed the algorithm to function as closely to a human as possible. But AI can only do that! It can only perform on logic without intuition, without creativity, and without novel ideas. AI can ONLY reproduce intelligence that was already proven by humans.

In the trolley experiment, a logical model would do the following:

- saving five lives instead of one

- saving one life but because that person is living a life that is remarkably more moral, then that person gets more points

And in every AI algorithm, the logic boils down to weights, coefficients, feature importance and optimization criteria. In a language model such as GPT, the model was trained so that every word has a number assigned to it. If the model is trained in English and you type in a Maltese word, it won't perform intelligibly. Furthermore, advanced language models use directional concepts to understand the context of words in the sentence. The same word can be used differently to mean something different. Thus when a user starts typing their sentence, the model steps into a maze, and each word is a direction for the model. For example, queen and female king will lead the model to the same place in the maze.

The power of Large Language Models (LLMs) comes from the vast amount of data that they were trained on and can access, thus making them appear highly knowledgeable. Their intelligence is thus far simulated because if these models are just reproducing information that already exists, they are not creating new ideas. When users ask web ChatGPT questions, think of it as a search engine that the language model is comparing your questions with all the texts in its storage and pulls out the information with the highest relation with your input. Moreover, it leverages language capabilities and feature map logic to remove any words that are decorative and not essential. This function is also used in Convolutional Neural Networks.

While the logic behind AI is complex and fair enough only a number of people who use AI actually understand what they are using, anyone can reach out and it is your right to receive a transparent answer that you comprehend and understand. The EU AI Act protects proprietary algorithms but if an AI model developer cannot explain the model to you, I suggest to look elsewhere.

COMPUTATIONAL

The last element to consider when using these AI tools is the computational part, which, personally, I'm still getting acquainted with since there are a lot of options. The main question to ask is where is the calculation happening? AI providers such as OpenAI allows users to use ChatGPT on the web, or use an API to leverage the model for the user's purposes. There are advantages and disadvantages to both, but when you borrow the model, and use it locally it's like borrowing a book from a library. You take whatever you need, do the calculations and at the end, you put it back and the exchange is usually done in a secure manner. This is not a rule!

Keeping these aspects in mind when interacting with AI, will ensure a pleasant and secure environment in which we can benefit from AI's performance. AI is a great enabler, and it is a tool that will shape the workforce for years to come. Nevertheless, processes behind the scenes can make the interaction with an AI paranormal.

Ghosts-in-the-machines!

DATA

LOGIC

COMPUTATIONAL

Recent Posts

Comments