Sharing Data with AI

Deandra Cutajar
Jul 18, 2023
3 min read

Updated: Oct 11, 2023

Over the past couple of months, our lives have been rocked by a new player in town - AI. Artificial Intelligence has been researched for decades but has only recently been accessible to the broader public. From a data scientist's perspective, this is exciting!

AI will enable the data scientist to focus on critical thinking and out-of-the-box solutions. Less time can be spent on optimising an algorithm when an AI can provide alternative ways to the answers with better performance. It can also aid in documenting the code better and in more detail. Generative AI can also improve communication between data scientists and businesses by giving ideas on how complex techniques can be explained to non-technical colleagues.

So why are companies banning specific AI algorithms?

If AI is so great, why are BigTech companies issuing policies and regulations around AI?

On July 11, 2023, Insider wrote an article sharing a list of 14 companies that have "restricted employees from using ChatGPT". The list includes:

Apple
Spotify
Samsung
JPMorgan Chase
Amazon
Citigroup
Bank of America
Goldman Sachs

A legal lawsuit claims that personal data was used for the AI to act like a human, as described in this article. The common concern that most companies have around ChatGPT is that it uses input data to train ChatGPT. This means that with every interaction a user has with ChatGPT, there is a risk that the same information used in that interaction, private or public, is used to train the model. Once the model learns that data, any other user can access it.

Amazon and Samsung's proprietary information was leaked when their employees uploaded code and algorithms, data and other sensitive details on the platform, thus exposing trade secrets to everyone. They've since put strict guidelines.

These companies are developing their own AI solutions similar to the same software they're banning. It may seem contradictory, but the critical point of the situation is that the company controls any AI solutions developed internally. By extension, any data uploaded adheres to the same data protection and security the company implements.

Big tech companies are NOT against AI technology.

Big tech companies and other companies, for that matter, are primarily concerned with the data pipeline of the AI provider. Any data shared with a platform goes through a process that ideally safeguards the customers' proprietary information from anyone not permitted access. Moreover, most companies are mature enough to understand that AI models must continuously learn from new data to maintain performance over time. If proprietary data is used to update the model, then any user interacting with the new model will have access to the same proprietary data.

Companies risk exposing their trade secrets to competitors when using a third-party solution.

Does that mean that companies should NOT use AI altogether? Should companies have their own internal AI team? Not necessarily!

Depending on the company's maturity, creating a data strategy for internal development may exceed the budget and take longer than reaching out to consulting companies. That, however, doesn't mean that one must do this without understanding the risk. If there is no data team or a small team to take on such an advanced project, there are ways to build a partnership with consultant firms and mitigate the risk towards almost non-existence.

AI is an IT and is here to stay. It promises to help us solve trivial problems and replace mundane tasks. Thus allowing us to focus on more pressing issues that require more effort. Instead of being distracted with small tasks, we can allocate budgets and resources to game-changing projects, objectives and goals. However, we must ensure that our data is well protected to protect trade secrets and ourselves.

Sharing Data with AI

Recent Posts

Comments