Integrity Unleashed: Embracing a New Era

“`html

We must discuss the importance of data integrity.

Specifically, the term signifies ensuring that data is not altered, whether during transmission or while stored. Altering account balances in banking databases, erasing entries from criminal records, and deleting mentions of allergies from medical documents are all attacks on integrity.

More generally, integrity pertains to verifying that data remains correct and precise from the moment it is gathered, across all the ways it is utilized, modified, transformed, and ultimately disposed of. Incidents involving integrity can include malicious acts as well as unintentional errors.

Although we may not perceive them this way, numerous basic integrity measures are integrated within our computer systems. The reboot function, which returns a computer to a previously verified state, acts as an integrity safeguard. The undo function serves as another integrity protection. Any systems capable of detecting hard drive malfunctions, file corruption, or lost internet packets serve as integrity measures.

Similar to how a website exposing personal information, even with no access, constitutes a privacy breach, a system that does not assure the precision of its data represents an integrity breach—regardless of whether that data was intentionally altered.

Integrity has always held significance, but with our increasing reliance on vast amounts of data to train and operate AI systems, the necessity for data integrity is more pressing than ever.

The majority of assaults on AI systems are integrity breaches. Placing small stickers on traffic signs to deceive AI driving systems is an integrity infringement. Prompt injection attacks represent another integrity issue. In both situations, the AI model fails to differentiate between valid data and malevolent inputs: visual in the former instance, textual instructions in the latter. Even more concerning, the AI model cannot discern between legitimate data and harmful commands.

Any assaults that manipulate training data, the model, the input, the output, or the feedback from the interaction that feeds back into the model constitute an integrity violation. If you are developing an AI system, integrity represents your largest security challenge. This is something we must contemplate, discuss, and find solutions for.

Web 3.0—the decentralized, distributed, intelligent web of the future—centers on data integrity. It’s not limited to AI. Reliable, verifiable, and accurate data and computation are essential components of cloud computing, peer-to-peer social networking, and decentralized data storage. Envision a scenario with autonomous vehicles that communicate among themselves regarding intentions and road conditions. This system cannot function without integrity. Similarly, neither can a smart power grid or dependable mesh networking. Without integrity, trustworthy AI agents cannot exist.

However, we first need to address a minor linguistic issue. Just as confidentiality corresponds to confidential, and availability corresponds to available, integrity relates to what? The corresponding term is “integrous,” but it’s such a rare word that it doesn’t appear in the Merriam-Webster dictionary, not even in the unabridged edition. I suggest we work to revive the usage of this term, beginning here.

We require research into integrous system design.

We need investigations into a range of complex problems that cover both data and computational integrity. How can we assess and quantify integrity? How can we create verifiable sensors with auditable system outputs? How do we construct integrous data processing units? How do we respond to an integrity breach? These are just a few of the inquiries we must address as we begin to explore integrity.

There are profound questions at hand, as deep as the internet itself. Back in the 1960s, the internet was developed to address a fundamental security query: Can we establish an available network amidst a reality of availability failures? More recently, we have shifted to the question of privacy: Can we create a confidential network in a landscape of confidentiality failures? I propose that the current iteration of this inquiry should be: Can we develop an integrous network in an environment rife with integrity failures? Like the previous versions of this question, the answer isn’t evidently “yes,” but it’s not clearly “no” either.

Let’s begin contemplating integrous system design. And let’s incorporate the term in our discussions. The more frequently we utilize it, the less peculiar it will seem. And, who knows, perhaps one day the American Dialect Society will select it as the word of the year.

This essay was initially published in IEEE Security & Privacy.

“`

Leave a Reply Cancel reply