While today’s digital infrastructure is based on the Internet, tomorrow’s infrastructure will be based on artificial intelligence (AI). The internet has governance structures and security measures in place to protect enterprise resources from malicious attacks, but there is no equivalent solution for AI. There is an urgent need to develop structure around the creation of AI models, their deployment, and their subsequent management and deprecation. As the use of AI grows, the risk of a catastrophic failure increases. Without AI governance, AI cannot be trusted, and may in fact open the door to a range of new attacks and vulnerabilities.
Providers must be able to assure users that the model was based on correct, high-quality data, was trained using sound methods, constraints for use (or: under what conditions the model is effective or ineffective) and the model’s origins are well understood, weaknesses are accounted for, and the model has not been tampered with. The model should accurately reflect “the ground truth,” which is the actual nature of the problem that is the target of a machine learning model. It should be assessed further if the ground truth changes over time, and if so, there should be a process to retrain the model to reflect the changes.
Let’s look at the factors that help assess the trustworthiness of an AI model: the source of training data, model quality, and model constraints.
3 Factors In Determining AI Model Trustworthiness
- Source of data
- Model quality
- Model constraints
Source of Data
Protecting the integrity of AI models to reduce vulnerabilities is at the core of AI security, such as preventing third parties from utilizing statistical properties of the model to “fool” it. AI models are black boxes, meaning after a model is trained, it is very hard to ascertain key model properties or understand the specific elements responsible for the model making predictions. That’s why it’s so important to give it training data from an unbiased, trusted source. AI models can be hacked, and adversarial attacks fool AI models using deceptive data. Changing the training data or altering the model’s architecture can affect an AI model’s ability to make correct decisions by changing its statistical properties.
AI models depend on algorithms with clearly defined constraints, high-quality training data and, if applicable, pre-trained models. The purpose for which the model was intended is also key to take into account. High-quality datasets are needed to create high-quality models — this entails not only having enough data, but also unbiased and representative data from a reliable source.
Model quality is given statistically, but there are important caveats. Data is used to train models and if the data is biased or incorrect then the data used for validation is also likely to be biased or incorrect. It will appear that the model has high accuracy, but in fact, it is simply a bad model. For example, Google trained AI to recognize faces, but because of a lack of Black people in the training data, the model misclassified faces with a dark skin tone.
The choice of algorithm can also have an impact on the final model because the combination of the algorithm and dataset determines the overall effectiveness of the model. The model effectiveness needs to be defined and is often determined by the problem being solved. For example, are false positives riskier than false negatives?
Unlike traditional computer programs in which an experienced software developer can read the code, AI models do not reveal anything about their behavior. AI models are simply a set of numbers and it is impossible to determine the behavior of the AI model from these numbers. AI models behave statistically, and their predictions are a matter of probability. If their probability “profile” has been changed, it is difficult to determine its accuracy.
All AI models give predictions associated with a probability. If an AI model classifies a photo of a dog as a dog, that isn’t a definitive statement — it is simply that there is an X percent probability that the photo is of a dog. Testing an AI model can only be done statistically. For example, taking a large number of photos and seeing how often the model successfully predicts that the image is that of a dog.
Understanding the purpose for which the model was created plays an important part in determining the quality of a model. A model can only be used in a very limited set of circumstances, and it can only predict based on the data it has been exposed to. A model trained to recognize dogs and cats can’t detect squirrels, but it will then also misclassify squirrels as dogs or cats. A model trained to detect tumors in the prostate gland cannot be used to detect tumors in a lung.
A model is trained for a specific purpose, so if a class of data is passed to the model that was not trained on that particular class, the model will try to match that class to the classes on which it has been trained. Suppose a model has been trained to detect dogs and cats, but someone passes a photo of a squirrel. The model will try to classify it as a dog or a cat because it has not been trained to know what a squirrel is.
Models may be trained using “pre-trained” models, or using another model as a base for training a new model. Take our model that was trained on dogs and cats for example — it can be “extended” to recognize squirrels. If a pre-trained model is used, then it is important that the quality of the pre-trained model is understood.
One last check to make would be on the model’s recency. An AI model may become less accurate with time. Suppose an AI is designed to create exciting headlines for news articles — over time the model may become less effective due to ever changing trends and news cycles. Likewise, a model used to predict the optimal insurance premium for a customer may change as their demographic, environment, or habits change.