Poison In The Machine NEW!
Machine learning adoption exploded over the past decade, driven in part by the rise of cloud computing, which has made high performance computing and storage more accessible to all businesses. As vendors integrate machine learning into products across industries, and users rely on the output of its algorithms in their decision making, security experts warn of adversarial attacks designed to abuse the technology.
Poison In The Machine
Most social networking platforms, online video platforms, large shopping sites, search engines and other services have some sort of recommendation system based on machine learning. The movies and shows that people like on Netflix, the content that people like or share on Facebook, the hashtags and likes on Twitter, the products consumers buy or view on Amazon, the queries users type in Google Search are all fed back into these sites' machine learning models to make better and more accurate recommendations.
Data poisoning or model poisoning attacks involve polluting a machine learning model's training data. Data poisoning is considered an integrity attack because tampering with the training data impacts the model's ability to output correct predictions. Other types of attacks can be similarly classified based on their impact:
The difference between an attack that is meant to evade a model's prediction or classification and a poisoning attack is persistence: with poisoning, the attacker's goal is to get their inputs to be accepted as training data. The length of the attack also differs because it depends on the model's training cycle; it might take weeks for the attacker to achieve their poisoning goal.
Data poisoning can be achieved either in a blackbox scenario against classifiers that rely on user feedback to update their learning or in a whitebox scenario where the attacker gains access to the model and its private training data, possibly somewhere in the supply chain if the training data is collected from multiple sources.
In a cybersecurity context, the target could be a system that uses machine learning to detect network anomalies that could indicate suspicious activity. If an attacker understands that such a model is in place, they can attempt to slowly introduce data points that decrease the accuracy of that model, so that eventually the things that they want to do won't be flagged as anomalous anymore, Patel tells CSO. This is also known as model skewing.
A real-world example of this is attacks against the spam filters used by email providers. In a 2018 blog post on machine learning attacks, Elie Bursztein, who leads the anti-abuse research team at Google said: "In practice, we regularly see some of the most advanced spammer groups trying to throw the Gmail filter off-track by reporting massive amounts of spam emails as not spam [...] Between the end of Nov 2017 and early 2018, there were at least four malicious large-scale attempts to skew our classifier."
The main problem with data poisoning is that it's not easy to fix. Models are retrained with newly collected data at certain intervals, depending on their intended use and their owner's preference. Since poisoning usually happens over time, and over some number of training cycles, it can be hard to tell when prediction accuracy starts to shift.
Reverting the poisoning effects would require a time-consuming historical analysis of inputs for the affected class to identify all the bad data samples and remove them. Then a version of the model from before the attack started would need to be retrained. When dealing with large quantities of data and a large number of attacks, however, retraining in such a way is simply not feasible and the models never get fixed, according to F-Secure's Patel.
"There's this whole notion in academia right now that I think is really cool and not yet practical, but we'll get there, that's called machine unlearning," Hyrum Anderson, principal architect for Trustworthy Machine Learning at Microsoft, tells CSO. "For GPT-3 [a language prediction model developed by OpenAI], the cost was $16 million or something to train the model once. If it were poisoned and identified after the fact, it could be really expensive to find the poisoned data and retrain. But if I could unlearn, if I could just say 'Hey, for these data, undo their effects and my weights,' that could be a significantly cheaper way to build a defense. I think practical solutions for machine unlearning are still years away, though. So yes, the solution at this point is to retrain with good data and that can be super hard to accomplish or expensive."
According to Anderson, data poisoning is just a special case of a larger issue called data drift that happens in systems. Everyone gets bad data for a variety of reasons, and there is a lot of research out there on how to deal with data drift as well as tools to detect significant changes in operational data and model performance, including by large cloud computing providers. Azure Monitor and Amazon SageMaker are examples of services that include such capabilities.
"If your model's performance after a retraining takes a dramatic hit, whether or not it's a poisoning attack or just a bad batch of data is probably immaterial and your system can detect that," Anderson says. "If you manage to fix that, then you can either root out that targeted poisoning attack or the bad batch of data that inadvertently got inside your data aperture when you trained your model. So those kinds of tools are a good start and they're kind of in this AI risk management framework that's beginning to materialize in the industry."
"A lot of security in AI and machine learning has to do with very basic read/write permissions for data or access to models or systems or servers," Anderson says. "It's a case where a small over permissive data provider service or file in some directory could lead to a poisoning attack."
Just as organizations run regular penetration tests against their networks and systems to discover weaknesses, they should expand this to the machine learning context, as well as treating machine learning as part of the security of the larger system or application.
"I think the obvious thing that developers should do with building a model is to actually attack it themselves to understand how it can be attacked and by understanding how it can be attacked, they can then attempt to build defenses against those attacks," Patel says. "Your detection is going to be based on what you found from the red teaming so when you put together attacks against the model, you can then understand what the data points would look like, and then accordingly, you would build mechanisms that are able to discard the data points that look like poisoning."
Anderson is actively involved with this at Microsoft. In a recent talk at the USENIX Enigma conference, he presented a red team exercise at Microsoft where his team managed to reverse-engineer a machine learning model that was being used by a resource provisioning service to ensure efficient allocation and mapping of virtual resources to physical hardware.
Without having direct access to the model, the team managed to find enough information about how it collected data to create of a local model replica and test evasion attacks against it without being detected by the live system. This allowed them to identify what combinations of virtual machines, databases, their sizes and replication factors, at what times of day and in what regions they should request from the real system to ensure with a high probability that the machine learning model would overprovision the resources they requested on physical hosts that also hosted high-availability services.
Modern machine learning often relies on open-source datasets, pretrained models, and machine learning libraries from across the internet, but are those resources safe to use? Previously successful digital supply chain attacks against cyber infrastructure suggest the answer may be no. This report introduces policymakers to these emerging threats and provides recommendations for how to secure the machine learning supply chain.
Progress in machine learning depends on trust. Researchers often place their advances in a public well of shared resources, and developers draw on those to save enormous amounts of time and money. Coders use the code of others, harnessing common tools rather than reinventing the wheel. Engineers use systems developed by others as a basis for their own creations. Data scientists draw on large public datasets to train machines to carry out routine tasks, such as image recognition, autonomous driving, and text analysis. Machine learning has accelerated so quickly and proliferated so widely largely because of this shared well of tools and data.
It is becoming standard practice for researchers to share systems that have been trained on data from real-world examples, enabling the systems to perform a particular task. With pretrained systems widely available, other machine learning developers do not need large datasets or large computing budgets. They can simply download those models and immediately achieve state-of-the-art performance and use those capabilities as a foundation for training even more capable machine learning systems. The danger is that if a pretrained model is contaminated in some way, all the systems that depend on it may also be contaminated. Such poison in a system is easy to hide and hard to spot.
The primary aim of this pilot study was to develop a machine learning algorithm to predict and distinguish eight poisoning agents based on clinical symptoms. Data were used from the National Poison Data System from 2014 to 2018, for patients 0-89 years old with single-agent exposure to eight drugs or drug classes (acetaminophen, aspirin, benzodiazepines, bupropion, calcium channel blockers, diphenhydramine, lithium and sulfonylureas). Four classifier prediction models were applied to the data: logistic regression, LightGBM, XGBoost, and CatBoost. There were 201 031 cases used to develop and test the algorithms. Among the four models, accuracy ranged 77%-80%, with precision and F1 scores of 76%-80% and recall of 77%-78%. Overall specificity was 92% for all models. Accuracy was highest for identifying sulfonylureas, acetaminophen, benzodiazepines and diphenhydramine poisoning. F1 scores were highest for correctly classifying sulfonylureas, acetaminophen and benzodiazepine poisonings. Recall was highest for sulfonylureas, acetaminophen, and benzodiazepines, and lowest for bupropion. Specificity was >99% for models of sulfonylureas, calcium channel blockers, lithium and aspirin. For single-agent poisoning cases among the eight possible exposures, machine learning models based on clinical signs and symptoms moderately predicted the causal agent. CatBoost and LightGBM classifier models had the highest performance of those tested. 041b061a72