Strengthen the defense based on artificial intelligence, the structural advantage of offensive AI

The real-world applications of artificial intelligence have grown so fast and become so common that they are difficult to meet in everyday life. For example, friends who drive or text, did not see their influence. The same is true in the field of cybersecurity, where both attackers and defenders want AI to prevail. Its rise coincides with the proliferation of data itself, and as we increasingly rely on artificial intelligence to understand this new world of data-centricity, we need to understand and understand its security.

For decades, defenders have defended against attacks by detecting signatures or specific patterns that indicate malicious activity. This bottom-up approach is passive. New attacks require the deployment of new signatures, so attackers are always one step ahead in digital scuffles. The next generation, artificial intelligence-based solutions solve this problem by adopting a top-down approach and providing a large set of active data sets to statistical models. This shift from signatures to statistics means that defenses can be proactive and better promoted to new attacks.

Artificial intelligence defense technology is booming and is now commonly used in spam filtering, malicious files or URL detection and other classic issues. These models typically rely on supervised machine learning algorithms that map functions from their input (for example, the domain name "https://google.com" or "http:// google phishpage.com") to the output ( For example, "benign" or "malicious"). While supervised learning may clearly map to the need for defenders to distinguish between benign and malicious, it is expensive and time consuming to implement because it relies on pre-existing tags. Data tags require early efforts, require domain expertise, and cannot be reused elsewhere, which means that there is a fundamental bottleneck in building effective AI-based defenses.

Structural advantage of offensive AI

Artificial intelligence-based defenses suffer from other exploitable weaknesses. Since the accuracy of the model is controlled by its label fidelity, an attacker can poison the model when the creator of the model trains it on a data set injected by a purposefully damaged tag. This allows an attacker to build a specific sample that bypasses the detection. Other models are systematically susceptible to input that is slightly disturbed, causing them to produce embarrassingly high confidence errors. The so-called confrontation examples are best illustrated by physical attacks, such as placing stickers on the parking sign to deceive the object recognizer used in the self-driving car, and implanting hidden voice commands to trick the speech recognizer used in the smart speaker to alert. .

While these examples may be close to the homes of ordinary citizens, similar mistakes for cyber security professionals may mean the difference between violations and promotions. Attackers are increasingly turning to automation, and they will soon turn to AI to take advantage of these weaknesses. In short, the "red team" attacker can benefit from the data like the "blue team" defender.

Around the harpoon phishing, password cracking, Captcha subversion, steganography, Tor to anonymity and anti-drug avoidance theory, the artificial team-based red team workflow is increasing. In each simulation, an attacker can take advantage of easy-to-access data, which indicates that data tag bottlenecks make AI-based attacks easier to cancel than their defense counterparts.

At first glance, this may seem like a historical reenactment. The attacker has always enjoyed an advantage, simply because of what is wrong. The Blue team only wins when it detects nearly 100% success, and the Red team wins with only one success among 100 players.

This difference is a broader industry trend, unfortunately, this is good for the Red Team. One of the reasons we have made such great progress on issues such as image recognition is that its researchers are rewarded for their collaboration. On the other hand, cybersecurity researchers are often limited because their data is too sensitive, even illegally shared, or considered intellectual property, a secret weapon that allows suppliers to gain a foothold in the fiercely competitive cybersecurity market. . Attackers can take advantage of this fragmented landscape and lack of data sharing to go beyond defense.

In the case of aggravating this asymmetry, it is only a matter of time before entering the barrier of applying AI to withdraw from a doctoral degree. Papers on high school classrooms, free educational resources, available data sets and pre-trained models, powerful cloud-based resource access such as GPUs, and open source software libraries all reduce the attack conditions of AI novices and become attackers. . Deep learning is actually more user-friendly than the old ones, and in many cases it is no longer required by expert manual design to produce the latest precision.

Calm before the storm...

Given these realities, the phrase "a one-dollar offense beats a dollar of defense" seems to be an effective summary of the malicious use of artificial intelligence. So far, good old-fashioned manual attacks still dominate, and there is no reliable evidence to prove that they are subject to a large number of AI-based attacks. However, it is at this time that we should consider how to improve the data label bottleneck to reduce the possibility of future impact.

While odds may overlap with them, defenders do have tools available to help them reduce the cost and time of tagging data. The crowdsourcing label service provides a cheap on-demand workforce with a consensus that is close to the accuracy of the experts. Other key skills in the industry include accelerating the deployment of artificial intelligence-based defenses through the following strategies:

Active learning, relatively slow human experts only mark the most abundant data.

Semi-supervised learning in which a model that trains finitely labeled data learns the problem structure from available unlabeled data.

Transfer learning, in which models previously trained for problems with rich available tag data are customized for new problems with limited tag data.

Finally, the best defense is a good offense. If handled with care, companies can create confrontational samples and strengthen artificial intelligence-based defenses, which means defenders can preemptively attack their own models to help plug any loopholes.

Although data label bottlenecks make AI-based attacks a tactical advantage, defenders can now and should take steps to adjust the game before the attacker releases them. Translation of shania

Infrared Pen

Infrared Pen,Infrared Touch Pen,Infrared Tablet Stylus Pen,Infrared Stylus Pencil

Shenzhen Ruidian Technology CO., Ltd , https://www.wisonens.com

This entry was posted in on