Tech News

Microsoft Releases New Feature, Prompt Shields

The company announced a new feature coming to its Azure AI Studio and Azure OpenAI Service, which people use to create generative AI applications and custom Copilots, known as Prompt Shields, according to a blog post today. 

TakeAway Points:

  • The company reveals a new feature called Prompt Shields.
  • The technology is designed to guard against two different types of attacks for exploiting AI chatbots.
  • Prompt injection attacks have emerged as a significant challenge, where malicious actors try to manipulate an AI system into doing something outside its intended purpose.

Microsoft Unveils Prompt Shield 

Microsoft announces a new capability that will be available in its Azure AI Studio and Azure OpenAI Service, which are used to build custom Copilots and generative AI applications. The device, called Prompt Shields, is intended to protect AI chatbots from two distinct kinds of threats. 

The company is ramping up its Azure AI services to prevent people from tricking chatbots into performing unintended and illegal tasks.

Prompt Injection Attacks

The first type of attack is known as a direct attack, or a jailbreak. In this scenario, the person using the chatbot writes a prompt directly designed to manipulate the AI into doing something that goes against its normal rules and limitations. For example, someone may write a prompt with such keywords or phrases as “ignore previous instructions” or “system override” to intentionally bypass security measures.

“Prompt injection attacks have emerged as a significant challenge, where malicious actors try to manipulate an AI system into doing something outside its intended purpose, such as producing harmful content or exfiltrating confidential data,” Sarah Bird, chief product officer of Responsible AI at Microsoft, wrote in the post.

“In addition to mitigating these security risks, organizations are also concerned about quality and reliability,” added Bird in the post. “They want to ensure that their AI systems are not generating errors or adding information that isn’t substantiated in the application’s data sources, which can erode user trust.”

Copilot AI

In February, Microsoft’s Copilot AI got into hot water after including nasty, rude, and even threatening comments in some of its responses, according to Futurism. In certain cases, Copilot even referred to itself as “SupremacyAGI,” acting like an AI bot gone haywire. When commenting on the problem, Microsoft called the responses “an exploit, not a feature,” stating that they were the result of people trying to intentionally bypass Copilot’s safety systems.

The second type of attack is called an indirect attack (also known as an indirect prompt attack or a cross-domain prompt injection attack). Here, a hacker or other malicious person sends information to a chatbot user with the intention of pulling off some type of cyberattack. This one typically relies on external data, such as an email or document, with instructions designed to exploit the chatbot.

Like other forms of malware, indirect attacks may seem like simple or innocent instructions to the user, but they can pose specific risks. A custom Copilot created through Azure AI could be vulnerable to fraud, malware distribution, or the manipulation of content if it’s able to process data, either on its own or through extensions, Microsoft said.

To try to thwart both direct and indirect attacks against AI chatbots, the new Prompt Shields will integrate with the content filters in the Azure OpenAI Service. Using machine learning and natural language processing, the feature will attempt to find and eliminate possible threats across user prompts and third-party data.

Prompt Shields is currently available in preview mode for Azure AI Content Safety, is coming soon to Azure AI Studio, and will be available for Azure OpenAI Service on April 1.

Microsoft today also offered another weapon in the war against AI manipulation: spotlighting, a family of prompt engineering techniques designed to help AI models better distinguish valid AI prompts from those that are potentially risky or untrustworthy.

Battle for Generative AI

PYMNTS earlier this week looked at Microsoft’s role in the “battle for generative AI” that was kicked off by the success of ChatGPT, developed by Microsoft partner OpenAI.

Although top tech companies like Microsoft and Google have an edge over their competitors, the contest for the AI crown involves more than Big Tech.

Open-source projects, collaborations, and a focus on ethics and accessibility have emerged as factors in the fight to dethrone OpenAI. Stretching the boundaries of AI frequently requires investments in computational power and research talent.

“The hurdle for building a broad foundational model is that training on increasingly large data sets is extraordinarily expensive,” Gil Luria, a senior software analyst at D.A. Davidson & Co., said in an interview with PYMNTS. “The only reason OpenAI can afford to do so is the backing of Microsoft and the Azure resources it makes available to OpenAI. The broad models, such as the ones leveraged by ChatGPT, have ingested huge portions of human knowledge and continue to train on new content, which is what makes them so versatile in many domains of expertise.”

Comments
To Top

Pin It on Pinterest

Share This