New Jailbreaks Enable GitHub Copilot Manipulation

Researchers Discover Two New Ways to Manipulate GitHub’s Copilot AI

Researchers have discovered two new ways to manipulate GitHub’s artificial intelligence (AI) coding assistant, Copilot, enabling the ability to bypass security restrictions and subscription fees, train malicious models, and more.

Method 1: Embedding Chat Interactions Inside Copilot Code

The first trick involves embedding chat interactions inside of Copilot code, taking advantage of the AI’s instinct to be helpful in order to get it to produce malicious outputs. This method allows users to bypass security restrictions and subscription fees, and can even be used to train malicious models.

Method 2: Rerouting Copilot Through a Proxy Server

The second method focuses on rerouting Copilot through a proxy server in order to communicate directly with the OpenAI models it integrates with. This allows users to manipulate the AI’s responses and bypass security restrictions.

Abusing the Design of Copilot

A "system prompt" is a set of instructions that defines the character of an AI — its constraints, what kinds of responses it should generate, etc. Copilot’s system prompt, for example, is designed to block various ways it might otherwise be used maliciously. However, by intercepting it en route to an LLM API, Shpigelman claims, "I can change the system prompt, so I won’t have to try so hard later to manipulate it. I can just modify the system prompt to give me harmful content, or even talk about something that is not related to code."

Lessons Learned

For Tomer Avni, co-founder and CPO of Apex, the lesson in both of these Copilot weaknesses "is not that GitHub isn’t trying to provide guardrails. But there is something about the nature of an LLM, that it can always be manipulated no matter how many guardrails you’re implementing. And that’s why we believe there needs to be an independent security layer on top of it that looks for these vulnerabilities."

Source Link

New Jailbreaks Enable GitHub Copilot Manipulation

Galaxy S25 Fixes Galaxy S24 Display Issue

David Darke: Strategic Agency Growth

SIEMs Fall Short on MITRE ATT&CK

YC backs EU’s Digital Markets Act

Home

Services

Domains & Hosting

FUSION MAG

New Jailbreaks Enable GitHub Copilot Manipulation

Galaxy S25 Fixes Galaxy S24 Display Issue

You May Also Like

David Darke: Strategic Agency Growth

SIEMs Fall Short on MITRE ATT&CK

YC backs EU’s Digital Markets Act

Home

Services

Domains & Hosting

FUSION MAG