Is Coding with AI Secure? A Guide to Safe AI-Assisted Development

In this article:
Subscribe to our blog:

In a recent Gartner survey of more than 240 senior enterprise executives, generative AI was the second most frequently named emerging risk for companies. In software development, the concerns appear to be justified.

A 2023 study on the use of coding tools powered by artificial intelligence (AI) found that developers who had access to these tools wrote significantly less secure code than those without access to AI tools.

Furthermore, developers using AI coding assistants were likelier to believe their code was secure, while those who leaned less on AI wrote code with fewer security vulnerabilities.

Security is emerging as a top priority for businesses, with 93% of information technology (IT) and security leaders polled by the Neustar International Security Council (NISC) stating that investments in development, security, and operations (DevSecOps) will be prioritized in the coming year.

And while generative AI is certainly not the only security threat companies face, it’s one of the most rapidly evolving ones. To understand what you can do to mitigate the risks of coding with AI, it’s first essential to know why code security is important and what threats AI-assisted coding potentially brings.

Why is Code Security Important?

Insecure code can lead to security vulnerabilities, which, in the worst-case scenario, lead to data breaches. According to IBM, the cost of data breaches for companies has risen consistently over the last several years and continues to do so.

Their 2023 Cost of a Data Breach report found that the worldwide average data breach cost in 2023 was $4.45 million, a $100,000 increase from their 2022 report. Since 2020, the average overall data breach cost has increased by 15.3%.

A breakdown of the expenses shows that companies spend the most on data breach detection and escalation. This trend points to an increase in the complexity of attacks and their corresponding investigations.

data breach cost statistics from IBM

Another sign of the increased financial impact of data breaches is that, according to IBM, the cost of notifying customers and any other affected parties increased by almost 20% since 2022.

Loss of business is another significant cost related to breaches, as they often damage an organization’s reputation and erode customer, partner, and stakeholder trust.

Security breaches can also impact your business continuity and cause operational disruption. If your applications and services aren’t available and functional, the resulting downtime and expenses related to service restoration can also cause you to lose clients and partners.

Another, often overlooked, consequence of security struggles is the stress it causes within your organization. Gartner predicts that almost 50% of cybersecurity leaders will change jobs by 2025 due to work-related stress.

Using AI Coding Tools: What Are the Potential Risks?

The risks of writing code with AI use are already top of mind for organizations. A recent Gartner survey shows that 57% of company leaders are concerned about leaked secrets in AI-generated code, and 58% say incorrect or biased outputs created by AI-powered coding tools are their most significant concern.

It’s important to remember that AI tools that generate code are trained on existing data and information—mainly publicly available open-source code of varying quality, reliability, and security.

And while we’ve seen that AI tools can help coders and many other professionals increase productivity by automating manual and repetitive tasks, they can be just as valuable to cybercriminals.

AI statistics from Gartner survey

One emerging security concern is AI’s inclination to sometimes present you with fabricated information, which could result from being trained on outdated data. Security platform Vulcan performed a study recently that found so-called “hallucinations” in the large language models (LLMs) that a popular AI tool, OpenAI’s ChatGPT, is trained on.

The study identified an attack technique called “AI package hallucination,” in which ChatGPT generated questionable code snippets as fixes to common vulnerabilities and exposures (CVEs), sometimes offering links to nonexistent coding libraries. Hackers would then be able to hijack these nonexistent coding libraries by publishing a malicious package in their place.

This unique attack method is just one of many ways cybercriminals are getting creative with generative AI.

Not only can AI code generators be used by hackers to create and carry out attacks, but regular use by developers to perform everyday tasks, resulting in incorrect and insecure outputs, can lead to several common types of attacks that often result in security breaches, including:

  • Phishing attacks: According to the 2023 IBM data breach report, 15% of reported data breaches in 2023 resulted from phishing—social engineering attacks in which hackers try to trick you into providing access to your most sensitive data. Cloud data analytics platform Netenrich recently discovered an AI bot sold on Dark Web marketplaces called “FraudGPT” that claims to be able to write malicious code and create phishing pages.

  • Injection attacks: If your AI-generated code doesn’t correctly validate or sanitize user inputs, it can lead to SQL injection, cross-site scripting (XSS), or other injection attacks.

  • Insecure dependencies: If the AI-generated code relies on third-party libraries or dependencies that are outdated or have known vulnerabilities, your security can be compromised.

  • Model poisoning and adversarial attacks: Attackers may attempt to manipulate AI models by injecting malicious data during training and providing inputs specifically crafted to mislead AI models into generating malicious code. According to IBM’s report, malicious software (malware) continues to be the most severe data breach threat, particularly ransomware, which had an average cost of $5.11 million per breach in 2022.

  • Resource exhaustion: Poorly-optimized AI models or generated code may consume excessive computational resources, leading to resource exhaustion attacks, such as denial-of-service (DoS), which, according to Zayo’s 2023 report, saw a 200% increase over the last year.

Addressing these security concerns requires a holistic approach. If your organization wants to introduce AI code completion tools into development, close collaboration between your team’s developers, DevOps engineers, and cybersecurity experts is required.

Mitigating the Risks of AI-Assisted Coding

Like it or not, developers are using AI to code. According to our State of Software Quality survey, 64% of developers have already integrated AI into their code production workflows, and 62% use AI to review their code.

The biggest problem with AI code is that it usually flies under the radar. When a developer commits code, it’s unclear what portion was written by them and which was written by AI. In a sense, you’re already allowing a trust escalation attack, where code written by external actors gets elevated permissions to be committed by an internal author.

This lack of clarity makes having a separate higher-attention review process for AI code impossible because it’s intermingled with human code. Even when developers perform a self-review of their code, it’s possible for there to be many weaknesses in the existing code: missing error handling, inconsistency with project idioms and rules, slow performance, and more.

To ensure the successful and secure integration of AI in your development workflows, it’s essential to create a comprehensive plan beforehand.

Vet AI Tools Thoroughly

Begin by understanding your organization’s specific requirements and goals for AI-assisted coding. Identify the coding tasks where AI can provide the most value before exploring and evaluating available AI tools and platforms.

Pay close attention to their features, security track record, and compatibility with your existing tech stack. Look for an AI coding assistant that prioritizes security, offers regular updates, and has a strong reputation for addressing vulnerabilities promptly.

Ensure that the chosen AI tools comply with relevant regulations and industry standards, especially if your software deals with sensitive data. Check for certifications and compliance with relevant data protection and security standards like the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA).

Once implemented, prioritize keeping AI tools up-to-date with the latest security patches from the vendor.

Create and Implement Staff Training

Develop training programs to familiarize your development team with the chosen AI tools. Focus on both their practical use and potential security implications.

Ensure your staff knows about the security risks associated with AI-assisted coding. Since AI technology evolves rapidly, reevaluate your training continuously and update it when necessary.

Define Code Review and Validation Procedures

Define and implement coding standards and guidelines that the AI code generation tool should adhere to when generating code and continuously monitor and assess the quality of code produced.

Utilize automated static code analysis tools to help identify potential vulnerabilities and security issues more effectively. Establish feedback mechanisms for developers to report issues and provide feedback on AI-generated code quality after performing automated code reviews.

Implement version control systems to track code changes and regularly back up your codebase to protect against accidental data loss or corruption.

Create an Incident Response Plan

Develop clear incident response procedures outlining how to detect, evaluate, and respond to security incidents related to AI-generated code.

Establish a communication plan to notify stakeholders, including customers and regulatory authorities, in case of a security breach or data leak.

Conduct regular incident response drills to ensure your team is well-prepared to handle AI-related security incidents.

Take Responsibility for Your Software’s Security

Ultimately, if your IT leadership team approves the introduction of AI tools into the development process, it must also be responsible for dealing with any potential pitfalls.

A recent Gartner survey found that 93% of IT and security leaders are already involved in their company’s AI risk management processes, while 24% said they own this responsibility outright.

And while using AI tools come with inherent risk, they also help mitigate many. Of the organizations polled by Gartner, 34% said that they are already using AI security tools to minimize the risks of generative AI tools.

AI statistics from Gartner survey

CodacyAI is one such tool. Remember, if you’re analyzing your code quality and trying to catch errors and vulnerabilities in the build stage, you’re already too late. The best phase for meaningful code analysis and correction is during pull request reviews.

CodacyAI runs on top of our powerful analysis engine, catching code errors and providing actionable suggestions for issue resolution. And unlike most AI coding tools, our Quality AI model never uses your private code for training.

If your team uses AI tools for development and is looking for an AI-powered partner to identify and fix security vulnerabilities, check out Codacy with a 14-day free trial.

RELATED
BLOG POSTS

Best Practices for Coding with AI in 2024
Developers are adopting coding tools powered by artificial intelligence (AI) at a rapid pace. According to our 2024 State of Software Quality survey, ...
AI-Assisted Coding: 7 Pros and Cons to Consider
According to our 2024 State of Software Quality survey, 64% of developers have already integrated artificial intelligence (AI) into their code...
Static Code Analysis: Everything You Need to Know
In a recent survey report, Incredibuild asked 300 senior IT managers about their most used technologies and methodologies for accelerating and...

Automate code
reviews on your commits and pull request

Group 13