Using AI to Normalize and Improve Scan Rule Documentation at Codacy

At Codacy, we integrate 34 open-source tools to provide insights into code quality and security across 43 different programming languages. Many of these tools come with plugins and add-ons, giving our users access to over 22,000 unique rules. These rules are frequently updated: new ones are added, and existing ones are improved or deprecated.
To help users navigate this ever-evolving landscape, Codacy imports and displays the official documentation for each tool’s rules. But this raises a critical question:
How can we consistently provide clear, up-to-date, and standardized documentation for thousands of rules coming from dozens of different sources?
The Challenge
There are several challenges:
- Tools don't follow a common documentation format; some provide code examples, others don't.
- The level of detail and tone can vary wildly between tools.
- Plugins and add-ons often lack the same documentation standards as their parent linters.
And beyond just displaying documentation, Codacy also normalizes rules by assigning categories, severities, and tagging them as "recommended" or not. However, many tools don’t define these attributes themselves, or if they do, they might use different criteria than Codacy. This means our team had to invest significant manual effort to:
- Curate and enrich documentation for thousands of rules,
- Implement complex logic to import and standardize content,
- Continuously adapt to tool updates and changes.
Using AI to Normalize Documentation
To reduce this manual workload and improve consistency, we built and trained an AI assistant to help us automatically normalize and enhance rule documentation.
Here are a few examples of the improvements we’ve achieved—before and after applying the AI assistant:
Analyzer | Rule before improvements | Rule after improvements |
GoSec (Golang) |
G101Look for hardcoded credentials (No explanation) |
Avoid Hardcoded CredentialsDetects hardcoded credentials in the code. Why is this a problem? Hardcoded credentials can lead to security vulnerabilities as they may be exposed in source code repositories or logs, allowing unauthorized access to systems. How do you fix it? Remove hardcoded credentials from the code and use secure methods such as environment variables, configuration files with proper access controls, or secret management tools to handle sensitive information. |
Pylint (Python) |
not-in-loop (E0103) %r not properly in loop Used when break or continue keywords are used outside a loop. Explanation %r not properly in loop Used when break or continue keywords are used outside a loop. |
Avoid using break or continue outside loops Detects usage of break or continue statements outside of loops. Why is this a problem? Using break or continue outside of a loop causes a syntax error and indicates a logical mistake in the code structure. How do you fix it? Ensure that break or continue statements are only used inside loops such as for or while loops. |
Revive (Golang) |
time-equal This rule warns when using == and != for equality check time.Time and suggest to time.time.Equal method, for about information follow this link Explanation Description: This rule warns when using == and != for equality check time.Time and suggest to time.time.Equal method, for about information follow this link Configuration: N/A |
Enforce time.Time Equality Using Equal Method Warns when using == or != operators to compare time.Time values and suggests using the time.Time.Equal method instead. Why is this a problem? Using == or != operators to compare time.Time values can lead to incorrect results because these operators compare the entire struct including unexported fields, which may not reflect actual time equality. The time.Time.Equal method correctly compares the time instant represented by the values. How do you fix it? Replace any usage of == or != operators for comparing time.Time values with the time.Time.Equal method to ensure correct equality checks. Good Examples
Bad Examples
|
Cppcheck (C/C++) |
Array index 'i' is used before limits check. Defensive programming: The variable 'i' is used as an array index before it is checked that is within limits. This can mean that the array might be accessed out of bounds. Reorder conditions such as '(a[i] && i < 10)' to '(i < 10 && a[i])'. That way the array will not be accessed if the index is out of limits. (No explanation) |
Avoid Using Array Index Before Limits Check Detects when an array index variable is used before confirming it is within valid bounds, which can lead to out-of-bounds array access. Why is this a problem? How do you fix it? Good Examples Bad Examples |
Semgrep (22 languages) |
unsafe-path-combine String argument $A is used to read or write data from a file via Path.Combine without direct sanitization via Path.GetFileName. Explanation String argument $A is used to read or write data from a file via Path.Combine without direct sanitization via Path.GetFileName. If the path is user-supplied data this can lead to path traversal. |
Avoid Unsafe Path.Combine Usage Without Sanitization Detects usage of Path.Combine with string arguments that are not sanitized via Path.GetFileName, which can lead to path traversal vulnerabilities. Why is this a problem? Using Path.Combine with unsanitized user-supplied input can allow attackers to perform path traversal attacks, potentially accessing or modifying unauthorized files on the filesystem. How do you fix it? Ensure that any string arguments used in Path.Combine are properly sanitized using Path.GetFileName or equivalent methods before combining paths to prevent path traversal. Good Examples
Bad Examples
|
Checkov |
Ensure that SSL validation isn't disabled with dnf Explanation More information [here]. |
Ensure SSL Validation Is Not Disabled with dnf Checks that SSL validation is enabled when using dnf to avoid insecure package downloads. Why is this a problem? How do you fix it? Good Examples
Bad Examples
|
Smarter Categorization and Tagging with AI
Beyond improving the documentation itself, our AI assistant also helped tackle one of the biggest pains in managing thousands of rules: consistent categorization and tagging.
In the past, assigning a category (e.g., "Code Style", "Security", "Performance"), a severity level (e.g., "Minor", "Medium", "Critical"), or identifying relevant topics (like "ReactJS" or "Accessibility") relied on a mix of hardcoded heuristics, keyword matching, and manual overrides. This approach was fragile, hard to scale, and often missed the nuance of what each rule was truly about.
Now, instead of relying on scripted rules to infer this metadata, we have an AI assistant that reads and understands each rule’s documentation, just like a human reviewer would. This means:
- Categories are assigned based on actual rule intent and context, not just keywords.
- Severities reflect the potential impact of the rule, taking into account language norms and best practices.
- Tags are more precise, helping users filter and discover rules relevant to their goals or concerns.
We no longer have to "guess" a rule's purpose based on pattern matching. Instead, we get a much richer and more coherent catalogue, with metadata that actually reflects the diversity and complexity of the tools we integrate.
Analyzer | Rule before improvements | Rule after improvements |
ESLint 9 |
Lodash: Consistent compose Lodash has two ways to compose functions: left to right ( Severity: Critical 🛑 |
Enforce Consistent Lodash Compose Method Using inconsistent function composition methods can lead to confusion and reduce code readability, as the order of function application differs between left-to-right and right-to-left composition styles. Choose one composition method (either left-to-right using flow/pipe or right-to-left using flowRight/compose) and consistently use it throughout the codebase. Severity: Minor ℹ️ |
ESLint 9 |
Jsx a11y: Aria role Enforce that elements with ARIA roles must use a valid, non-abstract ARIA role. Elements with ARIA roles must use a valid, non-abstract ARIA role. A reference to role definitions can be found at WAI-ARIA site. Severity: Minor ℹ️ |
Enforce Valid ARIA Roles in JSX Using invalid or abstract ARIA roles can lead to accessibility issues because assistive technologies rely on correct ARIA roles to interpret and interact with UI elements properly. Invalid roles may confuse users relying on such technologies and reduce the accessibility of the application. Use only valid, non-abstract ARIA roles as defined by the WAI-ARIA specification. Avoid empty or invalid role attributes. For custom roles, configure the rule to allow specific invalid roles if necessary, and optionally ignore non-DOM components if applicable. Severity: High ☣️ |
ESLint 9 |
Vuejs accessibility: Anchor has content Enforce that anchors have content and that the content is accessible to screen readers. Accessible means that it is not hidden using the aria-hidden prop. Refer to the references to learn about why this is important. Severity: Critical 🛑 |
Enforce Anchor Has Accessible Content Anchors without accessible content are problematic for users relying on screen readers, as they cannot understand the purpose or destination of the link, leading to poor accessibility and user experience. Make sure every anchor element contains readable content or accessible child elements that are not hidden with aria-hidden. Use text, components recognized as accessible children, or directives that provide accessible content. Severity: Medium ⚠️ |
PHP CodeSniffer (PHP) |
DB: Restricted Functions (No description or explanation) Severity: Minor ℹ️ |
Avoid Restricted Database Functions Using restricted database functions can lead to security vulnerabilities, compatibility issues, or maintenance problems within WordPress projects. Replace restricted database functions with recommended alternatives that adhere to WordPress coding standards and best practices. Severity: High ☣️ |
To visualize the impact, we also analyzed the overall distribution of categories and severities before and after using the AI assistant. The charts below show how much better balanced and representative our metadata has become:
These improvements make it significantly easier for users to browse, filter, and focus on what matters most for their codebases—whether that’s fixing critical security issues, improving maintainability, or enforcing style consistency.
Take for example just a single tool like Checkov and the impact this change had on its patterns:
All Checkov patterns were labeled as Medium severity; now most of them will be High, with a lower share of Medium and Critical ones. And they were mostly categorized as Error Prone, with only some categorized as Security, when in reality, most of the patterns were related to Security problems. For example:
Analyzer | Rule before improvements | Rule after improvements |
Checkov (Infra-as-code) |
Ensure terraform is not sending SSM secrets to untrusted domains over HTTP Severity: Medium ⚠️ |
Ensure Terraform Does Not Send SSM Secrets to Untrusted Domains Over HTTP Severity: Critical 🛑 |
Where We Are and What’s Next
The improvements powered by our AI assistant are already making a difference. The updated rule documentation and enriched tagging are now live and available in the Codacy platform. Users can see these enhancements when browsing and selecting patterns, and also when viewing details for issues detected in their code.
While categories and severities haven’t yet been updated in the product, we’re currently in the process of validating these changes to ensure they can be rolled out safely and without disrupting workflows. This step is critical, as these attributes are tied to many parts of the Codacy experience.
Once the new severities and categories are deployed, users will begin to see gradual changes in their dashboards and historical metrics as their repositories are reanalyzed. These shifts reflect a more accurate and consistent classification of rules, which will help teams focus on what truly matters—whether that’s fixing critical issues or improving long-term maintainability.
This update will also directly impact our Coding Standard assistant, which uses categories and severities to recommend a tailored set of rules for each team. With better metadata in place, the assistant will be able to generate more relevant and effective default configurations.
We’re excited about what this unlocks: a smarter, more intuitive Codacy, built to help developers make better decisions, faster. And this is just the beginning—by combining automation with AI understanding, we’re laying the foundation for the next generation of intelligent developer tooling.