Ask 10 developers what code quality means, and you’ll get 15 different answers.
Code quality is an elusive subject. Sometimes, it means the style of the code on the page. Sometimes, it means the simplicity of the code. Sometimes, it means the robustness of the code. Sometimes, it means something completely different.
This ambiguity is a problem if you are trying to design and build a product dedicated to helping developers improve the quality of their code. This is why, at Codacy, we have a precise, quantitative understanding of code quality. We believe that without strongly defining code quality, a development team can’t possibly deploy high-quality code.
Here, we will take you through our definition of code quality at Codacy to help you better understand what high-quality code means, whether you use Codacy or not.
The 4 Core Code Quality Metrics
At Codacy, we employ four critical areas of quality to assess the overall quality of your codebase:
- The issues within your codebase
- The complexity of your codebase
- The amount of duplication in your codebase
- The code coverage of your codebase
Let’s go through each of these in detail to understand code quality truly.
The Quality Issues With Your Code
“Issues” is a catch-all term in static code analysis encompassing many potential problems affecting the code's quality, security, and performance.
This is what makes static code analysis a vital tool in software development. A developer can't get everything correct, or even for senior developers to catch each issue in code review. Automating this process means almost immediately increasing the quality of your code.
At Codacy, here are the issues we look for.
Code Style
This will differ from programming language to programming language, but we’re looking at the code formatting and syntax quality here. Are the variable names styled correctly? Is there the correct spacing? The correct brackets? The right quotation marks?
Here’s a somewhat realistic example of code style issues, incorporating common mistakes developers might make, including naming conventions and other stylistic concerns.
def CalculateSum(x,y):
result=x+y #Missing spaces around operator
if(result>10):print("Result is big") #Improper spacing and missing braces
return result
Here are the stylistic issues in even this short block of code:
- Function naming. The function name CalculateSum does not follow the Python convention of using snake_case for function names. It should be calculate_sum.
- Parameter spacing. There's a lack of space after the comma in the function parameters (x,y), which should be (x, y).
- Operator spacing. The expression result=x+y lacks spaces around the = and + operators. It should be result = x + y.
- Conditional braces. Python does not require braces () around conditions. The statement if(result>10) is not wrong but is not idiomatic Python. It should be if result > 10.
- Inline print statement: The print statement within the if block is in the same line. Python favors readability, and this could be clearer in a separate line.
- Inline comments. The lack of comments or explanations makes the function's purpose and logic less clear, especially for more complex logic.
Here’s the same function styled correctly:
def calculate_sum(x, y):
"""
Calculate the sum of two numbers and print a message if the sum is large.
Args:
x (int): The first number.
y (int): The second number.
Returns:
int: The sum of the two numbers.
"""
result = x + y
if result > 10:
print("Result is big")
return result
The corrections:
- Function naming. The function name is in snake_case, adhering to Python's naming convention.
- Parameter spacing. Proper spacing is used in the function parameters.
- Operator spacing. Spaces are added around operators for better readability.
- Conditional braces. The condition is written in Python's idiomatic style without unnecessary braces.
- Inline print statement. The print statement is on a separate line, improving readability.
- Inline comments: A docstring is added to explain the function's purpose, parameters, and return type, enhancing clarity and maintainability.
This high-quality example follows Python's PEP 8 style guide, essential for maintaining readability and consistency in Python codebases.
Error Prone
When code is error-prone, it will produce bugs at runtime or compile time. Static code analysis can find these before they cause these issues.
Here’s the type of error you might find in Python code:
def get_user_data(user_list, user_id):
try:
return [user for user in user_list if user["id"] == user_id][0]
except IndexError:
return None
This function aims to find a user in a list of user dictionaries based on the user's ID. It uses a list comprehension followed by indexing [0] to get the first match. However, this approach is error-prone because it assumes there will always be at least one matching user, leading to an IndexError if no matches are found.
If get_user_data is called with a user_id that does not exist in user_list, the list comprehension will result in an empty list. Attempting to access the first element of this empty list with [0] will raise an IndexError. Although the function uses a try-except block to handle this, relying on exception handling for normal control flow is not a best practice and can lead to less readable and potentially error-prone code.
A better example:
def get_user_data(user_list, user_id):
return next((user for user in user_list if user["id"] == user_id), None)
This improved version uses the next() function with a generator expression. It's a more Pythonic way to achieve the same goal, and it safely returns None if no matching user is found without the risk of an IndexError.
This approach is clearer and avoids the pitfalls of the initial implementation.
Performance
If code has performance issues, this is a problem for end users and the organization. For end users, performance issues in code can lead to slow and frustrating experiences, often resulting in dissatisfaction and disengagement with the product.
For organizations, these performance problems damage reputation and user retention and increase operational costs due to inefficiencies and the need for more computing resources.
It can be worryingly easy for individual pieces of code to cause huge performance problems. Take this seemingly simple Python function:
def find_duplicates(items):
return [x for x in items if items.count(x) > 1]
The function find_duplicates uses a list comprehension to iterate through each item element. For each element x, it calls items.count(x) to count how many times x appears in items. The count method iterates the list to find matching elements, leading to a linear time complexity O(n) for each call. Since count is called for each element in items, the overall time complexity is O(n2), where n is the length of items.
This quadratic time complexity means that the function's execution time increases significantly as the size of items grows, making it inefficient for large lists.
This version uses collections.Counter to count occurrences of each element in items.
def find_duplicates(items):
from collections import Counter
item_count = Counter(items)
return [item for item, count in item_count.items() if count > 1]
Creating the Counter object involves iterating over items once, resulting in O(n) time complexity. The Counter object item_count is a dictionary mapping each element to its count. The list comprehension then iterates through item_count.items(), which is O(n) in the worst case (if all elements are unique). For each item, the list comprehension checks if the count is greater than 1 and adds it to the resulting list if true.
Since both major operations (creating Counter and iterating through its items) are linear, the overall time complexity remains O(n). This linear time complexity is more efficient, especially for large lists, as it does not involve nested iterations like in the low-quality example.
Assume we have a dataset where n is the number of elements:
- For n = 10 (a small dataset):
-
- O(n2) Complexity: The number of operations would be 102 = 100.
- O(n) Complexity: The number of operations would be 10.
- For n = 100 (a moderate dataset):
-
- O(n2) Complexity: The number of operations would be 1002 = 10,000.
- O(n) Complexity: The number of operations would be 100.
- For n = 1,000 (a large dataset):
-
- O(n2) Complexity: The number of operations would be 1,0002 = 1,000,000.
- O(n) Complexity: The number of operations would be 1,000.
- For n = 10,000 (a very large dataset):
-
- O(n2) Complexity: The number of operations would be 10,0002 = 100,000,000.
- O(n) Complexity: The number of operations would be 10,000.
So, for small datasets, the difference in performance might not be very noticeable. However, as the size of the dataset increases, the impact of the algorithm's complexity becomes dramatically more significant. At n = 10,000, an O(n2) algorithm requires 10 million times more operations than an O(n) algorithm. This stark difference can lead to severe performance degradation, especially in larger datasets.
This comparison underscores the importance of choosing efficient algorithms, particularly for applications dealing with large amounts of data, as the choice can drastically affect the performance and scalability of the software.
Compatibility
Compatibility is a problem that mostly manifests in front-end frameworks. Is your code going to be compatible with different browsers and browser versions?
The challenge lies in dealing with the varying levels of support for web standards and features across these platforms. For example, newer JavaScript ES6 features might work seamlessly in modern browsers but could cause issues in older versions. In a world where users access web content through diverse devices and browsers, ensuring compatibility is essential for providing a universally accessible and consistent user experience. Failure to do so can lead to parts of a website or application malfunctioning or being entirely unusable for a segment of users, significantly impacting user satisfaction and reach.
Take this code:
// Using ECMAScript 6 features
const getUser = (userId) => {
fetch(`https://api.example.com/users/${userId}`)
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
};
This snippet uses ECMAScript 6 (ES6) features like arrow functions ((userId) => {...}) and template literals (https://api.example.com/users/${userId}). While these features provide cleaner and more concise code, they are not supported in older browsers like Internet Explorer 11. If this script runs in a browser that doesn't support ES6, it will result in syntax errors, leading to the script's failure.
Additionally, the fetch API, used for making network requests, is not supported in Internet Explorer. In unsupported browsers, fetch will be undefined, and attempting to call it will throw an error.
Here is a more compatible version:
// Compatible with older browsers (ES5)
function getUser(userId) {
var xhr = new XMLHttpRequest();
xhr.open('GET', '<https://api.example.com/users/>' + userId, true);
xhr.onload = function() {
if (this.status >= 200 && this.status < 300) {
console.log(JSON.parse(xhr.responseText));
} else {
console.error('Request failed with status:', this.status);
}
};
xhr.onerror = function() {
console.error('Network error');
};
xhr.send();
}
This example is written in ECMAScript 5 (ES5), which is widely supported across all major browsers, including older ones like Internet Explorer. Instead of arrow functions, it uses traditional function declarations, which are universally supported. It concatenates strings using + instead of using template literals, ensuring compatibility with older JavaScript interpreters.
The XMLHttpRequest API is used in place of fetch. XMLHttpRequest has been supported in browsers for a long time and provides a way to make network requests compatible with almost all browsers.
By adhering to ES5 and using XMLHttpRequest, this code ensures broader compatibility, reducing the likelihood of encountering issues across different browser versions.
While newer JavaScript features offer various advantages, they can lead to compatibility issues in older browsers. Ensuring broad compatibility often involves using older standards and more universally supported APIs.
Unused Code
This is exactly what it says. Code is written that never gets called or used. This leads to unnecessary bloat and potential confusion in the codebase. Such redundant code can not only make maintenance more challenging but detract from the overall efficiency and clarity of the application, potentially obscuring the flow and logic of functional code segments.
A low-quality example:
def greet(name):
hello = "Hello, " # Unused variable
message = "Hello, " + name
return message
The variable hello is declared but never used, contributing to code clutter and potential confusion.
A better example:
def greet(name):
hello = "Hello, "
return hello + name
This version removes the unused variable, making the code cleaner and more focused.
Security
Security is probably the most important quality component of code. You can get away with esoteric styling, but if you have security concerns within your codebase, you put your users and your company at risk.
If not promptly and properly addressed, security concerns in a codebase can lead to significant vulnerabilities, exposing users and the organization to potential data breaches, financial losses, and reputational damage. Development teams are already well aware of the importance of security. According to our recent State of Software Quality report, almost 89% of teams have a dedicated security team or person. You must implement and regularly update security best practices in coding, including input validation, encryption, and secure authentication protocols.
Here’s an example of one of OWASP’s top ten security threats, SQL injection:
import sqlite3
def get_user(user_id):
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE id = " + str(user_id)) # SQL Injection risk
This code is vulnerable to SQL injection due to concatenating a query string directly with user input.
Instead, you can use parameterized queries to prevent SQL injection, securing the database operations.
import sqlite3
def get_user(user_id):
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,)) # Parameterized query
The integration of continuous security testing and monitoring within the development lifecycle is not just a precaution but a fundamental aspect of responsible and sustainable software development in the modern digital landscape.
Documentation
The final issue we look for at Codacy is good documentation within the codebase. This isn’t docs in a ReadMe or on your website. We’re looking for documented code.
Say you had a function like this:
def process_data(data):
# Processing logic here
return processed_data
This function lacks any documentation or comments explaining its purpose, parameters, return type, or the processing logic it employs. Without proper documentation, it's unclear what type of data the function expects (e.g., a list, dictionary, or another data structure), what it does with the data, and what it returns.
The absence of comments within the function body also makes it difficult for other developers (or even the original author at a later time) to understand the logic and intention behind the code, especially if the processing logic is complex. This lack of clarity can lead to misuse of the function, difficulties in debugging and maintaining the code, and challenges in onboarding new team members.
Contrast with this:
def process_data(data):
"""
Processes the given data and returns the result.
This function applies specific transformations to the input data.
It's designed to work with data in the format of a dictionary where
each key represents a data attribute.
Args:
data (dict): The data to be processed. Expected to be a dictionary with specific structure.
Returns:
dict: The processed data, maintaining the same structure as the input.
"""
# Detailed processing logic here
return processed_data
This example includes a docstring that clearly explains the function's purpose, the expected format of the input data, and what the function returns. It specifies the parameter type (dict) and the return type, providing crucial information for developers using the function.
By documenting the expectations and behavior of the function, this example facilitates easier maintenance, debugging, and collaborative development. It also aids in auto-generating documentation tools and improves code readability.
Proper documentation like this is essential in a collaborative development environment, especially for complex or non-intuitive functions. It is crucial in code for understanding, maintaining, and correctly utilizing functions, particularly in shared codebases and complex projects.
The Complexity of Your Code Impacts Quality
Complexity in code is often a silent detractor of quality, making maintenance and understanding challenging. Codacy's metrics around cyclomatic complexity provide insights into the intricacies of your codebase, highlighting areas that may benefit from simplification.
Cyclomatic complexity is calculated based on the program's control flow graph. This graph represents the flow of control in the program, where nodes correspond to blocks of code, and edges represent the flow between these blocks.
Cyclomatic complexity essentially measures the number of linearly independent paths through the program's source code. A higher number of these paths indicates a higher complexity and potentially more test cases needed to ensure adequate coverage.
- A value of 1 implies a simple program with no control flow decisions.
- A value between 1 and 10 indicates a program with moderate complexity, generally manageable and understandable.
- Values over 10 indicate higher complexity, suggesting a need for simplification or refactoring for better maintainability and testability.
The value of cyclomatic complexity can be used to estimate the required effort for testing. It gives the minimum number of paths you would need to test to ensure that each part of the program is executed at least once.
Let’s go through a high-complexity, low-quality example:
def analyze_data(data, analysis_type, options):
if analysis_type == "statistical":
if options["mean"]:
# calculate mean
pass
if options["median"]:
# calculate median
pass
if options["mode"]:
# calculate mode
pass
# Additional statistical analysis options...
elif analysis_type == "graphical":
if options["bar_chart"]:
# create bar chart
pass
if options["line_chart"]:
# create line chart
pass
if options["pie_chart"]:
# create pie chart
pass
# Additional graphical analysis options...
# More analysis types...
The function analyze_data starts with a base complexity of 1. Each if statement adds 1 to the cyclomatic complexity. The elif statement also adds 1.
We can calculate the complexity for each branch:
- For the "statistical" branch:
- 1 (base) + 1 (if analysis_type == "statistical") + 1 (each if inside this branch for "mean", "median", and "mode") = 4
- For the "graphical" branch:
- 1 (base) + 1 (elif analysis_type == "graphical") + 1 (each if inside this branch for "bar_chart", "line_chart", and "pie_chart") = 4
Any additional branches would each have a similar additive effect on complexity.
This structure results in high cyclomatic complexity, as the number of conditional branches increases significantly with each additional analysis type and option. High cyclomatic complexity indicates that the function is doing too many things and will likely be hard to test, understand, and maintain. This complexity could be reduced by breaking the function into smaller, more focused functions, each handling a specific type of analysis.
One effective way to achieve this is by using the Strategy Pattern and breaking down the function into smaller, more focused functions. Here's a refactored version:
def calculate_mean(data):
# Logic to calculate mean
pass
def calculate_median(data):
# Logic to calculate median
pass
def calculate_mode(data):
# Logic to calculate mode
pass
def create_bar_chart(data):
# Logic to create bar chart
pass
def create_line_chart(data):
# Logic to create line chart
pass
def create_pie_chart(data):
# Logic to create pie chart
pass
def analyze_statistical_data(data, options):
if options["mean"]:
calculate_mean(data)
if options["median"]:
calculate_median(data)
if options["mode"]:
calculate_mode(data)
# Additional statistical analysis options...
def analyze_graphical_data(data, options):
if options["bar_chart"]:
create_bar_chart(data)
if options["line_chart"]:
create_line_chart(data)
if options["pie_chart"]:
create_pie_chart(data)
# Additional graphical analysis options...
def analyze_data(data, analysis_type, options):
analysis_strategies = {
"statistical": analyze_statistical_data,
"graphical": analyze_graphical_data
# More analysis types can be added here
}
analysis_function = analysis_strategies.get(analysis_type)
if analysis_function:
analysis_function(data, options)
Here we have:
- Separation of concerns. Each type of analysis (statistical and graphical) is handled by a separate function (analyze_statistical_data and analyze_graphical_data). This makes the code more modular and easier to understand and maintain.
- Strategy pattern. The analyze_data function uses a dictionary (analysis_strategies) to map analysis_type to the corresponding function. This pattern allows for the easy addition of new analysis types without modifying the core analyze_data function.
- Reduced complexity. By delegating the specific processing logic to different functions, the cyclomatic complexity of analyze_data is significantly reduced. Each function is responsible for making the code more readable and testable.
- Flexibility and scalability. This approach makes it easier to add new analysis types or options. For example, adding a new analysis type is as simple as defining and adding a new function to the analysis_strategies dictionary.
Reducing complexity makes code more readable, testable, and less prone to errors. Cyclomatic complexity is a valuable metric for understanding a program's control flow's complexity, aiding in testing and maintenance efforts. However, it's most effective when used alongside other code quality and readability measures.
Reducing Duplication to Enhance Code Quality
Code duplication is a common issue that negatively impacts software quality. Codacy identifies duplicated code blocks, encouraging developers to abstract and refactor repetitive patterns. Eliminating duplication cleans up the codebase, reduces potential bugs, and makes the code easier to update and maintain.
In this example, the calculation of the area for a square is technically a specific case of the rectangle area calculation (where length equals width).
def calculate_area_of_square(side_length):
return side_length * side_length
def calculate_area_of_rectangle(length, width):
return length * width
def calculate_area_of_triangle(base, height):
return 0.5 * base * height
The functions calculate_area_of_square and calculate_area_of_rectangle are conceptually similar, leading to duplicated logic - multiplying two numbers. While seemingly minor, this duplication can add up in larger codebases, leading to unnecessary repetition and potential inconsistencies.
In the refactored code, the calculate_area_of_square function now calls calculate_area_of_rectangle, passing the square's side length as the length and width.
def calculate_area_of_rectangle(length, width):
return length * width
def calculate_area_of_square(side_length):
return calculate_area_of_rectangle(side_length, side_length)
def calculate_area_of_triangle(base, height):
return 0.5 * base * height
This change reduces duplication by leveraging the fact that a square is a special case of a rectangle. It makes the code more maintainable and consistent. If the logic for area calculation needs to be updated or fixed, it only needs to be done in calculate_area_of_rectangle. This approach also adheres to the DRY (Don't Repeat Yourself) principle, a key practice in software development for reducing redundancy and improving code quality.
By identifying and refactoring duplicated code, we:
- Increase test efficiency. When code is duplicated, any test for that logic must also be duplicated or broadened to cover all instances. By centralizing shared logic, fewer tests can provide broader coverage, making the testing process more efficient and less error-prone.
- Facilitate code reusability. Refactoring duplicated code into common methods or classes encourages reusability. Developers can easily reuse these well-tested and trusted components across different application parts, leading to faster development and a more consistent codebase.
- Have easier code reviews. Reducing duplication makes the codebase smaller and more straightforward, simplifying code reviews. Reviewers can focus on the logic and design rather than parsing through redundant code, enhancing team collaboration and knowledge sharing.
- Scale better. A codebase with less duplication is generally easier to scale and extend. New features and changes can be implemented more quickly and with less risk of introducing inconsistencies or bugs.
Tackling code duplication significantly improves overall code quality, efficiency, and maintainability.
Quality Needs to be Tested With Code Coverage
Code coverage is a critical measure of software quality, indicating how much of the codebase is tested by automated tests. A high percentage of code coverage, especially when combined with well-designed tests, significantly reduces the chances of undetected bugs and ensures that more scenarios are evaluated for correct behavior.
The primary goal of code coverage is to enhance the reliability of software by ensuring that each line of code has been executed and tested under various conditions. It provides a quantifiable metric that teams can aim to improve, leading to a more thorough examination of the codebase. This systematic approach to testing fosters confidence in the software's stability and functionality. Automated tests that cover a wide range of the codebase can reveal issues that might not be immediately apparent, including edge cases and potential error conditions.
By aiming for high code coverage, teams are encouraged to consider and test these less obvious scenarios, reducing the likelihood of bugs in production. High code coverage makes it safer and easier to refactor and maintain code. Developers can make changes with the assurance that a robust set of tests will catch any inadvertent issues introduced by their modifications. This safety net is vital for ongoing development and adapting the software to new requirements or technologies.
While high code coverage is desirable, balancing the quantity of tests with their quality is crucial. Code coverage alone doesn't guarantee the effectiveness of tests. Tests must be meaningful and well-designed to ensure they accurately assess the code's functionality and not just artificially inflate coverage metrics.
While code coverage is not the sole indicator of software quality, it plays a crucial role in the software development lifecycle. It encourages thorough testing, uncovers hidden issues, and supports robust and maintainable code, thereby significantly contributing to the overall quality and reliability of the software.
Grading The Quality of Your Code
All of the above is a huge amount to measure and analyze to understand your code quality. That is why automating these tests is so critical. At Codacy, not only do we automate these tests, but we also give each file and repo a grade:
This grade is a weighted average of all of the issues, complexity, duplication, and coverage across your codebase. This allows you to have an immediate understanding of the level of your code quality and where the issues exist in your codebase.
No codebase is going to be perfect. But by measuring quality and putting it at the forefront of your development process, you can proactively identify and address areas of improvement, continuously refine and elevate your coding standards, and ensure that your software not only meets but exceeds the expectations of both your team and your users.
This ongoing commitment to quality not only enhances the technical robustness of your projects but also builds trust and credibility in your development process, fostering a culture of excellence that permeates through every line of code.
To give Codacy a try, sign up for a 14-day free trial today.