Regular Expression without Anchors

The software uses a regular expression to perform neutralization, but the regular expression is not anchored and may allow malicious or malformed data to slip through.


Description

When performing tasks such as validating against a set of allowed inputs (allowlist), data is examined and possibly modified to ensure that it is well-formed and adheres to a list of safe values. If the regular expression is not anchored, malicious or malformed data may be included before or after any string matching the regular expression. The type of malicious data that is allowed will depend on the context of the application and which anchors are omitted from the regular expression.

Background

Regular expressions are typically used to match a pattern of text. Anchors are used in regular expressions to specify where the pattern should match: at the beginning, the end, or both (the whole input).

Demonstrations

The following examples help to illustrate the nature of this weakness and describe methods or techniques which can be used to mitigate the risk.

Note that the examples here are by no means exhaustive and any given weakness may have many subtle varieties, each of which may require different detection methods or runtime controls.

Example One

Consider a web application that supports multiple languages. It selects messages for an appropriate language by using the lang parameter.

$dir = "/home/cwe/languages";
$lang = $_GET['lang'];
if (preg_match("/[A-Za-z0-9]+/", $lang)) {
  include("$dir/$lang");
}
else {
  echo "You shall not pass!\n";
}

The previous code attempts to match only alphanumeric values so that language values such as "english" and "french" are valid while also protecting against path traversal, CWE-22. However, the regular expression anchors are omitted, so any text containing at least one alphanumeric character will now pass the validation step. For example, the attack string below will match the regular expression.

../../etc/passwd

If the attacker can inject code sequences into a file, such as the web server's HTTP request log, then the attacker may be able to redirect the lang parameter to the log file and execute arbitrary code.

See Also

Comprehensive CWE Dictionary

This view (slice) covers all the elements in CWE.

Weaknesses without Software Fault Patterns

CWE identifiers in this view are weaknesses that do not have associated Software Fault Patterns (SFPs), as covered by the CWE-888 view. As such, they represent gaps in...

Weaknesses Introduced During Implementation

This view (slice) lists weaknesses that can be introduced during implementation.


Common Weakness Enumeration content on this website is copyright of The MITRE Corporation unless otherwise specified. Use of the Common Weakness Enumeration and the associated references on this website are subject to the Terms of Use as specified by The MITRE Corporation.