Permissive Regular Expression

The product uses a regular expression that does not sufficiently restrict the set of allowed values.


Description

This effectively causes the regexp to accept substrings that match the pattern, which produces a partial comparison to the target. In some cases, this can lead to other weaknesses. Common errors include:

not identifying the beginning and end of the target string

using wildcards instead of acceptable character ranges

others

Demonstrations

The following examples help to illustrate the nature of this weakness and describe methods or techniques which can be used to mitigate the risk.

Note that the examples here are by no means exhaustive and any given weakness may have many subtle varieties, each of which may require different detection methods or runtime controls.

Example One

The following code takes phone numbers as input, and uses a regular expression to reject invalid phone numbers.

$phone = GetPhoneNumber();
if ($phone =~ /\d+-\d+/) {
  # looks like it only has hyphens and digits
  system("lookup-phone $phone");
}
else {
  error("malformed number!");
}

An attacker could provide an argument such as: "; ls -l ; echo 123-456" This would pass the check, since "123-456" is sufficient to match the "\d+-\d+" portion of the regular expression.

Example Two

This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command.

import subprocess
import re

def validate_ip_regex(ip: str):

  ip_validator = re.compile(r"((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}")
  if ip_validator.match(ip):

    return ip

  else:

    raise ValueError("IP address does not match valid pattern.")



def run_ping_regex(ip: str):

  validated = validate_ip_regex(ip)
  # The ping command treats zero-prepended IP addresses as octal
  result = subprocess.call(["ping", validated])
  print(result)

Since the regular expression does not have anchors (CWE-777), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (CWE-1389). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.

See Also

Comprehensive Categorization: Comparison

Weaknesses in this category are related to comparison.

SFP Secondary Cluster: Tainted Input to Command

This category identifies Software Fault Patterns (SFPs) within the Tainted Input to Command cluster (SFP24).

Data Processing Errors

Weaknesses in this category are typically found in functionality that processes data. Data processing is the manipulation of input to retrieve or save information.

Comprehensive CWE Dictionary

This view (slice) covers all the elements in CWE.

Weaknesses Introduced During Implementation

This view (slice) lists weaknesses that can be introduced during implementation.

Weakness Base Elements

This view (slice) displays only weakness base elements.


Common Weakness Enumeration content on this website is copyright of The MITRE Corporation unless otherwise specified. Use of the Common Weakness Enumeration and the associated references on this website are subject to the Terms of Use as specified by The MITRE Corporation.