Improper Control of Generation of Code ('Code Injection')

The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.


Description

When a product allows a user's input to contain code syntax, it might be possible for an attacker to craft the code in such a way that it will alter the intended control flow of the product. Such an alteration could lead to arbitrary code execution.

Injection problems encompass a wide variety of issues -- all mitigated in very different ways. For this reason, the most effective way to discuss these weaknesses is to note the distinct features which classify them as injection weaknesses. The most important issue to note is that all injection problems share one thing in common -- i.e., they allow for the injection of control plane data into the user-controlled data plane. This means that the execution of the process may be altered by sending code in through legitimate data channels, using no other mechanism. While buffer overflows, and many other flaws, involve the use of some further issue to gain execution, injection problems need only for the data to be parsed. The most classic instantiations of this category of weakness are SQL injection and format string vulnerabilities.

Demonstrations

The following examples help to illustrate the nature of this weakness and describe methods or techniques which can be used to mitigate the risk.

Note that the examples here are by no means exhaustive and any given weakness may have many subtle varieties, each of which may require different detection methods or runtime controls.

Example One

This example attempts to write user messages to a message file and allow users to view them.

$MessageFile = "messages.out";
if ($_GET["action"] == "NewMessage") {
  $name = $_GET["name"];
  $message = $_GET["message"];
  $handle = fopen($MessageFile, "a+");
  fwrite($handle, "<b>$name</b> says '$message'<hr>\n");
  fclose($handle);
  echo "Message Saved!<p>\n";
}
else if ($_GET["action"] == "ViewMessages") {
  include($MessageFile);
}

While the programmer intends for the MessageFile to only include data, an attacker can provide a message such as:

name=h4x0r
message=%3C?php%20system(%22/bin/ls%20-l%22);?%3E

which will decode to the following:

<?php system("/bin/ls -l");?>

The programmer thought they were just including the contents of a regular data file, but PHP parsed it and executed the code. Now, this code is executed any time people view messages.

Notice that XSS (CWE-79) is also possible in this situation.

Example Two

edit-config.pl: This CGI script is used to modify settings in a configuration file.

use CGI qw(:standard);

sub config_file_add_key {

  my ($fname, $key, $arg) = @_;

  # code to add a field/key to a file goes here


}

sub config_file_set_key {

  my ($fname, $key, $arg) = @_;

  # code to set key to a particular file goes here


}

sub config_file_delete_key {

  my ($fname, $key, $arg) = @_;

  # code to delete key from a particular file goes here


}

sub handleConfigAction {

  my ($fname, $action) = @_;
  my $key = param('key');
  my $val = param('val');

  # this is super-efficient code, especially if you have to invoke


  # any one of dozens of different functions!

  my $code = "config_file_$action_key(\$fname, \$key, \$val);";
  eval($code);

}

$configfile = "/home/cwe/config.txt";
print header;
if (defined(param('action'))) {
  handleConfigAction($configfile, param('action'));
}
else {
  print "No action specified!\n";
}

The script intends to take the 'action' parameter and invoke one of a variety of functions based on the value of that parameter - config_file_add_key(), config_file_set_key(), or config_file_delete_key(). It could set up a conditional to invoke each function separately, but eval() is a powerful way of doing the same thing in fewer lines of code, especially when a large number of functions or variables are involved. Unfortunately, in this case, the attacker can provide other values in the action parameter, such as:

add_key(",","); system("/bin/ls");

This would produce the following string in handleConfigAction():

config_file_add_key(",","); system("/bin/ls");

Any arbitrary Perl code could be added after the attacker has "closed off" the construction of the original function call, in order to prevent parsing errors from causing the malicious eval() to fail before the attacker's payload is activated. This particular manipulation would fail after the system() call, because the "_key(\$fname, \$key, \$val)" portion of the string would cause an error, but this is irrelevant to the attack because the payload has already been activated.

Example Three

This simple script asks a user to supply a list of numbers as input and adds them together.

def main():

  sum = 0
  numbers = eval(input("Enter a space-separated list of numbers: "))
  for num in numbers:

    sum = sum + num

  print(f"Sum of {numbers} = {sum}")
main()

The eval() function can take the user-supplied list and convert it into a Python list object, therefore allowing the programmer to use list comprehension methods to work with the data. However, if code is supplied to the eval() function, it will execute that code. For example, a malicious user could supply the following string:

__import__('subprocess').getoutput('rm -r *')

This would delete all the files in the current directory. For this reason, it is not recommended to use eval() with untrusted input.

A way to accomplish this without the use of eval() is to apply an integer conversion on the input within a try/except block. If the user-supplied input is not numeric, this will raise a ValueError. By avoiding eval(), there is no opportunity for the input string to be executed as code.

def main():

  sum = 0
  numbers = input("Enter a space-separated list of numbers: ").split(" ")
  try:

    for num in numbers:

      sum = sum + int(num)

    print(f"Sum of {numbers} = {sum}")
  except ValueError:

    print("Error: invalid input")


main()

An alternative, commonly-cited mitigation for this kind of weakness is to use the ast.literal_eval() function, since it is intentionally designed to avoid executing code. However, an adversary could still cause excessive memory or stack consumption via deeply nested structures [REF-1372], so the python documentation discourages use of ast.literal_eval() on untrusted data [REF-1373].

See Also

Comprehensive Categorization: Injection

Weaknesses in this category are related to injection.

OWASP Top Ten 2021 Category A03:2021 - Injection

Weaknesses in this category are related to the A03 category "Injection" in the OWASP Top Ten 2021.

Validate Inputs

Weaknesses in this category are related to the design and architecture of a system's input validation components. Frequently these deal with sanitizing, neutralizing a...

Comprehensive CWE Dictionary

This view (slice) covers all the elements in CWE.

Weaknesses in the 2023 CWE Top 25 Most Dangerous Software Weaknesses

CWE entries in this view are listed in the 2023 CWE Top 25 Most Dangerous Software Weaknesses.

Weaknesses Addressed by ISA/IEC 62443 Requirements

This view (slice) covers weaknesses that are addressed by following requirements in the ISA/IEC 62443 series of standards for industrial automation and control systems...


Common Weakness Enumeration content on this website is copyright of The MITRE Corporation unless otherwise specified. Use of the Common Weakness Enumeration and the associated references on this website are subject to the Terms of Use as specified by The MITRE Corporation.