Decoding The Web Application Protection: Understanding Input Handling

6 min readSep 16, 2023

Numerous attacks targeting web applications revolve around the submission of unconventional input, meticulously designed to trigger unintended behaviors within the application. Consequently, a fundamental necessity for bolstering an application’s security lies in its ability to process user input safely and securely.

Vulnerabilities stemming from input can emerge in various parts of an application’s features, spanning across a wide range of commonly used technologies. “Input validation” is frequently touted as the essential safeguard against such attacks. Nevertheless, it’s important to recognize that there isn’t a one-size-fits-all protective measure that can be universally applied, and defending against malicious input is often more complex than it may initially appear.

Let us consider an application’s input validation checks with different use cases.

Use Case 1:

Gather the user’s postal address for various reasons, such as sending items or confirming their whereabouts.

Checks:

Required Field-Ensure that the “Address” field is mandatory, meaning users must provide their address during registration.

Maximum Length- Establish a character limit cap (e.g., 100 characters) for the address input to avoid excessively lengthy entries that could potentially disrupt the structure of the database or user interface.

Valid Characters- Permit only legitimate characters in the address field, typically including alphanumeric characters, spaces, commas, and specific special characters such as hyphens and periods.

Address Format- Make certain that the address adheres to a standardized format appropriate for your target region or country. This could involve mandating that addresses incorporate expected elements such as street numbers, street names, city names, and postal/ZIP codes in the prescribed format.

Avoid HTML/Script Tags- Incorporate input sanitization measures to safeguard against the inclusion of HTML or script tags within the address field, as these elements may pose a risk for potential cross-site scripting (XSS) attacks.

Use Case 2:

Ensure that the user provides a valid email address during registration.

Check:

Verify that the input contains the “@” symbol and a valid domain name.

The various approaches:

1. Reject known bad:

This approach involves maintaining a blacklist or deny list containing known strings, characters, or patterns commonly used in various types of attacks. These entries are typically gathered from threat intelligence sources as Indicators of Compromise (IOCs). The technique operates by prohibiting any input or entities that match the items in the blacklist while permitting everything else to proceed without restriction.

In General this is termed as the least effective approach for validating input which has two reasons to support.

Vulnerabilities within an application can be exploited through numerous avenues, employing a diverse range of input methods, which may include encoding or diverse representations. Consequently, a blacklist may inadvertently exclude certain input patterns that can be leveraged to compromise the application’s security.
Exploitation tactics, techniques, and procedures are rapidly evolving. As a result, it’s improbable that existing blocklists can effectively prevent the exploitation of known vulnerability categories through new and innovative methods.

Examples:

SQL Injection Attempt:

Input: '; DROP TABLE Users--

Explanation: This input attempts to execute a malicious SQL injection attack by adding a semicolon followed by a SQL command to delete the “Users” table.
Cross-Site Scripting (XSS) Payload:

Input: <script>alert('XSS Attack');</script>

Explanation: This input contains a JavaScript script tag, which can be used to execute malicious scripts on the user’s browser.
Path Traversal Attempt:

Input: ../../../../../etc/passwd

Explanation: This input tries to perform a directory traversal attack by attempting to access sensitive files on the server, such as the password file.
Command Injection Attempt:

Input: ; ls -la

Explanation: This input aims to execute arbitrary commands on the server by appending a semicolon followed by a system command (ls -la in this case).

2. Accept known Good:

This method entails using a whitelist, which consists of specific literal strings, patterns, or criteria that are recognized as only matching benign inputs. The validation process permits data that aligns with the whitelist while rejecting any other inputs.

For an instance: Prior to querying the database for a requested product code, the application may perform validation to confirm that the code comprises precisely six characters and contains only alphanumeric characters. Considering the subsequent processing of the product code, developers are confident that input meeting these criteria will not pose any potential issues.

Another example: Before processing a user’s payment information, the application may validate that the credit card number provided is 16 digits long and contains only numerical digits (0–9). Given the subsequent payment processing steps, developers are assured that input adhering to these criteria will not lead to any issues or errors.

While a whitelist approach is highly effective in handling potentially malicious input, it’s not a universal solution. When meticulously constructed, a whitelist prevents attackers from using crafted input to disrupt an application’s behavior. However, in many cases, applications must accept data that doesn’t conform to predefined criteria. For example, names with apostrophes or hyphens can be legitimate but also pose database security risks. There are situations where the application needs to accommodate such data, making the whitelist approach not suitable for all scenarios.

Sanitization

This approach involves recognizing that there are situations where it’s necessary to accept data that may not be inherently safe and then taking steps to make it safe. This is achieved through sanitization techniques, which can involve removing potentially harmful characters, retaining only known safe elements, or appropriately encoding and escaping the data before proceeding with further processing.

This method is widely effective in numerous scenarios and can be considered a comprehensive solution to the issue of handling malicious input. For instance, a common defense against cross-site scripting involves the HTML encoding of potentially harmful characters before they are incorporated into the application’s web pages.

Safe Data Handling

Vulnerabilities in web applications often stem from insecure handling of user-supplied data. To address these vulnerabilities, it’s essential not just to validate user input but also to guarantee that subsequent data processing follows secure protocols. In specific instances, secure programming methods can supplant conventional practices to thwart attacks. For instance, utilizing parameterized queries when communicating with databases can protect against SQL injection attacks. Similarly, in different scenarios, application functionality can be structured to sidestep inherently risky practices, such as passing user input directly to an operating system command interpreter.

Semantic Checks

In certain cases, malicious input submitted by threat actors can appear identical to input from non-malicious users, making it challenging to discern the difference based solely on syntax or structure. The distinguishing factor lies in the context in which the input is provided. For instance, an attacker might attempt to access someone else’s bank account by altering an account number sent through a hidden form field. Conventional syntactic validation alone cannot distinguish between the legitimate user’s data and the attacker’s input. To thwart unauthorized access, the application must validate that the submitted account number actually belongs to the user who is making the submission.

Let’s consider an online voting system as another example within the same context. A threat actor might submit a vote that syntactically matches a legitimate user’s vote, making it difficult to distinguish between the two based on syntax alone. However, the malicious aspect lies in the intent — the attacker may want to manipulate the election results by casting multiple votes.

To prevent this type of fraudulent activity, the application must validate not just the syntax of the vote but also the context in which it is submitted. This could involve tracking unique identifiers for each voter, checking for patterns of repeated votes, and employing additional security measures like CAPTCHA tests or voter authentication to ensure that each vote is cast by an eligible and legitimate user.

Reference:

The web application hacker’s handbook- Finding and exploiting security flaws.

Let’s connect?
LinkedIn: www.linkedin.com/in/ravitejmbandlekar