Technical Observation: The Delimiter Discrepancy
During a recent red-team engagement for a Tier-1 Indian e-commerce platform, I identified a critical session hijacking vector stemming from how their edge Nginx nodes handled URI normalization compared to the backend Python/Django microservices. While testing the /api/v1/userinfo endpoint, I observed that the edge cache was configured to treat any URL ending in .js as a static asset, regardless of the presence of session-specific data.
By appending a semicolon and a fake file extension—/api/v1/userinfo;/.js—I forced the edge cache to store a dynamic JSON response containing the user's PII, including their saved UPI IDs and partial Aadhaar numbers. The backend ignored the ;/.js suffix and served the profile data, while the Nginx cache, seeing the .js extension, cached the response for 10 minutes. Any unauthenticated attacker requesting that specific URL could then download the previous user's sensitive data.
$ curl -I -X GET "https://target-ecommerce.in/api/v1/user/profile;%2f.js" \
-H "User-Agent: Mozilla/5.0" \ -H "Cookie: sessionid=REDACTED"
HTTP/1.1 200 OK Content-Type: application/json X-Cache-Status: MISS Cache-Control: public, max-age=600 ...
What is Web Caching and How Does it Work?
Web caching is the process of storing copies of files or data in temporary storage locations to reduce latency and server load. In the context of Indian e-commerce, where traffic spikes during "Big Billion" style festivals can reach millions of hits per second, caching is not optional. It is the primary defense against infrastructure collapse.
Caches operate based on a "Cache Key." This key is typically a string derived from specific parts of the HTTP request, such as the Method, Host, and Path. If a secondary request generates the same Cache Key, the cache serves the stored response instead of forwarding the request to the origin server.
The Role of CDNs and Reverse Proxies in Modern Infrastructure
Modern Indian tech stacks utilize a multi-layered caching strategy. A request first hits a global Content Delivery Network (CDN), then potentially a regional ISP's transparent proxy (common in rural India via providers like BSNL or Jio), and finally the organization's own Nginx or Varnish reverse proxies.
These layers often have conflicting rules. I've observed many "Kirana-tech" startups using aggressive Nginx configurations that prioritize bandwidth savings over security. They often use proxy_ignore_headers Cache-Control to force-cache assets that the backend developer intended to be dynamic, creating a massive surface for Web Cache Deception (WCD).
Understanding Web Cache Poisoning
Web Cache Poisoning occurs when an attacker manipulates an "unkeyed" input to change the cached response served to other users. Unlike Web Cache Deception, which targets a specific user's data, poisoning targets the cache itself to distribute malicious content to the entire user base. Identifying these vulnerabilities is the first step, but implementing SIEM rules to detect malicious traffic is crucial for long-term defense.
I look for unkeyed inputs—headers or parameters that the cache ignores when generating the Cache Key but the backend uses to generate the response. If the backend uses the X-Forwarded-Host header to generate absolute URLs in the page, I can point that header to my own domain and "poison" the cache with my scripts.
Identifying Unkeyed Inputs and Cache Keys
To identify these vulnerabilities, I use a process of elimination. I send two identical requests with a slight variation in a header. If the second request results in a X-Cache: HIT, that header is unkeyed.
- Keyed Inputs: Host header, Request URI, sometimes the Accept-Encoding header.
- Unkeyed Inputs: X-Forwarded-Host, X-Forwarded-Proto, User-Agent (often), and custom headers like X-Origin-Secret.
Exploiting Unkeyed Headers (X-Forwarded-Host, X-Forwarded-Proto)
In many legacy Indian banking or e-commerce portals, the application logic relies on X-Forwarded-Host to redirect users after login. If this header is unkeyed, I can poison the homepage cache so that every user clicking "Login" is redirected to my phishing page.
$ curl -H "X-Forwarded-Host: attacker-controlled.in" https://target-ecommerce.in/ -I
HTTP/1.1 200 OK X-Cache: MISS
$ curl https://target-ecommerce.in/ -I
HTTP/1.1 200 OK X-Cache: HIT
The response now contains links pointing to attacker-controlled.i
n
Delivering Malicious Payloads via Poisoned Responses
The most dangerous payloads involve injecting <script> tags or manipulating src attributes of JavaScript files. If I can poison a common library like jquery.min.js, I gain execution on the browser of every visitor to the site. In the context of the DPDP Act 2023, such a breach would constitute a failure to implement "reasonable security safeguards," potentially leading to fines up to ₹250 crore.
Web Cache Deception Attacks
Web Cache Deception (WCD) is the inverse of poisoning. Here, I trick the cache into storing a private response as if it were a public static asset. This is achieved by exploiting discrepancies in how the cache and the origin parse the URL path, similar to the techniques used when hardening session security against prefix bypasses.
Web Cache Poisoning vs. Web Cache Deception
- Poisoning: Attacker sends a malicious input -> Cache stores it -> All users receive the malicious response.
- Deception: Attacker tricks a victim (or the cache) into requesting a private URL with a static extension -> Cache stores the private data -> Attacker retrieves it.
Path Mapping and Extension Confusion
The core of WCD is "Path Mapping Discrepancy." Many frameworks (like Spring or Django) support "Path Variables" or ignore trailing parts of the URL. For example, /api/user/profile and /api/user/profile/test.js might both resolve to the same profile controller.
However, a CDN like Cloudflare or an Nginx proxy will see the .js and assume it is a static, non-sensitive file. If the cache is configured to ignore the Set-Cookie header or if the backend fails to send Cache-Control: private, the profile data is cached under the /test.js key.
How Attackers Trick Caches into Storing Private User Data
I often use a "double-dot" or semicolon technique. In the Indian context, I've seen many custom-built PHP applications running on Apache behind Nginx. Nginx sees /profile.php/nonexistent.css and caches it as a CSS file. Apache sees /profile.php and executes the script.
Attacker sends this link to a logged-in victim
https://target-ecommerce.in/my-account/settings.php/style.css
Impact: From Information Disclosure to Account Takeover
Once the victim clicks the link, their browser fetches the page. The Nginx cache sees the .css extension, misses, fetches the private settings.php from the backend, and stores it. I then simply visit the same URL to download the victim's session tokens, CSRF tokens, and personal details.
This is particularly effective against Indian e-commerce sites using "Quick Login" features via OTP, where the session token is often reflected in the initial dashboard load.
Advanced Exploitation Techniques
As caches become smarter, exploitation requires more nuance. We are no longer just looking for .js extensions; we are looking for normalization bugs and internal cache discrepancies.
Cache Key Normalization and Path Traversal
Different layers of the stack normalize URLs differently. Nginx might decode %2f to /, but the backend might not. This discrepancy allows for "Internal Cache Poisoning." If I can reach an internal-only API by traversing out of a cached directory, I can sometimes expose internal metrics or administrative panels.
Testing for normalization discrepancies
$ curl -I "https://target-ecommerce.in/static/..%2fapi/admin/stats
"
Internal Cache Poisoning vs. External Cache Poisoning
Internal poisoning occurs within the application's own caching layer (like Redis or Memcached). If the application caches the result of a database query based on a user-controlled input without sanitization, I can poison the internal data structure. This is often harder to detect because it doesn't show up in standard X-Cache headers.
Chaining Cache Poisoning with Cross-Site Scripting (XSS)
If a site has a "Self-XSS" (an XSS that only affects the user who inputs it), it is usually considered low impact. However, by using Web Cache Poisoning, I can turn a Self-XSS into a Stored XSS. I input the payload, trigger the cache to store my "poisoned" version of the page, and now that XSS executes for everyone.
Exploiting Cookie-Based Cache Keys
Some sophisticated caches include parts of the cookie in the cache key, like the language preference or currency. In India, this is common for multi-lingual sites (Hindi, Tamil, Bengali). If I can find an unkeyed cookie or a cookie that is partially keyed, I can still achieve poisoning by finding a collision.
Tools and Methodology for Cache Security Testing
Manual testing is essential because automated scanners often miss the subtle timing and header discrepancies required for cache exploitation when building your offensive security skills at WarnHack Academy.
Automating Discovery with Burp Suite and Param Miner
I rely heavily on the "Param Miner" extension in Burp Suite. It automates the process of guessing unkeyed headers and parameters.
- Step 1: Right-click a request -> Guess headers.
- Step 2: Look for "Cache poisoning" issues in the Dashboard.
- Step 3: Manually verify by sending the request via Repeater and checking for the
X-Cache: HIT.
Manual Testing Workflows for Cache Vulnerabilities
My manual workflow involves identifying the "cache buster." A cache buster is a unique parameter (like ?cb=123) that forces a cache miss. This allows me to test the backend response without interference from previous tests.
Using ffuf to find extensions that trigger caching
$ ffuf -u https://target-ecommerce.in/my-account/FUZZ \
-w extensions_list.txt \ -mc 200 \ -H "Cookie: sessionid=VALID_SESSION" \ -X GET
Analyzing Cache Headers: X-Cache, Age, and CF-Cache-Status
I always inspect the response headers for clues:
- X-Cache / X-Cache-Hit: Indicates if the response came from cache.
- Age: How long the object has been in the cache (in seconds).
- CF-Cache-Status: Specific to Cloudflare (HIT, MISS, DYNAMIC, REVALIDATED).
- Vary: Tells the cache which headers must match for a HIT.
Prevention and Mitigation Strategies
Securing a cache requires a "Secure by Default" posture where only explicitly defined assets are cached. Monitoring for cache-related anomalies requires deploying an enterprise-grade SIEM solution that can parse complex log discrepancies.
Configuring Robust Cache Keys
The cache key must include all factors that can change the response. For dynamic content, the session ID or a unique user hash must be part of the key. However, this negates the benefits of caching for performance. The real solution is to separate static and dynamic content entirely.
Proper Use of the Vary Header
The Vary header is the most underutilized defense. By setting Vary: Cookie, you tell the cache that the response is different for every unique cookie. This prevents Web Cache Deception but can significantly reduce the hit rate.
Implementing Strict Cache-Control Policies
I recommend a strict Cache-Control policy for all dynamic endpoints. Instead of relying on the cache's default behavior, the backend should explicitly state:
Cache-Control: no-store, no-cache, must-revalidate, private.
Disabling Caching for Sensitive or Dynamic Content
In the Nginx configuration, ensure that static file matching is done strictly. Avoid regex that can be bypassed with delimiters.
SECURE CONFIGURATION
location /static/ { alias /var/www/static/; expires 30d; add_header Cache-Control "public"; # Ensure no proxying happens here to prevent WCD }
location /api/ { proxy_pass http://backend; proxy_cache off; # Explicitly disable cache for API add_header Cache-Control "no-store"; }
The Vulnerable Configuration Pattern
I frequently see this pattern in the wild, which is a recipe for disaster:
location ~* \.(js|css|jpg|jpeg|png|gif|ico|woff|pdf)$ { # VULNERABLE: Caches based on extension regardless of Origin headers proxy_cache static_cache; proxy_pass http://backend_nodes; proxy_ignore_headers Cache-Control Expires Set-Cookie; add_header X-Cache-Status $upstream_cache_status; }
The proxy_ignore_headers directive is particularly dangerous as it overrides the developer's intent to keep data private.
The Evolving Landscape of Web Cache Security
As Indian e-commerce shifts toward "Headless Commerce" and micro-frontend architectures, the number of caching layers is increasing. Each layer introduces a new opportunity for delimiter confusion or normalization discrepancies. We are seeing a move away from simple path-based caching toward more complex logic based on JWT claims or custom headers.
The DPDP Act 2023 will necessitate a fundamental shift in how Indian DevOps teams handle caching. Caching PII (Personal Identifiable Information) accidentally is now a legal liability, not just a technical bug. Security researchers must look beyond the application layer and analyze the entire delivery pipeline, from the CDN edge to the internal Redis cluster.
Summary of Best Practices for Developers and DevOps
- Never ignore
Cache-Controlheaders from the backend. - Use an "Origin-Centric" caching model where the backend dictates cacheability.
- Ensure the cache and the origin server use the same URL normalization libraries.
- Avoid using regex to identify static files; use directory-based routing instead.
- Regularly audit cache keys using tools like Param Miner.
To verify if your current Nginx setup is vulnerable to semicolon-based path confusion, run the following test against your staging environment:
$ openssl s_client -connect target-ecommerce.in:443 -quiet <<<'GET /dashboard/settings.css HTTP/1.1\r\nHost: target-ecommerce.in\r\n\r\n
'
Observe if the response contains the dashboard HTML or a 404. If it returns the dashboard and an X-Cache: MISS (followed by a HIT on the second attempt), your session data is likely leakable.
