Incoming request paths

Casing and URL encoding

Casing and URL encoding add significant complexity to the jobs IIS and ASP.NET must perform. ASP.NET automatically decodes and lowercases the scheme, host, and port. The Path (and PathInfo) portions are decoded, but case is not changed. The querystring is not modified. When the querystring is parsed to populate the QueryString collection, names and values get URL decoded once.

There are an almost infinite number of ways to represent every URI take for instance the following URIs, and how they are sanitized before your application receives them:

URL encoded scheme (http)

URI
%68%74%74%70://localhost:87/content/pages/test.aspx
Response
IIS7 returns HTTP Error 400.0 - Bad Request

Double URL encoded host

URI
http://%25%34%63%25%34%66%25%34%33%25%34%31%25%34%63%25%34%38%25%34%66%25%35%33%25%35%34:87/Content/pages/test.aspx
Response
IE7 returns "Host not found" (browser only performs one level of URL decoding, apparently)

URL Encoded host (UPPERCASE)

URI
http://%4c%4f%43%41%4c%48%4f%53%54:87/Content/pages/test.aspx
Request.Url.OriginalString
http://localhost:87/Content/pages/test.aspx

IE7 decodes the host name automatically

URL Encoding the host (lowercase)

URI
Error! Hyperlink reference not valid.
Request.Url.OriginalString
http://localhost:87/Content/pages/test.aspx

Mixed case

URI
HttP://LOcalHoSt:87/CoNtEnT/pAgEs/tEsT.asPx
Request.Url.OriginalString
http://localhost:87/CoNtEnT/pAgEs/tEsT.asPx

URLEncoding the entire URI (http://localhost:87/content/pages/test.aspx)

URI
%68%74%74%70%3a%2f%2f%6c%6f%63%61%6c%68%6f%73%74%3a%38%37%2f%63%6f%6e%74%65%6e%74%2f%70%61%67%65%73%2f%74%65%73%74%2e%61%73%70%78
Respnse
IIS7 returns HTTP Error 400.0 - Bad Request

URL Encoding the path (FilePath + PathInfo), but not the query string

URI
http://localhost/%55%72%6c%54%65%73%74%73%2f%63%6f%6e%74%65%6e%74%2f%70%61%67%65%73%2f%74%65%73%74%2e%61%73%70%78%2f%66%6f%6c%64%65%72%2f%66%69%6c%65?querystring
Request.Url.OriginalString
http://localhost:80/UrlTests/content/pages/test.aspx/folder/file?querystring

URL Encoding the path, querystring, and the "?" in-between.

URI
http://localhost/%55%72%6c%54%65%73%74%73%2f%63%6f%6e%74%65%6e%74%2f%70%61%67%65%73%2f%74%65%73%74%2e%61%73%70%78%2f%66%6f%6c%64%65%72%2f%66%69%6c%65%3f%71%75%65%72%79%73%74%72%69%6e%67
Response
Server Error in '/UrlTests' Application.
'/UrlTests/content/pages/test.aspx/folder/file?querystring' is not a valid virtual path.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Web.HttpException: '/UrlTests/content/pages/test.aspx/folder/file?querystring' is not a valid virtual path.
Source Error:
Frame1
Source File: c:\Users\Nathanael\desktop\webapplication2\Content\pages\test.aspx.cs    Line: 912308
Stack Trace:
Frame2
Version Information: Microsoft .NET Framework Version:2.0.50727.312; ASP.NET Version:2.0.50727.833
Hackers can exploit this to discover:
   If CustomErrors=false
   1.The physical location of the web site
   2.The physicial loation of the .NET framework and the
Of course, I'm sure there are other ways to generate errors - this is just one you can't patch.

Also, since the entire URL is parsed as the virtual path, it looks like this is a way to circumvent the splitting of FilePath, PathInfo, and the querystring, since the whole URL was parsed as the virtual path.

If you want the querystring to be parsed correctly, you cannot encode the delimiting question mark.

URL Encoding the path and querystring separately

URI
http://localhost/%55%72%6c%54%65%73%74%73%2f%63%6f%6e%74%65%6e%74%2f%70%61%67%65%73%2f%74%65%73%74%2e%61%73%70%78%2f%66%6f%6c%64%65%72%2f%66%69%6c%65?%71%75%65%72%79%73%74%72%69%6e%67
Request.Url.OriginalString
http://localhost:80/UrlTests/content/pages/test.aspx/folder/file?%71%75%65%72%79%73%74%72%69%6e%67

Note: HttpUtility.URLEncode() only encodes non-alphanumeric characters. You'll have to use something else to get complete encoding, such as is used above.

So, I think you can see why basing security on the incoming path is a bad idea.

Differences between Root and Subfolder applications

IIS allows websites to be hosted in virtual folders that do not correspond to the on-disk organization. For example, if you publish a website located in C:\Websites\MyWebsite2\ on virtual folder /Web1/Test/, The URI to access it would be http://mycomputer/Web1/Test/.

The following table contains data from single web site being accessed from two different locations. One channel is the virtual folder /URLTests on port 80, and the other is the root website on port 87.

Property http://localhost:87/content/pages/ test.aspx/pathinfo?query=value http://localhost/UrlTests/content/pages/ test.aspx/pathinfo?query=value
Request. ApplicationPath / /UrlTests
Request. RawUrl /content/Pages/test.aspx/pathinfo?query=value /UrlTests/content/Pages/test.aspx/pathinfo?query=value
Request. AppRelativeCurrent ExecutionFilePath ~/content/Pages/test.aspx ~/content/Pages/test.aspx
Request.Current ExecutionFilePath /content/Pages/test.aspx /UrlTests/content/Pages/test.aspx
Request. FilePath /content/Pages/test.aspx /UrlTests/content/Pages/test.aspx
Request. Path /content/Pages/test.aspx/pathinfo /UrlTests/content/Pages/test.aspx/pathinfo
Request. PathInfo /pathinfo /pathinfo
Request. PhysicalApplicationPath C:\Users\Nathanael\Desktop\WebApplication2\ C:\Users\Nathanael\Desktop\WebApplication2\
Request. PhysicalPath C:\Users\Nathanael\Desktop\WebApplication2 \content\Pages\test.aspx C:\Users\Nathanael\Desktop\WebApplication2 \content\Pages\test.aspx
Request. Url.OriginalString http://localhost:87/content/Pages/ test.aspx/pathinfo?query=value http://localhost:80/UrlTests/content/Pages/ test.aspx/pathinfo?query=value

Note: CurrentExecutionFilePath (and the AppRelative version) both differ from FilePath with Server.Execute or Server.Transfer is used. CurrentExecutionFilePath changes to reflect the executing file, while FilePath remains unchanged. Spaces in the table above were inserted to allow wrapping of the text - the actual data contains no whitespace.

This should illustrate why application-relative paths (~/file.aspx) should be used wherever possible - they permit the site to be hosted in virtual folder as well as in a site root. Can you imaging the maintenance costs if you needed to move your site, and were using absolute paths, such as (/vdir/file.aspx)? You may not currently need to host the site on a virtual folder for dev purposes, but what about 5 years from now when you want to keep the old system online in a subfolder of the new site?

Published on

About Nathanael

Nathanael Jones is a software engineer, father, consultant, and computer linguist with unreasonably high expectations of inanimate objects. He refines .NET, ruby, and javascript libraries full-time at Imazen, but can often be found on stack overflow or participating in W3C community groups.

ImageResizer

If you develop websites, and those websites have images, ImageResizer can make your life much eaiser. Find out more at imageresizing.net.

Imazen

I run Imazen, a tiny software company that specializes in web-based image processing and other difficult engineering problems. I spend most of my time writing image-processing code in C#, web apps in Ruby, and documentation in Markdown. Check out some of my current projects.

More articles