Tuesday, July 08, 2008

The case for case sensitive OpenID URL checking

URLs on the Internet are case sensitive by definition.  Some web servers choose to be case insensitive.  To treat OpenID urls as anything but case sensitive for purposes of identifying a user introduces a grave security risk.  Implementers of OpenID should be cautious when using case-insensitive string comparisons and be aware that in most cases checks should be case sensitive.

OpenID Claimed Identifiers

Consider the tale of three URLs:

  1. http://MYPROVIDER.org/myuser
  2. http://myprovider.org/myuser
  3. http://myprovider.org/MYUSER

If entered into an OpenID login box, which combination of these URLs are guaranteed to represent the same identifier and therefore the same person?  (go ahead and think about it)...

If you decided all three, from an intuitive user perspective, you'd be right.  But from a security perspective, only 1 and 2 should be recognized as the same.  An RFC-abiding, self-respecting web server may be case sensitive for the path in its URL.  Linux/Apache servers may be configured this way quite easily, perhaps even by default in some cases.  Therefore an OP that does not take specific precautions to prevent multiple identifiers that differ only in casing from existing simultaneously would be open to the following attack:

  1. User A visits myprovider.org and acquires the myprovider.org/myuser identifier.
  2. User A visits somerp.com and logs in as myprovider.org/myuser.
  3. User B seeks to spoof User A's identity on somerp.com. 
  4. User B visits myprovider.org and acquires the myprovider.org/MYUSER identifier.
  5. User B visits somerp.com and logs in as myprovider.org/MYUSER.
  6. somerp.com does a case insensitive comparison and considers User B to be the same person as User A.  somerp.com grants all access that User A should get to User B.

Now, who broke a rule here?  Although some might argue that myprovider.org is a poor Provider, it hasn't actually violated any RFCs or the OpenID spec.  And since somerp.com has no way of knowing whether given Provider is case sensitive or not, to prevent the above identity spoofing scenario it MUST perform a case sensitive comparison.

Now the one exception to this is in the authority area of the URI.  The authority is the hostname.myprovider.org area of the URL.  The authority is case insensitive because DNS is case insensitive and the authority is not sent as part of the GET line in the HTTP protocol (it may be sent as one of the HTTP headers afterward, but no server can rightly be case sensitive on that particular header).  So to be fully correct, safe, and as user-friendly as possible, the URL should be compared in two parts: the authority which would be case insensitive, and the path+query+fragment segment which would be case sensitive.

Realm-return_to validation

And if you're not yet convinced, consider the next concern regarding casing: realm-return_to validation.  Part of an OP's responsibility is verifying that a return_to URL falls somewhere at or beneath the realm URL.  There are security reasons for this validation that fall outside the scope of this post.  But case sensitive testing here is also very important.

Consider this scenario:

  1. Shared hosting provider www.yourpages.com allows subscribers to host their own web sites as virtual app directories under their domain, such that one subscriber choose www.yourpages.com/user1/ as the root of their site and another subscriber might choose www.yourpages.com/user2/ as the root of their site.
  2. User 1 subscribes to www.yourpages.com and chooses to host his OP Provider site at www.yourpages.com/RadProvider.
  3. User 2 wants an OpenID URL and gets this one from RadProvider: www.yourpages.com/RadProvider/User2IsCoolGuy
  4. User 2 visits somerp.com and logs in with www.yourpages.com/RadProvider/User2IsCoolGuy and establishes private information with somerp.com.
  5. User 3 sets up his own OpenID Provider at www.yourpages.com/radprovider/.  Notice he chose the name to match an existing provider but with different casing.  This case sensitive shared hosting service didn't think of the security ramifications behind his innocent web server allowing multiple users to have such similar paths and so User 3 exploits this.
  6. Now consider an OpenID authentication request in which realm is www.yourpages.com/radprovider and the return_to url is www.yourpages.com/RadProvider.  The RP discovery step, which is a very important security improvement in OpenID 2.0, is totally thwarted because it will be done on the wrong Provider site.  Several authentication hacks can be done without RP discovery to thwart the security offered by OpenID.


Realm-return_to validation checks, and Claimed Identifier matching MUST be done in a case sensitive way for the path+query+fragment pieces, and SHOULD be case insensitive for the scheme and authority parts. 

My argument here is not that this is common, but that it is possible.  And possible is all it takes for a security hole to be exploited and people's identity to be compromised.

In a follow-up post, I will present what OPs can do to give their users a great user experience even if all the RPs are properly case sensitive, such that their users can enter their OpenID URLs in any case they like and still authenticate properly.


  1. Any programmer or tech worth their salt knows that "case insensitive" comparisons are actually more laborious to perform. Also, the only time URLs are not case sensitive (apart from the domain name) is on MS servers (as the file system is not case sensitive for some poorly construed reason dating back to MS DOS).

    Despite this obvious fact, you point out that we should take care. However, I do not see the point.
    A system where case insensitive URLs are done as such is not going to have, for example, user names that are NOT unique by case!

    WTF is this article about really?

  2. SPO256,

    I think you're missing several critical pieces to why this article relates to everyone:

    Microsoft operating systems are only usually case insensitive, although they can be case sensitive. Linux servers can absolutely be case insensitive. It's a switch in Apache's configuration file.

    But file systems have nothing to do with case sensitivity as it relates to OpenID. Claimed Identifiers are almost never stored on the file system. They're generally stored in a database, and therein lies one of the chief sources of a potential break. Many SQL databases perform case insensitive string comparisons by default. If an incoming request for claimed identifier "foo" came into your server (Windows or Linux alike) and you did a SQL lookup for that user with:
    SELECT * FROM USERS WHERE ClaimedId = 'foo'
    You would very likely get users "Foo", "foo", and "FOO" back, if they all existed. But only "foo" should be considered as matching.

    The other part you might not be catching is that Linux servers need to accommodate Windows servers and vice versa. An OpenID implementation must perform correctly on either operating system, and considering that either operating system might be on the remote end. This also takes special care.

    If you end up implementing your own OpenID library, start a discussion on an openid mailing list to get more of the details.

    Please avoid the profanity. Thanks.

  3. This took me a while to figure out: to make SQL server consider your OpenID claimed identifier lookups to be case sensitive, you should make sure that column's collation is set to be case sensitive:
    COLLATE SQL_Latin1_General_CP1_CS_AS

  4. Andrew, which databases did you have in mind? (That perform case insensitive string comparisons by default)