Sunday, November 02, 2008

OpenID on Classic ASP

Several people have asked for an OpenID library for Classic ASP.  Yes, it's several years deprecated, but there are still some major and smaller sites using it.  Classic ASP allows the consumption of COM/ActiveX objects, so it turns out DotNetOpenId can be used by Classic ASP!

But DotNetOpenId isn't a COM server.  Not in its released form anyway.  But there is now a CTP of DotNetOpenId that does offer a COM interface that Classic ASP can call into, and it includes an actual sample of a classic ASP OpenID relying party.

You can download the CTP here, on the condition that you then leave feedback on your experience on our mailing list.

Tuesday, October 21, 2008

DotNetOAuth source code to be released eminently

Microsoft attorneys have signed off on the open source release of the DotNetOAuth source code that I've been building in my spare time.  It should show up in the public git repo in the next 24 hours.  What's up there right now is several weeks old and tons of progress has been made that I'm eager to publish.

Once it's published I'd very much like to get some feedback on the public API, and any holes in what it supports. 

And now a review of what's coming...

I'll publish the DotNetOAuth source code first as a separate library just so you can get at it sooner, since that's its current form.  I'll eventually merge it into the DotNetOpenId library for its final release.  But I hope many of you will try it out in its separate library form and give lots of feedback, particularly on its public API including ease/difficulty of use and discovery, and scenarios that are not yet allowed or difficult given what I've exposed.

I have samples demonstrating both Service Provider and Consumer roles.  The Consumer sample demonstrates downloading your Gmail address book using OAuth.  Then both Consumer and SP samples work in tandem to show off WCF with OAuth authorization.  For those of us working more in the .NET or SOAP worlds, WCF is a great API for making data queries from one web site to another, and adding OAuth is really exciting.

And of course unit tests (currently 140 of them) verifying correct behavior.  The tests are written to use mstest.exe rather than NUnit.  Whether it stays that way is still flexible, but since I'm writing all the code at this point and find mstest to be more convenient and easier to measure code coverage for better testing, that's what I've chosen for now.  With the right set of #ifdefs, I just might be able to get compiles to work against either unit test library, but it hasn't been a priority yet.

As revealed before, this library takes dependencies on .NET 3.5.

Saturday, October 18, 2008

Your security is inversely proportional to the number of OpenID Providers you use

Just a quick note if you're familiar with OpenID's XRDS documents and how they allow you to have one 'omni-Identifier' that lists all your other OpenID providers and identifiers so that you can use this one Identifier to log in anywhere with any OP and yet maintain just a single identity.  Although there's great convenience in tying your several Identifiers into a single Identifier using an XRDS document, one should be cautious about just which Providers are listed inside your Identifier's XRDS document.

Any individual OP listed in your XRDS file has the capability of asserting your identity both through the identifier it assigned to you and through your omni-identifier.  If that OP was evil, or compromised, or just plain poorly written, your identity on all sites you log into with that Identifier is equally compromised. Your identity is only as secure as the weakest OP in your XRDS file.  Since you typically don't know which OP will fail first, a simple equation sums it up: the strength of your identity's security is inversely proportional to the number of Providers in your XRDS document.  Each one increases the surface area of your risk.

What does this mean?  Be cautious.  I would advise that you have no more than 3 Providers listed in your XRDS file.  One might be all you need.  In my case, I have a favorite OP, and then a couple of others I include with lower priority values so that RPs I sign into that have whitelists of OPs can still use my omni-Identifier.

You can be sure I won't add any community group's Provider to my XRDS file.  We should all keep only very reliable Providers as our identity providers.

Tuesday, October 14, 2008

Enable international characters in OpenID domain names

If you're using DotNetOpenId, you should consider enabling .NET to use IDN, or the International Domain Name scheme to enable users with international characters in the host name of their OpenID to log into your site.  Enabling IDN support requires .NET 3.5. 

First, add this to your web.config or machine.config file's <configuration><configSections> area.

<section name="uri" type="System.Configuration.UriSection, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />

Then within the <configuration> tag of your web.config file, add this snippet:

<uri>
	<idn enabled="All" />
	<iriParsing enabled="true" />
</uri>

And that's all there is to it.  Your dotnetopenid web site will automatically start supporting OpenIDs from around the world.

In DotNetOpenId 2.5.1 and later, these snippets will be included in the sample sites included in the distribution.

For more information, see read the section entitled "International Resource Identifier Support" in the MSDN Magazine article Get Connected With The .NET Framework 3.5

Monday, October 06, 2008

Why OAuth can't be ignored

Do you realize that your email password is probably your most sensitive piece of information?  You should never, ever give it away.  Not to another "trustworthy" web site, and not to a desktop app that wants to perform some service for you.  Web-based email these days is hosted by providers that offer many other services using the same user account.  For instance, if you use a desktop app to write blog posts and publish them to your Blogger account, do you realize you're giving that desktop app the full ability to read your email, write email as if it were you, download and publish your Google search history, and otherwise impersonate you on just about every web site you belong to?

Web sites and desktop apps like to offer services that extend or require the services on another web site, using your account.  It's your account!  Why do they need to impersonate you?  Likely because they need to access some data that is in your account and the web site holding that data will only let "you" access it, so obviously this new web site will need to pretend to be you to get to it.  This is what impersonation is about.

They don't need to impersonate you.  There are two reasons why trusting someone else to impersonate you is a Very Bad Idea:

  1. It usually involves giving away your password to that other web site.  Giving away your password is a free-for-all pass to your account.  It lets the other web site pretend to be you for an unlimited duration (or until you change your password, but that's a pain) and do unlimited things as you.  Oh, and they could publish your password to the world (yes, it actually happens!) either deliberately or on accident.
  2. Your password is very often the same password you use on many other sites.  Not only are you giving this arbitrary site the ability to impersonate you on just one site, but you're probably giving them the ability to impersonate you on many other sites as well.

There is a better way.  OAuth is a protocol that can give web sites and desktop applications the ability to access a constrained subset of your information or abilities on some web site, without giving away your password, and therefore without actually impersonating you, as the site holding your protected information knows that it's not really you, but some 3rd party you've authorized that is actually accessing your data.  Oh, and with OAuth you can revoke a 3rd party's privileges to access your protected data at any time without any inconvenience around changing your password.

Let's look at a couple of real-world scenarios:

Windows Live Writer Today

I love Windows Live Writer.  I'm using it now to write this post.  But it demands to know my Blogger username and password in order to publish posts when I'm done writing them.  So yes, Live Writer could theoretically read my email, write emails, read my address book, and ultimately break into all web sites that I have accounts on.  Oh, but Microsoft wrote Live Writer, so I'm safe, right?  Perhaps.  But suppose someone were to steal my laptop.  They could decrypt Live Writer's cache of my password and do all these Bad Things. 

Windows Live Writer Tomorrow

What I'd love to see is the next version of Live Writer to offer OAuth access to my Blogger account.  Google already supports OAuth so there's nothing stopping Live Writer from doing this.  Here's how it would work first setting up Live Writer to publish to my blog:

LW: What your blog URL?

Me: http://blog.nerdbank.net

LW: I see that that blog is hosted by Blogger.  I know how to work with that.  May I request permission to publish to it?

Me: Yes.

LW opens a browser window that I see google.com come up in.  I'm already logged into google.com since I read my email a lot, so I'm not asked to log in.

google.com: Live Writer says it wants to publish to your blog.  Is that ok with you?

Me: Yes.

google.com: Ok, LW has been authorized.  Close this window and return to LW.

Me: [I close the browser].  LW, Google.com has authorized you.

LW: Great.  You can start writing your first post now.

In this process neither I, nor Google, gave Live Writer my password.  Google generated a cryptographically strong token and token secret (essentially a username and password for computers to use) just for Live Writer to use, that is limited to just managing my blog. 

If someone were to steal my laptop now, the most they could do is update my blog.  But in that they would very likely fail because I could just log into google.com on any computer and revoke Live Writer's permission to publish to my blog and then the token that was stolen along with the laptop is worthless.

LinkedIn / Facebook

Social networking web sites are notorious for asking for your email password.  And they range from very reputable to downright nasty and people give their passwords away to all of them because we've trained them too.  Shame on us.

LinkedIn and Facebook should lead the way following this process:

LI: If you use any of the following email programs, click on one [list of logos].

Me: [click on Gmail]

LI: [quietly redirects me to google.com]

google: LinkedIn wants to download your address book.  Is that ok?

Me: Yes.

LI: Thanks!  We've spammed all your friends and recorded their email addresses so we can invite you to spam them again later if they don't join our web site. :)

You can see that this is even simpler for the user than the desktop client app example.  It is left as an exercise for the reader to justify giving away your friends' email addresses to some web site they may not want to join.

Why OAuth cannot be ignored

While the above are compelling reasons why OAuth should be used instead of impersonation, there is an unavoidable reason why OAuth will have to be used in the future: no passwords.  Passwords need to die, and are dying slowly already.  They can be phished, forgotten, lost, etc.  And they make impersonation too easy.  InfoCards and OpenID make great password replacements for logging into web sites.  But if there is no password, there is nothing for you to possibly share with these 3rd party apps that want to impersonate you.  Their only option left with be OAuth (or some similar technology) so they can get their own access token.  It's available today, so why wait?  Help train your users against being phished instead of training them to be phished, and future-proof your app today. 

Add OAuth support to your apps for accessing your users' data on other web sites!

Friday, September 26, 2008

What's coming for DotNetOpenId and DotNetOAuth

Did you even know DotNetOAuth existed?  It's alpha quality right now but that will change soon.  OAuth and OpenID serve two orthogonal needs, and although Google is trying to combine the two, I haven't figured out how they are going to do it yet.  In the meantime I have been writing the DotNetOAuth library for you .NET programmers out there.  It's coming along nicely.

When I joined the DotNetOpenId team a couple of years ago, the library had already been mostly ported from Janrain's Boo version, which was itself ported from Janrain's Python library.  Eventually I refactored the entire DotNetOpenId library to not only look and feel more like a native C# library but also be much easier to discover via Intellisense for the end user and made most of the classes internal so that people wouldn't use the wrong stuff.  It was a great exercise for the budding architect in me.  But in retrospect, when I look at the implementation of the code I still see quite a few of its original Python roots.  Writing unit tests was really tough as the thing wasn't originally designed with testability in mind, it seemed.  I've overcome much of that with dozens of test helper methods, but it still feels cumbersome at times.

Starting DotNetOAuth has been a real treat for me, personally.  I got to start from the ground up.  I read the spec over a few times and tried to get the spirit of it without letting the way it expressed ideas directly imply how I would implement it.  I saw a great deal of similar concepts ad requirements with OpenID.  In particular the most glaring one was this need to pass messages between two parties both directly and indirectly (which means from one server to another, and via redirects through the browser). 

I wanted to capture the concept of messages being passed around in such a way that one could write code that would write up a message, call up a channel and just say "send" in essence and the right thing would happen and the author would not need to know exactly what kind of transport the message needed or anything.  When I thought of this, WCF seemed like the obvious solution.  It has an extensible message pipe and would be great... or so I thought.  And frankly maybe I'm wrong: maybe WCF still could do it.  But I found the need for messages to be passed via redirect to be a bit beyond WCF's designers' plans and it seemed to fall short.  I bet you could do it by writing your own transport binding element, but there were other requirements that WCF didn't quite meet here and there, and eventually I thought I might as well try writing my own mini-WCF just for passing messages directly and indirectly and see where that took me.

The important goal for me the whole while was that I wanted to write an extensible message passing library that could be used to reconstruct the insides of DotNetOpenId and build up DotNetOAuth, as well as make any future redirection-based protocols very easy and quick to implement. 

I'm very happy to report that I succeeded.  I have an extensible channel with binding elements including support for message tamper detection, expiring messages and replay detection.  The messages can be passed directly or indirectly.  The transport itself is very customizable, and most importantly, very testable.  Mock transports can be easily injected so that tests can simulate full scenarios of two party communication without actually launching an HTTP server.  I went on to write a small test simulation framework so you can write each party's side of the test separately as two different thread methods, and the framework will allow them to interact as if they were live, yet much faster and easier to step through offline.

So what does this mean for DotNetOpenId?  I expect that DotNetOpenId 3.0 will be largely the same on the customer-facing side.  But internally (I hope) I'll have had the time to tear out its innards to use this new messaging framework.  It should make it much easier for people who like to read the source code of the open source libraries they consume to understand what's going on and even write their own tests.

And where is DotNetOAuth?  I've got a bit of work left to do on it.  I'm writing a few more tests and I still have the sample sites to write.  It's probably a few weeks away from going to beta.  And I still need to get Microsoft lawyers to sign-off on my ability to publish this work that I've done on my spare time since I work at Microsoft and they like to know what their employees are giving away for free.  There's a lot more work on the source code than what I've published up at the Google Code web site.  So don't think the work has stalled.  I'm just waiting for sign off before I push up all the code changes I've made for public viewing.

Watch here for more news.

Friday, September 19, 2008

NoTrailingWhitespace StyleCop rule, and others

After a period of initial bad taste, I've come to like StyleCop.  I dislike a few of their rules, and I turn those off.  But overall I like how it helps many developers maintain a consistent coding style within a project.  I wrote a few rules of my own, and offer them here for free.

Source code here: NerdBank StyleCop Rules

And here are the rules that are currently in the library, with more likely to be added soon.  Remember you can pick and choose which rules you want to use within the Settings.StyleCop file.

NoTrailingWhiteSpace

This rule will make sure that lines of code don't end with extra spaces or tabs that don't do anything.  What does it hurt?  Well, it doesn't much.  But just like almost any rule in StyleCop, it helps maintain consistent formatting.  I like enabling visible whitespace in Visual Studio so that I can make sure I'm consistent with tabs vs. spaces within a file.  And seeing whitespace characters at the end of a line is tacky.  But the bigger reason is that auto-formatting in C# will remove that trailing whitespace, which means if you're using revision control you'll often get diffs on lines that have nothing but removed whitespace changes, making your diff bigger than it needs to be.  This rule helps you avoid the whitespace to begin with.

IndentUsingTabs

This is the inverse of the SA1027 rule: TabsMustNotBeUsed.  It ensure that you do use tabs instead of spaces for indentation.  Beautiful.  I like tabs instead of spaces so that whoever is viewing the code can choose their preferred indentation size rather than having to use whatever the author thought was readable.

NoSpacesBeforeTabs

Sometimes when one person is using tabs with size 4 and another has size 3 tabs, extra spaces can make formatted code look wrong.  This rule just makes sure you don't have a tab, a space, and another tab or something strange like that.

Monday, September 15, 2008

A rebuttal to the concerns surrounding OpenID

OpenID is still largely unknown to most Internet users.  Those who already know something about it are mostly very tech savvy and have a strong opinion either for or against it.  Those for it see it as a realistic vision for single sign-on for the web.  Those who are against it range from simply not seeing a need for it to adamantly thinking it's one of the worst protocols because of the security risks it (supposedly) introduces.  Many of these concerns are based on old information or false assumptions.  I'd like to review some of the more commonly repeated concerns and explain why OpenID does more good than harm, and why I'd like to see its universal adoption in the next 2-3 years.

This post assumes you already have a grasp of what OpenID is all about.  If you don't, stop by openid.net and read about it first, then come back here.  Another good read is Kim Cameron's Laws of Identity (or in brief form), which I'll discuss in the context of OpenID in a later post.

And a quick review of terminology used throughout the post:

  • SSO = single-sign on: the capability to log in just once and have anyone you wish to reveal your identity to implicitly recognize you without you having to re-enter your credentials.
  • IdP = Identity Provider: an entity that "hosts" your identity, and can make assertions regarding your identity and attributes to other entities.
  • RP = Relying Party: an entity that trusts an IdP to assert your identity--that is, a site that accepts your OpenID Identifier to login.
  • OpenID Identifier = the URL or XRI that you can type in to log into an RP.

Now on to the concerns I keep hearing...

OpenID fails at SSO

First let's discuss the merits and demerits of SSO.  If you logged into your computer and every single web site you visited immediately knew who you were, this would indeed be convenient.  Unfortunately it would also be convenient to snoops who want to find out more about you to know who you were without your knowledge.  The Laws of Identity include this provision.  So SSO on the wild, wild web must include a conscious decision to reveal your identity -- and only as much as is absolutely necessary. 

OpenID requires you to deliberately sign-in to every RP you visit.  If you trust an RP you can tell your IdP that it can stop asking you if you want to log in to make it more streamlined.  OpenID thus takes you as close to SSO on the web as you really want to be.

Verdict: A pro instead of a con.

Users often have multiple OpenID Identifiers

The mantra of OpenID is presumed to be that an individual has just one Identifier that gets him/her into everything.  Then why do so many users have multiple OpenID Identifiers?

Having more than one Identifier is not a sign of a problem, but a sign that one of OpenID's powers is being put to use.  With OpenID allowing anyone to be an identity provider (IdP), we have competition in the marketplace to become the best IdP to win loyal web visitors.  Until we see a "best" out there (likely never), there will always be trade-offs to picking one IdP over another.  But if you choose multiple IdPs, you can combine the advantages of many into your own identity management.  Since OpenID lets you collect all your IdPs under one OpenID Identifier, your identity isn't splintered by having multiple IdPs.  It just makes your omni-Identifier more potent. 

Let's say you have three OpenIDs: one from openid.yahoo.com, one from myvidoop.com and one from linksafe.com.  Each IdP has its advantages.  Go with me in this example (it may not be strictly true for you, but true for many):

  • You're almost always logged into Yahoo!, so using them as a IdP makes sense because it increases the likelihood that logging into an web site that accepts OpenID (henceforth called RP, for "relying party") won't require you to log into your Provider.
  • When you're away from your own (trusted) computer, and you're at an untrusted computer (an enemy's, or an Internet cafe) that may have a key logger, you can use myvidoop.com to login with their one-time-use passwords.
  • When you're trying to log into HealthVault which has a whitelist of only two providers, your linksafe.com provider (which is on HealthVault's whitelist) will automatically be selected and authenticate you.

And all this without having to remember your several Identifiers.  Just remember the one that ties them together.  One Identifier to Rule Them All.

Verdict: A pro instead of a con.

OpenID is too hard

No password to login?  That's the hardest intellectual hurdle in my opinion.  When I first tried OpenID I remember how I felt while typing some URL into a single box and wondering "How in the world will my typing in a URL prove who I am to this web site?"  There was a mind block there for several moments.  But I tried it and it worked.

Since then it's gotten way easier.  I don't even have to remember a moderately easy URL like http://andrewarnott.myopenid.com.  I only have to type in myopenid.com or yahoo.com and I'm done.  How easy is that?!

In fact, with some popular Providers like Yahoo! issuing their "Log in with your Yahoo! ID" buttons, OpenID falls completely behind the scenes.  The user just clicks a familiar button, the log into Yahoo (from their perspective) and suddenly they're logged into the RP web site.  How could that be any easier?

There is room to improve, of course.  Yahoo! and some other providers have to make creating an OpenID look simpler.  But that is no fault of the protocol.  The sites that offer OpenID can and will get better.  Remember how hard the web was when it was new?  Was anyone ready to type in http://www.yourfavoritebrand.com back then?  Heck, household product vendors didn't even know what to do with a web site yet.  OpenID is just the next new thing and it will take some time for people to "get it" mentally.  But that is normal, not a bad sign that OpenID is headed for the dump heap.

Verdict: There is improvement to be made, but many IdP's are already quite user-friendly.

OpenID allows IdP's to track all your web site visits

This is true.  Part of the OpenID protocol's security features include that an IdP is aware of each RP a user logs into.  While the protocol does not require the IdP to track all your visits over time, an IdP is free to do so.  But this concern is resolved when you remember that anyone (even you!) can be an IdP, and if you don't trust the IdP you're using to either not track you or to keep your private data private, pick a different one, or become your own IdP. 

Verdict: Completely resolvable by an individual user.

OpenID allows different RPs to link data they have on you

This concern is based on the assumption that a visitor uses the same Identifier to every RP he/she logs into.  If I am known by https://andrewarnott.myopenid.com at every site I log into, then those sites can exchange user data about me and gain more information than I intended for them to have. 

This was a valid concern with OpenID 1.x, but OpenID 2.0 solved that by introducing the directed identity feature.  Directed Identity allows IdP's to transform the Identifier the user typed in into some other Identifier.  It is both a convenience feature (so user's can just type myopenid.com instead of andrewarnott.myopenid.com) and a security feature so that RPs (optionally, based on the IdP you use) never see andrewarnott.myopenid.com at all, but instead some randomly generated Identifier that the IdP will always use for that individual RP and no others. 

So OpenID, once again, puts the option into the user's hands by letting the user decide whether the RP will see a standard Identifier you use when you wish to be identifiable across sites vs. a unique Identifier for that individual RP or small group of RPs.

To date however, I have not seen a single IdP that actually uses directed identity to generate pair-wise unique Identifiers so the user's identity cannot be tied across RPs.  It's absolutely technically feasible given the spec, and not that hard either.  But I guess customers have not been clamoring for this feature enough for IdP's to actually add support for it.

(Don't be fooled into thinking Yahoo!'s hash-looking Identifier it creates for you is pair-wise unique... it's not.  The same crazy Identifier is shared across all the RPs you log into.)

Verdict: Totally resolved by OpenID 2.0, although IdP's need to catch up and leverage 2.0 to solve this.

OpenID is vulnerable to phishing attacks

First the bad news: OpenID multiplies the severity of the phishing problem.  If someone else controls your OpenID Identifier, they can impersonate you everywhere you use OpenID to login.  Since an untrusted RP that you're trying to log into is the same entity that redirects you to your IdP to prove your identity through alternate means (traditional username/password for example), there is room for the RP to pretend to be your IdP and capture your credentials and gain control of your OpenId Identifier.  It was bad enough when phishing emails with almost-normal-looking URLs tried to steal your username/password to your bank account by luring you to a copy-cat site where one username/password could get stolen.  But now web sites can take you without seeing or clicking on a link to a site that you think is yours, and steal the one key that will open every one of your locks (at least those that accept your OpenID).  Wow, that's bad...

Although OpenID somewhat forces centralization and unification of your one-size-fits-all key, I'll bet 95% of the common users out there are doing it informally already.  Be honest: do you really use a unique username and password for each and every web site you log into?  If you have a password manager installed in your browser you just might.  But you probably don't.  In fact, most less tech-savvy people reuse the same one or two passwords at every site they visit.  This is not just guessing.  It is a real problem.  Web sites can and do use their username/password database to try to steal your identity or assets on other web sites and succeed!  This is all without the help of OpenID.

But of course this cloud has a silver lining, or else I wouldn't be advocating OpenID. :) 

While an IdP can take a username/password as its credential, it can (and should) also take stronger, less phishable credentials.  Some terrific alternatives are InfoCard, X.509 client certificates, one-time-use passwords, and two-factor authentication like calling you on your cell phone.  By using InfoCard or X.509 client certificates as your primary authentication to an IdP, your proof of identity becomes completely unphishable.  Yes, an evil RP could trick you into logging into their pretend-IdP site, but that will not buy them control of your OpenID hosted at the real IdP at all.  The very most they could phish out of you is whatever attributes about yourself you explicitly added to your InfoCard or X.509 cert, which may consist of your name and email address -- but never the ability to pretend to be you to your IdP.  In fact several of the big-name OpenID Providers out there already offer the option to let you log in using InfoCard or X.509 and even allow you to turn off the ability to log in with a password so it can never happen to you. 

Verdict: Completely resolvable by an individual user, but Providers can improve security for their users by encouraging more users to turn off their passwords. 

We already have InfoCard and X.509 client certificates.  Why do we need OpenID?

X.509 client certificates are not a great solution because you either have one for every web site or you share certificates across web sites which means these web sites can collaborate and share information about you.  Neither of these is desirable.  X.509 client certificates are also not easily ported between computers, and you'd never want to copy them to an untrusted computer to temporarily log in as that computer could steal the private certificate without your knowledge.

InfoCard allows you to have just one Card (or perhaps a few) that you can use across all sites you visit, without those sites being able to link your information across multiple sites.  But accepting InfoCard securely requires that your site both have an HTTPS server certificate and that your web site has access to the private certificate to decrypt the token, which many web hosting providers disallow for security reasons.   This makes accepting InfoCard technically infeasible for most sites out there.  It also requires special software on the client, and the Cards are not easily ported between computers and again should not be done on untrusted hardware, leaving you unable to log in as yourself on someone else's computer.

OpenID on the other hand does not require RPs to have HTTPS certificates, although RPs should have one if they ask for sensitive personal information about yourself as all web sites should.  It can be done with "safe" server code (partial trust ASP.NET, Java, PHP, Perl, Python, Ruby, ...) and is thus easily deployable in shared hosting environments.  Logging in using OpenID can be done from any computer, subject to the limitations of the IdP you choose, and requires only a web browser.  But that's the awesome bit: it's up to you to choose your IdP and how to configure it, so you choose between the trade-offs between high security and high portability instead of every other web site you visit.  Plus, some IdP's (like myopenid.com and myvidoop.com) are getting quite creative in getting the best of both worlds between portability and security.  With OpenID you can leverage that to benefit you on every site that accepts OpenID.  Sweet.

I'll just add that any web site you log into with a username/password really should use HTTPS to protect your password. But with OpenID an RP can safely allow you to login without having an HTTPS certificate of its own.  With the great libraries out there for adding OpenID to your web site, adding OpenID actually becomes potentially easier than adding username/password in a secure way. 

Verdict: InfoCard and X.509 lack some important features that OpenID makes easily available to users.

OpenID Identifier URLs are subject to others gaining (or spoofing) control of them and stealing your identity

If you are using a URL from a domain you own as your OpenID Identifier (say http://blog.yourname.com) there is the danger that one year you'll forget, or decide to not renew that domain name.  Someone can then buy that domain, set up an OpenID Identifier at the same location, and then impersonate you on all the RP sites that you left dangling accounts on.  You would have to be very careful to delete your account or change OpenID Identifiers on every single RP you ever signed into to avoid this.  That's unlikely and thus a real security risk.  (but read on...)

You don't even have to let your domain registration expire.  If you don't always use a secure https Identifier, an attacker can use a DNS poisoning attack to impersonate you on some RP by tricking the RP into thinking the attacker is hosting the user's IdP.  That's really bad.  And although not all sites are vulnerable to DNS poisoning attacks, as just a visitor to an RP web site, there's no good way for you to establish their vulnerability to this (nor would you want to laboriously check this with every visit).

Finally, even if you aren't hosting your own Identifier and have chosen a good HTTPS-supporting IdP, concern exists that when you cancel your georgeduncan account at that IdP that the IdP may recreate the account for a new George Duncan that comes along to that IdP, allow the same Identifier to be created and that George to impersonate you on all the RP sites you've visited in the past.

OpenID 2.0 helps to relieve most of these concerns that arose out of use of OpenID 1.x.  HTTPS Identifiers are highly encouraged and many Providers already support them, although most can improve their security by normalizing Identifiers users type in to be https even if the user typed in http.  IdP's that wish to reuse usernames as existing users cancel accounts and new ones arrive may do so without sacrificing the security of their old account by using OpenID 2.0's "Identifier recycling", which adds a #fragment to the URL Identifier that RPs will recognize and use to differentiate the new and old user accounts.

Finally, and the best solution in my opinion, is that OpenID 2.0 introduced the use of XRIs as Identifiers that we can use instead of URLs.  XRI's come in two forms: i-names and i-numbers.  My i-name is =arnott, and my i-number is =!9B72.7DD1.50A9.5CCD.  My i-name, like a domain name, only belongs to me as long as I maintain an account with the service I registered it with.  But my i-number that came with that i-name is universally unique and mine forever, long after I cancel my i-name.  Since RPs I log into using my i-name actually store my i-number as my Identifier, I can change my i-name anytime I like and still log in because my i-number is mine forever.  And a future owner of =arnott will not be able to impersonate me because their i-number must be different than mine.  And no I do not have to memorize my i-number.  I just remember =arnott, and the infrastructure does all the work of resolving that to an i-number and authenticating me.  Now even if I'm hosting my own IdP I can rest assured that no one can ever buy off my Identifier.

Verdict: Valid security concerns exist, but solutions to all of them already exist and are well supported.

There are not enough relying party web sites that accept OpenID

I agree with this sentiment.  Although there are already tens of thousands of sites that accept OpenID, on the scale of the web that is a very small percentage.  And most of the sites that I personally use do not yet accept it. 

This of course is not a shortcoming of the OpenID Authentication protocol, but just a sign of its newness.  I encourage everyone to help in the effort of spreading OpenID.  Web developers can adopt OpenID on their web sites.  Everyone can visit http://demand.openid.net and add a bookmarklet to their web browser toolbars and click "Demand OpenID" on each site they visit that does not accept OpenID. Think of it as a vote counter that web developers can view to see how many of their visitors want OpenID.  It's just one click and you can go on logging in the traditional way until that site adds OpenID support.

Verdict: True.  But sites are adding OpenID support at an encouraging rate.

OpenID has no "single sign off"

Let's say you're at an Internet kiosk and log into an OpenID-supporting web site.  To complete that process you actually logged into two web sites: your IdP and the RP.  Each subsequent RP you log into adds one more web site to the list of web sites you've logged into.  No big deal, since OpenID makes logging into each successive RP quick and easy.  But now you're ready to leave the kiosk and you certainly don't want the next person in line to impersonate your identity, so you responsibly Log Off of the last RP you signed into.  Guess what?  Even if you logged out of every single RP you logged into, you may have forgotten to explicitly log out of your IdP since you only visited that site long enough to log into another site.  The next person in line may even be innocent when they type in "yahoo.com" as their OpenID Identifier to log themselves into bankofamerica.com, but when they see your account automatically come up instead of being prompted to log in as themselves, even an honest man can be tempted.  If you logged out of your IdP explicitly but forgot to log out of an RP, you're identity can still be impersonated at that RP. 

Leaving yourself logged into RPs when you walk away from a computer is not a concern introduced by OpenID.  Obviously any site you log into with a standard username/password is still dangerous to walk away from without logging out first.  But the fact that inadvertently leaving yourself logged into your IdP can allow someone to impersonate you everywhere is horrible.

I regret to say that OpenID does not have a good answer for this -- yet.  A future version of OpenID or an extension to an existing version could allow for a "single sign off" feature that would automatically log you out of every web site when you click Log Off of just one (or some variant of that).  In the meantime, you need to remember to log out of your IdP before you walk away -- or don't click that "Remember Me" checkbox and be sure to close your browser window when you're done.

Verdict: A real security concern (for now), but individual users can protect themselves by explicitly logging out when they leave a computer.

My personal concern: Insecure implementations of OpenID

I actually haven't heard anyone else voice this concern, but to me this is the most serious one and the most difficult to resolve.  There are several good, reputable libraries that make adding OpenID support to a web site fairly trivial and secure at the same time.  As an OpenID library author myself I know that getting it correct, interoperable with other sites, and secure is difficult and takes time and a lot of code and reviews. 

But there are some web developers out there that resist adding libraries to their web sites and would rather code up OpenID support themselves.  I've seen several blog posts on "How to add OpenID to your site without a library in just 30 lines of code" or to that effect.  It's horrendous the security holes these implementations have.  In just a couple minutes of reviewing these ~30 lines I can usually find two or three gaping holes that would allow me to impersonate anybody.  Yet these blog samples unfortunately become the source code to real, live web sites.  The fact that a web site can get OpenID login "working" in 30 lines of code in ideal conditions does not automatically mean that it is secure against attackers or interoperable with other sites that do things slightly differently. 

You wouldn't want to implement the HTTPS protocol yourself (I hope!) just to avoid a dependency on software that should be built into your web server anyway.  OpenID is a highly security-sensitive protocol that deserves the same respect and care. 

Yet given some arbitrary web site you choose to visit that accepts OpenID, there is no way for you to tell whether a well-written library is driving that feature or a cheap 30 lines from some blog.  If that site offers username/password and OpenID and lets you choose, do you choose username/password because that's much easier to get right than an OpenID implementation just to play it safe?  Not that passwords are all that safe and you're still trusting the site to use them responsibly, but still, it's a way to hedge your bets.  The trouble is that end users have no way to know whether this site's OpenID is well done or if its implementation would allow someone else to impersonate them. 

There is unfortunately no solution whatsoever right now to this problem.  I imagine the only way to solve this would be to get a reputable security company (like Verisign or Thawte) to issue "OpenID security checked and approved" seals that sites can display next to their OpenID login box to help assure you of a good implementation.  But until someone does that, well, don't use a poe-dunk looking site to store anything you wouldn't want someone impersonating you to access.  That's largely true whether you're using username/password or OpenID anyway.

Verdict: A real concern, and unfortunately adoption-stifling caution is the best mitigation for now.

Summary

Logging in with passwords needs to go the way of all the earth, there just is no way around it.  Passwords are phishable, forgettable, and weak.  But in the short term they are the only truly portable identity.  The stronger credentials all require either hardware or special software on a computer to let you prove your identity (which is what makes them so secure).  Until the stronger credentials become as easy as passwords to use when you're on a stranger's computer, passwords will survive. 

OpenID can help us in this transition by letting users choose when they are personally ready to make the jump to no passwords by choosing an OpenID Provider that suits their security requirements.  Those decisions, made by each individual, can then span to all the web sites the user logs into automatically.

Tuesday, September 09, 2008

My OpenID Provider wishlist

I have yet to find an OpenID Provider that offers all that OpenID has to offer.  Though some come awfully close, myopenid.com most notably. 

  1. Multiple personas
    1. Attribute Exchange support (using the correct type URIs), including both ordinary persona information and allowing RPs to push attributes up.
    2. Simple registration support
  2. Unsolicited assertions to any RP I name
    1. ordinary, and
    2. using a user-supplied Claimed Identifier (that is, one that delegates to this OP, but isn't controlled by the OP)
    3. A home page full of customizable bookmarks, including to RPs in which an unsolicited assertion rests in the link so I just click and I'm logged into the site automatically.
  3. Authentication options that include:
    1. InfoCard,
    2. X.509,
    3. telephone,
    4. one-time use passwords, where more can be obtained using cell phone text messages, etc., and
    5. other password alternatives that allow login from strange and untrusted consoles.
  4. Good normalization of identity page URL, always to an https: URL, removing/correcting unnecessary path and querystring variations via redirects
  5. Highly customizable trust settings and history of RPs logged into.
  6. URI recycling
  7. Built-in support for XRIs and free community i-names.
  8. Directed identity support, including optional pair-wise unique claimed identifiers for select RPs
  9. A page that generates an HTML snippet to copy into a delegate URL.
  10. Allow customization of the identity XRDS file to allow for refs so that the file can be imported into another (larger) XRDS file of the user's own hosting.

Ideal OP, are you out there?

Monday, August 18, 2008

DotNetOpenId adds an OpenID AJAX login control

DotNetOpenId may be feature complete per the OpenID spec, but it's far from stagnant. The next version will be packed with new enhanced security features that can be optionally turned on by relying party sites with high security requirements, and a new AJAX login control! I'm pleased to announce as well that it will be just as easy to drop in and use as the non-AJAX version.  Ah, the joys and power of ASP.NET.  Gotta love it.  That's why I think .NET is the best platform for developing OpenID-enabled web sites.

You can preview the new AJAX control early. And please, leave comments here on this blog on how you like/dislike it. Don't leave comments on the preview comment page just linked to, because those comments are not actually stored anywhere...

Saturday, August 09, 2008

Make your ASP.NET custom controls work with the validation controls

Do you have a custom ASP.NET control that you want to allow your users to attach the standard ASP.NET validation controls to?  A quick web search turns up a lot of forum topics that seem to suggest it's not possible.  But it is!  And it's so easy.

Just add a ValidationPropertyAttribute to your custom control's type declaration:

[ValidationProperty("Text")]
public class YourCustomControl : WebControl // or CompositeControl or whatever
{
	/// <summary>
	/// Gets/sets the textbox area of the control
	/// </summary>
	public string Text { get; set; }

	// more code here
}

Of course this only makes sense if your control has a field that can sensically be validated, such as a textbox area. Your property of course does not need to be named Text, but can be anything as long as you keep the property name and the value you pass to the ValidationPropertyAttribute's constructor in sync.

Saturday, August 02, 2008

How to use a library without taking a hard dependency on it

First let me lay out the problem: you're writing a library that so far is self-contained as a single DLL.  You want to add some functionality to your library that is available in another library that your users may or may not have or wish to deploy with your library.  You can either add a reference to that library and code against it in your library and require that your users deploy both libraries together from now on, or you can embed the functionality of the second library directly into your own library (by copying source code or rewriting the behavior yourself) so your users only have the one DLL to deploy.  Both have their pros and cons.  Isn't there a way to capture the best of both worlds?  Yes.  Two, in fact.

Pros and cons to adding an external dependency

I've already written a post on An argument for the extra dependency of a library.  Deploying another DLL to a web site's Bin directory is trivial in cost and time.  So what's the big deal?  Perhaps server administrators have a review process that every binary must go through before it is allowed in their web site's Bin directory for security reasons.  Perhaps if you think your customers are looking for a cool feature that your library implements but then discover that your library requires the presence of several other libraries as well (perhaps for features this individual doesn't even care to use) they might look elsewhere.

On the flip side, if you just copy the code (or rewrite it yourself) into your library so that everything is in one big DLL, well now you have a versioning nightmare because every time the other library gets serviced (maintenance release, security bug found and fixed, etc.) you have to diff the changes and merge them into your own product.  Also, if your Library A includes Library B in its DLL and your customers aren't even aware of it, then when the author of Library B issues a broad security warning that your customer sees he may not even realize it applies to him, and even if he did, what could he do about it since he cannot modify Library A with the patch for Library B?

How to minimize the downsides of the external dependency

So let's assume I've convinced you to use the external library as-is without bundling it into your own DLL.  But you still want that other library to be optional to your customers.  This may be because the other library provides a rarely used feature, or perhaps a common feature that is used everywhere but can be altogether turned off (like logging). 

Logging in fact is the scenario that drove me to develop the pattern I'm about to present to you.  I found .NET's System.Diagnostics.Trace class completely incapable of running when in a partial trust environment, so I had to find another logging mechanism.  Log4net is a pretty well-known one that works in partial trust environments so it fit the bill.  The only trouble was many of my users didn't want to have to deploy log4net.dll alongside my own library.  So here were my requirements:

  1. If log4net.dll was present, the product should use it for its logging mechanism.
  2. If log4net.dll was not present, the product should quietly switch to using System.Diagnostics.Trace if the product was running with full trust.
  3. If log4net.dll was not present and the product was running under partial trust, logging would be disabled.
  4. This was logging so it had to be fast and could not slow the product down significantly whether logging was enabled or not.

The first obvious way that occurred to me to use log4net.dll if it was present but not actually require it to be deployed with my library was to use .NET reflection to discover the DLL and load it if it was present, and then using reflection invoke the various methods necessary to log messages.  Using reflection to invoke methods is painfully slow however, so that was out. 

Another way to do it without the full cost of reflection at every logged message was to use Reflection.Emit to generate code on-the-fly that would call into log4net.  This generated code would consist of generated method stubs that would call into log4net in an early-bound way so that it was much faster.  But Reflection.Emit still has an upfront cost, and besides it isn't available at all in most partial trust environments, which was the whole point of using log4net.  So Reflection.Emit was out.

Finally I came to the following solution, which consists of a reusable pattern for writing fast code that calls into a library that may or may not be there and you can decide what to do in either case.  The pattern follows.

The pattern for using a library that may not be present

I'll be using log4net throughout the description of this pattern for illustrative purposes, but the pattern works perfectly well for other external libraries (even if those external libraries come with external dependencies of their own).

First, add an assembly reference to the external library (log4net.dll). 

We need to define an interface within our own project that exposes all the functionality in the external library that we will need to access.  There is one interface in log4net that is interesting throughout our project: log4net.ILog.  So we'll define DotNetOpenId.Loggers.ILog and since both DotNetOpenId and log4net have liberal open source licenses we'll copy ILog from log4net into DotNetOpenId and change the namespace.  Although in our case the two interfaces will be identical, they need not be, and when working with less liberal licenses you probably should avoid copying the interface right out of someone else's assembly.

namespace DotNetOpenId.Loggers
{
	interface ILog
	{ 
		void Debug(object message); 
		void DebugFormat(string format, params object[] args); 
		bool IsDebugEnabled { get; }
		// many more members, skipped for brevity
	}
}

Now we need to implement this DotNetOpenId.Loggers.ILog interface with our own class that does nothing but forward all calls to log4net.  This class must be smart about handling cases when log4net.dll is missing however, so we very carefully design it thus: [Update 8/6/08: fixed bug that prevented code from working when log4net.dll was not present]

namespace DotNetOpenId.Loggers {
	class Log4NetLogger : ILog {
		private log4net.ILog log4netLogger;

		private Log4NetLogger(log4net.ILog logger) {
			log4netLogger = logger;
		}

		/// <summary>
		/// Returns a new log4net logger if it exists, or returns null if the assembly cannot be found.
		/// </summary>
		internal static ILog Initialize() {
			return isLog4NetPresent ? CreateLogger() : null;
		}

		static bool isLog4NetPresent {
			get {
				try {
					Assembly.Load("log4net");
					return true;
				} catch (FileNotFoundException) {
					return false;
				}
			}
		}

		/// 
		/// Creates the log4net.LogManager.  Call ONLY once log4net.dll is known to be present.
		/// 
		static ILog CreateLogger() {
			return new Log4NetLogger(log4net.LogManager.GetLogger("DotNetOpenId"));
		}

		#region ILog Members

		public void Debug(object message) {
			log4netLogger.Debug(message);
		}

		public void DebugFormat(string format, params object[] args) {
			log4netLogger.DebugFormat(CultureInfo.InvariantCulture, format, args);
		}

		public bool IsDebugEnabled {
			get { return log4netLogger.IsDebugEnabled; }
		}

		// Again, many members skipped for brevity.

		#endregion
	}
}

There are several critical techniques used in the Log4NetLogger class shown above. 

  1. The Log4NetLogger class implements the DotNetOpenId.Loggers.ILog interface instead of the log4net.ILog interface, because again that would introduce an exposed reference to a log4net.dll type, which would defeat the purpose. 
  2. All static members on the class make no reference to any types inside log4net.dll. 
  3. The constructor is private and we use a static factory method that only allows instantiation of this class once log4net.dll is confirmed to be present, otherwise null is returned.
  4. Once the class is instantiated, all communication with log4net.dll types is done within the instance and not exposed outside the class.  It's ok that we have a private instance field of type log4net.ILog in our class because our class is only instantiated, and that field only gets touched, if we already know that log4net.dll is present.

At this point we have a class that provides safe access to log4net.dll when it is present, and only returns null if the referenced assembly is missing rather than having the CLR throw an assembly load exception. 

Now we need another implementation of DotNetOpenId.Loggers.ILog that will be used when log4net.dll is not present.  We'll write one call the NoOpLogger that silently does nothing.

namespace DotNetOpenId.Loggers {
	class NoOpLogger : ILog {

		/// <summary>
		/// Returns a new logger that does nothing when invoked.
		/// </summary>
		internal static ILog Initialize() {
			return new NoOpLogger();
		}

		#region ILog Members

		public void Debug(object message) {
			return;
		}

		public void DebugFormat(string format, params object[] args) {
			return;
		}

		public bool IsDebugEnabled {
			get { return false; }
		}

		// Again, many members skipped for brevity.

		#endregion
	}
}

Now we need a centralized Logger class that manages the use of our ILog implementing classes so that throughout our project we can log with such a simple line as:

Logger.Debug("Some debugging log message");

From this use case, we know Logger must be a static class.  Static classes cannot implement interfaces, so rather than actually implementing DotNetOpenId.Loggers.ILog, it will define all the same members as are in ILog but make them static.   These static methods will forward the call on to some instance of ILog.

namespace DotNetOpenId {
	/// <summary>
	/// A general logger for the entire DotNetOpenId library.
	/// </summary>
	/// <remarks>
	/// Because this logger is intended for use with non-localized strings, the
	/// overloads that take <see cref="CultureInfo" /> have been removed, and 
	/// <see cref="CultureInfo.InvariantCulture" /> is used implicitly.
	/// </remarks>
	static class Logger {
		static ILog facade = initializeFacade();

		static ILog initializeFacade() {
			ILog result = Log4NetLogger.Initialize() ?? TraceLogger.Initialize() ?? NoOpLogger.Initialize();
			return result;
		}

		#region ILog Members
		// Although this static class doesn't literally implement the ILog interface, 
		// we implement (mostly) all the same methods in a static way.

		public static void Debug(object message) {
			facade.Debug(message);
		}

		public static void DebugFormat(string format, params object[] args) {
			facade.DebugFormat(CultureInfo.InvariantCulture, format, args);
		}

		public static bool IsDebugEnabled {
			get { return facade.IsDebugEnabled; }
		}

		// Again, many members skipped for brevity.

		#endregion
	}
}

Here are the specific techniques used in the above Logger class:

  1. We have a static field that indicates the ILog instance to be used that is initialized automatically at first use of the Logger class. 
  2. The static ILog initializeFacade() method first tries to initialize the Log4NetLogger, fails over to the TraceLogger (which we haven't talked about yet), and finally gives up and reverts to the fail-safe NoOpLogger.
  3. All the members of ILog also appear as public (or internal if you choose) static members on this class, allowing extremely convenient use of whichever logger happens to be active.
  4. If the ?? syntax you see above is new to you, it's a C# binary operator that returns the first operand that evaluates to non-null, or null if both are.  For example, a ?? b is equivalent to (a != null) ? a : b.  And it gets really powerful when you compare a ?? b ?? c to ((a != null) ? a : (b != null ? b : c)).  It stacks much more elegantly than the more common ?: trinary operator as you can tell.
Why it works

At runtime, .NET only loads assemblies when they are first used.  So although your library references log4net.dll, log4net.dll won't be loaded (or noticed as missing) until the execution in your library draws very near to a call that actually requires log4net.dll to be loaded.  By carefully surrounding all references to types found in log4net.dll with these forwarding classes in your library, you can avoid .NET ever trying to load log4net.dll if you know that it's missing.

What was that TraceLogger I saw?

Well, since we have this fail-over mechanism for choosing which logger to use, it seemed a shame to fail immediately to logging nothing just because log4net.dll wasn't present.  System.Diagnostics.Trace is an adequate logging mechanism if you happen to be running in full trust so that's worth a shot.  Here's a snippet of what TraceLogger looks like:

namespace DotNetOpenId.Loggers {
	class TraceLogger : ILog {
		TraceSwitch traceSwitch = new TraceSwitch("OpenID", "OpenID Trace Switch");

		/// 
		/// Returns a new logger that uses the  class 
		/// if sufficient CAS permissions are granted to use it, otherwise returns false.
		/// 
		internal static ILog Initialize() {
			return isSufficientPermissionGranted ? new TraceLogger() : null;
		}

		static bool isSufficientPermissionGranted {
			get {
				PermissionSet permissions = new PermissionSet(PermissionState.None);
				permissions.AddPermission(new KeyContainerPermission(PermissionState.Unrestricted));
				permissions.AddPermission(new ReflectionPermission(ReflectionPermissionFlag.MemberAccess));
				permissions.AddPermission(new RegistryPermission(PermissionState.Unrestricted));
				permissions.AddPermission(new SecurityPermission(SecurityPermissionFlag.ControlEvidence | SecurityPermissionFlag.UnmanagedCode | SecurityPermissionFlag.ControlThread));
				var file = new FileIOPermission(PermissionState.None);
				file.AllFiles = FileIOPermissionAccess.PathDiscovery | FileIOPermissionAccess.Read;
				permissions.AddPermission(file);
				try {
					permissions.Demand();
					return true;
				} catch (SecurityException) {
					return false;
				}
			}
		}

		#region ILog Members

		public void Debug(object message) {
			Trace.TraceInformation(message.ToString());
		}

		public void DebugFormat(string format, params object[] args) {
			Trace.TraceInformation(format, args);
		}

		public bool IsDebugEnabled {
			get { return traceSwitch.TraceVerbose; }
		}

		// Again, many members skipped for brevity.

		#endregion
	}
}

The only new stuff in TraceLogger that you haven't seen in the last two ILog classes we've seen so far in this post is that its success if based on whether sufficient permissions are granted for trace logging to actually succeed.

If you're using some external reference other than a logger, you may choose to provide some other default implementation of your facade interface rather than a no-op one.  This pattern is very flexible to accommodate various libraries and interfaces as you can hopefully see.

What was that second possible solution?

So I said in my opening paragraph that there are actually two solutions to this problem.  The other solution is to use ILMerge.  It combines two compiled managed DLLs into one.  This allows you to just build your own library as if it had an external dependency, then you can merge the two DLLs together so your customers only see a single DLL.  You still have to service your distribution every time the other one issues a release, but it's not as bad as if you'd copied the other's source code into your own and have to merge changes into your product by hand.

Wednesday, July 30, 2008

How to cleanly log messages without wasting cycles when not logging

Whether you use System.Diagnostics.Trace, log4net, or any other logger, it's often the case that you want to allow the logging to be turned on or off at runtime. To avoid your logging to slow down your app unnecessarily when the log messages are not being collected, it's common to use conditionals to only execute the logging paths if someone is listening. This leads to logging logic that clutters up your processing code and distracts someone reading the code from what significantly is going on. This post describes a way to try to capture the best of both worlds.

First let's look at the most basic logging examples and discuss their problems. For simplicity I'll just write the logging as if System.Diagnostics.Trace is what I'm using, but all the concepts and problems apply to all logging mechanisms. Keep in mind that although the Trace class has methods with the [Conditional("TRACE")] attribute on them so that calls to them don't slow down your code if tracing is turned off, keep in mind that this is a compile-time switch rather than the runtime switch that we're looking for. If we want trace logging as an option for the customer who receives your program/library, you need to define TRACE at compile-time, which means that even if tracing is turned off by default in your running code, your code still has all those calls to the Trace class in it.

Simplest example

In this example, we unconditionally call Trace.TraceInformation with logging messages.

var someDictionary = new Dictionary<string, int>(/* some data*/);
// some logic
Trace.TraceInformation("The dictionary has {0} elements in it.", someDictionary.Count);
Trace.TraceInformation("Dictionary contents: {0}", WriteDictionaryAsString(someDictionary));
// more logic

Although we call Trace.TraceInformation here unconditionally, internally the Trace class is only actually logging to somewhere if some trace listener is interested in these log messages. But consider what we've already done just by calling TraceInformation in the first place. We've called TraceInformation twice, with strings that must be formatted, and the second call always executes an expensive call to a method that reads through the entire dictionary and creates a large string to represent its whole contents. Now, if Trace.TraceInformation is well-written, the String.Format call it makes internally should only be called if loggers are actually recording the messages, which will save you some cycles if logging is turned off. But that will not prevent WriteDictionaryAsString from executing with its expensive code. The most straightforward way to solve this leads us to our next example.

Being smarter about how we log

Here we surround the logging calls in a conditional to prevent the expensive call to WriteDictionaryAsString if no one is listening:

TraceSwitch traceSwitch = new TraceSwitch("YourLoggingSwitch", "Some Description");
var someDictionary = new Dictionary<string, int>(/* some data*/);
// some logic
if (traceSwitch.TraceInfo)
{
    Trace.TraceInformation("The dictionary has {0} elements in it.", someDictionary.Count);
    Trace.TraceInformation("Dictionary contents: {0}", WriteDictionaryAsString(someDictionary));
}
// more logic

This is an improvement, because a quick boolean check is really cheap which makes the non-logging scenario very fast by avoiding the calls to TraceInformation altogether and especially that expensive WriteDictionaryAsString call. But look how we now have 6 lines of logging code instead of 2. Yuck. We can improve on this by finding a middle-ground.

Building the smarts into the system

Note that WriteDictionaryAsString is necessary because Dictionary<TKey, TValue> doesn't have a useful ToString() method. If it did, we could just pass the dictionary instance to TraceInformation, which could call ToString() on the object only if tracing were turned on. Then we'd be back to just two quick calls to TraceInformation, which don't really do anything inside unless logging is turned on. This would be ideal, but Dictionary doesn't support it, and often you have complex structures of your own to emit that don't have ToString() methods that behave this way.

So let's come up with a new way to accomplish the same thing. Imagine what we could do if we did:

var someDictionary = new Dictionary<string, int>(/* some data*/);
// some logic
Trace.TraceInformation("The dictionary has {0} elements in it.", someDictionary.Count);
Trace.TraceInformation("Dictionary contents: {0}", someDictionary.DeferredToString());
// more logic

That looks downright readable again. Now here's how we make this work well and fast:

/// <summary>
/// Extension methods for deferred serialization of various object types to strings.
/// </summary>
public static class DeferredToStringTools {
    /// <summary>
    /// Prepares a dictionary for printing as a string.
    /// </summary>
    /// <remarks>
    /// The work isn't done until (and if) the 
    /// <see cref="Object.ToString"/> method is actually called, which makes it great
    /// for logging complex objects without being in a conditional block.
    /// </remarks>
    public static object DeferredToString<K, V>(this IEnumerable<KeyValuePair<K, V>> keyValuePairs) {
        return new CustomToString<IEnumerable<KeyValuePair<K, V>>>(keyValuePairs, dictionarySerializer);
    }

    // Add as many overloads of DeferredToString as you want to this class,
    // one for each type of object you want to emit as part of logging. 

private static string dictionarySerializer<K, V>(IEnumerable<KeyValuePair<K, V>> pairs) { var dictionary = pairs as IDictionary<K, V>; StringBuilder sb = new StringBuilder(dictionary != null ? dictionary.Count * 40 : 200); foreach (var pair in pairs) { sb.AppendFormat(CultureInfo.CurrentCulture, "\t{0}: {1}{2}", pair.Key, pair.Value, Environment.NewLine); } return sb.ToString(); } /// <summary> /// Wraps an object in another object that has a special ToString() implementation. /// </summary> /// <typeparam name="T">The type of object to be wrapped.</typeparam> private class CustomToString<T> { T obj; Func<T, string> toString; public CustomToString(T obj, Func<T, string> toString) { if (toString == null) throw new ArgumentNullException(); this.obj = obj; this.toString = toString; } public override string ToString() { return toString(obj); } } }

Note that the code above uses a couple of C# capabilities unique to .NET 3.5. If you are targeting .NET 2.0 you can still do this, but the syntax will be slightly different.

So there you have it. You can call Trace.TraceWarning, TraceInformation, etc. all you want (ok, still within reason) and pass it strings with {0} placeholders and objects that must be written out as strings, and those objects without adequate ToString() methods can leverage this facility to benefit from deferred serialization just like any other type with an adequate ToString() method.

Tuesday, July 22, 2008

How I have taken control of my own identity, part 2

In my last post, I discussed how I made http://blog.nerdbank.net my one OpenID URL that allows me to link my several accounts with various OpenID Providers into a single URL that I may use anywhere.  In this post, I'll talk about some of the problems that remain with the system, and how XRI i-names can solve them.

Why use an XRI/i-name?

I purchased =Arnott from 1id.com, one of the many XRI accredited brokers.  It costs me $7/year I think, which is slightly less than a domain name from most resellers.  Along with that i-name I got an associated CanonicalID (=!9B72.7DD1.50A9.5CCD) which is mine forever. Even if I cancel with 1id.com, my CanonicalID will never be re-assigned to anyone else.

I can even change my i-name from =Arnott to =SomebodyElse and transfer my CanonicalID to that new i-name, and all my identity transfers automatically.  OpenID 2.0 includes support for XRI i-names, and requires that web sites that allow people to log in with "=Arnott" actually store this canonical ID as the primary key instead of the "=Arnott" string.  Not only does this allow me to change my i-name periodically, but it guarantees that if someone else later buys "=Arnott", they cannot log in as me anywhere. 

Contrast this identity security against the standard OpenID URL like http://blog.nerdbank.net.  If I stop paying for the nerdbank.net domain name, someone else can buy it, put up an OpenID endpoint at the same URL that I used to, and then log into countless web sites and impersonate me.  Clearly, XRIs with their non-reassignable canonical IDs are superior.

While most OpenID-supporting web sites support URLs, only a small handful seem to support XRIs.  That isn't too bad though, since any XRI can be written out as a URL like this: https://xri.net/=Arnott.  Now, to a relying party web site, that's just a URL and will very likely work if the site has support for XRDS documents. 

But using the URL form of an XRI is not equivalent to using the =Arnott XRI.  That is, when the URL form is used the primary key on the web site is teh URL rather than my XRI's Canonical ID.  I cannot use =Arnott interchangeably with https://xri.net/=Arnott on the same web site and expect to be treated as the same person.

And who knows?  Maybe I'll grow tired of the URL I use for my blog.  An XRI is just the better way to go if you're trying to consolidate your identity online.

Setting up your i-name

As I said earlier, I happen to host my i-name with 1id.com.  I do not like 1id.com's user interface though and it doesn't provide many of the authentication options that myopenid.com does.  But myopenid.com doesn't offer XRI hosting.  No problem.  XRDS documents can bring in the best of both worlds. 

I took the same XRDS document I wrote and linked to from my blog and programmed it into 1id.com's XRDS management interface (which because 1id doesn't give direct access to the XRDS doc except through a web interface of push buttons and text fields was not as easy as it should have been).  I removed the services that 1id.com offered my XRI by default or gave them a very high priority number (which means low priority because these things are sorted ascending).  I could test my changes by visiting https://xri.net/=Arnott?_xrd_r=application/xrds%2Bxml;sep=false to see the full XRDS doc as I was building it up to compare it with the one I had previously hosted on my blog.

With my customized XRDS doc set up, my =Arnott XRI, hosted by 1id.com, when used to log into an OpenID relying party I am redirected to myopenid.com instead of 1id.com for authentication.  But my Canonical ID is still the primary key with that web site.  That means I have the best of everything: I'm using a primary key that is universally mine forever, and I can choose whatever authentication Provider I want from time to time without disrupting my identity on any web site.  Sweet. 

What about my old blog OpenID url?

Well for those web sites that don't yet support XRI's, I can use either the URL form of my XRI I mentioned earlier, or I can continue using my blog URL.  I chose to use my blog URL for non-XRI supporting web sites.  But to avoid having to maintain two XRDS documents (one at 1id.com and one hosted on my blog), I changed my blog's HEAD tags to point directly at the XRDS document hosted for my =Arnott identity!

<meta http-equiv='X-XRDS-Location' content='https://xri.net/=Arnott?_xrd_r=application/xrds%2Bxml;sep=false' />

Then I realized that instead of using =Arnott, which is really only a convenient short-hand for my XRI CanonicalID, I'd go ahead and use the canonical ID here, so that if I ever drop =Arnott in favor of some other i-name, so long as I transfer my CanonicalID to the new alias the link will still work.  So I changed it to this:

<meta http-equiv='X-XRDS-Location' content='https://xri.net/=!9B72.7DD1.50A9.5CCD?_xrd_r=application/xrds%2Bxml;sep=false' />

Summary

And that wraps up my identity.  I would encourage you to pick up an i-name for yourself, customize the XRDS, and take control of your identity.  Although I picked 1id.com, you should hop over and check out freexri.com, which as its name implies, gives out free 'community' i-names.  I don't use freexri.com to save myself $7/year (yet) because they don't seem to issue me a Canonical ID along with my i-name which makes it useless in my opinion.  [7/23/08 Update] I found out that i-names freexri.com generated over six months ago don't have them, but new ones do, so check out his service!  Their interface is more friendly and powerful at the same time.  So hopefully we'll be able to get Canonical IDs there soon (if they don't already).

Hopefully someday soon this will be all so natural and easy that people will do it just as comfortably as they Set Up their Internet Connection when they get a new PC.

Editorial note:

At the time of this writing, 1id.com has a bug in their XRDS implementation that I just found out about today, where instead of <openid:Delegate> tags in my XRDS services, it emits <openid:delegate>, which breaks all OpenID 1.x relying parties.  Dang.  I've written to 1id.com and so has John Bradley (=jbradley) so I hope they fix this soon.

[Update 7/23/08] 1id.com fixed it within hours of my reporting it.  But existing 1id.com customers will have to go into their XRDS management page and re-save all their i-services in order for the change to affect them.

Friday, July 18, 2008

How I have taken control of my own identity, part 1

First I obtained an OpenID account with www.myopenid.com.  I actually have several other accounts with other OpenID Providers, such as pip.verisignlabs.com and yahoo.com because some relying parties allow only white-listed Providers, and some services offer me an OpenID whether I use it or not. 

But to avoid an identity crisis of appearing all over the web as http://andrew.arnott.myopenid.com, http://andrewarnott.signon.com, http://aarnott.pip.verisignlabs.com, etc., I wanted to tie all these logins together under one identifier that I would always use, so people would be able to recognize the same person is behind all these identifiers.  Using an XRDS document, I can do this. I created this document to describe all my OpenID Provider accounts:

<%@ Page ContentType="application/xrds+xml" %><?xml version="1.0" encoding="UTF-8"?>
<xrds:XRDS
	xmlns:xrds="xri://$xrds"
	xmlns:openid="http://openid.net/xmlns/1.0"
	xmlns="xri://$xrd*($v*2.0)">
	<XRD>
		<Service priority="10">
			<Type>http://specs.openid.net/auth/2.0/signon</Type>
			<Type>http://openid.net/signon/1.0</Type>
			<Type>http://openid.net/sreg/1.0</Type>
			<Type>http://openid.net/extensions/sreg/1.1</Type>
			<Type>http://specs.openid.net/extensions/pape/1.0</Type>
			<URI>https://www.myopenid.com/server</URI>
			<LocalID>http://andrew.arnott.myopenid.com</LocalID>
			<openid:Delegate>http://andrew.arnott.myopenid.com</openid:Delegate>
		</Service>
		<Service priority="20">
			<Type>http://specs.openid.net/auth/2.0/signon</Type>
			<Type>http://specs.openid.net/extensions/pape/1.0</Type>
			<URI>https://open.login.yahooapis.com/openid/op/auth</URI>
			<LocalID>https://me.yahoo.com/a/cJASAdp4x5Rx6CU9olKi7rMkG1TX_7Yl1kQ-</LocalID>
		</Service>
		<Service priority="30">
			<Type>http://specs.openid.net/auth/2.0/signon</Type>
			<Type>http://openid.net/signon/1.0</Type>
			<URI>https://www.signon.com/partner/openid</URI>
			<LocalID>https://andrewarnott.signon.com</LocalID>
			<openid:Delegate>https://andrewarnott.signon.com</openid:Delegate>
		</Service>
		<Service priority="40">
			<Type>http://specs.openid.net/auth/2.0/signon</Type>
			<Type>http://openid.net/signon/1.0</Type>
			<URI>https://pip.verisignlabs.com/server</URI>
			<LocalID>https://aarnott.pip.verisignlabs.com</LocalID>
			<openid:Delegate>https://aarnott.pip.verisignlabs.com</openid:Delegate>
		</Service>
	</XRD>
</xrds:XRDS>

There is a little ASP.NET tag at the beginning to make sure the server sends down the proper Content-Type HTTP header with the document so Relying Party web sites know what they're looking at.  I was also careful to include the right <Type> tags in each service, as some services support just OpenID 1.x, some just 2.0, and some both.  Some support extensions as well so I included those.  Finally, each service has a priority attribute on it that allows the RP to sort the list based on my preferences and then choose the first Provider that fulfills the RP's requirements.

This XRDS document, at a URL, could serve as my OpenID URL itself.  But http://someserver/somexrds.aspx would be an ugly OpenID URL.  So I needed to refer to this document from a URL that was easier on the eyes and the fingers.

First I had to choose an Identifier that I would always use.  I did not want to use any Identifier that is specific to an individual OpenID Provider for two reasons:

  1. I might want to stop using that Provider at some point in the future.
  2. Most Providers do not let me add additional services to the XRDS document that they host for me.

A common choice is a blog URL.  I had already been using this snippet on my blog to host an OpenID identity:

<link rel="openid.server" href="https://www.myopenid.com/server"/>
<link rel="openid.delegate" href="http://andrew.arnott.myopenid.com"/>
<link rel="openid2.provider" href="https://www.myopenid.com/server"/> <link rel="openid2.localid" href="http://andrew.arnott.myopenid.com"/>

This link style has the limitation of only allowing one of my Providers to be listed.  But some RPs only support this syntax and cannot read XRDS documents, so while leaving this snippet here, above it I inserted the following snippet (within my Blogger-hosted blog's HEAD section):

<meta content='http://nerdbank.net/openid_xrds.aspx' http-equiv='X-XRDS-Location'/>

Great.  Now http://blog.nerdbank.net is my omni-identity.   I can log into any OpenID relying party web site, and (assuming that site uses a decent implementation of OpenID) the RP will let me log in even if my first choice Provider isn't strong enough by looking down the list until it finds one that is.  If I don't have any that are good enough, I can just create an account at one more Provider and add it to my list and I don't have to worry about changing my OpenID URL that I use everywhere.

For example, Microsoft HealthVault has a whitelist of only two allowed Providers (including Verisign) that their visitors may log in with.  No problem.  I just log in as http://blog.nerdbank.net and the RP should (HealthVault actually doesn't support all this yet) find the Verisign service in my XRDS document and automatically direct me to Verisign for authentication, even though it's nearly last on my preferred list of Providers. 

To summarize: I now have http://blog.nerdbank.net as my OpenID that I can use to log into any OpenID 1.x or 2.0 relying party web site (provided it has a decent implementation), regardless of what Providers each site may require I use.

In my next post, I'll discuss how to use XRI i-names to further secure your identity.

Monday, July 14, 2008

DotNetOpenId gets a new face

Javier Román has completed his work on the new DotNetOpenId logo.  I think he has done excellent work, and he did it for free in the spirit of contributing to this open source project.  This logo is now on the project home page, and will soon be added to the samples.

making_of dotnetopenid_little

A big thank you to Javier!

Wednesday, July 09, 2008

How to make your OpenID Provider case insensitive

In my last post, I discussed why case sensitive OpenID URLs are so important for security.  But case sensitive OpenID URLs are no fun at all for users.  For instance, most users probably expect they could alternate between logging in as www.myprovider.com/myopenidurl and www.myprovider.com/MYOPENIDURL at an RP and be considered the same person.  But if OpenID URLs must be case sensitive, how can this be accommodated?  Let's investigate.

John Bradley suggested the idea to solve the problem (for me anyway): use redirects.  By observing the exact URL of an incoming discovery from a relying party site at a given OpenID URL, the Provider should redirect the discovery if the requested URL was not in the canonical casing for that identifier.  For instance:

  1. Canonical OpenID: myprovider.org/SomeUser
  2. Incoming discovery request: myprovider.org/someuser
  3. OP redirects RP to: myprovider.org/SomeUser
  4. Incoming discovery request: myprovider.org/SomeUser
  5. OP responds: sends OpenID LINK tags or or XRDS document.

If a request comes in for myprovider.org/someuser but the canonical form of the identifier is myprovider.org/SomeUser, then the server should respond to the request with a redirect to myprovider.org/SomeUser.  That way the RP will know immediately what the correct Claimed Identifier should be, will store it in its canonical cased form, and next time the user logs in, this time with myprovider.org/SOMEUSER, it will again be redirected to myprovider.org/SomeUser and the user will be identified as the same person.  John's a smart guy. :)

When you have a case insensitive web server, it will be an easy thing to do this check and redirect because all casing forms of the request will be responded to with the same processing page, which can check the request URL and redirect if necessary.

If you're using a case sensitive web server, you're going to have to capture every single incoming request and do your own checking to see if the URL needs to be redirected, since /someuser and /SomeUser won't both be sent to the same processing page automatically.  Hopefully your web application platform offers some common point to intercept every incoming request and perform special processing on it.  Beware though that it is not enough to just rewrite the URL to the proper casing and send it on its way up your web application stack.  An actual HTTP 301 Redirect must be sent back to the caller to get the right casing to come back as a new request.  This is the only way to get the RP to notice that the Claimed Identifier changed casing to the canonical form.

Note that using this method, no work has to be done at the RP.  So if you're an RP just study and make sure you're compliant with my last post.  If you're an OP you can implement this post's suggestion and your users logging in to any RP will automatically benefit from the convenience of their case insensitive OpenID URLs.

Tuesday, July 08, 2008

The case for case sensitive OpenID URL checking

URLs on the Internet are case sensitive by definition.  Some web servers choose to be case insensitive.  To treat OpenID urls as anything but case sensitive for purposes of identifying a user introduces a grave security risk.  Implementers of OpenID should be cautious when using case-insensitive string comparisons and be aware that in most cases checks should be case sensitive.

OpenID Claimed Identifiers

Consider the tale of three URLs:

  1. http://MYPROVIDER.org/myuser
  2. http://myprovider.org/myuser
  3. http://myprovider.org/MYUSER

If entered into an OpenID login box, which combination of these URLs are guaranteed to represent the same identifier and therefore the same person?  (go ahead and think about it)...

If you decided all three, from an intuitive user perspective, you'd be right.  But from a security perspective, only 1 and 2 should be recognized as the same.  An RFC-abiding, self-respecting web server may be case sensitive for the path in its URL.  Linux/Apache servers may be configured this way quite easily, perhaps even by default in some cases.  Therefore an OP that does not take specific precautions to prevent multiple identifiers that differ only in casing from existing simultaneously would be open to the following attack:

  1. User A visits myprovider.org and acquires the myprovider.org/myuser identifier.
  2. User A visits somerp.com and logs in as myprovider.org/myuser.
  3. User B seeks to spoof User A's identity on somerp.com. 
  4. User B visits myprovider.org and acquires the myprovider.org/MYUSER identifier.
  5. User B visits somerp.com and logs in as myprovider.org/MYUSER.
  6. somerp.com does a case insensitive comparison and considers User B to be the same person as User A.  somerp.com grants all access that User A should get to User B.

Now, who broke a rule here?  Although some might argue that myprovider.org is a poor Provider, it hasn't actually violated any RFCs or the OpenID spec.  And since somerp.com has no way of knowing whether given Provider is case sensitive or not, to prevent the above identity spoofing scenario it MUST perform a case sensitive comparison.

Now the one exception to this is in the authority area of the URI.  The authority is the hostname.myprovider.org area of the URL.  The authority is case insensitive because DNS is case insensitive and the authority is not sent as part of the GET line in the HTTP protocol (it may be sent as one of the HTTP headers afterward, but no server can rightly be case sensitive on that particular header).  So to be fully correct, safe, and as user-friendly as possible, the URL should be compared in two parts: the authority which would be case insensitive, and the path+query+fragment segment which would be case sensitive.

Realm-return_to validation

And if you're not yet convinced, consider the next concern regarding casing: realm-return_to validation.  Part of an OP's responsibility is verifying that a return_to URL falls somewhere at or beneath the realm URL.  There are security reasons for this validation that fall outside the scope of this post.  But case sensitive testing here is also very important.

Consider this scenario:

  1. Shared hosting provider www.yourpages.com allows subscribers to host their own web sites as virtual app directories under their domain, such that one subscriber choose www.yourpages.com/user1/ as the root of their site and another subscriber might choose www.yourpages.com/user2/ as the root of their site.
  2. User 1 subscribes to www.yourpages.com and chooses to host his OP Provider site at www.yourpages.com/RadProvider.
  3. User 2 wants an OpenID URL and gets this one from RadProvider: www.yourpages.com/RadProvider/User2IsCoolGuy
  4. User 2 visits somerp.com and logs in with www.yourpages.com/RadProvider/User2IsCoolGuy and establishes private information with somerp.com.
  5. User 3 sets up his own OpenID Provider at www.yourpages.com/radprovider/.  Notice he chose the name to match an existing provider but with different casing.  This case sensitive shared hosting service didn't think of the security ramifications behind his innocent web server allowing multiple users to have such similar paths and so User 3 exploits this.
  6. Now consider an OpenID authentication request in which realm is www.yourpages.com/radprovider and the return_to url is www.yourpages.com/RadProvider.  The RP discovery step, which is a very important security improvement in OpenID 2.0, is totally thwarted because it will be done on the wrong Provider site.  Several authentication hacks can be done without RP discovery to thwart the security offered by OpenID.

Summary

Realm-return_to validation checks, and Claimed Identifier matching MUST be done in a case sensitive way for the path+query+fragment pieces, and SHOULD be case insensitive for the scheme and authority parts. 

My argument here is not that this is common, but that it is possible.  And possible is all it takes for a security hole to be exploited and people's identity to be compromised.

In a follow-up post, I will present what OPs can do to give their users a great user experience even if all the RPs are properly case sensitive, such that their users can enter their OpenID URLs in any case they like and still authenticate properly.