Ars Technica has an interesting article titled New algorithm guesses SSNs using date and place of birth. It describes how the date and location of birth can be gleaned from social networking sites such as Facebook, and then used in a new algorithm to guess the person's SSN "with a startling degree of accuracy." An inference attack, in other words.
Per reference.com: "An Inference attack occurs when a user is able to infer from trivial information more robust information about a database without directly accessing it. The object of Inference attacks is to piece together information at one security level to determine a fact that should be protected at a higher security level." So consider Facebook or MySpace as one big database; the adversary just needs to put the pieces together, or in this case, use those pieces in an intelligent way to get the golden ring: the SSN.
Some questions came to mind after reading this:
- Should we be careful about over-sharing?
- Why are we still so reliant on SSN as our unique identifier?
- Why do companies continue to trust without verifying when processing credit card and loan applications?
Companies and Congress have (slowly) been taking steps to limit the use and reliance on SSNs in response to growing public frustration. It seems not to be worth the time for credit card companies to impose stronger security measures to protect against fraud, so it is consumers that pay the price. A safe position may be to assume that all of our personal information is already exposed and for sale somewhere in the dark recesses of the Internet, so that we maintain a defensive position by default. Otherwise, we only react after the damage is done. We just have to be smarter, and perhaps more vocal, about this issue.


























The article assumes that one was issued their SSN near the state where they were born. This is more of an issue now-a-days, with the whole "enumeration at birth" initiative, but many people in the working world moved to another state prior to getting a SSN.
More troubling, however, is this: While the last four digits are statistically harder to to guess for a given individual, the last four digits are also commonly used in documents (as in xxx-xx-1234, or just 1234) to indicate to the addressee of a document that it pertains to them. I could probably find half a dozen different source documents in my files, some issued by the Social Security Administration, that use this convention.
This just means that a blended attack would be trivial. Who here has NEVER thrown away a piece of paper that had just the last four digits of their SSN on it?
Posted by: David Fitzgerald | 07 July 2009 at 08:37 PM
Good point. Knowing just the last 4 could be good enough to launch a successful phishing attacks.
Also, per the book Zero Day Threat: "A prospective borrower filling out an online loan application can submit less than nine correct digits of [a] Social Security number and just three matching letters of the first name of someone of good credit standing...The three letters of the first name don't even have to be in the same order or sequence." So for some systems just having partial information is just as good. Scary.
Posted by: Don Franke | 08 July 2009 at 10:24 PM