Wednesday, July 31, 2013

A Review of Real World Security Questions & Answers

Yesterday I delivered my presentation titled A Review of Real World Security Questions and Answers at PasswordsCon 13.  I have written about security questions (AKA challenge questions) before, but this presentation is based on my analysis of thousands of actual user choices that were included in hacker database dumps from three different organizations this past year.

We don't often have an opportunity to see how people are actually using security questions outside of controlled surveys, so I was thrilled to dig into the data to see what additional insights it offered about their strengths and weaknesses.  Hopefully you will find it to be a useful resource when making decisions about the use of security questions in your own organization.

You can find a copy of my presentation slides with my analysis at www.passwordresearch.com/securityqs.html.  If you would like to talk about these findings you are welcome to contact me through email or on Twitter @PwdRsch.  You can also post your comments or questions on this blog entry.


Thursday, February 7, 2013

Passwords Found in the Wild for January 2013

Studying the passwords dumped on the Internet by hackers back in December provided a good opportunity for me to measure the scope of the problem. Following that experience I decided to collect and correlate some new information when analyzing password dumps from January.

Overview of Password Dumps


Last month I found 110 password dumps which met my criteria* for analysis, down from 154 in December. A few of the dumps contained data from multiple sites. There were 90 specific organizations or domains named as the source of the passwords. The remaining dumps didn't identify the source of their data, or were gathered from multiple personal computers instead of from a centralized web site.

From this collection, 40 dumps consisted primarily of plaintext passwords, exposing roughly 61,000 passwords (36% of the monthly total). Another 64 dumps primarily contained hashed passwords, adding approximately 101,000 passwords (59% of the total). Six more dumps had a mixture of plaintext and hashed passwords, accounting for 9,000 passwords (5% of the total).

Compared to 450,000 passwords dumped in December, this month's total of 170,000 passwords was significantly lower. A contributing factor to this was the smaller number of dumps, but maybe more importantly there also tended to be fewer large password dumps. This month only had two dumps containing more than 10,000 passwords, while last month had 17.

I wasn't really surprised to see this result. The number of sites that are compromised and the number of passwords they disclose will always change based on what is getting hacked that month and how large the vulnerable sites happen to be.

Over time a baseline average of 150,000 - 300,000 passwords dumped each month might emerge, but this number would skyrocket every time a large site was affected by a security breach. In June of 2012 we watched LinkedIn lose 6.5 million passwords and eHarmony lose around 1.5 million. The next month 450,000 passwords were leaked from Yahoo Voices. Just this past Friday Twitter announced a forced password change for around 250,000 users whose password hashes may have been accessed by hackers (although these have yet to be publicly dumped).

There were 40 different hackers or hacker groups claiming credit for January's password dumps, and more hackers that chose to dump their data anonymously. So even the retirement, capture, or poor motivation of any particular hacker seems unlikely to have a large impact in the flow of monthly password dumps.

The biggest deterrent to future password dumps is more likely to be improvements in the security of the code and development frameworks used by the vulnerable sites, or a widespread adoption of specific security countermeasures (e.g. web application firewalls or intrusion prevention systems). This brings us to the subject of what types of sites I found to be vulnerable today.

Sites Vulnerable to Password Dumps


In January I decided to visit all the sites named in the password dumps and gather information on the software coding language they used, the category of the function they served, and the country in which they were hosted. I hoped this might provide some further insight on whether these attacks were targeted in any way or simply opportunistic.

SQL injection attacks appear to be the primary supplier of the database dumps containing passwords. Many of the dumps include actual output from the tools (like Havij) that hackers can use to automate the extraction of database contents. This seemed pretty intuitive since SQL injection is a common web application vulnerability and one that may not require hackers to gain any other illicit access on the targeted site.

What did surprise me was that the majority of the 87 named sites targeted with password dumps were developed using the PHP programming language, as shown in the following chart.

Coding language used by sites experiencing password dumps in January 2013
 
Netcraft's Web Server Survey for this same month shows that 39% of all web sites (around 244 million) are running PHP. While that is a large portion of the Internet, the market share by itself doesn't seem to justify PHP sites making up 91% of the total. After all, sites using ASP and JSP can be just as vulnerable to SQL injection attacks as PHP. If I didn't know that SQL injection was the primary attack method I would suspect that hackers were exploiting some PHP-specific vulnerabilities.

A more likely explanation is that the popularity of the language has led to the rapid deployment of PHP sites and PHP-based content management systems (CMS) by people who lack an education in web application security. Even though the risk of SQL injection in PHP should be fairly well understood, some organizations still end up deploying code that doesn't implement proper security controls.

Interestingly enough, I found that one of the organizations suffering a January password dump had actually showed up previously in a hacker's report of sites vulnerable to SQL injection posted on Pastebin.com over 6 months ago. So either they never learned they were vulnerable in the subsequent months or were unable to completely fix the problem before it was exploited to dump their entire user database.

Another possible explanation is that some of these sites might be unintentionally allowing attackers to connect directly to the site database and download records that way. This should be prevented by host firewall segmentation and proper database authentication, but sites may have been deployed without these precautions. I see evidence within the posted password dump files that this is happening, but I believe it is secondary to the more popular SQL injection attacks.

While all software developers and web admins should learn about SQL Injection prevention and other secure site management practices, it appears that the PHP community needs the most help catching up.

Categories of Targeted Sites


When I visited these vulnerable sites I also assigned them a category based on the site's purpose. Most of these category labels should be self explanatory, but I'll provide a description for a few of them.

Education sites were mainly universities, although there was one primary school. Business sites were a corporate web presence that mainly published data and didn't offer online services to customers. Entertainment sites could be a discussion forum or a site sharing information on a specific topic (e.g. movies or sports).

An Online Store was a business site where customers could make purchases. This type of site would be particularly valuable to attackers looking to capture stored billing information or to order merchandise and charge it to customers. As mentioned last month, sites like this often don't make it into public password dumps because attackers can sell the account data in the underground marketplace.

Info Services are the sites that sell information as a single-use or subscription service. These sites could also be valuable to attackers, depending on what type of data they make available to customers. Finance sites are banks, credit unions, or other related institutions, but in this month's case it was a single investment firm.

The chart below shows how many sites matched each category in the January password dumps.

Category of sites experiencing password dumps in January 2013














I didn't really have many preconceived ideas about the categories of sites I expected to see targeted. Government, Business, Medical, Education, News, and Political certainly make sense if the attacker hopes to gain media attention with their data dump. 

The Online Stores, Info Services, and Financial sites make sense if the attacker hoped to make a financial gain from the attack. Although going after these sites could also certainly just be for bragging rights.

There is probably some ratio of opportunistic and targeted attacks in this mix, but it is difficult to detect unless the hackers specifically outline their motivations in the dump descriptions.

Countries Hosting the Targeted Sites


I was able to identify the country of origin for 96 of the January password dumps. There were 35 different countries represented in that total. The top 10 countries that experienced the most password dumps are shown in the chart below.

Country hosting the sites experiencing password dumps in January 2013
Seven other countries tied the Philippines with two vulnerable sites each. South Africa was given a bit more attention in January than normal due to one hacker group specifically targeting organizations in that country as part of a political statement.

Another observation was that half of the vulnerable sites in India were Education category sites. They were the only country that had such a large percentage of their sites in the same category. This might indicate greater web site security problems at universities in India, or it could just be chance.

Impacts on Users of Targeted Sites


At least some users of the hacked sites are likely to experience problems as a result of these password dumps. If passwords were stored in a plaintext format then any account is vulnerable to misuse by unauthorized individuals. Even if hashed, password cracking software can produce the plaintext passwords fairly quickly for all but the stronger passwords. Whether hackers care to use these stolen credentials will depend partially on what the account can be used for and partially on how motivated they are to annoy users.

Some web sites use an email address for the username, or record the email address during the account registration process. Email addresses of users were found in 96 (87%) of the analyzed password dumps in January.

The use of email addresses isn't necessarily a problem by itself, but the reality of user password practices can result in email address leaks endangering their identities on other Internet sites. If one organization leaks email addresses and passwords this allows attackers to try those credentials elsewhere. In fact, hackers have developed software tools that automate the process of trying discovered email and password combinations against a list of popular web sites.

Password reuse makes this a real problem, although mainly for the users and not necessarily for the site that was vulnerable to the password dump in the first place. Even if a user chose a stronger password to reuse, it only takes one site storing that password in plaintext to potentially expose all of their accounts.

My suggestion to users is that passwords should never be reused across any sites where you would care if your account gets hacked. Always choose a strong and unique password for each site and then store it in a password manager if you are concerned about forgetting it.

Web sites often can't do much to prevent password reuse other than enforcing good password selection controls that might eliminate the worst of the reused passwords. However, they can make sure that they use adequate password hashing and salting techniques that make the task of cracking user passwords much more difficult for hackers. Sites should also notify users when a database breach is detected and warn them to change their password anywhere else it was used, in addition to forcing a change on the hacked site itself.

Conclusion


I wasn't able to complete my analysis of all the information provided by the password dumps in January, but I'll continue to work on it and will post new results here on the blog. In the meantime, if you have any questions or comments leave them below, or contact me on Twitter @PwdRsch.

* Study Methodology


I monitored Twitter and other sites for notices that a data dump had been publicly posted. Some data dumps contained user or customer information but not passwords. Others contained only the administrator password or the passwords of a very limited number of users. I ignored these and focused only on sources that contained passwords (hashed or plaintext) of at least a dozen or more users.

I also attempted to eliminate duplicate dumps, a practice where one hacker copies a full or partial dump from someone else and reposts it as their own. Sometimes these dumps are from the same month and sometimes they are from previous months.

In a few cases the dump poster noted that they had included only a subset of the available user passwords. While I didn't count the unposted passwords, we should assume that the attacker had access to the complete user database, which would increase the total number of passwords exposed.

When reviewing these figures keep in mind that they account only for the publicly posted data which I was able to discover. Hackers certainly compromised the passwords of other sites and kept this activity secret, or shared the data over more private channels.

Monday, January 7, 2013

Passwords Found in the Wild for December 2012

In the late 1990's when I started analyzing passwords it was much harder to find samples to review.  My password collection routine consisted mainly of begging colleagues to share data or volunteering to perform the cracking for their security assessments.  Occasionally I would get lucky and find a publicly readable password file on the Internet.  Then I could dedicate a computer for several months to cracking each password database because it would certainly take at least that long before another sample showed up.

Today I find that I am actually overwhelmed with the opportunities to gather passwords.  The raw number of Internet sites that register users and collect their passwords is huge.  Correspondingly, the number of these sites that are susceptible to SQL injection or other vulnerabilities that allow attackers to extract their user databases has also grown.  Hackers are regularly exploiting these flaws and publishing password dumps to embarrass companies, to attract attention to their causes, or to simply stroke their egos.

I decided to monitor password dumps in December 2012 to get a better idea of how widespread this practice has become.  I monitored several sources (though mainly the Pastebin.com web site) for announcements of dumps and analyzed the data posted.

Study Methodology

Snippet of Password Dump Tracking Data
Some data dumps contained user or customer information but not passwords. Others contained only the administrator password or the passwords of a very limited number of users.  I ignored these and focused only on sources that contained passwords (hashed or plaintext) of at least a dozen or more users.  I also attempted to eliminate duplicate dumps, a practice where one hacker copies a full or partial dump from someone else and reposts it as their own.

In some cases the dump poster also noted that they included only a subset of the available user passwords.  However, we should assume that the attacker had access to the complete user database, which would increase the actual number of passwords exposed.

When reviewing these figures keep in mind that they account only for the publicly posted data of which I was made aware.  Hackers certainly compromised the passwords of other sites and kept this activity secret, or shared the data over more private channels.  Brian Krebs covered the underground marketplace for the more valuable passwords in his recent blog post.

Password Dump Findings

In December I found 154 dumps which met my criteria for analysis.  A few of the dumps contained data from multiple sites.  They named more than 125 different organizations and domains as the source of the leaks.  Passwords belonged to users at businesses, governments, schools, industry groups, and discussion forums.  Some dumps didn't identify the source of their data, or were gathered from multiple personal computers instead of from a centralized web site.

From this collection, 66 dumps consisted primarily of plaintext passwords, exposing roughly 221,000 passwords.   Another 82 dumps primarily contained hashed passwords, adding approximately 222,000 passwords to the count. So while the number of hashed password dumps was greater than the plaintext dumps, the number of passwords exposed was nearly equal.  Six more dumps had a mixture of plaintext and hashed passwords, but only accounted for 6,000 passwords.

Altogether, I found that almost 450,000 passwords were publicly exposed during the month. There were 103 dumps containing less than 1,000 passwords, and 17 dumps containing more than 10,000 passwords.  About 184,000 passwords (41% of the total) came from several dumps simultaneously released as part of Team GhostShell's Project WhiteFox on December 10th.

Finding that half of the exposed passwords lack the security provided by basic password hashing is disappointing.  While some of the affected sites likely have low security requirements, storing only password hashes is a pretty standard security practice that should be followed by almost every site.

Without password hashing both the poorly and well constructed passwords are exposed during leaks like these.  A user may think their password is secure only to find that their account has been compromised due to insecure password storage that was beyond their control.

Even hashed passwords can only offer resistance against attacks once they have been stolen from an organization.  Password crackers have become faster and more proficient at trying common words, names, phrases, and other combinations of guesses that can disclose a password after it has been hashed.

Conclusion

My feelings are mixed when it comes to the results of this study.  On one hand I'm frustrated with the security vulnerabilities that continue to plague many Internet sites, and on the other hand I'm eager to see what wisdom is provided by examining these leaked passwords.  The wide variety of passwords from these different sources allows researchers like me to pick and choose the password samples that seem most interesting or likely to produce the information we seek.

So these days instead of begging for passwords I'm finding myself begging for help to sort through all the password data that is available to me.