Studying the passwords dumped on the Internet by hackers back in December provided a good opportunity for me to measure the scope of the
problem. Following that experience I decided to collect and
correlate some new information when analyzing password dumps from
January.
Overview of Password Dumps
Last month I found 110 password dumps
which met my criteria* for analysis, down from 154 in December. A
few of the dumps contained data from multiple sites. There were 90
specific organizations or domains named as the source of the
passwords. The remaining dumps didn't identify the source of their
data, or were gathered from multiple personal computers instead of
from a centralized web site.
From this collection, 40 dumps
consisted primarily of plaintext passwords, exposing roughly 61,000
passwords (36% of the monthly total). Another 64 dumps primarily
contained hashed passwords, adding approximately 101,000 passwords
(59% of the total). Six more dumps had a mixture of plaintext and
hashed passwords, accounting for 9,000 passwords (5% of the total).
Compared to 450,000 passwords dumped in
December, this month's total of 170,000 passwords was significantly
lower. A contributing factor to this was the smaller number of
dumps, but maybe more importantly there also tended to be fewer large
password dumps. This month only had two dumps containing more than
10,000 passwords, while last month had 17.
I wasn't really surprised to see this
result. The number of sites that are compromised and the number of
passwords they disclose will always change based on what is getting
hacked that month and how large the vulnerable sites happen to be.
Over time a baseline average of 150,000
- 300,000 passwords dumped each month might emerge, but this number
would skyrocket every time a large site was affected by a security
breach. In June of 2012 we watched LinkedIn lose 6.5 million
passwords and eHarmony lose around 1.5 million. The next month
450,000 passwords were leaked from Yahoo Voices. Just this past
Friday Twitter announced a forced password change for around 250,000
users whose password hashes may have been accessed by hackers
(although these have yet to be publicly dumped).
There were 40 different hackers or
hacker groups claiming credit for January's password dumps, and more
hackers that chose to dump their data anonymously. So even the
retirement, capture, or poor motivation of any particular hacker
seems unlikely to have a large impact in the flow of monthly password
dumps.
The biggest deterrent to future
password dumps is more likely to be improvements in the security of
the code and development frameworks used by the vulnerable sites, or
a widespread adoption of specific security countermeasures (e.g. web
application firewalls or intrusion prevention systems). This brings
us to the subject of what types of sites I found to be vulnerable
today.
Sites Vulnerable to Password Dumps
In January I decided to visit all the
sites named in the password dumps and gather information on the
software coding language they used, the category of the function they
served, and the country in which they were hosted. I hoped this might
provide some further insight on whether these attacks were targeted
in any way or simply opportunistic.
SQL injection attacks appear to be the
primary supplier of the database dumps containing passwords. Many of
the dumps include actual output from the tools (like Havij)
that hackers can use to automate the extraction of database contents.
This seemed pretty intuitive since SQL injection is a common web
application vulnerability and one that may not require hackers to
gain any other illicit access on the targeted site.
What did surprise me was that the
majority of the 87 named sites targeted with password dumps were
developed using the PHP programming language, as shown in the
following chart.
Netcraft's Web Server Survey for this
same
month
shows that 39% of all web sites (around 244 million) are running PHP.
While that is a large portion of the Internet, the market share by
itself doesn't seem to justify PHP sites making up 91% of the total.
After all, sites using ASP and JSP can be just as vulnerable to SQL
injection attacks as PHP. If I didn't know that SQL injection was
the primary attack method I would suspect that hackers were
exploiting some PHP-specific vulnerabilities.
A more likely explanation is that the
popularity of the language has led to the rapid deployment of PHP
sites and PHP-based content management systems (CMS) by people who
lack an education in web application security. Even though the risk
of SQL injection in PHP should be fairly well understood, some
organizations still end up deploying code that doesn't implement
proper security controls.
Interestingly enough, I found that one
of the organizations suffering a January password dump had actually
showed up previously in a hacker's report of sites vulnerable to SQL
injection posted on Pastebin.com over 6 months ago. So either they
never learned they were vulnerable in the subsequent months or were
unable to completely fix the problem before it was exploited to dump
their entire user database.
Another possible explanation is that
some of these sites might be unintentionally allowing attackers to
connect directly to the site database and download records that way.
This should be prevented by host firewall segmentation and proper
database authentication, but sites may have been deployed
without these precautions. I see evidence within the posted password
dump files that this is happening, but I believe it is secondary to the more popular SQL injection attacks.
While all software developers and web
admins should learn about SQL Injection prevention and other secure
site management practices, it appears that the PHP community needs
the most help catching up.
Categories of Targeted Sites
When I visited
these vulnerable sites I also assigned them a category based on the
site's purpose. Most of these category labels should be self
explanatory, but I'll provide a description for a few of them.
Education sites
were mainly universities, although there was one primary school.
Business sites were a corporate web presence that mainly published
data and didn't offer online services to customers. Entertainment
sites could be a discussion forum or a site sharing information on a
specific topic (e.g. movies or sports).
An Online Store
was a business site where customers could make purchases. This type
of site would be particularly valuable to attackers looking to
capture stored billing information or to order merchandise and charge
it to customers. As mentioned last month, sites like this often
don't make it into public password dumps because attackers can sell the account data
in the underground marketplace.
Info Services are
the sites that sell information as a single-use or subscription
service. These sites could also be valuable to attackers,
depending on what type of data they make available to customers. Finance sites
are banks, credit unions, or other related institutions, but in this
month's case it was a single investment firm.
The chart below
shows how many sites matched each category in the January password
dumps.
I didn't really have many preconceived
ideas about the categories of sites I expected to see targeted.
Government, Business, Medical, Education, News, and Political
certainly make sense if the attacker hopes to gain media attention
with their data dump.
The Online Stores, Info Services, and
Financial sites make sense if the attacker hoped to make a financial
gain from the attack. Although going after these sites could also
certainly just be for bragging rights.
There is probably some ratio of
opportunistic and targeted attacks in this mix, but it is difficult
to detect unless the hackers specifically outline their motivations
in the dump descriptions.
Countries Hosting the Targeted Sites
I was able to identify the country of
origin for 96 of the January password dumps. There were 35 different
countries represented in that total. The top 10 countries that
experienced the most password dumps are shown in the chart below.
Seven other countries tied the
Philippines with two vulnerable sites each. South Africa was given a
bit more attention in January than normal due to one hacker group
specifically targeting organizations in that country as part of a
political statement.
Another observation was that half of
the vulnerable sites in India were Education category sites. They
were the only country that had such a large percentage of their sites
in the same category. This might indicate greater web site security
problems at universities in India, or it could just be chance.
Impacts on Users of Targeted Sites
At least some users of the hacked sites
are likely to experience problems as a result of these password
dumps. If passwords were stored in a plaintext format then any
account is vulnerable to misuse by unauthorized individuals. Even if
hashed, password cracking software can produce the plaintext
passwords fairly quickly for all but the stronger passwords. Whether
hackers care to use these stolen credentials will depend partially on
what the account can be used for and partially on how motivated they
are to annoy users.
Some web sites use an email address for
the username, or record the email address during the account
registration process. Email addresses of users were found in 96
(87%) of the analyzed password dumps in January.
The use of email addresses isn't
necessarily a problem by itself, but the reality of user password
practices can result in email address leaks endangering their
identities on other Internet sites. If one organization leaks email
addresses and passwords this allows attackers to try those
credentials elsewhere. In fact, hackers have developed software
tools that automate the process of trying discovered email and
password combinations against a list of popular web sites.
Password reuse makes this a real
problem, although mainly for the users and not necessarily for the
site that was vulnerable to the password dump in the first place.
Even if a user chose a stronger password to reuse, it only takes one
site storing that password in plaintext to potentially expose all of
their accounts.
My suggestion to users is that
passwords should never be reused across any sites where you would
care if your account gets hacked. Always choose a strong and unique
password for each site and then store it in a password manager if you
are concerned about forgetting it.
Web sites often can't do much to
prevent password reuse other than enforcing good password selection
controls that might eliminate the worst of the reused passwords.
However, they can make sure that they use adequate password hashing
and salting techniques that make the task of cracking user passwords
much more difficult for hackers. Sites should also notify users when a database breach is detected and warn them to change their password
anywhere else it was used, in addition to forcing a change on the
hacked site itself.
Conclusion
I wasn't able to complete my analysis
of all the information provided by the password dumps in January, but
I'll continue to work on it and will post new results here on the
blog. In the meantime, if you have any questions or comments leave
them below, or contact me on Twitter @PwdRsch.
* Study Methodology
I monitored Twitter and other sites for
notices that a data dump had been publicly posted. Some data dumps
contained user or customer information but not passwords. Others
contained only the administrator password or the passwords of a very
limited number of users. I ignored these and focused only on sources
that contained passwords (hashed or plaintext) of at least a dozen or
more users.
I also attempted to eliminate duplicate
dumps, a practice where one hacker copies a full or partial dump from
someone else and reposts it as their own. Sometimes these dumps are
from the same month and sometimes they are from previous months.
In a few cases the dump poster noted
that they had included only a subset of the available user passwords.
While I didn't count the unposted passwords, we should assume that
the attacker had access to the complete user database, which would
increase the total number of passwords exposed.
When reviewing these figures keep in
mind that they account only for the publicly posted data which I was
able to discover. Hackers certainly compromised the passwords of
other sites and kept this activity secret, or shared the data over
more private channels.