Over on the site Reddit, someone posted a link to some 50,000+ MySpace user e-mails and passwords from a phishing site. I'm not going to post the link to those passwords. You'll have to find it yourself.
I wrote a quick Ruby script to pour through the text. Normally I would have done this in Perl, but Ruby seems to be the language du jour.
Here's a breakdown of things I noticed about people's passwords. I tallied the number of passwords that had a lowercase letter, uppercase letter, a number, or a symbol in the password itself. I also checked for usernames that were the same as their passwords, and passwords that followed the pattern of "word-then-number".
total: 52692
lowercase: 50818 (96.44%)
numbers: 42570 (80.79%)
symbols: 2959 (5.62%)
uppercase: 2330 (4.21%)
Same as username: 197 (0.37%)
Word-Number pattern: 35800 (67.94%)
Almost everyone is using lowercase letters in their password. 4 out of every 5 users are using numbers. Less than 6% of all users are using symbols or uppercase letters in their passwords. Almost 1/3rd of a percent of MySpace's users are dumb enough to use their e-mail's user name as their password. Two-thirds of users use a "word followed by a number" pattern for their password. If I were to write a password cracker, I would first try to hack accounts that followed this very common pattern. (I must admit: most of my own passwords follow this pattern.)
I went on to check for the most common word usage. This was done by stripping out any non-letter characters and dumping the results in a hash table.
password 334
a 299 (The single letter 'a' is commonly used with a sequence of numbers.)
soccer 285
iloveyou 273
fuckyou 173
love 139
abc 137
football 135
baseball 125
myspace 122
There are a few surprises in this list. Why on earth would anyone make their password "password"? For that matter, if you are on MySpace, why make your password "myspace". There seem to be a lot of sports fans on MySpace. "soccer","baseball", and "football" all appear in the top ten most common words. There is also a love/hate drama going on between many users. "love", "iloveyou" and "fuckyou" are common choices for passwords.
So where's the code?!?
#!/usr/bin/ruby
file = ARGV[0]
total = 0
wordnum = 0
numbers = 0
lowercase = 0
uppercase = 0
symbol = 0
samename = 0
words = Hash.new
IO.foreach(file) do |line|
line.strip!
(user,password) = line.split /:/
next if password == nil
(name,domain) = user.split /@/
total = total + 1
if password =~ /^[a-zA-Z]+[0-9]+$/
wordnum = wordnum + 1
end
if password =~ /[0-9]/
numbers = numbers + 1
end
if password =~ /[a-z]/
lowercase = lowercase + 1
end
if password =~ /[A-Z]/
uppercase = uppercase + 1
end
if password =~ /[`~!@#\$%^&*()_\+\-\\=]/
symbol = symbol + 1
end
if password == name
samename = samename + 1
end
lettersonly = password.gsub(/[^a-zA-Z]/, '')
next if lettersonly =~ /^$/
if false == words.key?(lettersonly)
words[lettersonly] = 1
else
words[lettersonly] = words[lettersonly] + 1
end
end
puts "total: #{total}"
puts "numbers: #{numbers}"
puts "lowercase: #{lowercase}"
puts "uppercase: #{uppercase}"
puts "symbols: #{symbol}"
puts "Same as username: #{samename}"
puts "Word-Number pattern: #{wordnum}"
words_ary = words.sort {|a,b| b[1]<=>a[1]}
10.times do |i|
puts "#{words_ary[i][0]} #{words_ary[i][1]}"
end