20061121

A Review of MySpace Passwords using Ruby

Over on the site Reddit, someone posted a link to some 50,000+ MySpace user e-mails and passwords from a phishing site. I'm not going to post the link to those passwords. You'll have to find it yourself.

I wrote a quick Ruby script to pour through the text. Normally I would have done this in Perl, but Ruby seems to be the language du jour.

Here's a breakdown of things I noticed about people's passwords. I tallied the number of passwords that had a lowercase letter, uppercase letter, a number, or a symbol in the password itself. I also checked for usernames that were the same as their passwords, and passwords that followed the pattern of "word-then-number".

total: 52692
lowercase: 50818 (96.44%)
numbers: 42570 (80.79%)
symbols: 2959 (5.62%)
uppercase: 2330 (4.21%)
Same as username: 197 (0.37%)
Word-Number pattern: 35800 (67.94%)

Almost everyone is using lowercase letters in their password. 4 out of every 5 users are using numbers. Less than 6% of all users are using symbols or uppercase letters in their passwords. Almost 1/3rd of a percent of MySpace's users are dumb enough to use their e-mail's user name as their password. Two-thirds of users use a "word followed by a number" pattern for their password. If I were to write a password cracker, I would first try to hack accounts that followed this very common pattern. (I must admit: most of my own passwords follow this pattern.)

I went on to check for the most common word usage. This was done by stripping out any non-letter characters and dumping the results in a hash table.

password 334
a 299 (The single letter 'a' is commonly used with a sequence of numbers.)
soccer 285
iloveyou 273
fuckyou 173
love 139
abc 137
football 135
baseball 125
myspace 122

There are a few surprises in this list. Why on earth would anyone make their password "password"? For that matter, if you are on MySpace, why make your password "myspace". There seem to be a lot of sports fans on MySpace. "soccer","baseball", and "football" all appear in the top ten most common words. There is also a love/hate drama going on between many users. "love", "iloveyou" and "fuckyou" are common choices for passwords.

So where's the code?!?

#!/usr/bin/ruby

file = ARGV[0]

total     = 0
wordnum   = 0
numbers   = 0
lowercase = 0
uppercase = 0
symbol    = 0
samename  = 0

words = Hash.new

IO.foreach(file) do |line|

    line.strip!

    (user,password) = line.split /:/
    next if password == nil

    (name,domain)   = user.split /@/
    total = total + 1

    if password =~ /^[a-zA-Z]+[0-9]+$/
        wordnum = wordnum + 1
    end

    if password =~ /[0-9]/
        numbers = numbers + 1
    end

    if password =~ /[a-z]/
        lowercase = lowercase + 1
    end

    if password =~ /[A-Z]/
        uppercase = uppercase + 1
    end

    if password =~ /[`~!@#\$%^&*()_\+\-\\=]/
        symbol = symbol + 1
    end

    if password == name
        samename = samename + 1
    end

    lettersonly = password.gsub(/[^a-zA-Z]/, '')

    next if lettersonly =~ /^$/

    if false == words.key?(lettersonly)
        words[lettersonly] = 1
    else
        words[lettersonly] = words[lettersonly] + 1
    end

end

puts "total:               #{total}"
puts "numbers:             #{numbers}"
puts "lowercase:           #{lowercase}"
puts "uppercase:           #{uppercase}"
puts "symbols:             #{symbol}"
puts "Same as username:    #{samename}"
puts "Word-Number pattern: #{wordnum}"

words_ary = words.sort {|a,b| b[1]<=>a[1]}

10.times do |i|
    puts "#{words_ary[i][0]}  #{words_ary[i][1]}"
end

20061113

Bits of Java

Note: Today's coding gem is one that probably most Java programmers have experienced at least once.

A student of mine came to me with a problem from a different class she was taking. She had to write a program to generate truth tables for certain electrical gates. Because she's in my Java class, Java was her choice to complete the assignment. While I teach Java, I'm very unfamiliar with many of its low level aspects. This was a unique challenge for me.

These particular electrical gates required 4 binary inputs. How could I quickly generate all 16 possible binary combinations of 1's and 0's without resorting to spaghetti code? My first thought was to write a simple 'for' loop to generate the binary bits.

Solution #1:

    for (int i = 0; i < 16; i++) { // Combination Number
        for (int j = 0; j < 4; j++) { // Nth bit in combination
            int bit = (int) Math.pow(2,j);
            System.out.print( (i&bit)/bit );
        } // End For
        System.out.println("");
    } // End For

It just looks ugly, but it's mathematically correct (or at least plausible). We call "Math.pow", convert double to integer, there seems to be division taking place (but it's anyone's guess as to why), and we still have to perform the logical "and" (&). Bleh! How is anyone suppose to understand what's going on?

Solution #2:

    for (int i = 0; i < 16; i++) { // Combination Number
        for (int j = 0; j < 4; j++) { // Nth bit in combination
            System.out.print( (i>>>j)&1 );
        } // End For
        System.out.println("");
    } // End For

It's a little cleaner. There's a unsigned shift which is then compared and-wise to the number 1. It's only two operations, both of which happen at the binary level. Works for me. Too bad I figured this out after she came up with her own solution using large, ugly 2D-arrays.

20061102

The Impossible FFTW Bug

Ever had a computer bug that you spent weeks working on it? You might try every trick you know to solve it: debugging messages, profilers, Google, professors, coworkers, fellow students. You might try to explain the code to someone who doesn't even program just so you have to describe the logic in simple terms. Nothing seems to fix the problem.

I had this bug in my code for about two weeks. I had to write a program to compute the 2 dimensional FFT of an image. That's it. I'll just use FFTW, the most popular open source library for computing FFTs.

The Wrong Solution for computing FFTs with FFTW:

int j;
fftw_plan myplan;
fftw_complex in[x*y], out[x*y];

for (j = 0; j < x*y; j++) {
    in[j][0] = r[j];
    in[j][1] = i[j];
} // End For

myplan = (dir == 1)? fftw_plan_dft_2d(x, y, in, out, FFTW_FORWARD,  FFTW_MEASURE) :
                     fftw_plan_dft_2d(x, y, in, out, FFTW_BACKWARD, FFTW_MEASURE) ;

fftw_execute(myplan);

In these few lines of code lies a problem that I've spent weeks trying to find. There are no syntax errors, nor are there logic errors. Every time I ran this code, the result would be all zeros. If I ran it multiple times, I got the correct results for every execution after the first. I dropped the data into a prepared array, made a plan (as defined by the FFTW C library), then executed that library. Should work, right? Wrong.

The Correct Solution for computing FFTs with FFTW:

int j;
fftw_plan myplan;
fftw_complex in[x*y], out[x*y];

myplan = (dir == 1)? fftw_plan_dft_2d(x, y, in, out, FFTW_FORWARD,  FFTW_MEASURE) :
                     fftw_plan_dft_2d(x, y, in, out, FFTW_BACKWARD, FFTW_MEASURE) ;

for (j = 0; j < x*y; j++) {
    in[j][0] = r[j];
    in[j][1] = i[j];
} // End For

fftw_execute(myplan);

I found buried in the FFTW FAQ that I'm suppose to make a plan before giving it the data. I switched two sets of lines around and everything magically works for reasons I fail to understand. Let this be a lesson to anyone having to use FFTW.