20061012

Finding Identical Files using bash and find

I wanted to start a blog of all the little bits of code featuring tricks that I've learned over the years. I don't know how often I'll update this blog, but any time I use a bit of code to complete a complicated task, I'll describe the task and show the code used to solve it. My first example will be a identical file finder. In my class on Java programming that I teach, I suspected two students of turning in the exact same work. As I was grading, I had seen some identical code elsewhere, but I couldn't remember. I keep all the students work in different directories on my office computer just in case I need to look at their old homework solutions. Rather than look at every solution manually for an identical file, I wrote this little bash script to do the work for me:
#!/bin/bash

if [ $# -eq 2 ]
then
    for i in $(find . -name $1); do diff -s $i $2 | grep -v differ; done
else
    echo "USAGE: findIdent [SOME FILE EXPRESSION] [SOME FILE]"
fi
This script "findIdent" takes two arguments: a file pattern (say... "*.mp3") and the file that you want to see if duplicates exist. So did I find any duplicates of students' work? No. Turns out I just found some very similar code and it was nothing to worry about. But I kept this script in case I ever need it again.

No comments: