Today in IRC suseROCKS needed to find all duplicate files in a directory by their content, not by their file name, so we whipped up this fancy little 1 liner bash script to do the trick:
find . -type f -exec md5sum '{}' \; | sort | awk 'dup[$1]++{print $2}’
EDIT:
As Andreas suggested, using xargs instead of -exec is much faster, here is the updated command:
find . -type f -print0 | xargs -0 md5sum | sort | awk ‘dup[$1]++{print $2}’
My friend Sam posted a blog on his top 15 commands used from the commandline, so here are mine:
sontek@inspidell:~> history | awk ‘{print $2}’ | awk ‘BEGIN {FS=”|”} {print $1}’|sort|uniq -c | sort -n | tail -n 15 | sort -nr
143 ls
135 cd
84 vim
69 exit
57 ssh
56 su
35 svn
25 man
24 rm
24 python
22 sudo
22 jhbuild
18 make
17 grep
16 xrandr
You can tell a lot about a person by their top 15 commands and as you can see with mine, the majority of mine are used for coding!
You can see a break down of the command I used to list these here: http://czarism.com/my-top-ten-linux-comments-history
What are your top 15 commands?
I was helping a friend debug a problem with gksu (gnomesu alternative) today and we chose to use strace which allows you trace system calls an application makes.
To monitor all system calls an application makes you can redirect the output to a file like so:
strace <command> 2> <file name>
or
strace <command> -o <file name>
These commands return the exact same results, the first command redirects stderr (standard error, which has the file descriptor 2) to the file, strace sends all output to stderr by default, the second command uses the built in -o argument which is much cleaner.
One of the first things I like to do with strace is to check if it is having trouble accessing a file, which I see a lot because the file doesn’t exist or the user executing the command does not have permission to access it, you can do that with these commands:
strace <command> 2>&1 |grep open
or
strace <command> -e open
Again, these commands will return similar results. The first command redirects stderr to stdout so you can use grep to filter the output. The second command is the preferred method because it actually uses the built in -e argument which will trace only the named system call (this is a comma separated list so you can do strace -e open,read).
The only other arguments that I’ve found really helpful are -ff which when used with -o will append the pid (process id) to the file name and -F which will also trace children.
I’m lazy, so I just have this basic script I run that upgrades my wordpress:
#!/bin/bash
blog_directory=
update_url=
wget http://wordpress.org/latest.tar.gz
tar -xzvf latest.tar.gz
tar -zcvf blog-backup-$(date +’%F’).tar.gz $blog_directory
cp -rv wordpress/* $blog_directory
links $update_url
rm latest.tar.gz
How do you upgrade your wordpress?
Theres a discussion going on at reddit about PS1 ( here ).
Mine is:
PS1='\d \t\n\[\033[01;32m\]\u@\h\[\033[01;34m\] \w \n\$\[\033[00m\] ‘
and it looks like:
Wed Feb 20 01:09:15
sontek@inspidell ~
$
Thanks to Chris Crummer for pointing out the reddit post.
Today I found a very informative post on how to get a progress bar with the cp command in Linux.
You can find that blog post here. But I’ll repost the information here in case his blog ever disappears.
With the following bash script:
#!/bin/sh
cp_p()
{
set -e
strace -q -ewrite cp -- "${1}" "${2}" 2>&1 \
| awk '{
count += $NF
if (count % 10 == 0) {
percent = count / total_size * 100
printf "%3d%% [", percent
for (i=0;i<=percent;i++)
printf "="
printf ">"
for (i=percent;i<100;i++)
printf " "
printf "]\r"
}
}
END { print "" }' total_size=$(stat -c '%s' "${1}") count=0
}
You will get a progress bar like this:
% cp_p /mnt/raid/pub/iso/debian/debian-2.2r4potato-i386-netinst.iso /dev/null
76% [===========================================> ]
If you would like to parse torrent RSS feeds on a schedule and don’t want to bog your server down with Azureus, I wrote a basic shell script that you can drop into your cron jobs and have it do all the work for you.
You can download the latest release here or check out the latest code with svn co http://devtoo.net/svn/shtorrent
and then all you have to do is copy shtorrent-cron into cron jobs and you’re set!
You can submit bugs or feature requests at http://devtoo.net/projects/shtorrent/
The original concept was taken from BashT