Say you have a file with many repeated, unnecessary lines that you want to remove. For safety's sake, you would rather make an abbreviated copy of the file rather than replace it. Ruby makes this a cinch. You just iterate over the file, putting all lines the computer has already "seen" into a dictionary. If a line is not in the dictionary, it must be new, so write it to the output file. Here's the code designed with .tex files in mind, but easily adaptable:

puts 'Filename?'
filename = gets.chomp
input = File.open(filename+'.tex')
output = File.open(filename+'2.tex', 'w')
seen = {}
input.each do |line|
if (seen[line])
else
output.write(line)
seen[line] = true
end
end
input.close()
output.close()

Where would this come in handy? Well, the .tex extension probably already gave you a clue that I am reducing redundancy in a $\LaTeX$ file. In particular, I have an R plot generated as a tikz graphic. The R plot includes a rug at the bottom (tick marks indicating data observations)--but the data set includes over 9,000 observations, so many of the lines are drawn right on top of each other. The $\LaTeX$ compiler got peeved at having to draw so many lines, so Ruby helped it out by eliminating the redundancy. One special tweak for using the script above to modify tikz graphics files is to change the line

if (seen[line])

to

if (seen[line]) && !(line.include? 'node') &&  !(line.include? 'scope') && !(line.include? 'path') && !(line.include? 'define')

if your plot has multiple panes (e.g. par(mfrow=c(1,2)) in R) so that Ruby won't ignore seemingly redundant lines that are actually specifying new panes. The modified line is a little long and messy, but it works, and that was the main goal here. The resulting $\LaTeX$ file compiles easily and more quickly than it did with all those redundant lines, thanks to Ruby.