top | item 6751517

Shell script mistakes

117 points| reinhardt | 12 years ago |pixelbeat.org | reply

34 comments

[+] barrkel|12 years ago|reply

The single biggest shell script mistake is not handling whitespace in file names correctly, and it's almost impossible to do correctly if you have weird file names: embedded newlines, leading and trailing spaces, embedded tabs. Embedded quotes can be tricky too, especially if you're writing a script that generates a script.

That bit, writing a script that generates a script, happens surprisingly often in bash. It's cheaper to pipe a stream to sed that converts it into a shell command than it is to iterate over all the lines, and individually pluck out the arguments for the commands you want to execute. Leaving the script as something that outputs shell commands also lets you inspect what it does before committing to it (by piping it to bash).

[+] lifthrasiir|12 years ago|reply

Related: Fixing Unix/Linux/POSIX filenames. http://www.dwheeler.com/essays/fixing-unix-linux-filenames.h... (HN discussion at https://news.ycombinator.com/item?id=6644955)

[+] yogsototh|12 years ago|reply

The problem with file name is the main reason I use zsh.

    for fic in **/*(.); do
        doSomethingWith $fic
    done

will take way more time than using find, but will work even with files with stranges chars.

[+] BCM43|12 years ago|reply

I find that after a shell script gets to be over 3 or so lines, it's easier to switch over to python or perl. Do others feel the same?

[+] laumars|12 years ago|reply

I don't agree with this. I love Perl, it's up there as one of my favourite languages (despite it's many short comings), but in my opinion shell scripts make more sense if you're writing scripts which depend upon a number of additional programs as the bulk of it's processing.

For me, the point at which Perl makes more sense is if your script requires more internal logic than it does depends on spawning other programs.

For example, if I'm writing a routine to auto snapshot ZFS / Btrfs volumes and delete any over a certain age, the script would be dependant on your file system CLI tools. So it makes more sense to have an 80+ line shell script than it does to write that in Perl / Python.

However if I was writing a routine which requires users inputting details, where those details need to be sanity checked and then stored some where (such as a database), then the core logic of that program resides within your script (where you'd have to read inputs, do your sanity checks and then write to the database). So a Perl or Python script makes more sense.

Obviously you can do either of those examples in each of those languages (crudely speaking as I know shell scripts aren't technically a programming language); but that's just a basic example of where I personally draw the line.

I also think this is one of those occasions where it doesn't massively matter which approach you take just so long as the code works and is maintainable (though I draw the line at one maintenance script I saw last year. It was a Python script where every other line was os.system. It just struck me as rather pointless starting a Python interpreter if you're just going to use it like a shell script - you might as well do the whole lot in the shell to begin with).

[+] chrismonsanto|12 years ago|reply

For me, it has nothing to do with the number of lines, but the amount of actual computation I am doing. If all I need to do is wire up the inputs and outputs of X commands, then shell script is the best tool for the job.

If I actually need to do something with an input or output before sending it to the next command, I'll switch to Python.

For CLI tasks in Python, http://docopt.org/ and http://shell-command.readthedocs.org/en/latest/ are invaluable tools. Yes, I know about Envoy, but shell_command is much nicer to use in my opinion. Envoy doesn't automatically escape arguments, which is annoying for all the reasons that SQL injections are annoying.

[+] barrkel|12 years ago|reply

It entirely depends on what the script is doing. If it's orchestrating a pipeline of processes, shell script is ideal. If it needs much in the way of complicated temporary data structures, shell is the worst way to go.

[+] joshguthrie|12 years ago|reply

I was gonna write this.

Shell script syntax has always been hard to remember to me (but I feel it to be less than intuitive compared to lisp, perl, js, c, ruby, python...) , subject to different behaviors between shells (csh vs ksh IIRC), string manipulation is close to non-existing, types are more difficult to use than in PHP,...

Okay, it works, we can do stuff with it, we can hack a quick thingie there and there,... I know the main appeal is, "it just runs everywhere, no install required" (minus the csh/ksh differences), but is it really the tool we need for all we use it for?

[+] emmelaich|12 years ago|reply

Yep pretty much, but it can be up to 100 lines depending on the nature of the script.

To the article, I've never used a shell that doesn't understand [[ (guaranteed to be builtin, not sensitive to the hyphen issue) and {,} expansion.

For the "for file in *; ... " that is also susceptible to the shell arg limits and is fairly hairy.

Lastly if you're concerned about performance -- well, it's definitely time to leave shell for something better.

[+] elwin|12 years ago|reply

It depends what it's doing. Other languages are much better at data processing. They make it more complicated to start child processes, use pipes, set environment variables, etc. If you're doing system administration tasks, you'll end up with a lot of shell command lines embedded in the program. (Try editing a crontab in Python...)

[+] nilved|12 years ago|reply

It depends on the goal of the script. I tend to rewrite complicated shell scripts in Ruby, but working with subprocesses in Ruby is terrible, so those scripts are destined to end in `.sh` forever. bash has a very specific purpose where it excels over scripting languages like Python and Ruby: wiring input and output and communicating with subprocesses.

[+] jmount|12 years ago|reply

Yes. I just promised myself yesterday to stop writing bash scripts for things other than setting up an environment and launching another program. What pushed me over the edge is most of the script was incorporating the kind of fixes from the article (force variable to be treated as int, add dollars signs to access as variable, and so on).

[+] coolsunglasses|12 years ago|reply

I've gotten in the habit of using Python or Haskell for my automation. Pretty happy with it.

[+] ams6110|12 years ago|reply

Depends on the target. Python, and even perl is not always available. /bin/sh is.

[+] gwu78|12 years ago|reply

This example

  for $file in *;do wc -l $file;done

could be reduced to

  for $file in *; { wc -c $file ;}

in some POSIX-like shells.

Is the for loop even necessary?

    echo wc -l * |sh

But...

http://www.in-ulm.de/~mascheck/various/argmax/

[+] bloat|12 years ago|reply

This is a great page in the same vein - bash specific, but quite a bit more comprehensive.

http://mywiki.wooledge.org/BashPitfalls

[+] knweiss|12 years ago|reply

I recommend the shell script static analyzer ShellCheck: https://github.com/koalaman/shellcheck

[+] sateesh|12 years ago|reply

One of the subtle shell script mistake which I was unaware was that if a shell script is modified, currently running instances of the script might fail [1].

1. http://stackoverflow.com/questions/2285403

[+] unknown|12 years ago|reply

[deleted]

[+] memracom|12 years ago|reply

Let's not forget unit testing. After all a shell script is code and code should be unit tested.

https://code.google.com/p/shunit2/

[+] LukeShu|12 years ago|reply

I prefer http://bmizerany.github.io/roundup/ . It works in fewer shells, but when you know which shell(s) you are targeting, that doesn't matter. It is far less magic than shunit2--writing the tests actually feels like writing shell.

[+] sigzero|12 years ago|reply

No, just no.