> git bisect is the "real" way to do this, but it's not something I've ever needed
git bisect is great and worth trying; it does what you're doing in your bash loop, plus faster and with more capabilities such as logging, visualizing, skipping, etc.
The syntax is: $ git bisect run <command> [arguments]
Yes, git bisect is the way to go: in addition to the stuff you mentioned, his method only dives into one parent branch of merge commits. git bisect handles that correctly. A gem of a tool, git bisect.
I've always had trouble getting `for` loops to work predictably, so my common loop pattern is this:
grep -l -r pattern /path/to/files | while read x; do echo $x; done
or the like.
This uses bash read to split the input line into words, then each word can be accessed in the loop with variable `$x`. Pipe friendly and doesn't use a subshell so no unexpected scoping issues. It also doesn't require futzing around with arrays or the like.
One place I do use bash for loops is when iterating over args, e.g. if you create a bash function:
function my_func() {
for arg; do
echo $arg
done
}
This'll take a list of arguments and echo each on a separate line. Useful if you need a function that does some operation against a list of files, for example.
Once I discovered functions are pipeline-friendly, I started pipelining all the things. Almost any `for arg` can be re-designed as a `while read arg` with a function.
Here's what it can look like. Some functions I wrote to bulk-process git repos. Notice they accept arguments and stdin:
# ID all git repos anywhere under this directory
ls_git_projects ~/src/bitbucket |
# filter ones not updated for pre-set num days
take_stale |
# proc repo applies the given op. (a function in this case) to each repo
proc_repos git_fetch
> I've always had trouble getting `for` loops to work predictably, so my common loop pattern is this:
for loops were exactly the pain point that lead me to write my own shell > 6 years ago.
I can now iterate through structured data (be it JSON, YAML, CSV, `ps` output, log file entries, or whatever) and each item is pulled intelligently rather than having to conciously consider a tonne of dumb edge cases like "what if my file names have spaces in them"
eg
» open https://api.github.com/repos/lmorg/murex/issues -> foreach issue { out "$issue[number]: $issue[title]" }
380: Fail if variable is missing
379: Backslashes and code comments
378: Improve testing facility documentation
377: v2.4 release
361: Deprecate `swivel-table` and `swivel-datatype`
360: `sort` converts everything to a string
340: `append` and `prepend` should `ReadArrayWithType`
For me I always used for loops and only recently (after a decade of using Linux daily) have learned about the power of piped-loops. It’s strange to me you are more comfortable with those than for loops, but I think it does make sense as you’re letting a program generate the list to iterate over. A pain point in for loops is getting that right, e.g. there isn’t a good way to iterate over files with spaces in them using a for loop (this is why I learned about piped loops recently.)
Installing GNU stuff with the 'g' prefix (gsed instead of sed) means having to remember to include the 'g' when you're on a Mac and leave it off when you're on Linux, or use aliases, or some other confusing and inconvenient thing, and then if you're writing a script meant for multi-platform use, it still won't work. I find it's a much better idea to install the entire GNU suite without the 'g' prefix and use PATH to control which is used. I use MacPorts to do this (/opt/local/libexec/gnubin), and even Homebrew finally supports this, although it does it in a stupid way that requires adding a PATH element for each individual GNU utility (e.g. /usr/local/opt/gnu-sed/libexec/gnubin).
One issue with that is it won't reflect the failure of those commands.
In bash you can fix that by looping around, checking for status 127, and using `-n` (which waits for the first job of the set to complete), but not all shells have `-n`.
> 1. Find and replace a pattern in a codebase with capture groups
> git grep -l pattern | xargs gsed -ri 's|pat(tern)|\1s are birds|g'
Or, in IDEA, Ctrl-Shift-r, put "pat(tern)" in the first box and "$1s are birds" in the second box, Alt-a, boom. Infinitely easier to remember, and no chance of having to deal with any double escaping.
What you are doing here is to propose a very specialist approach ("Why not just use a SpaceX Merlin Engine, this is ?") when a slightly more cumbersome general approach ("This is how you get from A to B") was described.
IDEA is nice if you have IDEA.
"But not everyone uses Bash" - very correct (more fond of zsh, personally), but this article is specifically about Bash.
> git bisect is the "real" way to do this, but it's not something I've ever needed
uh, yeah, you did need it, that's why you came up with "2. Track down a commit when a command started failing". Seriously though, git bisect is really useful to track down that bug in O(log n) rather than O(n).
For the given command, if the assumption is the command failed recently, it's likely faster than bisect. You can start it and go grab a coffee. It's automatic.
I wish my usage of bisect were that trivial, though. Usually I need to find a bug in a giant web app. Which means finding a good commit, doing the npm install/start dance, etc. for each round.
When you're running a script, what is the expected behaviour if you just run it with no arguments? I think it shouldn't make any changes to your system, and it should print out a help message with common options. Is there anything else you expect a script to do?
Do you prefer a script that has a set of default assumptions about how it's going to work? If you need to modify that, you pass in parameters.
Do you expect that a script will lay out the changes it's about to make, then ask for confirmation? Or should it just get out of your way and do what it was written to do?
I'm asking all these fairly basic questions because I'm trying to put together a list of things everyone expects from a script. Not exactly patterns per se, more conventions or standard behaviours.
> When you're running a script, what is the expected behaviour if you just run it with no arguments? I think it shouldn't make any changes to your system, and it should print out a help message with common options. Is there anything else you expect a script to do?
Most scripts I use, custom-made or not, should be clear enough in their name for what they do. If in doubt, always call with --help/-h. But for example, it doesn't make sense that something like 'update-ca-certificates' requires arguments to execute: it's clear from the name it's going to change something.
> Do you prefer a script that has a set of default assumptions about how it's going to work? If you need to modify that, you pass in parameters.
It depends. If there's a "default" way to call the script, then yes. For example, in the 'update-ca-certificates' example, just use some defaults so I don't need to read more documentation about where the certificates are stored or how to do things.
> Do you expect that a script will lay out the changes it's about to make, then ask for confirmation? Or should it just get out of your way and do what it was written to do?
I don't care too much, but give me options to switch. If it does everything without asking, give me a "--dry-run" option or something that lets me check before doing anything. On the other hand, if it's asking a lot, let me specify "--yes" as apt does so that it doesn't ask me anything in automated installs or things like that.
IMO any script that makes any real changes (either to the local system or remotely) should take some kind of input.
It's one thing if your script reads some stuff and prints output. Defaulting to the current working directory (or whatever makes sense) is fine.
If the script is reading config from a config file or envvars then it should still probably get some kind of confirmation if it's going to make any kind of change (of course with an option to auot-confirm via a flag like --yes).
For really destructive changes it should default to dry run and require an explicit —-execute flag but for less destructive changes I think a path as input on the command line is enough confirmation.
That being said, if it’s an unknown script I’d just read it. And if it’s a binary I’d pass —-help.
A script is just another command, the only difference in this case is that you wrote it and not someone else. If its purpose is to make changes, and it's obvious what changes it should make without any arguments, I'd say it can do so without further ado. poweroff doesn't ask me what I want to do - I already told it by executing it - and that's a pretty drastic change.
Commands that halt halfway through and expect user confirmation should definitely have an option to skip that behavior. I want to be able to use anything in a script of my own.
What you're getting at seems to be more about CLI conventions as opposed to script conventions specifically. As such, you might want to have a look at https://clig.dev/ which is a really comprehensive document describing CLI guidelines. I can't say I've read the whole thing yet, but everything I _have_ read very much made sense.
What is your intended audience? It is you? A batch job or called by another program? Or a person that may or not be able to read bash and will call it manually?
A good and descriptive name comes first, then the action and the people that may have to run it are next.
I am confused how this works. I would assume `SECONDS` would just be a shell variable and it was first assigned `0` and then it should stay same, why did it keep counting the seconds?
> SECONDS
bash: SECONDS: command not found
> SECONDS=0; sleep 5; echo $SECONDS;
5
> echo "Your command completed after $SECONDS seconds";
Your command completed after 41 seconds
> echo "Your command completed after $SECONDS seconds";
Your command completed after 51 seconds
> echo "Your command completed after $SECONDS seconds";
Your command completed after 53 seconds
I really love this style of blog post. Short, practical, no backstory, and not trying to claim one correct way to do anything. Just an unopinionated share that was useful to the author.
It seems like a throwback to a previous time, but honestly can't remember when that was. Maybe back to a time when I hoped this was what blogging could be.
We need a simpler regex format. One that allows easy searching and replacing in source code. Of course, some IDE's already to that pretty well, but I'd like to be able to do it from the command line with a stand alone tool I can easily use in scripts.
The simplest thing I know that is able to do that is coccinelle, but even coccinelle is not handy enough.
I think what we really need is one regex format. You have POSIX, PCRE, and also various degrees of needing to double-escape the slashes to get past whatever language you're using the regex in. Always adds a large element of guesswork even when you are familiar with regular expressions.
Anyway, curl is not needed. One can write a small program in their language of choice to generate HTTP/1.1, but even a simple shell script will work. Even more, we get easy control over SNI which curl binary does have have.
There are different and more concise ways, but below is an example, using the IFS technique.
This also shows the use of sed's "P" and "D" commands (credit: Eric Pement's sed one-liners).
Assumes valid, non-malicious URLs, all with same host.
Usage: 1.sh < URLs.txt
#!/bin/sh
(IFS=/;while read w x y z;do
case $w in http:|https:);;*)exit;esac;
case $x in "");;*)exit;esac;
echo $y > .host
printf '%s\r\n' "GET /$z HTTP/1.1";
printf '%s\r\n' "Host: $y";
# add more headers here if desired;
printf 'Connection: keep-alive\r\n\r\n';done|sed 'N;$!P;$!D;$d';
printf 'Connection: close\r\n\r\n';
) >.http
read x < .host;
# SNI;
#openssl s_client -connect $x:443 -ign_eof -servername $x < .http;
# no SNI;
openssl s_client -connect $x:443 -ign_eof -noservername < .http;
exec rm .host .http;
The curl binary will reuse the TCP connection when fed multiple URLs. Infact it can even use HTTP2 and make the requests in parallel over a single TCP connection. Common pattern I use is to construct URLs with a script and use xargs to feed to curl.
I want to like this, but the for loop is unnecessarily messy, and not correct.
for route in foo bar baz do
curl localhost:8080/$route
done
That's just begging go wonky. Should be
stuff="foo bar baz"
for route in $stuff; do
echo curl localhost:8080/$route
done
Some might say that it's not absolutely necessary to abstract the array into a variable and that's true, but it sure does make edits a lot easier. And, the original is missing a semicolon after the 'do'.
I think it's one reason I dislike lists like this- a newb might look at these and stuff them into their toolkit without really knowing why they don't work. It slows down learning. Plus, faulty tooling can be unnecessarily destructive.
These are the types of stuff you typically run on the fly in the command line, not full blown scripts you intended to be reused and shared.
Here's a few examples from my command history:
for ((i=0;i<49;i++)); do wget https://neocities.org/sitemap/sites-$i.xml.gz ; done
for f in img*.png; do echo $f; convert $f -dither Riemersma -colors 12 -remap netscape: dit_$f; done
Hell if I know what they do now, they made sense when I ran them. If I need them again, I'll type them up again.
Seems like you could say the same thing about every snippet e.g. running jobs is the original purpose of a shell, (3) can be achieved using job specifications (%n), and using the `jobs` command for oversight.
If I find myself running a set of commands in parallel, I'd keep a cheap Makefile around: individual commands I want to run in parallel will be written as phony targets:
[+] [-] jph|4 years ago|reply
git bisect is great and worth trying; it does what you're doing in your bash loop, plus faster and with more capabilities such as logging, visualizing, skipping, etc.
The syntax is: $ git bisect run <command> [arguments]
https://git-scm.com/docs/git-bisect
[+] [-] OskarS|4 years ago|reply
[+] [-] MainJane|4 years ago|reply
If you are looking for a limit or the failing part of a file have a look at: https://gitlab.com/ole.tange/tangetools/-/tree/master/find-f...
[+] [-] stewartbutler|4 years ago|reply
This uses bash read to split the input line into words, then each word can be accessed in the loop with variable `$x`. Pipe friendly and doesn't use a subshell so no unexpected scoping issues. It also doesn't require futzing around with arrays or the like.
One place I do use bash for loops is when iterating over args, e.g. if you create a bash function:
This'll take a list of arguments and echo each on a separate line. Useful if you need a function that does some operation against a list of files, for example.Also, bash expansions (https://www.gnu.org/software/bash/manual/html_node/Shell-Par...) can save you a ton of time for various common operations on variables.
[+] [-] adityaathalye|4 years ago|reply
Here's what it can look like. Some functions I wrote to bulk-process git repos. Notice they accept arguments and stdin:
Source: https://github.com/adityaathalye/bash-toolkit/blob/master/bu...The best part is sourcing pipeline-friendly functions into a shell session allows me to mix-and-match them with regular unix tools.
Overall, I believe (and my code will betray it) functional programming style is a pretty fine way to live in shell!
[+] [-] hnlmorg|4 years ago|reply
for loops were exactly the pain point that lead me to write my own shell > 6 years ago.
I can now iterate through structured data (be it JSON, YAML, CSV, `ps` output, log file entries, or whatever) and each item is pulled intelligently rather than having to conciously consider a tonne of dumb edge cases like "what if my file names have spaces in them"
eg
Github repo: https://github.com/lmorg/murexDocs on `foreach`: https://murex.rocks/docs/commands/foreach.html
[+] [-] jolmg|4 years ago|reply
The problem with the pipe-while-read pattern is that you can't modify variables in the loop, since it runs in a subshell.
[+] [-] edgyquant|4 years ago|reply
[+] [-] michaelhoffman|4 years ago|reply
I didn't know about `$SECONDS` so I'm going to change it to use that.
[+] [-] caymanjim|4 years ago|reply
[+] [-] R0flcopt3r|4 years ago|reply
[+] [-] masklinn|4 years ago|reply
In bash you can fix that by looping around, checking for status 127, and using `-n` (which waits for the first job of the set to complete), but not all shells have `-n`.
[+] [-] l0b0|4 years ago|reply
[+] [-] DocTomoe|4 years ago|reply
IDEA is nice if you have IDEA.
"But not everyone uses Bash" - very correct (more fond of zsh, personally), but this article is specifically about Bash.
[+] [-] turbocon|4 years ago|reply
[+] [-] marginalia_nu|4 years ago|reply
[+] [-] barbazoo|4 years ago|reply
uh, yeah, you did need it, that's why you came up with "2. Track down a commit when a command started failing". Seriously though, git bisect is really useful to track down that bug in O(log n) rather than O(n).
[+] [-] deckard1|4 years ago|reply
I wish my usage of bisect were that trivial, though. Usually I need to find a bug in a giant web app. Which means finding a good commit, doing the npm install/start dance, etc. for each round.
[+] [-] masklinn|4 years ago|reply
[+] [-] bloopernova|4 years ago|reply
When you're running a script, what is the expected behaviour if you just run it with no arguments? I think it shouldn't make any changes to your system, and it should print out a help message with common options. Is there anything else you expect a script to do?
Do you prefer a script that has a set of default assumptions about how it's going to work? If you need to modify that, you pass in parameters.
Do you expect that a script will lay out the changes it's about to make, then ask for confirmation? Or should it just get out of your way and do what it was written to do?
I'm asking all these fairly basic questions because I'm trying to put together a list of things everyone expects from a script. Not exactly patterns per se, more conventions or standard behaviours.
[+] [-] gjulianm|4 years ago|reply
Most scripts I use, custom-made or not, should be clear enough in their name for what they do. If in doubt, always call with --help/-h. But for example, it doesn't make sense that something like 'update-ca-certificates' requires arguments to execute: it's clear from the name it's going to change something.
> Do you prefer a script that has a set of default assumptions about how it's going to work? If you need to modify that, you pass in parameters.
It depends. If there's a "default" way to call the script, then yes. For example, in the 'update-ca-certificates' example, just use some defaults so I don't need to read more documentation about where the certificates are stored or how to do things.
> Do you expect that a script will lay out the changes it's about to make, then ask for confirmation? Or should it just get out of your way and do what it was written to do?
I don't care too much, but give me options to switch. If it does everything without asking, give me a "--dry-run" option or something that lets me check before doing anything. On the other hand, if it's asking a lot, let me specify "--yes" as apt does so that it doesn't ask me anything in automated installs or things like that.
[+] [-] mason55|4 years ago|reply
It's one thing if your script reads some stuff and prints output. Defaulting to the current working directory (or whatever makes sense) is fine.
If the script is reading config from a config file or envvars then it should still probably get some kind of confirmation if it's going to make any kind of change (of course with an option to auot-confirm via a flag like --yes).
For really destructive changes it should default to dry run and require an explicit —-execute flag but for less destructive changes I think a path as input on the command line is enough confirmation.
That being said, if it’s an unknown script I’d just read it. And if it’s a binary I’d pass —-help.
[+] [-] scbrg|4 years ago|reply
Commands that halt halfway through and expect user confirmation should definitely have an option to skip that behavior. I want to be able to use anything in a script of my own.
[+] [-] auno|4 years ago|reply
It's been discussed here on HN before.
https://news.ycombinator.com/item?id=25304257
[+] [-] gmuslera|4 years ago|reply
A good and descriptive name comes first, then the action and the people that may have to run it are next.
[+] [-] guruparan18|4 years ago|reply
[+] [-] gcmeplz|4 years ago|reply
https://www.oreilly.com/library/view/shell-scripting-expert/...
[+] [-] notatoad|4 years ago|reply
It seems like a throwback to a previous time, but honestly can't remember when that was. Maybe back to a time when I hoped this was what blogging could be.
[+] [-] marcodiego|4 years ago|reply
The simplest thing I know that is able to do that is coccinelle, but even coccinelle is not handy enough.
[+] [-] marginalia_nu|4 years ago|reply
[+] [-] f0e4c2f7|4 years ago|reply
[+] [-] js2|4 years ago|reply
https://gist.github.com/jaysoffian/0eda35a6a41f500ba5c458f02...
Uses perl instead of gsed, defaults to fixed strings but supports perl regexes, properly handles filenames with whitespace.
[+] [-] 1vuio0pswjnm7|4 years ago|reply
The cURL project said it never properly suported HTTP/1.1 pipelining and in 2019 it said it was removed once and for all.
https://daniel.haxx.se/blog/2019/04/06/curl-says-bye-bye-to-...
Anyway, curl is not needed. One can write a small program in their language of choice to generate HTTP/1.1, but even a simple shell script will work. Even more, we get easy control over SNI which curl binary does have have.
There are different and more concise ways, but below is an example, using the IFS technique.
This also shows the use of sed's "P" and "D" commands (credit: Eric Pement's sed one-liners).
Assumes valid, non-malicious URLs, all with same host.
Usage: 1.sh < URLs.txt
[+] [-] 1vuio0pswjnm7|4 years ago|reply
[+] [-] themk|4 years ago|reply
[+] [-] geocrasher|4 years ago|reply
I think it's one reason I dislike lists like this- a newb might look at these and stuff them into their toolkit without really knowing why they don't work. It slows down learning. Plus, faulty tooling can be unnecessarily destructive.
[+] [-] lambic|4 years ago|reply
[+] [-] marginalia_nu|4 years ago|reply
Here's a few examples from my command history:
Hell if I know what they do now, they made sense when I ran them. If I need them again, I'll type them up again.[+] [-] masklinn|4 years ago|reply
[+] [-] gcmeplz|4 years ago|reply
[+] [-] penguin_booze|4 years ago|reply
[+] [-] Hikikomori|4 years ago|reply
[+] [-] Eduard|4 years ago|reply
[+] [-] amp108|4 years ago|reply
[+] [-] ghostly_s|4 years ago|reply
[+] [-] usefulcat|4 years ago|reply
I definitely use this all the time. Also, generating the list of things over which to iterate using the output of a command:
[+] [-] phone8675309|4 years ago|reply
[+] [-] make3|4 years ago|reply
[+] [-] gnubison|4 years ago|reply
[+] [-] harvie|4 years ago|reply