top | item 40983316

(no title)

ryapric | 1 year ago

On mobile so no idea if this a) looks good or b) runs (especially considering the command substitutions, but you could also redirect to temp files instead), but it's just something like this:

    N=0
    while read -r URL; do
      data="$(curl "$URL")" || { printf 'error fetching data\n' && exit 1 ; }
      prop="$(jq '.data[].someProp' <<< "$data" || { printf 'error parsing JSON response\n' && exit 1 ; }
      grep -v "filter" <<< "$prop" > "$N.data" || { printf 'error searching for filter text\n' && exit 1 ; }
      N="$((N+1))"
    done < ./links.txt
bash also has e.g. functions, so you could abstract that error handling out. Like I said, not that weird.

Edit: oh good, at least the formatting looks ok.

discuss

order

theamk|1 year ago

You forgot "-f" flag to curl, which means it won't fail if the server returns an error. Also "jq" returns success on empty input, pretty much always. Together, this might mean that networking errors will be completely ignored, and some of your data files will mistakenly become empty when server is busy. Good luck debugging that!

And yes, you can fix those pretty easily.. as long as you aware about them. But this is a great illustration why you have to be careful with bash: it's a 6-line program which already has 2 bugs which can cause data corruption. I fully agree with the other commenter: switch to python if you have any sort of complex code!

1-more|1 year ago

plus the delightful mechanism of

    set -euo pipefail # stop execution if anything fails
    cleanup () {
        rm $tmp_file
        popd
        # anything else you want to think of here to get back to the state of the
        # world before you ran the script
    }
    trap cleanup EXIT # no matter how the program exits, run that cleanup function.
A really good rundown of the different options for set https://www.howtogeek.com/782514/how-to-use-set-and-pipefail...

Too|1 year ago

But why bother? The moment you start doing all that, all the arguments of "oh look how much i can solve with my cool oneliner" goes away. The python version of that code is not only safe by default, it's also shorter and actually readable. Finally it is a lot more malleable, in case you need to process the data further in the middle of it.

    for N, url in enumerate(Path("links").read_text().splitlines()):
        resp = requests.get(url)
        resp.raise_for_status()
        prop = resp.json()["data"]["someProp"]
        matches = (line for line in prop.splitlines() if "filter" not in line)
        Path(f"{N}.data").write_text("\n".join(matches))
I'm sure there is something about the jq [] operator i am missing but whatever. An iteration there would be a contrived use case and the difficulty to understand it on a glance just proves I'm not interested. As someone else mentioned, both curl and jq requires some extra flags to not ignore errors, i can't say if that was intentional or not. It would either way be equally easy to solve.

ryapric|1 year ago

I never said anything about one-liners?