top | item 36000407

Linux tool to show progress for cp, mv, dd

540 points| jxi | 2 years ago |github.com

178 comments

order
[+] emidoots|2 years ago|reply
Wow, didn't expect to see @xfennec pop up on hacker news while drinking my coffee this morning! I don't know if he'll see this, to be honest didn't know he was still doing things - but this person basically got me into programming and game development-I really can't believe it.

xfennec (and some friends?) I think built a game engine called Raydium, and one of their games called Mania Drive-a Track Mania clone-got distributed with OpenSuse installation CDs back in the day. When I was just like 12 years old, my dad installed that on the family computer and it was all we had, Mania Drive was one of the coolest games on there. Me and my siblings played that for literally days and months on end, making crazy levels we couldn't beat without knowing every turn. It was a huge part of our childhood.

Their game engine was in C with PHP scripting, I remember posting some levels to their forums and asking, in retrospect super dumb, questions and they were so polite and friendly. I remember us joking at the time that the French seemed like these god-like game developers, it had such a profound impact on us, I even wrote about it last year and linked a video of Mania Drive first[0]. I went on to learn Python and then lower-level languages as a result. I'm not sure I'd be coding today without them, to be honest.

Sorry it's off-topic, just really blown away to see a username like that pop up in my feed. Really goes to show that kindness + some cool open source software can have profound effect on people.

[0] https://devlog.hexops.com/2021/increasing-my-contribution-to...

[+] xfennec|2 years ago|reply
Xfennec here, thank you so much for this message. ManiaDrive was a small game made with a bunch of friends, I'm so glad it had an impact on you. I'm now a dad, and it makes me very emotional to read this. Thanks again.
[+] dilap|2 years ago|reply
Such a great story. This part made me lol:

> // Don't remove this print statement. Game will crash!

:-)

[+] yobert|2 years ago|reply
I'd never heard of that game. Trying it now and it's great fun. Really enjoying the soundtrack.
[+] ktm5j|2 years ago|reply
I love stuff like this, thanks for sharing!!
[+] jmclnx|2 years ago|reply
This is one thing I really miss on Linux when compared with the BSDs, especially with dd(1). On BSD you can press ^t to see the status of a command. All you need to do is issue this command to activate it:

% stty status '^t'

btw, for Linux dd(1) I know about "status=progress", but for me it is a bit hard to remember and specific to dd(1). But, nice little utility :)

[+] jagged-chisel|2 years ago|reply

    dd if=<input> | pv | dd of=<output>
to get a count of bytes passing through, or

    pv <input> | dd of=<output>
to get actual completion progress. For tarchives,

    pv <tarchive> | tar x
Compression progress

    pv <file> | bzip2 > <file>.bz2
[+] mat_epice|2 years ago|reply
You can get status on-the-fly from dd on Linux as well by sending the USR1 signal.

  [user@machine ~]$ dd if=/dev/zero of=/dev/null &
  [1] 3254428
  [user@machine ~]$ kill -USR1 %1
  19061394+0 records in
  19061393+0 records out
  9759433216 bytes (9.8 GB, 9.1 GiB) copied, 6.06968 s, 1.5 GB/s
  [user@machine ~]$ kill -USR1 %1
  25868762+0 records in
  25868762+0 records out
  13244806144 bytes (13 GB, 12 GiB) copied, 8.97352 s, 1.5 GB/s
  [user@machine ~]$ kill %1
  [1]+  Terminated             dd if=/dev/zero of=/dev/null
[+] j33zusjuice|2 years ago|reply
How often do you use dd that this really matters? Just curious. I’ve run dd maybe 20 times in the 5-6 years I’ve worked with Linux professionally.
[+] kazinator|2 years ago|reply
I'm astonished that the BSD projects are merging un-Unix-like fluff like this.

Meanwhile, the Linux kernel has removed Shift-PgUp scrollback from the console.

[+] Ballas|2 years ago|reply
Perhaps an alias in your .bashrc for dd would solve it? I usually just use ddrescue most of the time (primarily because I prefer it's usage syntax, but it also reports status).

Similarly you could alias rsync instead of copy and move:

   alias pcp='rsync -au --info=progress2'
   alias pmv='rsync -aP --info=progress2 --remove-source-files'
[+] thedougd|2 years ago|reply
Ctrl-t works for me on DD on Linux.
[+] loeg|2 years ago|reply
Can't speak to the others, but on FreeBSD you don't need to activate ^t (SIGINFO); it just works that way out of the box.
[+] _joel|2 years ago|reply
kill -USR1 {dd pid} ;)
[+] emmelaich|2 years ago|reply
FWIW, dd will give a status if signalled. SIGUSR1 on Linux, SIGINFO on macos.
[+] Nursie|2 years ago|reply
Linux did also spits out status if you kill -USR1 it, which is useful when you forgot to progress=status it.
[+] tssva|2 years ago|reply
Don't try to remember adding status=progress. Add a shell alias for dd.
[+] sillystuff|2 years ago|reply
You can send a USR1 signal to dd and it will print its progress.
[+] lathiat|2 years ago|reply
I use the pv “Pipe Viewer” tool to do the same. You can either put it in the middle of a pipe, or pass it a PID using -p http://www.ivarch.com/programs/pv.shtml

It works by reading /proc/PID/fdinfo/*

[+] mavhc|2 years ago|reply
I used PV's rate limit when trying to do to a multi terabyte zfs send, slow it down and speed it back up as required
[+] foreigner|2 years ago|reply
I use pv all the time and didn't know you could use it to watch an existing process. Thanks for the tip!
[+] zamubafoo|2 years ago|reply
That's interesting! I often found myself forgetting to turn on progress flags on many data transfer jobs and the occasional data transform batch job that I looked into something like this.

I found that `iotop` is great for this kind of thing. Sure, you have to either start it before your process starts or your accumulated total is off, but usually I'm not tracking progress for files less than 1GB so being off by kilobytes is fine.

My go-to's are `sudo iotop -aoP` for general monitoring, adding the `-p` flag if it's just a specific process, or `-u` if I'm monitoring something that is possibly transient.

[+] eps|2 years ago|reply
> It simply scans /proc for interesting commands, and then looks at directories fd and fdinfo to find opened files and seek positions, and reports status for the largest file.

Wasn't expecting something as simple that at all. Bloody ingenious.

[+] pstoll|2 years ago|reply
Random aside - I know formatting discussions border on the religious (and why something like gofmt is the only correct answer & yet I am also good with spaces for Python) but..

Did anyone else look at the code and ask themselves- what is the actual formatting standard being used?

Looked like a mix of “open brace on same line, 4 chars indent for code” then “open brace on new line, code at same zero indent”

Not a big deal obviously. Just something that tripped up my eyes scanning the code.

[+] akritid|2 years ago|reply
From a quick look to one file, seems fairly consistent. Perhaps most unorthodox but at the same time easiest to justify: function bodies skip one level of indentation. The curly brace of function body at column zero is actually a separate, very traditional style. It's just that both styles apply to function bodies, there is no other causal relationship. Finally, indentation is 4 spaces, and tabs are expanded. Of course a very few places may be miss-styled, as happens with hand crafted code. While definitely sinful for not indenting with tabs as God intended, the style is not messy
[+] antihero|2 years ago|reply
Yeah it just looks messy and harder to read when it’s inconsistent
[+] reaperducer|2 years ago|reply
Did anyone else look at the code and ask themselves- what is the actual formatting standard being used?

Looks like a combination of Whitesmiths and ChatGPT.

[+] jtode|2 years ago|reply
I've been adding status=progress to my dd commands and getting progress reports for years now, not sure when it started working like that but any current Linux should have it.
[+] shric|2 years ago|reply
Newer versions of Linux dd support status=progress.

iirc, dd on *BSD will show progress on ^T because it sends a SIGUSR1

[+] mort96|2 years ago|reply
I actually believe it sends a SIGINFO, not a SIGUSR1. It's unfortunate that SIGINFO never made it to Linux, it's an incredibly useful feature.
[+] JdeBP|2 years ago|reply
It's SIGINFO, and this is a common thing on the BSDs.
[+] kazinator|2 years ago|reply
If you have a program that can spew lots of output about the progress it is making, you can redirect it to Pipe Watch:

https://www.kylheku.com/cgit/pw/about/

Pipe Watch continues to read from the pipe even when backgrounded.

Pipe Watch shows you snapshots of the text that is passing through it. You can set triggers and filter and such. The triggers work even when it's in the background, not refreshing the display..

[+] igtztorrero|2 years ago|reply
I use pv, Pipe view since forever:

pv largefile.sql > mysql -u root -psecret

[+] asn1parse|2 years ago|reply
i use rsync. even locally. its one of the best tools ever written, my goto for diffing whole directories. have a nice day
[+] abotsis|2 years ago|reply
The method this uses is cute, hacky, and useful. Makes me want to write an osx background thing that uses the same scheme and pops up progress windows whenever I run a coreutils thing.
[+] e40|2 years ago|reply
The "how does it work" section doesn't make sense for macOS since there is no /proc there. How does it work on macOS?! I tested it and it works like a charm!
[+] rollcat|2 years ago|reply
Look at progress.c, start with "#ifdef __APPLE__" and keep looking from there. Something called libproc, there's some headers[1] I've found but I couldn't find any man pages unfortunately. You need some way to look at open FDs and every system will have such an API, even if it looks slightly (or very) different from the next one.

[1]: https://opensource.apple.com/source/xnu/xnu-7195.81.3/libsys...

[+] dale_glass|2 years ago|reply
Nice, but I wonder why the actual tools don't already include this.

I even recall cp having been patched with a progress bar maybe a decade back in Gentoo, but for some reason that didn't stick.

[+] simula67|2 years ago|reply
I think it is probably because of the UNIX philosophy of the virtue of silence: 'The program should say nothing if it has nothing interesting to say'.

It is an open question whether progress is 'interesting' or not. My own opinion is that it is not interesting if the operation is nearly instantaneous. If any operation can take more than 5 seconds, it should have a progress bar and an estimated time of completion.

Earlier versions of Windows did this very well. Often these progress bars were a joke, but sometimes they were useful. It gave us valuable input on whether there is enough time to go get a cup of coffee, which as we all know, is the most important question for us all.

[+] rurban|2 years ago|reply
I do maintain the mv cp progress patches, but they recently broke, when they added another feature.

coreutils refused the patches as saying they are feature complete, whilst they dont have progress support, nor Unicode support. Stubborn

https://github.com/rurban/coreutils/

[+] manuelabeledo|2 years ago|reply
I’m going to go on a limb here, but I would say that coreutils main users aren’t human, but scripts. GUIs have had progress bars for ages as well.
[+] coxley|2 years ago|reply
"Make each program do one thing well"

The philosophy surrounding these tools wouldn't lend well to each implementing that. Fortunately `pv` exists, is perfect for this use-case, and included in many distributions. :)

[+] w0m|2 years ago|reply
`alias cpProgress='rsync --progress -ravz'`

has been in my ~/.bashrc for the majority of my career for large file transfers.

[+] mmh0000|2 years ago|reply
Oh this is fun! It reminds me of a Bash script i wrote years ago that does something very similiar[1].

[1] https://xn0.co/rp

[+] anonymousiam|2 years ago|reply
Great to see this here! I've never used it until now because instead I used a 20 year old shell script (later converted to Python by a friend) to do the same thing.

I gave it a try on a ddrescue job (unfortunately not recognized by default) and the estimated remaining time varies quite a bit between what ddrescue says and what progress says. I think ddrescue uses a larger moving average window and it seems to give more accurate estimates, although they are still far from perfect.

[+] pstoll|2 years ago|reply
Cute, clever. There may be edge cases it doesn’t get right (parallel downloads/copies) but pretty useful when you’ve launched a job and didn’t think ahead of time to ask for progress.

I’ve used a signal handler (SIGHUP aka control-c) before as a “show progress” mechanism that I found very useful to monitor long running processes that were compute not io related (launched in screen fwiw to stay the active process).