top | item 40604596

CPU Energy Meter: A tool for measuring energy consumption of Intel CPUs

170 points| todsacerdoti | 1 year ago |github.com | reply

60 comments

order
[+] sirn|1 year ago|reply
Since kernel 3.3 or so, RAPL is also exposed through `/sys/devices/virtual/powercap/intel-rapl/*/energy_uj` in micro-joules (if not, `modprobe intel_rapl`). So if you want to do a quick power measurement, it can be done using just POSIX sh (root required):

    # in milli-watt (1000 = 1W) because shell arithmetic doesn't do floating point 
    while true; do
        LAST_MJ=$MJ
        MJ=$(cat /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/energy_uj)
        echo $(((MJ - LAST_MJ) / 1000))
        sleep 1
    done
Despite the powercap name being intel-rapl, the powercap interface is also available on AMD machines.

For a more detailed reading on several more metrics about the CPU, I think pcm[1] may be a better tool (it's a successor to the Intel Power Gadget the project was forked from). Though, it only works on Intel CPU.

[1]: https://github.com/intel/pcm

[+] 149765|1 year ago|reply
There is also perf:

  perf stat -e 'power/energy-pkg/' -I 1000 --interval-count 3 
  #           time             counts   unit events
       1.001064377              11.00 Joules power/energy-pkg/                                                     
       2.002605466              10.98 Joules power/energy-pkg/                                                     
       3.003726824              11.01 Joules power/energy-pkg/
[+] lathiat|1 year ago|reply
AMD have an equivalent in uProf: https://www.amd.com/en/developer/uprof.html

Power profiling is listed as supported on all CPUs though a bunch of features (including memory bandwidth, one that I had wanted) are limited to EPYC CPUs and don't exist in Ryzen or Threadripper.

[+] jeffbee|1 year ago|reply
Another easy tool that may already be on your system is "turbostat".
[+] Sweepi|1 year ago|reply
[+] aftbit|1 year ago|reply
Erm what? So silly!

>Long story short, since last year the AMD Energy sensor information has been limited to root due to the PLATYPUS security vulnerability. HWMON maintainer Guenter Roeck proposed slightly limiting and randomizing the sensor data so it couldn't be used for nefarious purposes but still accurate enough for genuine use-cases and no longer needing to be root-only access. However, AMD engineers didn't like that approach.

>With the hardware monitoring subsystem maintainer not wanting the information to be restricted to root-only and AMD not wanting the limiting/randomization approach, Guenter went ahead and removed the driver.

So... we're better off without having this system at all than we would be if it were limited to root OR if it were randomized? Sounds like silly kernel politicking to me. "You don't like my plan? Oh well, I guess I'll take the ball and go home, have fun losers!"

[+] sirn|1 year ago|reply
This only applies to hwmon, i.e. `sensors`. You can still read this through powercap/intel-rapl (even on AMD systems).
[+] speedgoose|1 year ago|reply
Are the energy consumption values reported by Intel CPUs accurate? Measuring energy consumption for cheap is hard, so I wonder whether they are big approximations or they have some magic tricks.
[+] ngneer|1 year ago|reply
Yes. Much earlier architectures (e.g., Sandy Bridge) used event counters as a rough approximation for energy consumption. However, these days, we use calibrated current sensors, not approximations. These are rather accurate. And accurate enough to do a side-channel attack, too. If software opts-in for security, we also add a little bit of randomness to the readings, in order to avoid measurements being too data-dependent to where crypto would be broken (PLATYPUS attack), but not enough to affect accuracy for normal use cases.
[+] formerly_proven|1 year ago|reply
As far as I know RAPL is implemented entirely in the CPU and is an estimate of CPU power using a complex model of CPU state, temperature and such. I don't believe it's an actual power measurement like e.g. SVI telemetry is.
[+] ngneer|1 year ago|reply
This was true for earlier implementations, but newer ones actually measure power. There is an ADC in there. At least for Intel. Not sure about AMD implementation.
[+] teleforce|1 year ago|reply
I really wish there is a similar tool for measuring energy consumption of a transceiver power amplifier (PA) inside any wireless device because the efficiency is abysmal (less than 50% in real life scenario due to impedance matching, skin effect, etc) not unlike the internal combustion engine (ICE) but at least the latter do not need to deal with mismatched and high frequency issues. In fact PA is increasingly becoming the main culprit of energy wasting in any connected devices especially the wireless ones, and about 50% of the power consumption of the entire device system by the PA are normal. With IoT and machine-to-machine (M2M) type of communications where data transmissions are regular and frequent unlike human type of communication where they sleep at night, machines mostly never sleep and this makes the PA inefficiency becomes even more notorious compared to human communications.
[+] gnufx|1 year ago|reply
Two systems I know from HPC that more usefully expose various architectures' RAPL etc. to userland via a daemon for application profiling are https://variorum.readthedocs.io/ and https://hpc.fau.de/research/tools/likwid/. Of course other sources of power consumption than CPU/uncore and GPU may be significant.

For whole-node power on typical racked systems, I'd expect to interrogate the power strips or similar supplies with SNMP or otherwise.

[+] reportgunner|1 year ago|reply
Why measure just the CPU and not the whole machine ?
[+] dannyw|1 year ago|reply
Because PSUs (sadly) don't have a simple interface that transmits to your OS.
[+] gnufx|1 year ago|reply
You don't just want to measure CPU consumption, but whole-system power is only useful for application performance if only one application of significance runs on it. I'd expect to measure it anyway for system management purposes.
[+] robertheadley|1 year ago|reply
Wild, I just came across this while doing some research on power consumption. I got a AMD 5950X and a Nvidia 4080 Super and I was conscerned about using too much power on my 750 Watt power supply. lol.

This was yesterday. Wild.

[+] steve1977|1 year ago|reply
A tool for monitoring Intel RAPL data would probably be a bit more accurate, as this tool is not really measuring anything.
[+] jhrmnn|1 year ago|reply
In general, how does CPU utilization correlate with CPU power draw?
[+] sandworm101|1 year ago|reply
More utilization = more power draw. Generally.
[+] imvetri|1 year ago|reply
wouldnt the meter also consume excess energy?
[+] jeffbee|1 year ago|reply
It's just a coulomb counter you can read from an MSR. But yes monitoring it inevitably consumes some amount of energy. It won't cost anything on a busy system but waking up an idle system to read it will be more noticeable. This is why I no longer use background metrics monitors like atop or netdata. An Intel client CPU can idle below 100mw if you leave it be, but something like netdata will raise that to 5W or worse.
[+] chickenchase-rd|1 year ago|reply
Someone should make a monitor for the monitor
[+] aljgz|1 year ago|reply

[deleted]

[+] silotis|1 year ago|reply
> does not switch to 200MHz for a minute during video calls

I had a Dell work laptop that did the same thing. As far as I was able to tell the system had a bug/fault that continuously asserted the CPU's BD PROCHOT line when the integrated webcam was active. I don't think it was an Intel bug, the CPU was just responding to the external signal that (falsely) indicated the system was overheating.

[+] Almondsetat|1 year ago|reply
What does this have to do with the content of the article?
[+] nottorp|1 year ago|reply
What's the point, they already consume more than entry level space heaters...
[+] navjack27|1 year ago|reply
Yes computers consistently sip 1200 w from the wall. That's a normal thing. Said no one ever.