NB10 CPU thermal throttling – unanswered questions

Like the popular Macbook Air and many other laptops, the NB10 is not able to run the CPUs and GPUs indefinitely at maximum effort, due to overheating. Unlike the Macbook Air, the NB10 is fanless.

The NB10 CPUs and GPUs will throttle as the system temperature rises, reducing both performance and heat output. The notebookcheck.net review notes that under a stress test, the dual-core N2810 variants will throttle down from 2GHz to 1.6GHz and the GPU down from 756MHz to 711MHz.

The same article notes that the CPU cores’ maximum temperature is 69C.

Under Linux, this throttling appears to be happening automatically (and without any effect on usability & perceived performance that I have been able to observe). When the CPU throttles, and then unthrottes, Machine Check Exception (MCE) lines are generated in the kernel ring buffer (“dmesg”) and further details are logged which can be collected and examined with the “mcelog” tool.

The throttling events look like this in the syslog:

Mar 30 08:12:14 hostname kernel: [78131.019009] CPU0: Core temperature/speed normal
Mar 30 08:12:14 hostname kernel: [78131.019019] CPU1: Core temperature/speed normal
Mar 30 08:12:35 hostname kernel: [78151.388917] mce: [Hardware Error]: Machine check events logged
Mar 30 08:23:17 hostname mcelog: Processor 1 heated above trip temperature. Throttling enabled.
Mar 30 08:23:17 hostname mcelog: Please check your system cooling. Performance will be impacted
Mar 30 08:23:17 hostname mcelog: Processor 0 heated above trip temperature. Throttling enabled.
Mar 30 08:23:17 hostname mcelog: Please check your system cooling. Performance will be impacted
Mar 30 08:23:17 hostname mcelog: Processor 1 below trip temperature. Throttling disabled
Mar 30 08:23:17 hostname mcelog: Processor 0 below trip temperature. Throttling disabled

And in the mcelog output:

mcelog: failed to prefill DIMM database from DMI data
mcelog: Unsupported new Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 1 THERMAL EVENT TSC 4fe066a71219
TIME 1396163533 Sun Mar 30 08:12:13 2014
Processor 1 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 8832000f MCGSTATUS 0
MCGCAP 806 APICID 2 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55
mcelog: Unsupported new Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 1
CPU 0 THERMAL EVENT TSC 4fe066a7467b
TIME 1396163533 Sun Mar 30 08:12:13 2014
Processor 0 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 8833000f MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55

And the unthrottling:

mcelog: failed to prefill DIMM database from DMI data
mcelog: Unsupported new Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 1 THERMAL EVENT TSC 4fe066a71219
TIME 1396163533 Sun Mar 30 08:12:13 2014
Processor 1 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 8832000f MCGSTATUS 0
MCGCAP 806 APICID 2 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55
mcelog: Unsupported new Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 1
CPU 0 THERMAL EVENT TSC 4fe066a7467b
TIME 1396163533 Sun Mar 30 08:12:13 2014
Processor 0 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 8833000f MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55

I’ve not been able to reliably generate these errors – I’ve seen only a handful. Running typical load-testing tools such as “stress” is not reliably generating these errors.

The “sensors” command output may not be giving accurate data; certainly the 100C thresholds it’s reporting do not appear to be accurate. After running ‘stress’ cpu hogs for 20 minutes, I see no mce log events, the bottom of the case remains cool to the touch, and ‘sensors’ reports:

acpitz-virtual-0
Adapter: Virtual device
temp1:        +29.0°C  (crit = +98.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +39.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:       +39.0°C  (high = +100.0°C, crit = +100.0°C)

There seems to be little to no impact on interactive usage of the machine from this occasional throttling. What I’m curious about is:

  • What is the actual maximum temperature of the CPU?
  • Is “sensors” truly displaying the current temperature?
  • Is the throttling configurable in any way?
Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s