The officially official Devuan Forum!

You are not logged in.

#1 2018-12-05 20:05:34

Altoid
Member
Registered: 2017-05-07
Posts: 1,429  

Shutdown problem - e1000 driver bug?

Hello:

My Sun Microsystems Ultra24 rig has a problem which up to now I’ve chalked up to a crap BIOS.
It happened with the previous original version it came with and with this one, which is the latest one available.

I'm on Devuan latest:

groucho@devuan:~$ uname -a
Linux devuan 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux
groucho@devuan:~$ 
[root@devuan groucho]# apt-get update
Hit:1 http://deb.devuan.org/merged ascii InRelease
Get:2 http://deb.devuan.org/merged ascii-updates InRelease [25.6 kB]
Hit:3 http://deb.devuan.org/merged ascii-security InRelease
Fetched 25.6 kB in 5s (4944 B/s)
Reading package lists... Done
[root@devuan groucho]#
[root@devuan groucho]# apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
[root@devuan groucho]# 
[root@devuan groucho]# 
[root@devuan groucho]# apt-get check
Reading package lists... Done
Building dependency tree       
Reading state information... Done
[root@devuan groucho]# 

Description:

On shutdown, the rig will do one of three things:

1. shut down properly
2. shut down properly and after about 5s. reboot start reboot and freeze
3. freeze during the shutdown at this point ...

e1000e: EEE Tx LPI Timer
Preparing to enter sleep state S5
Reboot: Power Down 

… with the fans blowing at full speed.
 
This happens ocasionally and I have not been able to reliably reproduce the behaviour, no idea what causes it.
It happened when I only had wireless access ie: before I had a wired connection and it also happens now.

Unfortunately, there’s no way of disabling the on board e1000e controller (you see, as this Sun MoBo has IME, that would be a no-no).

Could it be that (at least part of the issue) is caused by this bug:

https://sourceforge.net/p/e1000/mailman … e/34986431

Apparently it was solved from kernel 3.16.49 on, but the behaviour is very similar.

See:
https://www.systutorials.com/linux-kern … ux-3-16-49

My power settings (Xfce Power Manager) are:

General
When power button is pressed: Shutdown
When sleep button is pressed:  Do nothing
When hibernate button is pressed: Do nothing

System
System sleep mode: Suspend
Put sytem to sleep when inactive for: Never

This also happened with other distros I have tried before settling here with Devuan.
Previous owner of the rig used a a Gates OS but I don’t think he would have mentioned this anyway.

I have not found a way to log what is happening so as to be able to get a better idea of what is happening.
I’d appreciate any suggestions you may have as this is a real PITA.

Thanks in advance.

A.

Offline

#2 2018-12-17 23:18:25

Altoid
Member
Registered: 2017-05-07
Posts: 1,429  

Re: Shutdown problem - e1000 driver bug?

Hello:

While waiting to see if anyone here had a clue about this, I was able to find a few leads.

It seems that the problem is related to the Intel e1000e driver, WoL and the EEE settings on the e1000e on-board NIC.

My rig's BIOS has no setting to disable WoL or the on board Gbe for that matter. (!)

I am disabling WoL at boot with a line in /etc/rc.local

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
#
# Disable WoL - 20181212
/sbin/ethtool -s eth0 wol d
exit 0

It survives a reboot and disconnecting/connecting the interface while logged in.
But it has not been enough as the problem subsists.

Seeing that this involved the e1000e: EEE Tx LPI Timer, I searched and found this:

https://en.wikipedia.org/wiki/Energy-Efficient_Ethernet

My guess was that if the EEE TX LPI timer was what was giving me trouble (?), turning it off would do away with the issue.

Searching around I found this on the web:

https://unix.stackexchange.com/question … n-ethernet

It seems that this EEE thing is also a source of grief to others.
So I tried seeing what the EEE setting was for my on-board NIC using ethtool:

[root@devuan groucho]# ethtool --show-eee eth0
Cannot get EEE settings: Operation not supported
[root@devuan groucho]#

Just to try and see, I attempted to turn EEE off:

[root@devuan groucho]# ethtool --set-eee eth0  eee off
Cannot get EEE settings: Operation not supported

From the last link it seems that the EEE settings can be modified with the Windows driver but not with the Linux driver, at least not using ethtool.

Which was really unexpected. 8 ^/

The ethtool version installed in Devuan is 4.8 (20161004)

[root@devuan groucho]# ethtool -h | grep version
ethtool version 4.8
[root@devuan groucho]#

The latest available version is 4.19 (20181102)

The installed e1000e driver version is 3.2.6-k and it has SmartPowerDown enabled, apparently by default. (?)

[root@devuan groucho]# modinfo e1000e
filename:       /lib/modules/4.9.0-8-amd64/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
version:        3.2.6-k
license:        GPL
description:    Intel(R) PRO/1000 Network Driver
author:         Intel Corporation, <linux.nics@intel.com>
srcversion:     5EA033004005330EC3218BD
alias:          pci:v00008086d000015D6sv*sd*bc*sc*i*
---- snip ---
depends:        ptp
retpoline:      Y
intree:         Y
vermagic:       4.9.0-8-amd64 SMP mod_unload modversions 
parm:           debug:Debug level (0=none,...,16=all) (int)
parm:           copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm:           TxIntDelay:Transmit Interrupt Delay (array of int)
parm:           TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:           RxIntDelay:Receive Interrupt Delay (array of int)
parm:           RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:           InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:           IntMode:Interrupt Mode (array of int)
parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:           KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm:           WriteProtectNVM:Write-protect NVM [WARNING: disabling this can lead to corrupted NVM] (array of int)
parm:           CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int)
[root@devuan groucho]#

Any ideas as to how to get around this?
Turning off the EEE feature is the only thing I can think of now.

But how?
Maybe I need the latest version of ethtool and/or the latest version of the Intel driver but they are not available in the Devuan repo.

EDIT:

This is the specific on-board hardware in my Ultra24 rig:

[root@devuan groucho]# lspci | grep -i ethernet
00:19.0 Ethernet controller: Intel Corporation 82566DM-2 Gigabit Network Connection (rev 02)
[root@devuan groucho]# 

This is the latest version of the Intel e1000e driver:

https://downloadcenter.intel.com/downlo … duct=34459

---
For Intel® 82566DM Gigabit Ethernet PHY
Intel® Network Adapter Driver for PCIe* Intel® Gigabit Ethernet Network Connections Under Linux*
Version: 3.4.2.1 (Latest) Date: 8/26/2018
---

Can an update to this driver be requested to the maintainers?
If not, how can the download from Intel be installed in Devuan ASCII?

Thanks in advance.

A.

Last edited by Altoid (2018-12-20 09:59:22)

Offline

#3 2018-12-25 18:18:45

fanderal
Member
Registered: 2017-01-14
Posts: 54  

Re: Shutdown problem - e1000 driver bug?

Altoid wrote:

it seems that the EEE settings can be modified with the Windows driver

Maybe Hiren's BootCD?

Offline

#4 2021-04-22 13:39:10

Altoid
Member
Registered: 2017-05-07
Posts: 1,429  

Re: Shutdown problem - e1000 driver bug?

Hello:

Altoid wrote:

My Sun Microsystems Ultra24 rig has a problem which up to now I’ve chalked up to a crap BIOS.
It happened with the previous original version it came with and with this one, which is the latest one available.

For an update on the status of this problem, see https://dev1galaxy.org/viewtopic.php?id=4274

tl;dr
Apparently, having EEE enabled on this NICs leaves the EEE TX LPI timer active at shutdown.
EEE works on the basis of auto-negotiation with the device it is connected to and if that device does not support EEE, the timer ends up waiting for a signal it won't receive.
The result is an unresponsive system requiring a hard shutdown.
I have not been able to find out why this happens in a totally aleatory manner and found no reliable way to reproduce it.

ethtool (4.19) is not able to query or access the Intel 82566DM-2 Gigabit NIC's EEE settings because their e1000e driver does not support it.
See the rest in the thread linked above.

Best,

A.

Offline

Board footer