The officially official Devuan Forum!

You are not logged in.

#76 2021-05-09 15:01:19

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

That does not sound like an NIC (driver) issue. At least, if boot already goes noisy.

Offline

#77 2021-05-09 16:21:05

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... does not sound like an NIC (driver) issue.
... if boot already goes noisy.

I agree.

I have always wondered if the bad_boot and the bad_shutdown problems were indpendent of each other or if they shared more than the fans at full blast symptom.

Out of fear of some part of the filesystem getting borked, I had never allowed the boot sequence to go on, aborting it and getting a clean boot afterwards.
It will reboot after a time-out unless you explicitly allow the boot sequence to continue.

This time, not aborting the sequence revealed a bad_shutdown right after a bad_boot.
Then, going back to the Sun Microsystems *.pdf on the matter and seeing the diagnostic put forth (S0, S3, etc.) again made my doubts resurface.

ie:

"If you power on the workstation before the system enters the S3 ... "

Q: why would the system be entering S3 or any other save S5 in the first place?

This probably has to do with my not being able to disable ME "Firmware Power Control" and "Host Sleep States" in BIOS.
It is true that I pressed the power button immediately after power off and blank screen.
But I have not been able to reproduce it.

Like I mentioned, I have edited to shutdown script to set WoL to disabled as before, just not removing the e1000e module.

With respect to system states, dmesg states:

groucho@devuan:~$ sudo dmesg | grep S0
[    0.729378] ACPI: (supports S0 S1 S3 S4 S5)
groucho@devuan:~$ 

I'm only interested in S0 and S5 but ...   

groucho@devuan:/sys/power$ cat /sys/power/state
freeze standby mem disk
groucho@devuan:/sys/power$ 

I have seen how, if needed, freeze, standby, mem and disk can be added to /sys/power/state to enable power states S0, S1, S3 and S4 respectively.
eg: # echo mem > /sys/power/state

And have found how to disable power states in systemd distributions. 
eg: sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

But I have not found out how to do that in Devuan.
I am guessing that not having those values (specifically S3) could help finding out what is going on.

How to do the opposite of # echo mem > /sys/power/state ?

Thanks in advance.

A.

Offline

#78 2021-05-09 21:46:21

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Well, the /sys/power stuff belongs to Kernel CONFIG_PM I guess. Feel free to disable that. smile You may also try shutdown -h -P to halt and power off.

Just to verify:

# cat /etc/default/halt
# Default behaviour of shutdown -h / halt. Set to "halt" or "poweroff".
HALT=poweroff

Last edited by geki (2021-05-09 22:03:30)

Offline

#79 2021-05-09 22:03:08

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Oh, and I remembered just now, if you got to the frozen "reboot: Power down", press Alt + SysRq (Print Screen key) + o for shutdown
See: http://blog.kember.net/articles/reisub- … x-restart/

Last edited by geki (2021-05-09 22:19:02)

Offline

#80 2021-05-09 22:18:11

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Just for the fun of it, check kernel commandline parameter[0] pcie_port_pm=off. Maybe that helps disabling sleep states for your pci express slots and NIC. There is also the parameter apm, I wonder. Maybe there is some suspend software installed in /etc/pm, /etc/apm or /etc/acpi and check for such in /etc/default. big_smile

[0] https://www.kernel.org/doc/html/v4.19/a … eters.html

Offline

#81 2021-05-09 22:21:02

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... the /sys/power stuff belongs to Kernel CONFIG_PM I guess.
Feel free to disable that.

Hmm ...
Sure.

Q: how do I go about that.
ie: the opposite of # echo mem > /sys/power/state, effectively removing mem?
Cannot find anything specific about that for non-systemd distributions.

My idea is that if I remove anything S3 related from the system, it may (?) keep whatever system state is set in BIOS from activating.

geki wrote:

... also try shutdown -h -P to halt and power off.

Just as you point out:

groucho@devuan:~$ cat /etc/default/halt
# Default behaviour of shutdown -h / halt. Set to "halt" or "poweroff".
HALT=poweroff
groucho@devuan:~$ 
geki wrote:

... if you got to the frozen "reboot: Power down", press Alt + SysRq (Print Screen key) + o for shutdown

Hmm ...
I'm not sure that I did try it but without the expected result.
I'll remember for next time but I think (?) the kb was totally unresponsive. 

geki wrote:

Thanks for the heads up.

Best,

A.

Offline

#82 2021-05-10 14:39:34

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... check kernel commandline parameter pcie_port_pm=off.
... disabling sleep states for your pci express slots and NIC.

I think I used something pci=ish without results.
Have to see my notes.

geki wrote:

... also the parameter apm ...
... some suspend software installed in /etc/pm, /etc/apm or /etc/acpi and check for such in /etc/default.

I don't have pm-utils installed, removed it some time ago.

groucho@devuan:~$ apt list | grep -i installed | grep -i pm-utils
--- snip ---
groucho@devuan:~$

Notwithstanding, I do have these:

groucho@devuan:~$ ls -R /etc/apm/
/etc/apm/:
event.d

/etc/apm/event.d:
20hdparm
groucho@devuan:~$ 

This for spinning down HDDs if not on AC.
Always on AC but I presume it works as intended.

groucho@devuan:~$ ls -R /etc/acpi/
/etc/acpi/:
events  powerbtn-acpi-support.sh

/etc/acpi/events:
powerbtn-acpi-support
groucho@devuan:~$ 

This to initiate shutdown when the power button is pressed.
I disabled this in /etc/default/acpid because I inadvertently touched the recessed power button more than a few times.  8^/
Also because I'd rather shutdown via terminal or script.

groucho@devuan:~$ ls -R /etc/default/
/etc/default/:
acpid         cacerts        dbus    grub.d         hwclock          locale~               ntpdate        rsyslog          su              useradd
anacron       console-setup  devpts  grub.ucf-dist  intel-microcode  networking            rcS            saned            su~             wicd
autofs        cpufrequtils   exim4   halt           keyboard         networking.dpkg-dist  rcS.dpkg-dist  saned.dpkg-dist  sysstat
avahi-daemon  crda           gdomap  haveged        keyboard~        nfs-common            rkhunter       saned~           timeshift.json
bsdmainutils  cron           grub    hddtemp        locale           nss                   rsync          smartmontools    tmpfs

/etc/default/grub.d:
init-select.cfg
groucho@devuan:~$ ls -R /etc/default/
groucho@devuan:~$ cat /etc/default/acpid
# Options to pass to acpid
#
# OPTIONS are appended to the acpid command-line
# enabled 20181108 to log events to syslog 
OPTIONS="-l"

# Linux kernel modules to load before starting acpid
#
# MODULES is a space separated list of modules to load, or "all" to load all
# acpi drivers, or commented out to load no module
#MODULES="battery ac processor button fan thermal video"
#MODULES="all"
groucho@devuan:~$ 

I added OPTIONS="-l" back in 2018 to see if I could get anything written to a(ny) log.

Thanks for your input.

Best,

A.

Last edited by Altoid (2021-05-10 14:41:14)

Offline

#83 2021-05-10 19:02:16

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

For disabling CONFIG_PM, you have to build your own kernel. big_smile There seems to be any pm tooling installed. roll Otherwise, I just can remind you not to use a 4.x kernel and neither kernel version < 5.5. Unless we see some NIC hang, I am mostly out of thoughts. big_smile

A last idea is to check /etc/init.d/halt for hddown= and netdown=, which you can disable by respective configuration settings from /etc/default/halt. In my box, netdown and hdown are set. You may set that configuration parameters, so that they are disabled. I put

        read -p "Press enter to halt ($netdown $poweroff $hddown)" reply

before

        halt -d -f $netdown $poweroff $hddown

to see what is set.

Offline

#84 2021-05-10 20:16:57

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

Note:
Somehow this whole post got lost.
I recovered it but the posting order may have been altered.
Sorry ... 

geki wrote:

    For disabling CONFIG_PM, you have to build your own kernel.

Hmm ...
Not on my list.

geki wrote:

    ... seems to be any pm tooling installed.

How is it that it is done in systemd distributions?

There's no pre-systemd / Linux Devuan equivalent to sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target?

I was reading here: https://answers.launchpad.net/acpi-supp … tion/36260
Since my system had no /etc/default/acpi-support file, I added one:

groucho@devuan:~$ cat /etc/default/acpi-support
# Comment the next line to disable ACPI suspend to RAM
ACPI_SLEEP=false

# Comment the next line to disable suspend to disk
ACPI_HIBERNATE=false
groucho@devuan:~$ 

Don't know if it does anything at all.

geki wrote:

    ... remind you not to use a 4.x kernel and neither kernel version < 5.5.

I do have your suggestion on my desk.
Not forgotten, just postponed.  8^)

geki wrote:

    Unless we see some NIC hang ...
    ... mostly out of thoughts.

Well ...

Your efforts did unearth a lot of e1000e fun which could be put to good use by the module's maintainers.
If nothing else, the quality of the code will be been significantly improved by your work.
No one can thumb their nose at that.

Now, what could happen?

1. another bad shutdown.
It's been ~10 days since I started using the last version of the e1000e module ie: three patches + a std. shutdown.
The third patch was then edited further to accomodate various debug scenarios, but I understand that it was basically the same as v2.
Up to that point I had been without a bad_shutdown for at least a week.

I think that the bad_shutdown I had is linked to the bad_boot.
And the bad boot may (?) have been linked to not setting WoL to disabled on shutdown.
Like I mentioned previously, my shutdown script now sets WoL to disabled on shutdown but does not remove the e1000e module.

But there's that S3 that is bothering me and I'd like to know how to get rid of.

2. another bad_boot followed by a bad_shutdown.
My money is on getting rid of S3 to prevent that.
Maybe setting WoL to disabled serves the same purpose?

3. none of the above.
If in 30/45 days' time we are still at 3. then it could mean that something you have edited/changed with your patches to the e1000e module has had effect. 8^D!

But for now, we have to wait and see.

geki wrote:

    ... check /etc/init.d/halt for hddown= and netdown=, which you can disable by respective configuration settings from /etc/default/halt.

Let's see:

groucho@devuan:~$ cat /etc/default/halt
# Default behaviour of shutdown -h / halt. Set to "halt" or "poweroff".
HALT=poweroff
groucho@devuan:~$ 
groucho@devuan:~$ cat /etc/init.d/halt
#! /bin/sh
### BEGIN INIT INFO
# Provides:          halt
# Required-Start:
# Required-Stop:
# Default-Start:
# Default-Stop:      0
# Short-Description: Execute the halt command.
# Description:
### END INIT INFO

NETDOWN=yes

PATH=/sbin:/usr/sbin:/bin:/usr/bin
[ -f /etc/default/halt ] && . /etc/default/halt

. /lib/lsb/init-functions

do_stop () {
	if [ "$INIT_HALT" = "" ]
	then
		case "$HALT" in
		  [Pp]*)
			INIT_HALT=POWEROFF
			;;
		  [Hh]*)
			INIT_HALT=HALT
			;;
		  *)
			INIT_HALT=POWEROFF
			;;
		esac
	fi

	# See if we need to cut the power.
	if [ "$INIT_HALT" = "POWEROFF" ] && [ -x /etc/init.d/ups-monitor ]
	then
		/etc/init.d/ups-monitor poweroff
	fi

	# Don't shut down drives if we're using RAID.
	hddown="-h"
	if grep -qs '^md.*active' /proc/mdstat
	then
		hddown=""
	fi

	# If INIT_HALT=HALT don't poweroff.
	poweroff="-p"
	if [ "$INIT_HALT" = "HALT" ]
	then
		poweroff=""
	fi

	# Make it possible to not shut down network interfaces,   <-------- | x |               
	# needed to use wake-on-lan                               <-------- | x |                        
	netdown="-i"
	if [ "$NETDOWN" = "no" ]; then
		netdown=""
	fi

	log_action_msg "Will now halt"
	halt -d -f $netdown $poweroff $hddown
}

case "$1" in
  start|status)
	# No-op
	;;
  restart|reload|force-reload)
	echo "Error: argument '$1' not supported" >&2
	exit 3
	;;
  stop)
	do_stop
	;;
  *)
	echo "Usage: $0 start|stop" >&2
	exit 3
	;;
esac

:
groucho@devuan:~$ 
geki wrote:

    In my box, netdown and hdown are set.
    You may set that configuration parameters, so that they are disabled.

geki wrote:

    I put

            read -p "Press enter to halt ($netdown $poweroff $hddown)" reply

    before

            halt -d -f $netdown $poweroff $hddown

    to see what is set.

Right.
I'll edit that into /etc/init.d/halt and see how it behaves.
If I understand correctly, it will shut down on Enter.

Thanks for your input.

Best,

A.

Last edited by Altoid (2021-05-10 21:22:29)

Offline

#85 2021-05-10 20:58:58

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Well, you want to put NETDOWN=no into /etc/default/halt to disable WoL handling of /sbin/halt. Quite strange parameter to what it is doing in the end. big_smile

Though, you still want to disable WoL via ethtool. The more disabling, the better. cool

And yes, we want to disable WoL, too. We can get a patch to forcefully disable WoL by module parameter. I already spotted the flag in code, which activates WoL. We can, hypothetically for now, reset that in src/param.c, e heh. big_smile But even with that hypothetical patch, We want to disable WoL from /sbin/halt. And ethtool will be superfluous, then.

Last edited by geki (2021-05-10 21:08:02)

Offline

#86 2021-05-10 21:12:45

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Since you got no pm tooling, I wonder who enters S3. Only one visible for now is /sbin/halt with its unsetted NETDOWN. If not that, it must be the kernel on its own, then?! roll

Anyone here knows about sleep state handling? smile

Offline

#87 2021-05-10 21:17:17

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

A ha, and the dmesg output shows what for the freeze?

Offline

#88 2021-05-10 21:23:33

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

Altoid wrote:

I'll edit that into /etc/init.d/halt and see how it behaves.
If I understand correctly, it will shut down on Enter.

Never got that far ...

Here's the part that I edited:

        # Make it possible to not shut down network interfaces,
        # needed to use wake-on-lan
        netdown="-i"
        if [ "$NETDOWN" = "no" ]; then
                netdown=""
        fi

        log_action_msg "Will now halt"
        read -p "Press enter to halt ($netdown $poweroff $hddown)" reply     <--- just this line
        halt -d -f $netdown $poweroff $hddown
}

On shutdown, the box froze at this point:

nofan-frz.png

Good sign is that there were no fans blowing.
Only way out was a hard shutdown.
REISUB did not work or maybe it was that I could not type and hold Alt+SysRq at the same time?

This happens whether I shut down with my script or with sudo shutdown -h now from a terminal.

Thanks in advance.

A.

Offline

#89 2021-05-10 21:36:05

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:
There's some alteration in the posting order ...

geki wrote:

... you want to put NETDOWN=no into /etc/default/halt to disable WoL handling of /sbin/halt.

Right.

groucho@devuan:~$ cat /etc/default/halt
# Default behaviour of shutdown -h / halt. Set to "halt" or "poweroff".
HALT=poweroff
NETDOWN=no
groucho@devuan:~$ 
geki wrote:

Quite strange parameter to what it is doing in the end.

I'll take your word for it.
Don't have a clue as to what is strange here ...

geki wrote:

Though, you still want to disable WoL via ethtool. The more disabling, the better.

That is supposed to be the tool to use.
When the HW or the driver does not undo it behind your back.

geki wrote:

We can get a patch to forcefully disable WoL by module parameter. I already spotted the flag in code, which activates WoL.
We can, hypothetically for now, reset that in src/param.c ...

Thanks for sharing.
But it's you doing the patching.  8^)

geki wrote:

... even with that hypothetical patch, We want to disable WoL from /sbin/halt.

Right.
Would be like disabling it in the BIOS, which for whatever reason cannot be done without screwing up everything.

I discovered an old Sun Microsystems thread.
Had to find a catched copy.

Seems that this was a hard cookie from the very start of the Sun Ultra 24's market debut:
https://webcache.googleusercontent.com/ … clnk&gl=us

Best,

A.

Offline

#90 2021-05-10 21:45:03

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... you got no pm tooling, I wonder who enters S3.

That's what I think is the question here.
I think the answer is in the U24's BIOS.

Remember that I have not been able to disable ME "Firmware Power Control" and "Host Sleep States" in BIOS.
It is set to S3, I can set it back to S0 but I cannot disable it.

geki wrote:

Only one visible for now is /sbin/halt with its unsetted NETDOWN.

geki wrote:

If not that, it must be the kernel on its own, then?!

Hmm ...
Why would that be?

geki wrote:

Anyone here knows about sleep state handling?

No one else has pitched in ...
But you have gathered quite a following.

Thanks for your input.

A.

Last edited by Altoid (2021-05-10 21:55:27)

Offline

#91 2021-05-10 21:54:42

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... and the dmesg output shows what for the freeze?

Screen was frozen, needed a hard shudown.
dmesg0 does not show anything strange.

I could reproduce the whole thing but how to get what dmesg recorded, if anything at all?

Thanks in advance,

A.

Offline

#92 2021-05-11 07:27:42

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

You got that PCI REMOVE prints once out of dmesg? A normal shutdown would not enter S3? Though, I am out with that. :-D

I patch WOL and PM FREEZE detach thingie and that's it from me.  :-)

Offline

#93 2021-05-11 11:21:10

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

You got that PCI REMOVE prints once out of dmesg?

Yes.
I posted them at your request.

But I got them only after a rmmod e1000e and modprobe -v e1000e cycle:

groucho@devuan:~$ sudo dmesg
--- snip ---
[    1.805846] e1000e: loading out-of-tree module taints kernel.    
[    1.808975] ACPI: Power Button [PWRB]
[    1.851745] e1000e: module verification failed: signature and/or required key missing - tainting kernel
[    1.851839] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[    1.873915] ACPI: Power Button [PWRF]
[    1.894607] SCSI subsystem initialized
[    1.906926] ACPI: bus type USB registered
[    1.917645] usbcore: registered new interface driver usbfs
[    1.934103] Fusion MPT base driver 3.04.20
[    1.944731] Copyright (c) 1999-2008 LSI Corporation
[    1.945571] usbcore: registered new interface driver hub
[    1.966487] usbcore: registered new device driver usb
[    1.979636] e1000e: Intel(R) PRO/1000 Network Driver - 3.8.7-NAPI
[    1.990206] e1000e: Copyright(c) 1999 - 2020 Intel Corporation.
[    2.002032] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[    2.013157] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[    2.021243] e1000e 0000:00:19.0: PHY Smart Power Down Disabled
[    2.039474] e1000e 0000:00:19.0: EEE Support was initialized to be enabled
[    2.050107] e1000e 0000:00:19.0: EEE Support has been reset to be disabled
--- snip ----
[  431.874537] e1000e: PCI REMOVE PTP     <-- here is rmmod e1000e
[  431.874542] e1000e: PCI REMOVE TIMER
[  431.874546] e1000e: PCI REMOVE CANCEL WORK SYNC
[  431.874547] e1000e: PCI REMOVE HW TIMESTAMP
[  431.874572] e1000e: NETDEV CLOSE ENTERED
[  431.874573] e1000e: NETDEV CLOSE WAIT DONE
[  431.874575] e1000e: NETDEV CLOSE DEV IS PRESENT
[  432.104194] e1000e: NETDEV CLOSE DEV IS DOWN
[  432.104210] e1000e: NETDEV CLOSE FREE IRQ
[  432.104215] e1000e 0000:00:19.0 eth0: NIC Link is Down
[  432.104217] e1000e: NETDEV CLOSE LINK DOWN MSG
[  432.104218] e1000e: NETDEV CLOSE NAPI DISABLED
[  432.104228] e1000e: NETDEV CLOSE FREE TX RES
[  432.104251] e1000e: NETDEV CLOSE FREE RX RES
[  432.104252] e1000e: NETDEV CLOSE VLAN DONE
[  432.104254] e1000e: NETDEV CLOSE HW CTRL RELEASED
[  432.104257] e1000e: NETDEV CLOSE DONE
[  432.120067] e1000e: PCI REMOVE UNREGISTER NETDEV
[  432.120072] e1000e: PCI REMOVE WAKE NO RESUME
[  432.120075] e1000e: PCI REMOVE RELEASE HW CONTROL
[  432.120110] e1000e: PCI REMOVE INT AND TX RX RING
[  432.120123] e1000e: PCI REMOVE SELECTED REGIONS
[  432.140044] e1000e: PCI REMOVE FREE NETDEV
[  432.140048] e1000e: PCI REMOVE DISABLE ERR REPORTING
[  432.140164] e1000e: PCI REMOVE DISABLE DEVICE
-----------------------------------------------------------------------------------------------------------------------------------
[  443.741628] e1000e: Intel(R) PRO/1000 Network Driver - 3.8.7-NAPI      <--- here is modprobe -v e1000e
[  443.741632] e1000e: Copyright(c) 1999 - 2020 Intel Corporation.
[  443.741822] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[  443.741824] e1000e 0000:00:19.0: PHY Smart Power Down Disabled
[  443.741826] e1000e 0000:00:19.0: EEE Support was initialized to be enabled
[  443.741827] e1000e 0000:00:19.0: EEE Support has been reset to be disabled
[  444.060221] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:14:4f:4a:a2:81
[  444.060227] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[  444.060251] e1000e 0000:00:19.0 eth0: MAC: 7, PHY: 6, PBA No: FFFFFF-0FF
[  446.780888] e1000e 0000:00:19.0 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
[  446.780997] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO
[  447.562186] e1000e: NETDEV CLOSE ENTERED
[  447.562190] e1000e: NETDEV CLOSE WAIT DONE
[  447.562192] e1000e: NETDEV CLOSE DEV IS PRESENT
[  447.792198] e1000e: NETDEV CLOSE DEV IS DOWN
[  447.792211] e1000e: NETDEV CLOSE FREE IRQ
[  447.792216] e1000e 0000:00:19.0 eth0: NIC Link is Down
[  447.792218] e1000e: NETDEV CLOSE LINK DOWN MSG
[  447.792219] e1000e: NETDEV CLOSE NAPI DISABLED
[  447.792227] e1000e: NETDEV CLOSE FREE TX RES
[  447.792251] e1000e: NETDEV CLOSE FREE RX RES
[  447.792252] e1000e: NETDEV CLOSE VLAN DONE
[  447.792254] e1000e: NETDEV CLOSE HW CTRL RELEASED
[  447.792257] e1000e: NETDEV CLOSE DONE
[  448.311058] e1000e: NETDEV CLOSE ENTERED
[  448.311062] e1000e: NETDEV CLOSE WAIT DONE
[  448.311064] e1000e: NETDEV CLOSE DEV IS PRESENT
[  448.532200] e1000e: NETDEV CLOSE DEV IS DOWN
[  448.532213] e1000e: NETDEV CLOSE FREE IRQ
[  448.532218] e1000e 0000:00:19.0 eth0: NIC Link is Down
[  448.532220] e1000e: NETDEV CLOSE LINK DOWN MSG
[  448.532221] e1000e: NETDEV CLOSE NAPI DISABLED
[  448.532230] e1000e: NETDEV CLOSE FREE TX RES
[  448.532254] e1000e: NETDEV CLOSE FREE RX RES
[  448.532255] e1000e: NETDEV CLOSE VLAN DONE
[  448.532257] e1000e: NETDEV CLOSE HW CTRL RELEASED
[  448.532260] e1000e: NETDEV CLOSE DONE
[  450.408891] e1000e 0000:00:19.0 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
[  450.409000] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO
groucho@devuan:~$ 

I have found no other record of e1000e activity on a bad_sutdown.

geki wrote:

A normal shutdown would not enter S3?
Though, I am out with that.

I don't know.
There's the issue of not being able to disable sleep state in the BIOS.
And that being set to S3.

The thing would be to avoid the system from picking it up and overriding what the BIOS says.
ie: the proper way to do things

geki wrote:

I patch WOL and PM FREEZE detach thingie and that's it from me.

Sorry to hear that, but I understand.
You've done more than enough already.  8^)
And no one else has made any suggestions.

Maybe not patch WoL as that seems to be taken care of vie ethtool and boot and shutdown.

BTW: you have this in your system?

     # Make it possible to not shut down network interfaces,
        # needed to use wake-on-lan
        netdown="-i"
        if [ "$NETDOWN" = "no" ]; then
                netdown=""
        fi

        log_action_msg "Will now halt"
        read -p "Press enter to halt ($netdown $poweroff $hddown)" reply     <--- this line added
        halt -d -f $netdown $poweroff $hddown
}

Does it shut down properly?

Thanks for your input.

Best,

A.

Last edited by Altoid (2021-05-11 11:29:24)

Offline

#94 2021-05-12 20:32:16

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Yes, here it shuts down properly...

Last patch:
https://geki.selfhost.eu/hacks/1005-e10 … tach.patch

On a second thought, if your system entered sleep and e1000e called the pm_freeze function things could be corrupted, indeed. And next shutdown and start may go boom?! Another wild guess. big_smile

And for WoL, too tricky to patch actually. I just recommend you to set the ethtool disable WoL command early in boot process and not before shutdown. cool e heh. Just because we want it to stay put from the beginning. roll

And yes, with this I am out of thoughts....

Last edited by geki (2021-05-12 20:35:49)

Offline

#95 2021-05-12 21:42:17

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

... here it shuts down properly...

Any idea why it would not shut down properly in my box?
How can I troubleshoot that?

Right.
Thanks. 8^)

geki wrote:

... if your system entered sleep and e1000e called the pm_freeze function things could be corrupted, indeed.

Which ones?

geki wrote:

... next shutdown and start may go boom?!

Nah !   8^D

geki wrote:

... WoL, too tricky ...
... disable WoL command early in boot process ...

That's what I am already doing, remember?

First at /etc/rc.local ...

groucho@devuan:~$ cat /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

# to set all /proc/acpi/wakeup entries to 'disabled'
# no wakeup from S4 for anything
# does not survire reboot that's why it is here
# see https://dev1galaxy.org/viewtopic.php?pid=29113#p29113
/usr/local/bin/acpi_wakeups.sh

# to disable wol via ethtool at boot
# does not survire reboot that's why it is here
/sbin/ethtool -s eth0 wol d
groucho@devuan:~$ 

... and just in case some #%&?¡ whatever decides differently, with my shutdown script:

groucho@devuan:~$ cat /usr/bin/shutdown.sh
#!/bin/sh
# added to shutdown directly - no shutdown helper 
# options added to troubleshoot nic related bad shutdown 
PATH=/sbin:/bin:/usr/sbin:/usr/bin:

# 2
# sync
# disable onboard eth wol
# shutdown system directly 
sync && sudo ethtool -s eth0 wol d && sudo shutdown -h now

I've come across a couple of pages related to PM and /etc/default/acpi-support.
As acpi-support seesm to have been deprecated long ago, I had to look around for oldish pages or get systemd only hits.

I added the links to the file so as to remember where it all came from:

groucho@devuan:~$ cat /etc/default/acpi-support
# comment the next line to disable ACPI suspend to RAM
# ACPI_SLEEP=true
ACPI_SLEEP=false

# comment the next line to disable suspend to disk
# ACPI_HIBERNATE=true
ACPI_HIBERNATE=true

# added 20210511
# see https://forums.linuxmint.com/viewtopic.php?t=43068
# https://askubuntu.com/questions/47311/how-do-i-disable-my-system-from-going-to-sleep
SUSPEND_METHODS="none"
groucho@devuan:~$ 

With respect to the >5.5 kernel upgrade:
I installed 5.10 but there are no drivers for my Nvidia cards.
It seems that the nvidia-340xx-legacy driver used by my perfectly working Nvidia Quadro FX 580 cards has fallen out of favour with the powers that be.
And from what I have read, caused a lot of noise.

I think Manjaro and Ubuntu have put out patches, I expect Debian/Devuan will follow suit. (?)
So I'll have to wait.

I'm not going to give up my 580's or use nouveau, unless it improves.
They still have a few years left in them.

I'll apply patch 1005 and report back with whatever results I get.

Thank you very much for your help.

Best,

A.

Offline

#96 2021-05-13 00:03:58

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

Altoid wrote:

I'll apply patch 1005 and report back with whatever results I get.

Right, here we go.

groucho@devuan:/usr/src/e1000e-3.8.7$ sudo patch -p0 -i /usr/src/e1000e-patch/1005-e1000e_387_pm_freeze_sane_detach.patch
patching file src/netdev.c
groucho@devuan:/usr/src/e1000e-3.8.7$
groucho@devuan:/usr/src/e1000e-3.8.7/src$ sudo make
make[1]: Entering directory '/usr/src/linux-headers-4.19.0-16-common'
make[2]: Entering directory '/usr/src/linux-headers-4.19.0-16-amd64'
  CC [M]  /usr/src/e1000e-3.8.7/src/netdev.o
/usr/src/e1000e-3.8.7/src/netdev.c: In function 'e1000e_pm_freeze':
/usr/src/e1000e-3.8.7/src/netdev.c:7401:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
   int count = E1000_CHECK_RESET_COUNT;
   ^~~
  LD [M]  /usr/src/e1000e-3.8.7/src/e1000e.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /usr/src/e1000e-3.8.7/src/e1000e.mod.o
  LD [M]  /usr/src/e1000e-3.8.7/src/e1000e.ko
make[2]: Leaving directory '/usr/src/linux-headers-4.19.0-16-amd64'
make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-16-common'
groucho@devuan:/usr/src/e1000e-3.8.7/src$
groucho@devuan:/usr/src/e1000e-3.8.7/src$ sudo modinfo /usr/src/e1000e-3.8.7/src/e1000e.ko
filename:       /usr/src/e1000e-3.8.7/src/e1000e.ko
version:        3.8.7-NAPI
license:        GPL
description:    Intel(R) PRO/1000 Network Driver
author:         Intel Corporation, <linux.nics@intel.com>
srcversion:     7C3F06437067F7EF077432C
alias:          pci:v00008086d00001A1Dsv*sd*bc*sc*i*
--- snip ---
alias:          pci:v00008086d0000105Esv*sd*bc*sc*i*
depends:
retpoline:      Y
name:           e1000e
vermagic:       4.19.0-16-amd64 SMP mod_unload modversions
parm:           copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm:           TxIntDelay:Transmit Interrupt Delay (array of int)
parm:           TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:           RxIntDelay:Receive Interrupt Delay (array of int)
parm:           RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:           InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:           IntMode:Interrupt Mode (array of int)
parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:           KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm:           CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int)
parm:           EEE:Enable/disable on parts that support the feature (array of int)
parm:           Node:[ROUTING] Node to allocate memory on, default -1 (array of int)
parm:           debug:Debug level (0=none,...,16=all) (int)
groucho@devuan:/usr/src/e1000e-3.8.7/src$

dmesg at boot

groucho@devuan:~$ sudo dmesg | grep -i "e1000e\|00:19"
[    1.079755] pci 0000:00:19.0: [8086:10bd] type 00 class 0x020000
[    1.079770] pci 0000:00:19.0: reg 0x10: [mem 0xf5fc0000-0xf5fdffff]
[    1.079776] pci 0000:00:19.0: reg 0x14: [mem 0xf5ffe000-0xf5ffefff]
[    1.079783] pci 0000:00:19.0: reg 0x18: [io  0xac00-0xac1f]
[    1.079830] pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
[    2.147352] e1000e: loading out-of-tree module taints kernel.
[    2.215009] e1000e: module verification failed: signature and/or required key missing - tainting kernel
[    2.316026] e1000e: Intel(R) PRO/1000 Network Driver - 3.8.7-NAPI
[    2.333730] e1000e: Copyright(c) 1999 - 2020 Intel Corporation.
[    2.357402] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[    2.368687] e1000e 0000:00:19.0: PHY Smart Power Down Disabled
[    2.390290] e1000e 0000:00:19.0: EEE Support was initialized to be enabled
[    2.412085] e1000e 0000:00:19.0: EEE Support has been reset to be disabled
[    2.897162] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:14:4f:4a:a2:81
[    2.897164] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[    2.897182] e1000e 0000:00:19.0 eth0: MAC: 7, PHY: 6, PBA No: FFFFFF-0FF
[   27.302085] e1000e 0000:00:19.0 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
[   27.314226] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO
groucho@devuan:~$

dmesg for rmmod -v e1000e

groucho@devuan:~$ sudo dmesg
--- snip ---
[10140.513586] e1000e: PCI REMOVE PTP
[10140.513590] e1000e: PCI REMOVE TIMER
[10140.513593] e1000e: PCI REMOVE CANCEL WORK SYNC
[10140.513595] e1000e: PCI REMOVE HW TIMESTAMP
[10140.513617] e1000e: NETDEV CLOSE ENTERED
[10140.513619] e1000e: NETDEV CLOSE WAIT DONE
[10140.513620] e1000e: NETDEV CLOSE DEV IS PRESENT
[10140.741067] e1000e: NETDEV CLOSE DEV IS DOWN
[10140.741081] e1000e: NETDEV CLOSE FREE IRQ
[10140.741086] e1000e 0000:00:19.0 eth0: NIC Link is Down
[10140.741087] e1000e: NETDEV CLOSE LINK DOWN MSG
[10140.741089] e1000e: NETDEV CLOSE NAPI DISABLED
[10140.741099] e1000e: NETDEV CLOSE FREE TX RES
[10140.741121] e1000e: NETDEV CLOSE FREE RX RES
[10140.741123] e1000e: NETDEV CLOSE VLAN DONE
[10140.741125] e1000e: NETDEV CLOSE HW CTRL RELEASED
[10140.741128] e1000e: NETDEV CLOSE DONE
[10140.756902] e1000e: PCI REMOVE UNREGISTER NETDEV
[10140.756908] e1000e: PCI REMOVE WAKE NO RESUME
[10140.756911] e1000e: PCI REMOVE RELEASE HW CONTROL
[10140.756951] e1000e: PCI REMOVE INT AND TX RX RING
[10140.756968] e1000e: PCI REMOVE SELECTED REGIONS
[10140.772962] e1000e: PCI REMOVE FREE NETDEV
[10140.772965] e1000e: PCI REMOVE DISABLE ERR REPORTING
[10140.773065] e1000e: PCI REMOVE DISABLE DEVICE
--- snip ---
groucho@devuan:~$

dmesg for modprobe -v e1000e

groucho@devuan:~$ sudo dmesg
--- snip ---
[10200.634814] e1000e: Intel(R) PRO/1000 Network Driver - 3.8.7-NAPI
[10200.634819] e1000e: Copyright(c) 1999 - 2020 Intel Corporation.
[10200.635004] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[10200.635007] e1000e 0000:00:19.0: PHY Smart Power Down Disabled
[10200.635008] e1000e 0000:00:19.0: EEE Support was initialized to be enabled
[10200.635010] e1000e 0000:00:19.0: EEE Support has been reset to be disabled
[10200.949099] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:14:4f:4a:a2:81
[10200.949104] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[10200.949129] e1000e 0000:00:19.0 eth0: MAC: 7, PHY: 6, PBA No: FFFFFF-0FF
[10202.829745] e1000e 0000:00:19.0 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
[10202.829854] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO
groucho@devuan:~$

Seems everything is where/how it should be?

Thanks in advance,

A.

Offline

#97 2021-05-13 07:36:22

geki
Member
Registered: 2019-02-04
Posts: 103  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Did you try nvidia drivers from backports / non-free?
See: https://pkginfo.devuan.org/cgi-bin/pack … 10~bpo10+1

And there are other packages for that from backports/non-free. That seems to be supported by kernel 5.10.

Last edited by geki (2021-05-13 07:40:37)

Offline

#98 2021-05-13 17:17:36

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

geki wrote:

Did you try nvidia drivers from backports / non-free?

Yes.
I do have that in my /etc/apt/sources.list.
Must be a question of package priorities.

Will have to specifically install that: apt install nvidia-legacy-340xx-kernel-dkms/stable-backports
I'll get it done and report back.

Thanks for your input.

Best,

A.

Offline

#99 2021-05-13 19:56:13

Altoid
Member
Registered: 2017-05-07
Posts: 1,415  

Re: Linux e1000e module removal and e1000e EEE timer - Part II

Hello:

Altoid wrote:

... and report back.

Done.

groucho@devuan:~$ uname -a
Linux devuan 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux
groucho@devuan:~$ 
groucho@devuan:~$  apt policy nvidia-legacy-340xx-kernel-dkms
nvidia-legacy-340xx-kernel-dkms:
  Installed: 340.108-10~bpo10+1
  Candidate: 340.108-10~bpo10+1
  Version table:
 *** 340.108-10~bpo10+1 100
        100 http://deb.devuan.org/merged beowulf-backports/non-free amd64 Packages
        100 /var/lib/dpkg/status
     340.108-3~deb10u1 500
        500 http://deb.devuan.org/merged beowulf/non-free amd64 Packages
groucho@devuan:~$ 

Q: what does *** after Version table: mean?

With respect to e1000e-3.8.7, I guess now it's time to wait and see if there's another bad_shutdown or bad_boot and if they leave any trace
Maybe the additons to /etc/default/acpi-support have some effect?

---
Edit:
Last night I realised that the e1000e driver module now put in use by the 5.10 kernel upgrade was most probably the current one ie: e1000e-3.8.7.
But I had not applied the five patches, so that's done now.
---

Thanks for your input.

Best,

A.

Last edited by Altoid (2021-05-15 11:32:47)

Offline

Board footer