The officially official Devuan Forum!

You are not logged in.

#1 2023-04-04 17:12:14

dp
Member
Registered: 2017-11-10
Posts: 15  

Loss of on-hover response

I've downloaded  a live copy of CHIMAERA and run it from a thumb drive.  I have found a problem with the interaction of Firefox and Thunderbird. It appears that the "on-hover" function/interrupt gets confused and locks up the machine requiring a power cycle in order to get control of the machine.  My conjecture is that the "on-hover" function is "single-threaded" (for lack of a proper term?).  Firefox opens full screen with two tabs and a wait for the "Get Started" button on the second tab.  Subsequently loading Thunderbird, which opens full screen, has an interactive 'window?' that’s waiting for the "Next Tip" button.  At this point, the cursor is locked, in that, while the cursor can be moved around, no selection can be made even from the top of the page (system?) icons.  It appears that the on-hover function is locked up. While this occurs with Firefox and Thunderbird, I think that, at boot, if two apts are loaded full screen with both requiring an input, the system will be locked up because the "on-hover" processor is confused and locked up.  (IBM had a similar prob with OS/2 where all apts were “single-threaded” through the keyboard code. If one failed, no one had a keyboard.)  So I think that this is a more generic problem than just in this specific case.  If Firefox is loaded and resized to half screen, and T-bird is loaded and resized to half screen, then it mostly works but can be easily confused  locking up the system requiring a power cycle.

How do I go about further isolating this problem?

And, is their a key-sequence that can rescue this situation? Once upon a time, I remember "somewhere" there was an "alt F4" to close the current window.

P.S. What code processes the on-hover interrupts?  Is this the HTML processor?  And who might that be. Thank you.

Last edited by dp (2023-04-04 17:17:25)

Offline

#2 2023-04-05 16:32:55

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: Loss of on-hover response

ctrl-alt-f1 (press all 3 keys at once) should get you to a text screen (there are 6 of them, ctrl-alt-f1 to ctrl-alt-f6). ctrl-alt-f7 should get you back to the GUI (try this out *before* you get another hang). If that works you should be able to log on and try commands like killall firefox and killall thunderbird.

If you can't free the GUI you could use sudo shutdown -r now as a slightly gentler way to force a reboot.

Last edited by chris2be8 (2023-04-05 16:33:42)

Offline

#3 2023-04-05 16:51:20

golinux
Administrator
Registered: 2016-11-25
Posts: 3,137  

Re: Loss of on-hover response

I have an xkill shortcut in my panel for occasions like you describe.

Online

#4 2023-04-06 20:14:00

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

The best way to induce this problem is, upon boot (with a fresh live copy), load the browser (firefox), which will open two tabs the second of which will have a no network found and a "Try Again" button.  Then load Libre-writer which will open a small window with a "Next Tip" button.  At that point, the "on--hover" function no longer works for any of the icons on any of the panels. Additionally, Ctrl-Alt-F1--7 do not work.  So there is no way to get to the command line to recover the system.  Sometimes, when both appts are loaded in one window, and switching from one tab to the other,  the title switches to firefox but the rest of the picture still shows Libre-writer. On some occasions, when switching around, multiple instances of the Devuan Launcher is attempted before locking up.  (Don't know how to drop a screen pic into this message.)

It seems that two appts are waiting for their on-hover interrupt to be processed.(in sequence?)  If the second appt covers the first screen, you can't get to the first button so the interrupt processor cannot issue a return-from-interrupt so that the interrupt stack can get back in sync. Any debug suggestions are appreciated.  Progress is slow due to all the rebooting required...

Interestingly, if  a network connection is established before loading the appts, everything seems to work ok.

Also, loading the appts in the reverse order changes the way the failure manifests. thanks.

Offline

#5 2023-04-07 16:23:43

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: Loss of on-hover response

Do you have another system? If so can you log on to the systems with problems via ssh (preferably before causing it to hang)? That should give you a way to enter commands (I've had to do this).

Just to confirm, does Ctrl-Alt-F1---7 work whan the system is not hung?

Offline

#6 2023-04-07 23:42:53

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

chris2be8 wrote:

Do you have another system? If so can you log on to the systems with problems via ssh (preferably before causing it to hang)? That should give you a way to enter commands (I've had to do this).

Just to confirm, does Ctrl-Alt-F1---7 work whan the system is not hung?

I can bring up another laptop system.  I have to learn how to use ssh...  That's just where I am at this point. Yes---

Ctrl-Alt-F1---7 work when the system is not hung.
Thank you.

Offline

#7 2023-04-25 17:21:47

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

I'm thinking I may have a hardware problem.
Learned rsync and have backed up my home directory.  Will reinstall the OS and retry.

Offline

#8 2023-04-25 22:51:31

GlennW
Member
From: Brisbane, Australia
Registered: 2019-07-18
Posts: 582  

Re: Loss of on-hover response

Hi, I've been having mouse problems as well. "focus" being the main disfunction.


pic from 1993, new guitar day.

Offline

#9 2023-06-26 16:31:37

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

A simpler way to induce this problem is to load FF and goto www.cepher.net and do a look-inside.  Sometimes it locks up while loading the pdf pages. Sometimes it only affects the buttons in that window:  can't X out. By messing around, it will fail all on-hover function requiring a power cycle to recover. Other times with the window locked, by cycling through different work-spaces, it will clear the on-hover within the window and sometimes FF dies. This happens with a live CHIMAERA or an older installed Refracta.  However, it works just fine using a slightly older live copy of Knoppix.  Any ideas as to how to isolate this problem further?

Offline

#10 2023-07-12 16:14:05

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

I have two laptops. One has an RJ-45 enet port, the other has only usb's.  I can tether both machine to my smart iphone, note their IP addresses and ping between them.  Is there a way to ssh via the tether connection? If so, I need a little hand holding to get this done so I can do a remote login and see if progress can be made with the on-hover problem.

Offline

#11 2023-07-13 10:07:10

PedroReina
Member
From: Madrid, Spain
Registered: 2019-01-13
Posts: 267  
Website

Re: Loss of on-hover response

dp wrote:

Is there a way to ssh via the tether connection?

Have you tried?

Offline

#12 2023-07-14 17:37:33

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

The answer to MY question is yes.
The CHIMAERA machine can ssh into the Refracta machine via the iPhone tether. (This does not appear the case when both machines are connected to the public wifi (could be different routers?))  However, going the other way gives a "connection refused" error.  This appears to be caused by Port 22 not being available.  So far I haven't figured out how to to bring up port 22 on the chimaera machine. (This must be done for security reasons.)  But I can ssh into the refracta machine from the chimaera machine (at least before the window freezes(haven't tried after) - not even the time clock gets updated).  What can I do next to try and isolate the failure?  FF usually ends up crashing and wanting to send a message to mozilla.

Offline

#13 2023-07-15 11:14:31

fsmithred
Administrator
Registered: 2016-11-25
Posts: 2,409  

Re: Loss of on-hover response

Refracta isos have openssh-server installed; the devuan-live isos do not. That's why you can't ssh to the devuan box.

How did you start Thunderbird in a live session? It's not in the isos. Did you install it in the live session? If so, you may have run out of memory.

I went to www.cepher.net and looked inside. I'd say there's a problem with that website. Turning the pages of the book uses 213% of the cpu in a VM that has been allotted 2 cores. It didn't lock up, but it was acting like it was close to doing so.

Offline

#14 2023-07-17 18:27:11

dp
Member
Registered: 2017-11-10
Posts: 15  

Re: Loss of on-hover response

Sorry about the Thunderbird confusion. Its not needed to induce the problem.
I've tried to collect some data that may help someone suggest a next move.  At this point, I'm more interested in learning How to troubleshoot a problem like this than to actually fix it.  This will help me learn my way around the system better. (and perhaps someone else also.) Here's what I've found with my limited skills:
---------------------------------------
This looks like some sort of memory DMA access problem:

[ 8647.079594] nouveau 0000:01:00.0: fifo: DMA_PUSHER - ch 4 [Xorg[2108]] get 000019bef0 put 000019cbc0 ib_get 000003a0 ib_put 000003fa state 80000000 (err: INVALID_CMD) push 00406040

[ 8647.079632] nouveau 0000:01:00.0: gr: DATA_ERROR 00000004 [INVALID_VALUE]

[ 8647.079638] nouveau 0000:01:00.0: gr: 00100000 [] ch 4 [000f749000 Xorg[2108]] subc 3 class 8297 mthd 0ec8 data 00146200

[ 8647.079667] nouveau 0000:01:00.0: fifo: DMA_PUSHER - ch 4 [Xorg[2108]] get 000029b004 put 000029b00c ib_get 000003a1 ib_put 000003fa state 80000000 (err: INVALID_CMD) push 00406040

….

[ 9234.234212] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750

Looks like it gave up waiting for a return from interrupt.

zenos@E6400:~$

--------------------------------

Here's the process 2500 that got offended: (a disk access failure?)

zenos 2485 2394 0 08:54 ? 00:00:29 /usr/bin/python -O /usr/share/wi

zenos 2493 1 0 08:54 ? 00:00:00 /usr/lib/i386-linux-gnu/xfce4/no

root 2500 1 0 08:54 ? 00:00:01 /usr/lib/udisks2/udisksd

zenos 2524 1 0 08:54 ? 00:00:00 /usr/lib/gvfs/gvfs-goa-volume-mo

zenos 2529

-----------------------------------

"TOP" shows where kworker settles down(up) to with around 35% cpu and up to 3-4% mem:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

6846 root 20 0 0 0 0 R 33.4 0.0 15:12.50 kworker/0+

9072 zenos 20 0 712392 130768 94440 S 3.0 6.4 1:32.75 Isolated +

2108 root 20 0 171132 66468 32604 S 1.0 3.2 9:59.85 Xorg

520 root -51 0 0 0 0 S 0.7 0.0 0:46.71 irq/27-iw+

8701 zenos 20 0 8400 2948 2516 S 0.7 0.1 0:16.25 top

killing FF caused FF to go to 100%, eventually dying.

zenos@E6400:~$ kill 9072

zenos@E6400:~$ kill 8791

zenos@E6400:~$ dmesg | tail

[12115.107292] iwlwifi 0000:0c:00.0: EVT_LOGT:1132408922:0x0000000c:0512

[12115.107307] iwlwifi 0000:0c:00.0: EVT_LOGT:1133408960:0x00000000:0125

[12115.107350] iwlwifi 0000:0c:00.0: Command SENSITIVITY_CMD failed: FW Error

[12115.118904] ieee80211 phy0: Hardware restart was requested

[12115.166666] iwlwifi 0000:0c:00.0: Radio type=0x1-0x2-0x0

[12115.295444] iwlwifi 0000:0c:00.0: Radio type=0x1-0x2-0x0

Looks like it lost the wifi connection but successfully reconnected without loss of ssh sessions.

[12115.463718] volumeicon[2482]: segfault at 50 ip b7622f75 sp bfdf5708 error 4 in libgdk-3.so.0.2404.1[b75e9000+7f000]

….

[12115.466343] panel-17-xkb[2555]: segfault at 50 ip b7614f75 sp bf8d9568 error 4 in libgdk-3.so.0.2404.1[b75db000+7f000]

This segfault could be caused by the DMA failure...?
----------------------------

Did not lose the wifi this time and thus the ssh sessions survived.

At the point of this failure, kworker consistently increases incrementally from around 25% to 36% mem over one-two minutes.

_____________________________

On previous failures, at the point of failure, Xorg bounces around 100% cpu, 3-4 %mem. When it dies, FF bounces around 99% similiar mem use, before being killed. (takes a while for it to find time to get to the kill command). Machine recovered.

Thanks.

-------------------------------

Last edited by dp (2023-07-17 18:30:57)

Offline

Board footer