The officially official Devuan Forum!

You are not logged in.

#1 2020-03-05 20:23:22

rolfie
Member
Registered: 2017-11-25
Posts: 253  

Issues with very new hardware

Having invested into some hot new hardware planned to be the replacment for my wifes PC, I need some advice to get it fully working.

The HW concerned is a AMD X570 chipset board with a Ryzen7 3700X CPU and a Radeon RX5500XT graphics card.

The initial cli installation was done with a Sapphire GPRO 2200 installed, based on ASCII2.1 with openrc, then upgraded to Beowulf, updated with kernel 5.4 from backports, then xorg/lightdm/Mate desktop on top. Finally I changed the graphics card to the 5500XT. The PC boots and seems to be usable, though there are a few issues left.

1.) Minor issue: lspci shows a 7340 as detected VGA which seems to be the vendor ID.

lspci
0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 7340 (rev c5)

2.) There are iommu errors shown during boot.

Mar  1 16:42:52 rh060 kernel: [    1.825144] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    1.945170] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
Mar  1 16:42:52 rh060 kernel: [    1.945170] pci 0000:00:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
Mar  1 16:42:52 rh060 kernel: [    1.945171]  PPR X2APIC NX GT IA GA PC GA_vAPIC
Mar  1 16:42:52 rh060 kernel: [    1.945172] AMD-Vi: Interrupt remapping enabled
Mar  1 16:42:52 rh060 kernel: [    1.945173] AMD-Vi: Virtual APIC enabled
Mar  1 16:42:52 rh060 kernel: [    1.945173] AMD-Vi: X2APIC enabled
Mar  1 16:42:52 rh060 kernel: [    1.945264] AMD-Vi: Lazy IO/TLB flushing enabled
Mar  1 16:42:52 rh060 kernel: [    1.945276] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0a:00.0 address=0x7fb570a40]
Mar  1 16:42:52 rh060 kernel: [    1.945279] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0a:00.0 address=0x7fb570a60]
Mar  1 16:42:52 rh060 kernel: [    1.945982] amd_uncore: AMD NB counters detected
Mar  1 16:42:52 rh060 kernel: [    1.945985] amd_uncore: AMD LLC counters detected
Mar  1 16:42:52 rh060 kernel: [    1.946171] LVT offset 0 assigned for vector 0x400
Mar  1 16:42:52 rh060 kernel: [    1.946238] perf: AMD IBS detected (0x000003ff)
Mar  1 16:42:52 rh060 kernel: [    1.946242] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
Mar  1 16:42:52 rh060 kernel: [    1.946610] Initialise system trusted keyrings
Mar  1 16:42:52 rh060 kernel: [    1.946618] Key type blacklist registered
Mar  1 16:42:52 rh060 kernel: [    1.946666] workingset: timestamp_bits=40 max_order=23 bucket_order=0
Mar  1 16:42:52 rh060 kernel: [    1.947333] zbud: loaded
Mar  1 16:42:52 rh060 kernel: [    1.947476] Platform Keyring initialized
Mar  1 16:42:52 rh060 kernel: [    1.947477] Key type asymmetric registered
Mar  1 16:42:52 rh060 kernel: [    1.947477] Asymmetric key parser 'x509' registered
Mar  1 16:42:52 rh060 kernel: [    1.947482] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
Mar  1 16:42:52 rh060 kernel: [    1.947517] io scheduler mq-deadline registered
Mar  1 16:42:52 rh060 kernel: [    1.951955] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    1.951955] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    1.951955] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.323671] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.447853] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.575032] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0a:00.0 address=0x7fb570a90]
Mar  1 16:42:52 rh060 kernel: [    2.579018] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.703212] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.827398] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    2.951446] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    3.075596] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    3.199711] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    3.323794] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    3.443723] tsc: Refined TSC clocksource calibration: 3593.250 MHz
Mar  1 16:42:52 rh060 kernel: [    3.443732] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x33cb6addeae, max_idle_ns: 440795225061 ns
Mar  1 16:42:52 rh060 kernel: [    3.443753] clocksource: Switched to clocksource tsc
Mar  1 16:42:52 rh060 kernel: [    3.447731] AMD-Vi: Completion-Wait loop timed out
Mar  1 16:42:52 rh060 kernel: [    3.577101] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0a:00.0 address=0x7fb570ad0]
Mar  1 16:42:52 rh060 kernel: [    3.577104] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0a:00.0 address=0x7fb570af0]
Mar  1 16:42:52 rh060 kernel: [    3.577135] pcieport 0000:00:07.1: AER: enabled with IRQ 31
Mar  1 16:42:52 rh060 kernel: [    3.577246] pcieport 0000:00:08.1: AER: enabled with IRQ 32
Mar  1 16:42:52 rh060 kernel: [    3.577390] pcieport 0000:00:08.2: AER: enabled with IRQ 33
Mar  1 16:42:52 rh060 kernel: [    3.577532] pcieport 0000:00:08.3: AER: enabled with IRQ 34
Mar  1 16:42:52 rh060 kernel: [    3.578516] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
Mar  1 16:42:52 rh060 kernel: [    3.578522] efifb: probing for efifb
Mar  1 16:42:52 rh060 kernel: [    3.578542] efifb: framebuffer at 0xe0000000, using 3072k, total 3072k
Mar  1 16:42:52 rh060 kernel: [    3.578542] efifb: mode is 1024x768x32, linelength=4096, pages=1
Mar  1 16:42:52 rh060 kernel: [    3.578543] efifb: scrolling: redraw
Mar  1 16:42:52 rh060 kernel: [    3.578543] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
Mar  1 16:42:52 rh060 kernel: [    3.578593] Console: switching to colour frame buffer device 128x48
Mar  1 16:42:52 rh060 kernel: [    3.579610] fb0: EFI VGA frame buffer device
Mar  1 16:42:52 rh060 kernel: [    3.579636] Monitor-Mwait will be used to enter C-1 state
Mar  1 16:42:52 rh060 kernel: [    3.580669] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
Mar  1 16:42:52 rh060 kernel: [    3.580894] Linux agpgart interface v0.103
Mar  1 16:42:52 rh060 kernel: [    3.581248] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

3.) The latest amdgpu firmware package not yet supports the 5500 card. Found navi14 firmware on github that should support the 5500XT card, downloaded the files and placed them in /lib/firmware/amdgpu, but it looks like they are not loaded from kernel 5.4/5.5.

Mar  1 16:42:52 rh060 kernel: [   13.634934] amdgpu 0000:0a:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xe0000000 -> 0xefffffff
Mar  1 16:42:52 rh060 kernel: [   13.634936] amdgpu 0000:0a:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xf0000000 -> 0xf01fffff
Mar  1 16:42:52 rh060 kernel: [   13.634937] amdgpu 0000:0a:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfcb00000 -> 0xfcb7ffff
Mar  1 16:42:52 rh060 kernel: [   13.634938] checking generic (e0000000 300000) vs hw (e0000000 10000000)
Mar  1 16:42:52 rh060 kernel: [   13.634939] fb0: switching to amdgpudrmfb from EFI VGA

Mar  1 16:42:52 rh060 kernel: [   13.635031] amdgpu 0000:0a:00.0: vgaarb: deactivate vga console

Mar  1 16:42:52 rh060 kernel: [   13.660960] amdgpu 0000:0a:00.0: firmware: direct-loading firmware amdgpu/navi14_gpu_info.bin
Mar  1 16:42:52 rh060 kernel: [   13.660961] amdgpu 0000:0a:00.0: Failed to validate gpu_info firmware "amdgpu/navi14_gpu_info.bin"
Mar  1 16:42:52 rh060 kernel: [   13.660966] amdgpu 0000:0a:00.0: Fatal error during GPU init
Mar  1 16:42:52 rh060 kernel: [   13.660968] [drm] amdgpu: finishing device.

Mar  1 16:42:52 rh060 kernel: [   13.660981] ------------[ cut here ]------------
Mar  1 16:42:52 rh060 kernel: [   13.660982] sysfs group 'fw_version' not found for kobject '0000:0a:00.0'
Mar  1 16:42:52 rh060 kernel: [   13.660991] WARNING: CPU: 5 PID: 954 at fs/sysfs/group.c:280 sysfs_remove_group+0x76/0x80
Mar  1 16:42:52 rh060 kernel: [   13.660992] Modules linked in: snd_hda_codec_realtek snd_hda_codec_generic pcc_cpufreq(-) amdgpu(+) ledtrig_audio snd_hda_codec_hdmi edac_mce_amd gpu_sched snd_hda_intel ttm efi_pstore snd_intel_nhlt drm_kms_helper kvm_amd eeepc_wmi asus_wmi snd_hda_codec drm kvm snd_hda_core battery snd_hwdep sparse_keymap snd_pcm rfkill sp5100_tco evdev irqbypass video serio_raw wmi_bmof mxm_wmi pcspkr efivars k10temp mfd_core watchdog snd_timer snd ccp soundcore rng_core button acpi_cpufreq ext4 crc16 mbcache jbd2 crc32c_generic algif_skcipher af_alg dm_crypt dm_mod sg sr_mod cdrom sd_mod hid_generic usbhid hid uas usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci xhci_pci xhci_hcd libata aesni_intel igb nvme usbcore scsi_mod crypto_simd cryptd glue_helper i2c_algo_bit dca i2c_piix4 nvme_core ptp pps_core usb_common wmi
Mar  1 16:42:52 rh060 kernel: [   13.661016] CPU: 5 PID: 954 Comm: udevd Not tainted 5.4.0-0.bpo.3-amd64 #1 Debian 5.4.13-1~bpo10+1
Mar  1 16:42:52 rh060 kernel: [   13.661017] Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
Mar  1 16:42:52 rh060 kernel: [   13.661018] RIP: 0010:sysfs_remove_group+0x76/0x80
Mar  1 16:42:52 rh060 kernel: [   13.661019] Code: 48 89 df 5b 5d 41 5c e9 e8 bd ff ff 48 89 df e8 e0 ba ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 40 d2 ac 8a e8 e3 85 d5 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 48 85 f6 74 31 41 54
Mar  1 16:42:52 rh060 kernel: [   13.661020] RSP: 0018:ffffae6e0075ba20 EFLAGS: 00010282
Mar  1 16:42:52 rh060 kernel: [   13.661021] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8ac4d7e8
Mar  1 16:42:52 rh060 kernel: [   13.661022] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000247
Mar  1 16:42:52 rh060 kernel: [   13.661022] RBP: ffffffffc0da3de0 R08: 000000000000043b R09: 0000000000000004
Mar  1 16:42:52 rh060 kernel: [   13.661023] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9aceba1170b0
Mar  1 16:42:52 rh060 kernel: [   13.661023] R13: ffff9aceb7b94da0 R14: 0000000000000000 R15: ffff9acead9d8b10
Mar  1 16:42:52 rh060 kernel: [   13.661024] FS:  00007f722b6df880(0000) GS:ffff9acebe940000(0000) knlGS:0000000000000000
Mar  1 16:42:52 rh060 kernel: [   13.661025] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar  1 16:42:52 rh060 kernel: [   13.661025] CR2: 00007ffc55325fc8 CR3: 00000007ee7b4000 CR4: 0000000000340ee0
Mar  1 16:42:52 rh060 kernel: [   13.661026] Call Trace:
Mar  1 16:42:52 rh060 kernel: [   13.661109]  amdgpu_device_fini+0x445/0x479 [amdgpu]
Mar  1 16:42:52 rh060 kernel: [   13.661169]  amdgpu_driver_unload_kms+0x4a/0x90 [amdgpu]
Mar  1 16:42:52 rh060 kernel: [   13.661245]  amdgpu_driver_load_kms.cold.12+0x38/0x6e [amdgpu]
Mar  1 16:42:52 rh060 kernel: [   13.661257]  drm_dev_register+0x10d/0x150 [drm]
Mar  1 16:42:52 rh060 kernel: [   13.661315]  amdgpu_pci_probe+0x15a/0x1d0 [amdgpu]
Mar  1 16:42:52 rh060 kernel: [   13.661318]  local_pci_probe+0x42/0x80
Mar  1 16:42:52 rh060 kernel: [   13.661320]  pci_device_probe+0xfc/0x1b0
Mar  1 16:42:52 rh060 kernel: [   13.661322]  really_probe+0x1c2/0x3e0
Mar  1 16:42:52 rh060 kernel: [   13.661324]  driver_probe_device+0xb4/0x100
Mar  1 16:42:52 rh060 kernel: [   13.661325]  device_driver_attach+0x4f/0x60
Mar  1 16:42:52 rh060 kernel: [   13.661326]  __driver_attach+0x86/0x140
Mar  1 16:42:52 rh060 kernel: [   13.661327]  ? device_driver_attach+0x60/0x60
Mar  1 16:42:52 rh060 kernel: [   13.661328]  bus_for_each_dev+0x77/0xc0
Mar  1 16:42:52 rh060 kernel: [   13.661331]  ? klist_add_tail+0x3b/0x70
Mar  1 16:42:52 rh060 kernel: [   13.661332]  bus_add_driver+0x14d/0x1f0
Mar  1 16:42:52 rh060 kernel: [   13.661333]  ? 0xffffffffc0ff1000
Mar  1 16:42:52 rh060 kernel: [   13.661334]  driver_register+0x6b/0xb0
Mar  1 16:42:52 rh060 kernel: [   13.661335]  ? 0xffffffffc0ff1000
Mar  1 16:42:52 rh060 kernel: [   13.661337]  do_one_initcall+0x46/0x1f4
Mar  1 16:42:52 rh060 kernel: [   13.661340]  ? _cond_resched+0x15/0x30
Mar  1 16:42:52 rh060 kernel: [   13.661342]  ? kmem_cache_alloc_trace+0x1d9/0x220
Mar  1 16:42:52 rh060 kernel: [   13.661344]  do_init_module+0x5a/0x220
Mar  1 16:42:52 rh060 kernel: [   13.661346]  load_module+0x222d/0x24a0
Mar  1 16:42:52 rh060 kernel: [   13.661349]  ? __do_sys_finit_module+0xa8/0x110
Mar  1 16:42:52 rh060 kernel: [   13.661350]  __do_sys_finit_module+0xa8/0x110
Mar  1 16:42:52 rh060 kernel: [   13.661352]  do_syscall_64+0x52/0x160
Mar  1 16:42:52 rh060 kernel: [   13.661354]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar  1 16:42:52 rh060 kernel: [   13.661355] RIP: 0033:0x7f722bc20f59
Mar  1 16:42:52 rh060 kernel: [   13.661356] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
Mar  1 16:42:52 rh060 kernel: [   13.661357] RSP: 002b:00007ffc5532e958 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
Mar  1 16:42:52 rh060 kernel: [   13.661358] RAX: ffffffffffffffda RBX: 00005585f19bd080 RCX: 00007f722bc20f59
Mar  1 16:42:52 rh060 kernel: [   13.661358] RDX: 0000000000000000 RSI: 00007f722bd01cad RDI: 000000000000000f
Mar  1 16:42:52 rh060 kernel: [   13.661359] RBP: 00007f722bd01cad R08: 0000000000000000 R09: 0000000000000000
Mar  1 16:42:52 rh060 kernel: [   13.661359] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
Mar  1 16:42:52 rh060 kernel: [   13.661360] R13: 00005585f199f460 R14: 0000000000020000 R15: 00005585f19bd080
Mar  1 16:42:52 rh060 kernel: [   13.661361] ---[ end trace c007050759878ea8 ]---
Mar  1 16:42:52 rh060 kernel: [   13.661474] amdgpu: probe of 0000:0a:00.0 failed with error -22

4.) Like in this thread https://dev1galaxy.org/viewtopic.php?id=3336 the boot messages stop at "waiting for /dev to be fully populated". When switching to the login screen first of all a scrambled display is present for a second, then Lightdm login is presented as expected.

5.) Digged a bit in the internet and found a release candidate for kernel 5.5 in Debian experimental. I downloaded the .deb package from Debian and installed it with dpkg -i. The PC boots, but the issues remain as is.

6.) navi14 firmware: downloaded them and copied them into what I think is the right directory, access rights seem to be set correctly. The message is "Failed to validate gpu_info firmware "amdgpu/navi14_gpu_info.bin"". What did I miss?

Thanks, rolfie

Offline

#2 2020-03-06 08:46:22

ToxicExMachina
Member
Registered: 2019-03-11
Posts: 201  

Re: Issues with very new hardware

Are you sure you have downloaded firmware correctly?

Offline

#3 2020-03-06 18:01:51

rolfie
Member
Registered: 2017-11-25
Posts: 253  

Re: Issues with very new hardware

Good question. Reloaded 18 individual files again from a different PC, and the latest complete firmware package as tarball I found on github. I could not find any checksums.

Replacing the 18 individual files did not seem to make a difference. Then I extracted the files from the tarball and copied them into the library.

Then I realised a new phenomenon: with every reboot the clock is reset to 1.1.2019 00:00 (looks like this is the bios minimum). And not due to the bios battery being bad, replacing that did not help. Other parameters do not seem to be affected, i.e. Power fail: ON.

The PC now is stuck at "waiting for /dev to be fully populated", there is no more login window poppping up now.

What is this? Any suggestions?

rolfie

Offline

#4 2020-03-06 23:07:50

Head_on_a_Stick
Member
From: London
Registered: 2019-03-24
Posts: 556  
Website

Re: Issues with very new hardware

From where did you obtain the firmware?

I would try https://git.kernel.org/pub/scm/linux/ke … ree/amdgpu

Have you tested this hardware with Arch Linux yet? They have kernel 5.8.8 and the firmware from 2020-02-24.


"Il semble que la perfection soit atteinte non quand il n'y a plus rien à ajouter, mais quand il n'y a plus rien à retrancher." — Antoine de Saint-Exupéry

Offline

#5 2020-03-07 20:10:43

rolfie
Member
Registered: 2017-11-25
Posts: 253  

Re: Issues with very new hardware

Head_on_a_Stick wrote:

From where did you obtain the firmware?

I would try https://git.kernel.org/pub/scm/linux/ke … ree/amdgpu

Have you tested this hardware with Arch Linux yet? They have kernel 5.8.8 and the firmware from 2020-02-24.

Thats exactly where I got the individual files and the tarball from.

Loaded the latest available arch onto a stick and bootet the PC from that. I did not see the same errors, may be due to the fact that arch uses systemd, but the boot ends on the command line with wrong locale and no network, both to be configured in a different way. I am not familiar with arch/systemd.

I checked the journal for indications if the navi14 firmware is loaded, could not find anything.

What can I learn from this experiment? Does it pay to invest time for learning something I do not want/need as a stable working environment?

Thanks, rolfie

Last edited by rolfie (2020-03-07 20:27:34)

Offline

#6 2020-03-07 20:22:39

Head_on_a_Stick
Member
From: London
Registered: 2019-03-24
Posts: 556  
Website

Re: Issues with very new hardware

FWIW:

rolfie wrote:

the boot ends on the command line with wrong locale and no network

localectl set-locale en_GB.UTF-8 # for example
wifi-menu
rolfie wrote:

What can I learn from this experiment?

That kernel 5.8.8 supports your third generation Ryzen better than any of the De??an kernels that are available atm. You can switch to ceres and wait a bit but that's not a rolling release so it won't be as reliable as Arch.


"Il semble que la perfection soit atteinte non quand il n'y a plus rien à ajouter, mais quand il n'y a plus rien à retrancher." — Antoine de Saint-Exupéry

Offline

#7 2020-03-09 08:24:19

ralph.ronnquist
Administrator
From: Clifton Hill, Victoria, AUS
Registered: 2016-11-30
Posts: 393  

Re: Issues with very new hardware

rolfie  wrote:

The PC now is stuck at "waiting for /dev to be fully populated", there is no more login window poppping up now.

That same thing happens for me with the (forthcoming) beowulf installation on qemu, but this recovers by using ctrl-alt-f2 for an alternate login. Maybe it's the same for you?

Offline

#8 2020-03-09 10:48:43

rolfie
Member
Registered: 2017-11-25
Posts: 253  

Re: Issues with very new hardware

Thanks, the keyboard works (reacts to NUMLOCK), but no reaction to any ctrl-alt-fn combination.

The single boot deletes the time setting, is back to 1.1.2019 00:00.

Can reboot with ctrl-alt-del.

rolfie

Last edited by rolfie (2020-03-09 10:49:02)

Offline

Board footer