The officially official Devuan Forum!

You are not logged in.

#1 2019-04-02 10:41:06

zero
Member
Registered: 2019-01-18
Posts: 10  

EXT4 BUG? Data loss when I have launched sigil :s

I launched Sigil (displayed in full screen) to write something, then I get an error message in a dialog box, saying something like "Sigil cannot not run". So I clicked on "close" and noticed that the background image was gone…

I understood that my pictures have gone… :s I checked my home folder and indeed, my pictures have gone… seems few other files too, gone… Luckily, I did backup my system some days ago.

Now how can I track what happened? Nothing is logged in /var, and dmesg yeld nothing. Moreover, Smartctl reports nothing nor Samsung Magician.

And how can I track this in the future?

---EDIT---
Files in an other drive are missing. Possibly an EXT4 BUG??? How can I check this?

Last edited by zero (2019-04-13 12:04:08)

Offline

#2 2019-04-08 17:15:13

rolfie
Member
Registered: 2017-11-25
Posts: 133  

Re: EXT4 BUG? Data loss when I have launched sigil :s

I am pretty sure that ext4 does not have any significant bug. Like many others I am using it since many years and can't complain. You should look at your HW and this program you are talking about.

Run a full check of your drives with smartmontools. Maybe gsmartcontrol makes things easier to display.

Rolf

Last edited by rolfie (2019-04-09 19:58:16)

Offline

#3 2019-04-08 20:07:17

ChuangTzu
Member
Registered: 2018-06-13
Posts: 135  

Re: EXT4 BUG? Data loss when I have launched sigil :s

definitely not ext4.  Report back with the results of rolfies suggestion(s), if all looks good then you might need to clean out the dust, check the connections etc...

Offline

#4 2019-04-09 07:00:40

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Summary from bug#313

There are 4 drives, two different brands, 3 different technologies (SSD, Hard Drive, and nvme), and data loss occurred on the HDD and the nvme… So different technologies, "different controllers", different brands but they all have in common EXT4.
The data vanished in few seconds…
I tried photorec without success. There are not recoverable.

I could add that it's a "relatively brand new pc—2 months old" and that I have tried

  • nvme-cli

  • smartmontools

  • Samsung Magician

All these tools say that EVERYTHING is OK, and strangely, Photorec do not find those files.

Finally, and this is I think very important, as I was not logged in any way as root (terminal, etc… or even used sudo…). Files that belonged to root user or other user/group have also vanished. Yet my user did not have any "super rights". So more than 100GB in two disks vanished instantly, not possible and too fast for a "rm" command—that I never did).

Last edited by zero (2019-04-09 12:02:55)

Offline

#5 2019-04-09 19:18:28

ChuangTzu
Member
Registered: 2018-06-13
Posts: 135  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Let's start with the basics:

as root:

apt install inxi

then post output of:

inxi -Fr

Offline

#6 2019-04-09 19:37:30

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

System:    Host: devuan Kernel: 4.19.0-0.bpo.4-amd64 x86_64 (64 bit) Desktop: MATE 1.20.4
           Distro: Devuan GNU/Linux ascii
Machine:   Device: desktop Mobo: Micro-Star model: B450M PRO-VDH (MS-7A38) v: 4.0
           UEFI: American Megatrends v: M.40 date: 01/25/2019
CPU:       Hexa core AMD Ryzen 5 2600 Six-Core (-HT-MCP-) cache: 3072 KB
           clock speeds: max: 3400 MHz 1: 1390 MHz 2: 1518 MHz 3: 1383 MHz 4: 1398 MHz 5: 1379 MHz 6: 1399 MHz
           7: 1549 MHz 8: 1592 MHz 9: 1356 MHz 10: 1323 MHz 11: 1457 MHz 12: 1386 MHz
Graphics:  Card: Advanced Micro Devices [AMD/ATI] Oland PRO [Radeon R7 240/340]
           Display Server: X.Org 1.19.2 drivers: ati,radeon (unloaded: modesetting,fbdev,vesa)
           Resolution: 1920x1200@59.95hz
           GLX Renderer: AMD OLAND (DRM 2.50.0, 4.19.0-0.bpo.4-amd64, LLVM 7.0.1)
           GLX Version: 4.5 (Compatibility Profile) Mesa 18.3.6
Audio:     Card-1 Advanced Micro Devices [AMD] Family 17h (Models 00h-0fh) HD Audio Controller
           driver: snd_hda_intel
           Card-2 Advanced Micro Devices [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
           driver: snd_hda_intel
           Sound: Advanced Linux Sound Architecture v: k4.19.0-0.bpo.4-amd64
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller driver: r8169
           IF: eth0 state: up speed: 100 Mbps duplex: full mac: 00:xx:xx:xx:xx:xx
Drives:    HDD Total Size: 1132.3GB (79.0% used)
           ID-1: /dev/nvme0n1 model: N/A size: 500.1GB
           ID-2: /dev/sda model: SAMSUNG_SSD_830 size: 128.0GB
           ID-3: /dev/sdb model: WDC_WD10EZEX size: 1000.2GB
           ID-4: USB /dev/sdc model: USB_2.0_FD size: 4.0GB
Partition: ID-1: / size: 9.8G used: 1.2G (13%) fs: ext4 dev: /dev/dm-0
           ID-2: /usr size: 32G used: 6.9G (23%) fs: ext4 dev: /dev/dm-1
           ID-3: /boot size: 3.7G used: 175M (5%) fs: ext4 dev: /dev/sdc1
           ID-4: /var size: 14G used: 1.4G (11%) fs: ext4 dev: /dev/dm-2
           ID-5: /tmp size: 5.5G used: 25M (1%) fs: ext4 dev: /dev/dm-3
           ID-6: /home size: 458G used: 153G (36%) fs: ext4 dev: /dev/dm-4
Sensors:   System Temperatures: cpu: No active sensors found. Have you configured your sensors yet? mobo: N/A gpu: 33.0
Repos:     Active apt sources in file: /etc/apt/sources.list
           deb http://pkgmaster.devuan.org/merged ascii main contrib non-free
           deb http://pkgmaster.devuan.org/merged ascii-updates main
           deb http://pkgmaster.devuan.org/merged ascii-security main
           deb http://pkgmaster.devuan.org/merged ascii-backports main contrib non-free
           Active apt sources in file: /etc/apt/sources.list.d/ceres.list
           deb http://fr.deb.devuan.org/merged ceres main non-free contrib
Info:      Processes: 393 Uptime: 12:10 Memory: 19808.3/32183.9MB Client: Shell (bash) inxi: 2.3.5

The nvme is the Samsung SSD 970 EVO 500GB.

Offline

#7 2019-04-09 19:48:07

ChuangTzu
Member
Registered: 2018-06-13
Posts: 135  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Any noises coming from the drive, is it running hot?  Are you able to open the case and see if it is full of dust, loose connections etc...?  Is the fan working?  Also note, just because its a new drive does mean it was not faulty. 

Are you pulling packages from ceres into stable/ascii?

Offline

#8 2019-04-09 20:14:33

rolfie
Member
Registered: 2017-11-25
Posts: 133  

Re: EXT4 BUG? Data loss when I have launched sigil :s

@OP: Don't jump to conclusions very easily. To 99.9999% ext4 isn't your problem.

Obviously you run ASCII with kernel and Mate desktop from backports. Similar to my setup except Mate, I have a X470pro chipset and a Ryzen7 working on an nvme and only ext4 as file system. 

Since the PC is new, make sure all power supply and SATA connections are safely seated. Is the nvme screwed down ok?

Have you run extended self test with smartmontools? smartmontools should also indicate if there are issue with dodgy sata cables. 

I also would look at Sigil. Its a HTML editor for ebooks as far as I found from Wikipedia. Are you sure that the version you are using is free of bugs and that the HTML code you handle is free of strange/faulty code?

I think there is a lot of things that need to be checked, including your RAM. Faulty RAM can also cause data loss.

Good luck, Rolf

Last edited by rolfie (2019-04-21 19:12:19)

Offline

#9 2019-04-10 07:48:59

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Everything is "perfectly seated, and so on".

No strange noises, no dust (very clean inside), more than enough ventilated. I repeat for all tools, smartmontools included, everything is OK (even multiple/long tests).

Yes, I use some Ceres packages (mainly for lxc).

I have installed sigil from Devuan Ascii repository (0.9.7+dfsg-1). I did not used it before. Indeed, I switched from Debian to Devuan, after ~15 years, using mainly SID. (I used to use FreeBSD too).

I only try to start Sigil for the first time on Devuan. But it complains that it cannot run. As if it was unable to write anything on the nvme—when the bug occurred?

I suppose, but I do not affirm that EXT4 is the problem. Maybe the RAM is faulty, but seems strange, as:

  • two diffrent disks

  • more than 100GB have been loss, while the system has max theoretically up to 32GB of memory

To get corrupt, these data (files; folders; etc…), from these two disks, should have been "randomly loaded in the RAM" (say max 30GB) in more than three times (3*30 = 90GB), to get corrupted each time. How and why being loaded in the RAM?

Anyway, I will test the RAM as soon as I can.

Last edited by zero (2019-04-10 08:15:02)

Offline

#10 2019-04-10 10:18:57

cynwulf
Member
Registered: 2017-10-09
Posts: 234  

Re: EXT4 BUG? Data loss when I have launched sigil :s

ChuangTzu wrote:

definitely not ext4.

rolfie wrote:

@OP: Don't jump to conclusions very easily. To 99.9999% ext4 isn't your problem.

While I have to admit that it's unlikely to be an ext4 bug, that's not to say that ext4 is not without bugs:

https://bugzilla.kernel.org/buglist.cgi … t=advanced

https://arstechnica.com/information-tec … nel-panel/

T'so doesn't see it as a major step forward. He dismisses it as a rehash of outdated "1970s technology" and describes it as a conservative short-term solution. He believes that the way forward is Oracle's open source Btrfs filesystem, which is designed to deliver significant improvements in scalability, reliability, and ease of management.

That was 10 years ago, ext4 is still around, but there is no ext5 and probably never likely to be..

It would be useful to see your /etc/fstab to see how the partitions are mounted at boot time?

Offline

#11 2019-04-10 11:38:10

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

I think it should be better to display lsblk:

NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 119.2G  0 disk  
├─sda1            8:1    0   1.9G  0 part  /boot/efi
├─sda2            8:2    0  41.9G  0 part  
├─sda3            8:3    0    10G  0 part  
│ └─devuan_root 254:0    0    10G  0 crypt /
├─sda4            8:4    0    14G  0 part  
├─sda5            8:5    0    14G  0 part  
│ └─devuan_var  254:2    0    14G  0 crypt /var
├─sda6            8:6    0   5.6G  0 part  
│ └─devuan_tmp  254:3    0   5.6G  0 crypt /tmp
└─sda7            8:7    0  31.9G  0 part  
  └─devuan_usr  254:1    0  31.9G  0 crypt /usr
sdb               8:16   0 931.5G  0 disk  
├─sdb1            8:17   0   922G  0 part  
│ └─sde1_crypt  254:5    0   922G  0 crypt /home/zero/public
├─sdb2            8:18   0     1K  0 part  
├─sdb5            8:21   0   7.7G  0 part  
└─sdb6            8:22   0   1.9G  0 part  
sdc               8:32   1   3.8G  0 disk  
└─sdc1            8:33   1   3.8G  0 part  /boot
nvme0n1         259:0    0 465.8G  0 disk  
└─nvme_crypt    254:4    0 465.8G  0 crypt /home

/dev/sda is the SAMSUNG_SSD_830 (no data loss)
/dev/sdb is the WDC_WD10EZEX (data loss)
/dev/nvme0n1 is the Samsung SSD 970 EVO 500GB (data loss)

Offline

#12 2019-04-10 15:46:31

cynwulf
Member
Registered: 2017-10-09
Posts: 234  

Re: EXT4 BUG? Data loss when I have launched sigil :s

/etc/fstab would be more useful to those attempting to offer advice, as it would show the mount parameters for the block devices and any vfs devices.  (/etc/crypttab may be of use as well, though I'm not at all knowledgeable in that area)

It's highly unlikely that you have two failing devices, but not impossible, more likely that cached data is not being written back, for whatever reason hence the loss?  What the data loss devices seem to have in common is that contain encrypted partitions which are mounted as or under /home

As an aside, you seem to have an encrypted /tmp on an SSD...?  Not sure why you'd want to wear out your SSD and not just use tmpfs(5)?

Last edited by cynwulf (2019-04-10 15:48:55)

Offline

#13 2019-04-10 19:05:22

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Well, here the fstab (I have removed the UUID):

UUID= /boot           ext4    defaults        0       2

UUID=  /boot/efi       vfat    umask=0077      0       1

#/dev/mapper/devuan_root /               ext4    errors=remount-ro 0       1
UUID= /               ext4    errors=remount-ro 0       1

#/dev/mapper/devuan_usr /usr            ext4    defaults         0       2
UUID= /usr            ext4    defaults         0       2

#/dev/mapper/devuan_var /var            ext4    defaults         0       2
UUID= /var            ext4    defaults         0       2

#/dev/mapper/devuan_tmp /tmp            ext4    defaults         0       2
UUID= /tmp            ext4    defaults         0       2

#/dev/mapper/nvme_crypt  /home           ext4    defaults        0       2
UUID= /home           ext4    defaults        0       2

#/dev/mapper/sde1_crypt /home/zero/public ext4    defaults        0       2
UUID= /home/zero/public ext4    defaults        0       2

Indeed, the two "faulty drives" are mount under /home—while the other drives are much "read-only".

As for wearing out my SSD. A new and better SSD like the Kingston A400 120GB will cost ~20€… while the SAMSUNG SSD 830 have cost me, more than 7 years ago, ~90€.

As I recall, the TBW of the SAMSUNG SSD 830 120GB is around 30—unfortunately, seems the specs file of this drive was among the vanished files that I definitively lost :s.

After more than 7 years, my SAMSUNG SSD 830 displays ~11GB of total writes… So, theoretically, it should last 14 more years…

So, wearing out my SSD is not a real problem.

Last edited by zero (2019-04-13 12:03:14)

Offline

#14 2019-04-21 18:43:48

zero
Member
Registered: 2019-01-18
Posts: 10  

Re: EXT4 BUG? Data loss when I have launched sigil :s

Well I have finally tested the RAM; all memtest86+ tests are OK (4/4 pass).

Offline

Board footer