<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<atom:link href="https://dev1galaxy.org/extern.php?action=feed&amp;tid=3788&amp;type=rss" rel="self" type="application/rss+xml" />
		<title><![CDATA[Dev1 Galaxy Forum / [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
		<link>https://dev1galaxy.org/viewtopic.php?id=3788</link>
		<description><![CDATA[The most recent posts in [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?.]]></description>
		<lastBuildDate>Sun, 06 Sep 2020 13:30:27 +0000</lastBuildDate>
		<generator>FluxBB</generator>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24494#p24494</link>
			<description><![CDATA[<div class="quotebox"><cite>Marjorie wrote:</cite><blockquote><div><p>Horses for courses. If a drive is rated for 70C I&#039;d not want to load it to run at that continuously.</p></div></blockquote></div><p>I fully expect <em>any</em> drive to be able to operate at 100% load continuously, provided it&#039;s external environmental limits are maintained.<br />It&#039;s over to the manufacturer to ensure that <em>internal</em> components don&#039;t overheat in said external environment.</p><p>The manual does not say &quot;never write more than 100GB/hr&quot;, nor does it say &quot;use only in LN2 cooled enclosures&quot;. It just boasts about transfer speeds, and those are mostly lies anyway. <br />A drive that overheats under load even when <em>more than adequately cooled</em> is a pile of junk, end of story. Even with a plastic case, those SSDs more than likely would have survived had crucial spent the few cents to add a thermal transfer pad.</p><p>These were in a fan-cooled aluminium drive cage, and that cage was maintaining a ~25c environment. The case of the SSD itself couldn&#039;t have been much above that.<br />Other drives in the same cage, including an OCZ model from 2013 and possibly the cheapest Kingston ever made, have never exceeded 35c under exactly the same workload. </p><p>I wasn&#039;t writing to them at anywhere near the fictitious &quot;up to 500MB/s&quot; specification either, not even 1/4 that in fact.<br />All I was doing was moving 100GB-ish batches of ~5MB files to the drives and moving them off again. Not &quot;typical desktop usage&quot;, but certainly nothing I wouldn&#039;t expect any old drive to handle.</p><p>In short, it&#039;s Crucial&#039;s (complete lack of) internal thermal design that killed these, not my usage. <br />They&#039;re literally just a plastic box with a bare PCB rattling around inside. No screws, no heatsinking, nothing. Not even a bit of extra copper on the board for thermal mass.<br />I own USB3 <em>pen drives</em> that have thermal pads on the chips FFS.</p><div class="quotebox"><cite>Marjorie wrote:</cite><blockquote><div><p>Maybe I should add disk monitoring</p></div></blockquote></div><p>Smartd is good for many things, temperature included.</p>]]></description>
			<author><![CDATA[dummy@example.com (steve_v)]]></author>
			<pubDate>Sun, 06 Sep 2020 13:30:27 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24494#p24494</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24490#p24490</link>
			<description><![CDATA[<p>Horses for courses. If a drive is rated for 70C I&#039;d not want to load it to run at that continuously.<br />I have a BX500 myself in my email server but its very lightly, if continuously, loaded and currently sits at 28C (max 31C). Hopefully it will continue to provide good service. Maybe I should add disk monitoring (currently I just monitor the CPU).</p>]]></description>
			<author><![CDATA[dummy@example.com (Marjorie)]]></author>
			<pubDate>Sun, 06 Sep 2020 11:34:25 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24490#p24490</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24483#p24483</link>
			<description><![CDATA[<div class="quotebox"><cite>brocashelm wrote:</cite><blockquote><div><p>I actually own a couple of Crucial MX500s, and they&#039;ve yet to disappoint me in the two years I&#039;ve been using them.</p></div></blockquote></div><p>MX is the mid-range line, and I haven&#039;t heard of any problems with them either.</p><div class="quotebox"><cite>brocashelm wrote:</cite><blockquote><div><p>Temperature spikes have never been an issue, and it&#039;s been a pretty bad summer in my area.</p></div></blockquote></div><p>This was no spike, this was pegging at 60-70c continuously, and it only took about 50GB of sustained sequential write to get there. <br />I probably got ~5x drive capacity of write endurance total before the first one croaked.</p><div class="quotebox"><cite>brocashelm wrote:</cite><blockquote><div><p>HP is a shit brand</p></div></blockquote></div><p>HP make pretty decent server-grade stuff, but their consumer products are total garbage, always have been.<br />That said, those SSDs were almost certainly OEM jobs, and good luck figuring out who really made them.</p><div class="quotebox"><cite>brocashelm wrote:</cite><blockquote><div><p>I might try out a Kingston SSD next time.</p></div></blockquote></div><p>I have several (including a pair being thrashed as cache drives), and I have absolutely no problems to report. <br />Even the DRAM-less budget models seem to be okay, if understandably slow-as molasses. And yes, even the very cheapest have metal cases and thermal pads.<br />Just be aware that they&#039;re one of the (several) brands known to lie about performance, if stated speeds seem too good for the price, that&#039;s exactly what they are.</p>]]></description>
			<author><![CDATA[dummy@example.com (steve_v)]]></author>
			<pubDate>Sun, 06 Sep 2020 08:44:05 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24483#p24483</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24482#p24482</link>
			<description><![CDATA[<div class="quotebox"><cite>steve_v wrote:</cite><blockquote><div><p>[rant]<br />Moral of the story: The Crucial BX500 series SSDs are utter crap.</p><p>Of that pair of SSDs, one is dead as a doornail (doesn&#039;t register on the bus), the other is clearly dying. <br />Both are less than a year old and still under warranty, but I&#039;ll be throwing them in the trash rather than returning them - the last thing I want is more of the same.<br />Both have history of hitting ~70c and thermal throttling under sustained write loads, despite the enclosure and the drives directly above them never breaking 26c. <br />The nasty plastic casing never even gets warm, and now that I&#039;ve dissected one I see somebody at crucial thinks putting a thermal pad on the controller is a luxury. 70c is listed as max operating and they never exceeded it, but I&#039;ll bet a cookie heat is why they died.</p><p>Crucial, your budget SSD line gets a solid F from me. The Kingston A400 series is not only cheaper, it&#039;s also built properly and doesn&#039;t constantly try to cook itself.<br />I wanted budget SSDs for that filesystem, and expected budget performance. What I didn&#039;t expect was something so shitty it can&#039;t even sustain the already mediocre performance numbers without overheating. <br />FWIW, a BX500 reporting 70c writes at ~<strong>7</strong>MB/s.<br />[/rant]</p></div></blockquote></div><p>That&#039;s disappointing. I actually own a couple of Crucial MX500s, and they&#039;ve yet to disappoint me in the two years I&#039;ve been using them. Temperature spikes have never been an issue, and it&#039;s been a pretty bad summer in my area. The HP SSDs I&#039;ve owned, however, were already having multiple bad sectors within a few months of using them. Now, those are the cheaply produced SSDs (not surprised, since HP is a shit brand).</p><p>I might try out a Kingston SSD next time. I do use their HyperX Fury line for DDR3 memory, which is pretty nice and fast.</p>]]></description>
			<author><![CDATA[dummy@example.com (brocashelm)]]></author>
			<pubDate>Sun, 06 Sep 2020 07:50:54 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24482#p24482</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24478#p24478</link>
			<description><![CDATA[<p>Okay, I think we can close this one. It&#039;s hardware.</p><p>The counterpart to that failed drive was taking 27s to initialise and appear on the bus. It&#039;s certainly not ideal for the initrd to spam scary errors instead of backing off and retrying, but it&#039;s not his fault either.</p><p>[rant]<br />Moral of the story: The Crucial BX500 series SSDs are utter crap.</p><p>Of that pair of SSDs, one is dead as a doornail (doesn&#039;t register on the bus), the other is clearly dying. <br />Both are less than a year old and still under warranty, but I&#039;ll be throwing them in the trash rather than returning them - the last thing I want is more of the same.<br />Both have history of hitting ~70c and thermal throttling under sustained write loads, despite the enclosure and the drives directly above them never breaking 26c. <br />The nasty plastic casing never even gets warm, and now that I&#039;ve dissected one I see somebody at crucial thinks putting a thermal pad on the controller is a luxury. 70c is listed as max operating and they never exceeded it, but I&#039;ll bet a cookie heat is why they died.</p><p>Crucial, your budget SSD line gets a solid F from me. The Kingston A400 series is not only cheaper, it&#039;s also built properly and doesn&#039;t constantly try to cook itself.<br />I wanted budget SSDs for that filesystem, and expected budget performance. What I didn&#039;t expect was something so shitty it can&#039;t even sustain the already mediocre performance numbers without overheating. <br />FWIW, a BX500 reporting 70c writes at ~<strong>7</strong>MB/s.<br />[/rant]</p>]]></description>
			<author><![CDATA[dummy@example.com (steve_v)]]></author>
			<pubDate>Sun, 06 Sep 2020 01:56:33 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24478#p24478</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24400#p24400</link>
			<description><![CDATA[<div class="quotebox"><cite>fsmithred wrote:</cite><blockquote><div><p>You could try adding &#039;sleep 1&#039; to /etc/init.d/eudev as indicated</p></div></blockquote></div><p>I could, but all this is going on in the initrd before init starts or /etc/init.d is available...</p><p>I&#039;ll do the same to the initrd script that starts udevd and see what happens, but it&#039;ll have to wait a little as I can&#039;t really take this box offline right now.<br />It occurs to me that there were also some hardware changes which might have fallen in the &quot;I just noticed it&quot; window, and those could probably be reverted temporarily.</p><p>When I get a window long enough to risk borking the boot process and reverting backups without undue screaming, I&#039;ll have another poke about.</p>]]></description>
			<author><![CDATA[dummy@example.com (steve_v)]]></author>
			<pubDate>Mon, 31 Aug 2020 23:25:43 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24400#p24400</guid>
		</item>
		<item>
			<title><![CDATA[Re: [SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24384#p24384</link>
			<description><![CDATA[<p>Maybe this bug: <a href="https://bugs.devuan.org/cgi/bugreport.cgi?bug=483" rel="nofollow">https://bugs.devuan.org/cgi/bugreport.cgi?bug=483</a></p><p>You could try adding &#039;sleep 1&#039; to /etc/init.d/eudev as indicated in message #10 or #20 or apply the patch in the last message.</p>]]></description>
			<author><![CDATA[dummy@example.com (fsmithred)]]></author>
			<pubDate>Sat, 29 Aug 2020 21:29:02 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24384#p24384</guid>
		</item>
		<item>
			<title><![CDATA[[SOLVED] [hardware] Initramfs mdadm assembly woes, udev not triggered?]]></title>
			<link>https://dev1galaxy.org/viewtopic.php?pid=24381#p24381</link>
			<description><![CDATA[<p>At some point recently (I not sure when, it&#039;s headless so I rarely see early boot messages), a machine of mine (beowulf) began spewing:</p><div class="codebox"><pre><code>mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: error opening /dev/md?*: No such file or directory
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.</code></pre></div><p>From the initrd during boot.</p><p>Curiously, the arrays appear to come up correctly, the root device (RAID1) is found and mounted, the system boots, and apart from those messages everything seems fine...</p><p>md0 : active raid1 sdl1[3] sdk1[2]&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;&lt;----- /boot<br />&#160; &#160; &#160; 2095040 blocks super 1.2 [2/2] [UU]<br />&#160; &#160; &#160; <br />md1 : active raid1 sdl3[3] sdk3[2]&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;&lt;----- /<br />&#160; &#160; &#160; 105775104 blocks super 1.2 [2/2] [UU]<br />&#160; &#160; &#160; <br />md2 : active raid1 sda[0]&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &lt;----- Not currently mounted, pending disk replacement.<br />&#160; &#160; &#160; 117154240 blocks super 1.2 [2/1] [U_]&#160; <br />&#160; &#160; &#160; bitmap: 1/1 pages [4KB], 65536KB chunk</p><p>Array uuids all match as they should, I generated a new mdadm.conf and initramfs anyway, the problem persists...</p><p>Dropping into the initramfs shell with break=mount reveals something rather odd:<br />/proc/partitions is empty. No wonder mdadm can&#039;t find any devices.</p><p>A quick poke about in scripts/local-block/mdadm (booting with &quot;text&quot; on the command line reveals local-block as the source of the squealing):</p><div class="codebox"><pre class="vscroll"><code>#!/bin/sh

PREREQ=&quot;multipath&quot;

prereqs()
{
        echo &quot;$PREREQ&quot;
}

case $1 in
# get pre-requisites
prereqs)
        prereqs
        exit 0
        ;;
esac

. /scripts/functions

# Poor man&#039;s mdadm-last-resort@.timer
# That kicks in 2/3rds into the ROOTDELAY

if [ ! -f /run/count.mdadm.initrd ]
then
    COUNT=0

    # Unfortunately raid personalities can be registered _after_ block
    # devices have already been added, and their rules processed, try
    # triggering again.  See #830770
    udevadm trigger --action=add -s block || true
    wait_for_udev 10
else
    COUNT=$(cat /run/count.mdadm.initrd)
fi
COUNT=$((COUNT + 1))

echo $COUNT &gt; /run/count.mdadm.initrd

# Run pure assemble command, even though we default to incremental
# assembly it is supported for users to export variables via
# param.conf such as IMSM_NO_PLATFORM.  See #830300
mdadm -q --assemble --scan --no-degraded || true

MAX=30
if [ ${ROOTDELAY:-0} -gt $MAX ]; then
    MAX=$ROOTDELAY
fi
MAX=$((MAX*2/3))

if [ &quot;$COUNT&quot; = &quot;$MAX&quot; ]
then
    # Poor man&#039;s mdadm-last-resort@.service for incremental devices
    mdadm -q --run /dev/md?*

    # And last try for all others
    mdadm -q --assemble --scan --run

    rm -f /run/count.mdadm.initrd
fi

exit 0</code></pre></div><p>Sure enough, &#039;mdadm -q --assemble --scan --no-degraded&#039; and &#039;mdadm -q --run /dev/md?*&#039; spit out the same errors I&#039;m seeing in a normal (for some definition of normal) boot.<br />Running &#039;udevadm trigger --action=add -s block&#039; populates /proc/partitions, after which mdadm is happy and all is well.</p><p>I&#039;m not particularly familiar with the initramfs scripts, but it looks to me like that command should be run before mdadm tries to scan for devices? Right?</p><p>I did try rootdelay=10, but that makes no difference, and I&#039;m not getting anything useful from the &#039;net at large on this one. </p><p>Any idea what&#039;s going on here, or where I should be looking?</p>]]></description>
			<author><![CDATA[dummy@example.com (steve_v)]]></author>
			<pubDate>Sat, 29 Aug 2020 19:18:01 +0000</pubDate>
			<guid>https://dev1galaxy.org/viewtopic.php?pid=24381#p24381</guid>
		</item>
	</channel>
</rss>
