previous
2024-04-19
19:03:18 <Guest19> nah, not worried about getting hacked on this basically single-purpose device; they can hack the guest account all they want, it gets wiped out on boot. Browser bloat is a problem for me though. But thanks for the browser suggestions guys, I'll check those out.
19:05:19 <n4dir> Guest19: netsurf is really very basic, but at least not a command-line-webbrowser. falkon, last time i checked, is a modern browser and works on all sites, but you don't really save a lot of ressources. Worth a try though
19:16:27 <systemdlete> gnarface (and anyone else): I'm seeing intermittent problems with "apt update" from apt-cacher-ng server: e.g, W: Failed to fetch http://security.debian.org/debian[...] Connection failed [IP: 172.XX.YY.ZZ 3142]
19:18:13 <systemdlete> I have a log at https://dpaste.com/9GGU7RQRR
19:19:18 <systemdlete> I swear on a stack of Linux Manuals that I have not mucked (much) with the acng.conf file. I followed the directions gnarface offered me, and consulted the man pages (and some web pages also)
19:19:31 <systemdlete> https://unix.stackexchange.com/questions/623174/apt-cacher-ng-random-download-failures-with-apt-update-acgn
19:19:59 <systemdlete> I disabled ipv6 EVERYWHERE on my home network, but it did not help. (I was just following their suggestion)
19:20:37 <onefang> I use falkon when all my firefox-esr protective plugins means some crappy web site wont work.
19:20:49 <systemdlete> Apparently, I am not the only one having this issue. It is intermittent, which makes it hard to figure out.
19:21:05 <systemdlete> I think my network is OK, since I DO get successful apt-get's
19:22:25 <systemdlete> And no other programs or applications are complaining about network issues (recently I mean).
19:22:46 <gnarface> systemdlete: mine hasn't made a peep lately
19:23:28 <systemdlete> gnarface, this seems to be most frequent when hitting a lot of packages at once, such as the dist-upgrade I am attempting on one system
19:24:15 <systemdlete> The first one (apt-get upgrade, after repointing the sources file to chimaera from beowulf) was 640 packages, the second one over 1000.
19:24:34 <systemdlete> But I've also seen it, a bit less often maybe, with small updates
19:24:44 <gnarface> hmm, i also haven't updated that much at once for a while, i suspect
19:24:45 <systemdlete> Overall, I am VERY pleased with the cacher!
19:25:12 <gnarface> i wonder if it's a problem where the cache can invalidate due to age while the update is still in progress?
19:25:13 <systemdlete> I'm wondering if I should try to throttle apt-cacher-ng somehow
19:25:24 <systemdlete> hmmm.
19:25:33 <systemdlete> how often are packages released to the repos?
19:25:45 <systemdlete> I mean, this happens several times within, say, an hour
19:26:23 <gnarface> i could only speculate... doesn't seem like quite the error i ever got, the only transient errors i ever got it was clearly complaining about some checksum mismatch, and it'd always just go away if i waited until the next cache invalidation
19:26:54 <gnarface> (or if i logged into the control panel and flushed it manually if i was feeling impatient)
19:26:59 <systemdlete> I have seen a lot of web pages re your issue (checksum mismatch), but I really haven't noticed that one myself here.
19:27:51 <gnarface> some other hidden environmental cause we must be missing
19:28:19 <systemdlete> It's really strange, because I was in the middle of the upgrade and dist-upgrade steps, and the process would start getting hit with these errors. Then, I simply restarted it as suggested on devuan's upgrade instructions page. Then, no problem with same packages, only moments later. But then others would/might fail, and rinse and repeat...
19:28:59 <gnarface> so, i wonder if it could just be something caused by the repos updating during your downloads
19:29:16 <gnarface> you said it was in the security section after all, which is what i'd expect to be changing most
19:29:18 <systemdlete> as you were saying, above
19:29:26 <systemdlete> no, that was an "e.g." ^^^^
19:29:41 <systemdlete> there are many different errors; that was just the general format
19:30:06 <systemdlete> (I'm trying hard to be precise as possible here...)
19:30:39 <gnarface> is there any firewall between you and the apt-cacher-ng server? i wonder if it could just be an issue with it failing to hold enough connection states?
19:30:49 <systemdlete> I'm afraid that the actual messages I was seeing rolled off the bottom of my copy-paste buffer
19:31:01 <systemdlete> yes, there are a few
19:31:16 <systemdlete> but!
19:31:27 <systemdlete> this happens (sometimes) even on the cacher server itself!!!
19:31:58 <systemdlete> That server DOES have a firewall running, but I don't think that will matter for local connections, right?
19:32:32 <gnarface> depends on how you set it up
19:32:38 <systemdlete> that's why I am not very suspicious of the network, at least not for the cacher clients
19:32:52 <systemdlete> It's a basic gufw generated fw
19:33:16 <gnarface> but i know that if you're tracking state of inbound or outbound connections, there's a theoretical possibility of running up against a kernel memory limit
19:33:16 <onefang> We tell package mirrors to update every 30 minutes, but some updated less often.
19:33:26 <systemdlete> I've added a few pinholes for ports like the cacher's
19:33:59 <gnarface> like, if you end up tracking too many states at once you could start losing connections randomly
19:34:26 <gnarface> usually this is not a problem with default kernel builds unless you're doing something weird that is causing connections to stick around too long though
19:34:34 <systemdlete> ok, so maybe bumping up the limit might help?
19:34:52 <gnarface> sure, in theory
19:35:10 <gnarface> although often needing to is the sign of something else wrong
19:35:14 <systemdlete> I mean, this is overall rather minor. It's mainly a nuisance.
19:35:58 <systemdlete> just restarting the upgrade has always been enough to get it working again (so far)
19:36:32 <systemdlete> gnarface, did you look at the paste I put up?
19:37:00 <gnarface> systemdlete: no, and i doubt it will give me any insight but i will if you use paste.debian.net or just /msg it to me
19:37:18 <systemdlete> you don't trust dpaste.bin?
19:37:39 <gnarface> i just stopped adding new domains to my personal "trust" list years ago
19:37:53 <gnarface> it's not about any particular domain, it's about reducing attack surface
19:38:06 <gnarface> not just for me, but also for them
19:38:19 <gnarface> (i'm a mafia target and sometimes that gets innocents involved unnecessarily)
19:38:46 <systemdlete> https://paste.debian.net/hidden/779bf2a3/
19:38:55 <systemdlete> (sorry to hear that bad news)
19:39:30 <gnarface> hmm, this is a error i certainly never have seen
19:39:48 <gnarface> failed to move stale item out of the way... no such file or directory
19:39:59 <systemdlete> hmmm, let me look at inodes
19:40:09 <gnarface> scattered across several hours
19:40:33 <gnarface> so it can't move it out of the way because... it's gone already?
19:41:01 <gnarface> you don't have two apt-cacher-ng instances running on the same cache directory, do you?
19:41:02 <systemdlete> sneaky devils.
19:41:10 <systemdlete> uh, I don't think so...
19:41:39 <systemdlete> no, just one
19:41:40 <gnarface> double check, but this seems like quite a puzzle
19:41:56 <systemdlete> I have over 1.4M available inodes on /var
19:42:00 <gnarface> do you think disk I/O is really high at those points?
19:42:09 <systemdlete> good point...
19:42:15 <gnarface> i wonder if it could be some sort of race condition caused by disk bottlenecknig
19:42:28 <gnarface> just guessing
19:42:32 <systemdlete> maybe, but wouldn't that be a problem on the upstream side also?
19:42:40 <gnarface> dunno
19:42:51 <gnarface> never seen this before but my cache server is not under load
19:43:19 <systemdlete> I mean, I'm only seeing the problem from the client's POV. On the other hand, that log file seems to indicate :I: and :O: which I am guessing is input and output
19:43:24 <gnarface> not only is it not under load but i actually just recently upgraded it to a hardware raid with two SSDs
19:43:58 <systemdlete> I'm still on mech disks. They are still a good value for the money here.
19:44:02 <systemdlete> but anyway
19:44:25 <gnarface> i ran the old one right into the ground, it died making grinding noises :D
19:45:26 <gnarface> i figured it was a good idea to have drives that were less sensitive to vibration
19:45:39 <gnarface> (my downstairs neighbors bang on the walls a lot)
19:46:05 <systemdlete> Some of these drives (2TB Seagate and 2TB WD) are up to 3 years old, maybe older. Since I started leaving my PCs on 24x7, the frequency of hardware replacements has gone waaaaay down (MB, disks, etc). So I've become a believer.
19:46:39 <systemdlete> well, that can't be the only annoying thing arising from that problem
19:48:47 <gnarface> well, if it were the drives you'd see i/o errors in dmesg around the same time
19:48:59 <gnarface> but i'm suspecting this is not hardware related
19:49:07 <systemdlete> most of the clients are VMs
19:49:17 <gnarface> it looks like a race condition but i can't imagine why it'd be happening
19:49:46 <systemdlete> and I have a thruk installation here so I will find out fairly quickly if there are problems.
19:50:07 <systemdlete> and I also check, periodically, the host log files for hw errors
19:53:45 <gnarface> how many VMs you usually run the updates with at a time?
19:54:12 <gnarface> i'm rarely updating more than 2 things at once, i wonder if you're seeing something i'm not seeing just because i'm not throwing 1000 clients at it at once
19:54:37 <systemdlete> only one running apt upgrade at a time, otherwise the whole network crawls. However, there can be more than one running apt update at a time
19:55:14 <systemdlete> well, as I said, even casual upgrades of just a few packages can see this happen... iirc
19:57:54 <systemdlete> when I upgrade from chimaera to daedalus, I will try to monitor the caecher server with htop
19:58:41 <systemdlete> (I'm upgrading beowulf -> chimaera -> daedalus because skip upgrades are not officially supported)
19:59:15 <systemdlete> the chimaera upgrade is almost complete. It is unpacking and installing the dist-upgrade part right now.
19:59:58 <systemdlete> So if all goes well, and the resulting chimaera doesn't have problems for a day or so, I'll do the daedalus upgrade.
20:13:52 <systemdlete> gnarface, how do I msg you a file? I have it gzip'd down to about .5MB (I cannot paste it to deb paste bin)
20:14:14 <systemdlete> this file is the actual log, whereas what I pasted was the error log
20:15:06 <gnarface> systemdlete: oh, i just meant literally paste it and just wait for the flood protect throttling to let it finish going through
20:15:17 <systemdlete> oh...
20:15:23 <systemdlete> it's 8MB
20:15:38 <gnarface> is all 8MB relevant?
20:16:17 <gnarface> i mean, large pastes will take a while but it also will all get here, but i'm not sure if 8MB will actually fit in your clipboard
20:16:23 <systemdlete> of course not
20:16:27 * systemdlete duh
20:16:34 <systemdlete> I can snip it
20:19:18 <systemdlete> last 1000 lines?
20:19:40 <systemdlete> it's about 100k
20:19:56 <gnarface> i think it'll work, just say something when it's done
20:20:12 <systemdlete> how can I paste it?
20:20:21 <systemdlete> it might end up here in this channel!
20:20:31 <systemdlete> wait
20:20:36 <systemdlete> let me try pasting this smaller file
20:20:45 <gnarface> i just sent you 1 line, do you see that open in a new tab on your IRC client?
20:21:21 <gnarface> it would just be: /msg gnarface [paste]
20:21:33 <gnarface> but if your client lets you reply to the message i just sent, that might be easier
20:21:36 <systemdlete> here: https://paste.debian.net/hidden/e7acf4f1/
20:22:10 <systemdlete> (yes, I see the tab, but I think the paste is easier)
20:22:29 <gnarface> 1713568491|E|108923|192.168.40.41|devrep/merged/pool/DEBIAN/main/p/python-pip/python-pip-whl_20.3.4-4+deb11u1_all.deb [HTTP error, code: 502]
20:22:38 <systemdlete> yeah. Lots of errors
20:22:49 <gnarface> 502 though
20:23:30 <gnarface> lemme double check, but i think 502 means remote server replied "aak, i'm misconfigured!"
20:23:46 <systemdlete> I just looke dit up
20:23:59 <gnarface> i only see a couple 502 errors... the rest looks completely normal
20:24:14 <systemdlete> it means gateway interference, which is hard to believe when I'm getting the errors on the cacher server itself!
20:24:24 <gnarface> bad gateway yea
20:24:44 <systemdlete> unless
20:24:48 <gnarface> wait, but is that the cache server? or is that actually the debian mirror responding?
20:24:51 <systemdlete> unless it is on the upstream side?
20:24:56 <gnarface> that's what i was thinking
20:25:00 <systemdlete> yeah
20:25:21 <gnarface> might be a problem with a mirror that's just rippling down into your cache
20:25:27 <systemdlete> now, between the cacher and the Internet, there be some gateways, routers, firewalls...
20:26:04 <gnarface> hmm, but i think 502 means it had to actually get some sort of reply though...
20:26:06 <systemdlete> but as I mentioned, I don't have many network issues here (other than stupid things I caused, which are now corrected, to the best of my knowledge)
20:26:17 <onefang> If it's a mirror, please try to identify which one.
20:26:48 <gnarface> yea, see if it always happens while apt-cacher-ng is hitting the same remote mirror
20:26:58 <gnarface> it might be all their fault
20:27:01 <systemdlete> how can I tell?
20:27:15 <gnarface> first thing that comes to mind is tail the log file while watching the raw network traffic
20:27:16 <onefang> And if it's all their fault, I might have to do something.
20:27:18 <systemdlete> the log file (I just pasted) doesn't give that info
20:27:28 <systemdlete> onefang
20:27:40 <onefang> Find the IP of that mirror.
20:27:41 <systemdlete> onefang: don't fret, this happens to a lot of people outside devuan
20:27:48 <gnarface> yea, you'd just have to be tailing it and watching the network traffic at the same time, then see which remote ip is in the tcpdump at the same time that error hits the logs
20:28:04 <systemdlete> tail which log file?
20:28:10 <gnarface> the one you just pasted
20:28:14 <systemdlete> (i have a wide assortement of them)
20:28:28 <systemdlete> but the log doesn't show the upstream, does it? Maybe I misse dthat)
20:29:09 <gnarface> no, but my guess is the error will show up in that log file within milliseconds of the actual remote connection being made, so if you're just running tcpdump at the same time you'll see the corresponding IP
20:29:21 <systemdlete> ah
20:29:23 <systemdlete> tcpdump
20:29:33 <gnarface> in fact, with the right filter you would be able to even see the 502 error in the response
20:29:43 <gnarface> then you'd be sure which side of the network it was coming from
20:29:48 <systemdlete> drat.
20:30:04 <gnarface> actually, what am i thinking? you could certainly filter tcpdump output just for http errors
20:30:04 <systemdlete> I forgot to obfuscate my network addresses in that paste
20:30:22 <systemdlete> but probalby not an issue since my network is behind a residential firewall anyway
20:30:26 <gnarface> i noticed, but they're all private anyway
20:30:32 <systemdlete> right
20:31:48 <systemdlete> now, why would it be that apt is able to round-robin the repo servers, but apt-cacher has a problem doing the same--don't they both use the same code, ultimately?
20:32:05 <gnarface> uh, i don't know that they do
20:32:29 <gnarface> and apt may just be coded to retry without complaint
20:32:33 <onefang> Maybe not. I know debootstrap doesn't use apt, but mdebstrap does.
20:32:45 <systemdlete> ok
20:33:10 <gnarface> but with tcpdump you should be able with some fiddling to isolate the source of these 502 errors explicitly
20:33:35 <gnarface> there might be a simpler way that i'm not thinking of, but tcpdump is definitely up to this task
20:33:36 <systemdlete> right. I will do that
20:34:12 <onefang> And I definitely know that apt-panopticon doesn't use apt directly, but that's a different case. It's testing every step of the apt process on the package mirrors.
20:34:27 <gnarface> if you find it, the payload might even actually make the cause obvious
20:35:44 <systemdlete> tcpdump port https
20:36:00 <systemdlete> (and src host...)
20:37:03 <systemdlete> I wonder if wireshark might be easier in order to easily examine the packets
20:37:48 <systemdlete> but either way, it looks like it should be easy enough to gather the stream
20:42:17 <gnarface> wireshark might be easier but i have less familiarity with it
20:42:41 <gnarface> i'm not exactly a tcpdump pro but when i learned it there weren't alternatives
20:43:27 <gnarface> all you should have to do is filter for http 502 error headers, https might sabotage that though
20:43:55 <systemdlete> I just realized something. When I looked at https://unix.stackexchange.com/questions/623174/apt-cacher-ng-random-download-failures-with-apt-update-acgn, I failed to distinguish what they meant by disabling ipv6.
20:44:15 <systemdlete> They meant, in the acng.conf file! Not the actual network interfaces. (though no harm there)
20:45:05 <gnarface> hmm, i do also have ipv6 disabled everywhere afaik, though i don't see any particular indication that's what's causing this problem
20:45:06 <systemdlete> the cacher can be configured to only listen to ipv4 (or ipv6) as desired. I missed that, and I think I will try that before starting the next step of the upgrade to daedalus
20:46:44 <systemdlete> Yeah, I agree. And besides, it really OUGHT to work for ipv6-enabled networks.
20:47:09 <systemdlete> It's reallly sad if those networks users are deprived of this functionality
20:47:13 <gnarface> yea, but there could be an issue where just one mirror in the round-robin is missing the ipv6 dns entries or something weird like that
20:47:50 <gnarface> we've seen stuff like that cause problems before
20:48:06 <systemdlete> gnarface, some of the devuan mirrors are supported by users like us, right? I mean, some might be on big hardware in a DC somewhere, but maybe not all, and some might not be correctly configured for ipv6?
20:48:35 <systemdlete> ok
20:48:39 <gnarface> it has come up before
20:48:47 <gnarface> as have https issues
20:48:56 <systemdlete> not meaning to repeat you, just trying to get clear on your meaning
20:49:13 <onefang> That's why if there's some mirror issue, I want to know the IP of the errant mirror. Especially if it's something apt-panopticon isn't finding.
20:49:53 <systemdlete> onefang, the "errant mirror"-- do you mean a debian mirror, or only devuan?
20:50:30 <gnarface> i think in theory it could be either
20:50:34 <systemdlete> because debian users are hitting this too, if my survey of ddg hits on this topic is an indicator
20:50:43 <onefang> Could be either, but I'm only in charge of Devuan package mirrors, though if it's a Debian mirror problem, that's good to know as well.
20:50:44 <gnarface> though obviously scrutiny will be on the ones we can do something about first...
20:50:56 <systemdlete> as well as people using the cacher for openwrt (which I do as well)
20:51:14 <gnarface> is openwrt also a deb-based distro?
20:51:19 <systemdlete> no!
20:51:39 <systemdlete> I think it is freebsd, but I always forget. It's one of the BSDs
20:52:38 <onefang> There's at least one package mirror running from someones home server. I even once had an offer for a home based mirror server running over that PsaceX satellite network. lol
20:52:49 <systemdlete> I know they will be moving to pkg in the future; they're already migrating some of their tools that way. But for now at least, apt-cacher-ng works for openwrt
20:53:51 <onefang> How did I manage to typo SpaceX twice in a row, once while trying to fix the first typo. lol
20:54:14 <systemdlete> you must have borrowed my fingers for a moment...
20:54:35 <onefang> You can have them back. Keep 'em.
20:54:36 <systemdlete> not your fault. You expected mine to work correctly I think.
20:54:40 <systemdlete> lol
20:55:12 <systemdlete> I've typo'd here about 4 or 5 times today already
20:56:04 <systemdlete> gnarface: yeah, I followed the directions at openwrt.org wiki to set up the cacher for openwrt packages.
20:57:46 <systemdlete> It's great with openwrt. I have several routers, and by upgrading them in the order so that my gateway is last, by that time, all the packages are cached for that upgrade, and I don't have to do any special pre-downloading (gateway needs some firmware and other stuff not available in the release iso)
20:58:27 <systemdlete> so I'm really indebted to you for having me set this up. Even with this annoying issue...
20:59:07 <onefang> gnarface is always very helpful, we should all thank them.
20:59:22 <systemdlete> So onefang and gnarface, I will do my utmost to track down that repo for you.
20:59:27 <systemdlete> (yes, they are!!!)
20:59:58 <systemdlete> and you are too, and so are the rest of the folks here
21:08:11 <systemdlete> The 502 Bad Gateway error is an HTTP status code that occurs when a server acting as a gateway or proxy receives an invalid or faulty response from another server in the communication chain. This error indicates a problem with the communication between the involved servers and can result in disruption of internet services. Wikipedia
21:08:31 <systemdlete> That sounds exactly like what we are thinking here.
21:08:51 <systemdlete> So it is almost definitely an upstream-side issue
21:27:49 <gnarface> systemdlete: no problem, i hope you can identify it
21:28:22 <gnarface> i don't know if apt-cacher-ng can be made to add the remote server IPs to the log files or not
21:29:00 <gnarface> might be worth looking into, but packet sniffing will work, albeit probably quite tediously
21:29:48 <systemdlete> I just made that change to only listen on ipv4. But I don't see why that will make any difference, that is, if the problem is almost certainly on the upstream side.
21:30:02 <systemdlete> But no one can say I didn't at least try it.
21:30:12 <gnarface> time will tell
21:30:16 <systemdlete> this upgrade is taking all day, literally.
21:30:37 <gnarface> heh, yea that's expected
21:30:58 <gnarface> i did upgrade a full kde desktop, beowulf->chimaera->daedalus recently, i think something like 18 hours?
21:31:45 <gnarface> i might have been sleeping for part of that, but it took a long time either way
21:31:53 <onefang> My upgrade to daedalus has been ongoing since last year. This time I want to write configs from scratch instead of just copying them across, and I'm doing a shit load of testing. Once it's done testing, I'll roll it out to all my Linux computers.
21:32:28 <onefang> I'm also using my own script for the system building.
21:33:53 <systemdlete> same here
21:34:13 <systemdlete> still in development, just the same way. I've used it a few times, but it needs work...
21:35:10 <onefang> That's the other reason mines taking so long, I'm doing major surgery to the scripts. Not to mention my crappy life keeping me busy this last year. lol
21:35:17 <systemdlete> It would have been more expeditious, I think, to simply clone a new daedalus VM from a template VM I created months ago and just update/upgrade to current package levels and do restores to the home areas
21:35:57 <systemdlete> and add in all the programs and configuration from restores... ay uh
21:36:07 <systemdlete> maybe not better, idk
21:36:48 <systemdlete> I really have done a lot of customization on this system I'm upgrading. So maybe this will be worth it.
21:46:55 <systemdlete> https://askubuntu.com/questions/119298/apt-get-using-apt-cacher-ng-fails-to-fetch-packages-with-hash-sum-mismatch#answer-431764
21:47:31 <systemdlete> (of course, that is a 10-year-old answer)
21:47:43 <AlexLikeRock> hahahahhahah
21:47:57 <AlexLikeRock> nice nick " systemdlete "
21:47:59 <AlexLikeRock> hahhahahaha
21:48:36 <systemdlete> def systemdlete: "systemd: delete from all systems, immediately!"
21:49:06 <systemdlete> (i.e., nothing to do with "lete" or "l3t3" or whatever meme it is...)
21:49:10 <AlexLikeRock> yeh!!
21:49:11 <AlexLikeRock> i now
21:49:15 <AlexLikeRock> jhahahah
21:49:25 <systemdlete> thank you AlexLikeRock
21:49:57 <AlexLikeRock> nice
---------- 2024-04-20 ----------
00:27:42 <systemdlete> gnarface, onefang: oopsies. Looks like some hard drive errors on host as it turns out. I hadn't been alerted about them, and I could have sworn I had thruk set up to alert me upon hardware problems.
00:28:23 <systemdlete> So I might have just wasted your time, but I am not 100% sure about that. None of the articles I read about the cacher problem indicated a hard disk error.
00:48:37 <onefang> No problem for me, I'm in weekend mode, so I was mostly letting you and gnarface work on it. Was waiting to see if you had found an actual broken mirror for me to sort out.
00:50:17 <systemdlete> onefang, that could take some time. I am still trying to finish the upgrade to chimaera and that has taken ALL DAY. So it may be some time before I can explore that further. For one thing, I need to get my kernel logging configured to make thruk alert me for hard disk problems (which I thought I had already done...)
00:50:37 <systemdlete> I'll be focusing on that to avoid more problems going forward.
00:50:50 <systemdlete> But I will definitely be back here to let you know what I find out.
00:50:54 <onefang> Fair enough.
00:50:58 <systemdlete> I promise you.
00:51:55 <systemdlete> I'm sort of hoping, though, that these hard disk errors are the smoking gun. The first of the errors seems to have begun around April 14 (PST time), and that was about the time I began to notice the errors, but I did not enter them into my journal here...
00:52:48 <systemdlete> I'm suspicous of them but not really sure.
00:52:54 <systemdlete> *suspicious
01:00:19 <CueXXIII> systemdlete: anything in smartclt of that harddisk? otherwise it might be bad cabling producing those errors
01:43:43 <systemdlete> CueXXIII, not sure yet. I'm juggling a few things...
01:49:26 <systemdlete> I did look at the cell values on that harddisk and did not see anything that looked like hard errors. I only see evidence in the kern.log
01:51:01 <systemdlete> I will take the system down in a while, just as soon as my disks resync (RAID1). The system has been up 46 straight days, so maybe it is "tired"...
01:51:43 <systemdlete> I'll remove the bad disk and test it on a test box. I have some spares (good thinking, systemdlete, for once).
01:52:53 <systemdlete> I usually run badblocks for a day or two to see if it errors. Since the test box has its own cables, that will help to eliminate that possiblity.
02:46:17 <gnarface> systemdlete: which filesystem are you using on these?
02:50:51 <systemdlete> ext4, everywhere
02:51:25 <systemdlete> ext4 on VMs and hosts
02:51:45 <systemdlete> I've tried some of the more exciting stuff, I think xfs? or something like that
02:52:01 <systemdlete> but I had problems with it, but years ago, so I should prob try again
02:52:38 <systemdlete> oh, no. it was btrfs
02:53:04 <systemdlete> not xfs. I don't think I've ever tried that one
02:55:34 <gnarface> hmm, well being not btrfs or anything similarly experimental, i dunno, but ext4 did have one bad corruption bug that was affecting upgrades a few releases back, only i thought that was before beowulf
02:55:56 <gnarface> (and it did manifest itself exactly like a physical drive failure)
02:56:08 <systemdlete> right
02:56:17 <systemdlete> I vaguely recall that...
02:56:39 <gnarface> the issue was to do with using older e2fsprogs with newer kernels or something like that
02:57:00 <systemdlete> yeah, I think that's right
02:57:16 <systemdlete> I am pretty sure I stumbled into that one at some point.
02:58:09 <systemdlete> Thing is, before I started this upgrade, I cloned the VM, and upon boot I have all filesystems set up to fsck every time (I don't reboot any systems much).
02:59:03 <systemdlete> So I believe I have a clean file system to start with. But still, if the actual hardware backing the virtual FS has actual hard errors, then maybe there is still issues.
02:59:27 <systemdlete> I've been carefully checkking the kern.log's on both VM and host for any suspicious errors
02:59:40 <systemdlete> since I noticed it earlier
04:43:10 <systemdlete> ok now what am I doing wrong? I tried upgrading rsyslog on a beowulf system with the backports so I could get a more recent version. But there is no /etc/init.d/rsyslog
04:43:19 <systemdlete> I checked the package; it says it is included
04:44:07 <systemdlete> maybe a trigger is not running? I forget the details of how that happens...
04:52:49 <fsmithred> what does 'apt policy rsyslog' tell you?
04:53:12 <systemdlete> I removed the upgrade and tried re-installing the package from the regular repo, but same thing happens
04:53:20 <fsmithred> I just tried to install rsyslog from beowulf-backports and apt tells me that i already have the newest version
04:53:41 <fsmithred> but apt policy tells me that version is in beowulf. No rsyslog in backports.
04:53:58 <fsmithred> want mine?
04:54:37 <systemdlete> apt policy shows the correct versions, and the one I have installed has a star
04:55:01 <systemdlete> sorry, I meant chimera backports, not beowulf
04:55:14 <systemdlete> (I'm exhausted from all of this today...)
04:55:25 <onefang> Go and rest.
04:55:49 <systemdlete> but it doesn't matter; the rsyslog script does not get installed
04:56:00 <systemdlete> either version
04:56:24 <systemdlete> trouble is, I now have NO rsyslog running on that system!
04:56:41 <systemdlete> I can launch it manually
04:56:45 <systemdlete> but this is sick
04:59:32 <rrq> I haven't read backlog, but which script are you talking about?
04:59:38 <gnarface> hmm, did you debootstrap? i seem to recall a problem with rsyslog and debootstrap some point
05:00:28 <gnarface> i think it was in ascii though, and maybe only on arm64
05:01:09 <systemdlete> no, nothing to do with installing the system. The version of rsyslog I had was 2102, and I wanted the 23.02 version.
05:01:09 <gnarface> yea, according to my notes i had to exclude rsyslog and udev and include syslog-ng instead to successfully debootstrap ascii on arm64
05:01:24 <systemdlete> ascii... that's long time ago.
05:01:31 <systemdlete> I don't have any ascii systems here
05:01:33 <gnarface> never had any other problems with rsyslog that i can recall
05:01:36 <systemdlete> (or jessie)
05:02:04 <gnarface> does it work if you just reinstall it?
05:02:07 <systemdlete> I removed all rsyslog versions from the apt-cacher and I'll see if this works
05:02:14 <systemdlete> yeah, tried re-installing
05:02:33 <systemdlete> I'm going to see if maybe it will cache a fresh copy from the repos
06:14:59 <systemdlete> well reboot did not help, even shutting it down completely and restarting it "cold" (and swapping out the bad drive for a new one)
06:15:39 <systemdlete> then I tried to apt remove rsyslog and apt install rsyslog, but for whatever reason, it does not install the /etc/init.d/rsyslog script
06:16:02 <systemdlete> I also note that the archive does not contain the deb file for rsyslog
06:16:30 <systemdlete> (I am using the cacher, but I didn't figure that would affect the client as far as grabbing a copy of the deb file)
06:16:36 <systemdlete> (but what do I know, really?)
06:17:03 <gnarface> wait maybe that's because it's in beowulf-security now? or chimaera-security or whichever you're on?
06:17:15 <gnarface> make sure your sources.list is complete
06:17:51 <gnarface> if you have to, you can always switch to syslog-ng but i was pretty sure rsyslog worked...
06:18:32 <gnarface> i would be tempted to examine the preinst/postinst scripts
06:19:15 <gnarface> but rsyslog is definitely working on my beowulf systems
06:19:32 <systemdlete> chimaera here
06:19:48 <systemdlete> this is a different sea of trouble than the one we were dealing with before.
06:22:42 <systemdlete> and rsyslog was installed just fine previously. What happened was that I was running into some odd error messages from rsyslogd and I thought maybe upgrading to a more recent version could correct that (very hopeful thinking here)
06:23:02 <systemdlete> but after doing the upgrade to chimaera-backports
06:23:23 <systemdlete> I noticed that the script was missing (and maybe other stuff, idk)
06:24:28 <systemdlete> this is what happens when you got up early the day before your birthday, ran into technical problems, stayed up all night trying to fix them, and kept running into more problems...
06:25:03 <systemdlete> so I am totally exhausted, but I won't be able to sleep wondering about this.
06:30:46 <systemdlete> gnarface, should I expect to see a deb file for rsyslog in the client's cache directory?
06:31:11 <systemdlete> or does using the cacher cause different behavior?
06:31:22 <systemdlete> (I never thought to check this)
06:34:29 <systemdlete> I don't have the https_proxy set in my env???!!!
06:40:44 <systemdlete> or maybe that was only for openwrt clients of the cacher...
07:22:35 <gnarface> systemdlete: yes, it should still use /var/cache/apt/archives normally, if that's what you're asking
07:22:52 <gnarface> apt-cacher-ng shouldn't affect that
07:24:11 <gnarface> i'm setting the proxy in /etc/apt/apt.conf.d/
07:24:36 <gnarface> only using it for apt
07:46:46 <systemdlete> but I'm NOT seeing it there
09:36:21 <CueXXIII> oh dang, american holiday today… it's 4 20…
09:36:30 <CueXXIII> ah, wrong tab -.-
14:33:52 <Harzilein> hi :)
14:36:40 <Harzilein> jaromil: grats for getting a new dyne release out of the door. i think that should showcase devuan very nicely.
2024-04-20
next