index
previous
2024-04-19    
09:03:18 <Guest19> nah, not worried about getting hacked on this basically single-purpose device; they can hack the guest account all they want, it gets wiped out on boot. Browser bloat is a problem for me though. But thanks for the browser suggestions guys, I'll check those out.
09:05:19 <n4dir> Guest19: netsurf is really very basic, but at least not a command-line-webbrowser. falkon, last time i checked, is a modern browser and works on all sites, but you don't really save a lot of ressources. Worth a try though
09:16:27 <systemdlete> gnarface (and anyone else): I'm seeing intermittent problems with "apt update" from apt-cacher-ng server: e.g, W: Failed to fetch http://security.debian.org/debian[...] Connection failed [IP: 172.XX.YY.ZZ 3142]
09:18:13 <systemdlete> I have a log at https://dpaste.com/9GGU7RQRR
09:19:18 <systemdlete> I swear on a stack of Linux Manuals that I have not mucked (much) with the acng.conf file. I followed the directions gnarface offered me, and consulted the man pages (and some web pages also)
09:19:31 <systemdlete> https://unix.stackexchange.com/questions/623174/apt-cacher-ng-random-download-failures-with-apt-update-acgn
09:19:59 <systemdlete> I disabled ipv6 EVERYWHERE on my home network, but it did not help. (I was just following their suggestion)
09:20:37 <onefang> I use falkon when all my firefox-esr protective plugins means some crappy web site wont work.
09:20:49 <systemdlete> Apparently, I am not the only one having this issue. It is intermittent, which makes it hard to figure out.
09:21:05 <systemdlete> I think my network is OK, since I DO get successful apt-get's
09:22:25 <systemdlete> And no other programs or applications are complaining about network issues (recently I mean).
09:22:46 <gnarface> systemdlete: mine hasn't made a peep lately
09:23:28 <systemdlete> gnarface, this seems to be most frequent when hitting a lot of packages at once, such as the dist-upgrade I am attempting on one system
09:24:15 <systemdlete> The first one (apt-get upgrade, after repointing the sources file to chimaera from beowulf) was 640 packages, the second one over 1000.
09:24:34 <systemdlete> But I've also seen it, a bit less often maybe, with small updates
09:24:44 <gnarface> hmm, i also haven't updated that much at once for a while, i suspect
09:24:45 <systemdlete> Overall, I am VERY pleased with the cacher!
09:25:12 <gnarface> i wonder if it's a problem where the cache can invalidate due to age while the update is still in progress?
09:25:13 <systemdlete> I'm wondering if I should try to throttle apt-cacher-ng somehow
09:25:24 <systemdlete> hmmm.
09:25:33 <systemdlete> how often are packages released to the repos?
09:25:45 <systemdlete> I mean, this happens several times within, say, an hour
09:26:23 <gnarface> i could only speculate... doesn't seem like quite the error i ever got, the only transient errors i ever got it was clearly complaining about some checksum mismatch, and it'd always just go away if i waited until the next cache invalidation
09:26:54 <gnarface> (or if i logged into the control panel and flushed it manually if i was feeling impatient)
09:26:59 <systemdlete> I have seen a lot of web pages re your issue (checksum mismatch), but I really haven't noticed that one myself here.
09:27:16 <gnarface> odd
09:27:51 <gnarface> some other hidden environmental cause we must be missing
09:28:19 <systemdlete> It's really strange, because I was in the middle of the upgrade and dist-upgrade steps, and the process would start getting hit with these errors. Then, I simply restarted it as suggested on devuan's upgrade instructions page. Then, no problem with same packages, only moments later. But then others would/might fail, and rinse and repeat...
09:28:59 <gnarface> so, i wonder if it could just be something caused by the repos updating during your downloads
09:29:16 <gnarface> you said it was in the security section after all, which is what i'd expect to be changing most
09:29:18 <systemdlete> as you were saying, above
09:29:26 <systemdlete> no, that was an "e.g." ^^^^
09:29:33 <gnarface> oh
09:29:41 <systemdlete> there are many different errors; that was just the general format
09:30:06 <systemdlete> (I'm trying hard to be precise as possible here...)
09:30:39 <gnarface> is there any firewall between you and the apt-cacher-ng server? i wonder if it could just be an issue with it failing to hold enough connection states?
09:30:49 <systemdlete> I'm afraid that the actual messages I was seeing rolled off the bottom of my copy-paste buffer
09:31:01 <systemdlete> yes, there are a few
09:31:16 <systemdlete> but!
09:31:27 <systemdlete> this happens (sometimes) even on the cacher server itself!!!
09:31:58 <systemdlete> That server DOES have a firewall running, but I don't think that will matter for local connections, right?
09:32:32 <gnarface> depends on how you set it up
09:32:38 <systemdlete> that's why I am not very suspicious of the network, at least not for the cacher clients
09:32:52 <systemdlete> It's a basic gufw generated fw
09:33:16 <gnarface> but i know that if you're tracking state of inbound or outbound connections, there's a theoretical possibility of running up against a kernel memory limit
09:33:16 <onefang> We tell package mirrors to update every 30 minutes, but some updated less often.
09:33:26 <systemdlete> I've added a few pinholes for ports like the cacher's
09:33:59 <gnarface> like, if you end up tracking too many states at once you could start losing connections randomly
09:34:26 <gnarface> usually this is not a problem with default kernel builds unless you're doing something weird that is causing connections to stick around too long though
09:34:34 <systemdlete> ok, so maybe bumping up the limit might help?
09:34:52 <gnarface> sure, in theory
09:35:10 <gnarface> although often needing to is the sign of something else wrong
09:35:14 <systemdlete> I mean, this is overall rather minor. It's mainly a nuisance.
09:35:58 <systemdlete> just restarting the upgrade has always been enough to get it working again (so far)
09:36:32 <systemdlete> gnarface, did you look at the paste I put up?
09:37:00 <gnarface> systemdlete: no, and i doubt it will give me any insight but i will if you use paste.debian.net or just /msg it to me
09:37:18 <systemdlete> you don't trust dpaste.bin?
09:37:39 <gnarface> i just stopped adding new domains to my personal "trust" list years ago
09:37:53 <gnarface> it's not about any particular domain, it's about reducing attack surface
09:38:06 <gnarface> not just for me, but also for them
09:38:19 <gnarface> (i'm a mafia target and sometimes that gets innocents involved unnecessarily)
09:38:46 <systemdlete> https://paste.debian.net/hidden/779bf2a3/
09:38:55 <systemdlete> (sorry to hear that bad news)
09:39:30 <gnarface> hmm, this is a error i certainly never have seen
09:39:48 <gnarface> failed to move stale item out of the way... no such file or directory
09:39:59 <systemdlete> hmmm, let me look at inodes
09:40:09 <gnarface> scattered across several hours
09:40:33 <gnarface> so it can't move it out of the way because... it's gone already?
09:41:01 <gnarface> you don't have two apt-cacher-ng instances running on the same cache directory, do you?
09:41:02 <systemdlete> sneaky devils.
09:41:10 <systemdlete> uh, I don't think so...
09:41:39 <systemdlete> no, just one
09:41:40 <gnarface> double check, but this seems like quite a puzzle
09:41:56 <systemdlete> I have over 1.4M available inodes on /var
09:42:00 <gnarface> do you think disk I/O is really high at those points?
09:42:09 <systemdlete> good point...
09:42:15 <gnarface> i wonder if it could be some sort of race condition caused by disk bottlenecknig
09:42:28 <gnarface> just guessing
09:42:32 <systemdlete> maybe, but wouldn't that be a problem on the upstream side also?
09:42:40 <gnarface> dunno
09:42:51 <gnarface> never seen this before but my cache server is not under load
09:43:19 <systemdlete> I mean, I'm only seeing the problem from the client's POV. On the other hand, that log file seems to indicate :I: and :O: which I am guessing is input and output
09:43:24 <gnarface> not only is it not under load but i actually just recently upgraded it to a hardware raid with two SSDs
09:43:58 <systemdlete> I'm still on mech disks. They are still a good value for the money here.
09:44:02 <systemdlete> but anyway
09:44:25 <gnarface> i ran the old one right into the ground, it died making grinding noises :D
09:45:26 <gnarface> i figured it was a good idea to have drives that were less sensitive to vibration
09:45:39 <gnarface> (my downstairs neighbors bang on the walls a lot)
09:46:05 <systemdlete> Some of these drives (2TB Seagate and 2TB WD) are up to 3 years old, maybe older. Since I started leaving my PCs on 24x7, the frequency of hardware replacements has gone waaaaay down (MB, disks, etc). So I've become a believer.
09:46:39 <systemdlete> well, that can't be the only annoying thing arising from that problem
09:48:47 <gnarface> well, if it were the drives you'd see i/o errors in dmesg around the same time
09:48:59 <gnarface> but i'm suspecting this is not hardware related
09:49:07 <systemdlete> most of the clients are VMs
09:49:17 <gnarface> it looks like a race condition but i can't imagine why it'd be happening
09:49:46 <systemdlete> and I have a thruk installation here so I will find out fairly quickly if there are problems.
09:50:07 <systemdlete> and I also check, periodically, the host log files for hw errors
09:53:45 <gnarface> how many VMs you usually run the updates with at a time?
09:54:12 <gnarface> i'm rarely updating more than 2 things at once, i wonder if you're seeing something i'm not seeing just because i'm not throwing 1000 clients at it at once
09:54:37 <systemdlete> only one running apt upgrade at a time, otherwise the whole network crawls. However, there can be more than one running apt update at a time
09:54:57 <gnarface> hmm
09:55:14 <systemdlete> well, as I said, even casual upgrades of just a few packages can see this happen... iirc
09:57:54 <systemdlete> when I upgrade from chimaera to daedalus, I will try to monitor the caecher server with htop
09:58:41 <systemdlete> (I'm upgrading beowulf -> chimaera -> daedalus because skip upgrades are not officially supported)
09:59:15 <systemdlete> the chimaera upgrade is almost complete. It is unpacking and installing the dist-upgrade part right now.
09:59:58 <systemdlete> So if all goes well, and the resulting chimaera doesn't have problems for a day or so, I'll do the daedalus upgrade.
10:13:52 <systemdlete> gnarface, how do I msg you a file? I have it gzip'd down to about .5MB (I cannot paste it to deb paste bin)
10:14:14 <systemdlete> this file is the actual log, whereas what I pasted was the error log
10:15:06 <gnarface> systemdlete: oh, i just meant literally paste it and just wait for the flood protect throttling to let it finish going through
10:15:17 <systemdlete> oh...
10:15:23 <systemdlete> it's 8MB
10:15:38 <gnarface> is all 8MB relevant?
10:16:17 <gnarface> i mean, large pastes will take a while but it also will all get here, but i'm not sure if 8MB will actually fit in your clipboard
10:16:23 <systemdlete> of course not
10:16:27 * systemdlete duh
10:16:34 <systemdlete> I can snip it
10:19:18 <systemdlete> last 1000 lines?
10:19:25 <gnarface> sure
10:19:40 <systemdlete> it's about 100k
10:19:56 <gnarface> i think it'll work, just say something when it's done
10:20:12 <systemdlete> how can I paste it?
10:20:21 <systemdlete> it might end up here in this channel!
10:20:31 <systemdlete> wait
10:20:36 <systemdlete> let me try pasting this smaller file
10:20:45 <gnarface> i just sent you 1 line, do you see that open in a new tab on your IRC client?
10:21:21 <gnarface> it would just be: /msg gnarface [paste]
10:21:33 <gnarface> but if your client lets you reply to the message i just sent, that might be easier
10:21:36 <systemdlete> here: https://paste.debian.net/hidden/e7acf4f1/
10:22:10 <systemdlete> (yes, I see the tab, but I think the paste is easier)
10:22:27 <gnarface> hmmm
10:22:29 <gnarface> 1713568491|E|108923|192.168.40.41|devrep/merged/pool/DEBIAN/main/p/python-pip/python-pip-whl_20.3.4-4+deb11u1_all.deb [HTTP error, code: 502]
10:22:38 <systemdlete> yeah. Lots of errors
10:22:49 <gnarface> 502 though
10:23:30 <gnarface> lemme double check, but i think 502 means remote server replied "aak, i'm misconfigured!"
10:23:46 <systemdlete> I just looke dit up
10:23:59 <gnarface> i only see a couple 502 errors... the rest looks completely normal
10:24:14 <systemdlete> it means gateway interference, which is hard to believe when I'm getting the errors on the cacher server itself!
10:24:24 <gnarface> bad gateway yea
10:24:44 <systemdlete> unless
10:24:48 <gnarface> wait, but is that the cache server? or is that actually the debian mirror responding?
10:24:51 <systemdlete> unless it is on the upstream side?
10:24:56 <gnarface> that's what i was thinking
10:25:00 <systemdlete> yeah
10:25:21 <gnarface> might be a problem with a mirror that's just rippling down into your cache
10:25:27 <systemdlete> now, between the cacher and the Internet, there be some gateways, routers, firewalls...
10:26:04 <gnarface> hmm, but i think 502 means it had to actually get some sort of reply though...
10:26:06 <systemdlete> but as I mentioned, I don't have many network issues here (other than stupid things I caused, which are now corrected, to the best of my knowledge)
10:26:17 <onefang> If it's a mirror, please try to identify which one.
10:26:48 <gnarface> yea, see if it always happens while apt-cacher-ng is hitting the same remote mirror
10:26:58 <gnarface> it might be all their fault
10:27:01 <systemdlete> how can I tell?
10:27:15 <gnarface> first thing that comes to mind is tail the log file while watching the raw network traffic
10:27:16 <onefang> And if it's all their fault, I might have to do something.
10:27:18 <systemdlete> the log file (I just pasted) doesn't give that info
10:27:28 <systemdlete> onefang
10:27:40 <onefang> Find the IP of that mirror.
10:27:41 <systemdlete> onefang: don't fret, this happens to a lot of people outside devuan
10:27:48 <gnarface> yea, you'd just have to be tailing it and watching the network traffic at the same time, then see which remote ip is in the tcpdump at the same time that error hits the logs
10:28:04 <systemdlete> tail which log file?
10:28:10 <gnarface> the one you just pasted
10:28:14 <systemdlete> (i have a wide assortement of them)
10:28:28 <systemdlete> but the log doesn't show the upstream, does it? Maybe I misse dthat)
10:29:09 <gnarface> no, but my guess is the error will show up in that log file within milliseconds of the actual remote connection being made, so if you're just running tcpdump at the same time you'll see the corresponding IP
10:29:21 <systemdlete> ah
10:29:23 <systemdlete> tcpdump
10:29:33 <gnarface> in fact, with the right filter you would be able to even see the 502 error in the response
10:29:43 <gnarface> then you'd be sure which side of the network it was coming from
10:29:48 <systemdlete> drat.
10:30:04 <gnarface> actually, what am i thinking? you could certainly filter tcpdump output just for http errors
10:30:04 <systemdlete> I forgot to obfuscate my network addresses in that paste
10:30:22 <systemdlete> but probalby not an issue since my network is behind a residential firewall anyway
10:30:26 <gnarface> i noticed, but they're all private anyway
10:30:32 <systemdlete> right
10:31:48 <systemdlete> now, why would it be that apt is able to round-robin the repo servers, but apt-cacher has a problem doing the same--don't they both use the same code, ultimately?
10:32:05 <gnarface> uh, i don't know that they do
10:32:29 <gnarface> and apt may just be coded to retry without complaint
10:32:33 <onefang> Maybe not. I know debootstrap doesn't use apt, but mdebstrap does.
10:32:45 <systemdlete> ok
10:33:10 <gnarface> but with tcpdump you should be able with some fiddling to isolate the source of these 502 errors explicitly
10:33:35 <gnarface> there might be a simpler way that i'm not thinking of, but tcpdump is definitely up to this task
10:33:36 <systemdlete> right. I will do that
10:34:12 <onefang> And I definitely know that apt-panopticon doesn't use apt directly, but that's a different case. It's testing every step of the apt process on the package mirrors.
10:34:27 <gnarface> if you find it, the payload might even actually make the cause obvious
10:35:44 <systemdlete> tcpdump port https
10:36:00 <systemdlete> (and src host...)
10:37:03 <systemdlete> I wonder if wireshark might be easier in order to easily examine the packets
10:37:48 <systemdlete> but either way, it looks like it should be easy enough to gather the stream
10:42:17 <gnarface> wireshark might be easier but i have less familiarity with it
10:42:41 <gnarface> i'm not exactly a tcpdump pro but when i learned it there weren't alternatives
10:43:27 <gnarface> all you should have to do is filter for http 502 error headers, https might sabotage that though
10:43:55 <systemdlete> I just realized something. When I looked at https://unix.stackexchange.com/questions/623174/apt-cacher-ng-random-download-failures-with-apt-update-acgn, I failed to distinguish what they meant by disabling ipv6.
10:44:15 <systemdlete> They meant, in the acng.conf file! Not the actual network interfaces. (though no harm there)
10:45:05 <gnarface> hmm, i do also have ipv6 disabled everywhere afaik, though i don't see any particular indication that's what's causing this problem
10:45:06 <systemdlete> the cacher can be configured to only listen to ipv4 (or ipv6) as desired. I missed that, and I think I will try that before starting the next step of the upgrade to daedalus
10:46:44 <systemdlete> Yeah, I agree. And besides, it really OUGHT to work for ipv6-enabled networks.
10:47:09 <systemdlete> It's reallly sad if those networks users are deprived of this functionality
10:47:13 <gnarface> yea, but there could be an issue where just one mirror in the round-robin is missing the ipv6 dns entries or something weird like that
10:47:50 <gnarface> we've seen stuff like that cause problems before
10:48:06 <systemdlete> gnarface, some of the devuan mirrors are supported by users like us, right? I mean, some might be on big hardware in a DC somewhere, but maybe not all, and some might not be correctly configured for ipv6?
10:48:32 <gnarface> yes
10:48:35 <systemdlete> ok
10:48:39 <gnarface> it has come up before
10:48:47 <gnarface> as have https issues
10:48:56 <systemdlete> not meaning to repeat you, just trying to get clear on your meaning
10:49:00 <gnarface> sure
10:49:13 <onefang> That's why if there's some mirror issue, I want to know the IP of the errant mirror. Especially if it's something apt-panopticon isn't finding.
10:49:53 <systemdlete> onefang, the "errant mirror"-- do you mean a debian mirror, or only devuan?
10:50:30 <gnarface> i think in theory it could be either
10:50:34 <systemdlete> because debian users are hitting this too, if my survey of ddg hits on this topic is an indicator
10:50:43 <onefang> Could be either, but I'm only in charge of Devuan package mirrors, though if it's a Debian mirror problem, that's good to know as well.
10:50:44 <gnarface> though obviously scrutiny will be on the ones we can do something about first...
10:50:56 <systemdlete> as well as people using the cacher for openwrt (which I do as well)
10:51:14 <gnarface> is openwrt also a deb-based distro?
10:51:19 <systemdlete> no!
10:51:24 <gnarface> hmm
10:51:39 <systemdlete> I think it is freebsd, but I always forget. It's one of the BSDs
10:52:38 <onefang> There's at least one package mirror running from someones home server. I even once had an offer for a home based mirror server running over that PsaceX satellite network. lol
10:52:49 <systemdlete> I know they will be moving to pkg in the future; they're already migrating some of their tools that way. But for now at least, apt-cacher-ng works for openwrt
10:53:51 <onefang> How did I manage to typo SpaceX twice in a row, once while trying to fix the first typo. lol
10:54:14 <systemdlete> you must have borrowed my fingers for a moment...
10:54:35 <onefang> You can have them back. Keep 'em.
10:54:36 <systemdlete> not your fault. You expected mine to work correctly I think.
10:54:40 <systemdlete> lol
10:55:12 <systemdlete> I've typo'd here about 4 or 5 times today already
10:56:04 <systemdlete> gnarface: yeah, I followed the directions at openwrt.org wiki to set up the cacher for openwrt packages.
10:57:46 <systemdlete> It's great with openwrt. I have several routers, and by upgrading them in the order so that my gateway is last, by that time, all the packages are cached for that upgrade, and I don't have to do any special pre-downloading (gateway needs some firmware and other stuff not available in the release iso)
10:58:27 <systemdlete> so I'm really indebted to you for having me set this up. Even with this annoying issue...
10:59:07 <onefang> gnarface is always very helpful, we should all thank them.
10:59:22 <systemdlete> So onefang and gnarface, I will do my utmost to track down that repo for you.
10:59:27 <systemdlete> (yes, they are!!!)
10:59:58 <systemdlete> and you are too, and so are the rest of the folks here
11:08:11 <systemdlete> The 502 Bad Gateway error is an HTTP status code that occurs when a server acting as a gateway or proxy receives an invalid or faulty response from another server in the communication chain. This error indicates a problem with the communication between the involved servers and can result in disruption of internet services. Wikipedia
11:08:31 <systemdlete> That sounds exactly like what we are thinking here.
11:08:51 <systemdlete> So it is almost definitely an upstream-side issue
11:27:49 <gnarface> systemdlete: no problem, i hope you can identify it
11:28:22 <gnarface> i don't know if apt-cacher-ng can be made to add the remote server IPs to the log files or not
11:29:00 <gnarface> might be worth looking into, but packet sniffing will work, albeit probably quite tediously
11:29:48 <systemdlete> I just made that change to only listen on ipv4. But I don't see why that will make any difference, that is, if the problem is almost certainly on the upstream side.
11:30:02 <systemdlete> But no one can say I didn't at least try it.
11:30:12 <gnarface> time will tell
11:30:16 <systemdlete> this upgrade is taking all day, literally.
11:30:37 <gnarface> heh, yea that's expected
11:30:58 <gnarface> i did upgrade a full kde desktop, beowulf->chimaera->daedalus recently, i think something like 18 hours?
11:31:45 <gnarface> i might have been sleeping for part of that, but it took a long time either way
11:31:53 <onefang> My upgrade to daedalus has been ongoing since last year. This time I want to write configs from scratch instead of just copying them across, and I'm doing a shit load of testing. Once it's done testing, I'll roll it out to all my Linux computers.
11:32:28 <onefang> I'm also using my own script for the system building.
11:33:53 <systemdlete> same here
11:34:13 <systemdlete> still in development, just the same way. I've used it a few times, but it needs work...
11:35:10 <onefang> That's the other reason mines taking so long, I'm doing major surgery to the scripts. Not to mention my crappy life keeping me busy this last year. lol
11:35:17 <systemdlete> It would have been more expeditious, I think, to simply clone a new daedalus VM from a template VM I created months ago and just update/upgrade to current package levels and do restores to the home areas
11:35:57 <systemdlete> and add in all the programs and configuration from restores... ay uh
11:36:07 <systemdlete> maybe not better, idk
11:36:48 <systemdlete> I really have done a lot of customization on this system I'm upgrading. So maybe this will be worth it.
11:46:55 <systemdlete> https://askubuntu.com/questions/119298/apt-get-using-apt-cacher-ng-fails-to-fetch-packages-with-hash-sum-mismatch#answer-431764
11:47:31 <systemdlete> (of course, that is a 10-year-old answer)
11:47:43 <AlexLikeRock> hahahahhahah
11:47:57 <AlexLikeRock> nice nick " systemdlete "
11:47:59 <AlexLikeRock> hahhahahaha
11:48:36 <systemdlete> def systemdlete: "systemd: delete from all systems, immediately!"
11:49:06 <systemdlete> (i.e., nothing to do with "lete" or "l3t3" or whatever meme it is...)
11:49:10 <AlexLikeRock> yeh!!
11:49:11 <AlexLikeRock> i now
11:49:15 <AlexLikeRock> jhahahah
11:49:25 <systemdlete> thank you AlexLikeRock
11:49:57 <AlexLikeRock> nice

14:27:42 <systemdlete> gnarface, onefang: oopsies. Looks like some hard drive errors on host as it turns out. I hadn't been alerted about them, and I could have sworn I had thruk set up to alert me upon hardware problems.
14:28:23 <systemdlete> So I might have just wasted your time, but I am not 100% sure about that. None of the articles I read about the cacher problem indicated a hard disk error.
14:48:37 <onefang> No problem for me, I'm in weekend mode, so I was mostly letting you and gnarface work on it. Was waiting to see if you had found an actual broken mirror for me to sort out.
14:50:17 <systemdlete> onefang, that could take some time. I am still trying to finish the upgrade to chimaera and that has taken ALL DAY. So it may be some time before I can explore that further. For one thing, I need to get my kernel logging configured to make thruk alert me for hard disk problems (which I thought I had already done...)
14:50:37 <systemdlete> I'll be focusing on that to avoid more problems going forward.
14:50:50 <systemdlete> But I will definitely be back here to let you know what I find out.
14:50:54 <onefang> Fair enough.
14:50:58 <systemdlete> I promise you.
14:51:55 <systemdlete> I'm sort of hoping, though, that these hard disk errors are the smoking gun. The first of the errors seems to have begun around April 14 (PST time), and that was about the time I began to notice the errors, but I did not enter them into my journal here...
14:52:48 <systemdlete> I'm suspicous of them but not really sure.
14:52:54 <systemdlete> *suspicious
15:00:19 <CueXXIII> systemdlete: anything in smartclt of that harddisk? otherwise it might be bad cabling producing those errors

15:43:43 <systemdlete> CueXXIII, not sure yet. I'm juggling a few things...
15:49:26 <systemdlete> I did look at the cell values on that harddisk and did not see anything that looked like hard errors. I only see evidence in the kern.log
15:51:01 <systemdlete> I will take the system down in a while, just as soon as my disks resync (RAID1). The system has been up 46 straight days, so maybe it is "tired"...
15:51:43 <systemdlete> I'll remove the bad disk and test it on a test box. I have some spares (good thinking, systemdlete, for once).
15:52:53 <systemdlete> I usually run badblocks for a day or two to see if it errors. Since the test box has its own cables, that will help to eliminate that possiblity.

16:46:17 <gnarface> systemdlete: which filesystem are you using on these?
16:50:51 <systemdlete> ext4, everywhere
16:51:25 <systemdlete> ext4 on VMs and hosts
16:51:45 <systemdlete> I've tried some of the more exciting stuff, I think xfs? or something like that
16:52:01 <systemdlete> but I had problems with it, but years ago, so I should prob try again
16:52:38 <systemdlete> oh, no. it was btrfs
16:53:04 <systemdlete> not xfs. I don't think I've ever tried that one
16:55:34 <gnarface> hmm, well being not btrfs or anything similarly experimental, i dunno, but ext4 did have one bad corruption bug that was affecting upgrades a few releases back, only i thought that was before beowulf
16:55:56 <gnarface> (and it did manifest itself exactly like a physical drive failure)
16:56:08 <systemdlete> right
16:56:17 <systemdlete> I vaguely recall that...
16:56:39 <gnarface> the issue was to do with using older e2fsprogs with newer kernels or something like that
16:57:00 <systemdlete> yeah, I think that's right
16:57:16 <systemdlete> I am pretty sure I stumbled into that one at some point.
16:58:09 <systemdlete> Thing is, before I started this upgrade, I cloned the VM, and upon boot I have all filesystems set up to fsck every time (I don't reboot any systems much).
16:59:03 <systemdlete> So I believe I have a clean file system to start with. But still, if the actual hardware backing the virtual FS has actual hard errors, then maybe there is still issues.
16:59:27 <systemdlete> I've been carefully checkking the kern.log's on both VM and host for any suspicious errors
16:59:40 <systemdlete> since I noticed it earlier

18:43:10 <systemdlete> ok now what am I doing wrong? I tried upgrading rsyslog on a beowulf system with the backports so I could get a more recent version. But there is no /etc/init.d/rsyslog
18:43:19 <systemdlete> I checked the package; it says it is included
18:44:07 <systemdlete> maybe a trigger is not running? I forget the details of how that happens...
18:52:49 <fsmithred> what does 'apt policy rsyslog' tell you?
18:53:12 <systemdlete> I removed the upgrade and tried re-installing the package from the regular repo, but same thing happens
18:53:20 <fsmithred> I just tried to install rsyslog from beowulf-backports and apt tells me that i already have the newest version
18:53:41 <fsmithred> but apt policy tells me that version is in beowulf. No rsyslog in backports.
18:53:58 <fsmithred> want mine?
18:54:37 <systemdlete> apt policy shows the correct versions, and the one I have installed has a star
18:55:01 <systemdlete> sorry, I meant chimera backports, not beowulf
18:55:14 <systemdlete> (I'm exhausted from all of this today...)
18:55:25 <onefang> Go and rest.
18:55:49 <systemdlete> but it doesn't matter; the rsyslog script does not get installed
18:56:00 <systemdlete> either version
18:56:24 <systemdlete> trouble is, I now have NO rsyslog running on that system!
18:56:41 <systemdlete> I can launch it manually
18:56:45 <systemdlete> but this is sick
18:59:32 <rrq> I haven't read backlog, but which script are you talking about?
18:59:38 <gnarface> hmm, did you debootstrap? i seem to recall a problem with rsyslog and debootstrap some point
19:00:28 <gnarface> i think it was in ascii though, and maybe only on arm64
19:01:09 <systemdlete> no, nothing to do with installing the system. The version of rsyslog I had was 2102, and I wanted the 23.02 version.
19:01:09 <gnarface> yea, according to my notes i had to exclude rsyslog and udev and include syslog-ng instead to successfully debootstrap ascii on arm64
19:01:24 <systemdlete> ascii... that's long time ago.
19:01:31 <systemdlete> I don't have any ascii systems here
19:01:33 <gnarface> never had any other problems with rsyslog that i can recall
19:01:36 <systemdlete> (or jessie)
19:02:04 <gnarface> does it work if you just reinstall it?
19:02:07 <systemdlete> I removed all rsyslog versions from the apt-cacher and I'll see if this works
19:02:14 <systemdlete> yeah, tried re-installing
19:02:33 <systemdlete> I'm going to see if maybe it will cache a fresh copy from the repos

20:14:59 <systemdlete> well reboot did not help, even shutting it down completely and restarting it "cold" (and swapping out the bad drive for a new one)
20:15:39 <systemdlete> then I tried to apt remove rsyslog and apt install rsyslog, but for whatever reason, it does not install the /etc/init.d/rsyslog script
20:16:02 <systemdlete> I also note that the archive does not contain the deb file for rsyslog
20:16:30 <systemdlete> (I am using the cacher, but I didn't figure that would affect the client as far as grabbing a copy of the deb file)
20:16:36 <systemdlete> (but what do I know, really?)
20:17:03 <gnarface> wait maybe that's because it's in beowulf-security now? or chimaera-security or whichever you're on?
20:17:15 <gnarface> make sure your sources.list is complete
20:17:51 <gnarface> if you have to, you can always switch to syslog-ng but i was pretty sure rsyslog worked...
20:18:32 <gnarface> i would be tempted to examine the preinst/postinst scripts
20:19:15 <gnarface> but rsyslog is definitely working on my beowulf systems
20:19:32 <systemdlete> chimaera here
20:19:48 <systemdlete> this is a different sea of trouble than the one we were dealing with before.
20:22:42 <systemdlete> and rsyslog was installed just fine previously. What happened was that I was running into some odd error messages from rsyslogd and I thought maybe upgrading to a more recent version could correct that (very hopeful thinking here)
20:23:02 <systemdlete> but after doing the upgrade to chimaera-backports
20:23:23 <systemdlete> I noticed that the script was missing (and maybe other stuff, idk)
20:24:28 <systemdlete> this is what happens when you got up early the day before your birthday, ran into technical problems, stayed up all night trying to fix them, and kept running into more problems...
20:25:03 <systemdlete> so I am totally exhausted, but I won't be able to sleep wondering about this.
20:30:46 <systemdlete> gnarface, should I expect to see a deb file for rsyslog in the client's cache directory?
20:31:11 <systemdlete> or does using the cacher cause different behavior?
20:31:22 <systemdlete> (I never thought to check this)
20:34:29 <systemdlete> I don't have the https_proxy set in my env???!!!
20:40:44 <systemdlete> or maybe that was only for openwrt clients of the cacher...

21:22:35 <gnarface> systemdlete: yes, it should still use /var/cache/apt/archives normally, if that's what you're asking
21:22:52 <gnarface> apt-cacher-ng shouldn't affect that
21:24:11 <gnarface> i'm setting the proxy in /etc/apt/apt.conf.d/
21:24:36 <gnarface> only using it for apt
21:46:46 <systemdlete> but I'm NOT seeing it there

23:36:21 <CueXXIII> oh dang, american holiday today… it's 4 20…
23:36:30 <CueXXIII> ah, wrong tab -.-

---------- 2024-04-20 ----------
04:33:52 <Harzilein> hi :)

04:36:40 <Harzilein> jaromil: grats for getting a new dyne release out of the door. i think that should showcase devuan very nicely.
2024-04-20    
search in #devuan logs:
index
next