The officially official Devuan Forum!

You are not logged in.

#1 2024-10-06 19:06:18

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Excalibur, runit's supervision tree inherits tty1, misbehavior

<sysrq>+K on tty1 brings the system down. Happens because runsvdir gets SIGKILLed by the kernel. Happens because runsvdir inherits tty1. I think this is a bug and should be fixed. One solution I am aware of is
exec </dev/null >/dev/null 2>/dev/null
in /etc/runit/2 just before runsvdir. What do you think?

Offline

#2 2024-10-06 19:17:45

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Another misbehavior here is that tty1 gets cluttered with log messages.

Offline

#3 2024-10-21 21:22:47

Lorenzo
Member
Registered: 2020-03-03
Posts: 46  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Hi!

<sysrq>+K on tty1 brings the system down

One solution I am aware of is
exec </dev/null >/dev/null 2>/dev/null

but this discards all output from runsvdir right? Or are you able to get it in some other way?

anyway, clutter tty1 is a known issue, <sysrq>+K is new to me. And there is also this other issue
https://github.com/void-linux/runit/issues/14
reported at Void.

I would like to address them all, and keep a log or runsvdir if possible..

Could you test the following in stage 2

PIPE=/run/runit/runsvdirpipe
mkfifo "$PIPE"
exec <> "$PIPE"

exec env - PATH=$PATH  runsvdir -P "${SVDIR}" >"$PIPE" 2>"$PIPE"

note that the log arg for runsvdir (the dots) is removed.

and a log of runsvdir could be catched with a simple service like (run file)

#!/bin/sh

NAME="runsvdir-log"
LOG=/run/runit/runsvdir-log
test -d "$LOG" || mkdir "$LOG" && chown -R _runit-log:adm "$LOG"

exec 2>&1

exec < /run/runit/runsvdirpipe
echo "Starting runsvdir-log"

exec chpst -u _runit-log svlogd -tt "$LOG"

I tested quickly and it seems to do the job, please let me know if it works for you

Lorenzo

Offline

#4 2024-10-22 15:33:50

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Lorenzo wrote:

but this discards all output from runsvdir right?

Correct.

Or are you able to get it in some other way?

Yep, I've put /dev/tty12 instead of /dev/null there. <sysrq>+K on tty12 still brings the system down, but I don't care because it doesn't get in the way of my routinely workflow, unlike tty1 which I use heavily. I have some suspicions that part of the output gets lost somehow, though... Need to test that more thoroughly.

I would like to address them all, and keep a log or runsvdir if possible..

I hurry to notify you that I (very personally) find runsvdir's cmdline logging feature useless. Ever since I've put an svlogd on every service that needs it, and >/dev/null 2>&1 on every service that maintains logs on its own (hello mullvad-daemon), my ps -ef | grep runsvdir is

runsvdir -P /etc/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................

and my tty12 console output is as good as non-existent.

Regarding your FIFO approach, I mostly agree with it, but will still give you some input that popped up in my head.

First, in case of a big amount of services, will the default 16 pages be enough to keep all the logs in the FIFO before the svlogd starts? I think it should, but if you settle on this solution, maybe you want to keep your hand on the pulse for possible bug reports. Bottom line, you can start svlogd prior to runsvdir and then restart it under runsv.

Second, consistency. Stage 1 and sysvinit scripts log to tty1. Are runsvdir logs important enough to be concerned about, or is nobody interested in seeing them anyway? sysvinit logs are very important as they notify you about possible errors during startup of critical parts, I presume many people rely on them and I am no exception. On the other hand, I (very personally) do not find runsvdir logs as useful. As long as I get a getty I will be fine. I can imagine a few use cases to it. One would be not using svlogd for getty, this way if getty fails, you immediately know why. Is this useful? No, because since you can't log in you would need a separate boot media anyway, as such you could as well check the log. On the other hand, some critical, not very verbose services, could still make use of it. Want to check errors/warning? Switch to the tty and read the text on your screen. Easy, no hassle, no logging in as root etc etc.

Lastly, why /run/runit/runsvdir-log? This beats the purpose. I suggest this is changed to /var/log/runit instead.

I humbly suggest the following modifications.

SVNAME=runsvdir-log
DEF=/etc/default/runsvdir-log
PIPE=/run/runit/runsvdirpipe
is_number () {
    case "$1" in
        *[!0-9]*|"") return 1;;
        *) return 0;;
    esac
}

if [ -L "$SVDIR/.$SVNAME" ]; then
    mv "$SVDIR/.$SVNAME" "$SVDIR/$SVNAME"
fi

[ -f "$DEF" ] && . "$DEF"

if is_number "$LOG_TTY" && [ -c /dev/tty$LOG_TTY ]; then
    echo "runsvdir log goes to /dev/tty$LOG_TTY"
    PIPE=/dev/tty$LOG_TTY
    [ -L "$SVDIR/$SVNAME" ] && mv "$SVDIR/$SVNAME" "$SVDIR/.$SVNAME"
elif ! [ -L "$SVDIR/$SVNAME" ]; then
    PIPE=/dev/console # default
elif ! mkfifo "$PIPE"; then
    echo "Failed to create runsvdir pipe for logging: $PIPE"
    echo "runsvdir will run with logging disabled"
    PIPE=/dev/null
    [ -L "$SVDIR/$SVNAME" ] && mv "$SVDIR/$SVNAME" "$SVDIR/.$SVNAME"
fi

# no blocking for FIFO, please, so <> is mandatory
exec <> "$PIPE"
# must redirect twice for FIFO
exec env - PATH="$PATH"  runsvdir -P "${SVDIR}" >"$PIPE" 2>"$PIPE"

and

#!/bin/sh

SVNAME="runsvdir-log"
LOG=/var/log/runit/"$SVNAME"
if ! mkdir -p "$LOG"; then
    # mkdir output error already, this is to help making sense of it
    echo "$SVNAME: $LOG: mkdir failed" >&2
    sv d .
fi
chown -R _runit-log:adm "$LOG"
chmod 750 "$LOG"

exec 2>&1
exec < /run/runit/runsvdirpipe

echo "Starting $SVNAME"
exec chpst -u _runit-log svlogd -tt "$LOG"

I find this quite involved to be honest. I am not sure how good exactly the idea is. Also, if svlogd fails here, you'll never know why because of the loop, it might get somebody confused. I honestly fail to see a flawless solution here. I suspect the scripts above are flexible enough to let people do whatever they want. Put LOG_TTY=1 in the /etc/default/runsvdir-log and you get the default behavior.

Also, maybe somehow add an option to the runsvdir-log service to duplicate the log to a tty. I'm afraid my head is not clear enough now to suggest anything more sensible.

Also, the scripts above work, I've just tested them.

Offline

#5 2024-10-22 15:40:13

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Also, here

elif ! mkfifo "$PIPE"; then

maybe use /dev/console instead of /dev/null, it would be more sensible.

Last edited by Alverstone (2024-10-22 15:40:27)

Offline

#6 2024-10-23 15:10:59

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Lorenzo,

I gave it more thought and here is what I have. Append this to the current default /etc/runit/2:

LOGSVNAME=runsvdir-log
[ -L "$SVDIR/.$LOGSVNAME" ] && mv "$SVDIR/.$LOGSVNAME" "$SVDIR/$LOGSVNAME"
LOGFIFO=/run/runit/runsvdirpipe
LOG="$LOGFIFO"
if ! mkfifo "$LOGFIFO"; then
    # mkfifo already output error
    echo "runsvdir will log to /dev/console" >&2
    LOG=/dev/console
    # just in case
    rm -f "$LOGFIFO"
fi

# Use <> for stdin, otherwise FIFO blocks
exec <>"$LOG" >"$LOG" 2>"$LOG"
exec env - PATH="$PATH" runsvdir -P "${SVDIR}"

The logger service must be called runsvdir-log, the run file is

#!/bin/sh

# do not use echo to report error
# if this service is needed and logs go to FIFO, all error
# messages will get lost in the FIFO anyway

SVDIR="$(dirname "$(readlink -f .)")"
SVNAME="runsvdir-log" # must be runsvdir-log, MUST
DEF=/etc/default/runsvdir-log
LOGDIR=/var/log/runit/"$SVNAME"
LOGFIFO=/run/runit/runsvdirpipe
LOGDUP=no

is_number () {
    case "$1" in
        *[!0-9]*|"") return 1;;
        *) return 0;;
    esac
}

is_yes () {
    case "$1" in
        y|ye|yes) return 0;;
        *) return 1;;
    esac
}

msg_start () {
    echo "Starting $SVNAME"
}

[ -f ./conf ] && . ./conf
[ -f "$DEF" ] && . "$DEF"

if ! [ -p "$LOGFIFO" ]; then
    mv "$SVDIR/$SVNAME" "$SVDIR/.$SVNAME"
    exit
fi

if is_number "$LOGTTY" && [ -c /dev/tty"$LOGTTY" ]; then
    LOGTTY=/dev/tty"$LOGTTY"
else
    LOGTTY=""
fi

if ! is_yes "$LOGDUP" && [ -n "$LOGTTY" ]; then
    msg_start
    exec cat < "$LOGFIFO" > "$LOGTTY" 2> "$LOGTTY"
fi

if ! mkdir -p "$LOGDIR"; then
    sv d . # down instead of 'mv' as a kind of error reporting
fi
chown -R _runit-log:adm "$LOGDIR"
chmod 750 "$LOGDIR"

msg_start
if is_yes "$LOGDUP" && [ -n "$LOGTTY" ]; then
    DUPPIPE=/run/runit/duppipe
    rm -f "$DUPPIPE"
    if ! mkfifo "$DUPPIPE"; then
        # at least do something
        rm -f "$DUPPIPE" # just in case
        exec chpst -u _runit-log svlogd -tt "$LOGDIR" < "$LOGFIFO"
    fi
    chpst -u _runit-log svlogd -tt "$LOGDIR" < "$DUPPIPE" &
    # this could deadlock... still it would at least be an indication of an error
    # unlike writing into a FIFO nobody reads, so don't use <> on $DUPPIPE
    exec 8>"$DUPPIPE"
    exec 9<"$LOGFIFO"
    # this way after exit svlogd will receive EOF and exit
    rm -f "$DUPPIPE"
    while read -r logline <&9; do
        echo "$logline" >&8
        echo "$logline" > "$LOGTTY"
    done
else
    exec chpst -u _runit-log svlogd -tt "$LOGDIR" < "$LOGFIFO"
fi

This

  1. Keeps /etc/runit/2 as small as possible.

  2. Prevents https://github.com/void-linux/runit/issues/14 because cmdline logging feature is disabled.

  3. Prevents <sysrq>+K from bringing system down. It probably kills the logger when it's cat, but that is not critical.

What do you think?

Last edited by Alverstone (2024-10-23 15:14:23)

Offline

#7 2024-10-23 15:16:23

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Consequently, /etc/default/runsvdir-log sets up two variables.

  • LOGTTY: VT number to which log is written.

  • LOGDUP: If set to yes (or ye or y) and LOGTTY is defined, write log to svlogd and duplicate it to VT number $LOGTTY.

If LOGTTY is undefined, or VT with such number does not exist, log to svlogd. You could easily add support for logging/duplicating to /dev/console, if you think this solution is any good at all.

Last edited by Alverstone (2024-10-23 15:31:24)

Offline

#8 2024-10-25 15:33:22

Lorenzo
Member
Registered: 2020-03-03
Posts: 46  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Hi Alverstone,

I redirected runsvdir output to /dev/tty10 on my system and  tested  <sysrq>+K on tty1 and it brings down the system;
tested again, and it's actually the

exec <>

that does the trick and shields runsvdir from <sysrq>+K

I would like to address them all, and keep a log or runsvdir if possible..

I should have been more precise, apologise if this wasted your time in search of a solution with a proper log:
what I mean is that it's preferable to not discard completely the runsvdir output by default; it doesn't have to be necessary a proper log file.

since output to tty and exec <> does the trick I think I prefer that to proper fifo.

Issue with the fifo is what happens if the buffer is full and there is nothing reading on the other side, I'm not sure if that can affect runsvdir in a bad way, that would be a problem.
What I fear (more than the race between lines written and runsvdir-log startup) is that a user can disable the runsvdir-log service because he doesn't care and then, due to some service malfunction later, the buffer slowly fills up.
Runsvdir output 1 line per second for each non functional service (example run not executable or not found); how many line it takes before the fifo buffer is full?

On the other hand is hard to predict if a ttyN is available in a system; only thing that is granted to be there is /dev/console (reason why runit outputs there, I think) but /dev/console is connected with tty1, so we are at the starting point..

I agree with the logic you proposed for stage 2 but I prefer something more simple; your proposed code could be fine if I roll runit specific bootscripts but for now I prefer to keep stage 2 more simple. I'm thinking of an hardcoded list of  /dev/* with a fallback on /dev/console;
I think standard system (desktop, laptop, server) have plenty of tty; less sure about embedded hardware (raspberry, arduino or even more essential hardware); there is also the container case to cover.

Something like (recycling you code, simplified)

for out in tty12 ttyS1 tty1 console; do
  if [ -c /dev/"$out" ] ; then
     PIPE="/dev/$out"; break
  fi
done
exec <>"$PIPE" >"$PIPE" 2>"$PIPE"
exec env - PATH="$PATH" runsvdir -P "${SVDIR}"

your code is certainly more configurable from user point of view,
but unfortunately I fear is also more fragile if the user is not aware of the fifo.
let me know if you  see shortcomings or you have more thoughts on this; I intend to
add a fix for this in the next runit upload to unstable.

Thanks for your ideas!
Lorenzo

Offline

#9 2024-10-26 14:06:06

Alverstone
Member
Registered: 2024-10-06
Posts: 24  

Re: Excalibur, runit's supervision tree inherits tty1, misbehavior

Hi Lorenzo,

I redirected runsvdir output to /dev/tty10 on my system and  tested  <sysrq>+K on tty1 and it brings down the system;

Weird. If you redirected to /dev/tty10 then sysrq only brings down the system when invoked on the tty in question. This is the way my system behaves. The kernel's 6.6.56, pristine. Ehh it's a bit bothersome to reboot several times while writing a post roll, but...

exec env - PATH="$PATH" runsvdir -P "${SVDIR}" >/dev/tty12 2>/dev/tty12

sysrq brings the system down both on tt1 and tty12. I wish I remembered which kernel function does it to have a look at it, but as far as trial, error and memory go kernel checks what processes "sit" on the current tty (perhaps just have it open, I don't remember), locks them (to prevent TOU bugs), and kills recursively down to the last child. So yes, it is imperative that not only stdout and stderr are redirected, but stdin as well.

As for <>, I do not observe the said behavior.

exec env - PATH="$PATH" runsvdir -P "${SVDIR}" </dev/tty12 >/dev/tty12 2>/dev/tty12

prevents sysrq from bringing the system down on tty1 and moves the "bug" (quotes because not sure if it's really a bug) to tty12. On my system.

apologise if this wasted your time in search of a solution with a proper log

So much for enthusiasm!

that does the trick and shields runsvdir from <sysrq>+K

Please either double check or clarify.

exec env - PATH="$PATH" runsvdir -P "${SVDIR}" <>/dev/tty12 >/dev/tty12 2>/dev/tty12

<sysrq>+K on tty12 is followed by a shutdown all right.

I'm not sure if that can affect runsvdir in a bad way

This isn't hard to check.

mkfifo test
exec 4<>test
head -c 65536 < /dev/urandom > test
echo 1 > test

So unless runsvdir specifically avoids blocking on writes (poll, O_NONBLOCK, whatever), it's gonna block. I more or less knew this, but didn't pay much attention. I guess I understand your concerns here.

your proposed code could be fine if I roll runit specific bootscripts but for now I prefer to keep stage 2 more simple

I'm not sure I fully understand what you mean here. Elaborate, please?

Something like (recycling you code, simplified)

Do you mean to try and guess the most likely unused tty and put logs there?

I dunno. It looks like for stdout and stderr runit uses buffer_unixwrite(), which is just write(). Furthermore, buffer_1 and buffer_2 are mainly used with buffer_puts() and the return status isn't really checked at the first glance. This is very thin ice, but theoretically, if we could open the FIFO in non-blocking mode, it could work. I can't say I like it.

The core of the problem here is that we attach to a VT. Do openrc guys have this problem? I dunno how it works, but looks like they're using startup scripts for their services so they probably don't have any critical processes attached to any VTs.

Listen, since runit is an almost trivial (in terms of UX at least) init system, can we really hope to avoid tinkering completely? What do you think about leaving the default behavior alone and instead leaving a BIG and FAT piece of documentation somewhere everybody is very likely to look and explain the situation to people? We could then make a stage 2 logger around FIFO and just tell them to enable it manually or whatever. Maybe even put something at /etc/default, though I don't know all the pitfalls of packaging. Better than /etc/default, we could just have a conf file. It feels like backing off, but it doesn't look like we can cope without some black magic.

Thank you for reaching out.

Offline

Board footer