#osdev on 2022-02-11 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:09 bas1l is now known as basil

00:32 gog has quit [Quit: byee]

00:54 zid` has joined #osdev

00:55 zid has quit [Ping timeout: 256 seconds]

00:59 <klange> While I still don't have the music, classic Doom is so much better with the sound at least. Even if I am cheating slightly by using an AC97 on an ARM VM.

01:02 <dmh> lol nice

01:05 myon98 has joined #osdev

01:10 <geist> ah did you punch that through PCI?

01:10 <geist> guess so, unless there's a virtio-audio or something (which i hvaen't seen references to somewhere)

01:11 <geist> have seen that is

01:12 <geist> oh huh it is: https://www.kraxel.org/virtio/virtio-v1.1-cs01-sound-v7.html

01:12 <bslsk05> www.kraxel.org: Virtual I/O Device (VIRTIO) Version 1.1

01:15 <klange> I don't see support for it in qemu yet; I think the most platform-appropriate option is usb-audio.

01:16 <klange> And frankly none of this is reflective of the audio on either of the two actual hardware setups I am targetting, so *shrug*

01:20 pretty_dumm_guy has quit [Quit: WeeChat 3.4]

01:35 <Griwes> aaaaand I have working syscalls

01:35 <Griwes> well, just one now, but all's fitting together correctly

01:36 <Griwes> last thing remaining for this commit to be ready is to fix some weird build system behavior

01:42 <heat> congrats

01:43 <heat> i'm feeling lazy so I've just looked into riscv traps and context switching

01:47 nyah has quit [Quit: leaving]

01:51 <geist> heat: not a lot there huh?

01:51 <geist> only thig i find particularly annoyig with the riscv solution is there's no separate vector for 'from user space' vs 'from supervisor mode'

01:51 <geist> so it gets a little tricky to restore the cpu state

01:52 <heat> yeah i kinda got lazy once I realised how not-x86 the context switching is

01:53 <heat> geist, I think there are vectors for that if you use the vectored mode

01:55 <heat> actually no huh

01:55 <heat> what's that for then

01:57 <geist> the separate vectors are i think only for external irqs

01:57 <geist> er wait maybe not even that. i didn't find it to be particularly compelling

01:57 <heat> yeah I didn't really get them

01:58 <geist> lemme re-read it

01:58 <geist> but since it's optional and there's not a huge win and additional complexity tos upport both modes i didn't think it worth implementing

01:58 <heat> it's for IRQs

01:58 <heat> says asynchronous interrupts

02:00 <geist> ah yeah. so only the irqs that have the top bit set

02:00 <geist> software interrupts, timers, external

02:01 <geist> the non interrupts (where a syscall would be vectored from) still all go through the base + 0 vector

02:02 <geist> if you note what about 'user software interrupt', whcih is also 0, how would you tell if it's vectored or not? and i think the answer is 'that can't happen because user software interrupt can't be triggered that way' or something

02:02 <geist> but this part of the priviledged spec is a great example of what i dont like how the riscv docs

02:02 <geist> basically the docs are written to describe the registers

02:03 <geist> and then when they get to the part where they describe the mtvec register (where you put in the interrupt/exception base address) it then goes ahead and describes how interrupts/exceptions work

02:03 <heat> oh it's described in the machine stuff?

02:03 <geist> 'here's a register that holds the vector table, and oh i guess you should know what that's for, here's how interrupts work'

02:03 <geist> and yes. it's described in the machine section. when you get to the supervisor section it assumes youv'e read the machine section

02:04 <geist> i was bitten by that a few times to. for things that are implemented basically the same in supervisor mode it's assumed (or maybe pointed out) that the real descriptions are in the machine mode section

02:04 <geist> and sueprvisor mode section is jsut describing basically how it applies there

02:05 <geist> basically there seems to be no real high level 'here's how the arch works' and then a later section describing registers that let you configure the arch

02:05 <geist> you have to jsut read the whole thig and then infer how the arch works

02:05 <heat> yea I agree, it's weird

02:05 <heat> especially coming from intel

02:05 <geist> yah or even ARM. both of them treat control registers as a thing in service of the thing they're describing (the feature)

02:10 <geist> but all of this is mitigated by the fact that the whole thing is so darn simple you can simply jsut read it end to end and basically remember it

02:10 <geist> or at least remember the basic concepts

02:10 <geist> as the arch inevitably gets more complicated the docs will definitely need to get restructured a bit

02:11 <heat> something I've been meaning to ask you: is the only difference between riscv32 and 64 that registers are larger and you have more instructions (the dword variants)?

02:27 <heat> i just had the horrible idea of copying arch/riscv to arch/arm64 and go for that for now

02:51 <klange> at least with arm there's, like, actual hardware that's easy to get and use virtualization on

03:12 <dmh> so true

03:12 <dmh> how many years in

03:13 <dmh> kinda crazy considering you can run softcore in fpga and traceout that way

03:26 <kazinsal> someone talk me out of buying a vax

03:31 [itchyjunk] has quit [Ping timeout: 240 seconds]

03:32 sonny has joined #osdev

03:33 <dmh> well how much

03:35 <kazinsal> $280 USD + $105 USD for shipping to canada

03:35 <moon-child> wow that's pretty good

03:35 <kazinsal> microvax 3100, 32 MB

03:35 <moon-child> get it

03:36 [itchyjunk] has joined #osdev

03:36 <kazinsal> hrm, okay, that's a problem, no hard drive

03:37 <kazinsal> wonder if I can just shove any ol' SCSI drive in there

03:41 <kazinsal> ooh, multiple reports of SCSI2SD working in it

03:45 sonny has quit [Quit: Ping timeout (120 seconds)]

03:46 <kingoffrance> https://www.netbsd.org/docs/network/netboot/intro.vax.html where we're going, we dont need hard drives (check models first of course)

03:46 <bslsk05> www.netbsd.org: Introduction (vax-specific), Diskless NetBSD HOW-TO

03:47 sonny has joined #osdev

04:05 heat has quit [Remote host closed the connection]

04:06 <geist> heat: yah basically the same

04:06 <geist> annnnnd they left

04:06 <geist> kazinsal: oooooh which one?

04:06 <geist> 3100/?

04:07 <kazinsal> 3100/40

04:07 <geist> mine is a 40 i believe... yay

04:07 <geist> exact same setup as mine. 3100/40 32MB

04:07 <geist> and yeah scsi2sd works fine

04:09 <geist> caveat: you'll need a AUI to ethernet box (easy to find on ebay) and little DEC serial port to real serial port cable

04:09 <geist> they used this little almost rj45 but with a key offset thing

04:09 <geist> also easy to get on ebay

04:10 <geist> i had managed to get it kinda working with a straight rj11 crammed into it with the pins lined up but the dec plug is better

04:10 <klange> Something I want of the same vintage is a VT340.

04:10 <geist> yah those also take the same plug

04:10 <geist> id' probably get a 220 or a 420. seem to be fairly available

04:10 <geist> 340s i think were an odd duck

04:11 <klange> They are _the_ color DEC!

04:11 <geist> those still seem to be fairly expensive on ebay. i'm looking for a cheapo one but it hasn't happened yet

04:14 sonny has quit [Quit: Client closed]

04:21 <geist> kazinsal: re: any old scsi drive. i found it to be pretty tolerable too. same with scsi cdroms. i also installed VMS from 'cdrom' by writing a iso image directly to a scsi drive and booting off it

04:21 <geist> same with netbsd

04:22 <geist> though a scsi2cd supports multiple LUNs so that makes it easy

04:24 <geist> power wise it's not too bad either. iirc it pulls like 95W

04:24 <geist> (not doing a good job talking you out of this am i?)

04:25 <kazinsal> not in the slightest haha

04:26 <geist> well, when you do it i can hook you up with the magic i had to cobble together to net boot one

04:26 <geist> you had to speak some proprietary protocol and wrap the image it loads with some header that i figured out

04:26 <kazinsal> ah, neat

04:26 <geist> but once you do you can just take yuor bits and slam it in ram and start outputting chars almsot from the get go

04:27 <geist> one neat thing: the firmware on the vax *always* respect a BRK on the serial port

04:27 <geist> it will absolutely drop you out of whatever you're doing, even if you've completely hosed the state of the system

04:28 <geist> so yuo can pretty much jsut hook it up in a closet somewhere and connect a serial cable to a linux box or whatnot

04:29 <kingoffrance> ^ im not certain, but i wonder if alpha also had this

04:29 <geist> i've seen soem sparc boxes that mostly honor it. dunno precisely how it works there

04:29 <kazinsal> yeah, that's a handy feature to have

04:30 <geist> in the original VAX 11/780 the serial/console stuff was a separate cpu that could obviously watch the traffic go by, but i think the later ones kept the same model, even if it was built into the cpu itself

04:30 <kazinsal> cisco boxes do too, or at least used to. if something was all messed up you could send three BRKs in a row and it'd immediately halt the machine and go into rommon

04:30 <geist> i had remembered reading a 11/750 or 730 had an 8085 based console, etc

04:31 <geist> also for that reason at least on the bigger VAXen you could read/write the console via writing to a pair of dedicated control registers in the cpu itself

04:31 <geist> whcih was nice. you could basically single instruction output a char

04:31 <geist> but iirc by the time these later microvaxes came around that was nerfed and you talk tot he serial port like a regular peripheral

04:32 <geist> anyway. yay join the vax club!

04:32 <geist> honestly it might be from the same seller. iirc there's someone taht's jsut perpetually selling that exact model with that exact config

04:32 <geist> like they have some motherload they're slowly dishing out

04:56 <mxshift> Aspeed BMC SoCs have a magic string you send on a UART that drops it into a ROM monitor with direct access to AHB

04:56 <geist> +++ATH

04:57 <kazinsal> NO CARRIER

05:01 * moon-child decides to just write //todo not horrible and move on

05:01 <mxshift> Oxide used the same string as the office wifi password for a while

05:01 <klange> /* TODO make this less shit */

05:04 ElectronApps has joined #osdev

05:11 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

05:32 <Griwes> ha, randomizing the order of my two (with one functioning) syscalls has already allowed me to diagnose a stupid bug in my build system

05:33 <Griwes> neat

05:35 k8yun has joined #osdev

05:59 [itchyjunk] has quit [Remote host closed the connection]

06:05 zaquest has quit [Remote host closed the connection]

06:11 k8yun has quit [Quit: Leaving]

06:17 zaquest has joined #osdev

06:24 <kazinsal> geist: what do you use for a console for your vax? looks like it uses some DEC 6P6C cable for RS-232

06:24 <geist> yah that's what i was talking about, you need to at least get that

06:25 <geist> i bought one on ebay for $10 or so

06:25 <geist> then i can just plug it into a PC or whatnot

06:25 <kazinsal> neat

06:26 <geist> like i said in a pinch you can jam in a rj11 and make the pins line up, but nothing holds that in place

06:27 <geist> AFAICT the DEC plug is the same size and pinout as a RJ11 but it has the tab offset

06:27 <geist> presumably just to be DEC

06:27 <kazinsal> oho, looks like someone makes a DEC MMJ to RJ45 rollover cable

06:27 <geist> yah i remember that not being a big deal

06:28 <kazinsal> so I can slap it in the aux port on one of the routers I keep around and dial "through" that over SSH to the VAX

06:28 <geist> and like i also said it has an AUI on the back, so you'll want one of those too

06:28 <kazinsal> yeah

06:28 <geist> or a thinnet i think, but i didn't have coax cables floating around

06:28 <geist> and i only have one old 10base T hub that takes coax anyway

06:29 <kazinsal> haha yeah, I'd rather AUI into an actual switchport

06:29 <geist> yah i was surprised to also find an aui to ethernet box for cheap on ebay too. one of those grey ones with the colors next to the lights

06:30 <geist> forget the brand but i instantly recognized it

06:30 <kazinsal> I used to have one of those around here somewhere

06:30 <geist> must have been pretty ubiquitous

06:30 <kazinsal> came with some old sparc box my uncle gave me when I was a youngin

06:30 <kazinsal> which I unfortunately cannot find anymore

06:30 <geist> Allied Telesis it appears

06:31 <geist> AT-210TS

06:31 <kazinsal> yep, that looks familiar

06:31 <geist> yah looks like you can get em for $15-$20 on ebay

06:49 <bradd_> hi. in uefi, GOP, can I somehow get the default monitors native resolution?

06:50 <moon-child> I would just pick the biggest resolution

06:51 <bradd_> ok

06:51 sprock has quit [Ping timeout: 256 seconds]

06:54 elastic_dog has quit [Ping timeout: 250 seconds]

06:57 <kazinsal> you could get the EDID info for the active display from GOP

06:59 elastic_dog has joined #osdev

07:01 <kazinsal> if you do a LocateProtocol on... I think it's EFI_EDID_DISCOVERED_PROTOCOL and it succeeds it should spit out a typedef struct {uint32_t sizeOfEdid; const uint8_t* edid}

07:01 <kazinsal> may want to double check the UEFI docs on tha tthough

07:03 <bradd_> yeah, I'm looking through it now. seems doable. thanks

08:30 vancz_ is now known as vancz

08:34 <geist> hah was trying to get virtio working on the 68k virt machine

08:34 <geist> was puzzling why it couldn't detect the devices

08:34 <geist> someone forgot to endian swap the registers!

08:35 <zid`> was that someone you

08:36 <geist> maaaaaaaybe

08:36 <geist> is kinda intersting though that the virtio mmio interface isn't host native

08:37 <geist> though i guess in a cross endian situation like this you gotta pick one or the other, the host machine or the emulated machine

09:03 MrBonkers has quit [Remote host closed the connection]

09:05 MrBonkers has joined #osdev

09:09 the_lanetly_052 has joined #osdev

09:49 bauen1_ has quit [Ping timeout: 256 seconds]

10:10 bauen1 has joined #osdev

10:20 dormito has quit [Quit: WeeChat 3.3]

10:47 GeDaMo has joined #osdev

10:49 masoudd has joined #osdev

11:02 gog has joined #osdev

11:05 dormito has joined #osdev

11:36 gog has quit [Read error: Connection reset by peer]

11:38 masoudd has quit [Quit: Leaving]

11:41 mjg has joined #osdev

12:04 gog has joined #osdev

12:04 <gog> mew

12:04 masoudd has joined #osdev

12:06 <zid`> wem.

12:07 <zid`> https://cdn.discordapp.com/attachments/417023075348119556/941407186473611274/unknown.png Progress report of my own

12:07 <gog> wow

12:07 <zid`> I should just start copy pasting shit from boros

12:08 gog` has joined #osdev

12:08 gog` has left #osdev [#osdev]

12:08 <gog> whoops

12:08 <zid`> phew, thought you were breeding for a second there

12:08 <gog> no longer a problem

12:08 <gog> :D

12:09 <zid`> gog mitosis

12:09 <zid`> latest article on the scarypasta wiki

12:09 <gog> haha

12:09 <GeDaMo> Why are you passing some struct to main?

12:09 <gog> why not?

12:09 <zid`> why is multiboot passing a struct to main

12:09 <GeDaMo> Ah

12:09 <gog> why does anything do anything

12:10 <zid`> whch way around do typedefs go

12:10 <zid`> same as define or backwards

12:11 <zid`> I never use typedefs except once every 10 years when I implement types.h and make u64 and u32

12:11 <GeDaMo> I think it's typedef int my_int;

12:11 <g1n> hello

12:11 <zid`> backwards then

12:11 <GeDaMo> Yeah, it is https://en.cppreference.com/w/c/language/typedef

12:11 <bslsk05> en.cppreference.com: Typedef declaration - cppreference.com

12:12 <GeDaMo> Hi g1n :)

12:12 <gog> just use stdint.h

12:12 <zid`> can't be bothered to type uint64_t

12:12 <gog> the extra 5 keystrokes are worth it

12:12 <gog> or use autocomplete

12:12 <zid`> u64 for life

12:12 <zid`> also makes function protos shorter so I have to split fewer of them

12:27 nyah has joined #osdev

12:28 <zid`> Me and the VGA BOIS going on the vengabios. boom boom boom boom, I want you in my room. we'll spend the night together, together playing doom.

12:30 <gog> when you're alone and you need a friend, someone to help you kill the demons

12:30 <gog> just come along baby take my gun, we'll shoot some demons tonight

12:50 dennis95 has joined #osdev

13:17 catern has joined #osdev

13:23 ElectronApps has quit [Remote host closed the connection]

13:29 <zid`> gog watch out there's a spider on your keyboard

13:29 <zid`> wait nevermind, he's under control

13:29 <gog> he's my friend it's ok

13:31 <zid`> under control, gog

13:33 <zid`> like a keyboard

13:34 <gog> yes

13:34 <gog> yes i get it

13:34 <gog> :p

13:34 <zid`> UNDER CONTROL GOG

13:35 <gog> THANK YOU ZID

13:38 <sham1> If you have to explain the joke, it's not funny

13:38 <zid`> Explaining a joke is like dissecting a frog

13:38 <zid`> nobody laughs and the frog dies

13:38 <gog> my wife and i have an ongoing gag where we explain obvious jokes to one another

13:39 <gog> it's funny every time imo

13:41 <zid`> I find it funny too

13:41 <gog> we dissected lungfish in hs bio

13:44 <zid`> I'm dicking with page tables, did we ever discover if you need NX set all the way down, or just on the pml4e containing it

13:44 <zid`> we discussed it before but I don't remember the actual answer

13:45 <gog> so like for example, if you set NX in a PDE that whole 2MB is NX

13:45 <gog> in a PDPE the whole GB

13:45 <sham1> We had to dissect a frog or something. I didn't make jt

13:45 <gog> permissions enforce down the tree

13:45 <zid`> but that seems like it'd not be very cacheable

13:45 <gog> idk if that's how it works in actual hardware is the thing

13:45 <zid`> you'd have to keep rechecking the pdml4e even if you have the pte cached

13:46 <gog> but in qemu i set NX on a PDPE and my kernel #PF'd on entry

13:46 <gog> even though the PTE was not NX

13:46 <zid`> I think maybe it only applies if it's a short entry

13:46 <zid`> so it only marks that 2MB entry as NX if it's a huge page

13:46 <zid`> it only marks the 1G range as NX if it's a giant page

13:46 <gog> no, it was a regular page directory

13:46 <gog> or PDPE rather

13:47 <gog> PDPT*

13:47 <zid`> I guess it does have to do a full walk the /first/ time though

13:47 <zid`> and maybe it bails early

13:47 <zid`> so all bits matter

13:47 <gog> i think so

13:47 <gog> but i'd have to dig into the docs again to make sure i understood them correctly

13:48 <zid`> confusing notes like `With PAE paging, the PDPTEs do not determine access rights.`

13:48 <zid`> don't help

13:48 <gog> hm

13:48 <gog> maybe i misremembered, i'll have to do an experiment later

13:49 [itchyjunk] has joined #osdev

13:50 <zid`> I guess the question is

13:50 <zid`> "What happens if I set NX in a PTE, but not in a PD"

13:51 <zid`> The access rights from the paging-structure entries used to translate linear addresses with the page number: The logical-OR of the XD flags (necessary only if IA32_EFER.NXE = 1).

13:51 <zid`> from the TLB section

13:56 <gog> so qemu's behavior is right then

13:56 <zid`> and what's qemu's b ehavior?

13:57 <gog> logical or with the bits of its containing table

13:57 <zid`> (I mean, it's a hypervisor right so it just does.. what your cpu does anyway?)

14:16 epony has quit [Read error: Connection reset by peer]

14:18 pretty_dumm_guy has joined #osdev

14:35 blockhead has quit []

15:13 epony has joined #osdev

15:23 <GeDaMo> It's just occurred to me I'm the same age as Christopher Walken was when he made the Weapon of Choice video :|

15:25 <gog> fatboy slim > moby

15:27 <gog> he was 57 when he did that wow

15:27 <gog> wait

15:27 <gog> 47?

15:27 <gog> no 57

15:27 <GeDaMo> 0x39

15:27 <gog> lol

15:27 <gog> yes

15:29 k8yun has joined #osdev

15:31 <zid`> GeDaMo: Can I have half your stuff when you die in the next couple of years?

15:31 <GeDaMo> Which half? :P

15:31 <zid`> down the middle vertically

15:31 <GeDaMo> You could have half my stuff now, it's mostly crap :P

15:31 <zid`> a keyboard with QWERT

15:32 <zid`> a monitor with the start button but not the systray

15:36 epony has quit [Ping timeout: 240 seconds]

15:44 <sham1> QWERTZ

15:54 k8yun has quit [Quit: Leaving]

16:12 epony has joined #osdev

16:17 epony has quit [Ping timeout: 240 seconds]

16:22 epony has joined #osdev

16:30 the_lanetly_052 has quit [Ping timeout: 256 seconds]

17:03 gwizon has joined #osdev

17:08 vdamewood has quit [Remote host closed the connection]

17:08 vdamewood has joined #osdev

17:19 gwizon has quit [Quit: Lost terminal]

17:29 k8yun has joined #osdev

17:36 heat has joined #osdev

18:13 dennis95 has quit [Quit: Leaving]

18:14 k8yun has quit [Ping timeout: 240 seconds]

18:19 k8yun has joined #osdev

18:21 k8yun_ has joined #osdev

18:24 k8yun has quit [Ping timeout: 250 seconds]

18:26 k8yun_ has quit [Quit: Leaving]

18:39 heat_ has joined #osdev

18:39 heat has quit [Read error: Connection reset by peer]

18:41 heat_ is now known as heat

18:58 epony has quit [Ping timeout: 240 seconds]

19:00 the_lanetly_052 has joined #osdev

19:00 epony has joined #osdev

19:08 k8yun has joined #osdev

19:13 tiotags has joined #osdev

19:24 epony has quit [Ping timeout: 240 seconds]

19:47 <heat> screw it i'm going for arm64

19:47 <heat> riscv has bored me

19:47 <geist> hah too easy?

19:47 <geist> come on at least get it fully working (if you hvaen't already)

19:54 <heat> yeah i know but it's boring me

19:54 <heat> i want a good manual again ;_;

19:54 <j`ey> heat: and you can test on your new rpi!

19:55 <heat> i'll stick to kvm for now xd

19:55 <j`ey> well even that is cooler than tcg :P

20:02 [itchyjunk] has quit [Read error: Connection reset by peer]

20:02 <heat> hmm yeah idk

20:02 <heat> arm64 seems fun but actually completing something would be fun

20:08 <heat> something really funny I found out: because fedora can't cross compile packages they had to compile the whole of fedora inside qemu-system-riscv tcg

20:09 <j`ey> :|

20:21 epony has joined #osdev

20:24 <catern> relevant for here too https://lwn.net/Articles/869140/

20:24 <bslsk05> lwn.net: x86 User Interrupts support [LWN.net]

20:24 <catern> seeing the string "IPI" makes me think "slow"

20:25 <catern> but realizing this is basically support for explicit inter-core message passing makes me think "fast"

20:25 <catern> (i guess IPIs are not necessarily bad? they're basically message passing anyway)

20:26 <zid`> did anyone read the code?

20:26 <zid`> How does it erm, work

20:26 <zid`> presumably this is hardware accelerated, so now the cpu needs to know how threads work inside your OS!?

20:27 <catern> check the "application interface" section

20:27 <catern> surprisingly excellent

20:29 <catern> (by that I mean the fact that they have an actual fd interface which nicely allocates indices and all that. very cool!)

20:30 epony has quit [Ping timeout: 240 seconds]

20:30 <zid`> so it just sets a flag associated with a file descriptor, and the kernel can check it when it wakes the task up to wake it up into effectively a signal handler instead?

20:31 <zid`> or userspace can poll the fd, either way

20:31 <catern> If the receiver is running (CPL=3), then the user interrupt is delivered directly without a kernel transition. If the receiver isn't running the interrupt is delivered when the receiver gets context switched back. If the receiver is blocked in the kernel, the user interrupt is delivered to the kernel which then unblocks the intended receiver to deliver the interrupt.

20:31 <catern> seems pretty clear

20:31 <zid`> except it's all abstract?

20:33 <catern> doesn't seem that abstract, the implementation is pretty obvious - set some "pending_ipi" bit on the struct task and deliver when you switch into that task

20:33 <catern> that's for if the receiver isn't running, I mean

20:33 <catern> if the receiver is running... it sends an IPI, seems pretty clear

20:33 <catern> it receives the IPI*

20:33 <zid`> clearly we have different interpretations of clear

20:34 <zid`> I understand the semantics of it fine

20:34 <zid`> I wanted to know the mechanics, which you just on the spot made up as a guess, sort of proving my point

20:34 mrvn has joined #osdev

20:34 <catern> fair enough

20:35 <mrvn> moin

20:35 tiotags has quit [Quit: Leaving]

20:37 <mrvn> Anyone know if gcc/clang can produce code coverage information for const expressions?

20:37 <mrvn> constexpr/consteval/constinit

21:13 <heat> i don't think so

21:13 <heat> sancov/gcov is fully runtime

21:14 <mrvn> and misses all constevals

21:16 <heat> wow looks like user interrupts are crap

21:16 <mrvn> user interrupts?

21:17 <heat> the 10x performance gain is only when you spin in user-space

21:17 <heat> mrvn, <catern> relevant for here too https://lwn.net/Articles/869140/

21:18 <heat> when you (reasonably) block in the kernel, it's only 10% faster than eventfd and 40% faster than signals(which aren't really meant to be used for ipc anyway)

21:21 <mrvn> What's the hardware part in this? Reads more like a different interface to sending signals.

21:22 <heat> new instructions

21:22 <mrvn> "The interrupt state of each task is referenced via MSRs which are saved and

21:22 <mrvn> restored by the kernel during context switch.

21:22 <mrvn> " Sounds like a lot of overhead on task switch for little gain

21:23 <heat> do not fear, the micro benchmark is here

21:23 <catern> heat: that's not crap

21:23 <heat> it's 10x faster than eventfd!!!!!!!!!!!!

21:23 <heat> (on a 1M ping-pong IPC with 1 byte messages and where you spin the whole time in user-space)

21:24 <catern> if you have a cpu-intensive thing that's always running on a core

21:24 <catern> you can easily reach the 10x gain

21:24 <catern> i mean, it's just a better alternative to a spinlock

21:24 <catern> (well, plus it has actual interrupts)

21:25 <heat> you should never ever use a spinlock in user-space ever

21:25 <catern> spoken like a true person-who-does-not-write-high-performance-userspace-code

21:25 <mrvn> "If the receiver is running (CPL=3), then the user interrupt is delivered

21:25 <heat> if you need "high performance userspace code" don't run on a conventional OS

21:25 <mrvn> directly without a kernel transition." Is that only if the receiver task is running?

21:29 <mrvn> I think for me this feature will be 100% pointless. Violates the "race to sleep" principle I'm going for.

21:29 <heat> someone should tell intel futexes exist

21:30 <mrvn> heat: futexes don't interrupt

21:30 dormito has quit [Quit: WeeChat 3.3]

21:30 <heat> interrupting is just a cute word for "worker thread that waits for an event"

21:30 GeDaMo has quit [Remote host closed the connection]

21:30 <mrvn> heat: no. interrupt means it doesn't have to wait

21:31 <mrvn> interruptiong vs. polling

21:31 <heat> you don't have to wait if you use a futex and sleep on an address

21:31 <heat> there's no polling here

21:31 <mrvn> sleep is a wait

21:31 <heat> it's like waiting for an interrupt but less asynchronous

21:32 * kingoffrance watches catern have nightmare flashbacks as this was more or less discussed in another channel

21:32 <kingoffrance> its happening again...

21:32 <mrvn> user interrupts compare to signals, uintr_fd compares to event_fd and futexes.

21:33 <catern> kingoffrance: lol you're right haha

21:33 <mrvn> Urgs, looks like user interrupts don't save state. The app has to do that manually.

21:34 <mrvn> How will that work with lazy fpu saving in the kernel?

21:35 <heat> x86 doesn't do lazy fpu saving anymore

21:35 <mrvn> heat: kernels do

21:35 <heat> no, linux doesn't

21:35 <heat> it's actually slower

21:35 <mrvn> linux is not all kernels

21:35 <heat> doesn't do so since like 2016 or so

21:37 <heat> i don't understand why you choose this option

21:37 <heat> instead of doing something sensible like a worker thread waiting for futexes in shared memory

21:37 <mrvn> anyway, how is the app going to save FPU/SSE state?

21:38 <mrvn> heat: because waiting has a big kernel overhead

21:38 <heat> spinning also has a big kernel overhead

21:38 <mrvn> heat: hence the whish to have interrupts instead

21:38 <heat> the 10x speedup is only when you spin in user-space lol

21:39 <heat> it's marginally faster than eventfd when it's not spinning but blocking as usual

21:39 <mrvn> heat: what do you mean by "spin" anyway? I don't see any spinning in the proposed API

21:39 masoudd has quit [Quit: Leaving]

21:39 <heat> https://lwn.net/ml/linux-kernel/BYAPR11MB33203044CD5D7413846655F9E5DA9@BYAPR11MB3320.namprd11.prod.outlook.com/

21:39 <bslsk05> lwn.net: Re: [RFC PATCH 00/13] x86 User Interrupts support [LWN.net]

21:40 <heat> "The 10x gain is only seen when the receiver is spinning in User space - waiting for interrupts."

21:41 <mrvn> heat: that seems to be very bad wording unless he means the task is doing "while(true) if (testui()) { ... }"

21:42 <mrvn> heat: The description of the hardware feature reads more like that the receiving task just has to be running. E.g. doing some calculations or whatever.

21:42 <heat> they also fail to give out any examples of latency, it's all relative

21:44 <mrvn> heat: note: "If the receiver isn't running the interrupt is delivered when the receiver gets context switched back." Latency can be days.

21:48 <mrvn> On the hardware side this seems like a nightmare to implement race free. Descriptor tables that need to be read and modified from mutliple cores, premission checked, MSR registeres changed, stack manipulated and all of it atomic.

21:49 moon-child is now known as bowl-of-petunias

21:49 bowl-of-petunias is now known as moon-child

21:49 <mrvn> 22:37 < mrvn> anyway, how is the app going to save FPU/SSE state?

21:50 <mrvn> Does gcc/clang "interrupt" attribute handle that magically?

21:50 <heat> xsave

21:50 <mrvn> heat: syntax error

21:51 <heat> ?

21:52 <mrvn> heat: it's user code and supposed to call C code from the description. If you have to mess around with asm for the handler then it failed.

21:53 <heat> it has intrinsics I guess

21:53 <heat> I would go with the "if you need to use __attribute__((interrupt)) in user-space, your feature failed" but that works too ;)

21:53 <mrvn> heat: The article only mentions 'interrupt' attribute

21:53 <mrvn> My question was if that makes gcc save FPU/SSE state as needed too

21:54 <heat> no

21:54 <heat> although there is going to be a new flag, muintr for the interrupt attribute

21:54 <heat> though they explicitly mention "Applications can choose to save floating point registers as part of the interrupt handler as well."

21:55 <heat> because x86_64 user space obviously never uses the FPU

21:55 * mrvn thinks this feature will cause tons and tons of hidden bugs where UIPI will corrupt fpu/sse state when it happens at a bad time.

21:57 <heat> i agree

21:57 <heat> but i bet no one is really going to use this

21:57 <mrvn> That makes it even worse. Because everyone will have to pay for it.

21:58 <mrvn> (in linux)

21:58 <zid`> KCONFIG_UIPI=N

21:58 <zid`> ftfy

21:58 <heat> hopefully they only save and restore in processes where the feature is actually enabled

21:58 <mrvn> heat: unless that would leak information then

21:59 <heat> wrmsr when switching to the thread and wrmsr when switching out

22:00 <mrvn> And who came up with the idea of having a "senduipi <index>" instruction instead of "senduipi <index>, data"? One should be able to at least send one word of data and not just interrupt.

22:00 <zid`> make a new index, there's your one bit of data

22:00 <mrvn> yeah, indexes 0-64. Only needs 8 UIPIs per word then.

22:01 <mrvn> s/64/63/

22:01 <heat> this was really poorly thought out IMO

22:02 <mrvn> or at least it has a verry narrow target

22:02 <heat> for the context switching latency, the fans of this feature clearly don't like other threads all that much

22:02 <heat> so that was never an issue ;)

22:03 masoudd has joined #osdev

22:03 <heat> maybe they were waiting for fsgsbase to be merged so they could add another wrmsr to replace it :v

22:03 <zid`> I think it's basically just there to implement WaitForObjectEx(thread2);

22:03 <mrvn> it could be usefull to replace signals, at least user defined ones.

22:03 <mrvn> zid`: no. if you wait the feature fails to deliver.

22:03 masoudd has quit [Max SendQ exceeded]

22:04 masoudd has joined #osdev

22:04 <mrvn> "No feedback on whether the interrupt was sent or received." Oh that sounds like fun.

22:05 <mrvn> So I do "senduipi 23" and cross my fingers.

22:05 <heat> there are no privilege checks on senduipis too

22:06 <geist> oh wow a mrvn appears

22:06 <heat> they assume everything is trusted

22:06 <mrvn> heat: The descriptors the index points to aren't checked?

22:06 <mrvn> geist: hi geist. I'm back. Had to make some tee, took me a while.

22:06 <geist> hah hadn't seen you since the bounce to the new server, thought maybe you got lost in the shuffle

22:06 <heat> "The current implementation expects only trusted and cooperating processes to communicate using user interrupts."

22:07 <mrvn> geist: yeah, missed that move.

22:07 <heat> this literally can't replace signals

22:07 <geist> i wonder how something like riscv's user interrupts were supposed to help with this

22:08 <geist> or if it was for a separate thing. i dunno if it was ever actually implemented

22:08 <mrvn> heat: In the API you need the unitr_fd to create the needed descriptor tables.

22:08 <mrvn> .oO(Everything is a file, even permissions)

22:09 mahmutov has joined #osdev

22:12 <mrvn> What happens when 2 cores use senduipi? Does it block one core till the receiver does uiret? Does it wake up the kernel to queue them? Does it drop it on the floor?

22:12 <mrvn> (I assume the last)

22:12 <heat> i think it queues them

22:13 <heat> that would be the most interrupty behaviour

22:13 <mrvn> heat: queues them where?

22:13 <mrvn> interrupts just drop when it fires twice.

22:14 joe9 has quit [Ping timeout: 256 seconds]

22:14 <heat> it's the upid I think

22:15 <mrvn> heat: assume both cores send the same index.

22:15 <mrvn> one probably shouldn't share indexes between senders.

22:16 <heat> honestly idk

22:16 <heat> another wart in x86 i guess

22:17 <mrvn> yeah, lots of details left haning in the air in the article.

22:17 <mrvn> geist: do you know of something to get code coverage for constexpr/consteval/constinit in c++?

22:18 dormito has joined #osdev

22:18 <mrvn> I saw this nice talk about using constexpr for unit tests but I really don't see how to check if all code is covered then.

22:23 epony has joined #osdev

23:08 AeroNotix has joined #osdev

23:08 ShahNaim has joined #osdev

23:09 masoudd has quit [Read error: Connection reset by peer]

23:21 masoudd has joined #osdev

23:22 heat has quit [Remote host closed the connection]

23:26 <moon-child> anyone know of good resources on implementing very robust persistent storage systems?

23:30 <dmh> can ya quantify how robust

23:30 <kazinsal> maybe read some whitepapers from netapp?

23:33 masoudd has quit [Remote host closed the connection]

23:33 <moon-child> dmh: like, _so_ robust

23:34 <moon-child> kazinsal: have a pointer? I see https://www.netapp.com/atg/publications/, but nothing there seems directly relevant

23:34 <bslsk05> www.netapp.com: Academic Industry Research Contributions Publications | NetApp

23:35 <kazinsal> https://hack.org/mc/texts/wafl.pdf

23:35 <mrvn> robust against what?

23:35 <kazinsal> https://www.netapp.com/pdf.html?item=/media/19939-tr-3298.pdf

23:35 <bslsk05> www.netapp.com: RAID-DP: NetApp Implementation of RAID Double Parity for Data Protection | NetApp

23:35 <moon-child> mrvn: hardware failure

23:36 <moon-child> perhaps also software failure

23:36 <mrvn> moon-child: run linux software raid with enough drives or better zfs.

23:36 <moon-child> sure. That's what I would do if I wanted to _use_ such a system. I want to know about how to implement one

23:37 <moon-child> kazinsal: thanks!

23:37 <mrvn> against software failure there really is just one thing: make backups and test them.

23:37 <mrvn> well, the algorithms for raid and zfs are known, read up.

23:38 <moon-child> can write defensively and try to layer abstractions well, st bugs are less likely to cause unrecoverable data loss

23:38 <mrvn> checksum everything. there is nothing worse than creeping errors on storage

23:38 dude12312414 has joined #osdev

23:39 <moon-child> sure, yes. That's a bit trivializing though

23:39 <kazinsal> https://web.archive.org/web/20150927213917/https://atg.netapp.com/wp-content/uploads/2012/12/RTP_Goel.pdf

23:39 <bslsk05> web.archive.org: Wayback Machine

23:40 dude12312414 has quit [Remote host closed the connection]

23:41 <mrvn> It also really depends on what access you want to grant. It this an archive that you basically never read? Tape formats work very well for that and any error just corrupts single files or mean you have to scan a bit for the next header.

23:42 <mrvn> read/write/modify access is a totally different beast

23:43 * moon-child nods

23:43 <moon-child> I want to tolerate write-heavy workloads

23:44 <geist> probably the right strategy is to read up and learn the basics first and then work up to it

23:44 <geist> ie, as mrvn was saying read up and understand basic RAID and techniques first

23:45 <geist> which i think are still fairly ubiquitous

23:45 <geist> and work up to more integrated solutions like zfs or btrfs

23:45 <geist> but really the answer is this stuff is *incredibly* complicated

23:45 <mrvn> or really simple.

23:45 <geist> though like many things generally built with layers of more simpler things

23:46 <geist> but the whole package ends up being a highly complex layer cake

23:46 <mrvn> A raid with a tar file on it or a common FS is really trivial stuff. But if your FS is buggy then bye bye data.

23:46 <geist> indeed. though if you also want checksumming of data you need a less trivial fs, etc

23:47 <geist> it starts to build up

23:47 <dh`> how robust do you mean by "very robust"?

23:48 <mrvn> If you value your data you don't want any single point of failure. So redundancy in the disk drives (raid, zfs). But also redundancy in the software: raid+ext4 on one set, btrfs on another, zfs on a third?

23:48 <moon-child> haha

23:48 <mrvn> And to protect against creeping error keep backups.

23:48 <geist> or physical redundancy, not having data in the same physical location

23:48 <kazinsal> or metaphysical redundancy (having an expensive 24X7 support contract)

23:49 <geist> noice

23:49 <dh`> there's one set of techniques for not losing filesystem volumes on crashes, there's another set for protecting against disk failures, and a third set for things like forest fires burning down your machine room

23:49 <geist> and if you combine them you get a Voltron of Storage

23:49 <mrvn> The most important thing I feel is protecting your data while you recover from a bad drive.

23:50 <dh`> these techniques don't really intersect that much because of the different operational constraints at the different levels of concern

23:50 <mrvn> A lot of people fail to recover from a recoverable failure. And I mean a lot.

23:51 <kazinsal> I've definitely seen people lose data because they thought RAID-5 was enough, and their drives were all from the same few lots

23:51 <kazinsal> one failed, then another failed mid-rebuild

23:51 <kazinsal> RIP your data, gone to the great bit bucket in the sky

23:51 <mrvn> You should also buy different drive models and drives from different charges. Rotate out some good drives after a while so drives in the array all have different used time.

23:52 <mrvn> kazinsal: that happens a lot. A raid rebuild is usually a MAJOR load on the array. Way past the usual work load. And then the old flaky drives just go down like flies.

23:52 <kazinsal> yerp

23:52 <mrvn> kazinsal: most people don't scrub their raids.

23:53 <dh`> anyway, not losing filesystem volumes on crash is well understood and, theoretically, textbook material, but given the ... shortcomings of readily available implementations that might be an optimistic perspective

23:53 <kazinsal> I do datacenter stuff for a living so I tend to see a lot of storage nightmares

23:53 <mrvn> dh`: hardware screws that up.

23:53 <mrvn> dh`: don't trust your drives to adhere to standards and actualy write your data when they say they do.

23:53 <kazinsal> unreliable power is one of the most deadly things to storage systems

23:53 <dh`> theoretically, your hardware robustness layer should take care of that (to the extent possible)

23:54 <kazinsal> all the code in the world will not save you if your drives keep losing power during workloads once a week

23:54 <mrvn> If you want robustness you want a write-once design.

23:54 <dh`> there is nothing you can do with SSDs that revert to an inconsistent mix of prior states on power failure

23:54 <dh`> other than use those disks only for /var/obj