#osdev on 2023-01-24 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:05 danilogondolfo has quit [Remote host closed the connection]

00:48 nyah has quit [Quit: leaving]

00:51 Burgundy has quit [Ping timeout: 260 seconds]

01:04 Vercas has quit [Quit: Ping timeout (120 seconds)]

01:05 Vercas has joined #osdev

01:10 smach has quit [Ping timeout: 252 seconds]

01:10 Matt|home has quit [Quit: Leaving]

01:15 gog has quit [Ping timeout: 255 seconds]

01:35 Brnocrist has quit [Ping timeout: 252 seconds]

01:37 Brnocrist has joined #osdev

01:45 craigo has joined #osdev

02:04 xvmt has quit [Ping timeout: 264 seconds]

02:08 xvmt has joined #osdev

02:15 joe9 has quit [Quit: leaving]

02:30 Vercas has quit [Quit: Ping timeout (120 seconds)]

02:34 Vercas has joined #osdev

02:39 wxwisiasdf has joined #osdev

02:39 <wxwisiasdf> Hiiiiii

02:39 <wxwisiasdf> today is the day we consume RISCV 64 and embrace the greatness of RISCV 128

02:47 smach has joined #osdev

02:54 fedorafansuper has joined #osdev

02:54 fedorafan has quit [Ping timeout: 252 seconds]

03:07 masoudd has quit [Quit: Leaving]

03:10 heat has quit [Ping timeout: 256 seconds]

03:12 smach has quit [Read error: Connection reset by peer]

03:13 sugarbeet has joined #osdev

03:13 sugarbeet has left #osdev [#osdev]

03:24 CryptoDavid has quit [Quit: Connection closed for inactivity]

03:43 fedorafansuper has quit [Quit: Textual IRC Client: www.textualapp.com]

03:45 small_ has quit [Quit: Konversation terminated!]

03:59 <mrvn> You went from riscv 32 to riscv 64 and it wasn't enough. What makes you think doing the same again will be any better? Come one, go up to the next operand. 64 * 64 = riscv 4096

03:59 <mrvn> 64^2

04:02 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

04:17 srjek has quit [Ping timeout: 268 seconds]

04:25 smeso has quit [Quit: smeso]

04:27 zxrom has quit [Quit: Leaving]

04:30 Vercas has quit [Quit: buh bye]

04:31 Vercas has joined #osdev

04:32 smeso has joined #osdev

04:44 <wxwisiasdf> mrvn: riscv 96-bit

04:55 wxwisiasdf has quit [Ping timeout: 264 seconds]

05:31 <geist> i think there is a prototype riscv128 in work though, i should dig up infos on it to see

05:54 bradd has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

05:54 bradd has joined #osdev

06:02 <sham1> What reason would there be

06:03 <sham1> What would be the use of a 128 bit ISA? I mean, okay, arithmetic, but other than that

06:04 <geist> well, the arithmetic could be a thing

06:06 <moon-child> I thought it was just about addr space

06:06 <moon-child> I mean, you could do just 128 bit arithmetic and smaller address space. That would be fine imo

06:06 <moon-child> (though not too useful in practice--multiword arithmetic is fine when you need it)

06:07 * geist nods

06:09 <geist> but a 64bit aspace we're fairy close to exhausting in some extreme situations, so i can see extending that out in a natural way to be a thing to consider

06:15 <moon-child> which conditions?

06:15 <moon-child> I mean, you could have more than 2^64 bytes of data

06:15 <geist> big ass machines

06:15 <moon-child> but it's not clear to me that you can practically exceed a 64-bit address space

06:15 <geist> also mapping large storage things into the aspace

06:16 <moon-child> and you don't want to double the size of your regular pointers

06:16 <geist> it's already enough that arm and x86 are extending from 48 to 57, etc

06:16 <AttitudeAdjuster> moon-child: bring back weird segment addressing of the old days maybe?

06:16 <geist> i'm not saying it's something anyone needs right now, but in 10-20 years easy

06:16 <AttitudeAdjuster> 16bit segment pointer with 64bit addr pointer

06:16 <moon-child> that's my point; the considerations for large storage things are different than for main memory. And doubling the size of all pointers seems like a bad tradeoff

06:16 <moon-child> AttitudeAdjuster: pls

06:16 <geist> so the riscv folks at least left in a nice forward compatibility mechanism

06:17 * geist shrugs

06:17 <AttitudeAdjuster> moon-child: fine i'll see myself out :'(

06:17 AttitudeAdjuster has left #osdev [dirty imposter]

06:17 AttitudeAdjuster has joined #osdev

06:17 <AttitudeAdjuster> jk

06:19 <moon-child> I wonder how many captchas on shady sites are fronts for captcha solving services

06:19 <moon-child> considering they get to sell a solved captcha and still get the benefit of a regular captcha

06:33 slidercrank has joined #osdev

06:45 truy has joined #osdev

06:47 <sham1> moon-child: doesn't even need to be a shady site. reCaptcha and thus nowadays Google does it even now to get training data

06:48 <moon-child> obviously. That's different

06:54 potash has quit [Quit: ZNC 1.8.2 - https://znc.in]

06:56 micttyl has joined #osdev

07:08 lockna has joined #osdev

07:12 bradd has quit [Ping timeout: 252 seconds]

07:14 bgs has joined #osdev

07:16 lockna has quit [Quit: lockna]

07:19 potash has joined #osdev

07:20 <epony> that implies that GOOG is not a shady business operaton.. but it is

07:21 <epony> nothing that happens in the inside is well understood and verifiable or validity evaluated by the public, it does some things that people speculate about and that's it

07:21 foudfou has quit [Quit: Bye]

07:22 foudfou has joined #osdev

07:24 <epony> the primary purpose of that is to rate limit the concurrency overloaded (enum) n:M (many) problem of servers that everyone uses, but that "concentration" is not really natural or meaningful, it's artificial (and not very intelligent)

07:28 <epony> it only obstruct regular usage people, not intentional violators of policies and limitations, nor business nor criminals, not mechanised and serviceable solvers and bypasses, as with copy protection and copyright and patents (and other "intellectual property") in general.. and GOOG steals secrets from your cornputers, that's why it's banned in research and development institutions and facilities outside USA (for example in German universities and other places)

07:28 Left_Turn has joined #osdev

07:28 <geist> can you just stop

07:28 <epony> yes

07:29 <geist> then please do

07:29 <epony> ok

07:30 Turn_Left has quit [Ping timeout: 268 seconds]

07:38 bradd has joined #osdev

07:44 truy has left #osdev [#osdev]

07:47 valerius_ is now known as valerius

07:51 potash has quit [Read error: Connection reset by peer]

08:14 smach has joined #osdev

08:15 <dinkelhacker> does anyone know how to make qemu start at EL3?

08:17 <dinkelhacker> nvm, found it -machine secure-on,virtualization=on

08:18 <geist> bingo. yep

08:18 <geist> aso means it wont emulate PSCI or whatnot, that's now your job (if you want to)

08:20 <dinkelhacker> as I don't know what it is I think I don't need it right now :D

08:22 <geist> yeah, if you just use virtualization=on you start at EL2 though

08:22 <geist> with PSCI emulated at a pseudo EL3

08:25 fedorafan has joined #osdev

08:38 craigo has quit [Ping timeout: 255 seconds]

08:47 danilogondolfo has joined #osdev

08:50 DonRichie has quit [Quit: bye]

08:54 smach has quit []

08:57 foudfou has quit [Remote host closed the connection]

08:59 foudfou has joined #osdev

09:11 gog has joined #osdev

09:45 gxt has quit [Remote host closed the connection]

09:46 gxt has joined #osdev

09:54 unimplemented has joined #osdev

09:56 unimplemented has quit [Read error: Connection reset by peer]

09:57 <ddevault> is this slide comprehensible https://l.sr.ht/0MNQ.png

09:57 <zid`> not really

09:58 <zid`> Usually you'd put blank lines in and make a sort of diagram showing the switches

09:58 <ddevault> blank lines?

09:59 <zid`> making y the time axis

09:59 <ddevault> Y is the time axis here

09:59 <zid`> so like your (blocked) lines

09:59 <ddevault> but, hm

09:59 <ddevault> maybe a table is better than an enumeration

09:59 <zid`> your thing does two things at the same time on multiple rows, so it isn't cpu time on y

10:00 <ddevault> fixed some of the timing issues https://l.sr.ht/2Ke9.png

10:00 <ddevault> could be multiple cores, the key is not CPU time but task states

10:00 <zid`> I'd say just outright remove (blocked)

10:00 <zid`> it's just making the screen busier

10:00 <zid`> -- at best

10:01 <zid`> I'd get rid of line 12 for similar reasons

10:02 <zid`> and is task 1 line 6 doing anything?

10:02 <zid`> seems like it could be folded into 5, and remove another 'two things on same line' case

10:02 <ddevault> latest https://l.sr.ht/d8L8.png

10:02 <ddevault> not sure what these numbers refer to after several edits

10:02 <zid`> same

10:05 <ddevault> any better? https://l.sr.ht/ccq5.png

10:05 <zid`> much nicer

10:05 <zid`> I prefer the old colours I think though

10:06 <zid`> no idea what they were trying to express

10:06 <zid`> but they were prettier

10:06 <ddevault> https://l.sr.ht/53gf.png

10:06 <ddevault> orange is kernel, black is userspace

10:06 <ddevault> to be explained in narration

10:07 nyah has joined #osdev

10:12 <ddevault> here's the whole slide deck, still not done expanding it for the full hour slot https://l.sr.ht/04Q9.pdf

10:14 GeDaMo has joined #osdev

10:16 <dinkelhacker> So if one compiled with -pic for pa space and you switch to va space you can't just write the va offset to the pc and sp, right? I mean it works as long as you don't have any static function pointer arrays which will contain the pa addresses...

10:19 <zid`> well if it's pic it's pic

10:19 <zid`> if it's not pic it's not

10:19 <zid`> tautology best ology

10:20 <zid`> If it's PIC, you can.. position it wherever you want, if it's not, you cannot

10:24 <dinkelhacker> I'll have to check later but I think I compiled with -fpic

10:35 bradd has quit [Ping timeout: 248 seconds]

10:36 <dinkelhacker> zid`: even if a global array contains pointers to functions? I mean the addresses stored in memory can only be one value?

10:39 <zid`> you'd need to process the GOT for that

10:39 <zid`> and do relocations

10:43 <dinkelhacker> Hmm.. seems like it would be much easier if the bootloader already sets up the vaspace and you directly compile the kernel for that?

10:44 <zid`> I get it easy because I use two binaries

10:44 <zid`> I turn the mmu on and jump to the pre-prepared kernel binary built to run at a specific VA

10:44 <zid`> achieveable with a linker script fine though

10:44 <zid`> even as a single binary

10:46 <dinkelhacker> as a single binary? How? Tell the linker that this one part of the code is at pa and the rest at va?

10:46 <zid`> it's just two sections with two different virtual addresses

10:47 <zid`> . = 1M; .text.low : { bootstrap.o } . = -2GB; .text.high : { kernel.o } or such

10:49 <dinkelhacker> bootstrap.o would be at a physical address, then you turn on the mmu und jump to kernel.o which is at a virtual address? I don't get the "two different _virtual_ addresses part.

10:50 <zid`> VA = PA

10:50 <zid`> you can still consider it a virtual address

10:50 <zid`> it's just identity mapped until the mmu is on

10:50 <zid`> your code doesn't give a shit about the physical address, just which virtual address things are visible through

10:51 <dinkelhacker> okay but wouldn't the image grow if you they are far apart?

10:51 <zid`> We're only changing the virtual addressing

10:52 <zid`> the physical is still the load address of the ELF (1MB for me, text.low would be at like 0x1001000 and text.high would be at 0x1002000)

10:53 <dinkelhacker> and what exactely tells the linker that you are changing the virtual address?

10:54 <zid`> . =

10:54 bradd has joined #osdev

10:54 <zid`> I made a test setup I can show you

10:58 <zid`> https://github.com/zid/test_va

10:58 <bslsk05> zid/test_va - Example (0 forks/0 stargazers)

10:58 <zid`> There

11:00 <zid`> https://cdn.discordapp.com/attachments/417023075348119556/1067398430315532309/image.png

11:00 <zid`> f() and g() both know which address they will be running from, as shown by the disassembly

11:02 <zid`> you can also use AT() to disjoint what ends up in the program headers, if needed

11:02 xenos1984 has quit [Read error: Connection reset by peer]

11:02 <zid`> or >

11:03 fedorafan has quit [Ping timeout: 256 seconds]

11:04 <dinkelhacker> thx! I'll take a look. I thought I did that at some point and it ended up groving my image a lot. But now that you explained it I don't kow why it should.

11:05 <zid`> you did . inside the {}

11:05 <zid`> so you had 'start of section at x, end of section at y'

11:06 <zid`> so it had to pad it

11:06 <zid`> https://cdn.discordapp.com/attachments/417023075348119556/1067399889593577492/image.png

11:06 fedorafan has joined #osdev

11:07 <zid`> That's a weird binary that says .text.low will be in physical memory at 10M but expects to run from 1M, and .text.high will be in physical memory at 20M but expects to run at 128M

11:08 <zid`> I have a 1M = 1M, and a 1.1M = 510TB for my acutal thing, the 1M=1M low code runs with paging disabled, I use it to set up the 510TB -> 1.1MB mapping, then jump to 510TB

11:08 Burgundy has joined #osdev

11:08 <dinkelhacker> oh okay! I think I got it

11:20 xenos1984 has joined #osdev

11:25 Burgundy has left #osdev [#osdev]

11:27 bauen1 has quit [Ping timeout: 256 seconds]

11:32 truy has joined #osdev

11:33 smach has joined #osdev

11:41 gxt has quit [Remote host closed the connection]

11:41 gog has quit [Ping timeout: 246 seconds]

11:44 gxt has joined #osdev

12:03 heat has joined #osdev

12:13 elastic_dog has quit [Ping timeout: 252 seconds]

12:13 elastic_dog has joined #osdev

12:22 smach has quit [Read error: Connection reset by peer]

12:30 <dinkelhacker> zid`: thx, btw ;)

12:35 <ddevault> final slide deck https://l.sr.ht/Lw4Y.pdf

12:40 bauen1 has joined #osdev

12:49 bauen1 has quit [Ping timeout: 255 seconds]

12:59 bauen1 has joined #osdev

13:03 dutch has quit [Quit: WeeChat 3.8]

13:10 fedorafan has quit [Ping timeout: 252 seconds]

13:11 <dinkelhacker> zid`: ok so I've actually done it like you mentioned. But when I create a binary I used objcopy -O binary out.elf out.img. The -O binary makes it bigger actually

13:13 Gooberpatrol_66 has joined #osdev

13:13 <dinkelhacker> at least when I have sections with addresses far apart. Without that my binary is actually smaller (30k instead of 250k)

13:13 TkTech7 has joined #osdev

13:14 xvmt_ has joined #osdev

13:14 sebonirc_ has joined #osdev

13:14 Patater has joined #osdev

13:14 pounce_ has joined #osdev

13:14 puck__ has joined #osdev

13:15 fedorafan has joined #osdev

13:15 zhiayang_ has joined #osdev

13:15 samis has joined #osdev

13:15 childlikempress has joined #osdev

13:16 nyah_ has joined #osdev

13:17 outfox_ has joined #osdev

13:17 doppler_ has joined #osdev

13:18 corank has joined #osdev

13:20 nyah has quit [*.net *.split]

13:20 xvmt has quit [*.net *.split]

13:20 sortie has quit [*.net *.split]

13:20 TkTech has quit [*.net *.split]

13:20 DrPatater has quit [*.net *.split]

13:20 CompanionCube has quit [*.net *.split]

13:20 corank_ has quit [*.net *.split]

13:20 _koolazer has quit [*.net *.split]

13:20 Gooberpatrol66 has quit [*.net *.split]

13:20 ebb has quit [*.net *.split]

13:20 puck has quit [*.net *.split]

13:20 Clockface has quit [*.net *.split]

13:20 sebonirc has quit [*.net *.split]

13:20 mahk has quit [*.net *.split]

13:20 outfox has quit [*.net *.split]

13:20 stux has quit [*.net *.split]

13:20 doppler has quit [*.net *.split]

13:20 zhiayang has quit [*.net *.split]

13:20 moon-child has quit [*.net *.split]

13:20 pounce has quit [*.net *.split]

13:20 xvmt_ is now known as xvmt

13:20 sebonirc_ is now known as sebonirc

13:20 pounce_ is now known as pounce

13:20 TkTech7 is now known as TkTech

13:20 zhiayang_ is now known as zhiayang

13:21 sortie has joined #osdev

13:22 koolazer has joined #osdev

13:22 <heat> dinkelhacker, how does your linker script look?

13:23 <heat> for a regular elf if you start jumping around the vaddr when objcopying to binary you are forced to have padding

13:24 <heat> so PHDR [1MiB, 2MiB], PHDR [4MiB, 4MiB + 4] will objcopy to ~3MiB + 4 bytes

13:25 <heat> sorry, not vaddr but probably paddr

13:25 ebb has joined #osdev

13:27 <zid`> unrelated

13:34 <zid`> honestly the phys field in an elf loader is *incredibly* rarely useful

13:37 <dinkelhacker> heat: it looks like so https://pastebin.com/N4fQ6rX5

13:37 <bslsk05> pastebin.com: ENTRY(_start)__stack_core_0 = 0x160000 - 0x10000;__stack_core_1 = 0x1600 - Pastebin.com

13:39 <dinkelhacker> so if I `obcopy -O binary` that I get roughly 30k. Without -O binary I have 250k.. just tried running that on the pi which did not work

13:41 <zid`> show readelf -l

13:41 <zid`> end is a bit of a mess btw

13:42 <dinkelhacker> I mean probably because without that its just an ELF file right? The pi expects a binary?

13:42 <zid`> . = align(4096); . = align(4096); bss_end = .; end = .;

13:42 <heat> I don't know. maybe?

13:42 <heat> they usually expect a flat binary

13:42 <heat> but idk about the pi

13:42 <zid`> idk what pi expects, qemu can probably deal with elf at least

13:42 <zid`> but, show readelf -l

13:42 <heat> btw, let me guess, your elf has debug info/syms

13:42 <heat> :))

13:43 <zid`> /DISCARD/ ho

13:44 <heat> btw, quick linker script tips: you can ALIGN(4096) when declaring your sections (like .text ALIGN(0x1000) : ...), you should do *(.data*), *(.text*) because the compiler sometimes generates stuff like .text.hot, etc

13:45 <zid`> .text.startup

13:45 <zid`> is a classic

13:45 <zid`> and arm type devices always have a bunch of weird shit

13:45 <dinkelhacker> yeah qemu can but not the pi.. so that's my problem I can't compile it in a way where I have some code in the pa space and some in the va space to get around the problem I head when switching to va space

13:45 <zid`> like .data.constpool.rel8

13:45 <zid`> dinkelhacker: readelf -l plskthx

13:45 <heat> dinkelhacker, why can't you

13:46 <zid`> You just need to make the rom be what the elf would be sans header, which will probably just be.. to do nothing besides move . around

13:47 <dinkelhacker> heat: bc. the binary whould be huge?

13:47 <heat> it would not

13:47 <zid`> no, you're confusing file offsets with virtual addresses

13:47 <heat> you just need to do it properly

13:47 <zid`> file offsets should be linear and packed, you use the mmu to map some of that file into memory at high addresses

13:47 <heat> ELF supports vaddr != paddr

13:48 gildasio has quit [Ping timeout: 255 seconds]

13:48 <zid`> paddr doesn't even matter here, heat

13:48 <zid`> we won't be using a physical loader for the elf

13:48 <heat> in the linker script you can use AT(...) to set up the paddr for your sections

13:48 <zid`> if paddr mattered, which it won't

13:48 <heat> wdym "we won't be using a physical loader for the elf" ?

13:48 <zid`> elf will be flashed to a rom or whatever

13:49 <zid`> nobody is then going to 'load' the elf section by section into physical memory

13:49 <zid`> it'll get splatted there in an -O binary blob

13:49 <clever> but objcopy to .bin, uses paddr rather then vaddr when laying out sections

13:49 <zid`> which is why you should ignore it

13:49 <clever> so you may have a gig of gap in the vaddr, but no gap in paddr

13:49 gog has joined #osdev

13:49 <zid`> just shove everything into .text starting at 0 if you're blobbing

13:49 <heat> sure, but you need to do it properly to get a usable ELF you can easily objcopy or use for debugging, etc

13:50 <zid`> ignore paddr, let the linker sort it out

13:50 gildasio has joined #osdev

13:50 <zid`> yea that tracks

13:50 <clever> zid`: the difference matters most in XIP targets, where you want the linker to put .data in ram, but objcopy to put .data into the ROM with .text

13:50 <dinkelhacker> https://pastebin.com/s12YfWtd

13:50 <bslsk05> pastebin.com: Elf file type is EXEC (Executable file)Entry point 0x160000There is 1 progra - Pastebin.com

13:50 <zid`> clever: that's a loader

13:51 <zid`> jesus lol

13:51 <zid`> oh, -ffunction-sections?

13:51 <heat> what the fuck

13:51 <heat> why are you putting everything in one phdr?

13:51 epony has quit [Read error: Connection reset by peer]

13:51 <zid`> I mean, that's what I do, unless like you said, I need to run it through other tools

13:51 <zid`> like deboogers

13:51 <clever> looks like the linker script didnt merge .text.* into .text

13:52 <heat> it did not

13:53 <zid`> but I use the wildcard :p

13:53 <heat> dinkelhacker, btw your load address is bogus

13:53 <zid`> .text : { *.o (.text*); }

13:53 <zid`> --wide exists also btw

13:53 <dinkelhacker> guys I can't follow anymore >D

13:53 <kaichiuchi> hi

13:53 <zid`> sections go in, sections go out

13:53 <heat> ok everyone shut the fuck up

13:53 <heat> including kaichiuchi

13:53 <heat> fuck you

13:54 <kaichiuchi> fuck you too

13:54 <zid`> oh I have him ignroed, makes sense

13:54 <heat> <3

13:54 <kaichiuchi> zid`: me!

13:54 <kaichiuchi> ?

13:54 <heat> dinkelhacker, pressing concerns: your load address is nothing a rpi will ever load

13:54 <zid`> pfft I checked logs he was being fine

13:55 <heat> dinkelhacker, https://github.com/raspberrypi/documentation/blob/develop/documentation/asciidoc/computers/config_txt/boot.adoc#kernel_address

13:55 <bslsk05> github.com: documentation/boot.adoc at develop · raspberrypi/documentation · GitHub

13:55 <kaichiuchi> wonder why zid` would ignore me

13:55 dude12312414 has joined #osdev

13:55 bauen1 has quit [Ping timeout: 252 seconds]

13:56 <kaichiuchi> i’d love to target an OS to rpi

13:56 <heat> dinkelhacker, I've also heard "0x80000 for older 64-bit kernels ("arm_64bit=1" set, flat image)"

13:56 <gog> hi

13:56 <kaichiuchi> hi

13:56 <heat> hell

13:56 <gog> did i mss some darama

13:56 <heat> no

13:56 <kaichiuchi> no

13:56 <gog> boring

13:56 <clever> heat: you can also just set kernel_address= in config.txt to force a certain load addr, as long as it doesnt conflict with other parts

13:57 <heat> dinkelhacker, TLDR your load address makes no sense and that would explain it if your thingy doesn't work

13:57 <heat> clever, any insight into the load address spaghetti fuckery?

13:57 <clever> i would need to see the linker script

13:57 <dinkelhacker> heat: The binary the pi will load is a small bootloader I have on the sd card which allows me to send the actualy binary via uart. However, this bootloader has the some load address. I mean, the pi just loads the binary to address 0x80000 and jumps to it.

13:58 <dinkelhacker> and it works fine ... i don't think the load address matters at all for the pi?

13:59 <clever> dinkelhacker: its more, that if your not writing PIC code, and your binary is loaded to a different address from where you linked it, things malfunction in fun ways

13:59 <heat> yes, the address you link your binary to run at needs to more or less match or here be dragons

14:00 <clever> but if the bootloader is loading your binary to 0x160000, it should be fine

14:00 <heat> s/more or less//

14:01 <dinkelhacker> yeah that is what I'm saying. The bootloader is linked to 0x80000, the pi laods that and executes it and loads the binary to 0x160000

14:01 <sham1> PIC without IP-relative addressing seems "fun". I hope that ARM has that

14:03 <clever> dinkelhacker: what is not working?

14:04 <dinkelhacker> but I still don't get how i should have one linker script where one code section is at 0x160000 and one at 0x40000000, objcopy that to a flat binary without it not beeing like 1GiB in size?

14:04 Vercas has quit [Ping timeout: 255 seconds]

14:04 <clever> dinkelhacker: thats what AT and paddr is for, to tell objcopy how to layout the things in the .bin, something else (mmu or memcpy) then has to move them to the "right" addr later

14:05 bradd has quit [Ping timeout: 260 seconds]

14:06 <heat> sham1, i think arm and riscv are mostly PIC

14:07 <clever> ive recently been looking into the encoding more, `b label` is always PC-relative, but bits0/1 of the addr are missing, because the target must be 32bit aligned

14:07 <clever> but if your jumping to something that might be thumb, you need `bx r0`, and now you need to get the addr into r0 first, `ldr` is typical, but thats not usually PIC

14:08 <clever> and i vaguely remember an `adc` opcode, that is basically just `r0 = pc + offset`

14:08 <heat> dinkelhacker, https://github.com/heatd/Onyx/blob/master/kernel/arch/x86_64/linker.ld

14:08 <bslsk05> github.com: Onyx/linker.ld at master · heatd/Onyx · GitHub

14:09 <heat> this linker script has code at 16MiB and -2GiB (almost +256TiB)

14:09 <heat> as you may guess, I don't get a 256TiB blob :))

14:09 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

14:10 <dinkelhacker> I thought AT was irrelevant? o.O

14:10 <dinkelhacker> but that makes a lot more sense ^^

14:10 <clever> objcopy uses the paddr, AT sets the paddr

14:10 antranigv has joined #osdev

14:11 <dinkelhacker> Ok... that makes sense... THANK YOU

14:11 <sham1> I dislike this immensely. Why would you physically link your kernel at 16MiB heat

14:11 bauen1 has joined #osdev

14:12 <sham1> Just make a separate thing that puts your kernel at -2TiB vaddr from the outset

14:13 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

14:17 mlombard has joined #osdev

14:21 <zid`> 16MB gives you loves of space for activities underneath

14:22 <zid`> lots

14:22 <zid`> like, 128 stacks

14:23 valerius has quit [Killed (NickServ (GHOST command used by theophilus!~corvus@user/theophilus))]

14:23 valerius_ has joined #osdev

14:25 <clever> dinkelhacker: also, a handy trick, if you pass qemu a .elf file, it will respect the load addresses (i forget if its paddr or vaddr) and i think the entry-point

14:25 <clever> dinkelhacker: so you could skip your bootloader when in qemu

14:26 <dinkelhacker> I do that.

14:31 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

14:35 epony has joined #osdev

14:35 antranigv has joined #osdev

14:38 Vercas has joined #osdev

14:42 Vercas has quit [Client Quit]

14:42 <dinkelhacker> how do you actually set the load address?

14:43 <clever> dinkelhacker: https://github.com/cleverca22/gba-template/blob/master/src/linker.ld#L1-L5

14:43 Vercas has joined #osdev

14:43 <bslsk05> github.com: gba-template/linker.ld at master · cleverca22/gba-template · GitHub

14:43 <clever> this defines various regions of memory that the linker should know about

14:43 <clever> and the > later on, says which region a section belongs in

14:44 <clever> .data has a vaddr within iwram, but a paddr within rom

14:45 <clever> any time c/asm refers to a symbol in .data, it will get the vaddr of the symbol

14:45 <clever> but objcopy -O binary, uses the paddr

14:45 Burgundy has joined #osdev

14:46 Vercas has quit [Client Quit]

14:46 Vercas has joined #osdev

14:47 <dinkelhacker> So my load address was weird because I never set a paddr in the linker script?

14:47 <clever> there was probably a default load addr somewhere

14:48 <clever> or your using that trick others do, where they just shove 16mb of zeros into the .o file, via .space

14:48 <clever> and praying it all lines up

14:48 <dinkelhacker> where could that default load address be?

14:48 uzix has joined #osdev

14:49 <clever> somewhere in the binutils source

14:49 uzix is now known as mahk

14:49 <dinkelhacker> kk

14:50 <dinkelhacker> and that load address whould be normaly used by a proper loader?

14:51 <clever> when using a .bin file, the load address is basically lost

14:51 <clever> objcopy just gives you a binary, that spans the lowest addr to the highest addr

14:51 <clever> and its your responsibility, to ensure its loaded at the addr the linker was expecting

14:58 <clever> dinkelhacker: another option, is to just implement elf in the bootloader, and send it that

14:58 <clever> then the bootloader will respect the elf headers

14:58 <dinkelhacker> Yeah that lines up with what I knew. I think I just got completely confused bc. I didn't know about about the paddr/vaddr/objcopy thing.

14:59 srjek has joined #osdev

14:59 <dinkelhacker> Maybe once I have usb running. sending more than a couple of KB via uart is so slow

14:59 <dinkelhacker> heat: thx for the link of your linker file. That helps!

15:00 <zid`> Imagine having more than a couple of kb of project

15:00 <clever> i implemented xmodem a while back, for loading an entire .elf

15:00 <clever> it wound up taking 2 minutes to load

15:00 * zid` hides his 3.2MB kernel image

15:00 <clever> so i went back to using the official netboot

15:00 <zid`> It's water weight I swear

15:00 <clever> dinkelhacker: which reminds me, you can just have your rpi boot kernel.img over tftp, 100mbit or more!

15:00 <zid`> It *might* be the giant background bmp.

15:01 <sham1> I wouldn't imagine XModem being particularly fast

15:01 <dinkelhacker> zid`: I bet you don't send it via uart^^

15:01 dutch has joined #osdev

15:01 <zid`> who needs a uart when you have have a background bmp

15:01 <clever> sham1: yeah, all it added was error detection and retry

15:02 <clever> i was also only running at 115200 baud

15:02 <zid`> lto breaks --wide even, stupid lto

15:02 <clever> but i have ran at 1,000,000 baud before, and could have tried that

15:03 <zid`> .gnu.lto_ayame.0.9b1d301769837a9b

15:03 <zid`> good section name

15:04 Vercas has quit [Ping timeout: 255 seconds]

15:05 <sham1> Thanks .gnu

15:09 <clever> dinkelhacker: have you looked into the netboot on the pi yet? it works on every model

15:09 <clever> (that has ethernet)

15:10 <dinkelhacker> no i haven't

15:10 <clever> it lets you just throw start(4).elf + kernel.img onto a tftp server, whack reset on the pi, and boom, its running

15:10 <clever> no need to swap SD cards, no need to wait on uart

15:11 <dinkelhacker> seems like I tend to set life difficulty to `hard`....

15:11 <dinkelhacker> "wack reset" please don't tell me it has a reset button...

15:11 <clever> it has a reset pin

15:12 <clever> in the past, ive wired it to a giant arcade console button

15:12 <clever> so i can just smack it every time the build is done

15:12 <clever> but lately, ive wired reset to a pin on my uart adatper

15:12 <dinkelhacker> I've setup an interrupt on one of the gpios which i trigger through openocd and than let the watchdog timeout

15:13 <clever> openocd could also just halt the arm, and then write to the watchdog

15:14 <clever> and with the arm halted, it cant fight back!

15:14 <dinkelhacker> funny you say that ... i realized that today and tried which seems to segfault my openocd version

15:15 <clever> that sounds like a bug in openocd

15:15 <clever> also beware of the arm mmu, you might need to turn it off, if your not sure where the mmio is mapped

15:17 <clever> also, with just 3 opcodes (and knowing which registers can be clobbered), you can write a single byte to the uart

15:17 <dinkelhacker> Follow up question on the linker topic: So i get now how i can compile the code so that one portion uses pa and the other va. If one version uses a function that is normally in the other world that won't work? Or will it with PIC?

15:17 small has joined #osdev

15:18 <clever> in the past, ive made a putc ASM macro, so i could just print a char anywhere, to debug things

15:18 <zid`> It's all VA.

15:18 <zid`> Just sometimes the mmu is disabled, such that PA=VA

15:18 <zid`> (or identity mapped)

15:18 <dinkelhacker> okay right

15:19 <clever> the linker always acts on vaddr, and assumes the vaddr is always right

15:19 <clever> so when the mmu is off, you need to ensure the binary is loaded to that vaddr (or exclusively use PIC asm)

15:19 <clever> when the mmu is on, you then need to ensure the binary is mapped to that vaddr

15:20 <clever> 2023-01-24 10:09:07 < heat> this linker script has code at 16MiB and -2GiB (almost +256TiB)

15:20 <clever> this example, has 2 chunks of code, a pre-mmu code, with an addr that is in valid phys memory

15:20 <clever> and some post mmu code, that lives at the top of the virt addr space

15:21 <dinkelhacker> and you can't call the post mmu code from the pre mmu code

15:21 <clever> if the mmu is on, and youve mapped both to the respective addresses, you can call back and forth

15:21 <dinkelhacker> yeah of course

15:21 <dinkelhacker> well that might be the way to go

15:21 <clever> but typically, the pre-mmu half is only mapped for a short time, until you jmp to the post-mmu code

15:21 masoudd has joined #osdev

15:21 <clever> then the pre-mmu half is discarded

15:22 <zid`> va -> what address I want to jump to to run this code

15:22 <zid`> if your mmu is off at the time, that locks you into "it has to be the same as the physical address it is loaded to", if it's on, it can be whatever you like

15:23 <zid`> You know what the case is

15:23 <dinkelhacker> Right. So the first thing the pre-mmu part would do is map the post-mmu part. Now nothing can go wrong at this point and you can jump wherever. After that I jump to post-mmu code and disable the pre-mmu mapping

15:24 <dinkelhacker> Is that more or less what you would do?

15:24 <clever> yep

15:24 <zid`> <zid`> I have a 1M = 1M, and a 1.1M = 510TB for my acutal thing, the 1M=1M low code runs with paging disabled, I use it to set up the 510TB -> 1.1MB mapping, then jump to 510TB

15:24 <zid`> and we're full circle again :p

15:26 Vercas has joined #osdev

15:29 <dinkelhacker> Yeah.. I'm a bit slow today.. woke up at 5 bc our central heating died

15:33 Vercas has quit [Quit: Ping timeout (120 seconds)]

15:34 Vercas has joined #osdev

15:40 slidercrank has quit [Ping timeout: 265 seconds]

15:43 bauen1 has quit [Ping timeout: 260 seconds]

15:44 <mrvn> dinkelhacker: I found it becomes far easier to understand and implement if you separate the pre-mmu and post-mmu parts fully. Build your kernel to run in virtual address space and make a blob of that. Then make a tiny loader that has a bit of ASM code and the kernel blob and just activates the MMU, sets the page table and then calls into the actual kernel.

15:45 <mrvn> there shouldn't really be any shared code between the two.

15:45 <clever> that split design also makes it far simpler to have a pre-mmu printf, and a post-mmu printf

15:45 <clever> and you can just printf() from either, and it will call the right variant

15:46 <zid`> If you're really disgusting it can be the same printf twice

15:46 <zid`> using cool ifdefs to stop it including the wrong headers, yum yum

15:46 <clever> another option (little-kernel for example), is to hand write the pre-mmu part as PIC asm

15:47 <mrvn> not like the pre-mmu stuff should need a full printf. a puts() and put_hex() at most.

15:47 <clever> zid`: headers shouldnt matter, it can even be the same printf.o, its purely what the linker script does

15:47 <clever> mrvn: that as well

15:47 <clever> with LK, the pre-mmu part is as dumb a a brick, and i dont think it even has a stack

15:47 <clever> and because its PIC, the load addr can be "wrong"

15:47 <clever> and it will just configure the mmu to fix that

15:47 <mrvn> that's the ideal case.

15:48 <zid`> clever: Depends where your printf is

15:48 <clever> yep

15:48 <zid`> mine's in basically "string/stdlib except malloc.o"

15:48 <zid`> so you'd need to massage the source a small amount with some light ifdefs to stop it trying to pull in the rest of my kernel

15:48 <dinkelhacker> mrvn: I was thinking about that but wouldn't I end up with 2 binaries ?

15:48 <clever> zid`: ive had trouble getting newlib to work on my latest project, so i just grabbed the old rpi-open-firmware printf

15:48 <clever> https://github.com/cleverca22/gba-template/blob/master/src/xprintf.c

15:48 <bslsk05> github.com: gba-template/xprintf.c at master · cleverca22/gba-template · GitHub

15:49 <mrvn> dinkelhacker: sort of. You build kernel.elf -> kernel.blob and that you link into the loader.

15:49 <clever> zid`: this basically just turns into a xprintf.o with a .text, and in theory, the linker could then include that in both the pre-mmu and post-mmu binaries

15:49 <zid`> https://github.com/zid/bootstrap/blob/master/boot/print.c I just wrote one, ignore the ghetto as fuck ega text parts :p

15:49 <bslsk05> github.com: bootstrap/print.c at master · zid/bootstrap · GitHub

15:49 <zid`> but I started the kernel one by just copy pasting this file

15:49 <dinkelhacker> mrvn: okay so only one binary in the end?

15:49 <zid`> so really I could hve just done stupid incestuous linking

15:50 <clever> dinkelhacker: yeah, the post-mmu binary gets baked into the second binary

15:50 <clever> either with cat, or .incbin

15:50 <zid`> lame

15:50 <mrvn> dinkelhacker: yes. In many cases you only have the option of a kernel and initrd. With multiboot you can do loader, kernel, initrd, other-blobs, ... but that is rare.

15:50 <clever> https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/arm/payload.S#L3-L6

15:50 <bslsk05> github.com: lk-overlay/payload.S at master · librerpi/lk-overlay · GitHub

15:50 <clever> dinkelhacker: here is a .incbin example, where i'm taking the objcopy output of another build, and including it into the .rodata

15:51 <mrvn> dinkelhacker: On some hardware you even have to attach the initrd to the loader/kernel for a single file alltogether.

15:52 <clever> xen under grub, abuses the initrd api, to pass the true kernel to the xen "bootloader" kernel

15:52 <dinkelhacker> clever: so you just branch to bcm2835_payload_start and - abrekadabra - you

15:52 <dinkelhacker> are in the other binary?

15:52 <clever> dinkelhacker: you would want to configure the mmu, so something like -2GiB maps to bcm2835_payload_start

15:53 <clever> and then turn on the mmu and jump to -2GiB

15:53 <clever> .align can be used, to ensure bcm2835_payload_start is page-aligned

15:54 <mrvn> dinkelhacker: in the simplest case the included binary just starts with the entry point and you just jump to it. But you can also have a blob that contains structured data telling you where the the .text, .rodata, .data, .bss section of the kernel is. Where the entry point is. A whole lot of relocation data so you can do address space randomization. But just calling the payload_start is a good begining.

15:59 <dinkelhacker> Okay... man I just wanted to run the thing in qemu, which made me realize all kinds of things have just worked by accident because of quirks of the pi and now I'm basically back to the start >D But that's good I feel like the picture gets much clearer.

16:00 <zid`> ye I rewrote my boot setup shit several times

16:00 <zid`> until I got something I was only vaguely unhappy with

16:00 <dinkelhacker> Haha yeah sometimes it's one step forward, 2 miles back

16:02 <clever> some things i need to look into in the future

16:02 <clever> 1: usb-device bootloader, for the device capable models

16:02 <clever> 2: usb-host bootloader, with msd/tftp support

16:02 <clever> 3: fixing u-boot

16:03 <clever> 4: implementing psci

16:04 stux has joined #osdev

16:07 Matt|home has joined #osdev

16:07 gildasio has quit [Ping timeout: 255 seconds]

16:10 gildasio has joined #osdev

16:10 Vercas has quit [Quit: Ping timeout (120 seconds)]

16:11 Vercas has joined #osdev

16:16 gildasio has quit [Ping timeout: 255 seconds]

16:17 joe9 has joined #osdev

16:25 gildasio has joined #osdev

16:26 Vercas has quit [Ping timeout: 255 seconds]

16:27 <dinkelhacker> a lot to do

16:28 pretty_dumm_guy has joined #osdev

16:32 Vercas has joined #osdev

16:41 terminalpusher has joined #osdev

16:48 bauen1 has joined #osdev

16:53 <mjg> heat: https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1359948-trying-out-the-bsds-on-the-intel-core-i9-13900k-raptor-lake?p=1369206#post1369206

16:53 <bslsk05> www.phoronix.com: Trying Out The BSDs On The Intel Core i9 13900K "Raptor Lake" - Phoronix Forums

16:53 <kaichiuchi> you forgot to highlight me as well

16:53 <kaichiuchi> :(

16:54 <kaichiuchi> since i am a bsd fan

16:54 masoudd_ has joined #osdev

16:54 masoudd has quit [Read error: Connection reset by peer]

16:55 <mjg> kaichiuchi: https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1359948-trying-out-the-bsds-on-the-intel-core-i9-13900k-raptor-lake?p=1369206#post1369206

16:55 <mjg> well in short it is already resolved, just not present in the release they tested

16:55 <kaichiuchi> thanks

16:56 <mjg> and it was not even a freebsd bug per se

16:56 small has quit [Ping timeout: 252 seconds]

16:56 <mjg> even so, makes you wonder how come openbsd did not have the problem

17:03 craigo has joined #osdev

17:03 craigo has quit [Read error: Connection reset by peer]

17:03 <zid`> That's exactly as working as I expected freebsd to be

17:03 gog has quit [Quit: Konversation terminated!]

17:04 <mjg> :)

17:05 craigo has joined #osdev

17:07 truy has left #osdev [#osdev]

17:13 craigo has quit [Ping timeout: 252 seconds]

17:19 masoudd_ has quit [Quit: Leaving]

17:20 craigo has joined #osdev

17:21 linearcannon has joined #osdev

17:23 terminalpusher has quit [Remote host closed the connection]

17:23 terminalpusher has joined #osdev

17:24 masoudd has joined #osdev

17:26 small has joined #osdev

17:30 xenos1984 has quit [Ping timeout: 246 seconds]

17:31 xenos1984 has joined #osdev

17:31 craigo has quit [Ping timeout: 246 seconds]

17:34 Vercas has quit [Quit: Ping timeout (120 seconds)]

17:35 Vercas has joined #osdev

17:37 joe9 has quit [Quit: leaving]

17:41 craigo has joined #osdev

17:41 childlikempress is now known as moon-child

17:43 Vercas has quit [Ping timeout: 255 seconds]

17:44 Vercas has joined #osdev

17:57 small has quit [Ping timeout: 260 seconds]

18:00 fedorafan has quit [Read error: Connection reset by peer]

18:01 fedorafan has joined #osdev

18:04 gog has joined #osdev

18:11 joe9 has joined #osdev

18:18 Vercas has quit [Quit: Ping timeout (120 seconds)]

18:18 Vercas has joined #osdev

18:20 xenos1984 has quit [Ping timeout: 246 seconds]

18:23 <ddevault> back to EFI grief

18:29 <ddevault> yeah this ain't it

18:29 <ddevault> $ git add boot

18:29 <ddevault> $ git commit -m "some garbage that doesn't work"

18:30 <ddevault> $ git checkout master

18:32 <gog> taht's programming

18:34 xenos1984 has joined #osdev

18:39 <ddevault> would be nice if someone wrote a good linker

18:39 <ddevault> a halfway decent linker that can build hare programs and/or helios is probably only a few weeks of work

18:39 <ddevault> hmm...

18:44 genpaku has quit [Read error: Connection reset by peer]

18:45 vexmane has joined #osdev

18:48 fedorafansuper has joined #osdev

18:48 genpaku has joined #osdev

18:51 fedorafan has quit [Ping timeout: 252 seconds]

19:03 <zid`> ddevault: Yea I've considered a quick and dirty linker as a fun project

19:07 micttyl has quit [Quit: leaving]

19:11 Vercas has quit [Quit: Ping timeout (120 seconds)]

19:11 Vercas has joined #osdev

19:13 <sham1> Replace the GNU ecosystem from your OS build process one by one

19:13 <sham1> Where GNU of course is Giant, Nasty and Unavoidable

19:15 <mjg> and BSD is Bad, Stale and Dead

19:17 <sham1> Right. That's why we should all just use TempleOS

19:18 Vercas has quit [Quit: Ping timeout (120 seconds)]

19:18 Vercas has joined #osdev

19:20 fedorafansuper has quit [Read error: Connection reset by peer]

19:23 fedorafansuper has joined #osdev

19:26 <heat> sortie, linker when????

19:29 <mjg> sortild

19:29 <mjg> not a good name

19:29 <sham1> sortie-link

19:30 <sham1> Could also say that it's an exit of some kind

19:30 <mjg> i would totally use Elon Musk linker

19:30 <mjg> would probably be named linkex

19:32 <heat> sortie-link is very microsoft

19:32 <heat> ...perfect for MAXSISTRING

19:33 <heat> CONST STATIC MAXSI_STRING gOutputName

19:34 <heat> mjg: mjg's object link editor

19:34 <heat> mold for short

19:35 <mjg> you are just jelly onyx does not run on a toaster

19:36 <heat> NetOnyx when

19:36 srjek has quit [Ping timeout: 260 seconds]

19:37 <mjg> here is a historical lolfact concerning netbsd

19:37 <mjg> when they decided to larp as a smp-capable os a bunch of code showed up which required the CAS instruction

19:37 <mjg> around 2009 or so

19:38 <mjg> apparently however the instruction is implemented on *VAX* it sucks terribly over there

19:38 <mjg> and some dude started protesting the smp effort becaue ofi t

19:39 <heat> they said it runs everywhere

19:39 <heat> they did not say it runs everywhere, *well*

19:39 <mjg> "of course it ruins netbsd"

19:40 <mjg> the official slogan misses a letter by accident

19:40 <heat> lol

19:41 <heat> still can't believe none of you idiots have /bin/python3

19:43 <zid`> I have a /usr/bin/python3

19:43 <zid`> does that help

19:43 <heat> no

19:43 <heat> the bsd idiots don't

19:43 fedorafa_ has joined #osdev

19:45 fedorafansuper has quit [Ping timeout: 252 seconds]

19:45 <zid`> if you want it in /bin you need to root me first

19:45 * zid` passwd -L heat

19:46 <heat> i don't run bsd

19:46 <heat> what the fuck do you think I am

19:46 <zid`> heat did you ever figure out how to use mkisofs

19:46 <heat> no

19:51 <heat> i can't connect my xbox one controller to linux thru bluetooth

19:51 <heat> thank you desktop linux

19:52 <heat> https://i.imgur.com/K5jmIwD.png

19:52 <heat> look at this shit

19:53 <mrvn> heat: lib is a link to usr/lib in most modern linuxes so a lot of people have it

19:59 <heat> the best part about using linux is that everything is fucking broken

20:00 <sham1> It's not broken, if you define it as not broken

20:01 bgs has quit [Remote host closed the connection]

20:02 <heat> ok so apparently I need to boot to windows to fix this shit

20:02 <heat> poggers

20:02 <heat> kill me now

20:03 heat has quit [Remote host closed the connection]

20:08 <sortie> what u do to our heat

20:15 <sham1> Made 'em launch Windows

20:15 <zid`> You can't leave a conga line, only form a rival conga line that is in competition with the original

20:19 <gog> i run bsd

20:19 <gog> just kidding i don't hate myself

20:21 <zid`> too busy leading a rival conga line to run bsd

20:22 elastic_dog has quit [Ping timeout: 252 seconds]

20:24 elastic_dog has joined #osdev

20:29 puck__ is now known as puck

20:42 <kaichiuchi> sometimes being a programmer is annoying

20:43 <kaichiuchi> definitely feels like you can’t write a hello world without 500,000 lunatics criticizing it

20:43 xenos1984 has quit [Read error: Connection reset by peer]

20:45 <mrvn> kaichiuchi: you are missing punctuation. :)

20:45 <kaichiuchi> :)

20:49 <jimbzy> Constructive criticism doesn't bother me.

20:50 <kaichiuchi> that’s fine

20:50 <kaichiuchi> there’s nothing wrong with that

20:50 <kaichiuchi> it’s when you get completely shit on no matter what you do

20:50 <kaichiuchi> not that i’m a victim of that

20:51 <jimbzy> Yeah, I give those people a standard, ":D" response and go about my business.

20:51 <kaichiuchi> but I saw something at work that I did not want to see

20:52 Vercas has quit [Quit: Ping timeout (120 seconds)]

20:53 Vercas has joined #osdev

20:54 danilogondolfo has quit [Remote host closed the connection]

20:55 <geist> yah also jokingly shit comments bugs me sometimes too

20:55 <jimbzy> ?

20:55 Vercas has quit [Client Quit]

20:56 Vercas has joined #osdev

20:56 <kaichiuchi> essentially, there is an intern who is legitimately trying to learn and get better

20:56 <kaichiuchi> but his boss is completely shitting all over him

20:56 <kaichiuchi> not a good look

20:57 <gog> definitely not

20:58 <gog> the point of an internship is to learn, not to get beat up

20:58 <gog> and if the boss is just beating up somebody who has no power in the arragnement then the boss is a massive jerk

20:59 <gog> if the internship is unpaid double my condemnation

20:59 <jimbzy> Unpaid internships should be illegal.

21:02 xenos1984 has joined #osdev

21:06 <gog> agreed

21:06 <gog> they are in many places anyway

21:13 GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]

21:19 Vercas has quit [Ping timeout: 255 seconds]

21:25 pretty_dumm_guy has quit [Quit: WeeChat 3.5]

21:26 vexmane has quit [Quit: Leaving]

21:33 Vercas has joined #osdev

21:40 <immibis_> while watching emerge update my system I wonder why some kind of throughput scheduler isn't more common. Instead of running `make -j5` the system should have a queue of all remaining work, and it should pick the next item from the list whenever the CPU is idle.

21:41 <immibis_> it shouldn't be make's job to guess how many concurrent processes to run. It should queue them all as soon as they are ready to run, and the system decides when to start them

21:42 <immibis_> this scales properly when make runs make (or emerge runs make) without the need for a "job server"

21:42 ptrc has quit [Remote host closed the connection]

21:42 <sham1> Wouldn't the kernel in that case count as a job server

21:42 <immibis_> recursive make normally uses a "job server" process which just hands out "concurrent process tokens" so that you get 5 concurrent processes instead of 25

21:43 ptrc has joined #osdev

21:44 AttitudeAdjuster is now known as MorallyFlexible

21:45 <immibis_> sham1: only if you consider it to already be a job server since it already schedules processes

21:45 <mrvn> "This is the time when you run."

21:46 <immibis_> the time when I run is when a velociraptor is chasing me.

21:48 <mrvn> immibis_: If you start a new build whenever the cpu is idle then every time the compiler waits for a file to load from disk a new compiler spawns. YOu end up with all file being build in parallel.

21:48 MorallyFlexible is now known as EthicsGradient

21:49 <mrvn> Better would be to put all jobs into a group and always run the lowest PID in running state when a cpu is idle.

21:51 <mrvn> Picking the job that will run longest would be even better. Otherwise you end up with all jobs finished except one that takes forever.

21:51 EthicsGradient is now known as AttitudeAdjuster

21:51 <mrvn> and jobs that block many other jobs.

22:08 Vercas has quit [Ping timeout: 255 seconds]

22:10 <\Test_User> could you just start the next one ahead of time and wait() for the previous?

22:10 <\Test_User> or would that totally break if one of 'em took an absurd amount of time

22:11 <moon-child> immibis_: see discussion of a few days ago. Kernel has limited knowledge of what userspace is actually doing

22:11 <\Test_User> or actually, have make itself be multithreaded wait()ing on stuff, then fork and start the next when the last ends

22:12 <\Test_User> waitpid*

22:13 <\Test_User> actually no, generic wait would do from a single thread bc it'd detect if any exit, so yeah

22:14 <mrvn> \Test_User: how many can you start before you run out of resouces?

22:14 <\Test_User> ...but it should already be doing that, so where's the extra delay..

22:16 <mrvn> "start the next one ahead of time and wait()" is kind of what "make -j5" does. Every fork does a read() on the jobserver pipe instead of your wait but that's basically the same.

22:16 Vercas has joined #osdev

22:16 <mrvn> just fewer resources invested before the read()

22:16 <\Test_User> make -j5 runs 5 actively at once though, so more ram eaten

22:17 <mrvn> as it should. But all the extra ones wait on a read()

22:17 <\Test_User> though yeah... why would read be delaying long enough for an extra thread to make the difference ig isw the quest

22:18 <\Test_User> *question

22:18 <mrvn> The read blocks till one of the running 5 writes back a token.

22:18 <mrvn> Only then the new process starts up and allocates resources.

22:19 <\Test_User> and it's not writing as soon as it's done? or...

22:19 <clever> mrvn: that jobserver stuff might explain that weird bug ive noticed, where make sometimes hangs

22:19 <clever> but its been years since i saw it happen

22:19 <mrvn> it is. The only difference is that the resource allocation is after write insted on your start+wait idea it would be before

22:19 <clever> if i just whack the process with a non-fatal signal, it unhangs

22:20 <mrvn> clever: you should never loose tokens so make should never hang.

22:20 <clever> hence it being a bug

22:20 <clever> i never got good details on it, because it was so rare

22:20 <mrvn> kernel bug then, the read()s should wake up with pending data.

22:20 <clever> and now that i mention it, i realize i havent seen the fault in years

22:21 <mrvn> clever: did it maybe happen when 2+ processes finished and then you only wake up one read() even though that only processes 1 byte?

22:21 <clever> dont remember

22:21 <clever> i just know that make had no children, and wasnt using any cpu

22:22 <clever> its safe to assume its been fixed by now

22:33 <immibis_> \Test_User: starting the next one ahead of time and then waiting, seems equivalent to just running a certain number in parallel, like make already does

22:33 <immibis_> yes, RAM usage is a problem

22:33 <immibis_> CPU and I/O throughput are in some sense queue-able resources; if they are not available now, you can delay the task and get it later. Memory does not work that way.

22:34 <immibis_> of course this is a well-known fact in scheduler design

22:36 <mrvn> except it kind of does. you can swap out processes and run fewer compilers in parallel when ram gets tight.

22:36 <immibis_> What if the compiler was segregated into input/process/output phases - you could start a new input or output phase whenever the disk drive wasn't busy, and a processing phase whenever the CPU wasn't busy. With limits on the number of pending tasks in each state.

22:36 <mrvn> immibis_: use c++. I/O is quite irrelevant then.

22:36 <immibis_> you can swap processes out, but it seems slower than not having started them to begin with

22:37 <immibis_> mrvn: segregating the I/O phase avoids the problem of starting a new processing phase whenever a processing phase does I/O

22:37 <\Test_User> immibis_: having more waiting rather than running means less process switching

22:37 <mrvn> immibis_: only when you have to swap. if you have enough ram then running one compiler per core is worth it.

22:37 <mrvn> swapping is just to recover when you guessed wrong

22:37 <\Test_User> also removes the chance of enough ending at the same time

22:37 <immibis_> running more than one compiler per core can be better if they are I/O bound

22:37 fedorafa_ has quit [Read error: Connection reset by peer]

22:38 <immibis_> or rather, partially I/O bound. If they are fully I/O bound you might want to run one per disk drive :)

22:38 fedorafan has joined #osdev

22:38 * immibis_ 's system currently has 7 disk drives attached

22:38 <clever> that reminds me, twice now (on both linux and macos), ive seen bugs where not calling fsync on a file, and then copying it with cp, pokes giant holes in the file

22:39 <clever> https://github.com/openzfs/zfs/pull/12724

22:39 <clever> the linux case, was a zfs bug

22:39 <bslsk05> github.com: Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency by behlendorf · Pull Request #12724 · openzfs/zfs · GitHub

22:39 <clever> i dont know how macos had nearly the identical bug

22:39 <mrvn> In most cases the whole thing is a non-issue anyway. Just run one compiler per core. They have enough ram and all file I/O will just use caches or close enough with ssd.

22:40 <immibis_> that's not a terrible heuristic. I tend to configure N+1 parallel processes.

22:40 <clever> but in both cases, the hole detection api lied, and then cp copied around the fake hole

22:40 <clever> resulting in giant nulls in a file

22:40 <immibis_> either way the kernel should still be responsible for the parallel processing limit

22:40 <immibis_> or at least for avoiding extra context switches of processes tagged for throughput

22:41 <mrvn> clever: asking the FS where holes are and then copying around them is race prone.

22:41 <immibis_> if I start 5 compilers on 4 cores, and they all want to use the CPU, suspend whichever one is last, until one of the earlier ones does I/O

22:41 <immibis_> copying a file that's currently being written to is race-prone

22:41 <mrvn> immibis_: yes, you should have a process group like that

22:42 <clever> mrvn: in the zfs case, the problem is that after you close() a file, but it only exists journal, the kernel reports holes where data actually exists

22:42 <mrvn> clever: if you don't fsync() then there is no sequence point. So I would say user error

22:42 <immibis_> close, or rather munmap, should probably clear up such inconsistencies

22:42 <immibis_> if it doesn't I'd say that's a bug

22:42 <clever> yeah, fsync or even plain /bin/sync was enough to mask the problem

22:42 <immibis_> if you are copying the file while still mapped, that's user erro

22:43 <mrvn> "A successful close does not guarantee that the data has been success‐ fully saved to disk, as the kernel uses the buffer cache to defer writes.

22:43 <clever> immibis_: in both cases, it occured after the file was close()'d

22:43 <mrvn> " just close without sync is not enough

22:43 <immibis_> mrvn: but the kernel cache should be consistent

22:43 <clever> if you closed the file, then immediately copied with cp, it had chunks missing

22:43 <mrvn> immibis_: it should.

22:43 <clever> but if you closed the file, `sleep 120`, then cp, it didnt have chunks missing

22:43 <immibis_> apparently the ZFS bug is that ZFS did not update holes immediately on close/munmap

22:44 <clever> yeah

22:44 <mrvn> which is totally fine if the cp is not in the same process

22:44 <immibis_> no, it's not fine, because all processes use the same kernel cache

22:44 <mrvn> fine as in by specs

22:44 gildasio has quit [Remote host closed the connection]

22:44 <immibis_> close/munmap (whichever one it was) should behave as a sequence point. anything else is crazy

22:44 <clever> macos is more of a black box, and bisection pointed to a commit where coreutils had sparse support re-added

22:45 <mrvn> immibis_: yeah. but the specs explicitly say it's not

22:45 <clever> which implied macos was always broken, and just removing sparse support from cp fixed it

22:45 gildasio has joined #osdev

22:45 <immibis_> mrvn: the specs are stupid then. It's excusable for cache to not be written back on close, but it's not excusable for the cache itself to be inconsistent

22:46 <clever> yeah, i agree with that

22:46 <clever> if read() says there is data at a given offset

22:46 <mrvn> immibis_: might not be kernel cache but per process IO buffers

22:46 <clever> then lseek should not claim there is a hole at that offset

22:46 <immibis_> mrvn: per-process I/O buffers after closing and unmapping?

22:47 <mrvn> immibis_: sure. they take time to flush

22:47 <clever> mrvn: userland buffers where not the issue, it was basically a bash script that did: ghc foo.hs -o foo ; cp foo $out/bin/foo

22:47 <clever> and random holes appeared in the file

22:47 <immibis_> mrvn: explain where these per-process I/O buffers are implemented?

22:47 <mrvn> immibis_: anywhere between your code and the disk.

22:47 <mrvn> clever: in that case the process ending is a sequence point

22:48 <immibis_> mrvn: and where is that?

22:49 <mrvn> immibis_: in hypothetical land

22:49 fedorafan has quit [Read error: Connection reset by peer]

22:49 fedorafansuper has joined #osdev

22:52 <mrvn> immibis_: close can also fail before data is flushed to the FS.

22:52 <mrvn> (but has already closed the FD, so don't close it again)

22:53 <mrvn> Fun fact: If close() is interrupted by a signal that is to be caught, it

22:53 <clever> but in this case, it hasnt failed, because just running sync in a shell between ghc and cp fixes it

22:53 <mrvn> shall return -1 with errno set to EINTR and the state of fildes

22:53 <mrvn> is unspecified.

22:53 <mrvn> clever: obviously your case was a bug

22:53 <clever> yeah

22:54 <mrvn> clever: as said the process ending (the shell running waitpid) and starting the cp is a sequence/synchronization point.

22:54 <immibis_> there's another OS design problem here about flushing in general: how to square the desire that a process has really finished when it thinks it's finished, with the conflicting desire for efficiency when the file is temporary

22:54 <immibis_> when I run `cp -r ~/homework /mnt/usb/` I would like the command to finish when the copying has really finished

22:55 <immibis_> but when I run `cp foo.o build/foo.o` I would like the command to finish immediately so the command stream can run ahead. In fact I don't even care if the data is ever on the disk as I can remake it

22:55 <mrvn> immibis_: sync on close on removable drives?

22:55 <clever> immibis_: in the past, i wasnt aware of how much usb will buffer the crap out of things, and often thought "oh it crashed again" and forcibly remove the usb

22:55 <clever> i still dont know why usb lets the dirty memory hit 500mb+, while a hdd doesnt

22:55 <immibis_> probably because your hdd is faster to write back

22:55 <immibis_> because it's just faster

22:56 <mrvn> The "eject USB device" should show a popup with progress bar showing the amount of buffers to be written.,

22:56 <clever> immibis_: na, ive seen cp take 10 minutes to run before

22:56 <clever> its definitely blocking on the writes, and refusing to get dirty, heh

22:57 <mrvn> clever: dirty memory is kind of broken in linux. You get some 30% and then the data is flushed. While that happens you rack up gigabyte of more dirty data for the USB stick without it getting blocked.

22:57 <mrvn> clever: but not at first. Takes a few dirty/flush cycles before that happens. At first it blocks future writes correctly.

22:58 <clever> ah

22:58 <mrvn> Happens with USB sticks or NFS.

22:58 <mrvn> Somehow I don't see it with local disks, they might just be fast enough.

22:58 <clever> ive not seen it happen on nfs

22:58 <mrvn> write a few TB to NFS.

22:58 <clever> the cp on the nfs client always blocks for me

22:59 <clever> but ive not tried copying TB

22:59 <mrvn> Always worked for me for some 100GB and then suddenly it flips and has no limit.

22:59 <clever> the nfs server is also configured as async, for that client

22:59 <clever> so it should just lie and take everything

22:59 <clever> but i have noticed the write speed varies based on free space

23:00 <mrvn> that's likely the FS at fault if you are talking >90% full

23:00 <clever> 6gig free, out of ~8tb

23:01 <mrvn> any reserved free space?

23:01 <clever> just the usual zfs slop space

23:01 <clever> *looks*

23:01 <mrvn> zfs definetly has that slowdown when it gets full

23:01 <clever> [root@nas:~]# cat /sys/module/zfs/parameters/spa_slop_shift

23:01 <clever> 5

23:01 <mrvn> ext3/4 reserves 5% per default that only root can use and that isn't included in the free stats.

23:02 <clever> i forget the math, but this tunes how much zfs reserves, so the CoW doesnt hard jam from a full disk

23:02 <clever> if i echo an 8 into there, i suddenly have 105gig free

23:02 <clever> because i told it to reserve less

23:02 <mrvn> both ext and zfs slow down towards the end. zfs gets really slow.

23:03 <clever> so df may claim i have 6gig free, but its actually over 100gig

23:03 <mrvn> .oO(Gives you time to buy more disks before it fails :)

23:03 <clever> in the zfs case, the major slowdown is the spacemap histograms

23:04 <clever> and zfs_metaslab_try_hard_before_gang being turned on

23:04 <clever> each metaslab (like an ext block group) has its own free space list, and they are rather memory costly

23:04 <clever> so zfs only has a few loaded at once

23:05 Vercas has quit [Ping timeout: 255 seconds]

23:05 <clever> if zfs_metaslab_try_hard_before_gang is enabled, and zfs cant find a big enough hole, it will "try hard" (load more metaslab spacemaps) to find a properly sized hole

23:05 <mrvn> buy more disks and make some big holes.

23:05 <clever> without that, it can give up early (faster) and create a fragmented record, which harms performance more down the road

23:07 <clever> https://i.imgur.com/P8dtvd2.png

23:07 <clever> i also wrote a patch to zfs, that lets me generate these graphs cheaply

23:07 <clever> if that orange line hits zero, then even with zfs_metaslab_try_hard_before_gang, it will fragment most writes

23:07 bxh7 has joined #osdev

23:10 terminalpusher has quit [Remote host closed the connection]

23:11 <immibis_> cp over NFS reminds me of yet another OS design problem which is how to efficiently accelerate things that can be more efficiently done by external hardware or other computers

23:11 xenos1984 has quit [Read error: Connection reset by peer]

23:11 srjek has joined #osdev

23:11 <immibis_> you could expect to tell an NFS server "copy this byte range to that byte range" without downloading the entire byte range and uploading it again

23:11 <immibis_> and maybe NFS has that ability, and maybe it's even supported in cp, but it's all special-cased

23:12 <clever> i also have had other fun bombs go off with nfs

23:12 <immibis_> there's absolutely no ability for e.g. gcc -E to rewrite the unchanged segments of include files through that special case

23:12 <mrvn> immibis_: NFS doesn't but some filesystems have smart links for that

23:12 <clever> my server was graphing free disk space, and that involved running df in cron

23:12 <immibis_> and it would be completely absurd to expect gcc to write special-case code for it

23:12 <clever> the "server" had the laptop mounted over nfs (as an nfs client)

23:13 <clever> when i left for a trip, the laptop went with me

23:13 <clever> df then hung, because the nfs server was missing

23:13 <clever> cron kept forking out new df's, and swap just ate them all harmlessly

23:13 <clever> then the laptop returned....

23:13 <immibis_> on Windows, that would get you an ERROR_NETNAME_DELETED I think

23:13 <clever> every single df, woke up at once, and all demanded a share of the cpu, and ram

23:13 <immibis_> the decision of which errors to return to clients and which to attempt to paper over has no universal right answers

23:14 <clever> immibis_: thats what the soft vs hard mount flag controls, in nfs

23:14 <immibis_> I believe in MS-DOS, you could simulate a dual-drive system with a single drive. When accessing B: after accessing A:, the system would pause the running "process" and ask you to swap disks.

23:14 <clever> hard means retry forever

23:14 <mrvn> clever: I now that behavior. Takes a while but everything eventually recovers just fine

23:14 <clever> soft means give an io error if there is network problems

23:14 <immibis_> such emulation seems rather useful in odd cases and anti-useful in othres

23:15 <clever> immibis_: ah, i had seen that on YT recently, to get 3 floppy drives working on 1 machine

23:15 <mrvn> immibis_: AmigaOS has disk names so you can open "fonts:bla.ttf" and it will acess whatever drive thas the fonts floppy in it or ask you to insert it.

23:15 <clever> he had physical switches to re-route the drive select lines

23:15 <clever> and enabled that DOS feature, and then re-routed things manually

23:15 <mrvn> immibis_: you can even remove a floppy during write operations and reinsert it in another drive and it will just keep going.

23:15 <clever> neat!

23:15 <immibis_> every problem can be solved by adding more abstraction except the problem of too much abstraction

23:16 <immibis_> Linux also has this ability, if you were to set up something to automount floppies, but mount them at consistent paths - but it wouldn't block on access, you'd probably need something like FUSE for that

23:16 <clever> zfs can recover from a block device going missing mid-write, but only if it comes back at the same /dev/ path

23:16 <immibis_> the Linux behaviour of "an unmounted drive is an empty folder" is not a particularly sensible default

23:16 <clever> renaming or symlinks can fool it enough to work, its a limitation of the userland tooling

23:16 <mrvn> immibis_: no it doesn't. You can't umount and remount a device and have open files continue to work

23:16 <immibis_> it just falls out of the design of how Linux mounts go over the top of existing folders

23:17 <immibis_> mrvn: as we see with clever's df thing, hanging the process until the drive comes back isn't always a good idea either

23:17 <mrvn> immibis_: starting a cron job again while the previous is still running is just plain broken.

23:17 <mrvn> cron should never do that as default.

23:18 <clever> systemd timers dont do that!

23:18 Vercas has joined #osdev

23:18 <immibis_> also not universally true

23:18 <clever> because its less of a cron job, and more of a service, that starts (if not already running) on a schedule

23:18 <immibis_> and I bet if it was a flag, clever would've had a 50% chance of setting it to the wrong value because why would you even think to consider that?

23:18 <mrvn> immibis_: hence the "as default"

23:19 <clever> the df cronjob, was part of the cacti polling setup

23:19 craigo has quit [Ping timeout: 252 seconds]

23:19 <clever> but ive since moved to prometheus based graphing, which doesnt spawn a new process on every poll

23:19 <clever> so it would never fork-bomb the same way

23:19 <clever> more likely to just hang the entire exporter

23:19 <immibis_> instead it would just freeze the entire graphing system until the drive came back?

23:19 <mrvn> also bad

23:19 <clever> yeah, for that one machine

23:20 <clever> but not DoS level bad

23:20 <mrvn> you want to create a thread per resource so all other graphs still process

23:20 <immibis_> what we all want is a highly abstracted system, so everything is very flexible, with no abstractions so everything is very efficient

23:20 <clever> and thats why i just soft-mount everything now

23:20 <immibis_> the Cheetah/XOK webserver stores your static HTML files as pre-formatted TCP packets on disk

23:20 <mrvn> immibis_: sendfile to the rescue

23:20 <immibis_> sendfile is not the same level of abstractionlessness

23:20 <clever> mrvn: the exporter, is basically just an http endpoint, that returns all of the metrics

23:21 <clever> the central graphing server, has http timeouts, so it wont 100% die

23:21 mlombard has quit [Read error: Connection reset by peer]

23:21 <clever> it will just consider that 1 host as down

23:21 <immibis_> even being able to "store your HTML as TCP packets" requires cutting through a lot of abstractions and writing code that only works for the specific case of serving static files over TCP

23:21 <immibis_> and then you have to rebuild your static pages folder if the link MTU changes

23:21 <mrvn> how does that even work? You need the right sequence number

23:22 <clever> immibis_: oh, that reminds me, you can configure some http servers, to send a foo.html.gz file, but slap a content-encoding header on it

23:22 <immibis_> I assume it filled in the dynamic fields at runtime

23:22 <clever> so the client will decompress it on the fly

23:22 <clever> that then saves you cpu cycles on the server, having to re-compress the file for every request

23:22 <immibis_> some network cards might support TCP Segmentation Offload, and then you have code that not only only works for TCP, but only works for your specific card and DMA controller, but it's very fast because it DMAs directly from the disk to the network card

23:23 <mrvn> immibis_: using writev() seems like a better way. Splice the ip packets together from the header and chunks of the file.

23:23 <clever> ive looked at the genet (bcm2711 ethernet) driver before, the tx ring is a big array of addr+size+flag sets, and flags include "start of packet" and "end of packet"

23:24 <immibis_> depending on the segment size the kernel might do more work to process a writev than it would cost to just copy the bytes

23:24 <immibis_> now, if you could store a DMA descriptor chain on disk...

23:24 <clever> so scatter-gather dma, is just giving it multiple addr+length pairs

23:24 <mrvn> immibis_: splice should be able to directly DMA stuff

23:24 <immibis_> I don't know consumer ethernet drivers but I worked on some kind of industrial router and this sometimes involved peeking under the hood. The engine takes a linked list of DMA descriptors.

23:25 Vercas has quit [Quit: Ping timeout (120 seconds)]

23:25 <clever> so your writev() call, could prepend a few buffers for packet data, and gather-dma pieces out of both kernel and userland ram

23:25 <clever> but, if userland modifies the buffer mid-write, your checksums wont be right

23:25 <immibis_> you would, of course, want it to gather the same packet header over and over, while gathering different data pieces

23:26 Vercas has joined #osdev

23:26 <immibis_> maybe you add some kind of modification thing to the DMA chain, telling it how to increment sequence numbers and decrement checksums

23:26 <mrvn> clever: that's why my OS doesn't even allow that all. If you write a buffer you give up rights to the buffer. you can't modify it. Having to COW every buffer on write is too much work.

23:26 <immibis_> or you hardcode logic in the network card telling it how to generate TCP packets from a plain old data stream

23:26 gabi-250 has quit [Ping timeout: 255 seconds]

23:27 <immibis_> mrvn: how's the overhead of handling many smallish buffers?

23:27 <clever> mrvn: yep, that sounds like a valid solution

23:27 <mrvn> immibis_: On modern cards you write 64k frames to the NIC and it internally splits it into MTU chunks and generates all the right headers for it.

23:28 <immibis_> that would be called TCP Segmentation Offload

23:28 <mrvn> immibis_: horrible. 1 page for the message, 1 page for the buffer. Two INVLPGs. I'm not writing my OS to be fast, just simple.

23:28 gabi-250 has joined #osdev

23:29 xenos1984 has joined #osdev

23:30 <immibis_> that seems to be a common problem in message passing systems (and what is the difference between a message and a buffer?)

23:30 <mrvn> immibis_: and everything is a process. So every subsystem the message passes through is another INVLPG fest. You should not use an external buffer for short stuff. Better to include it in the message itself.

23:30 <clever> mrvn: i would say, if the buffer is under some size threshold (maybe under 1 page), just copy it

23:30 <immibis_> a silly thought: maybe messages should be passed in XMM/YMM registers

23:31 <mrvn> clever: that would require allocating pages inside interrupts. not a possibility.

23:31 <clever> or that, include the data in the same page as the message

23:31 <mrvn> clever: that's what I do.

23:31 <clever> the zfs journal does similar

23:31 <clever> for small writes, the data is in the journal itself

23:31 <clever> but for large writes, the data goes to its usual final destination, and the journal just holds the pointer

23:31 <immibis_> in XMM registers you have 256 bytes, with direct access from CPU instructions, that won't be stomped on by context switching code and in fact not stomping on them makes the context switch *faster*

23:32 <mrvn> If you write under 4000 bytes then just include it in the message itself.

23:32 <clever> so it can put off the more costly updates, until later

23:32 Turn_Left has joined #osdev

23:32 <clever> and the journal is enough of a promise to userland, that the data is secure

23:32 <immibis_> mrvn: software interrupts or real interrupts?

23:33 <clever> immibis_: that sounds like lazy fpu context switching

23:33 <clever> defer the fpu context switch, and set access control registers so it faults upon any access

23:33 <mrvn> Another thing I plan to do is to have buffers attached to a message but not map them. You just get a handle for the buffer and can pass that around and if you actually want to access the buffer you have to map it.

23:33 <mrvn> immibis_: yes

23:33 <immibis_> even lazier. deliberately leaking FPU context between processes as message passing

23:34 <clever> mrvn: that sounds like linux dma_buf api

23:34 <clever> but in linux, each buffer, is a seperate fd handle

23:34 <clever> so you need to be passing whole fd's around, potentially 1-3 per video frame

23:34 <mrvn> clever: for me it would be addr+size specifying a bunch of physical pages that then get turned into some VM handle.

23:34 <clever> immibis_: heh, thats one way, just read it, and context switch to the right destination

23:35 Left_Turn has quit [Ping timeout: 256 seconds]

23:35 <immibis_> can Memory Protection Keys be used to switch between 16 different processes without changing page tables?

23:35 <mrvn> immibis_: ARM has 256 ASIDs you can switch between

23:35 <mrvn> x86_64 has 4096, right?

23:35 <immibis_> the ideal message-passing context switch is like "ASID = new_proc->ASID; jmp new_proc->MessageReceiver;"

23:36 <clever> immibis_: now youve reminded me of the centurion cpu6, it has 16 banks of registers, each with its own mmu config, and can switch between them freely, an irq will also force a switch to a specific set (each irq is bound to a diff one)

23:36 <immibis_> mrvn: don't know

23:36 <mrvn> immibis_: you still load a new page table but you don't loose any TLB or cache content.

23:37 <immibis_> oh well that's good. Last time I knew about context switches, page table flushing was the main overhead.

23:37 gildasio has quit [Remote host closed the connection]

23:37 <clever> https://github.com/Nakazoto/CenturionComputer/wiki/Instructions

23:37 <bslsk05> github.com: Instructions · Nakazoto/CenturionComputer Wiki · GitHub

23:37 <mrvn> immibis_: with a microkernel you should definetly look into ASIDs.

23:38 <clever> i should get off to bed, its getting late here

23:38 <mrvn> then you still have time. it's not early yet.

23:38 <clever> lol

23:43 gildasio has joined #osdev

23:55 <kaichiuchi> I cannot believe CMake doesn't have reasonable line break support