#osdev on 2022-08-24 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:00 ZipCPU has quit [*.net *.split]

00:00 FreeFull has quit [*.net *.split]

00:00 heat has quit [*.net *.split]

00:00 scoobydoo has quit [*.net *.split]

00:00 terrorjack has quit [*.net *.split]

00:00 Brnocrist has quit [*.net *.split]

00:00 DanDan has quit [*.net *.split]

00:00 ptrc has quit [*.net *.split]

00:00 genpaku has quit [*.net *.split]

00:00 elastic_dog has quit [*.net *.split]

00:00 dennisschagt has quit [*.net *.split]

00:00 wgrant has quit [*.net *.split]

00:00 ThinkT510 has quit [*.net *.split]

00:00 dormito has quit [*.net *.split]

00:00 thatcher has quit [*.net *.split]

00:00 woky has quit [*.net *.split]

00:00 CYKS has quit [*.net *.split]

00:00 k0valski1889 has quit [*.net *.split]

00:00 PapaFrog has quit [*.net *.split]

00:00 sprock has quit [*.net *.split]

00:00 Ram-Z has quit [*.net *.split]

00:00 ckie has quit [*.net *.split]

00:00 mahk has quit [*.net *.split]

00:00 nj0rd has quit [*.net *.split]

00:00 _xor has quit [*.net *.split]

00:00 FreeFull has joined #osdev

00:00 genpaku has joined #osdev

00:00 dennisschagt has joined #osdev

00:01 the_lanetly_052_ has quit [Remote host closed the connection]

00:02 ptrc has joined #osdev

00:02 heat has joined #osdev

00:02 heat has quit [Remote host closed the connection]

00:02 scoobydoo has joined #osdev

00:02 PapaFrog has joined #osdev

00:02 leah_ has joined #osdev

00:02 ZipCPU has joined #osdev

00:02 Ram-Z has joined #osdev

00:02 the_lanetly_052 has joined #osdev

00:03 heat has joined #osdev

00:03 woky has joined #osdev

00:03 dormito has joined #osdev

00:04 terrorjack has joined #osdev

00:05 elastic_dog has joined #osdev

00:05 Brnocrist has joined #osdev

00:05 sprock has joined #osdev

00:07 FreeFull has quit []

00:07 ckie has joined #osdev

00:08 gog` has quit [Ping timeout: 248 seconds]

00:16 <geist> ETOOHARD

00:17 nyah has quit [Ping timeout: 268 seconds]

00:20 <klange> Don't give me ideas...

00:20 <klange> So many syscalls I could "implement" by returning such an error code ;)

00:24 frkzoid has joined #osdev

00:35 <gog> i just do that for all coding tasks rn

00:35 <gog> i refuse to grow as a person and learn new things

00:38 <zid> same tbh

00:41 frkzoid has quit [Ping timeout: 244 seconds]

00:49 DanDan has joined #osdev

00:52 gildasio has quit [Remote host closed the connection]

00:56 gildasio has joined #osdev

01:06 <heat> why is the grub decompressor so slow?

01:06 <heat> it's soo odd

01:07 <klange> Given your previous complaint, are you in TCG? I've found it's particularly bad at decompression algorithms.

01:08 <klange> I think they're unfriendly to the JIT by nature.

01:08 <heat> no, KVM

01:08 <klange> Ah, then, probably just because it's designed for size and is shit. Which format?

01:08 <heat> xz

01:08 <heat> 9MB -> 90MB approx

01:08 <heat> using the linux cmd line util is much faster

01:10 <klange> Probably a bunch of compounding factors with grub's io interfaces, a different xz implementation designed for size... and if this is bios grub that runs in protected mode, so maybe 32-bit instructions vs. 64-bit?

01:30 foudfou has quit [Remote host closed the connection]

01:30 foudfou has joined #osdev

01:42 <geist> Maybe the cache is disabled?

01:42 <geist> Is this on x86?

01:46 <heat> yes

01:46 <heat> although I'm no longer sure on the "much" part

01:46 <heat> maybe that was just placebo

01:52 gildasio has quit [Quit: WeeChat 3.6]

01:56 frkzoid has joined #osdev

02:00 jjuran_ has joined #osdev

02:01 <geist> Yeah, could also be various levels of compression too, though in general xz is fast to decompress

02:02 jjuran has quit [Ping timeout: 252 seconds]

02:02 jjuran_ is now known as jjuran

02:02 gxt has quit [Remote host closed the connection]

02:02 foudfou has quit [Remote host closed the connection]

02:02 opal has quit [Remote host closed the connection]

02:02 foudfou has joined #osdev

02:02 opal has joined #osdev

02:02 gxt has joined #osdev

02:02 jjuran has quit [Remote host closed the connection]

02:03 jjuran has joined #osdev

02:43 [itchyjunk] has quit [Remote host closed the connection]

03:08 gog has quit [Ping timeout: 268 seconds]

03:13 heat has quit [Remote host closed the connection]

03:14 heat has joined #osdev

04:09 gxt has quit [Remote host closed the connection]

04:10 gxt has joined #osdev

04:20 pretty_dumm_guy has joined #osdev

04:20 moberg has joined #osdev

04:25 heat has quit [Remote host closed the connection]

04:25 heat has joined #osdev

04:40 heat has quit [Ping timeout: 260 seconds]

05:13 gxt has quit [Remote host closed the connection]

05:14 gxt has joined #osdev

06:09 ThinkT510 has joined #osdev

06:09 \Test_User is now known as Test_User

06:10 Test_User is now known as \Test_User

06:17 the_lanetly_052_ has joined #osdev

06:20 the_lanetly_052 has quit [Ping timeout: 252 seconds]

06:21 gxt has quit [Remote host closed the connection]

06:22 gxt has joined #osdev

06:28 scoobydoob has joined #osdev

06:30 scoobydoo has quit [Ping timeout: 248 seconds]

06:30 scoobydoob is now known as scoobydoo

06:41 bradd has quit [Quit: No Ping reply in 180 seconds.]

06:41 bradd has joined #osdev

06:46 gxt has quit [Remote host closed the connection]

06:46 gxt has joined #osdev

07:11 mahk has joined #osdev

07:31 GeDaMo has joined #osdev

07:49 bauen1 has quit [Ping timeout: 248 seconds]

07:55 wolfshappen has quit [Ping timeout: 248 seconds]

07:56 wolfshappen has joined #osdev

08:00 wolfshappen has quit [Client Quit]

08:09 wolfshappen has joined #osdev

08:51 bauen1 has joined #osdev

09:13 <mrvn> can you even disable caches on x86?

09:15 <Mutabah> Yes, MTRRs

09:15 <Mutabah> but pretty rare

09:24 puck has quit [Excess Flood]

09:24 puck has joined #osdev

09:48 opal has quit [Remote host closed the connection]

09:48 opal has joined #osdev

09:51 gelatram has joined #osdev

09:58 sprock has quit [Ping timeout: 252 seconds]

10:00 sprock has joined #osdev

10:27 gildasio has joined #osdev

10:43 <moon-child> can I manually do hot/cold ordering?

10:43 <moon-child> just put stuff in .text.cold or something? (What is the section name?)

10:46 zaquest has quit [Remote host closed the connection]

10:48 zaquest has joined #osdev

10:50 frkzoid has quit [Ping timeout: 244 seconds]

10:51 heat has joined #osdev

11:12 <heat> Mutabah, also CR0.CD

11:12 <heat> moon-child, attribute((cold))?

11:13 <heat> also x86 has four caching modes controlled by CR0.CD and CR0.NW

11:13 <heat> those bits + MTRR are what allow cache-as-ram

11:13 <moon-child> heat: in assembly

11:13 <heat> .section .text.cold?

11:15 gog has joined #osdev

11:17 gareppa has joined #osdev

11:21 gareppa has quit [Remote host closed the connection]

11:21 <heat> sorry, .section .text.cold,"ax"

11:21 <heat> I've screwed myself over quite a few times over not specifying that stuff

11:27 <dminuoso> mjg: Curious, critizing a company should *exactly* be done when you are there, not a good idea afterwards. *shrugs*

12:07 gildasio has quit [Quit: WeeChat 3.6]

12:20 gildasio has joined #osdev

12:26 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

12:29 MiningMarsh has joined #osdev

12:35 opal has quit [Remote host closed the connection]

12:35 opal has joined #osdev

12:36 smach has joined #osdev

12:47 xenos1984 has joined #osdev

13:12 bauen1 has quit [Ping timeout: 260 seconds]

13:32 puck has quit [Excess Flood]

13:32 puck has joined #osdev

13:38 bliminse has quit [Ping timeout: 248 seconds]

13:39 bliminse has joined #osdev

13:44 gog has quit [Quit: byee]

13:57 gog has joined #osdev

13:59 bauen1 has joined #osdev

14:10 gog has quit [Quit: byee]

14:10 gog has joined #osdev

14:17 gelatram has quit [Ping timeout: 252 seconds]

14:22 Celelibi has quit [Read error: Connection reset by peer]

14:29 Celelibi has joined #osdev

14:35 <heat> anyone familiar with decompression?

14:35 <heat> I'd like to know how much slower is streamed decompression vs decompressing everything at once

14:35 <heat> (generally)

14:36 <heat> I implemented zstd initrd decompression in the kernel itself yesterday (instead of relying on GRUB, which is slow and also doesn't support zstd)

14:36 <clever> heat: decompression that can be done with threads seems like a good option for gaining massive speed

14:36 <heat> the problem is that I inevitably decompress everything at once (and do it on a try by try basis)

14:36 <clever> some compression formats are based on blocks, and each block has a header detailing the compressed and uncompressed size

14:37 <moon-child> I don't think threading is incompatible withs treaming; you just shard

14:37 <heat> zstd doesn't have that I think

14:37 <clever> if you want to be smarter about decompression, you need a compression format that supports seeking

14:37 <clever> while most gzip libraires (for example) dont directly support seeking, you can still add it in

14:37 <heat> moon-child, the important bit here is that I have the whole source buffer, but I don't have the whole dst buffer

14:38 * moon-child nods

14:38 <clever> if you scan the headers for each block, you can convert a byte offset, into a the start of a block, and an offset within that block

14:38 <clever> then you can skip to decompressing just that block

14:39 <clever> you can also do what things like zfs do, where each FS extent is a completely self-contained compressed object

14:39 <clever> so you just read the FS metadata, and decompress the right block

14:39 <heat> homie this is a compressed tarball

14:39 <heat> i'm not going around that

14:39 <clever> but in the zfs case, things are complicated, you have 128kb extents, being compressed seperately, and then turned into a series of 4k blocks

14:39 <heat> at least not until I figure out squashfs

14:40 <mrvn> didn't we figure out that the SD card is slower than a single core can decompress?

14:40 <clever> ah, for a simple .tar.gz, i would recommend 2 things

14:40 <heat> ideally I would craft a squashfs image and be done with it - the problem is that the documentation isn't great, as usual

14:40 <clever> 1: do a streaming decompress of the entire file, parse the .tar headers as you go, make note of the byte-offset&filename of every file in the tar

14:41 <heat> most linux distros use squashfs images for their livecd

14:41 <clever> 2: as your doing that, also make note of the uncompressed->compressed byte offsets for each block of gzip data

14:41 <clever> then you can resume decompression in the middle of the .tar.gz at any time, and skip ahead to the right tar entry

14:42 <heat> this is not gz, but zstd

14:42 <heat> i'm not sure I can do that

14:42 <heat> seems risky at least

14:42 <mrvn> clever: you can only start in the middle when you know the dictionary at that point

14:42 <clever> check if zstd is block based, and if you can uncompress just one block

14:42 <heat> I'm probably looking into just doing it all in one pass

14:42 <j`ey> heat: is this for the initrd or?

14:42 <heat> j`ey, ack

14:42 <mrvn> You can record all the places in a gz file where the dictionary is flushed.

14:43 <clever> mrvn: and where does gzip store the dictionary? how does `cat foo.gz bar.gz > baz.gz` work, is the dictionary a record within the stream and bar.gz updates it?

14:43 <heat> j`ey, the problem is that the problem here is when I want to generate a livecd environment

14:43 <heat> whoa, looped a bit there

14:43 <mrvn> clever: gzip builds the dictionary from your input

14:43 <clever> ah, that answers half of that

14:43 <heat> I can get huge livecd env, 200MB for instance

14:43 <clever> so you need to record the position of each directionary, the position of each block, and how much the block expands to

14:43 <heat> by decompressing everything at once, I take up around 250MB

14:43 <clever> then you should be able to seek within gzip

14:44 <mrvn> clever: as said the dictionary is build from your input. You have to start at a place where the dictionary is flushed.

14:44 <clever> mrvn: ah, so treat the offset immediately after a flush, as a new block?

14:44 <mrvn> yep.

14:44 <clever> and then that whole chunk, up to the next flush, is one unit

14:45 <clever> and you can skip to the start of any of those units

14:45 <clever> and then just record the byte offset to the start of each, and how much it unpacks to

14:45 <mrvn> gzip also has a rsyncable option where it flushes the diction a few extra times when the adler32 checksum of the input is 0.

14:45 <clever> so you can convert an output offset, to an input offset pair

14:45 <mrvn> Makes it flush at the same places in a file even if the start of the file changes.

14:45 <clever> ah, neat

14:46 <clever> i assume that makes the binary diff smaller?

14:46 <mrvn> means the compressed files has identical parts.

14:46 <clever> yeah

14:46 <mrvn> more of them anyway.

14:46 <clever> zfs gets similar (assuming no insertions, only overwrites), by splitting the file into 128kb chunks first, then compressed each chunk seperately

14:47 <clever> but zfs is also designed to allow seeking within a compressed object

14:47 <heat> https://dr-emann.github.io/squashfs/squashfs.html

14:47 <bslsk05> dr-emann.github.io: Squashfs Binary Format

14:47 <clever> heat: and now your making me want to add squashfs to little-kernel, lol

14:48 <clever> at a glance, that looks well documented

14:51 <clever> bbl

15:00 bauen1 has quit [Ping timeout: 252 seconds]

15:01 <heat> you didn't even add ext4 yet!

15:01 <heat> also that was reverse engineered apparently

15:01 <heat> so... unclear if its great

15:11 gildasio has quit [Ping timeout: 268 seconds]

15:11 gildasio has joined #osdev

15:22 frkzoid has joined #osdev

15:26 <clever> heat: as for why that idea popped into my head, i was recently mentioning on the rpi forums, about how my baremetal code can do things like a slide show you might find on an advertising sign, at very short boot times and perfect vsync'd swaps

15:26 <clever> and the current limitations, are no hdmi (solved by running on the arm side, with the official firmware)

15:27 <clever> and using an intrd for the image file payload seems like a way to simplify config

15:31 xenos1984 has quit [Quit: Leaving.]

15:51 <mats1> https://pierrekim.github.io/blog/2022-08-24-2-byte-dos-freebsd-netbsd-telnetd-netkit-telnetd-inetutils-telnetd-kerberos-telnetd.html

15:51 <bslsk05> pierrekim.github.io: 2-byte DoS in freebsd-telnetd / netbsd-telnetd / netkit-telnetd / inetutils-telnetd / telnetd in Kerberos Version 5 Applications - Binary Golf Grand Prix 3 - IT Security Research by Pierre

15:55 <gog> dang

15:58 <heat> oh no

15:58 <heat> not telnet!

16:06 gildasio has quit [Remote host closed the connection]

16:06 gildasio has joined #osdev

16:15 <mats1> the wonders of 30y/o open sores

16:20 frkzoid has quit [Ping timeout: 260 seconds]

16:21 <gog> love me some programmatic ulcers

16:22 <heat> so everyone copied their telnet implementations from each other

16:23 <heat> and now they all have an exploit

16:23 <heat> this sounds so BIOS its not even funny

16:23 <mjg> man

16:23 <heat> man

16:23 <mjg> i was at a workplace which refused to retire telnet at 2011

16:24 <heat> swear to fucking god

16:24 <mjg> despite me pointing out the daemon is unused and it is waiting for someone to fuzz it for lulz

16:24 <heat> you need to start to name drop

16:24 <mjg> this one was leading polish webhosting provider, nazwa.pl

16:24 <heat> your employers are all getting blacklisted

16:24 <mjg> anyho

16:24 <mjg> it was all nice and dandy until someone did preciesly that -- a root priv 0day dropped around that time for telnetd

16:25 <mjg> they turned something which should have been a mere curiosity into a an actual threat

16:26 <mjg> make no mistake though, webhosting companies are a shithole

16:26 <clever> the only time i ever really had an interest in telnet, was when i was trying to run commands remotely from another script, on windows, but the non-deterministic binary junk in telnet got in the way at the time

16:26 <clever> i just made a custom tcp protocol instead

16:26 <mjg> for them security is an old african word for irrelevant

16:27 <clever> which is a better choice, then it can only trigger the actions i approve of, and cant just run anything

16:27 <zid> I likes telnet

16:27 <mjg> heat: btw that place also used sendmail :-P

16:27 <zid> It's up at 0xFF so it's utf-8 clean, has useful but not stupidly complex commands

16:27 <mjg> i don't know what you heard about that unit of a mta

16:27 <heat> custom tcp protocol is almost as scary as custom udp protocol that implements tcp

16:28 <heat> mjg, they're very old-unix, i like it

16:28 <heat> mckusick would be proud

16:28 <mjg> would not, that was linux

16:28 <mjg> debian

16:29 <heat> debian is indeed old

16:29 <mjg> that was debian woody, whihch was already obsolete when i joined

16:29 <mjg> 8)

16:31 <heat> debian is obsolete from its release onwards

16:38 <mjg> wait, there is no openbsd on that list

16:38 <mjg> did they whack telnetd?

16:39 <mjg> ... yes they did

16:39 <mjg> but the client is still there

16:40 <clever> heat: ive made countless custom line-based tcp protocols, and one binary one (that was wrapped in tls)

16:46 <mrvn> "... and because *physics* ..."

16:47 nyah has joined #osdev

16:59 carbonfiber has joined #osdev

17:06 matt__ has joined #osdev

17:15 dude12312414 has joined #osdev

17:42 the_lanetly_052_ has quit [Ping timeout: 252 seconds]

17:42 the_lanetly_052 has joined #osdev

17:43 dionys has quit [Ping timeout: 255 seconds]

17:45 eck has quit [Quit: PIRCH98:WIN 95/98/WIN NT:1.0 (build 1.0.1.1190)]

17:46 dionys has joined #osdev

17:46 matt__ is now known as freakazoid333

17:48 eck has joined #osdev

17:50 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

17:59 the_lanetly_052_ has joined #osdev

18:02 the_lanetly_052 has quit [Ping timeout: 264 seconds]

18:04 FreeFull has joined #osdev

18:05 the_lanetly_052 has joined #osdev

18:06 the_lanetly_052_ has quit [Ping timeout: 260 seconds]

18:32 seer has quit [Quit: quit]

18:33 gildasio has quit [Ping timeout: 268 seconds]

18:34 seer has joined #osdev

18:34 gildasio has joined #osdev

18:38 freakazoid333 has quit [Ping timeout: 260 seconds]

18:53 <geist> TIL about -ffinite-loops

18:53 <geist> useful: nerfs the ability of gcc and clang to mark infinite loops as UB

18:54 <mrvn> somehow when I write "while(true) { }" it doesn't eliminate that

18:56 <geist> seems to only be on clang

18:56 <geist> actually -fno-finite-loops. which seems inverted polarity

18:56 <mrvn> it just seems to detect obviously left empty loops and leaves them

18:57 <geist> https://gcc.godbolt.org/z/r4fq164Y1

18:57 <geist> yep. trouble is recentlyish clang has started to treat infinite loops as UB, and usually just elides them entirely, and silently

18:57 <geist> this has caused problems in zircon in more than one occasion

18:58 <geist> usually specialized 'die here' sort of loops in the kernel or early bringup code

18:58 <geist> but the fact that it just chooses to elide them is particularly heinous

18:58 <mrvn> it's UB in C++ as programs must make progress.

18:58 <geist> right. so this switch nerfs that

18:59 <MrBonkers> LittleFox I know I'm late, but considering you mentioned mlibc the other day, any reason why you didn't pick that? We're more than happy to help fix issues that you ran into

18:59 <LittleFox> didn't know it when I started looking for a libc to use and now just stuck with newlib as it's mostly good enough for now

19:00 papaya has joined #osdev

19:00 <MrBonkers> well if you want to switch, feel free to give us a shout if anything comes up (#managarm-mlibc on libera or the Managarm discord server)

19:02 <dzwdz> infinite loops being UB is weird

19:04 <dzwdz> i'm probably a bit ignorant, but it seems like the optimizations it enables are pretty niche

19:05 <geist> yeah also AFAICT gcc doesn't do anything with this, so unclear precisely what it's useful for

19:05 <mrvn> It neatly solves the halting problem. Every correct c++ program halts.

19:05 smach has quit []

19:05 <geist> and further moves C++ away from being acceptable for bare metal or non user-space-program centric

19:06 <\Test_User> what about when it uses external input for whether or not to halt; it may never halt if the input says not to

19:06 <\Test_User> s/whether or not/when/

19:06 <geist> it's strange too, since it seems to not admit that signals exist? AFAIK that's part of the language spec

19:06 <geist> that somethingl ike signals can come along

19:06 <mrvn> yeah, I never got that part.

19:07 <mrvn> alarm(1); while(true) { } is perfectly fine code

19:07 <geist> exactly, was going to say that

19:07 <dzwdz> geist: isn't it a thing in C too?

19:07 <dzwdz> N1528 seems to suggest that

19:08 <geist> indeed. C++ even acknowledges that stuff like signals exist because of things like thread_signal_fence and whatnot

19:08 <dzwdz> \Test_User: then that doesn't fit the definition of an infinite loop because it does io

19:08 <\Test_User> but it's still potentially infinite, so halting problem not solved

19:08 <dzwdz> eh, is it

19:08 <dzwdz> the user will die eventually

19:09 <mrvn> \Test_User: I don't think the halting problems includes infinite input

19:09 <\Test_User> so will the CPU, so therefore while (true) {} is also non-infinite?

19:09 <dzwdz> indeed

19:09 <\Test_User> mrvn: ah

19:09 <dzwdz> https://xkcd.com/1266/

19:09 <bslsk05> xkcd - Halting Problem

19:10 <GeDaMo> while (!heat_death()) {}

19:11 <mrvn> doesn't quanten effects prevent the heat death?

19:11 <heat> no

19:11 <heat> i am live

19:12 <dzwdz> i think so

19:13 <LittleFox> MrBonkers, thanks :)

19:13 <LittleFox> does it work with libc++, anything known?

19:14 <MrBonkers> unsure if that is tested, but it should. If it doesn't, we consider that a pretty major bug ngl

19:14 <mrvn> heat death is when everything is in thermodynamic equilibrium. So I can measure the temperature, and therefore speed, of a gas. And then I can determine the position of an atom and that's impossible to know both accurately.

19:14 <MrBonkers> works for sure with libstdc++ tho

19:14 <dzwdz> i don't think the laws are meant to apply at that point

19:15 <mrvn> dzwdz: because any observer is dead at that point?

19:15 <dzwdz> yeah, there's no police

19:15 <dzwdz> it's free for all

19:16 <dzwdz> i think quantum fluctuations are the thing that prevents the heat death

19:17 <dzwdz> but when a friend explained them to me it kinda went over my head, so idk

19:18 <heat> i die will never

19:18 <mrvn> " Sean M. Carroll, originally an advocate of this idea, no longer supports it,[24][25] arguing that the virtual particles produced by quantum fluctuation cannot become real particles without an external input of energy. "

19:18 <dzwdz> heat: not quite, you'll die but then appear again

19:18 <heat> like obi wan

19:18 <heat> ?

19:19 <mrvn> dzwdz: those fluctuations could create a new universe in 10^10^10^56 years but " Such a scenario, however, has been described as "highly speculative, probably wrong, [and] completely untestable". but but but I wanna test that, let me test that

19:19 <dzwdz> like, infinitely many times

19:19 <dzwdz> i'm purposefully not reading any more about it

19:19 <dzwdz> i'm probably heavily misunderstanding it and i don't want to ruin it

19:20 frkzoid has joined #osdev

19:20 <GeDaMo> https://en.wikipedia.org/wiki/Boltzmann_brain

19:20 <bslsk05> en.wikipedia.org: Boltzmann brain - Wikipedia

19:24 <mrvn> GeDaMo: there is a certain probability that all the atoms in your underwear will move 1m to the right at the same time.

19:24 * mrvn turns on the probability engine.

19:25 <GeDaMo> I don't get invited to those sort of parties :|

19:25 k0valski1889 has joined #osdev

19:26 <heat> operating system

19:26 <moon-child> party(6)

19:26 <heat> so how about those kernels eh

19:26 * moon-child grabs popcorn

19:27 <dzwdz> i got my garbage tcp stack to the point where it just barely works

19:27 <heat> thats the spirit

19:28 <dzwdz> doing a http file transfer at .5mbps is enough to pin the cpu

19:29 <heat> fucken what

19:29 <dzwdz> who needs more

19:30 <dzwdz> it was just meant to be a prototype to see how networking would fit into the overall system, i purposefully didn't care about optimizing it at all

19:30 <heat> what are you doing and why is it O(n!)

19:32 <moon-child> n‽ That's extreme. Usually you only get accidentally quadratic

19:32 <moon-child> or even just crap constant factors

19:32 <dzwdz> i'm making a lot of unneeded memory allocations

19:33 <dzwdz> i think there's one per each incoming tcp/udp packet

19:33 <moon-child> bet your memory allocator also sucks

19:33 <dzwdz> dlmalloc

19:33 <dzwdz> so, yeah

19:33 <moon-child> oh, eh, not amazing by modern standards, but dlmalloc is probably ok, esp. on one thread

19:34 <dzwdz> i'm actually running two threads

19:34 <moon-child> I was thinking tinyalloc or something

19:34 <dzwdz> and i didn't enable the thread safety thing at all

19:34 <dzwdz> lol

19:34 <moon-child> well

19:34 <heat> one per incoming packet is super standard

19:34 <moon-child> get thee some mimalloc!

19:35 <dzwdz> probably the bigger issue is that my ipc isn't that great

19:37 <dzwdz> is it bad if i manually parse the paging structures when i copy memory between different address spacess?

19:37 <heat> define manually?

19:38 <moon-child> as in you don't have your own model in-kernel of the virtual address mappings?

19:38 <dzwdz> i'd assume that the best way to do larger transfers is to map the pages i'm copying from/to in memory

19:38 <dzwdz> so i could then do a single contiguous transfer

19:39 <moon-child> ideally you would have shared memory buffers

19:39 <moon-child> and let the applications negotiate what 'transfer' means

19:39 <dzwdz> i'm also thinking about that

19:39 <heat> yeah sure that sounds fine

19:39 <moon-child> (maybe I shouldn't say 'ideally'. That is one common strategy, and if you are unsure of what to do, following orthodoxy may be a good idea)

19:43 <dh`> remapping pages as part of IPCs is expensive

19:43 <dzwdz> i have filesystems in userspace, and i was thinking about making read()/write() work on pages instead of arbitrary buffers

19:43 <dzwdz> dh`: ok then

19:44 <dh`> copying data is also expensive

19:44 <heat> well yes, depends on the size

19:44 <dh`> there's no one right answer (other than, perhaps, don't build things as microkernels)

19:45 <dzwdz> literally the only reason i'm doing osdev in the first place is that i wanted to experiment with an idea i've had for a microkernel

19:45 <dzwdz> so, no

19:45 <dh`> sure

19:45 <dh`> it just puts you on the back foot for performance

19:49 <heat> if you wanted to be on the back for performance you could've just used openbsd instead

19:50 <zid> heat forgetting solaris exists smh

19:50 <heat> cc mjg

19:53 <mjg> just copy paste my last rant

19:57 <mjg> https://people.freebsd.org/~mjg/illumos/smartos-bzImage-j40-inc.svg

19:57 <mjg> 's what you get when you have lock-protected reference counts on vnodes

19:57 <mjg> and there are parallel lookups in the same dir

19:58 <ThinkT510> dzwdz: you are making a microkernel? do you have a repo one can look at or are you just in planning stages?

19:58 <geist> mjg: i sat down to fiddle with getting a memcpy & memset benchmark going last night. surprisingly my zen 3 has both ERMS and FRMS bits in cpuid set

19:58 <geist> i didn't know about the latter

19:58 <mjg> geist: oh?

19:58 <dzwdz> well, i do, but the readme is severely outdated

19:58 <dzwdz> lemme update it, i'll send a link

19:58 <geist> zen 2 has neither, of course

19:58 Gooberpatrol66 has quit [Quit: Leaving]

19:58 <geist> and somewhat predictably has fairly bad spin up on rep stosb as you say

19:59 <mjg> geist: are you sure about frms? does cpuid confirm it?

19:59 <mjg> i mean the cpuid tool

19:59 <mjg> it recognizes the bit

19:59 <geist> yep

19:59 <mjg> nice

19:59 <mjg> so what are the results :)

19:59 <geist> oh i dont have them handy right now, just was going to point out

19:59 <mjg> i almost got an frms capable box yesterday

20:00 <mjg> at work

20:00 <geist> still working on the tool, etc. was mostly getting it in place so i can then do side by side comparisons of memcpy and memset with various alignments/sizes, etc

20:00 <mjg> but some usb bullshit prevented it

20:00 <papaya> perhaps a dumb question, but what is the FRMS bit?

20:00 <mjg> "fast short rep mov"

20:00 <dzwdz> so do you know how complex some memcpy() functions can get? if you have FRMS you can replace your memcpy with like 4 asm instructions

20:00 <geist> yah it's the bit that says 'we *really* mean the ERMS bit now'

20:00 <heat> it's a big fat lie

20:01 <heat> dzwdz, 1) very 2) no

20:01 <dzwdz> supposedly, i don't have the hardware to test it on

20:01 <mjg> the supposedly bit is what makes me worried :)

20:01 <dzwdz> i mean

20:01 <dzwdz> you CAN, can't you?

20:01 <moon-child> you could do that anyway

20:01 <mjg> also, even if it spins up fast enough for short bufs, the question is how it handles misalignment

20:01 <dzwdz> isn't that what linux does

20:01 <moon-child> it's part of the base isa

20:02 <papaya> @mjg thank you

20:02 <heat> dzwdz, https://github.com/heatd/Onyx/blob/master/musl/src/string/x86_64/memmove.S

20:02 <bslsk05> github.com: Onyx/memmove.S at master · heatd/Onyx · GitHub

20:02 <mjg> dzwdz: linux has being doing plain erms for some time now for several funcs

20:02 <heat> and this is pretty simple compared to glibc

20:02 <mjg> dzwdz: *before* frms became a thing

20:02 <mjg> dzwdz: as in, believe it or not, they don't have very optimized routines

20:02 <dzwdz> heat: come on, what isn't complex in glibc

20:03 <heat> they = kernel here

20:03 <mjg> is that bionic?

20:03 <heat> dzwdz, good point

20:03 <heat> yes

20:03 <mjg> it is going to lose to erms past *some* size

20:03 <mjg> i have not benchmarked yet

20:04 <heat> believe it or not, bionic's sse memcpy doesn't really win against musl's crapshoot

20:04 <heat> it turns out my CPU probably does ERMS on rep movsq

20:04 <mjg> for what sizes adn what alignment

20:04 <heat> a whole bunch of em

20:04 <heat> we've gone through this before

20:04 <geist> also yeah there's two sets of solutions too: ones where vectors are available and ones where they aren't

20:05 <geist> the former is what i'm interested in right now

20:05 <dh`> it's always seemed to me that it would be better to write memcpy in C and teach the compiler to emit all the magic

20:05 <heat> geist, I think AVX is the super big win

20:05 <zid> That's what gc already does

20:05 <zid> gcc*

20:05 <moon-child> people have tried

20:05 <moon-child> compiler ain't good enough

20:05 <mjg> https://elixir.bootlin.com/linux/latest/source/arch/x86/lib/usercopy_64.c#L17 for weirdly pessimized linux right there

20:05 <geist> heat: indeed, but again without them available it doesn't matter if it's a win

20:05 <zid> if you write out a memcpy it spits out a kilobyte of avx :p

20:05 <bslsk05> elixir.bootlin.com: usercopy_64.c - arch/x86/lib/usercopy_64.c - Linux source code (v5.19.3) - Bootlin

20:06 <geist> anyway i dont mean to once again drag this channel into another memcpy thing, but mjg did get me interested again in how bad zircon has pessimized things (in the kernel)

20:06 <mjg> they had a good idea: when you have reads from /dev/zero or similar intead of copying zeroes into the target buffer, you can just zero it out

20:06 <dh`> then the compiler should be made better

20:06 <mjg> ... except the routine is so botched....

20:07 <mrvn> Is all that unrolling even saving any time? Isn't the loop instruction eliminated by the hardware through pipelineing and branch prediction?

20:07 <geist> i had an idea the other day that on x86 if you take a page fault for writes, then check what instruction did it and it's a rep stosb yuo could potentially emulate it in the VM if it crosses a few pages

20:07 <dzwdz> mjg: how is copying zeros different than zeroing a buffer out

20:07 <dh`> that's a neat idea

20:08 <mjg> dzwdz: for starters you don't waste cache to store the source buffer

20:08 <moon-child> mrvn: yes, unrolling is helpful

20:08 <mrvn> geist: you mean don't fault a page in from disk if it's going to overwrite it?

20:08 <geist> that would be one case yeah

20:08 <geist> or if it's just some code memsetting something large

20:08 <dzwdz> wait, so the normal way to do it is to have /dev/zero be backed by a buffer full of zeroes?

20:08 <mjg> yea

20:08 <heat> yea?

20:08 <heat> not really

20:08 <dh`> no, that's not the normal way, but it can happen that way

20:08 <mjg> so how do you do it

20:08 <dh`> depends on how you have your devices set up

20:09 <mrvn> moon-child: how so?

20:09 <heat> user_memset

20:09 <mjg> freebsd has a 4KB page mapped over a 2MB area

20:09 <mjg> user_memset is the same idea i described above

20:09 carbonfiber has quit [Quit: Connection closed for inactivity]

20:09 <heat> yes, because it makes sense

20:09 <mjg> i completely agree

20:09 <mjg> just saying it is not the norm

20:09 <heat> if reading always returns 0's, just memset it

20:09 <heat> that's LSD code from 4.4

20:09 <mjg> yes

20:09 <moon-child> mrvn: there is still loop overhead

20:10 <mjg> bb in 15

20:10 <moon-child> mrvn: also consider the loop counter/index; on every iteration, the value of the counter depends on its value the previous iteration, and the body of the loop will likely depend on that index

20:10 <dh`> in a BSD kernel because of the uio abstraction the path of least resistance is to call uiomove() with a buffer of zeros

20:10 <mrvn> moon-child: and those instructions are not executed in parallel with waiting for the memory to load/store?

20:10 <dh`> and I think that's how it was in 4.4

20:10 <moon-child> so that limits your parallelism

20:11 <moon-child> they are executed in parallel. But for a sequential thing, you're probably not waiting very long

20:11 <moon-child> load/store are rather fast when hot

20:14 GeDaMo has quit [Quit: A program is just a bunch of functions in a trenchcoat.]

20:16 <mrvn> moon-child: one thing I'm wondering is branch prediction. Say you have a loop with one branch. Now it gets unrolled 4 times. Doesn't that mean the loop now uses 4 pranch predictors (lets assume no collision) and learns 4 times slower?

20:18 <moon-child> yes

20:19 <mrvn> I noticed clang is generally more conservative with unrolling. gcc code is often a lot longer.

20:19 <moon-child> really? I observe the opposite--gcc is usually more conservative

20:20 <heat> i've heard clang generates larger code at least

20:20 <heat> that's why they can't use it in firmware

20:20 <ThinkT510> dzwdz: found it. what made you settle on the name camellia?

20:21 <dzwdz> just picked a random flower name

20:21 <ThinkT510> neato

20:22 <dzwdz> i think the readme is from before i even wrote any code

20:22 <dzwdz> and the docs are super outdated too

20:23 <ThinkT510> any inspirations from other microkernel designs?

20:23 <mrvn> worst example I had was the destructor for a binary tree. If there is a left child call the destructor for the left child. Same for right child. clang just makes recursive calls. While gcc inlined 4 levels of recursion before doing a recursive call.

20:24 <mjg> dh`: so netbsd is not doing it?

20:24 <heat> my kernel is inspired on unix

20:24 <heat> and svr4, the best kernel ever

20:24 <mjg> unix sucks man

20:24 <heat> and linux, kinda shit

20:24 <mjg> check out "dnlc"

20:24 <heat> u succ fuk u bsd man go cry 2 berkly boohoo u dont like att i fuk u up

20:25 <mrvn> heat: 2 things where invented at Berkley: LSD and BSD. I think there is a connection.

20:25 <kof123> where is was/att? is this like east const west const all over again

20:25 <kof123> *is/was

20:25 <heat> lerkeley software distribution

20:25 <heat> just saying

20:26 <mjg> sounds like someone is on the stuff

20:26 <heat> yes

20:26 <mrvn> mjg: not enough

20:26 <heat> i have a big will to bring in mach's VM

20:26 <heat> also share my fucking code all over

20:26 <mjg> i wanted to say dnlc does not have backpointers from vnodes to nc entries

20:26 <mjg> which has a funny result where sometimes you need to SCAN THE ENTIRE CACHE to find them

20:27 <dzwdz> ThinkT510: not really, i looked around but didn't find anything i particularly liked

20:27 <mjg> and yes, solaris is doing it

20:27 <heat> d o o r s

20:27 <heat> slightly worse than STREAMS but I'll take it

20:28 <dzwdz> i wanted a mechanism for privilege separation which i'd find easy to grok and use

20:29 <ThinkT510> dzwdz: have you looked at managarm? they made a fosdem video in 2022 which was interesting

20:29 joe9 has quit [Quit: leaving]

20:29 <dzwdz> i honestly don't remember

20:30 <mjg> heat: you are laughing at a TRUE UNIX

20:30 <dzwdz> oh actually i didn't mention, this is influenced by plan 9 quite a bit

20:30 <heat> has marvell open sourced svr4 yet?

20:30 <heat> i wanna take a look

20:30 <dzwdz> at first i was thinking about doing this on top of its codebase

20:31 <heat> the best part about plan9 is the GARBLED FUCKING IDENTIFIERS GOD DAMN IT WHY IS EVERYONE ON LSD AGAIN

20:31 <mjg> i love their asm

20:31 <mjg> also love how it made its way to golang

20:31 <heat> link???????????

20:31 <heat> pls

20:31 <mjg> which part

20:32 <heat> asm

20:32 <dzwdz> plan9 is full of great ideas

20:32 <dzwdz> each with this ONE weird quirk which'll make you hate it

20:33 <dzwdz> still, it grew on me

20:33 <mjg> just search for golang on github or whatever, for example memmove_amd64.s

20:33 <mjg> TEXT runtime·memmove<ABIInternal>(SB), NOSPLIT, $0-24

20:34 <dh`> mjg: dunno (re zeros)

20:34 <heat> is it fast?

20:34 <mjg> oh heh

20:34 <mjg> geist:

20:34 <mjg> // REP instructions have a high startup cost, so we handle small sizes

20:34 <mjg> // with some straightline code. The REP MOVSQ instruction is really fast

20:34 <mjg> // for large sizes. The cutover is approximately 2K.

20:34 <mjg> tail:

20:34 <dh`> but if you're doing something performance-critical with /dev/zero you're already in the wrong place :-)

20:34 <geist> linux and splice() and /dev/zero is pretty slick though

20:35 <mjg> geist: from golang's src/runtime/memmove_amd64.s , perhaps you can bring these people to a fuchsia meeting

20:35 <geist> have had pretty good luck with that

20:35 <heat> you mean vmsplice?

20:35 <heat> geist

20:35 <geist> mjg: yeah part of the problem honestly is there *is* a lot of internal research at google with regards to memcpy/memmove

20:35 <geist> but it tends to be *highly* server workload oriented

20:35 <geist> so it's a bit hard to get folks out of that mindset

20:35 <geist> ie, it's not a blank slate to have a good discussion, bceause there's already a lot of heavy ammo on the table

20:36 <heat> where's the avx512 memcpy

20:36 <geist> precisely

20:36 <moon-child> avx512 is like easy-mode for memcpy

20:36 <moon-child> load masked, store masked, done

20:37 <mjg> it's like erms would like to be11

20:37 <mjg> what*

20:37 <geist> heat: anyway i was just thinking about splice() syscall and /dev/zero

20:37 <geist> kinda fun to move a lot of data around with that

20:37 <mjg> does it work though?

20:37 <geist> pv /dev/zero is particularly nice for just generting a crapton of data on a pipe since it uses splice()

20:38 <mjg> i distinctily remember someone from G implementing reads from sparse files from tmpfs with clear_user

20:38 <mjg> .. only to find it causes a slowdown cause the routine is atrocious

20:38 * geist nods

20:38 <mjg> if you end up using it... :)

20:38 <geist> key though is as usual: what version, what sort of hardware are yo on, etc etc

20:39 <geist> sometimes especially for Big Iron stuff, seem that copying around may have completely different crossover points with pissant consumer shit

20:39 <mjg> any hw, the routine has not been updated in over a decade and has been dodgy from the getgo

20:39 * geist nods

20:39 <mjg> i traced it a little bit with ebpf

20:39 <mjg> it is mostly used to store just few bytes, then there is a cutoff to over 1KB

20:39 <geist> i've noticed, for example, on derpy arm hardware on linux, that splice(/dev/zero) is pretty good

20:39 <geist> but then, i have no numbers to back it up, etc etc

20:39 <mjg> oh right

20:39 <mjg> it's amd64 :)

20:40 <geist> see, thats *precisely* what i mean by 'what hardware are you on, etc'

20:40 <geist> as in, yes that shit *actually matters*

20:40 <heat> vmsplice writes to pipes don't copy

20:40 <heat> they just COW pages

20:40 <heat> its distinctly cool

20:40 <heat> sadly they do regular old memcpy on reads

20:40 <mjg> geist: well i did paste the amd64 asm for that routine higher up, so...

20:40 <geist> that's precisely the kinda thing i tend to hit at google. folks are very big on Big Iron x86 and then you have this conversation and you have to remind them not all problems look like that

20:41 <mjg> and we are talking amd64 here with all the rep stuff, so the context... ;)

20:41 <geist> sure sure

20:41 <mjg> anyhow i found the file, here is part of it https://dpaste.com/5XBVNSLLG

20:42 <bslsk05> dpaste.com <no title>

20:42 <mjg> the number after : is the call count

20:42 <mjg> size on the left

20:42 <mjg> it jumps from small to really big, and the latter SUCKS

20:42 <mjg> which is really weird given how easy it is to drastically impreove this

20:42 <mjg> not to be confused with making it optimal for all uarchs

20:42 jafarlihi has joined #osdev

20:43 Gooberpatrol66 has joined #osdev

20:43 <jafarlihi> What do you think of single-header libraries in C++? Is it good idea to write your library as single-header when it is 20kloc+?

20:44 <heat> its a bad idea please dont

20:44 <griddle> single header libs are just there for people who don't value their time enough to sort out integrating into your build system

20:44 <moon-child> 'single-header' is a curiosity

20:44 <mjg> sudo bpftrace -e 'kprobe:__clear_user { @[kstack(), arg1] = count(); }' # if you want to take a look at yur craplinux

20:44 <griddle> use cpp files and dont require any fancy compiler flags to work

20:45 <griddle> test your lib w/ pedantic or whatever idk :)

20:46 <kof123> "single header libs are just there " yep and sometimes people do that with C too for same reason

20:47 <heat> header-only lib in C sounds funny

20:47 <griddle> what is that c lib that implements like png, etc?

20:47 <heat> libpng

20:47 <griddle> the famous one that has a name close to stl

20:48 <griddle> its header only

20:48 <moon-child> stb?

20:48 <griddle> yeah that one

20:48 <griddle> idk, single-header in C is probably fine

20:49 <griddle> in C++ you are trading up front one-time cost for constant cost at every compile

20:49 <griddle> (the up front cost of getting library TUs setup in your project)

20:52 <geist> also in general header only libraries imply that most things are inlined

20:52 <griddle> or, you compile multiple times and then have to do symbol resolution in the linker later

20:53 <geist> indeed

20:53 <griddle> i mean, c++'s big compile-time footgun of templates make header only libs the default in that language

20:54 joe9 has joined #osdev

21:00 <dzwdz> ThinkT510: i wrote a slightly better readme, so you can have some idea how it actually works

21:13 <ThinkT510> dzwdz: thanks

21:20 justache is now known as justDeez

21:22 MarchHare has joined #osdev

21:23 MarchHare has left #osdev [#osdev]

21:32 dude12312414 has joined #osdev

21:37 papaya has quit [Quit: Lost terminal]

21:42 frkzoid has quit [Ping timeout: 244 seconds]

21:45 pretty_dumm_guy has quit [Quit: WeeChat 3.5]

21:47 <Griwes> Hey geist, why are things like "can create a new process" job policies instead of handles to objects that carry kernel permissions in zircon? Is it "let's reduce the number of handles people need to pass around", "this is the way we initially figured it", or a deeper reason?

21:48 <geist> it's both actually

21:48 <geist> the job policy is a defense in depth, a second layer that operates under a different mechanism, meant to be used as an additional wayt o lock down a process

21:48 <geist> the primary mechanism for things like that is n eeding a handle to a ting that lets you make a thing

21:49 <geist> ie, making a new job needs a handle to a job with sufficient rights

21:49 <geist> same for processes, threads, etc

21:51 <Griwes> ...right, creating a process wasn't a good example, because it does need a handle. I guess creating a VMO is a better one

21:51 jafarlihi has quit [Quit: WeeChat 3.6]

21:52 <Griwes> So the idea is that if someone guesses a handle, they can't just use it? (your handles are global, or am I misremembering?)

21:56 <geist> they're per process

22:08 <raggi> It'd make bootstrap harder, and bootstrap is already quite painful

22:08 <raggi> Userspace bootstrap that is

22:22 bauen1 has joined #osdev

22:25 [itchyjunk] has joined #osdev

22:39 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

22:44 theruran has quit [Quit: Connection closed for inactivity]

22:47 heat has quit [Remote host closed the connection]

23:23 nyah has quit [Ping timeout: 252 seconds]

23:24 FreeFull has quit [Ping timeout: 268 seconds]

23:34 epony has joined #osdev

23:57 theruran has joined #osdev