#osdev on 2023-02-01 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:14 <heat> omg we're getting a new glibc today

00:16 <geist> ah 2.37?

00:17 <heat> yes

00:19 <heat> oh shit, RELR support finally!

00:20 <mrvn> Relative relocations?

00:21 <heat> actually no RELR was in 2.36

00:21 <heat> fuck

00:21 <heat> either they haven't filled in NEWS yet or the glibc people are FRAUDS that do absolutely nothing

00:22 <heat> mrvn, https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ

00:22 <bslsk05> groups.google.com: Proposal for a new section type SHT_RELR

00:22 <mrvn> great, I haven't even grocked REL/RELA yet and now I need to add RELR too

00:27 <Griwes> neat

00:27 <heat> sir, my name is heat

00:28 <mrvn> "off with his head" and your neat.

00:28 <zid> shit relocator?

00:29 <heat> SHT_NOBITS moment

00:31 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

00:39 * mrvn goes and implements the Executable Verification Intermediate Language.

00:46 sauce has quit [Remote host closed the connection]

00:47 sauce has joined #osdev

00:55 * kof123 works on formula for entropylax pill at bootup

01:25 epony has quit [Ping timeout: 268 seconds]

01:33 heat has quit [Ping timeout: 248 seconds]

01:47 elastic_dog has quit [Killed (zirconium.libera.chat (Nickname regained by services))]

01:47 elastic_dog has joined #osdev

01:49 * gog glibcs

01:49 <gog> glad i waited to update

01:49 <gog> updating early is for chumps

01:49 <gog> the longer you wait to update the better

01:51 <\Test_User> does never count as a longer wait? :P

01:54 <gog> it's the longest possible wait and is thus the most ideal

01:54 <gog> logic++

02:06 <zid> couple of small updates available on my gentoo

02:06 <zid> no glibc though

02:06 <geist> they're not gonna pick it up that fast

02:09 <zid> there's a glibc-9999 though if I wanna use glibc from git :P

02:10 <gog> bigger version number better

02:21 small has joined #osdev

02:22 gorgonical has quit [Ping timeout: 268 seconds]

02:32 Vercas has quit [Quit: Ping timeout (120 seconds)]

02:37 small has quit [Quit: Konversation terminated!]

02:39 dutch has quit [Quit: WeeChat 3.8]

02:51 Vercas has joined #osdev

03:37 fedorafan has quit [Ping timeout: 252 seconds]

03:42 terrorjack has quit [Quit: The Lounge - https://thelounge.chat]

03:44 terrorjack has joined #osdev

03:51 gog has quit [Ping timeout: 252 seconds]

04:26 dude12312414 has joined #osdev

04:29 m5zs7k has quit [Ping timeout: 268 seconds]

04:29 m5zs7k has joined #osdev

04:31 dude12312414 has quit [Client Quit]

04:37 m5zs7k has quit [Ping timeout: 252 seconds]

04:43 m5zs7k has joined #osdev

05:04 linearcannon has quit [Read error: Connection reset by peer]

05:06 bradd has joined #osdev

05:09 Left_Turn has joined #osdev

05:12 Turn_Left has quit [Ping timeout: 260 seconds]

05:13 Turn_Left has joined #osdev

05:14 Left_Turn has quit [Ping timeout: 252 seconds]

05:19 gorgonical has joined #osdev

05:24 fedorafan has joined #osdev

06:09 <sham1> We need glibc-\infty

06:12 bgs has joined #osdev

06:14 dza has quit [Quit: ]

06:23 gorgonical has quit [Quit: #optee]

06:36 small has joined #osdev

06:37 simpl_e has quit [Read error: Connection reset by peer]

06:38 simpl_e has joined #osdev

07:12 gorgonical has joined #osdev

07:12 <gorgonical> god damn I hate working on these proprietary sbcs

07:13 <gorgonical> why won't rockchip tell me what the fucking registers in sgrf do

07:25 bgs has quit [Remote host closed the connection]

07:29 <sham1> Thus is working with proprietary stuff

07:30 <gorgonical> I can't even figure out why op-tee even works in secure mode at this point

07:30 <gorgonical> As in, in theory when you have DDR security regions and you set the SIF bit in SCR_EL3, secure code trying to access non-secure memory regions should create a fault

07:31 <gorgonical> But op-tee doesn't mark the region of ram it seems to be running in until some part of the boot process

07:41 dza has joined #osdev

07:42 bradd has quit [Ping timeout: 268 seconds]

07:47 <geist> ugh

07:48 <gorgonical> indeed

07:52 <geist> also ugh, looks like the device tree that is passed from opensbi into the kernel on riscv doesn't use the DTs reserve memory region mechanism

07:52 <geist> guess that's just not what it does

07:53 <geist> Heat had suggested using fdt_get_mem_rsv() but it returns nothing because the table is empty

07:54 <geist> which i guess makes sense. it's not that descriptive, only address+size, whereas the more complex mechanism linux uses is actual tree nodes with more flexible metadata

07:54 <gorgonical> I think my conclusion is that SCR_EL3.SIF refers to ifetch from pages marked NS=1. I can't find any elaboration at all, so I have to assume this bit doesn't magically propagate to all sorts of bus accesses

07:54 <gorgonical> little kernel?

07:55 <geist> yah

07:55 <geist> already have code to parse the more complex memory reserve stuff, just thought i could remove that and use the simpler mechanism

07:55 <geist> alas. no

07:55 <gorgonical> dts supposed to have a reserved element in some place?

07:56 <gorgonical> e.g. reserved for the bootloader?

07:56 <geist> in this case yeah. the sbi firmware itself carves off a chunk of memory and marks it unavailable

07:56 <geist> not unlike secure memory on ARM

07:56 <geist> and marks it as such in a device tree section so the supervisor mode kernel doesn't touch it

07:57 <gorgonical> that's typically the reserved-memory tree node, right

07:57 <geist> SBI firmware running in machine mode is pretty similar to the PSCI firmware in EL3

07:57 <geist> yeah

07:57 <geist> heat had pointed out there's a simpler reserve memory thing in the DT structure itself that libfdt can walk for you, but alas it doesn't seem to be filled in

07:57 <gorgonical> yeah I have familiarity with risc-v bootloaders and stuff. Not the internals of opensbi though

07:57 <gorgonical> oh really, simpler?

07:58 <geist> it's in the FDT header itself. just a pointer to an array of struct { address, size }

07:58 <geist> where 0, 0 terminates the list

07:58 <geist> seems to be outside of the tree itself

07:58 <gorgonical> oh

07:58 <gorgonical> weird

07:58 bradd has joined #osdev

07:58 <geist> yah didn't know about this either until i traced the libfdt code

07:59 <geist> https://android.googlesource.com/kernel/lk/+/qcom-dima-8x74-fixes/lib/libfdt/fdt.h#11 basically

07:59 <bslsk05> android.googlesource.com: lib/libfdt/fdt.h - kernel/lk - Git at Google

07:59 <geist> points to a list of https://android.googlesource.com/kernel/lk/+/qcom-dima-8x74-fixes/lib/libfdt/fdt.h#25

07:59 <bslsk05> android.googlesource.com: lib/libfdt/fdt.h - kernel/lk - Git at Google

07:59 <gorgonical> oh wow that's interesting

07:59 <geist> yeah huh

07:59 <gorgonical> I can only imagine it's not being filled in because of the ubiquity of reserved-memory

08:00 <gorgonical> Because this way it's not only duplicated but not human readable either

08:00 <geist> yeah my guess is the reserved-memory stuff is just more complex and powerful

08:00 <geist> whether or not linux actually reads both formats and tries to union them i dunno

08:01 <gorgonical> an exciting idea, that someone would feed you a dt with inconsistent headers and content

08:01 <geist> heh 'exciting'

08:02 <gorgonical> I wonder why they put it in there at all

08:02 <gorgonical> isn't the whole point of the device tree that you can define sort of arbitrary nodes with syntax?

08:03 <gorgonical> why force it into the header with a fixed, brittle syntax

08:03 <geist> well, my guess is since it's in the header it goes way back to the beginning of OF

08:03 <geist> probably late 80s

08:03 <geist> so probably just way predates the reserve-memory stuff

08:03 <gorgonical> i didn't realize dt was that old

08:04 <geist> yah it came out of openfirmware from what i understand. is why it's big endian and whatnot

08:04 <geist> first time i bumped into it was on old sun workstations and PPC macs

08:05 <gorgonical> what exactly does flattened mean here

08:05 <gorgonical> does it just mean that all the references and stuff are inlined and collapsed wherever possible?

08:06 small has quit [Ping timeout: 252 seconds]

08:06 <gorgonical> it just occurred to me that I don't really know why it's called a flattened dt

08:06 <geist> i think it just means the in memory format, 'flattened' into a binary run

08:06 <gorgonical> oh yes maybe

08:13 small has joined #osdev

08:18 <moon-child> everyone agrees you should be able to memcpy 0 bytes to a null or invalid pointer. But why shouldn't you be able to read/write 0 bytes from/to an invalid fd?

08:19 <Mutabah> Well... a `NULL` pointer is UB to access, even with a zero-sized memcpy iirc?

08:19 <Mutabah> wait... is it?

08:20 <moon-child> I don't remember if it's UB, but I do think everyone agrees it _shouldn't_ be

08:20 <sham1> The only reason I can imagine is that the kernel does something silly like CurrentProcess()->GetFileDescriptors()[fildes]->Read(/*params*/)

08:20 <sakasama> I don't agree. It should just be invalid.

08:21 <Mutabah> I don't see a disparity, because the two are different concepts

08:21 <Mutabah> a memcpy is a copy from a region of memory - an empty region has a size of zero, so the copy is valid

08:22 <Mutabah> but a `fd` is a stream handle - there isn't (usually) a separately known size

08:22 <Mutabah> thus, any attempt to use an invalid FD should fail - because doing so is a bug

08:23 <Mutabah> (or indicates a bug)

08:24 <moon-child> you might argue by the same token that any attempt to use a null or invalid pointer also indicates a bug

08:24 <moon-child> (though a pointer to one past the end of a valid allocation should be fine for a zero-sized memcpy, obviously)

08:24 <Mutabah> for null, I would

08:24 <Mutabah> (as null is a common sentinel for "invalid")

08:25 <Mutabah> But for a properly-aligned nonzero but invalid pointer, it can be considered to be the one-past-end pointer of an empty range

08:25 <moon-child> it can't

08:25 <moon-child> but null might be what you use for the pointer in an actually empty array

08:27 <Mutabah> Worth noting: My rustacean is showing - a null pointer is never a valid value for a safe pointer to have, but `1` is valid (for a pointer to a byte aligned zero-sized type)

08:27 <moon-child> we are in c-land :)

08:28 <Mutabah> We're talking opinions

08:28 <moon-child> I think david chisnall explained to me at some point why a random pointer can't count as a pointer to one past the end of a zero-sized allocation in c, but can't find it now

08:31 <moon-child> lobsters search is really useless

08:32 gorgonical has quit [Remote host closed the connection]

08:37 danilogondolfo has joined #osdev

08:48 slidercrank has joined #osdev

09:17 gog has joined #osdev

09:22 <mrvn> moon-child: you can not write 0 bytes to a pointer. that's UB.

09:25 <mrvn> moon-child: for latest standards a memcpy(nullptr, nullptr, 0); is a type error I would say.

09:27 <mrvn> As for read/write of 0 bytes to an FD I would assume the access check to the FD is done before checking the size. Reading 0 bytes from an FD I think also has meaning. Shouldn't it block till there is data ready to read from a socket or pipe?

09:28 x8dcc has joined #osdev

09:28 gildasio2 has quit [Remote host closed the connection]

09:29 <mrvn> moon-child: Note: pointers can be tagged by the hardware so you can't even form a pointer that isn't pointing at a valid object (AS400 has something like that). So just casting a random int to pointer would throw a cpu exception.

09:29 gog has quit [Quit: byee]

09:30 <mrvn> ARM can tag pointers too, if supported, but I don't know how serious the cpu takes those.

09:30 gog has joined #osdev

09:31 <klange> Enforcement can be enabled by a flag.

09:32 <klange> But it's kinda complicated, I think it's more common to just set the "ignore top byte" flag and have, eg., free() slap you if you give it an untagged pointer

09:32 <mrvn> klange: but that would be when you use a register as a pointer. Just loading the value into a register does nothing.

09:33 <mrvn> klange: or does lea fault then too?

09:33 gildasio2 has joined #osdev

09:39 <klange> I do not believe anything not actually going through the MMU can trap for tagging, so only a dereference / use as a memory operand would do so.

09:39 <klange> But the ARMARM is hot garbage, so I can't easily check individual instructions.

09:40 <gog> nya

09:40 <mrvn> klange: it would break pretty much every program since compilers use lea a lot to load integers and not just pointers.

09:41 <mrvn> so I'm pretty sure lea won't fault.

09:41 Vercas has quit [Ping timeout: 255 seconds]

09:44 <klange> C23 got rid of a lot of verbiage around trap representations; I think outside of floating point values, the intention is that having what is now called a "non-value representation" exist is not itself something that should trap or be undefined behavior?

09:47 <mrvn> did C23 also get rid of one's-complement integers or is that only c++?

09:49 <klange> Strictly speaking C simply did not specify, but now it's two's-complement only.

09:51 <mrvn> signed overflow being undefined pretty much spelled it out that integers could be one's-complement, two's-complement, sign+magnitude or whatever.

09:52 <mrvn> What annoys me a bit is that c++ still says integer overflow is UB while with two's-complement being required it's now perfectly defined.

09:54 <klange> It's possible that the C++ UB is accounting for some possibility that signed integers have special interpretation and overflow traps while that's not permitted for unsigned? But it is probably just a holdover from C.

09:54 GeDaMo has joined #osdev

09:57 <mrvn> klange: no, they have to be two's-complement now. My guess is that they didn't want to break compiler optimnizations. Knowing there is no overflow allows loops to be simplified a bunch.

09:58 Celelibi has quit [Ping timeout: 248 seconds]

09:58 <mrvn> and you don't have to truncate results to int all the time.

10:01 <mrvn> .oO(you are right on the overflow traps though, that remains too)

10:02 <moon-child> '''optimizations'''

10:02 <moon-child> such bullshit

10:02 <moon-child> it literally makes no actual difference

10:03 <moon-child> and everyone else gets shafted with cves for it

10:07 <mrvn> moon-child: and yet you get different (slower) code when you make your loop with an unsigned.

10:08 <Griwes> there's measurable difference between codegen for integers and unsigned integers

10:08 <Griwes> there's multiple talks about this on youtube

10:08 nyah has joined #osdev

10:20 <froggey> stop using c/c++

10:21 <mrvn> that's what the NSA tells us

10:23 <mrvn> cpp2 has some nice changes, maybe I should be using that.

10:24 <ddevault> I'm going to print out the ACPI standard and self-immolate in front of Microsoft headquarters with it held in my lap

10:30 <moon-child> Griwes: no one has shown me a real workload which gets appreciably slower with -fwrapv

10:30 <moon-child> people handwave at asm snippets, which have fuck all to do with anything

10:32 <mrvn> moon-child: "This flag enables some optimizations and disables others."

10:32 <mrvn> moon-child: and who says that has any effect on the optimizations that work because the compiler can assume there is no overflow?

10:33 <moon-child> the whole point of -fwrapv is to make the compiler assume that things can overflow

10:33 <sakasama> ddevault: They'll just toss you in with the others in their catacombs.

10:33 <moon-child> somebody once pointed me at a chandler carruth talk where he says 'such-and-such tight loop in bzip2 is slow because it uses unsigned instead of signed'. So I humoured them; checked out bzip2, compiled with the relevant bits replaced with signed, and the performance was exactly the same

10:34 <moon-child> actually, the loop was super slow, but not because of signed vs unsigned :)

10:34 <mrvn> moon-child: the description sounds more like telling the compiler that integers are two's-complement. Which would be default in the latest standard anyway.

10:34 <sham1> People should stop relying on these pieces of undefined behaviour, damn it

10:35 <gog> but i like tech debt

10:35 <moon-child> mrvn: https://godbolt.org/z/YP1hKjEzr

10:35 <gog> got more of it than student loan debt

10:35 <bslsk05> godbolt.org: Compiler Explorer

10:35 <moon-child> that's literally all it does. Can substitute -fon-strict-overflow if you like

10:35 <sham1> As for 2s complement being default, yeah, it means that there is only thing that you get out of INT_MAX+1, which is a proper value, but there are certain invariants that still break

10:35 <sham1> For example, x < X+1

10:36 <moon-child> two's-complement is the only _representation_, but _behaviour_ on overflow is still undefined in the latest c (and I assume c++) spec

10:36 <mrvn> moon-child: also note that the signed and unsigned code might very well run at the same speed because the cpus are optimized on a hardware level. You might just increase the utilization of the cpu and not run faster or slower.

10:36 <moon-child> yes, that's (part of) the point

10:37 <moon-child> the other is that the function was super slow for reasons that had nothing to do with signed/unsigned, and if you made it fast, it would be fast regardless of signed/unsigned

10:37 <mrvn> moon-child: that flag does more. And those changes will have ripple effects.

10:37 <sham1> Related to the point of not relying on UB, quit writing non-char words onto unaligned addresses, damnit

10:38 <moon-child> show me the benchmarks

10:38 <mrvn> moon-child: I'm not talking about speed. I'm talking about behavior.

10:40 <moon-child> I don't follow

10:40 <moon-child> -fwrapv defines behaviour which was otherwise undefined

10:40 <mrvn> moon-child: https://godbolt.org/z/1YzvKjhb6

10:40 <bslsk05> godbolt.org: Compiler Explorer

10:40 <mrvn> moon-child: see how you get different code for signed and unsigned

10:40 dutch has joined #osdev

10:41 <moon-child> your point being?

10:41 <mrvn> that was my point

10:41 <moon-child> signed and unsigned integers are different. You also get different code for 'x < y' for signed vs unsigned x and y

10:41 <moon-child> so what?

10:42 <mrvn> They didn't remove that signed overflow is UB because (a) the traps you mentioned and b) it would change the codegen a lot and break existing code (even it that code is bad)

10:42 <moon-child> what???

10:42 <moon-child> defining behaviour which was previously undefined can't be a breaking change

10:44 <mrvn> moon-child: lol. Do you have any idea how much code in the wild depends on behavior that's strictly speaking undefined?

10:44 <sham1> Too muxh

10:44 <mrvn> way way way too much

10:44 <moon-child> this is not that sort of thing though

10:44 <sham1> But they shouldn't rely on it. That's the entire point of it being undefined

10:44 <moon-child> I mean, *(double*)&some_int, fine

10:45 <moon-child> but the only behaviour you could possibly be reasonably relying on with signed overflow is wraparound

10:45 <mrvn> moon-child: memset(struct_with_pointer, 0, sizeof(struct_with_pointer));

10:45 <moon-child> esp. as anything else will break when you compile without opts

10:47 <mrvn> anway, there are cpus out there that trap on signed integer overflow so the rest of the argument is moot.

10:47 <moon-child> they might support a 'signed add' instruction that does that

10:48 <mrvn> unless you want to write the specs as "signed integer will overflow like two's-complement does except where hardware will trap on it"

10:48 <moon-child> but 'unsigned add' is literally the same function, and has to be supported and can't trap

10:48 <moon-child> so you can just use that

10:56 <mrvn> not so easy for mul

10:59 kof123 has quit [Ping timeout: 268 seconds]

10:59 <moon-child> low mul is the same for signed/unsigned

10:59 <moon-child> it's just the high part that's different

10:59 <moon-child> and c doesn't have high multiply. When you do have a widening multiply, though--it can't overflow; you get the full product

11:00 Left_Turn has joined #osdev

11:00 <mrvn> C only has high multiply. an no widening

11:00 <moon-child> no

11:00 <moon-child> c has low multiply

11:00 <mrvn> signed * signed = signed

11:01 <moon-child> the high multiply of (say) two 64-bit integers is mathematically floor(x*y / 2^64)

11:01 <ddevault> I'm going to print out hard copies of the ACPI spec and start showing up RISC-V events to ritually burn it on the lawn outside of the conference

11:01 <moon-child> mrvn: yes; that's the same for signed/unsigned

11:01 <mrvn> moon-child: then your statement is just wrong. the low multiply is different for signed.

11:02 <mrvn> it multiplies the absolute value of the integer and then puts the sign back.

11:02 Turn_Left has quit [Ping timeout: 252 seconds]

11:03 <moon-child> mrvn: find me an example of a case where, given uint32_t x, y, x*y is different from (uint32_t)((int32_t)x*(int32_t)y)

11:04 <mrvn> moon-child: x = MAX_INT + 1;

11:04 <mrvn> the second code is simply undefined

11:04 elastic_dog has quit [Read error: Connection reset by peer]

11:05 <moon-child> mrvn

11:05 elastic_dog has joined #osdev

11:05 kof123 has joined #osdev

11:06 <moon-child> the _whole point_ of all this is that if we make signed overflow well-defined, we can implement basic signed arithmetic ops as unsigned arithmetic ops to get around trapping on cpus that do that

11:06 <moon-child> division is the one exception, but that only 'overflows' for INT_MIN/-1. And is slow so it's easy enough to check for it

11:07 <mrvn> moon-child: -1 * -1 = -1 (0xFFFFFFFF), 0xFFFFFFFF * 0xFFFFFFFF = 0xfffffffe00000001 = 0x00000001 != 0xFFFFFFFF

11:08 <mrvn> moon-child: a signed mul is not just an unsigned mul and ignoring overflows.

11:08 <moon-child> where are you getting a result of 0xFFFFFFFF from?

11:09 <moon-child> https://0x0.st/oFdJ.txt

11:10 <mrvn> C-style arbitrary precision calculator (version 2.12.7.2)

11:10 <moon-child> oops I forgot to compile with -fwrapv. Same result though

11:10 <moon-child> mrvn: explain my result, then

11:12 <mrvn> ups, my error, -1 * -1 = 1 obviously.

11:13 <moon-child> :)

11:14 <moon-child> you'll find you get the same result in all cases

11:14 fedorafan has quit [Ping timeout: 248 seconds]

11:14 <mrvn> moon-child: uint32_t * int32_t?

11:15 <moon-child> I would expect the uint gets promoted to an int before doing the multiply

11:15 <mrvn> nope, can't promote to a "smaller" type

11:16 <moon-child> idk, I never had the promotion rules straight

11:16 <mrvn> they are a mess

11:16 <moon-child> but my recollection is it has to promote them to some common type and then do the multiplication there

11:17 <moon-child> indeed

11:17 <mrvn> can't find a case where the bits after mul are different but the flags certainly are.

11:17 <moon-child> good thing c doesn't have flags

11:18 <sham1> moon-child: slight correction, multiplication would be X*Y mod 2^64

11:18 <mrvn> sham1: obviously, or 2^32 in this case of uint32_t

11:18 <sakasama> From the LLVM docs for 'mul': "Because LLVM integers use a two’s complement representation, and the result is the same width as the operands, this instruction returns the correct result for both signed and unsigned integers. If a full product (e.g. i32 * i32 -> i64) is needed, the operands should be sign-extended or zero-extended as appropriate to the width of the full product."

11:18 <moon-child> sham1: the _high_ multiply

11:18 <sham1> There's no divide semantic there. Instead you're on the modular ring just doing modular ring things

11:18 <moon-child> low multiply is mod, yes

11:18 <sham1> But there's no high multiply in C tho

11:19 <sham1> At least not standard

11:19 <sham1> The low multiply is the only multiply

11:19 <mrvn> moon-child: the flags become relevant when you have if (x * y < 0)

11:19 <moon-child> that's what I _said_

11:19 <moon-child> ffs

11:20 <sham1> Hm, must be not enough caffine

11:20 <mrvn> But what that means on an overflow .... hard to say

11:20 <moon-child> mrvn: nope

11:20 <moon-child> x*y < 0 will be true if the high bit of the _low_ product is 1

11:20 <moon-child> but that might not be the same as the sign bit of the true product

11:21 <mrvn> moon-child: no. it will be true when the negative flag is set. Which makes no sense for unsigned mul.

11:21 fedorafan has joined #osdev

11:21 <mrvn> moon-child: with mulu instead of muls the compiler has to add code to test the sign bit

11:21 <moon-child> sure, if you have one

11:22 <sakasama> The sign matters for div/rem, but not the other basic arithmetic operations.

11:22 <moon-child> ANYWAY

11:22 <moon-child> the point is that there's no reason not to make signed overflow well defined in c

11:22 <mrvn> moon-child: and now you have optimization problems. if the compiler knows x and y are positive it still has to test because suddenly the code can overflow into negative numbers and behave differently.

11:23 <moon-child> show me the benchmarks

11:23 <mrvn> this has nothing to do with speed

11:23 <moon-child> what optimisation problems are there, then?

11:23 <mrvn> the one I describved

11:24 <moon-child> please describe concretely the problem, because it's not clear to me

11:24 <mrvn> moon-child: bool f(unsigned x, unsigned y) { int sx = x; int sy = y; return sx * sy < 0; }

11:25 <mrvn> make that uint16_t x to avoid the UB on cast

11:26 <moon-child> and?

11:27 <mrvn> That function is guaranteed to return false for all valid inputs and your change makes it so it has to actually compute the value and may return true

11:28 <mrvn> It can no longer be optimized out.

11:28 <moon-child> it is defined for more inputs. That is not a compatibility break

11:28 <moon-child> the behaviour on the inputs for which it was previously defined is exactly the same

11:28 Celelibi has joined #osdev

11:29 <mrvn> moon-child: but your change the amount of inputs that are defined.

11:29 <mrvn> and yes, there is code out there that relies on that UB.

11:30 <moon-child> show me this code

11:30 xenos1984 has quit [Read error: Connection reset by peer]

11:30 <moon-child> also: that is the sort of behaviour which exhibits non-locality; which varies according to compiler switches; which varies according to compiler and compiler version

11:30 <moon-child> it is not the purview of the language standard to maintain compatibility with such code

11:31 <mrvn> no argument there

11:31 <moon-child> c23 had explicit compatibility breaks (and it was fucking stupid, but)

11:31 <mrvn> unfortunately such code is way way way to common

11:31 <moon-child> so what is your argument?

11:32 <mrvn> same as all this time. they didn't want to break stuff too much (and they couldn't anway due to the cpus that trap on overflow and code that relies on those traps)

11:48 xenos1984 has joined #osdev

12:03 <mrvn> moon-child: apropo integer promotion being insane: uint16_t f(uint16_t x, uint16_t y) { return x * y; } is UB for a half of it's inputs.

12:04 SGautam has joined #osdev

12:08 xenos1984 has quit [Ping timeout: 248 seconds]

12:10 xenos1984 has joined #osdev

12:24 <SGautam> Why does the magnitude of the fourier transform of e^-x look like a normal distribution?

12:24 <SGautam> Ooops wrong channel

12:26 <mrvn> SGautam: you mean half a normal distribution?

12:26 <mrvn> \_

12:36 elastic_dog has quit [Ping timeout: 252 seconds]

12:37 elastic_dog has joined #osdev

12:55 x8dcc has quit [Quit: leaving]

12:56 foudfou has quit [Quit: Bye]

12:56 foudfou has joined #osdev

13:05 [itchyjunk] has joined #osdev

13:11 joe9 has quit [Quit: leaving]

13:12 joe9 has joined #osdev

13:33 gildasio2 has quit [Remote host closed the connection]

13:43 papaya has joined #osdev

14:07 heat has joined #osdev

14:10 <heat> geist, aw the riscv mem rsv thing you added is different?

14:10 <heat> sucks

14:11 small has quit [Ping timeout: 252 seconds]

14:12 gildasio has joined #osdev

14:12 dutch has quit [Ping timeout: 246 seconds]

14:24 <Jari--_> BIOS is stuck on this machine

14:25 <Jari--_> Windows 11 works

14:25 <Jari--_> cant install Ubuntu for development

14:25 <GeDaMo> Something to do with SecureBoot?

14:26 <heat> thank your firmware

14:26 <heat> whatever keeps you as far away from linux as possible

14:26 <heat> i wish my firmware was broken as well smh

14:29 <kof123> "I'm going to print out the ACPI standard and self-immolate in front of Microsoft headquarters with it held in my lap" "They'll just toss you in with the others in their catacombs." is there a bios/uefi pile?

14:33 <zid> my garlic naans are too hot to eat with my fingers :9

14:34 <GeDaMo> What about your toes? :|

14:34 <zid> osdev me a new garlic naan firmware heat

14:34 <sham1> Don't you mean NaNs

14:35 * sham1 plays laugh track

14:35 * gog claps

14:35 <zid> is a naan an snan or nan?

14:35 <gog> i disable secure boot because what's the point

14:36 <gog> if somebody gains access to my machine and installs an unauthorized boot thing good on them

14:36 <zid> yea that's what I told heat

14:36 <zid> blew his mind

14:36 <sham1> gog: nooo, but now someone could tamper with your bootloader in an Evil Maid attack, noook

14:36 <zid> I directed him to the xkcd comic about printer drivers vs my email account

14:36 <heat> zid, i still don't agree

14:37 <gog> oh noooo

14:37 <zid> https://xkcd.com/1200/

14:37 <bslsk05> xkcd - Authorization

14:38 <sham1> Also, can we just acknowledge that "evil maid attack" is both an unfortunate name for a security threat and also a great band name

14:38 <zid> evil made attack is my new light novel series

14:38 <gog> pretty sure the cleaning lady here knows we're not worth tampering with

14:38 <sham1> The name's not long enough

14:38 <gog> she sees the shit we get up to and knows we're a half-assed outfit

14:39 <gog> we have nothing

14:39 <zid> we're even half assed?

14:39 <gog> i mean

14:39 <gog> at best

14:39 <kof123> well, evil maid will plant a backdoor rather than just pawn your stuff, yeah......

14:39 <zid> It feels at *best* like a treehouse with a "no noobs" sign outside

14:39 <kof123> this too is like the xkcd $5 wrench password extractor

14:39 <gog> yeah it'd be more profitable to stea and fencel the shit we have in unlocked cabinets and don't evne use

14:40 <zid> sham1: The story of the evil maids who want to wipe out humanity - Attack on dust

14:40 <sham1> Nice

14:40 <heat> if you replace the n in "no noobs" with b you get "bo boobs"

14:41 <heat> which is a hell of a lot funnier

14:41 <gog> booba

14:41 <heat> bobs

14:41 <zid> heat has boobs though

14:41 <heat> mammal moment

14:41 <zid> cringe r.r. heat

14:50 Burgundy has joined #osdev

14:55 slidercrank has quit [Ping timeout: 268 seconds]

15:21 dutch has joined #osdev

15:36 janemba has quit [Ping timeout: 252 seconds]

15:56 <Jari--_> zid: heat a girl?

15:57 <heat> no

15:57 janemba has joined #osdev

15:57 <heat> girls are not allowed in the osdev treehouse

15:57 <heat> because they are weird and eat boogers

16:02 <zid> heat has boobs cus he's addicted to watching sports

16:08 <zid> geist: time to dig out your A750, big driver update apparently

16:15 pg12 has quit [Ping timeout: 255 seconds]

16:17 pg12 has joined #osdev

16:17 xenos1984 has quit [Read error: Connection reset by peer]

16:20 <marshmallow> how would you proceed if you were to perf a C++ program? like, finding out if I'm using the correct data structures, if there is any kind of bottleneck..

16:21 <marshmallow> could time(1) help here?

16:21 <heat> use perf

16:21 <heat> :))

16:22 <SGautam> marshmallow: valgrind, although that's mostly for profiling purposes and finding leaks. I don't think we'll have optimizeGPT for a long time judging from ChatGPT performance on coding.

16:23 <marshmallow> OK perf is a powerful tool, and no doubt it would help here. but say we'd like to give time(1) a try. how could I make sure I'm not taking advantage of cache hits during the subsequent run of the program?

16:24 <heat> you don't

16:25 <heat> ... because time is not a profiling tool

16:25 <heat> time can give you a HINT (run your prog with large input, 10s, change your prog to be fast, run, 5s)

16:32 <marshmallow> now I actually wonder how perf can actually get something useful out of billion of instructions executed per second

16:34 <zid> driver overhead seems cut down on the intel driver at least, +77% fps in cs:go et

16:35 <heat> marshmallow, because it samples your code with a timer and can effectively use CPU performance monitoring counters

16:36 <marshmallow> heat, but if it samples then it might likely miss some instructions or some functions that could be more or less interesting, no?

16:37 xenos1984 has joined #osdev

16:37 <heat> yes, which is why you sample often and run it for a good bit of time

16:37 <zid> heat disregard oprofile, acqurie ds

16:38 <heat> zid zid

16:38 <zid> heat hat

16:38 <zid> het heet heat haat hat

16:38 <heat> u good ad palying video game dark soul or not yet?

16:39 <zid> I am okay

16:44 <marshmallow> heat: so perf itself runs the program multiple times or you need to run it under perf more times?

16:44 <heat> you run it once and make it do something for a good bit of time. the more time it takes, the more accurate data is

16:45 <heat> with intel-pt you get the accuratest of data since it does not sample I believe (or it samples with a stupidly high freq timer, not sure)

16:45 bgs has joined #osdev

16:58 Burgundy has quit [Remote host closed the connection]

16:59 gareppa has joined #osdev

17:00 <marshmallow> heat: could in principle a simple x86-64 rdtsc be used here?

17:00 <heat> rdtsc of what?

17:01 gareppa has quit [Remote host closed the connection]

17:02 <marshmallow> sorry, rdtsc at function prologue and epilogue to measure the execution time of a function

17:03 terminalpusher has joined #osdev

17:05 <heat> yes? but that will affect the results

17:05 <heat> why are you trying to half-ass profiling?

17:06 <heat> you asked "how would you proceed?" and I said "use proper tooling (perf)" and now you want to stick rdtsc in functions or use time(1)

17:08 <gog> proper tooling?

17:08 <gog> breh this is osdev

17:14 gog has quit [Quit: Konversation terminated!]

17:20 <marshmallow> heat: yeah sorry, was just freaking out :P

17:23 dude12312414 has joined #osdev

17:26 xenos1984 has quit [Ping timeout: 246 seconds]

17:26 xenos1984 has joined #osdev

17:32 hmmmm has quit [Remote host closed the connection]

17:33 hmmmm has joined #osdev

17:57 smach has joined #osdev

18:00 smach has quit [Client Quit]

18:08 gorgonical has joined #osdev

18:12 gog has joined #osdev

18:22 xenos1984 has quit [Ping timeout: 252 seconds]

18:38 xenos1984 has joined #osdev

18:53 heat has quit [Ping timeout: 252 seconds]

19:03 slidercrank has joined #osdev

19:23 terminalpusher has quit [Remote host closed the connection]

19:23 terminalpusher has joined #osdev

20:03 demindiro has joined #osdev

20:20 terminalpusher has quit [Remote host closed the connection]

20:21 terminalpusher has joined #osdev

20:25 GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]

20:28 <gorgonical> what's everyone doing this first day of february

20:34 <gog> trying to figure out why drawing bitmap fail

20:34 <mjg> first day of slacking this month

20:34 <gog> the short answer is idk yet

20:34 <mjg> got a 31 day streak last month

20:34 <geist> more workin

20:37 <gorgonical> I'm recovering from the shellshock of finding out that yet again arm64 manufacturers hate me

20:39 <gorgonical> any arm64 experts mind cluing me in on exactly what SCR_EL3.SIF does? The description is "forbids ifetch from non-secure memory when in secure state" but that's a little vague

20:40 <geist> which part? the ifetch part?

20:40 <geist> probably means 'no instructions can be run from non secure memory' which seems like another safety/security bit

20:40 <gorgonical> mostly what is meant by "non-secure memory"

20:40 <geist> lots of those going in

20:40 <gorgonical> As far as I can tell it must refer to the NS bit in the page tables

20:41 <geist> almost certainly

20:41 <gorgonical> There's so many confusing interlocking hardware components of trustzone that it's not very easy to tell what bits affect what

20:41 <gorgonical> the secure/non-secure state of the PE is taken into account for DRAM bus accesses but seemingly not this SIF bit

20:41 * geist nods

20:42 <gorgonical> and based on my understanding it has to be that way because op-tee for example doesn't even configure its memory as secure so SIF can't be doing that

20:42 <gorgonical> anyway, I digress

20:45 bgs has quit [Remote host closed the connection]

20:49 <demindiro> I'm trying to update my Rust cross compiler to the latest version

20:49 <demindiro> Took me a while to figure out why it supposedly couldn't find compiler_builtins

20:49 <demindiro> And also always fun when a bunch of APIs get changed

20:50 <geist> eep, what version to what? was it a major jump?

20:51 <demindiro> idk what the last version I used was, but there are ~20000 new commits apparently

20:51 <geist> ah from source. got it

20:53 <geist> i stopped building from source once i discovered rustup

20:53 <demindiro> You kinda need to build from source if you want to port stdlib

20:53 <geist> good to know yeah

20:54 heat has joined #osdev

20:54 <geist> and then theres heat

20:54 <heat> and then theres geist

20:54 * geist shakes head, sadly

20:54 <geist> heat heat heat when are you ever gonna learn

20:55 <heat> learnw aht

20:56 <geist> exactly.

20:56 <heat> damn

20:56 <geist> here's my masochistic mission for the day: build qemu on ubuntu-riscv64 on top of qemu

20:56 <heat> build qemu under tcg?

20:57 <heat> or qemu under kvm?

20:57 <geist> though really it's not *too* bad. riscv on qemu tcg on a ryzen 3950x seems to be approx 500mhz ARM class

20:57 <geist> so it's pretty good all things considered

20:57 <heat> i mean, fuck

20:57 <geist> haha under kvm. tat's funny heat

20:57 <heat> that is not that good

20:57 <geist> yah but whe yu think about it it's about right: about a 5-8x slowdown because TCG

20:58 <heat> my rpi zero 2 w struggles to compile e.g gn, which isn't that large

20:58 <heat> iirc it even OOMs (and it has 512MB of ram)

20:58 <mjg> what's tcg?

20:58 <geist> at least in the case of the qemu instance i can add tons of cores and lots of ram so that's not an issue

20:58 <heat> qemu software thing

20:58 <geist> 'tiny code generator' i think is the full name

20:59 <heat> geist, is a 128T threadripper the fastest riscv CPU? :v

20:59 <heat> oh yeah Intel cancelled their riscv project!

20:59 <geist> well interestingly, on a high end x86 core on qemu tcg it's pretty close if not exceeds the speed of a sifive unleashed board

20:59 <geist> which also benchmarked around an 800mhz arm or so, according to my informal benchmarks

21:00 <geist> and in qemu you can get much better io throughput

21:00 <demindiro> I imagine it's a fair bit less power efficient though

21:00 <geist> indeed

21:00 <geist> also my guess is there are things that benchmark terribly, probably lots of context switches where TCG would conceivably throw away jitted stuff and start over

21:01 <geist> i honestly dunno how effective TCG jit caches are

21:01 <heat> ugh

21:01 <heat> riscv sucks

21:02 <geist> it's more of a 'reset of expectations'

21:02 <geist> there was a time when this was a fast and efficient machine, it was just some time ago

21:02 <geist> it's like modern retrocomputing

21:03 <gorgonical> heat: no u

21:04 <heat> i still dont understand why the damn vendors dont just get something fast

21:04 <geist> they are. it's just taking a while to catch up. basically think of riscv socs as reset back about 10 years in ARM world

21:04 <geist> but the slope of performance is fairly high, so it's catching up to the status quo

21:04 <geist> just needs another 5 years or so

21:04 <mrvn> can't make it fast, stupid patents.

21:05 <geist> nah it's just money and the bootstrapping effect. one cannot simply bust out a high end core from scratch, you have to go through some iterations, build a team, etc

21:05 <geist> and find a market, so takes some number of years to bootstrap it

21:05 <heat> but in theory couldn't an arm vendor just take out the ARM frontend for the RISCV one and call it a day?

21:05 <geist> only vendor that could do that is something like apple

21:05 <geist> maybe they will

21:06 <heat> why?

21:06 <geist> why what?

21:06 <heat> why only apple?

21:06 <geist> because they build their own designs

21:06 <geist> if you are licensing arm cores, like most other vendors, they can't modify it

21:06 <heat> really?

21:06 <heat> so what does i.e qualcomm do?

21:07 <geist> that's a good question. perhaps they'll make th emove on RV first

21:07 <geist> or something like samsung

21:07 <geist> would need to be a big company that otherwise has expertise to build high end cores decide to switch

21:07 <heat> seriously now im worried wtf do SoC vendors do

21:07 <geist> but AFIAK both of those companies got rid of their cpu design team some years ago because they're too short sighted

21:07 <geist> and went back to just licensing arm cores

21:07 <mrvn> heat: add block around the core?

21:07 <geist> what do they do? keep licensing arm cores

21:07 <mrvn> blocks

21:08 <heat> so what makes qualcomm faster than exynos or vice versa?

21:08 <heat> if its the same design and similar fabs

21:08 <geist> they're not necessarily. depends. they have a good GPU team

21:08 <geist> well there's all that *not* core stuff that also matters

21:08 <geist> memory controllers, gpu, peripherals, etc

21:08 <geist> how aggressive they push the clocks, etc

21:09 <heat> hmm

21:09 <geist> there's infinity options there that dont just involve the cpu core

21:09 <heat> so what you're saying is that ARM sets the low bar for performance and the "CPU Companies" just take it?

21:10 <geist> but to rip out the ARm front end and replace it with a RV front end, you'd have to have your own in house cpu core to do it, and both of those companies kinda got out of that biz a few years back

21:10 <geist> uh. what?

21:10 <geist> 'set the low bar for performance?' what are you talking about

21:10 <zid> building rust takes me about 4-7 days

21:10 <zid> pls stop porting things to rust

21:11 <heat> geist, well they give you a design, they could try and do better, but they don't. so the only things they can change are memory controller stuff and peripherals (which I assume have no effect on raw CPU perf)

21:11 <geist> sure, it's just money

21:11 <geist> arm has multiple levels of licensing. you can license a core, in which ase you can' really modify it outside of the params you pay for it

21:11 <heat> so it sounds like for many years it was just one company doing all the effort and then "vendors" did not really compete? until apple ofc

21:12 <geist> you pay $X per unit for that

21:12 <geist> you can become an architectural licensee, in which case you pay some N * X$ but then you're free to build your own cores provided they pass the arm architecture tests

21:12 <geist> and now you need a team of folks to build it

21:12 <geist> that's what apple, ampere, etc do

21:13 <geist> in that case you dont pay per unit sold, so you can arguably break even after you ship N units

21:13 <geist> and you can do what you want, provided it still is an ARM core

21:13 <geist> er i mean is an architecturally compatible ARM core i mean

21:13 <geist> not ARM the company owning it

21:13 <heat> yes

21:13 <geist> in the past both qualcomm (snapdragon, etc) and samsung (mongoose, etc) made their own cores

21:14 <heat> why tf does samsung try with exynos then?

21:14 <geist> but both of them pivoted back to using ARM provided cores the last 5 years or so, presumably because they couldn't justify paying a high end team of cpu designers, and ARM is/whas building good enough cores

21:14 <geist> try what?

21:14 <heat> to build a SoC

21:14 <geist> they do, eynos is a soc

21:14 <heat> I know

21:15 <geist> gotta be more specific, what do you think samsung should do?

21:15 <heat> but they use (or used) qualcomm at least in NA while doing exynos elsewhere

21:15 <geist> they as in used where?

21:15 <heat> phones

21:15 <heat> (SoC)

21:15 <geist> oh that's a completely different part of the company

21:15 <geist> samsung competes with itself.

21:16 <geist> samsung semiconductor and samsung phones and samsung washing machines are all completely different business units

21:16 <geist> what do they call that in korean, one of those megaconglomerates

21:16 <geist> dont confuse different parts of samsung with another

21:17 <geist> Chaebol

21:17 <mrvn> Did they split it up so it isn't a monopoly?

21:17 <geist> 재벌

21:18 <mrvn> c7ac bc8c?

21:18 <heat> even then, they use qualcomm for NA and exynos for rest-of-world. what's the point?

21:18 * geist shrugs

21:18 <heat> I thought they actually did do CPU design stuff and they were trying to, you know, build a nice CPU core

21:18 <zid> Yea korea is weeeird, japan mostly broke up the zaibatsus

21:18 <geist> they did, and then they decided to stop doing it

21:18 <geist> i was dissapoint to, to be honest

21:18 <zid> but korea is still "yea your house, shoes, air and water are all owned by samsung, deal with it"

21:18 <mrvn> heat: don't try, do.

21:18 <geist> but probably like qualcomm they decided it was too expensive to pay these cpu designers, so they stopped and laid them off

21:19 <geist> and so they dont have the expertise

21:19 <gorgonical> zid: they did, but there's still echoes of that system in japan. banks give companies like sony 0.1% loans or whatever

21:19 <zid> well yea

21:19 <gorgonical> it is certainly not as aggressively integrated as the zaibatsu were though, to be sure

21:19 <geist> yah as far as i know the samsung and hyundai and whatnot in SK still pretty much run the show

21:20 <heat> praise big corp

21:20 <zid> korea is living the amazon dream

21:20 <gorgonical> please deliver us to our cyberpunk dystopian dream

21:20 <geist> oh LG too

21:20 <zid> of having everything owned by amazon and having to drink your verfication can

21:20 <heat> when do we start having sponsored IRC messages? - this message was sponsored by raycon

21:20 <mrvn> gorgonical: do you really want to have an oxygen bill every month?

21:20 <zid> Click this message to recover trace amounts of paperclips.

21:21 <gorgonical> if the invisible hand of the free market will improve the quality of oxygen available

21:21 <gorgonical> then yes

21:21 <gorgonical> no price is too high

21:21 <geist> this is getting pretty Phillip K Dickian

21:21 <mrvn> gorgonical: now with 5% more O in the O2 atoms.

21:22 <heat> btw https://en.wikipedia.org/wiki/Sheriff_(company)

21:22 <heat> "Sheriff has grown to include nearly all forms of profitable private business in the unrecognised country, and has even become significantly involved in local politics and sport,[2] with some commentators saying that company loyalists hold most main government positions in the territory"

21:22 <heat> Sheriff owns a chain of petrol stations, a chain of supermarkets, a TV channel, a publishing house, a construction company, a Mercedes-Benz dealer, an advertising agency, a spirits factory, two bread factories, a mobile phone network, the football club FC Sheriff Tiraspol and its home ground Sheriff Stadium, a project which also included a five-star hotel.[5]

21:23 <heat> samsung is a wannabe compared to Sheriff

21:23 <gorgonical> They must be exceptionally influential. Only 3% of transnistrians work for them

21:23 <gorgonical> Or their influence is hidden in subsidiaries etc.

21:24 * geist wanders off to do some work

21:24 <demindiro> Isn't that the same region that has a bunch of old weapons/munitions?

21:25 <demindiro> Where Sheriff is

21:25 <heat> it's a separatist region of moldova yes

21:25 <heat> protected by a small old-soviet-now-russian regiment

21:25 <geist> huh TIL: The modern word “Sheriff”, which means keeper or chief of the County, is derived from the Anglo-Saxon words “Shire-Reeve”.

21:25 <heat> based on an old weapons silo of some sorts, probably what you're referring to

21:26 <netbsduser`> i always thought it came from sharif

21:26 <geist> yah i was surprised to find it's a proper english word

21:26 <geist> and not a loaner

21:26 <zid> are we counting french as loaners here

21:27 <geist> yah

21:27 <zid> wow, we're nearly at a thousand years and geist still not forgiven 1066

21:27 <zid> (wrost day of my life)

21:28 <netbsduser`> take up the good old cause, throw off the norman yoke

21:31 gildasio has quit [Ping timeout: 255 seconds]

21:31 gildasio has joined #osdev

21:32 <heat> remember yesterday where I wrote a patch for a rando to save a feature?

21:32 <heat> https://lore.kernel.org/linux-acpi/CAJZ5v0iXcRFamA+mE837=zHReBT-+8WmMeRDR7L9R+FVpLr25A@mail.gmail.com/T/#t I failed

21:32 <bslsk05> lore.kernel.org: [PATCH] ACPI: Make custom_method use per-open state

21:33 <zid> can we remove all of acpi next

21:33 <heat> noooooOOOOooooooOOOooooooooo

21:33 <heat> acpi very god i promise

21:36 <netbsduser`> i have some nice words to say about NXP: no idea who they are, they seem to own the 68k series now, but they've uploaded fully texted PDFs of the manuals to all the 68k series cpus

21:37 <netbsduser`> just been following the 68040 manual to start work on a 68k port of my kernel

21:44 <mjg> :)

21:44 <mjg> how is netbsd doing on the cpu?

21:44 <mjg> i think it was supported up to some point(?)

21:44 <heat> netbsduser` doesn't use netbsd

21:44 <heat> no one uses it, it's a myth

21:44 <mjg> makes sense

21:45 <gorgonical> but didn't they do that thing where they put lua into the kernel

21:45 <gorgonical> or was that someone else

21:45 <heat> people using netbsd is like freebsd devs using freebsd

21:45 <mjg> it was them but i don't think it goet anywhere

21:45 <mjg> heat: OH i'm using it!

21:45 <kof123> yes, and nowadays there are county sherriffs. or city has the same redundancy. reeve of the shire shire

21:45 <kof123> re: sherriff

21:45 <gorgonical> that's a shame. what's not to love about the lua stuff?

21:46 <kof123> *shire reeve of the shire

21:46 <heat> oh oh i'll start

21:46 <heat> 1) lua

21:46 <heat> 2) lua in the kernel

21:46 <gorgonical> i have yet to hear about something you like, heat

21:46 <heat> you

21:46 <gorgonical> aww

21:46 <heat> <3

21:47 <heat> that is also wrong btw, I like linux and glibc

21:47 <netbsduser`> mjg: still runs well

21:47 <netbsduser`> i have an sd card with it installed on my actual amiga

21:47 <mjg> is that a current version?

21:47 <heat> real talk i also enjoy freebsd and netbsd

21:48 <mjg> i know of a guy who is running a ntebsd 5 fork on sparcstatin

21:48 <mjg> and does not udpate because things got too slow after that

21:48 <netbsduser`> it's an old -current, i think about 2 years old

21:48 <mjg> what's the amiga and dou have an accelrator card or some other fpga to fake it?

21:48 <heat> mjg, things that run on SPARC tend to get slow, you should know that

21:49 <netbsduser`> i'll upgrade to the latest -current at some point (there's wsfb with xorg support now)

21:49 <netbsduser`> it's an a2000 with a tekmagic/060 accelerator

21:50 <mjg> that changes things does not it

21:50 <mjg> Processor: 040@40MHz or 060@50MHz(66MHz)

21:50 <mjg> Max Ram: 128MB

21:50 <netbsduser`> i have the latter and a full 128mb

21:50 <mjg> i hear vamprie or whatever the name can do way etter, but then again it is almost a complete replacement

21:51 <netbsduser`> it's definitely the envy of anyone living in 1994 or thereabouts

21:51 <netbsduser`> vampire has no MMU, or rather they claim to have a custom MMU but there's no documentation of it nor have they released a driver

21:51 <mjg> but amigaos works on it?

21:52 <mjg> ithought it has vm separatin

21:52 <netbsduser`> yeah, and i've heard with better compatibility for older apps and games than does the 68060 have

22:03 <heat> freebsd is *fast*, netbsd is *portable*, openbsd, dragonfly bsd is *interesting*

22:04 <mjg> are you serious mate

22:04 <mjg> you sound like a reddit user

22:05 <mjg> from time to time somoene asks what's the diff between the bsds and the above rolls out

22:05 <mjg> with openbsd being secure

22:05 <heat> nothing I said is wrong

22:05 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

22:06 <mjg> i don't know if netbsd is really portable

22:06 <gorgonical> isn't that their whole motto thoughh

22:06 <mjg> it did get several ports, but how (not)easy was to create them

22:06 <gorgonical> a good point. the distinction between "portable" and "well-ported"

22:07 <mjg> is it really more portable than freebsd or people just sat through it

22:07 <mjg> it is a genuine inquiry and i would not take the portability claim at face value

22:07 <gorgonical> That argument really boils down to the same as asking if L4 is more portable than Linux. Is netbsd less complex than freebsd?

22:07 <mjg> similarly, the 'freebsd == f4s5 bro' is not exactly ture, is it

22:08 <heat> is it not?

22:08 <heat> you yourself say it competes with our good old commie unix

22:08 <mjg> if by fast you mean on par with linux or at least in the ballpark, it is true for select workloads

22:08 <mjg> but i can easily point at stuff where it fucking DIES

22:08 <mjg> for example mount --bind on linux scales

22:09 <mjg> while the freebsd equilvaent (nullfs) does not perform for shit

22:09 <mjg> if you have to use it... hrhrhr

22:09 <mjg> *vm* does not scale either

22:09 <heat> ok so you're telling me it's objectively bad because it doesn't scale sometimes

22:09 <heat> ??

22:10 <mjg> you got this backwards

22:10 <heat> "fast" does not mean "fast in every possible way"

22:10 <mjg> it does scale *sometimes*

22:11 <mjg> to be fair, if you have a real workload on a box < 40-ish cores, it will probably be perfectly ok

22:11 <heat> ....

22:11 <mjg> and even today this is the typical hardware

22:11 <heat> OK SO ITS SLOW WHEN YOU HAVE A 3K XEON SO NOT FAST AT ALL

22:11 <mjg> but if you have more serious needs, things quickly get murky and snowaball to total crap

22:11 <mjg> dude being fast at laptop scale is liek... so 2010

22:12 <mjg> i can tell you for example that pmap does not scale for shit

22:12 <heat> how is 40 core laptop scale

22:12 <mjg> laptop scale is 8 or so

22:12 <mjg> this is where even netbsd mostly performs

22:12 <mjg> openbsd of course does not

22:13 <mjg> totally serious claim: you have to set some hardware standards. for better or worse i think behavior at about 100 cpu threads covers scalability ofr vast majority of real world deplyments

22:13 <mjg> and consequently makes for a sensible base

22:13 <mjg> for example supposed "big" bare metal boxes on amazon are 96 threads

22:14 <mjg> at that scale there are woes

22:14 SGautam has quit [Quit: Connection closed for inactivity]

22:15 <mjg> of course there is hardware way core-y than that and there freebsd does not perform for shit

22:15 <mjg> but that mostly des not matter

22:16 <mjg> all that said, if you throw in a vm with few cores somewhre, freebsd wont be a perf problem

22:16 <mjg> similarly if you benchmark at laptop scale

22:17 <mjg> even fucking oslaris scales to laptop scale :p

22:17 <heat> if you use "lets see if this shit scales" on 100 core boxes as the "sensible base" then you'll be sad to realize only linux is fast

22:17 <mjg> not true

22:17 <mjg> i *beat* linux with my vfs work

22:18 <heat> I highly doubt windows does

22:18 <mjg> and then i made it back on top after fixing lockref

22:18 <mjg> i'm gonna beat it agan later

22:18 <heat> and I don't see any other possible contender here

22:18 <mjg> contrary to popular belief linux drowns in slowness

22:18 <mjg> but nobody is comparing it to other systems

22:18 <mjg> see the lockref debacle, how was this unpatched for 9 years is beyond me

22:19 <heat> ok so the only other OS that is fast is freebsd?

22:19 <heat> didn't you literally call it "not fast"

22:19 <mjg> fast as in faster than linux in *some* cases, freebsd is the only one i know of

22:20 <heat> 100 cores is not a sensible base man, sorry

22:20 <mjg> but it is not that freebsd is doing anything amazing

22:20 <mjg> what is a sensible base

22:20 <heat> I can tell you that no cloudflare box has more than 60-something/70 cores

22:21 <heat> maybe 50?

22:21 <mjg> you may note that's already way past laptop scale though

22:21 <heat> it is

22:21 <mjg> you can expect vm to be a problem at that scale for example

22:21 <heat> yes but is it slow?

22:22 <demindiro> If you're at the point you actually have a use for 100 cores, aren't you then already at the point you can use multiple machines?

22:22 <demindiro> At which point I assume core count per machine matters less?

22:23 <mjg> demindiro: you may have a workload which needs a lot of parallelism and fast transfer across nodes

22:23 <demindiro> And you can have a more sensible number like 16 cores/machine if you use e.g. Ryzen VPS

22:23 <mjg> heat: freebsd vm, on that scale, yes

22:23 <mjg> heat: i do have patches for it though

22:23 <heat> freebsd works fine for netflix's scale

22:23 <heat> so...

22:23 <mjg> that's a common fallacy you are falling into dawg

22:23 <heat> if it did not, I would assume they would just use linux

22:24 <mjg> netflix has a highly specific workload and they optimized their own freebsd fork to perform for that workload

22:24 <mjg> it does not translate for more generic cases

22:24 <heat> sure

22:25 <mjg> most notably, netflix workload does not mmap/munmap files

22:25 <mjg> and this is where the primairy bottleneck lies, along with page fault handling

22:25 <heat> the problem is that you point to synthetic benchmarks and say "look, $os is slow!"

22:25 <mjg> i did not point at synthetic benchmarks here

22:25 <mjg> or any bench for that matter

22:25 <gorgonical> quick question: what's the purpose of a memsiz=0,filesiz=0 loadable program header in an elf file?

22:25 <heat> if you're at the point where the OS is used in production for a large-ass CDN, it is not slow, period

22:25 <mjg> if you want i can point you at a real workload which shows how it sucks

22:25 <heat> gorgonical, none, where did you see it?

22:26 <gorgonical> in the kernel I have

22:26 <mjg> the workload being building linux

22:26 <gorgonical> So possibly a weird linker artifact from a wonky script?

22:26 <heat> gorgonical, pastebin

22:27 <heat> i actually had a bug with some wonky program headers in a linker script a few weeks back because that phdr was empty and bfd still wanted to spit it out

22:28 <gorgonical> https://pastebin.com/1hz7XekX

22:28 <bslsk05> pastebin.com: Elf file type is EXEC (Executable file)Entry point 0xffffffc000080040There a - Pastebin.com

22:28 <heat> gorgonical, Section to Segment mapping: too

22:28 <gorgonical> whoops

22:29 <gorgonical> https://pastebin.com/JFvnMBqV

22:29 <bslsk05> pastebin.com: Elf file type is EXEC (Executable file)Entry point 0xffffffc000080040There a - Pastebin.com

22:30 <heat> ok you have an empty segment

22:30 <heat> probably from a linker script?

22:30 <gorgonical> Must be

22:30 <heat> do you have a PHDRS ?

22:31 <gorgonical> I don't really mess with the linker script so I don't know much about it

22:31 <heat> which one is it? linux?

22:31 <gorgonical> our own kernel

22:31 <gorgonical> Kitten

22:31 <heat> (i guess, from the section names)

22:31 <heat> oss?

22:31 <gorgonical> it's very linux-y

22:31 <gorgonical> yes

22:31 <heat> linc

22:31 <gorgonical> https://github.com/hobbesosr/kitten

22:31 <bslsk05> HobbesOSR/kitten - Kitten Lightweight Kernel (19 forks/37 stargazers/NOASSERTION)

22:32 <gorgonical> this is the arm64 build

22:32 <gorgonical> oh there is a phdrs section in the script, you were right

22:32 <heat> yes

22:32 <heat> your user segment is empty

22:32 <heat> bfd doesn't care and still outputs it. gold and lld do not

22:33 <gorgonical> I wonder what that's a holdover from

22:33 <heat> also what's up with your fucking segment permissions

22:33 <heat> data RWX??

22:34 <gorgonical> This is usually used for hpc where security is less important

22:34 <gorgonical> And also I don't know, this code is basically reused for the last forever

22:34 <gorgonical> I'm not offering excuses, only what I know lol

22:35 <heat> your kernel is very weird

22:35 <gorgonical> it is indeed

22:35 <heat> looks half copied from linux

22:35 <heat> or more than half

22:35 <gorgonical> you're right, it is. the basic idea was to take linux and strip out all the extraneous shit that isn't useful for hpc applications

22:35 <heat> huh

22:36 <gorgonical> so like there's no interactivity support at all. you pass a single initrd and that's your only program

22:37 <gorgonical> there's an extension that allows you to run other programs in a shell-like fashion, but it's typical use-case is just run-and-done a single binary

22:37 <heat> i mean I guess in theory you stop being able to use large pages if you do proper perms, unless you large-page align segments

22:39 <heat> actually don't know if large/huge pages are a big win in the kernel

22:39 <heat> interesting question

22:40 <moon-child> I heard hugepages really suck on linux

22:40 <moon-child> because when you 'downgrade' a big page to a bunch of small ones, it sends a separate tlb shootdown for each one

22:40 <moon-child> instead of ipi and batching it

22:41 <heat> that should not be true

22:47 <heat> and even then, in this case, you're not downgrading anything

22:49 <heat> well, my "that should not be true" assertion comes from the fact that linux does indeed support shootdowns with more than one page. so unless their large pages breakup code is really shit, it should not be an issue

22:50 <moon-child> idk I just heard that somewhere. Could be wrong

22:51 <geist> or you just globally flush

22:51 <geist> it depends a lot on the arch

22:51 <geist> most sane arches make it pretty clear that a TLB flush on a large page should shoot down any matching TLBs, including the entire page

22:52 <geist> even if internally the TLB 'cracked' the entries into multiple smaller ones

22:52 <geist> but there is probably some dumb edge cases

22:52 <geist> on ARM for example you need to do a break-before-make there which is pretty unfortunate

22:53 <heat> what's the algo to break up a large page pte? alloc a page table, fill it with the entries you want, atomically set?

22:53 <heat> then invlpg or whatever

22:54 <heat> (also fwiw doing this on x86 will always trigger a global flush since the tlb invlpg ceiling is 33)

22:54 <geist> depends. for break before make on ARM you must kill the old page, shoot it down, sync, then add new pages

22:54 <clever> i just had a bit of a crazy idea, what if the TLB could test if 2 slots are consecutive in both physical and virtual?

22:54 <geist> for coherency reasons with the A and D bits basically

22:54 <clever> and auto-merge them into one TLB entry?

22:55 <clever> and they also have the same perms

22:55 <geist> problem with auto merge is things like the A and D bits. you have to remember what the original page was to write it back properly

22:55 <heat> actually you only need 1 invlpg in all cases

22:55 <heat> it Just Works for the large pages as you said

22:55 <heat> nice.

22:55 <clever> geist: does the cpu update the accessed bit on every access, or maybe only upon TLB miss when it reads the tables?

22:55 <geist> its a lot of the complexity of the ARM page tables, which is why in general merging pages like the contig pages on arm, and a new riscv extensiosn i was just reading about always say 'you cant predict which of the sub pages A/D gets written back to'

22:55 <geist> clever: the latter

22:56 <clever> so merging TLB entries would potentially result in less TLB misses, and the access bit updating less

22:56 <geist> for example

22:56 <geist> the A bit isn't so bad, but the D bit is more of an issue

22:56 <geist> since D is written back potentially after an existing TLB entry has been floating around a while

22:56 <clever> what was D again?

22:56 <geist> dirty, modified

22:57 <heat> i bet handling transient page faults on break-before-make sucks

22:57 <clever> ahh

22:57 <geist> heat: yeah, that's the issue

22:57 <clever> yeah, thats more of an issue

22:57 <heat> i mean, you can just get the PTE on page fault and check for the perms so, not really?

22:57 <heat> idk

22:57 <clever> and i assume the TLB also holds the addr of the page table entry its matched to?

22:57 <clever> so A/D can update without having to re-walk?

22:57 <geist> in user space i think it's more of a transient thing. for kernel space it's a real problem

22:58 <heat> geist, is the solution "don't mprotect in kernel space"?

22:58 <geist> basically

22:58 <heat> lol

22:58 <geist> or at least dont split large pages there

22:58 <geist> it's actually kinda a zircon problem that we haven't really solved yet

22:59 <geist> ie, the physmap is set up with large pages, but there may be situations where we want to 'punch out' a spot in it, which involves breaking a large page

22:59 <heat> don't you need it on perm changes too?

22:59 <geist> and then you have break-before-make

22:59 <geist> yes

22:59 <moon-child> clever: interesting

22:59 <heat> I remember you talked about it briefly in the fuchsia discord about ASAN

23:00 <heat> (or KASAN I mean)

23:00 <geist> yah we up until now have only had to do this at boot, where there's a single cpu running

23:00 <moon-child> clever: seems like it would be kinda finicky to get right. EG you map y in between x and z; now you need to merge on both sides. But probs still doable

23:00 <geist> ie, for marking kernel pages as XN after doing soem boottime patchups

23:00 <heat> would it just be a "write protected vs writable" thing on KASAN?

23:00 <geist> if it's a single cpu you're 'safe' because you can just make sure you dont fault on what you're fixing up

23:00 <heat> and the big q is how tf do you solve this

23:00 <geist> but it's when another cpu comes along and faults on what you're BBMing

23:01 <geist> short of halting all the other cpus, i dunno

23:01 <heat> sucks

23:02 terminalpusher has quit [Remote host closed the connection]

23:02 <heat> the gang checks linux source

23:02 <geist> i'm sure there are more and more elaborate strategies you can use to fix up the problem after it happens

23:02 <geist> but in general the best solution is to avoid having the problem in the first place

23:09 <clever> geist: oh, but i can see one solution to this, a new flag in the paging tables, that allows TLB merging and tells the cpu you dont care about A/D being out of date

23:09 <clever> it could be used for kernel mappings, mlock() stuff, and stuff you dont plan to swap

23:10 * geist nods

23:10 <clever> then the entire kernel .text would always be 1 TLB slot

23:10 <clever> and would basically never miss

23:11 <clever> and it would be more flexible then just having 16mb pages

23:11 <clever> because it wont be limited to a power of 2

23:11 <heat> geist, seems that the idea around linux kasan is that they just keep the dynamic shadow bits (vmalloc) unmapped I think

23:11 <heat> so this is a non issue

23:11 <clever> but then the TLB check has to be range based

23:11 <heat> i guess you can intercept this on a page fault and kasan still works

23:11 <geist> (sorry, a bit busy today, trying to ignore irc. please dont tag me, makes my phone bloop in the other room)

23:11 <clever> kk

23:12 <heat> sorry

23:12 <geist> no worries, just trying to heads down on something

23:12 <clever> is risc-v based on a soft mmu?

23:12 danilogondolfo has quit [Remote host closed the connection]

23:12 <clever> that would let you implement such an idea without changing the cpu

23:12 <heat> like "if (is_kasan_shadow(fault_address) && is_vmalloc_addr(shadow2addr(fault_address))) bad_kasan();"

23:12 <heat> no

23:12 <clever> if the tlb is range based

23:13 <heat> erm s/bad_kasan/bad_vmalloc_access/

23:13 <heat> they actually do this for NULL accesses as well

23:22 <zid> heat stop pming me your null access

23:29 air has quit [Remote host closed the connection]

23:30 air has joined #osdev

23:30 slidercrank has quit [Ping timeout: 252 seconds]

23:46 demindiro has quit [Quit: Client closed]

23:59 <mrvn> woudn't the cpu simply only merge TLB entries when the A/D bits match and also split entries when A/D changes?