#osdev on 2022-06-22 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:01 qubasa has joined #osdev

00:18 heat has quit [Ping timeout: 248 seconds]

00:36 arcadewise has left #osdev [#osdev]

00:38 pretty_dumm_guy has quit [Ping timeout: 240 seconds]

00:55 Likorn has quit [Quit: WeeChat 3.4.1]

01:01 gog has quit [Ping timeout: 256 seconds]

01:04 <geist> hmm, so FWIW since i moved the server board to a new case (and thus a new power supply and better cooling) it's been up for 3 and a half days

01:05 <geist> but one variable i hadn't isolated before was when it was failing and when it was starting to fail more and more often, was if i had let it cool off before running it some more

01:05 <geist> so maybe something was slowly overheating somewhere that if i just hit the reset button would cause it to take a few more days to get up to

01:06 \Test_User has joined #osdev

01:56 <sbalmos> odd. what about maybe bad fan speed sensor in the PSU or case fan, causing the CPU to go into thermal protection?

02:00 gildasio has joined #osdev

02:36 liz has quit [Ping timeout: 240 seconds]

02:43 <geist> or the PSU maybe

02:44 <geist> well, we'll see. still got another 4 or 5 days till i get it to the usual first failure time (7 or 8 days)

02:44 <geist> got a batch of noctua case fans. ❤️ noctua

02:50 <sbalmos> yeah, I've had two PSUs go bad over my life. and it's always that wacky behavior shite, like it starts dropping a voltage on a rail /just enough/ to make for o_O behavior like that.

02:58 <geist> yah and maybe it's a heating up thing, because if i test it cold it looks fine

03:10 <pounce> my work just gave me a desktop with a hot swappable PSU, it's bonkers

03:11 <pounce> also dual Xeon Gold processors, and 24 dimm slots

03:11 <pounce> really want to do NUMA testing on it

03:15 <zid`> can you requestion me a w-2125

03:19 sonny has joined #osdev

03:19 sonny has left #osdev [#osdev]

03:55 <moon-child> zid`: I got a w-2123 for really cheap on craigslist

03:55 <moon-child> just today

03:55 <moon-child> then I found out sourcing compatible motherboards is a complete pain

03:56 <moon-child> current plan is to get a used dell workstation mobo from ebay and hope it's compatible with a standard power supply

04:16 sham1_ has joined #osdev

04:16 sham1 has quit [Ping timeout: 246 seconds]

04:16 onering has joined #osdev

04:16 zhiayang_ has joined #osdev

04:16 Santurysim has joined #osdev

04:16 janemba has quit [Ping timeout: 246 seconds]

04:16 ThinkT510 has quit [Ping timeout: 246 seconds]

04:16 Ermine has quit [Ping timeout: 246 seconds]

04:17 Jari-- has quit [Ping timeout: 246 seconds]

04:17 Beato has quit [Ping timeout: 246 seconds]

04:17 zhiayang has quit [Ping timeout: 246 seconds]

04:17 zhiayang_ is now known as zhiayang

04:18 ThinkT510 has joined #osdev

04:29 janemba has joined #osdev

05:07 opal has quit [Ping timeout: 268 seconds]

05:07 brynet has quit [Ping timeout: 248 seconds]

05:21 opal has joined #osdev

06:01 dmh has quit [Quit: rip]

06:04 <mrvn> it's 8 in the morning and still to hot to work

06:10 srjek has quit [Ping timeout: 255 seconds]

06:16 mzxtuelkl has joined #osdev

06:18 <moon-child> remote?

06:18 <moon-child> if so, wet shirt

07:10 <vdamewood> I know people who wet their pants, but not their shirts.

07:11 <mrvn> wet shirt only works if the humitity is low.

07:13 <moon-child> I live in the pacific northwest. Works fine for me

07:13 <vdamewood> The pacific northwet?

07:19 <vdamewood> Dude. It's like midnight in the PNW.

07:19 <vdamewood> What are you doing up so late?

07:24 <Mutabah> Late? Midnight?

07:25 <\Test_User> it's a nice 3:25 for me

07:25 <\Test_User> 3:25 am ofc

07:31 <Mutabah> (To clarify - midnight isn't late)

07:33 <vdamewood> Mutabah: Actually, you're right. Misnight is just a few hours before bed time.

07:40 terminalpusher has joined #osdev

07:44 bauen1 has quit [Ping timeout: 240 seconds]

08:05 gog has joined #osdev

08:21 <Griwes> I've almost convinced myself that I should just use llvm's libunwind for the time being and only come back to the idea of writing my own once I'm much further into the project, as a side thing instead of a blocker

08:21 <Griwes> This is probably much healthier too, isn't it

08:21 <ddevault> the correct way to unwind stacks is by using %rbp

08:23 <Griwes> Only in languages where the only way to handle errors is to weave error handling around every line of the proper code

08:24 <ddevault> exceptions are super dumb

08:24 * moon-child grabs popcorn

08:24 <ddevault> exceptions are longjmp as a good practice

08:25 <moon-child> longjmp is greenspunned RETURN-FROM

08:25 * moon-child grabs more popcorn

08:28 GeDaMo has joined #osdev

08:30 <ddevault> in other news, got message passing working reliably

08:31 <klange> congrats; envy is settling in as I see you deliver a multitude of projects

08:31 <ddevault> ty, I get a lot of help

08:36 <mrvn> ddevault: why would %rbp have any sensible value?

08:36 <moon-child> sysv abi technically mandates that you maintain a frame pointer

08:37 <mrvn> The advantage of exceptions is suposedly that you don't have to handle them everywhere, they will just magically propagate to where you catch them. But with RAII you have to handle them at every } anyway. So what actually is the point?

08:37 <mrvn> moon-child: then it's a good thing I'm not doing sysc abi, nor even C.

08:38 <ddevault> I don't like magic

08:38 <ddevault> %rbp is the frame pointer

08:38 <ddevault> you can define any ABI you like but I like frame pointers

08:38 <mrvn> Basically any language with scopes and destructors the exception can't be using longjmp making it a bit pointless.

08:39 <moon-child> 'what actually is the point' two things. 1, it is handled automatically. Even comparing eg the way rust does it with c++, c++ is more modular, since I can call you and you can call me back, and I can catch in the outer stack frame an exception that I threw in the inner stack frame, and you don't have to care about it

08:39 <mrvn> ddevault: frame pointers are too little information for stack unwinding

08:39 <ddevault> not really

08:39 <moon-child> 2, in a language with tracing gc, the implementation doesn't need to spend nearly so much time with raii

08:39 <ddevault> it doesn't deal with inlining, sure

08:39 <moon-child> s/language/implementation/

08:40 <ddevault> but a frame pointer plus, well, a frame, is enough to walk over satck frames

08:40 <ddevault> stack*

08:40 <mrvn> moon-child: a language with GC doesn't have scopes and destructor. They are separated.

08:40 <moon-child> it can. Why can't it?

08:40 <mrvn> ddevault: if you know the frame layout you don't need the frame pointer, the SP will do the same

08:40 <moon-child> the only distinction is that you don't have to use destructors to manage lifetimes of pointers to allocated memory

08:40 <ddevault> yeah, but only if you know the frame layout

08:41 <ddevault> which calls for DWARF or something like it

08:41 <ddevault> much more complicated

08:41 <mrvn> moon-child: excatly. you don't call destructors at the end of the scope so you can just longjmp

08:41 <moon-child> you might have destructors for other things

08:41 <moon-child> such as files or mutexes

08:41 <mrvn> ddevault: you need the frame layout to unwind the stack and call all the destructors. That's my point.

08:42 <ddevault> not necessarily

08:42 <ddevault> hare does this by calling destructors before propagating errors

08:42 <ddevault> well, s/destructors/defers/

08:42 <moon-child> you can maintain a shadow stack. But that's just dwarf with extra steps

08:43 <mrvn> ddevault: ==> no longjmp for the exception.

08:43 <ddevault> again

08:43 <ddevault> exceptions are bad

08:43 <mrvn> that's an optinion

08:43 <ddevault> aye

08:44 <moon-child> error values are anti-modular. anti-modularity is a popular meme among the unix crowd, and I am not making a value judgement, but it is important to acknowledge the consequences of such a view

08:45 <ddevault> not defining your constraints within the type system is reckless

08:45 <ddevault> it's less modular, sure, but more reliable and predictable

08:45 <mrvn> Why do you think exceptions are about error values. I think that's the first mistake. Why should exceptions be exceptional and errors?

08:45 <moon-child> not defining your constraints within the type system is a _lot_ less reckless than permitting use-after-free

08:46 <mrvn> Not having exception as part of a functions ignature is the second mistake imho,

08:46 <moon-child> also what mrvn said

08:46 <ddevault> different trade-offs

08:46 bauen1 has joined #osdev

08:47 moon-child has left #osdev [#osdev]

08:49 <ddevault> in any case, that would just make us both hypocrites :)

08:49 <mrvn> I think exceptions should be much more like std::expected.

08:51 <mrvn> So what if you have to check for error on every level? All the cases that shouldn't better call abort() are things you catch very quickly anyway. Make exceptions not exceptional.

08:51 <mrvn> most of it you can hide behind syntactic suggar.

08:51 <ddevault> agreed

08:51 <ddevault> errors are just errors

08:52 <mrvn> Something like not_found isn't even a real error. That can be very much be the expected result.

08:53 <mrvn> if (s.find(foo) == s.end()) who wants to write that instead of try s.find(foo) except not_found ?

08:53 <ddevault> if (s.find(foo) is void), rather, but: me

08:53 xenos1984 has quit [Read error: Connection reset by peer]

08:54 <mrvn> or in ocaml you have this nice syntax with pattern matching: match Map.find(map, key) with IntLit i -> ... | StringLit s -> ... | NotFound -> ....

08:54 <mrvn> exceptions are just another case in the pattern matching.

08:55 Starfoxxes has quit [Ping timeout: 260 seconds]

08:57 <mrvn> Note to self: port std::expected to my kernel stl.

09:07 Starfoxxes has joined #osdev

09:09 <Griwes> "with raii exceptions are handled at every brace" is a nonsense take. It's just a mechanism that allows you to forget that any form of early return exists for the purposes of cleanup

09:12 <Griwes> Anyway, I'm not planning to have a pointless conversation trying to convinced people who have their opinions set and aren't interested in ever being convinced, so I'm afraid any popcorn grabbed for this will be wasted

09:12 xenos1984 has joined #osdev

09:14 <mrvn> Griwes: it's not so much RAII but the destruction at end of scope

09:18 <mrvn> Griwes: For me the problem is the requirement in languages like C++ that exceptions must have 0 cost unless you throw one. That makes throwing them usualy very expensive and makes exceptions unsuitable for everything but the exceptional, usualy stuff that aborts.

09:18 <ddevault> getting two OS developers to agree on anything is an exercise in frustration

09:18 <ddevault> building an OS is the ultimate exercise in NIH

09:19 <Griwes> ...exceptions are meant to be for the exceptional stuff? Almost like it's in the name!

09:19 <Griwes> Hard disagree on the claim that it's then "usually stuff that aborts" though

09:19 <mrvn> So we need a different name. What would you call something to do an early return that isn't exceptional?

09:20 <mrvn> Griwes: ever written an app that catched bad_alloc and keeps going?

09:20 <Griwes> Me? No. But I know people who have, with great success

09:21 <mrvn> Not that bad_alloc even gets thrown in most cases with overcommit.

09:23 <mrvn> Griwes: so what exceptions do you regularly catch and handle without having your program termiate eventually due to it?

09:29 <Griwes> depends on the domain, though once again, "regularly" is a funny word to use for something explicitly exceptional

09:29 <Griwes> the biggest boon of them is when you're writing a library and you aren't the one handling the error, they become transparent to anyone but whoever decides to catch them

09:30 <mrvn> Griwes: but for that case the fact that exceptions are not part of a functions signature makes them rather bad.

09:30 <Griwes> hard disagree

09:30 <Griwes> it allows middleware libraries to ignore error handling entirely and get it transparently handled a layer above

09:30 <Griwes> it's what makes them rather good

09:31 <Griwes> anyway, that's as far into this discussion as I'll allow myself to be dragged in

09:33 <mrvn> it makes it impossible for the compiler to see if exceptions are handled or not. If the exceptions thrown by a function change. Maybe the middleware library should handle a new exceptions but it just silently propagates and terminates the program and you won't find out for years because it's exceptional and doesn't happen till then.

09:34 <mrvn> I agree that it should be possible to pass exceptions along transparently. Like say "int foo() [throws everything LibBla::blub() throws]"

09:35 <kingoffrance> "anti-modularity is a popular meme among the unix crowd, and I am not making a value judgement, but it is important to acknowledge the consequences of such a view" some philosophies, everything contains the seeds of its own destruction. for that, the consequences are surely that it eventually leads to modularity :D

09:35 <mrvn> but it should also be possible to say "int foo [throws A | B | C]" and give an error if it can throw anything else.

09:36 <mrvn> what is anti-modularity?

09:36 <Griwes> "throws a b c" has been tried and it sucked major donkey balls

09:37 <Griwes> it's one of the really major pain points of java

09:37 <Griwes> it's also bad because it applies function coloring

09:37 <kingoffrance> sorry, was quoting "error values are anti-modular"

09:37 <mrvn> Griwes: not really. They only tried: "turn everything but a b c into terminate()"

09:37 <Griwes> though not as absurdly bad as making everything return std::expected, which is function coloring cubed

09:37 <Griwes> mrvn, yeah, and other languages tried the other option which happens to be even worse

09:38 <Griwes> anyway "terminate called because of an uncaught exception" is a fine thing to happen vOv

09:39 <mrvn> Griwes: that isn't what "throws a b c" does.

09:39 <mrvn> or rather it's missing the "fail to compile if there is a new exception d"

09:40 <mrvn> terminate is about the worst thing to happen for a lib

09:40 RAMIII has joined #osdev

09:40 RAMIII has quit [Client Quit]

09:42 <mrvn> 'The “color” of a function is a metaphor for segmenting functions into two camps: async and normal functions.' How does that apply to "throws a b c"?

09:43 <Griwes> anyway "throws a b c" is rightly dead and shall never be alive in C++ again and that's good

09:43 <mrvn> Griwes: c++ throws was horrible

09:43 <Griwes> mrvn, the color of a function is whether you can just simply call it and get a result or whether you need to handle it in its own special way. with checked exceptions, you *have* to handle everything it may throw, which means it has a color

09:44 <mrvn> Griwes: you can have to handle it or throw it.

09:44 <Griwes> throwing it means handling it

09:44 <mrvn> but it in no way limits you what color of function you can call

09:44 <Griwes> there must exist code that handles the color

09:45 <mrvn> The point, generally, of exception is that there is no code to handle the exception, only the exception handler does that.

09:45 <Griwes> whether it's a catch {} or an annotation propagating the thrown exception info, that's handling a color

09:45 <mrvn> it's purely a compile time thing, no code generated.

09:45 <Griwes> you're missing the point I'm making

09:46 <mrvn> I'm not sure what point you want to make

09:46 <Griwes> `void foo() throw (whatever bar() throws) { bar(); }` <- `throw (whatever bar() throws)` is a piece of code that handles the color

09:46 <mrvn> it's a bit of source, no runtime component

09:47 <Griwes> yes

09:47 <mrvn> ok

09:47 <Griwes> colors are about programming overhead, not runtime overhead

09:47 <mrvn> I see that as no different as: int foo(float); that's a color too

09:47 <mrvn> should we go back to implicit prototypes?

09:47 <Griwes> sigh

09:48 <Griwes> I have no interest in continuing a discussion that's not being made in good faith

09:48 <Griwes> bye

09:49 <mrvn> Griwes: do you agree that what a function throws is part of it's contract?

09:51 <mrvn> Because in my mind I'm just asking for the functions contract to be machine parsable.

09:53 * mrvn like google images for "function coloring"

09:55 <clever> mrvn: thats something i liked about java, where you had to formally declare what you can throw, and also what your not catching that can be thrown from further down your call graph

09:55 <clever> it made it trivial to know what exceptions you can expect, and need to either choose to handle or let pass on

09:56 <mrvn> clever: did java fail if you didn't declare something or convert it into uncaught_exception or termiate?

09:56 <clever> i think it was a compile time error only

09:56 <clever> and some build systems didnt enable it

09:59 <mrvn> that's what I want. I can see the complained about it being to noisy. Do I really want to specify bad_alloc for every function that uses the heap? How many functions will your C++ code have that do not throw bad_alloc?

09:59 <mrvn> s/complained/complaint/.

10:00 <clever> i think there was a whitelist of exceptions it didnt care about, like divide by zero

10:00 <mrvn> That's not really one you "throw"

10:00 <clever> exactly

10:00 <clever> but it can still be caught

10:01 <clever> its more of the runtime throwing it for you, when you do something bad

10:01 <clever> or the runtime not even checking, and converting the signal into an exception

10:02 <mrvn> I can see that as an option: Some exceptions are global like bad_alloc and div_by_zero. I could also see class define a list of exception that would then apply to all methods.

10:02 <clever> there is also the question of should malloc ever return 0?

10:02 <clever> maybe the process should just die instead?

10:03 <clever> depends on the use-case

10:03 <mrvn> clever: you can handle it if you have the need for it. So yes, it should.

10:03 <clever> for large allocations, i can see that being valid

10:04 <clever> but for tiny allocations, just printing an error with some frameworks needs heap space

10:04 <mrvn> kind of should be an attribute in ELF so the kernel disables overcommit to binaries that handle malloc returning 0

10:04 <clever> and if a tiny allocation fails, more are going to fail soon

10:05 <mrvn> hehe, how do you allocate the bad_alloc exception when new fails? That needs new and that fails again.

10:05 <clever> for a more embedded case (kernel or mcu), your more likely to avoid touching the heap whenever possible

10:05 <clever> exactly

10:05 <mrvn> bad_alloc kind of needs to be pre-allocated somewhere so you can throw an existing address.

10:07 <Griwes> <mrvn> hehe, how do you allocate the bad_alloc exception when new fails? That needs new and that fails again.

10:07 <Griwes> C++ abis are very specific about this

10:08 <Griwes> https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html#imp-allocate

10:08 <bslsk05> itanium-cxx-abi.github.io: C++ ABI for Itanium: Exception Handling

10:09 azul has joined #osdev

10:10 <mrvn> Griwes: another of those things imposed by the "exceptions are exception" design.

10:11 <clever> stack unwinding and frame pointers are another tricky thing

10:12 <clever> my rough understanding of framepointers on x86/arm, is that the frame pointer forms a linked list

10:12 <clever> where each frame pointer, points to the previous framepointer on the stack, which is at the "middle" of a stack frame

10:12 <clever> positive offsets point to arguments (beyond that fit in the first few regs), negative offsets point to local vars

10:13 <clever> and a fixed offset from there, is the return addr, varying by platform

10:13 <mrvn> clever: the frame pointer is the top of each frame while the SP is the bottom of the frame. And there is a defined way to get the previous frame pointer given a frame pointer.

10:14 <clever> for x86, it would be a positive offset, because the call opcode pushes the return addr right below the args, and the prologue then saves the framepointer, and creates locals

10:14 <mrvn> On m68k frame pointers use the "link/unlink" opcodes so they are handled in hardware.

10:14 <HeTo> my rough understanding of frame pointers on x86 is that usually they don't exist. or at least you can't find the head of the list in a register. and I think it's the same for ARM too, actually (usually you can't get usable backtraces on ARM without debug symbols)

10:15 azul has quit [Quit: leaving]

10:15 <mrvn> HeTo: the compiler can optimize them away and then perf doesn't work right.

10:15 <clever> x86 for example, the stack can look like an array of: [ local1, local2, framepointer, returnaddr, arg1, arg2, arg3 ] (exact numbers will be wrong)

10:15 <mrvn> There are options to force and eliminate all frame pointers in gcc/clang.

10:15 <clever> because the caller first pushes all args to the stack, then runs the "call" opcode, that pushes the return addr

10:16 <clever> and the first thing the prologue does, is push the old framepointer onto the stack, and copy sp->fp, to create a new stack frame

10:16 <clever> and then sp -= $locals_size to make toom for local1/local2

10:16 <HeTo> mrvn: perf can alternatively use dwarf for the backtrace (not sure if it consults the symbols at runtime. I think it just saves a bunch of the stack at runtime, and maybe leaves interpreting that for the analysis?)

10:16 <mrvn> clever: yes. the frame pointer is always pushed at a fixed offset and the frame pointer register is then updated.

10:16 <GeDaMo> Frame pointers were necessary on 8086 because you couldn't do sp-relative addressing

10:17 <clever> mrvn: for arm, the return address is slightly more complicated, because of the lr register, and it being on the stack is optional

10:17 <clever> *looks*

10:17 <mrvn> clever: what stack? ARM (hardware) doesn't have a stack.

10:17 <clever> yeah, the hardware doesnt enforce one, but gcc still has one

10:17 <mrvn> doesn't ruse have no stack?

10:18 <mrvn> rust

10:18 <mrvn> or was that go?

10:18 <clever> haskell's stack is a linked list on the heap, rather then the traditional stack

10:18 <HeTo> clever: I think the return address will be on the stack if you have a stack frame. leaf functions that don't use much stack might not have their own stack frame though

10:19 <clever> HeTo: i'm checking some disassembly to confirm things there

10:19 <mrvn> HeTo: on ARM the return address will only be on the stack if the return register gets clobbered, i.e. if you call other functions.

10:19 <clever> yeah, leaf vs non-leaf functions

10:19 <clever> but, is the return addr at a positive or negative offset from the framepointer?

10:20 <mrvn> and leaf functions might have an implicit stackframe with the red zone.

10:20 <mrvn> clever: hardware dependent

10:20 <clever> it feels more abi dependent to me?

10:20 <clever> whatever rule gcc set

10:20 <mrvn> clever: positive on x86 because CALL puts it on the stack before the function prolog saves the ebp

10:21 <clever> yep

10:21 <clever> but on arm, the prologue is responsible, and can do whatever it wants

10:21 <mrvn> theoretically the compiler you save the address of the return address or any other offset into the minimal stackframe. but normaly you would just "push ebp"

10:22 <mrvn> s/compiler you/compiler could/

10:22 <clever> ok, r14 == linkreg, r15==pc

10:22 <mrvn> and then ebp = sp; sp -= size

10:22 <clever> 80000f1c: e92d4010 push {r4, r14}

10:22 <clever> 80000f38: e8bd8010 pop {r4, r15}

10:22 <clever> this is a non-leaf function, its saving r4+lr, but then restoring into r4+pc

10:23 <mrvn> clever: that's a nice way to pop and ret in one go

10:23 <clever> yep

10:23 <clever> but i think this was built without frame pointers

10:23 <clever> so my answer is missing

10:23 <HeTo> also really confusing reading disassembly if you aren't used to it

10:23 <mrvn> r4 is the 4th argument register, so no frame pointer

10:23 <clever> its also not clear, if that pushes r4 then r14, or r14 then r4

10:24 <HeTo> when you're looking for some form of branch or return instruction, you don't expect "pop" to be one if you're not familiar with ARM

10:24 <clever> HeTo: its clearer when it says pop {r4,pc}

10:24 <clever> but objdump can decode r15 as either r15 or pc, and this disassembly went for the confusing option

10:25 <mrvn> clever: for PC that's true. for some other registers the number is clearer for code that doesn't use the register in a conventional way

10:25 <clever> mrvn: but there is a 3rd arch, where framepointers and stockholme will drive you mad!

10:26 <clever> on VPU, register+immediate-offset doesnt pack negative offsets well

10:26 <mrvn> Does aarch still have a pop {pc}?

10:26 <mrvn> aarch64

10:26 <clever> so framepointer + -123 would be expensive

10:26 <clever> and the author of the gcc port, decided to violate the framepointer rules some

10:26 <clever> and now the framepointer is total nonsense

10:26 <mrvn> So I guess you don't have a red-zone on the VPU?

10:26 <clever> ive not seen any sign of a redzone

10:27 <mrvn> why does it have a frame pointer at all?

10:27 <clever> probably just because gcc generates one by default

10:28 <mrvn> so make the no-framepointer option defauilt for VPU

10:28 <clever> let me find an example...

10:28 <mrvn> no need

10:29 <clever> ah, seems framepointer is already off

10:29 <clever> 80002f32: a1 03 stm r6-r7,lr,(--sp)

10:29 <clever> 80002f34: 59 c0 7c cf add sp,sp,-4

10:29 <clever> 8000309a: 59 c0 44 cf add sp,sp,4

10:29 <clever> 8000309e: 21 03 ldm r6-r7,pc,(sp++)

10:29 <mrvn> optimizer fail

10:29 <clever> a non-leaf function, it pushes r6/r7/lr, and decrements sp by 4 for locals, then undoes it all at the end, restoring lr into pc

10:30 <clever> mrvn: where is the fail? i'm not seeing one immediately

10:30 <mrvn> clever: why doesn't it push an extra register?

10:31 <clever> ah, as-in, "save" r8, just to get the sp another 32bits lower?

10:31 <mrvn> yep.

10:31 <clever> that could work, for small stack frames

10:31 <clever> but there is a range limit on store-many

10:31 <clever> 800002a0: a9 02 stm r6-r15,(--sp)

10:31 <clever> 800002a2: c7 02 stm r16-r23,(--sp)

10:32 <mrvn> sure, there is a limit for it and at some point writing to the stack costs more time than the extra opcode to add to sp.

10:32 zaquest has quit [Remote host closed the connection]

10:32 <clever> yeah

10:32 <clever> which reminds me, this cpu is also dual-issue

10:32 <mrvn> but for 4 byte the extra opcode and writing a register should even out

10:32 <clever> not sure about this case of modifying sp back2back, but certain combinations of opcodes can run in the same clock cycle

10:33 <mrvn> the stm an add have a register dependency

10:33 xenos1984 has quit [Quit: Leaving.]

10:33 <clever> yeah

10:33 <clever> that complicates things, and it would have to get really clever to merge them

10:33 xenos1984 has joined #osdev

10:34 <mrvn> which is why I think storing an extra register would be better. Makes SP available for other use earlier.

10:34 <mrvn> and the opcode after the "add" might not use SP at all

10:34 <mrvn> e.g. "xor r0, r0, r0"

10:35 <clever> there are 4 opcodes after the add, leading to a branch+link

10:35 <clever> 80002f38: 00 e8 04 30 00 7e mov r0,0x7e003004

10:35 <clever> and the very first one, is a rather fat load 32bit immediate

10:35 <clever> 48bit opcodes, something arm just cant do

10:35 SGautam has joined #osdev

10:35 <clever> at the cost of decoder complexity, of course

10:36 <mrvn> nothing compared to m68k. Or x86 with its 15 byte opcode limit.

10:36 <clever> yeah, vpu maxes out at 80bits (10 byte) for its vector opcodes

10:37 <mrvn> I miss the "(--An)" from m68k. auto increment/decrement is such a usefull thing when working with arrays or strings.

10:37 <clever> the syntax of vpu asm implies it can do the same

10:37 <clever> but i think ive tried using it, and it actually cant

10:37 <mrvn> on x86 I mean

10:38 <clever> it only works on the stack pointer, and only in one direction

10:38 <clever> so store can only decrement sp, and load can only increment sp

10:38 <clever> its just being verbose about what its doing

10:39 <clever> ghidra has an abnormally good opcode decoder, where it clearly explains every bitfield in the opcode

10:40 <mrvn> should have just said: push/pop

10:40 RAMIII has joined #osdev

10:40 <mrvn> I like the way on ARM how you can increment/decrement and toggle write-back of the result.

10:41 <clever> with that, i can see that `stm r6-r7, lr, (--sp)` has 3 operands encoded into it, r6 is a 2bit value of 1, r7 is a 5bit value of 1, and lr is a 12bit value of 101100000011

10:41 <clever> and then `stm r6-r10, lr, (--sp)` has a 100 (4 decimal) in the r10 slot

10:42 <mrvn> it's not a bitset of the registers?

10:42 <clever> so, r7=1, r8=2, r9=3, r10=4

10:42 <clever> i think its 2 ints, for a start and end register

10:42 <clever> hense the r6-r10 syntax

10:42 <mrvn> you don't always have just one range.

10:43 <clever> in that case, you use multiple stm's

10:43 <mrvn> I believe on ARM the stm just has a bitset.

10:43 dennis95 has joined #osdev

10:43 <mrvn> r0-r4 is just syntactic suggar for r0, r1, r2, r3, r4

10:44 <clever> in this case, there are 32 registers, r0 thru r31, some of them having special names like sp/pc/lr, just like arm

10:44 <clever> so you would need 32bits just to allow specifying every reg

10:44 <mrvn> indeed

10:44 <mrvn> or 10 bits for start/end of a range.

10:45 <clever> vpu complicates things, by only allowing 2bits for the start, and i think its more of an enum

10:46 <mrvn> There are probably some register you stm far more often than others. Maybe it's logarithmical too: start at 0, 1, 2, 4

10:46 <clever> the first field is a 2bit int, where 1==r6, 0==r0, not finding other examples yet

10:46 <clever> i believe that is why the abi says that r6 and up are the preserved regs

10:46 <clever> and r0-r5 are clobbered

10:47 <clever> because `stm r6-r??, lr` is cheaper to encode

10:47 <mrvn> or just a "looks random" lookup table like 0==r0, 1==r6, 2==r9, 3==lr

10:47 <clever> yeah

10:47 <clever> the designer picked some random values, to suit an ABI

10:48 <mrvn> Clear sign of the CPU designers having some calling convention in mind and the STM is ment to save the clobber registers.

10:48 <clever> exactly

10:49 <mrvn> 0 == big function saving everything, 1 == normal function just saving clobbers, 2 == small function, 3 == leave function

10:49 <clever> searching thru an example binary, i can see 3 forms of stm

10:49 <clever> a: just 1 register, is not many!

10:49 <clever> b: r0-r??, or r6-r??, storing just a range

10:49 <clever> c: r0-r??,lr or r6-r??,lr storing a range plus lr

10:50 <mrvn> ahh, start == 3 might mean just the end register, no range.

10:50 <clever> oh, and a 4th form

10:50 <clever> stm lr, (--lr)

10:51 <clever> again, its not many, but the range has been omitted, its now just lr!

10:51 zaquest has joined #osdev

10:51 <clever> oh, theres an odd decoding, but this looks like garbage binary data

10:51 <clever> stm gp-r12, (--sp)

10:52 <clever> where gp is an alias of r24

10:52 <clever> that makes no sense at all, the range is backwards

10:53 <clever> which makes sense! :P, this doesnt look like vpu asm, its some other form of binary data

10:54 <mrvn> Could that be r24-r31,r0-r12?

10:55 <clever> let me throw together some asm to brute-force it

10:55 <mrvn> have fun

10:56 <clever> mrvn: https://gist.github.com/cleverca22/9f0e424c01709c103836ed99d260a788 left-over code from my last brute-forcing session

10:56 <bslsk05> gist.github.com: gist:9f0e424c01709c103836ed99d260a788 · GitHub

10:57 <clever> 0x1234 encodes as a 16bit immediate tacked onto a 16bit opcode

10:57 <clever> but 0x12345 encodes as a 32bit immediate on a 16bit opcode, now waisting 1 byte

10:57 <clever> while 0x1c and below are more complicated, sharing the 16bits between both opcode and immediate

10:58 <clever> and you can see how the encoding varies wildly, depending on both the destination register and the immediate size

10:59 dennis95 has quit [Remote host closed the connection]

11:00 dennis95 has joined #osdev

11:00 terminalpusher has quit [Remote host closed the connection]

11:03 <clever> mrvn: went thru the entire range, for the single form (stm r?, (--sp)), it only supports 5 registers, r0, r6, r16, gp, and lr

11:03 <clever> ghidra claimed the first operand (for the range form) was 2 bits, 0-3, which would explain r0/r6/r16/gp, and lr is a special case ive seen elsewhere

11:05 <clever> and looking at the bytes i can confirm that, r0/r6/r16/gp have a 00, 01, 10, and 11 pattern in one of the bytes, and are otherwise identical

11:06 <clever> while the lr variant, is vastly different

11:12 sympt has quit [Read error: Connection reset by peer]

11:14 sympt has joined #osdev

11:16 <clever> mrvn: oh wow, at least at the binutils layer, r0-r1 all the way thru to r0-r31 encodes into something!

11:17 <clever> explains the 5bit int i saw in ghidra, 0-31

11:21 <clever> and a value of 0 for the end reg, is used for just r0, without a range

11:21 <clever> so its encoded more as r0-r0

12:21 brynet has joined #osdev

12:33 gog has quit [Quit: byee]

12:37 gorgonical has quit [Quit: Client closed]

12:43 azul has joined #osdev

12:44 azul has quit [Quit: leaving]

13:09 Matt|home has joined #osdev

13:11 <mrvn> what about the other bits?

13:19 <clever> mrvn: https://gist.github.com/cleverca22/9f0e424c01709c103836ed99d260a788 updated

13:19 <bslsk05> gist.github.com: gist:9f0e424c01709c103836ed99d260a788 · GitHub

13:20 <clever> if i tell it to save r6-r6, it assembles fine, but then disassembles as just r6, no range

13:21 <clever> and keep in mind, not all 16bits of this can be used by this one opcode

13:21 <clever> there are other 16bit opcodes, and bigger opcodes that need the first 16bits to not look like a 16bit opcode

13:21 <clever> so some of those bits are just going to be constants

13:28 toluene has joined #osdev

13:51 <mrvn> When I use std::for_each(std::execution::par, std::begin(a), std::end(a), [&](auto x) { ... }); then what is creating threads? Or choosing how many threads?

13:52 <sbalmos> random uneducated guess is some automatic tie-in to the pthreads lib?

13:53 <mrvn> sbalmos: std::jthread uses pthread under the hood, yes.

13:53 blockhead has quit []

13:53 <mrvn> the question is what creates the std::(j)thread objects

13:55 <ddevault> I said I got message passing working reliably

13:55 <ddevault> then I expanded the test suite

13:55 <mrvn> And how would I do the same for my BigNum add/sub/mul/div/sqr/sqrt?

13:55 <mrvn> ddevault: that is usualy how it goes: If it passes all tests then you didn't test enough.

13:56 <ddevault> to be fair, I knew my original statement was a qualified one

13:57 <kingoffrance> if it compiles on first attempt, be scared, be very scared

13:59 <sbalmos> void kmain() { start_reactor(); }

13:59 <sbalmos> whoops, missed the comment before start_reactor(). // Quaid

14:15 xenos1984 has quit [Quit: Leaving.]

14:19 FireFly has quit [Ping timeout: 260 seconds]

14:26 wand has joined #osdev

14:28 xenos1984 has joined #osdev

14:35 Matt|home has quit [Ping timeout: 268 seconds]

14:40 \Test_User has quit [Ping timeout: 240 seconds]

14:41 Likorn has joined #osdev

15:14 Santurysim is now known as Ermine

15:15 SGautam has quit [Quit: Connection closed for inactivity]

15:15 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

15:28 <mjg_> hrmpf

15:28 <mjg_> does gcc provide a way to tag a struct or a pointer as misaligned?

15:29 \Test_User has joined #osdev

15:29 <mjg_> oh, there is an attribute aligned

15:31 <mrvn> mjg_: [[gnu::packed]], aligned can only increase alignment

15:31 <mrvn> you can pack and align but I don't think there is a way to mis-align but not pack.

15:32 <mrvn> on the other hand packed isn't recursive so you can doubble bag a struct

15:32 <mjg_> well let's see what's going to happen

15:33 <mrvn> pointers can't be mis-aligned at all, which I consider a bug in the packed extention

15:33 <mrvn> Tip: never ever pack something you access more than once. It's faster to copy it to a not-packed / aligned struct and work with that.

15:35 <mjg_> how about pre-existing big codebase which sometimes traps on 32 bit arm

15:35 <mjg_> where playing whack-a-mole is a non-starter

15:36 <mjg_> i already tried memcpy, doe snot help me as the target gets modified later

15:36 <mjg_> so i would have to memcpy the change back

15:36 <mjg_> and make sure i caught all the cases

15:39 <mjg_> aand aligned(1) did not help, bummer but was worth giving it a shot

15:45 _xor has joined #osdev

15:49 gxt has quit [Ping timeout: 268 seconds]

15:50 vdamewood has joined #osdev

15:51 gxt has joined #osdev

15:55 Matt|home has joined #osdev

15:56 vinleod has joined #osdev

15:59 vdamewood has quit [Ping timeout: 248 seconds]

16:03 vinleod has quit [Quit: Life beckons]

16:09 terrorjack has quit [Quit: The Lounge - https://thelounge.chat]

16:21 Andrew is now known as GNU\Andrew

16:21 Matt|home has quit [Ping timeout: 248 seconds]

16:23 <mrvn> mjg_: as said aligned can only increase. iirc it's actually UB to try to lower it.

16:23 <mrvn> but packed should fix stuff, it's just horribly slow

16:24 <mrvn> can't you find out what traps? Did you align the stack right for double register loads?

16:25 <mjg_> the specific trap is now fixed, but i expect new cases to pop up here and there

16:25 <mjg_> well let me restate, clang 10 hapened to generate code which did not trap (one byte loads for the struct)

16:26 <mjg_> but it would some times trap if the code changed

16:26 <mjg_> clang 11 and later always traps

16:26 <mjg_> in the specific place, which is now fixed

16:26 <mjg_> but i expect there will be more cases down the road and not easily fixable

16:26 <mjg_> having a simple hammer for them would be nice

16:29 bauen1 has quit [Ping timeout: 256 seconds]

16:31 <ddevault> aha, I think I know the problem

16:32 <ddevault> I bet both threads have the same IPC buffer because I was lazy

16:32 Likorn has quit [Quit: WeeChat 3.4.1]

16:33 Likorn has joined #osdev

16:40 Likorn has quit [Quit: WeeChat 3.4.1]

16:42 dennis95 has quit [Quit: Leaving]

16:49 andreas303 has quit [Quit: fBNC - https://bnc4free.com]

16:57 blockhead has joined #osdev

17:00 <mrvn> gcc and clang have different opinions about unaligned loads. iirc clang does it wrong and assumes the CPU doesn't trap on unaligned load.

17:12 andreas303 has joined #osdev

17:21 bauen1 has joined #osdev

17:37 mzxtuelkl has quit [Quit: Leaving]

17:52 X-Scale` has joined #osdev

17:54 X-Scale has quit [Ping timeout: 248 seconds]

17:54 X-Scale` is now known as X-Scale

18:20 gorgonical has joined #osdev

18:24 <gorgonical> Getting close to a successful build

18:24 <gorgonical> Step 1 nearly completed

18:30 terminalpusher has joined #osdev

18:32 dude12312414 has joined #osdev

18:46 scoobydoo_ has joined #osdev

18:47 scoobydoo has quit [Ping timeout: 240 seconds]

18:47 Likorn has joined #osdev

18:47 scoobydoo_ is now known as scoobydoo

19:13 terminalpusher has quit [Remote host closed the connection]

19:14 Likorn has quit [Quit: WeeChat 3.4.1]

19:43 <geist> mrvn: depends on the architecture

19:43 <geist> modern arm it's just assumed you have the allowed unaligned access bit set

19:43 <geist> you can argue that's a bad idea, but that ship sailed years ago

19:55 <mrvn> there should be an option for the compiler

19:56 mahmutov has joined #osdev

19:57 Likorn has joined #osdev

19:59 <geist> there is, though i think we went through all of this before

19:59 <geist> there's a switch that among other things disables the assumption that you can do unaligned accesses

19:59 <mrvn> iirc i read Android doesn't have the bit set

19:59 <geist> but i think it may only be arm64

20:00 <geist> of course because it generates shitty code

20:00 <geist> you use it for things like firmware before the mmu is brought up

20:00 <geist> *the point* of allowing unaligned accesses is it generates better code, and the architecture fully supports it and recommends it

20:01 <geist> you mean android doesn't have the 'allow unaligned accesses' bit? i seriously doubt that

20:01 <geist> but we have to be clear: do you mean arm32 or arm64?

20:01 <mrvn> arm32

20:01 <geist> i'm about 98% sure they have the bit set. i went through this fight years ago at a company that was a competitor to android, and the fact that android went ahead and set it sealed the deal

20:02 <mrvn> I think on some cpus the double register load/store still fails with unaligned addresses

20:02 mahmutov has quit [Ping timeout: 268 seconds]

20:02 <geist> you have to be very explicit about which cpus and which ones you're running your code on, android, etc

20:03 <geist> which versions. all the modern 'big' ones dont have the problem

20:03 <geist> i think some of the earlier embedded versions (armvN-m) does

20:03 <geist> and there *are* alignment of atomic issues you have to be aware of

20:04 <mrvn> anywaym my point was that gcc assumes the "allow unaligned access" bit is not set and clang assumes it's set.

20:04 mahmutov has joined #osdev

20:05 <geist> i dont think that's right

20:05 <mrvn> if you access a packed struct then one does byte-by-byte acess and the other just loads registers.

20:09 <mrvn> probably depends on the compiler version too

20:09 <geist> but i did just double check: indeed, in armv7 at least there's a whole table of what can cause unaligned faults with SCTLR.A=0 (alignment checks disabled)

20:09 <geist> notably, atomics, load/store double, load/store multiple

20:10 <geist> armv8 seems to have seriously relaxed it to basically just load/store multiple (which isn't really supported in 64bit) and atomics

20:10 <geist> which is why i hadn't thought about it. unclear if that means an armv8 core in 32bit mode also has less traps

20:12 <geist> ah never mind, found the 32bit thing. its the same on armv8

20:13 <mrvn> https://godbolt.org/z/8Mo1a6joz I remebered it wrong. gcc screws up

20:13 <bslsk05> godbolt.org: Compiler Explorer

20:14 <geist> basicall 64bit has very few unaligned restrictions, aside from atomics

20:14 <qookie> you can tell gcc to emulate unaligned accesses with aligned ones only with -mno-unaligned-access

20:14 <qookie> or have it fail on unaligned accesses with -mstrict-align

20:15 <geist> exactly. also looks like the behavior started with gcc 11

20:15 <qookie> but the latter is aarch64 specific it seeems

20:15 <geist> yah was gonna say

20:15 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

20:16 <mrvn> Passing a packed struct by value produces odd code too: https://godbolt.org/z/j7ov168dz

20:16 <bslsk05> godbolt.org: Compiler Explorer

20:17 <mrvn> What is gcc thinking there?

20:18 mahmutov has quit [Ping timeout: 264 seconds]

20:19 <mrvn> doesn't even need packed. Why is gcc storing a struct passed as arg in regs to the stack just to ignore it?

20:19 mahmutov has joined #osdev

20:22 eroux has quit [Ping timeout: 248 seconds]

20:25 eroux has joined #osdev

20:25 ZipCPU_ has joined #osdev

20:27 ZipCPU has quit [Ping timeout: 260 seconds]

20:27 ZipCPU_ is now known as ZipCPU

20:28 <geist> yeahthat's pretty odd

20:42 skipwich has quit [Ping timeout: 244 seconds]

20:42 ZipCPU has quit [Ping timeout: 256 seconds]

20:53 wxwisiasdf has joined #osdev

20:53 GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]

20:58 cultpony has joined #osdev

21:04 <gorgonical> I have gotten a build all the way up to undefined symbol errors. Unfortunately there's 1000 errors so I've done something wrong evidently

21:15 Terlisimo has quit [Quit: Connection reset by beer]

21:19 Terlisimo has joined #osdev

21:45 gxt has quit [Remote host closed the connection]

21:47 gxt has joined #osdev

21:52 mahmutov has quit [Ping timeout: 264 seconds]

21:54 funac has joined #osdev

21:58 wxwisiasdf has quit [Ping timeout: 240 seconds]

22:12 gxt has quit [Remote host closed the connection]

22:13 gxt has joined #osdev

22:19 funac has left #osdev [#osdev]

22:22 gog has joined #osdev

22:23 FireFly has joined #osdev

22:27 opal has quit [Remote host closed the connection]

22:28 opal has joined #osdev

22:34 heat has joined #osdev

23:00 <heat> https://pbs.twimg.com/media/EWEqV_fWsAIv3tj?format=png&name=small

23:01 <psykose> cute selfie

23:05 <heat> I DO NOT USE FUCKING NIXOS

23:05 <heat> i use arch btw

23:05 <gog> how can you tell if somebody uses arch linux

23:06 <gog> they'll tell you

23:06 <gog> heh heh heh heh

23:07 <heat> sadly not anymore :(

23:07 <heat> now it's all about nixos

23:07 <mjg_> vegan arch crossfitter

23:07 <heat> you know what solves every problem ever? THE NIXOS PACKAGE MANAGER

23:08 * kingoffrance throws gog elephant meat for that joke

23:09 <FireFly> "the nixos package manager" sounds like a roundabout way to say "nix" :p

23:09 <gog> what is the nixos package manager

23:09 <gog> i'm not a fucken nerd so i wouldn't know

23:09 <heat> its the nixos package manager

23:10 <heat> you clearly can't read

23:10 Vercas9 has joined #osdev

23:10 Vercas has quit [Remote host closed the connection]

23:10 Vercas9 is now known as Vercas

23:11 <gog> why would i read

23:12 <heat> becuz you're a fucken nerd

23:14 <heat> screw it I'll shamelessly copy r/programmerhumor

23:14 <heat> i merged my first PR for my new job at cloudflare haha

23:14 <heat> so happy

23:14 <j`ey> congrats!

23:14 <j`ey> my first PR at my job was a single character change

23:15 <heat> didn't even test it lol just pushed it

23:15 <heat> hopefully nothing went wrong lol

23:15 <heat> (I like how you didn't get the joke, or maybe you're pretending you didn't get the joke and this is all some sort of meta joke)

23:15 <heat> what is joke

23:17 <j`ey> uh, i didnt get your joke ?_?

23:17 <heat> there was a huge outage yesterday

23:18 <j`ey> ohhhh lol

23:18 <heat> https://blog.cloudflare.com/cloudflare-outage-on-june-21-2022/

23:18 <bslsk05> blog.cloudflare.com <out of privacy tokens, poke puck>

23:18 <puck> oh ughhhhh

23:18 <psykose> lmao

23:18 <j`ey> brought back online and by 07:42 UTC

23:18 <j`ey> no wonder I didnt notice

23:23 srjek has joined #osdev

23:27 <heat> it brought down cloudflare on and near every big population center

23:28 <heat> i would've never have noticed because there's a PoP right here in lisbon, which wasn't affected

23:29 <heat> this sort of stuff makes me very happy that I'm not a network engineer

23:30 <j`ey> are you on call though?

23:31 <heat> no

23:31 <heat> my team doesn't have anyone on call afaik

23:31 <j`ey> nice

23:31 <heat> we're no goddamn SREs

23:34 <heat> and for my real first PR, it was like a week and a half ago and nothing broke yet sooo

23:34 <heat> i'm clear

23:34 * heat knows on wood

23:35 * heat learns how to type and this time knocks on wood

23:35 * psykose emits knowledge towards the wood

23:36 <heat> i am one with the wood and the wood is with me

23:37 mrvn has quit [Ping timeout: 256 seconds]

23:40 * kingoffrance .oO( bruce lee spinning in his grave "i said water! be water, water!" )

23:42 <kingoffrance> well cremated, who knows, anyways, adios

23:42 * kingoffrance zzz