pretty_dumm_guy has quit [Ping timeout: 240 seconds]
Likorn has quit [Quit: WeeChat 3.4.1]
gog has quit [Ping timeout: 256 seconds]
<geist>
hmm, so FWIW since i moved the server board to a new case (and thus a new power supply and better cooling) it's been up for 3 and a half days
<geist>
but one variable i hadn't isolated before was when it was failing and when it was starting to fail more and more often, was if i had let it cool off before running it some more
<geist>
so maybe something was slowly overheating somewhere that if i just hit the reset button would cause it to take a few more days to get up to
\Test_User has joined #osdev
<sbalmos>
odd. what about maybe bad fan speed sensor in the PSU or case fan, causing the CPU to go into thermal protection?
gildasio has joined #osdev
liz has quit [Ping timeout: 240 seconds]
<geist>
or the PSU maybe
<geist>
well, we'll see. still got another 4 or 5 days till i get it to the usual first failure time (7 or 8 days)
<geist>
got a batch of noctua case fans. ❤️ noctua
<sbalmos>
yeah, I've had two PSUs go bad over my life. and it's always that wacky behavior shite, like it starts dropping a voltage on a rail /just enough/ to make for o_O behavior like that.
<geist>
yah and maybe it's a heating up thing, because if i test it cold it looks fine
<pounce>
my work just gave me a desktop with a hot swappable PSU, it's bonkers
<pounce>
also dual Xeon Gold processors, and 24 dimm slots
<pounce>
really want to do NUMA testing on it
<zid`>
can you requestion me a w-2125
sonny has joined #osdev
sonny has left #osdev [#osdev]
<moon-child>
zid`: I got a w-2123 for really cheap on craigslist
<moon-child>
just today
<moon-child>
then I found out sourcing compatible motherboards is a complete pain
<moon-child>
current plan is to get a used dell workstation mobo from ebay and hope it's compatible with a standard power supply
sham1_ has joined #osdev
sham1 has quit [Ping timeout: 246 seconds]
onering has joined #osdev
zhiayang_ has joined #osdev
Santurysim has joined #osdev
janemba has quit [Ping timeout: 246 seconds]
ThinkT510 has quit [Ping timeout: 246 seconds]
Ermine has quit [Ping timeout: 246 seconds]
Jari-- has quit [Ping timeout: 246 seconds]
Beato has quit [Ping timeout: 246 seconds]
zhiayang has quit [Ping timeout: 246 seconds]
zhiayang_ is now known as zhiayang
ThinkT510 has joined #osdev
janemba has joined #osdev
opal has quit [Ping timeout: 268 seconds]
brynet has quit [Ping timeout: 248 seconds]
opal has joined #osdev
dmh has quit [Quit: rip]
<mrvn>
it's 8 in the morning and still to hot to work
srjek has quit [Ping timeout: 255 seconds]
mzxtuelkl has joined #osdev
<moon-child>
remote?
<moon-child>
if so, wet shirt
<vdamewood>
I know people who wet their pants, but not their shirts.
<mrvn>
wet shirt only works if the humitity is low.
<moon-child>
I live in the pacific northwest. Works fine for me
<vdamewood>
The pacific northwet?
<vdamewood>
Dude. It's like midnight in the PNW.
<vdamewood>
What are you doing up so late?
<Mutabah>
Late? Midnight?
<\Test_User>
it's a nice 3:25 for me
<\Test_User>
3:25 am ofc
<Mutabah>
(To clarify - midnight isn't late)
<vdamewood>
Mutabah: Actually, you're right. Misnight is just a few hours before bed time.
terminalpusher has joined #osdev
bauen1 has quit [Ping timeout: 240 seconds]
gog has joined #osdev
<Griwes>
I've almost convinced myself that I should just use llvm's libunwind for the time being and only come back to the idea of writing my own once I'm much further into the project, as a side thing instead of a blocker
<Griwes>
This is probably much healthier too, isn't it
<ddevault>
the correct way to unwind stacks is by using %rbp
<Griwes>
Only in languages where the only way to handle errors is to weave error handling around every line of the proper code
<ddevault>
exceptions are super dumb
* moon-child
grabs popcorn
<ddevault>
exceptions are longjmp as a good practice
<moon-child>
longjmp is greenspunned RETURN-FROM
* moon-child
grabs more popcorn
GeDaMo has joined #osdev
<ddevault>
in other news, got message passing working reliably
<klange>
congrats; envy is settling in as I see you deliver a multitude of projects
<ddevault>
ty, I get a lot of help
<mrvn>
ddevault: why would %rbp have any sensible value?
<moon-child>
sysv abi technically mandates that you maintain a frame pointer
<mrvn>
The advantage of exceptions is suposedly that you don't have to handle them everywhere, they will just magically propagate to where you catch them. But with RAII you have to handle them at every } anyway. So what actually is the point?
<mrvn>
moon-child: then it's a good thing I'm not doing sysc abi, nor even C.
<ddevault>
I don't like magic
<ddevault>
%rbp is the frame pointer
<ddevault>
you can define any ABI you like but I like frame pointers
<mrvn>
Basically any language with scopes and destructors the exception can't be using longjmp making it a bit pointless.
<moon-child>
'what actually is the point' two things. 1, it is handled automatically. Even comparing eg the way rust does it with c++, c++ is more modular, since I can call you and you can call me back, and I can catch in the outer stack frame an exception that I threw in the inner stack frame, and you don't have to care about it
<mrvn>
ddevault: frame pointers are too little information for stack unwinding
<ddevault>
not really
<moon-child>
2, in a language with tracing gc, the implementation doesn't need to spend nearly so much time with raii
<ddevault>
it doesn't deal with inlining, sure
<moon-child>
s/language/implementation/
<ddevault>
but a frame pointer plus, well, a frame, is enough to walk over satck frames
<ddevault>
stack*
<mrvn>
moon-child: a language with GC doesn't have scopes and destructor. They are separated.
<moon-child>
it can. Why can't it?
<mrvn>
ddevault: if you know the frame layout you don't need the frame pointer, the SP will do the same
<moon-child>
the only distinction is that you don't have to use destructors to manage lifetimes of pointers to allocated memory
<ddevault>
yeah, but only if you know the frame layout
<ddevault>
which calls for DWARF or something like it
<ddevault>
much more complicated
<mrvn>
moon-child: excatly. you don't call destructors at the end of the scope so you can just longjmp
<moon-child>
you might have destructors for other things
<moon-child>
such as files or mutexes
<mrvn>
ddevault: you need the frame layout to unwind the stack and call all the destructors. That's my point.
<ddevault>
not necessarily
<ddevault>
hare does this by calling destructors before propagating errors
<ddevault>
well, s/destructors/defers/
<moon-child>
you can maintain a shadow stack. But that's just dwarf with extra steps
<mrvn>
ddevault: ==> no longjmp for the exception.
<ddevault>
again
<ddevault>
exceptions are bad
<mrvn>
that's an optinion
<ddevault>
aye
<moon-child>
error values are anti-modular. anti-modularity is a popular meme among the unix crowd, and I am not making a value judgement, but it is important to acknowledge the consequences of such a view
<ddevault>
not defining your constraints within the type system is reckless
<ddevault>
it's less modular, sure, but more reliable and predictable
<mrvn>
Why do you think exceptions are about error values. I think that's the first mistake. Why should exceptions be exceptional and errors?
<moon-child>
not defining your constraints within the type system is a _lot_ less reckless than permitting use-after-free
<mrvn>
Not having exception as part of a functions ignature is the second mistake imho,
<moon-child>
also what mrvn said
<ddevault>
different trade-offs
bauen1 has joined #osdev
moon-child has left #osdev [#osdev]
<ddevault>
in any case, that would just make us both hypocrites :)
<mrvn>
I think exceptions should be much more like std::expected.
<mrvn>
So what if you have to check for error on every level? All the cases that shouldn't better call abort() are things you catch very quickly anyway. Make exceptions not exceptional.
<mrvn>
most of it you can hide behind syntactic suggar.
<ddevault>
agreed
<ddevault>
errors are just errors
<mrvn>
Something like not_found isn't even a real error. That can be very much be the expected result.
<mrvn>
if (s.find(foo) == s.end()) who wants to write that instead of try s.find(foo) except not_found ?
<ddevault>
if (s.find(foo) is void), rather, but: me
xenos1984 has quit [Read error: Connection reset by peer]
<mrvn>
or in ocaml you have this nice syntax with pattern matching: match Map.find(map, key) with IntLit i -> ... | StringLit s -> ... | NotFound -> ....
<mrvn>
exceptions are just another case in the pattern matching.
Starfoxxes has quit [Ping timeout: 260 seconds]
<mrvn>
Note to self: port std::expected to my kernel stl.
Starfoxxes has joined #osdev
<Griwes>
"with raii exceptions are handled at every brace" is a nonsense take. It's just a mechanism that allows you to forget that any form of early return exists for the purposes of cleanup
<Griwes>
Anyway, I'm not planning to have a pointless conversation trying to convinced people who have their opinions set and aren't interested in ever being convinced, so I'm afraid any popcorn grabbed for this will be wasted
xenos1984 has joined #osdev
<mrvn>
Griwes: it's not so much RAII but the destruction at end of scope
<mrvn>
Griwes: For me the problem is the requirement in languages like C++ that exceptions must have 0 cost unless you throw one. That makes throwing them usualy very expensive and makes exceptions unsuitable for everything but the exceptional, usualy stuff that aborts.
<ddevault>
getting two OS developers to agree on anything is an exercise in frustration
<ddevault>
building an OS is the ultimate exercise in NIH
<Griwes>
...exceptions are meant to be for the exceptional stuff? Almost like it's in the name!
<Griwes>
Hard disagree on the claim that it's then "usually stuff that aborts" though
<mrvn>
So we need a different name. What would you call something to do an early return that isn't exceptional?
<mrvn>
Griwes: ever written an app that catched bad_alloc and keeps going?
<Griwes>
Me? No. But I know people who have, with great success
<mrvn>
Not that bad_alloc even gets thrown in most cases with overcommit.
<mrvn>
Griwes: so what exceptions do you regularly catch and handle without having your program termiate eventually due to it?
<Griwes>
depends on the domain, though once again, "regularly" is a funny word to use for something explicitly exceptional
<Griwes>
the biggest boon of them is when you're writing a library and you aren't the one handling the error, they become transparent to anyone but whoever decides to catch them
<mrvn>
Griwes: but for that case the fact that exceptions are not part of a functions signature makes them rather bad.
<Griwes>
hard disagree
<Griwes>
it allows middleware libraries to ignore error handling entirely and get it transparently handled a layer above
<Griwes>
it's what makes them rather good
<Griwes>
anyway, that's as far into this discussion as I'll allow myself to be dragged in
<mrvn>
it makes it impossible for the compiler to see if exceptions are handled or not. If the exceptions thrown by a function change. Maybe the middleware library should handle a new exceptions but it just silently propagates and terminates the program and you won't find out for years because it's exceptional and doesn't happen till then.
<mrvn>
I agree that it should be possible to pass exceptions along transparently. Like say "int foo() [throws everything LibBla::blub() throws]"
<kingoffrance>
"anti-modularity is a popular meme among the unix crowd, and I am not making a value judgement, but it is important to acknowledge the consequences of such a view" some philosophies, everything contains the seeds of its own destruction. for that, the consequences are surely that it eventually leads to modularity :D
<mrvn>
but it should also be possible to say "int foo [throws A | B | C]" and give an error if it can throw anything else.
<mrvn>
what is anti-modularity?
<Griwes>
"throws a b c" has been tried and it sucked major donkey balls
<Griwes>
it's one of the really major pain points of java
<Griwes>
it's also bad because it applies function coloring
<kingoffrance>
sorry, was quoting "error values are anti-modular"
<mrvn>
Griwes: not really. They only tried: "turn everything but a b c into terminate()"
<Griwes>
though not as absurdly bad as making everything return std::expected, which is function coloring cubed
<Griwes>
mrvn, yeah, and other languages tried the other option which happens to be even worse
<Griwes>
anyway "terminate called because of an uncaught exception" is a fine thing to happen vOv
<mrvn>
Griwes: that isn't what "throws a b c" does.
<mrvn>
or rather it's missing the "fail to compile if there is a new exception d"
<mrvn>
terminate is about the worst thing to happen for a lib
RAMIII has joined #osdev
RAMIII has quit [Client Quit]
<mrvn>
'The “color” of a function is a metaphor for segmenting functions into two camps: async and normal functions.' How does that apply to "throws a b c"?
<Griwes>
anyway "throws a b c" is rightly dead and shall never be alive in C++ again and that's good
<mrvn>
Griwes: c++ throws was horrible
<Griwes>
mrvn, the color of a function is whether you can just simply call it and get a result or whether you need to handle it in its own special way. with checked exceptions, you *have* to handle everything it may throw, which means it has a color
<mrvn>
Griwes: you can have to handle it or throw it.
<Griwes>
throwing it means handling it
<mrvn>
but it in no way limits you what color of function you can call
<Griwes>
there must exist code that handles the color
<mrvn>
The point, generally, of exception is that there is no code to handle the exception, only the exception handler does that.
<Griwes>
whether it's a catch {} or an annotation propagating the thrown exception info, that's handling a color
<mrvn>
it's purely a compile time thing, no code generated.
<Griwes>
you're missing the point I'm making
<mrvn>
I'm not sure what point you want to make
<Griwes>
`void foo() throw (whatever bar() throws) { bar(); }` <- `throw (whatever bar() throws)` is a piece of code that handles the color
<mrvn>
it's a bit of source, no runtime component
<Griwes>
yes
<mrvn>
ok
<Griwes>
colors are about programming overhead, not runtime overhead
<mrvn>
I see that as no different as: int foo(float); that's a color too
<mrvn>
should we go back to implicit prototypes?
<Griwes>
sigh
<Griwes>
I have no interest in continuing a discussion that's not being made in good faith
<Griwes>
bye
<mrvn>
Griwes: do you agree that what a function throws is part of it's contract?
<mrvn>
Because in my mind I'm just asking for the functions contract to be machine parsable.
* mrvn
like google images for "function coloring"
<clever>
mrvn: thats something i liked about java, where you had to formally declare what you can throw, and also what your not catching that can be thrown from further down your call graph
<clever>
it made it trivial to know what exceptions you can expect, and need to either choose to handle or let pass on
<mrvn>
clever: did java fail if you didn't declare something or convert it into uncaught_exception or termiate?
<clever>
i think it was a compile time error only
<clever>
and some build systems didnt enable it
<mrvn>
that's what I want. I can see the complained about it being to noisy. Do I really want to specify bad_alloc for every function that uses the heap? How many functions will your C++ code have that do not throw bad_alloc?
<mrvn>
s/complained/complaint/.
<clever>
i think there was a whitelist of exceptions it didnt care about, like divide by zero
<mrvn>
That's not really one you "throw"
<clever>
exactly
<clever>
but it can still be caught
<clever>
its more of the runtime throwing it for you, when you do something bad
<clever>
or the runtime not even checking, and converting the signal into an exception
<mrvn>
I can see that as an option: Some exceptions are global like bad_alloc and div_by_zero. I could also see class define a list of exception that would then apply to all methods.
<clever>
there is also the question of should malloc ever return 0?
<clever>
maybe the process should just die instead?
<clever>
depends on the use-case
<mrvn>
clever: you can handle it if you have the need for it. So yes, it should.
<clever>
for large allocations, i can see that being valid
<clever>
but for tiny allocations, just printing an error with some frameworks needs heap space
<mrvn>
kind of should be an attribute in ELF so the kernel disables overcommit to binaries that handle malloc returning 0
<clever>
and if a tiny allocation fails, more are going to fail soon
<mrvn>
hehe, how do you allocate the bad_alloc exception when new fails? That needs new and that fails again.
<clever>
for a more embedded case (kernel or mcu), your more likely to avoid touching the heap whenever possible
<clever>
exactly
<mrvn>
bad_alloc kind of needs to be pre-allocated somewhere so you can throw an existing address.
<Griwes>
<mrvn> hehe, how do you allocate the bad_alloc exception when new fails? That needs new and that fails again.
<bslsk05>
itanium-cxx-abi.github.io: C++ ABI for Itanium: Exception Handling
azul has joined #osdev
<mrvn>
Griwes: another of those things imposed by the "exceptions are exception" design.
<clever>
stack unwinding and frame pointers are another tricky thing
<clever>
my rough understanding of framepointers on x86/arm, is that the frame pointer forms a linked list
<clever>
where each frame pointer, points to the previous framepointer on the stack, which is at the "middle" of a stack frame
<clever>
positive offsets point to arguments (beyond that fit in the first few regs), negative offsets point to local vars
<clever>
and a fixed offset from there, is the return addr, varying by platform
<mrvn>
clever: the frame pointer is the top of each frame while the SP is the bottom of the frame. And there is a defined way to get the previous frame pointer given a frame pointer.
<clever>
for x86, it would be a positive offset, because the call opcode pushes the return addr right below the args, and the prologue then saves the framepointer, and creates locals
<mrvn>
On m68k frame pointers use the "link/unlink" opcodes so they are handled in hardware.
<HeTo>
my rough understanding of frame pointers on x86 is that usually they don't exist. or at least you can't find the head of the list in a register. and I think it's the same for ARM too, actually (usually you can't get usable backtraces on ARM without debug symbols)
azul has quit [Quit: leaving]
<mrvn>
HeTo: the compiler can optimize them away and then perf doesn't work right.
<clever>
x86 for example, the stack can look like an array of: [ local1, local2, framepointer, returnaddr, arg1, arg2, arg3 ] (exact numbers will be wrong)
<mrvn>
There are options to force and eliminate all frame pointers in gcc/clang.
<clever>
because the caller first pushes all args to the stack, then runs the "call" opcode, that pushes the return addr
<clever>
and the first thing the prologue does, is push the old framepointer onto the stack, and copy sp->fp, to create a new stack frame
<clever>
and then sp -= $locals_size to make toom for local1/local2
<HeTo>
mrvn: perf can alternatively use dwarf for the backtrace (not sure if it consults the symbols at runtime. I think it just saves a bunch of the stack at runtime, and maybe leaves interpreting that for the analysis?)
<mrvn>
clever: yes. the frame pointer is always pushed at a fixed offset and the frame pointer register is then updated.
<GeDaMo>
Frame pointers were necessary on 8086 because you couldn't do sp-relative addressing
<clever>
mrvn: for arm, the return address is slightly more complicated, because of the lr register, and it being on the stack is optional
<clever>
*looks*
<mrvn>
clever: what stack? ARM (hardware) doesn't have a stack.
<clever>
yeah, the hardware doesnt enforce one, but gcc still has one
<mrvn>
doesn't ruse have no stack?
<mrvn>
rust
<mrvn>
or was that go?
<clever>
haskell's stack is a linked list on the heap, rather then the traditional stack
<HeTo>
clever: I think the return address will be on the stack if you have a stack frame. leaf functions that don't use much stack might not have their own stack frame though
<clever>
HeTo: i'm checking some disassembly to confirm things there
<mrvn>
HeTo: on ARM the return address will only be on the stack if the return register gets clobbered, i.e. if you call other functions.
<clever>
yeah, leaf vs non-leaf functions
<clever>
but, is the return addr at a positive or negative offset from the framepointer?
<mrvn>
and leaf functions might have an implicit stackframe with the red zone.
<mrvn>
clever: hardware dependent
<clever>
it feels more abi dependent to me?
<clever>
whatever rule gcc set
<mrvn>
clever: positive on x86 because CALL puts it on the stack before the function prolog saves the ebp
<clever>
yep
<clever>
but on arm, the prologue is responsible, and can do whatever it wants
<mrvn>
theoretically the compiler you save the address of the return address or any other offset into the minimal stackframe. but normaly you would just "push ebp"
<mrvn>
s/compiler you/compiler could/
<clever>
ok, r14 == linkreg, r15==pc
<mrvn>
and then ebp = sp; sp -= size
<clever>
80000f1c: e92d4010 push {r4, r14}
<clever>
80000f38: e8bd8010 pop {r4, r15}
<clever>
this is a non-leaf function, its saving r4+lr, but then restoring into r4+pc
<mrvn>
clever: that's a nice way to pop and ret in one go
<clever>
yep
<clever>
but i think this was built without frame pointers
<clever>
so my answer is missing
<HeTo>
also really confusing reading disassembly if you aren't used to it
<mrvn>
r4 is the 4th argument register, so no frame pointer
<clever>
its also not clear, if that pushes r4 then r14, or r14 then r4
<HeTo>
when you're looking for some form of branch or return instruction, you don't expect "pop" to be one if you're not familiar with ARM
<clever>
HeTo: its clearer when it says pop {r4,pc}
<clever>
but objdump can decode r15 as either r15 or pc, and this disassembly went for the confusing option
<mrvn>
clever: for PC that's true. for some other registers the number is clearer for code that doesn't use the register in a conventional way
<clever>
mrvn: but there is a 3rd arch, where framepointers and stockholme will drive you mad!
<clever>
on VPU, register+immediate-offset doesnt pack negative offsets well
<mrvn>
Does aarch still have a pop {pc}?
<mrvn>
aarch64
<clever>
so framepointer + -123 would be expensive
<clever>
and the author of the gcc port, decided to violate the framepointer rules some
<clever>
and now the framepointer is total nonsense
<mrvn>
So I guess you don't have a red-zone on the VPU?
<clever>
ive not seen any sign of a redzone
<mrvn>
why does it have a frame pointer at all?
<clever>
probably just because gcc generates one by default
<mrvn>
so make the no-framepointer option defauilt for VPU
<clever>
let me find an example...
<mrvn>
no need
<clever>
ah, seems framepointer is already off
<clever>
80002f32: a1 03 stm r6-r7,lr,(--sp)
<clever>
80002f34: 59 c0 7c cf add sp,sp,-4
<clever>
8000309a: 59 c0 44 cf add sp,sp,4
<clever>
8000309e: 21 03 ldm r6-r7,pc,(sp++)
<mrvn>
optimizer fail
<clever>
a non-leaf function, it pushes r6/r7/lr, and decrements sp by 4 for locals, then undoes it all at the end, restoring lr into pc
<clever>
mrvn: where is the fail? i'm not seeing one immediately
<mrvn>
clever: why doesn't it push an extra register?
<clever>
ah, as-in, "save" r8, just to get the sp another 32bits lower?
<mrvn>
yep.
<clever>
that could work, for small stack frames
<clever>
but there is a range limit on store-many
<clever>
800002a0: a9 02 stm r6-r15,(--sp)
<clever>
800002a2: c7 02 stm r16-r23,(--sp)
<mrvn>
sure, there is a limit for it and at some point writing to the stack costs more time than the extra opcode to add to sp.
zaquest has quit [Remote host closed the connection]
<clever>
yeah
<clever>
which reminds me, this cpu is also dual-issue
<mrvn>
but for 4 byte the extra opcode and writing a register should even out
<clever>
not sure about this case of modifying sp back2back, but certain combinations of opcodes can run in the same clock cycle
<mrvn>
the stm an add have a register dependency
xenos1984 has quit [Quit: Leaving.]
<clever>
yeah
<clever>
that complicates things, and it would have to get really clever to merge them
xenos1984 has joined #osdev
<mrvn>
which is why I think storing an extra register would be better. Makes SP available for other use earlier.
<mrvn>
and the opcode after the "add" might not use SP at all
<mrvn>
e.g. "xor r0, r0, r0"
<clever>
there are 4 opcodes after the add, leading to a branch+link
<clever>
80002f38: 00 e8 04 30 00 7e mov r0,0x7e003004
<clever>
and the very first one, is a rather fat load 32bit immediate
<clever>
48bit opcodes, something arm just cant do
SGautam has joined #osdev
<clever>
at the cost of decoder complexity, of course
<mrvn>
nothing compared to m68k. Or x86 with its 15 byte opcode limit.
<clever>
yeah, vpu maxes out at 80bits (10 byte) for its vector opcodes
<mrvn>
I miss the "(--An)" from m68k. auto increment/decrement is such a usefull thing when working with arrays or strings.
<clever>
the syntax of vpu asm implies it can do the same
<clever>
but i think ive tried using it, and it actually cant
<mrvn>
on x86 I mean
<clever>
it only works on the stack pointer, and only in one direction
<clever>
so store can only decrement sp, and load can only increment sp
<clever>
its just being verbose about what its doing
<clever>
ghidra has an abnormally good opcode decoder, where it clearly explains every bitfield in the opcode
<mrvn>
should have just said: push/pop
RAMIII has joined #osdev
<mrvn>
I like the way on ARM how you can increment/decrement and toggle write-back of the result.
<clever>
with that, i can see that `stm r6-r7, lr, (--sp)` has 3 operands encoded into it, r6 is a 2bit value of 1, r7 is a 5bit value of 1, and lr is a 12bit value of 101100000011
<clever>
and then `stm r6-r10, lr, (--sp)` has a 100 (4 decimal) in the r10 slot
<mrvn>
it's not a bitset of the registers?
<clever>
so, r7=1, r8=2, r9=3, r10=4
<clever>
i think its 2 ints, for a start and end register
<clever>
hense the r6-r10 syntax
<mrvn>
you don't always have just one range.
<clever>
in that case, you use multiple stm's
<mrvn>
I believe on ARM the stm just has a bitset.
dennis95 has joined #osdev
<mrvn>
r0-r4 is just syntactic suggar for r0, r1, r2, r3, r4
<clever>
in this case, there are 32 registers, r0 thru r31, some of them having special names like sp/pc/lr, just like arm
<clever>
so you would need 32bits just to allow specifying every reg
<mrvn>
indeed
<mrvn>
or 10 bits for start/end of a range.
<clever>
vpu complicates things, by only allowing 2bits for the start, and i think its more of an enum
<mrvn>
There are probably some register you stm far more often than others. Maybe it's logarithmical too: start at 0, 1, 2, 4
<clever>
the first field is a 2bit int, where 1==r6, 0==r0, not finding other examples yet
<clever>
i believe that is why the abi says that r6 and up are the preserved regs
<clever>
and r0-r5 are clobbered
<clever>
because `stm r6-r??, lr` is cheaper to encode
<mrvn>
or just a "looks random" lookup table like 0==r0, 1==r6, 2==r9, 3==lr
<clever>
yeah
<clever>
the designer picked some random values, to suit an ABI
<mrvn>
Clear sign of the CPU designers having some calling convention in mind and the STM is ment to save the clobber registers.
<clever>
exactly
<mrvn>
0 == big function saving everything, 1 == normal function just saving clobbers, 2 == small function, 3 == leave function
<clever>
searching thru an example binary, i can see 3 forms of stm
<clever>
a: just 1 register, is not many!
<clever>
b: r0-r??, or r6-r??, storing just a range
<clever>
c: r0-r??,lr or r6-r??,lr storing a range plus lr
<mrvn>
ahh, start == 3 might mean just the end register, no range.
<clever>
oh, and a 4th form
<clever>
stm lr, (--lr)
<clever>
again, its not many, but the range has been omitted, its now just lr!
zaquest has joined #osdev
<clever>
oh, theres an odd decoding, but this looks like garbage binary data
<clever>
stm gp-r12, (--sp)
<clever>
where gp is an alias of r24
<clever>
that makes no sense at all, the range is backwards
<clever>
which makes sense! :P, this doesnt look like vpu asm, its some other form of binary data
<mrvn>
Could that be r24-r31,r0-r12?
<clever>
let me throw together some asm to brute-force it
<clever>
0x1234 encodes as a 16bit immediate tacked onto a 16bit opcode
<clever>
but 0x12345 encodes as a 32bit immediate on a 16bit opcode, now waisting 1 byte
<clever>
while 0x1c and below are more complicated, sharing the 16bits between both opcode and immediate
<clever>
and you can see how the encoding varies wildly, depending on both the destination register and the immediate size
dennis95 has quit [Remote host closed the connection]
dennis95 has joined #osdev
terminalpusher has quit [Remote host closed the connection]
<clever>
mrvn: went thru the entire range, for the single form (stm r?, (--sp)), it only supports 5 registers, r0, r6, r16, gp, and lr
<clever>
ghidra claimed the first operand (for the range form) was 2 bits, 0-3, which would explain r0/r6/r16/gp, and lr is a special case ive seen elsewhere
<clever>
and looking at the bytes i can confirm that, r0/r6/r16/gp have a 00, 01, 10, and 11 pattern in one of the bytes, and are otherwise identical
<clever>
while the lr variant, is vastly different
sympt has quit [Read error: Connection reset by peer]
sympt has joined #osdev
<clever>
mrvn: oh wow, at least at the binutils layer, r0-r1 all the way thru to r0-r31 encodes into something!
<clever>
explains the 5bit int i saw in ghidra, 0-31
<clever>
and a value of 0 for the end reg, is used for just r0, without a range
<clever>
if i tell it to save r6-r6, it assembles fine, but then disassembles as just r6, no range
<clever>
and keep in mind, not all 16bits of this can be used by this one opcode
<clever>
there are other 16bit opcodes, and bigger opcodes that need the first 16bits to not look like a 16bit opcode
<clever>
so some of those bits are just going to be constants
toluene has joined #osdev
<mrvn>
When I use std::for_each(std::execution::par, std::begin(a), std::end(a), [&](auto x) { ... }); then what is creating threads? Or choosing how many threads?
<sbalmos>
random uneducated guess is some automatic tie-in to the pthreads lib?
<mrvn>
sbalmos: std::jthread uses pthread under the hood, yes.
blockhead has quit []
<mrvn>
the question is what creates the std::(j)thread objects
<ddevault>
I said I got message passing working reliably
<ddevault>
then I expanded the test suite
<mrvn>
And how would I do the same for my BigNum add/sub/mul/div/sqr/sqrt?
<mrvn>
ddevault: that is usualy how it goes: If it passes all tests then you didn't test enough.
<ddevault>
to be fair, I knew my original statement was a qualified one
<kingoffrance>
if it compiles on first attempt, be scared, be very scared
<sbalmos>
void kmain() { start_reactor(); }
<sbalmos>
whoops, missed the comment before start_reactor(). // Quaid
xenos1984 has quit [Quit: Leaving.]
FireFly has quit [Ping timeout: 260 seconds]
wand has joined #osdev
xenos1984 has joined #osdev
Matt|home has quit [Ping timeout: 268 seconds]
\Test_User has quit [Ping timeout: 240 seconds]
Likorn has joined #osdev
Santurysim is now known as Ermine
SGautam has quit [Quit: Connection closed for inactivity]
vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<mjg_>
hrmpf
<mjg_>
does gcc provide a way to tag a struct or a pointer as misaligned?
\Test_User has joined #osdev
<mjg_>
oh, there is an attribute aligned
<mrvn>
mjg_: [[gnu::packed]], aligned can only increase alignment
<mrvn>
you can pack and align but I don't think there is a way to mis-align but not pack.
<mrvn>
on the other hand packed isn't recursive so you can doubble bag a struct
<mjg_>
well let's see what's going to happen
<mrvn>
pointers can't be mis-aligned at all, which I consider a bug in the packed extention
<mrvn>
Tip: never ever pack something you access more than once. It's faster to copy it to a not-packed / aligned struct and work with that.
<mjg_>
how about pre-existing big codebase which sometimes traps on 32 bit arm
<mjg_>
where playing whack-a-mole is a non-starter
<mjg_>
i already tried memcpy, doe snot help me as the target gets modified later
<mjg_>
so i would have to memcpy the change back
<mjg_>
and make sure i caught all the cases
<mjg_>
aand aligned(1) did not help, bummer but was worth giving it a shot
<mrvn>
gcc and clang have different opinions about unaligned loads. iirc clang does it wrong and assumes the CPU doesn't trap on unaligned load.
andreas303 has joined #osdev
bauen1 has joined #osdev
mzxtuelkl has quit [Quit: Leaving]
X-Scale` has joined #osdev
X-Scale has quit [Ping timeout: 248 seconds]
X-Scale` is now known as X-Scale
gorgonical has joined #osdev
<gorgonical>
Getting close to a successful build
<gorgonical>
Step 1 nearly completed
terminalpusher has joined #osdev
dude12312414 has joined #osdev
scoobydoo_ has joined #osdev
scoobydoo has quit [Ping timeout: 240 seconds]
Likorn has joined #osdev
scoobydoo_ is now known as scoobydoo
terminalpusher has quit [Remote host closed the connection]
Likorn has quit [Quit: WeeChat 3.4.1]
<geist>
mrvn: depends on the architecture
<geist>
modern arm it's just assumed you have the allowed unaligned access bit set
<geist>
you can argue that's a bad idea, but that ship sailed years ago
<mrvn>
there should be an option for the compiler
mahmutov has joined #osdev
Likorn has joined #osdev
<geist>
there is, though i think we went through all of this before
<geist>
there's a switch that among other things disables the assumption that you can do unaligned accesses
<mrvn>
iirc i read Android doesn't have the bit set
<geist>
but i think it may only be arm64
<geist>
of course because it generates shitty code
<geist>
you use it for things like firmware before the mmu is brought up
<geist>
*the point* of allowing unaligned accesses is it generates better code, and the architecture fully supports it and recommends it
<geist>
you mean android doesn't have the 'allow unaligned accesses' bit? i seriously doubt that
<geist>
but we have to be clear: do you mean arm32 or arm64?
<mrvn>
arm32
<geist>
i'm about 98% sure they have the bit set. i went through this fight years ago at a company that was a competitor to android, and the fact that android went ahead and set it sealed the deal
<mrvn>
I think on some cpus the double register load/store still fails with unaligned addresses
mahmutov has quit [Ping timeout: 268 seconds]
<geist>
you have to be very explicit about which cpus and which ones you're running your code on, android, etc
<geist>
which versions. all the modern 'big' ones dont have the problem
<geist>
i think some of the earlier embedded versions (armvN-m) does
<geist>
and there *are* alignment of atomic issues you have to be aware of
<mrvn>
anywaym my point was that gcc assumes the "allow unaligned access" bit is not set and clang assumes it's set.
mahmutov has joined #osdev
<geist>
i dont think that's right
<mrvn>
if you access a packed struct then one does byte-by-byte acess and the other just loads registers.
<mrvn>
probably depends on the compiler version too
<geist>
but i did just double check: indeed, in armv7 at least there's a whole table of what can cause unaligned faults with SCTLR.A=0 (alignment checks disabled)