wlemuel has quit [Quit: Ping timeout (120 seconds)]
wlemuel has joined #osdev
<ddevault>
this serial port is really giving me a hard time
<ddevault>
cannot tell when the fifo is empty
<mrvn>
geist: testing beforehand makes the code slower, a result you can test for? All result values are valid so you would need some condition register somewhere. A div-by-zero exception makes the cpu designer do the work to catch bad code/data.
<mrvn>
I think if you have an overflow exception than div-by-zero exception makes a lot of sense.
<mrvn>
If not then why are you asking for an execption for this one narrow case and ignoring all the other cases?
<moon-child>
most arches (x86 included) don't have overflow exceptions
<moon-child>
(except for floats)
<moon-child>
I do agree it makes the most sense to have either all or none
<mrvn>
floats have infty so there is no overflow
<nortti>
did MIPS have divide by zero exceptions?
<moon-child>
mrvn: no
<moon-child>
one of the floating-point exceptions is overflow
<moon-child>
the default behaviour on this exception is to produce infinity
<moon-child>
but you can configure other ways to handle it
<mrvn>
I think (on x86 and similar old archs) the problem is that add has no signed/unsigned/carry mode. You would need 3 different adds to generate the right overflow exceptions.
<moon-child>
yeah, that is one issue
<mrvn>
There are apparently archs with two's-complement and overflow exception or the C/C++ standards wouldn't still say overflow is UB.
<moon-child>
no because they say unsigned overflow is not ub
<mrvn>
There is the "add + trapv" design that probably can fuse the two opcodes.
Jari-- has joined #osdev
<moon-child>
and an unsigned add is the same as a signed add if you're not handling overflow
<mrvn>
moon-child: yes, but signed still is despide it being well defined what happens now that ints are two's-complement.
<mrvn>
==> some archs have overflow exceptions on signed add
<Griwes>
signed overflow is UB because it allows compilers to generate measurably better code
<moon-child>
mrvn: that doesn't follow
<moon-child>
Griwes: bull shit
<Griwes>
lol
<Griwes>
okay
<Jari-->
quiting smoking helps osdev apparently
<moon-child>
show me a codebase that sees a significant performance reduction from -fwrapv
<Griwes>
we've made the decision to not make it no longer UB, because people have shown codegen differences between the two modes and they are significant
<moon-child>
show me the benchmarks
Goodbye_Vincent has joined #osdev
<mrvn>
Griwes: "we"? You are on the standards commitee?
<moon-child>
I keep asking people to do this. No one can. Someone pointed me at a video of a talk by chandler carruth where he pointed at some codegen in bzip2 from signed vs unsigned integers and said 'look how bad it is! It's so bad!'
<moon-child>
I checked out the code and rebuilt it both ways. Performance was exactly the same
<Griwes>
mrvn, indeed
<moon-child>
and the routine in question was very questionably coded anyway
<mrvn>
Griwes: it's important to remember that different codegen doesn't always mean different speed
<Griwes>
moon-child, if a compiler engineer explaining how things work doesn't convince you, I won't waste my time looking for the data that was presented
<moon-child>
Griwes: I understand very well the sorts of transformations that a compiler can perform based on overflow being ub vs wrapping
<moon-child>
what I'm asking for is benchmarks
<Jari-->
multiple cores but it shares same memory tables like paging etc. ?
<mrvn>
moon-child: I'm betting the difference looks big to the eye but is actually below the noise level.
<Jari-->
what do you see threads or cores ?
<moon-child>
if your 'optimisations' have negligible effect on performance, but a very real cost to security and programmer sanity, then they are in fact pessimisations
<mrvn>
Jari--: sockets
<Jari-->
mrvn so some automation is involved in multitasking ála cores and threading
<mrvn>
moon-child: huh? If you want security then you should ask for overflow exceptions.
<bslsk05>
lists.gnu.org: Re: changing "configure" to default to "gcc -g -O2 -fwrapv ..."
<mrvn>
I also don't buy into the "have to do extra bookkeeping". Every instruction can have an IRQ happen. That already needs tons of bookkeeping. What extra bookkeeping do you need to add overflow? That just changes the reason code for the exception.
d5k has quit [Quit: leaving]
<ddevault>
ayy there we go
<Griwes>
anyway, idk if the data presented back when wg21.link/p0907 was being discussed was ever publicly published or not, but there was data
<bslsk05>
redirect -> www.open-std.org: P0907R4: Signed Integers are Two’s Complement
<moon-child>
Griwes: thanks. Unfortunately, spec is not publicly available, so I have no way to replicate or numbers or examine further ...
<ddevault>
working serial driver
<moon-child>
mrvn: no
<dminuoso>
moon-child: I would suspect the real cost of overflow detection isnt as much bookkeeping, as it should be data dependencies that inhibit some tomasulo parallellism.
<moon-child>
mrvn: if an irq happens, you take your time and throw it wherever you want
<moon-child>
mrvn: whereas if an add overflows, you have to throw the exception from that instruction in particular
<dminuoso>
(well that and any additional branching or conditional instructions, but these I consider granted)
<moon-child>
dminuoso: what dependencies do you think would be added?
<mrvn>
moon-child: add can throw a bus error or page fault too. Can't wait on that
<dminuoso>
moon-child: Im thinking MSR bits.
<moon-child>
mrvn: since when?
<moon-child>
if you're doing a memory access, that can fault, sure
<dminuoso>
moon-child: Unless MSRs tend to be allocated and renamed.
<mrvn>
moon-child: since you can add memory
<dminuoso>
(I dont know whether thats actually the case, but somehow I expect not to)
<moon-child>
mrvn: on x86, sure
<Griwes>
every instruction is doing a memory access
<mrvn>
Griwes: that's not what he means :)
<dminuoso>
Because MSRs very deeply influence the silicon, I somehow expect them to be hardwired and not allocated..
<moon-child>
most instances of load-op are split up into two uops
<Griwes>
mrvn, I know, I'm being facetious with that particular point
<moon-child>
in particular, instruction fetch is heavily queued
<Griwes>
ha, /efi\boot\bootaa64.efi, someone's been naughty with their slashes
<moon-child>
lol
<mrvn>
Griwes: I wonder how much of the compiler optimizations with signed integer overflow is because nobody looked into optimizing loops without that UB behavior.
<ddevault>
¯\_(ツ)_/¯
<moon-child>
windows actually lets you mix, right?
<Griwes>
yes, windows is bad at things like this
<Griwes>
but we can be better :P
<ddevault>
hm, this is crashing in setjmp
<ddevault>
ah
<Griwes>
mrvn, it's been a while since I've tried to figure out loop optimization conditions
bauen1 has quit [Ping timeout: 265 seconds]
<Griwes>
in a completely different topic. I think I want both async send/receive and sync call/receive/reply w/ scheduling context delegation for my main IPC primitive, but that makes me wonder if that means that I really want to have two separate kinds of kernel objects for those two or not
<moon-child>
i think the more usual strategy is you have async thingies you can either poll or wait for
<moon-child>
but it's the same object either way
<Griwes>
but that doesn't really answer the question that made me spend the last week or two considering a lot of options here, which is: how the heck do I deal with that in my IPC definition language and in the higher level abstraction library for those
<Griwes>
moon-child, I have not found a satisfying answer to "how do I do scheduling context delegation with async send/receive"
<Griwes>
for handling the async cases, I want to multiplex many channels onto few threads. for handling priority/timeslice/whatever else delegation, I want to have a server thread per channel, always ready to immediately be switched to if needed
<Griwes>
I have veered into considering having the kernel spawn threads within the server process a few times, but that way lies madness, because of nonsense like TLS and whatnot
<moon-child>
if the single server thread is already busy handling a message, it won't be ready to handle the next as it comes in
<moon-child>
why does the kernel care if a thread is handling one channel or many?
<moon-child>
re spawning threads: isn't that how fuchsia and windows handle signals?
<Griwes>
imagine I have a low priority server with 4 threads handling incoming messages
<Griwes>
none of them is currently blocked on a receive, because they are working on handling requests
<Griwes>
at the same time there's several high priority processes, and they are significantly starving the low prio server
<Griwes>
one of the high prio processes sends a message to that server, and blocks waiting for a reply
<Griwes>
how does the kernel know what threads to bump in priority, and when to lower their priority to what it was before?
* moon-child
nods
<moon-child>
priority inversion is a whole thing
<Griwes>
yes
<mrvn>
Griwes: all my IPC is async but you can tell the kernel to put you to sleep till something specific happens (e.g. a reply to the IPC you just send)
<Griwes>
I've spent some time trying to figure out a way, but I just don't see how you can possibly have both priority delegation *and* multiplexing of the channels onto threads in the server
<moon-child>
because you don't know which thread is handling the request of the high-priority thread
<moon-child>
I see
<Griwes>
exactly - or worse, which threads *could* try to handle it, if there's none currently blocked in receive
<mrvn>
Griwes: attach the timecard to the request. Whatever thread picks up the request will book it to that requests time card.
<moon-child>
I feel like this is a problem that you solve at a higher level
<Griwes>
mrvn, doesn't work if none of the threads are currently blocked
<moon-child>
could add lower-level hooks to help out, sure
<Griwes>
because, due to priority inversion, they may never get to the call to receive
<mrvn>
Griwes: I assume requests aren't interruptible and don't take long.
<Griwes>
if there's a flavor of the mechanism where the client knows there's a backing thread elsewhere (and that there's exactly a single backing thread elsewhere I guess?), and doesn't share its handle to the channel with lower prio threads, then this could reasonably work
<Griwes>
but then the server would need to keep a thread dedicated to each channel of this flavor
<Griwes>
mrvn, but of course they can take long and are interruptible
<mrvn>
Griwes: if you have interruptible requests then you could just create a new thread
<Griwes>
like, read() is the immediate example that comes to mind, because it goes client -> vfs -> fs driver -> storage driver and back
<mrvn>
i.e. run in the calling threads struct thread.
<mrvn>
Griwes: client -> vfs -> fs driver -> storage driver are all async calls. so each stage goes back to processing IPCs all the time.
<Griwes>
but vfs could get preempted while traversing the mount table, fs could get preempted while traversing the directory tree cache
<mrvn>
Griwes: not for me. those are all uninterruptible and never take long.
<Griwes>
no matter how I slice it, it seems to me that you can either have async + multiplexing, or sync + priority delegation
<mrvn>
Might become a problem if you have a dir with billions of entries.
<Griwes>
...so your fs drivers can disable preemption?'
<mrvn>
Griwes: no. it just never interrupts scanning a directory to process another IPC, which might add or delete a directory entry making the interrupted job invalid.
<Griwes>
oh, but that is not the problem
<Griwes>
the problem is that the server may get preempted, never get back to a blocking call that tells the kernel it's trying to handle IPC, and then get starved by high priority threads, which in turn means other high prio threads can get starved if they need a reply from the server
<mrvn>
Griwes: when a high prio request goes to a thread doing a low prio jobs (and being blocked due to that) it takes over the higher prio and wakes up.
<Griwes>
that only works if the low prio thread was blocked in receive when the request came in
<mrvn>
So the low prio jobs gets a little boost from the high prio request that's waiting in the queue. and then things clear up.
<mrvn>
No, that works exactly when the job isn't in receive but computing stuff.
<Griwes>
otherwise the kernel can't know what thread to bump (and as a separate issue, it doesn't know when to stop bumping it, really)
<Griwes>
mrvn, how?
<mrvn>
Griwes: the kernel picks a thread to process the request and that thread gets it's prio raised.
<Griwes>
how does the scheduler tell "this thread will eventually call channel_receive to receive and handle that message"?
<mrvn>
In my case I have mailboxes that the IPC goes to. A multi-threaded queue would have a mailbox address with e.g. a round-robin scheduler between a bunch of mailboxes and whatever mailbox the request goes to that thread wakes up.
<Griwes>
are your mailboxes tied to specific threads?
<mrvn>
yes
<Griwes>
ah, that's the disconnect :)
<Griwes>
still, you can't guarantee a thread will call mailbox_receive
<mrvn>
The whole syscall system is just sending messages to mailboxes.
<Griwes>
if it's not yet blocked in mailbox_receive
<mrvn>
what else will the thread do?
<Griwes>
a malicious thread could receive from its mailbox once, and when do while (true) ;
<Griwes>
or, idk, mine bitcoin
<mrvn>
sure. and then you kill the driver author.
<Griwes>
doesn't help the user whose system just locked up, and also may be illegal in your jurisdiction (I am not a lawyer and this is not legal advice)
<mrvn>
There is a space in the US where there is no law. :)
<Griwes>
I mean I guess the thread could also actually receive the prio bumping message and *then* start mining bitcoin
<Griwes>
but that could be bounded by attaching a higher prio timeslice budget to the prio bump
<Griwes>
so while it's bad, it's not catastrophically bad?
<Griwes>
aaaaaaa, I just wanted to start doing IPC and deal with a proper scheduler design later (:
<Griwes>
but actually starting to prototype the higher level IPC library has just thrown me into a series of deep rabbit holes
<mrvn>
Griwes: why do you start by assuming threads will be stuck doing computations? Most drivers should just wake up for a fraction of a timeslice and go right back to sleep.
<Griwes>
I'm trying to design this so that you don't have to trust the other end of the IPC endpoint
<mrvn>
you can't. The other endpoint can just take your request and then mine bitcoins.
slidercrank has joined #osdev
<mrvn>
when it does it gets preempted but you are still stuck waiting for it.
<Griwes>
right, but as I said, that could be bounded by attaching a budget to the prio bump (and giving you a wakeup when a reply to the call doesn't come within that budget)
<mrvn>
How about this: When a thread gets a high prio thread the prio gets raised. But only for a limited time, e.g. one time slice.
<mrvn>
exactly :)
remexre has quit [Read error: Connection reset by peer]
<Griwes>
hmmmmmmm I guess that also works if the thread is not, in fact, actively blocked (if there's an association of the mailbox with a thread or a group of threads in the kernel)
<Griwes>
alright, need to sleep on that a few times, thanks for the food for thought
<nortti>
I think sleeping on it a few times would be quite bad for performance
<mrvn>
Griwes: if all IPC is async you get verry little that blocks. All the computations between calls don't take long.
<Griwes>
nortti, that's the least of things causing bad performance of my brain, sooo
<mrvn>
And when they do you have to rewrite the code to check the mailbox periodically.
<Griwes>
I guess the recommendation can be "if you expect to take a bunch of actual timeslice on this, pass it off to another thread"
<Griwes>
and then the attempted murder charges just turn into "hey your server is written *really* badly, git gud"
<mrvn>
I haven't done it but I want my mailboxes to be more like linux urings. So you can check them from userspace verry quickly without a syscall.
<mrvn>
Yeah, you should have a maximum of #cores threads that do IPC and long computations should be offloaded to worker threads then.
<Griwes>
okay, I'm going to peace off now to try to get the puzzle pieces to fall into their place, thanks again :)
<mrvn>
Griwes: I think a lot if not most problems with priority inversions are more caused by locking than by threads not reading the request. If a low prio thread holds a lock then all other threads will be blocked by it. If you can avoid locks and maybe even threads and just do the work quickly you are better off.
pmaz has joined #osdev
bauen1 has joined #osdev
Brnocrist has quit [Ping timeout: 265 seconds]
Vercas6947 has quit [*.net *.split]
gildasio2 has quit [*.net *.split]
foudfou has quit [*.net *.split]
pharonix71 has quit [*.net *.split]
gabi-250 has quit [*.net *.split]
gxt__ has quit [*.net *.split]
Brnocrist has joined #osdev
gog has quit [Ping timeout: 250 seconds]
GeDaMo has joined #osdev
nyah has joined #osdev
bauen1 has quit [Ping timeout: 276 seconds]
bauen1 has joined #osdev
gog has joined #osdev
<moon-child>
whaaat
<moon-child>
why does freebsd have both 32- and 64-bit shared futexes, but only 32-bit process-local futexes
<bslsk05>
cgit.freebsd.org: src - FreeBSD source tree
<ddevault>
the sad realization that one of my ideas is blocked behind writing a USB stack
<immibis>
isn't this just a non issue? if a high priority thread decides to send a message to a bitcoin mining thread and block for a response, then it is now mining bitcoins at high priority because that's what you told it to do. It's no different than if the high priority thread mines bitcoins itself. If you don't like that then don't run bitcoin mining code.
<immibis>
maybe when calling untrusted code you have a flag that says don't raise the receiver's priority, but that doesn't really help because the high priority thread is still blocked
<immibis>
presumably it is high priority because you would like it to not be blocked
<ddevault>
is it possible to enumerate the available protocols in an EFI program
<moon-child>
not afaik. Why would you want to do that?
<ddevault>
just curious to know if GOP is around on this configuration
<ddevault>
but wanted this question answered in the form of "is it among this list of available things" rather than "is this specific thing available"
bauen1 has quit [Ping timeout: 246 seconds]
<ddevault>
oh good, I don't actually need to write a USB driver before I can try this fun thing
<klange>
You can iterate through all handles and then get the list of protocols they support; this is indirect, but should be equivalent - a protocol can only be supported insofar as there is a handle that supports it.
<ddevault>
klange: yeah, figured that one out already :)
<ddevault>
but I was saddened by this limitation, alas
<immibis>
many interfaces don't allow you to get a list of supported things because it's often abused by clients
<Ermine>
ddevault: what is that funny thing?
<immibis>
like how windows no longer lets you get the current windows version, only ask if it's version X or later, where X has to be one that was created before your development kit
<ddevault>
like what is my ultimate goal?
<ddevault>
I've got my kernel running on my phone
<ddevault>
I want to send an SMS message
<ddevault>
and I want to bring up the display and touchscreen, maybe do a fun little UI
<ddevault>
side proejct
<Ermine>
Woah
<ddevault>
I had thought the modem was wired up through USB but it's also connected directly to a UART so that's nice
<ddevault>
I'll probably need GOP to get the display up, not particularly interested in writing a GPU driver right now
<ddevault>
and for touchscreen I need i2c and then the rest is straightforward
<Ermine>
Is it pinephone?
<ddevault>
yeah
kof123 has quit [Ping timeout: 268 seconds]
<immibis>
what is GOP?
<ddevault>
graphics output protocol, gets you a framebuffer on EFI
<immibis>
wait pinephone? there's no EFI on pinephone
<zid>
republican graphs
<ddevault>
yes there is
<ddevault>
via u-boot
<immibis>
oh
<ddevault>
might also be able to use simple-framebuffer via the device tree, found some riggings for it in the u-boot source
<ddevault>
we'll see what works (if anything)
<immibis>
my own pinephone experimentation doesn't use u-boot so that's why
<immibis>
poor documentation and reverse engineering all the way
<ddevault>
EFI works OOTB if you jiggle it in the u-boot prompt over serial
<immibis>
still haven't got the screen up (although turning the backlight on is trivial and makes it LOOK like you got the screen up :) )
<ddevault>
setenv devtype mmc; setenv devnum 0; run scan_dev_for_efi to boot EFI from SD
<ddevault>
but I flashed tow-boot today which provides a menu
<immibis>
i should probably try getting the display up by removing everything else from a working bootloader and see what is different from my code
<Ermine>
ddevault: do you plan to implement modesetting?
<ddevault>
eventually, yes, presently, no
<Ermine>
Yeah, I meant eventually
<ddevault>
I have a lot of things I want to do before I get around to a proper GPU driver
<ddevault>
immediate priorities are allocating large pages, finishing malloc and implementing free, refactoring the process manager, then drivers: PCI, AHCI, maybe some kind of filesystem; for the pinephone i2c
<ddevault>
I guess I have to finish interrupts on ARM, w/e
<ddevault>
someone is working on virtio blk and someone else is working on RTC, might be nice to do some ethernet stuff once we have enough of a userspace environment to support programs which make use of it
pmaz has quit [Quit: Konversation terminated!]
goliath has quit [Quit: SIGSEGV]
bauen1 has joined #osdev
Burgundy has joined #osdev
<klange>
ugh, Netsurf's "nsfb" backend is so broken that porting Webkit will probably be less effort than trying to fix this up enough to be usable...
<klange>
I don't even know why I'm spending so much time on this, since Netsurf is abandonware now - has been for two years.
<klange>
Struggling to figure out why I couldn't log in to the forum... turns out you can't type 0, -, or + due to a missing jump after processing any of those keys as potentially being pressed with modifiers to zoom in/out.
<klange>
I also had a cursed experience in the redirect handler where I was getting extra garbage in the redirect URL, and I tried printing the pointer value for it and it _moved_ in a place it shouldn't
<klange>
cursed ub on a loop looking for nil bytes, it would appear; swapping for a malloc for a calloc magically fixed it - as did printing somewhere else, that's how you really know it's cursed
slidercrank has quit [Quit: Why not ask me about Sevastopol's safety protocols?]
wlemuel has quit [Ping timeout: 255 seconds]
wlemuel has joined #osdev
Ermine has quit [Remote host closed the connection]
Ermine has joined #osdev
bnchs has quit [Read error: Connection reset by peer]
<mjg>
moon-child: why me bro
<mjg>
moon-child: i suspect it had somethiing to do with not providing 64 bit atomics for 32 bit archs
heat has joined #osdev
foudfou has joined #osdev
<heat>
mjg, is a branch per division pessimal?
<heat>
i would say its PESSIMAL but I need confirmation from the master thyself
pharonix71 has joined #osdev
gxt__ has joined #osdev
gabi-250 has joined #osdev
gildasio2 has joined #osdev
eroux has joined #osdev
bauen1 has quit [Ping timeout: 248 seconds]
<gog>
hi
<zid>
no
<gog>
yes
<zid>
oh
<lav>
hii
* gog
wink at lav
<zid>
flert
* lav
uwu
<heat>
warm greeetings
<Ermine>
hi gog, may I pet you
<gog>
yes
* Ermine
pets gog
<heat>
oh wow seems that that msvc thing is just a hardwired code sequence
<heat>
it cannot detect a div by zero constant, it always emits the branch
<heat>
but it can detect me doing if (b == 0) return 0; so something's off
<heat>
msvc is a weird compiler that does not have the Great Testing of a GNU toolchain
<zid>
it's too warm in here to have music playing
<GeDaMo>
I've got Winter again :|
* gog
prr
<sham1>
Delayed prr
<mrvn>
heat: does it matter? it gets an exception and never takes the branch.
vdamewood has joined #osdev
amine8 has joined #osdev
amine has quit [Ping timeout: 250 seconds]
amine8 is now known as amine
zxrom has quit [Remote host closed the connection]
zxrom has joined #osdev
Jari-- has quit [Remote host closed the connection]
slidercrank has joined #osdev
heat has quit [Remote host closed the connection]
heat has joined #osdev
<mjg>
heat: branching is normally worse than not branching, same goes for division
<mjg>
heat: how bout you show whatcha doin
<mjg>
heat: believe it or not, i do claim there is a thing such as "premature optimization". that is, not all branches are worth worrying about
<heat>
instead of having the ud2 in the middle of the codepath and just jnz'ing over it
<heat>
GCC/clang perfectly shows this pattern in normal codegen. i.e all ubsan handling path gets stuck at the tail of the function, when the hot codepaths detect UB they just branch away
<mrvn>
isn't it highly CPU stepping specific what the branch predictor will do?
<heat>
versus... whatever the fuck msvc is doing here
<mrvn>
Also I wonder, if the cpu predicts the wrong path and hits the UD2 opcode does it then pause and speculatively execute the other side of the branch?
dutch has quit [Quit: WeeChat 3.8]
<mrvn>
or does the instruction decoding maybe fall through the UD2 so "test %rax, %rax; jnz .skip; ud2; idiv ..." will actually speculatively execute the idiv no matter what branch is taken?
<mrvn>
heat: I know what you mean. Generally you want the code path somewhere out of the way so it doesn't fill the instruction cache. But then you might need a larger jump and having the "ud2" opcode inline might be even shorter.
<mrvn>
s/code path/cold path/
<heat>
and because jumping over 4 bytes is expensive af
lav has quit [Ping timeout: 255 seconds]
<heat>
while it is indeed the hottest of paths
<mrvn>
.oO(And I really want a compiler hint that "||" should be evaluate non-lazy.
<heat>
also fyi arm64 div by 0 = 0, which is why msvc does the hacky trap
<mrvn>
4 byte? ud2 is 0f 0b, 2 byte.
<heat>
even worse. but arm64 instructions are 4 bytes :)
<mrvn>
So for 2 byte cold path you add 2 "0f 84 R_X86_64_PC32" linker references for a total of 12 byte instead of a local jump.
<mrvn>
Does the linker manage to relax those jumps in a case like this?
<mrvn>
https://godbolt.org/z/zbYo8h4cc Why does x86_64 use "divide.cold" and ARM64 use ".L4"? Why is picking the name of a lable architecture specific in gcc?
<bslsk05>
godbolt.org: Compiler Explorer
<GeDaMo>
Something to do with asm syntax?
<mrvn>
GeDaMo: it's using the same GNU as
<GeDaMo>
as is a collection of architecture-specific assemblers in a trenchcoat :P
<mrvn>
I don't think "divide.cold" is illegal on arm64.
<mrvn>
I love how ARM64 only needs one branch for the two tests because of "ccmp"
<mrvn>
Does risc-v have a ccmp too? Is that something modern designs have embraced?
<ddevault>
how does one "eagerly" evaluate ||, do you mean you want a non-short-circuiting version?
<mrvn>
ddevault: I wan't a non-eagerly version, yes a non-short-circuiting version. Like gcc does on ARM64
<heat>
sweet!
<heat>
what was the bug?
<netbsduser`>
heat: i implemented nonblocking for recvmsg in my unix sockets, but completely forgot about it in the read() vnode op for sockets
vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<netbsduser`>
i added debug tracing around epoll (which is what my select()/poll() internally use) and noticed, oh, xorg stopped using epoll around the time it stopped doing anything further; no more connections were accepted from other x clients; what could be going on?
<netbsduser`>
well, right at the outset of this i added a function that iterates every thread and prints detailed information about why it's waiting (i obsessively add this info to every wait-for-objects call). so i do that and noticed xorg was waiting in sock_read()
<netbsduser`>
and when i went there to look i realised: by god, i've completely forgotten to implement nonblocking in unix socket read() - i did in recvmsg() because i had a problem earlier where x clients wouldn't even finish connecting
<netbsduser`>
after adding support, i was greeted with this wondrous sight
goliath has joined #osdev
lav has joined #osdev
<mrvn>
netbsduser`: it's always nice when things come together.
<netbsduser`>
mrvn: it really is. now is the probably the time to work on cleaning up the various work i've done in pursuit of this
TkTech has quit [Ping timeout: 265 seconds]
wlemuel has quit [Ping timeout: 250 seconds]
wlemuel has joined #osdev
<geist>
mrvn: riscv doesn't have condition flags, so no
<geist>
re: the ccmp thing
<gog>
it's depessimized
* gog
dabs
* lav
bads
xenos1984 has quit [Ping timeout: 265 seconds]
dude12312414 has joined #osdev
xenos1984 has joined #osdev
stellarskylark has joined #osdev
Left_Turn has joined #osdev
slidercrank has quit [Ping timeout: 252 seconds]
gog has quit [Quit: Konversation terminated!]
stellarskylark has quit [Ping timeout: 252 seconds]
<sakasama>
Unintentional. I cancelled it but apparently the bridge doesn't support that.
<GeDaMo>
Ah :D
xenos1984 has joined #osdev
<zid>
what's a bsad though
innegatives has joined #osdev
<innegatives>
Is there any resource out there listing every 8086 instruction encoding or am I supposed to find out myself using an assembler?
<mrvn>
Geist: they could have ccmp r1, r2, r3 (, #imm) where r3 gets the result of the comparion ored together or replaced by the imm if they have the bits in the opcode.
<geist>
well i guess they do have the slt/etc style instructions, but you can’t have that format you suggested because there’s only slots in the instructioun for 3 operands
<innegatives>
mrvn: that doesnt list the bit patterns
<geist>
sandpile.org has kinda what you want too
<geist>
if you’re interested in the formats
<geist>
but iot’s not so much in a list form
<mrvn>
innegatives: it gives you the opcodes and encoding of the args. That should be enough to get the full bit sequence if you look up how the x86 encodes args in the first place.
<mrvn>
innegatives: you do have to know what e.g. a REX prefix means
<zid>
How hard/easy is an 8086 to get? I've never looked
<zid>
My guess doesn't help, I can foresee them either being unobtanium because significant + old, or absolutely everywhere because common at the time
<bslsk05>
'486 Breadboard Computer - Part 1' by FoxTech (00:17:36)
<zid>
486 is probably the sweet spot tbh
<zid>
modern enough to be fun (and not just an MCU with awkward pins) and not so big as to be dumb
<innegatives>
I'm making 8086 emulator in Rust, should I represent segment + offset address as u32 or should I look to use non-standard u20 type?
<nortti>
as in, the result of the computation?
<innegatives>
yes
<nortti>
I'd personally use a newtype on u32, which I could then attach convenience methods for memory access and 20-bit wrapping add on
<zid>
I can't see the u20 being useful, you'll use it for like.. one line?
<mrvn>
nortti: with or without A20 gate?
<zid>
most of your stuff with be case some_op: mem[cs<<16 | reg];
<zid>
rather than actually storing the value anywhere
<nortti>
mrvn: there's not A20 gate in an 8086 per se, it just has 20 address lines
<nortti>
+a
<zid>
s/cs/ds
<mrvn>
zid: s/|/+/ and mod 2^20
foudfou has quit [Ping timeout: 240 seconds]
<sakasama>
Wasn't it only times 16, so a 4 bit shift?
foudfou has joined #osdev
<nortti>
aye
<mrvn>
it's 16 bit base and the result is 20 bit. So <<4 sounds right
<zid>
err right my brain had "ds contains numbers like 8, those go on the end of the address" :P
<zid>
you need to <<4 and add
goliath has quit [Quit: SIGSEGV]
joe9 has quit [Quit: leaving]
Bitweasil- is now known as Bitweasil
danilogondolfo has quit [Quit: Leaving]
<geist>
yeah i was thinking of doing something kinda like that with a 68030. had started to work on a breakout board in kicad a while back, need to pick it up and continue
joe9 has joined #osdev
FreeFull has joined #osdev
elastic_dog has quit [Killed (calcium.libera.chat (Nickname regained by services))]
elastic_dog has joined #osdev
v28p has joined #osdev
<heat>
i need a pl011 driver
<zid>
what's a ploll
<heat>
arm serial port
<zid>
what's an 4rm 54r1a1 p0r7 though
goliath has joined #osdev
<geist>
it's not too bad
<geist>
pl011 that is
<geist>
iirc the Pl stuff is basically something like 'peripheral library' and you'll see various PLNNN things
<geist>
but pl011 is *old*, hence the early number
<heat>
yeah seems simple, just a fancier 16550 with some shuffled regs
<geist>
yep
<heat>
what's primecell?
<geist>
oh okay, primecell. yeah that was the marketing name for their library of things
<geist>
not sure they even use it anymore, but it was what it was called at the time
<heat>
its in the device tree nodes' compatible too
<geist>
ie, you license the primecell pl011 for your project, and you get the verilog, etc
<heat>
i was wondering how you could ever say "i'm compatible with arm,primecell" and then have the driver understand what pl it's talking to
<heat>
i.e pl031 (RTC it seems) and pl011 both have arm,primecell
<geist>
ah yeah 031
<geist>
the 031 is super dumb. iirc it's just a second counter of unix time
<geist>
as simple as it gets
<heat>
i have an annoying problem where my early serial console (hacky ish stuff) gets registered as a console and then when initializing the proper driver, I don't replace the earlycon, so I get doubled output sometimes
slidercrank has quit [Ping timeout: 255 seconds]
<innegatives>
So in 8086, INC with mod/rm is 1111,1110, where 0 can be 1 to denote "16 bit operand" instead of "8 bit operand". Say mod/rm is in indirect addressing mode, which means we have to INC something in memory. If w == 0 (so 8 bit mode) does that mean having value of "1111,1111" at that memory location will make it "0000,0000" without incrementing MEM_ADDR + 1, whereas having w == 1 (16 bit mode) will
<innegatives>
make it "0000,0000" but also increment MEM_ADDR + 1?
<GeDaMo>
Yes
GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]
pharonix71 has quit [Remote host closed the connection]
<mjg>
like intel did not have tons of stupid errata
* mjg
f00fs some fentanyl
<heat>
90% of the arm erratas I see are for the thunderx
<heat>
a *single* CPU, not 40 years of CPUs
<zid>
thunderx is a weird beastie to be fair
<zid>
like the first real attempt at a performant arm cpu
<zid>
with multiple cores
<mjg>
performant aint a word
<heat>
you can't even enable KPTI on it due to "icache corruption of kernel text"
* mjg
thus proclaims "native speakers" are overrated
<mjg>
heat: lol
<zid>
performant is a word.
<mjg>
heat: i got a lol armv7 where l2 prefetcher can corrupt it in smp setting
<zid>
-ant turns a word into an adjectival form, denoting attribution of a state
<mjg>
zid: do you also misuse "begging the question"? :X
<zid>
no
* mjg
presses X
<zid>
The only americanism retcon that I knowingly use is gotten
<zid>
which is very out of fashion in enGB compared to enUS, but I like conjugating it that way
dutch has quit [Quit: WeeChat 3.7.1]
bauen1 has joined #osdev
<geist>
heat: oh yeah the thunderx1 had some nice errata. x2 was a lot more solid
<zid>
mjg: By similar lexical rules, heat is a pissant
<heat>
" this triggers a known erratum on ThunderX, which does not tolerate non-global mappings that are executable at EL1, as this appears to result in I-cache corruption. "
dutch has joined #osdev
<heat>
😭😭😭
<zid>
I've never mapped anything non-global anyway so that's fine
<zid>
SMP is overrated
<heat>
how do you switch processes?
<heat>
toggle cr4.PGE? :v
<zid>
ofc
<zid>
(reload cr3)
<geist>
interesting, no non global X mappings
<mjg>
zid: bruh SMP is turbo crap
<heat>
reloading cr3 doesn't flush global mappings
<geist>
must be some way the icache tracks it
<geist>
or lack of, like it doesn't properly track it
<zid>
heat: Never give out the same address twice and it's fiine
<heat>
i think you need to toggle PGE off, reload cr3, toggle PGE back on, ez
<geist>
and/or you can use the new flush instructions
<zid>
or spam invplg
<geist>
there's one combination that does a full flush + global
<geist>
invpcid
<geist>
but then it has to be present, etc etc
<heat>
i want to play with those :(
<geist>
yah finally landed the PCID stuff in zircon
<zid>
what does zen4 have
<heat>
no, not those, the amd ones
<zid>
you can use my qemu
<heat>
invlpgb
<geist>
yah those too
<heat>
and then tlbsync
<heat>
i'm very curious to see if there's actually a nice performance improvement or if it sucks ass
<zid>
I'm just mad I upgraded cpu and I still don't have FRED
<heat>
since, erm, no one seems to be using it
<zid>
despite it being announced years ago
<heat>
you'll never have FRED, it's a myth
<zid>
apparently
<heat>
enjoy descriptor tables and swapgs, reject FRED
<heat>
i can't hear you over the sound of arm64 banked registers
<heat>
even the interrupt controller's registers are banked
<zid>
That's because of the jet engine cooling fans it needs
<zid>
to hit 1ghz
<heat>
sick burn
<zid>
how many bogomips does your arm have
<heat>
none, it's arm not mips
<zid>
my 2011 cpu had 58000
<zid>
my current cpu also has 58000 :(
<zid>
how can I increment my SHA1s to find collisions if I still can only do 58 billion inc per second
<nortti>
how do you have a bogomips that high? mine is 4788.86
<heat>
cheating
<zid>
Easy, have 12 of those
<zid>
also your blck is drooping badly
<zid>
bclk
<heat>
noOOOOOOoooooo not my bclk
innegatives has quit [Quit: WeeChat 3.8]
gildasio2 has quit [Remote host closed the connection]
gildasio2 has joined #osdev
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
<heat>
geist, apparently there have been further versions of the pl031 that do more stuff
<heat>
although this is really ancient stuff. maybe they should scrap it for goldfish rtc
foudfou has quit [Quit: Bye]
foudfou has joined #osdev
<heat>
this thing is even y2038k vulnerable
wlemuel has quit [Read error: Connection reset by peer]
[itchyjunk] has joined #osdev
wlemuel has joined #osdev
Left_Turn has quit [Read error: Connection reset by peer]
xenos1984 has quit [Quit: Leaving.]
Brnocrist has quit [Ping timeout: 264 seconds]
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]