<bslsk05>
github.com: menuetos/menuetos-kernel-sources-32b/STACK.INC at master · marcosptf/menuetos · GitHub
<mjg>
tcp fuckin' stack
<dostoyevsky2>
heat: isn't is a bit more straightforward for smaller OSes? You can call the bios directly, whereas in C you awkwardly need to avoid libc
<nikolapdp>
how do you need to avoid libc
<heat>
you can't call the bios
<heat>
usually
gog has quit [Quit: byee]
<Ermine>
Back In The Day there were people who wrote GUI windows programs in asm by calling win32 and were very proud of it
<heat>
was that person you?
<mjg>
i booted the livecd, it has a chess program, but no engine
<mjg>
bummer
<heat>
you're in luck
<heat>
onyx has a stockfish port
<heat>
Somewhere(tm)
* kof673
points at steve gibson
<kof673>
and randall hyde IIRC was the name of the person with "high level assembly"
<kof673>
i...don't really want to run, because the goal is run my own stuff, but i noted dreamcast linux lets you slip or similar over serial (i do not have the lan/broadband adapter) lol
<kof673>
that is not the software stack, but all you need for hardware lol
<kof673>
netbsd should too, but linux had a guide lol
<dostoyevsky2>
kof673: does it slip via the controller port?
<kof673>
no, it has serial. i believe parallel PLIP existed too
<kof673>
this is not surprising, just i am not that old so never saw them in use i guess ...
<kof673>
other systems controller ports are 2-way, so i do not see why not :D
<dostoyevsky2>
kof673: what was the serial port originally intended for on the dreamcast?
<kof673>
possibly connecting 2 systems, i dunno ;D "accessories"
<kof673>
i have no idea what the "dev systems" looked like if it was a leftover from that or what
<dostoyevsky2>
kof673: like a printer... so you can proove to your mates that you got that high score
<geist>
dreamcast also had a ethernet port accessory
<geist>
i hacked on it years ago, it was a rtl8139 with on board 32k iirc
<kof673>
yes, "broad band adapter" and another LAN one
netbsduser has quit [Ping timeout: 245 seconds]
<dostoyevsky2>
geist: and with a mozilla cd for dreamcast you could browse the webosphere
<geist>
yah i still have my old dreamcast with the BBA and keyboard and mouse
<geist>
hacked some code on it long ago
<heat>
okay im hacking some printk now
<heat>
lets go
<geist>
oh daaamn
<heat>
how hard can a ringbuffer be
<heat>
famous last words
<dostoyevsky2>
when I wrote my own OS based on OpenBSD I replaced all the printk()s with one character
<heat>
my strat is basically going to be: printk writes to a ringbuffer (with some structured-ish layout), messages are sequenced, consoles keep the last seq they saw
Arthuria has quit [Ping timeout: 245 seconds]
<heat>
printk tries to write to the console, if trylock fails whoever releases it next will update the whole thing
<heat>
if recursing (printk inside printk) or NMI, we write to a percpu buffer and flush it in a delayed manner
<bslsk05>
five-embeddev.com: RISC-V Instruction Set Manual, Volume I: RISC-V User-Level ISA | Five EmbedDev
<Ermine>
> The 'virtio-blk' device has gained true multiqueue support
<Ermine>
heat: does it need any changes on onyx side?
<heat>
probably not
<heat>
i don't know what true multiqueue means here
<heat>
did it expose queues that weren't really queues? maybe? i don't know, never looked
<geist>
i assume it lets you have multiple ring buffers, a-la nvme
netbsduser has quit [Ping timeout: 245 seconds]
<geist>
but it may have already had that?
<heat>
"true" here implies it had some sort of emulation-ish thing
<heat>
i *really* don't know how qemu block IO works
<Ermine>
There are more details in release notes
<geist>
yah i dont know the ramifictions of the io schedulers or async stuff you can set
<geist>
i was fiddling with proxmox and they expose all of those thigns as checkboxes
<heat>
"different queues of a single disk can be processed by different I/O threads. This can improve scalability in cases where the guest submitted enough I/O to saturate the host CPU running a single I/O thread processing the virtio-blk requests. Multiple I/O threads can be configured using the new 'iothread-vq-mapping' property"
<heat>
aha ok
<heat>
does qemu use O_DIRECT
<geist>
there are ways to set the cachability of the udnerlying file, and there's an uncached mode
<geist>
actually is what proxmox defaults to, which is acutlaly probably not a bad idea
<geist>
presumably that means O_DIRECT or something like that
<geist>
iirc it's something like cached= on the device line
<heat>
yeah
<heat>
the guest does its own caching anyway
<geist>
presumably the metadata portion of a qcow or whatnot is kept entirely in memory so that it's not constantly reading it back
<geist>
but in O_DIRECT i dunno how qemu orders writes to the metdata part as it's writing otherwise
<geist>
presumably it safely updates the table before writing new data, or whatnot
xal has quit [Quit: No Ping reply in 180 seconds.]
<geist>
at some i sat down and tried to grok the qcow2 format and it does pretty much what you expect, fairly clever
Gooberpatrol_66 has joined #osdev
Gooberpatrol66 has quit [Ping timeout: 245 seconds]
kof673 has quit [Ping timeout: 245 seconds]
LittleFox has quit [Quit: ZNC 1.8.2+deb3.1 - https://znc.in]
valshaped7424880 has quit [Quit: Ping timeout (120 seconds)]
LittleFox has joined #osdev
Ram-Z has quit [Remote host closed the connection]
xal has joined #osdev
<heat>
mixing O_DIRECT and cached is risky
valshaped7424880 has joined #osdev
Ram-Z has joined #osdev
<heat>
"Applications should avoid mixing O_DIRECT and normal I/O to the same file, and especially to overlapping byte regions in the same file."
<bslsk05>
riscv.org: Specifications – RISC-V International
<geist>
yeah, i'm seeing more and more stuff generally do that
mavhq has joined #osdev
kof673 has joined #osdev
<heat>
maybe riscv doesn't want to pay the egress fees for a storage bucket :v
<heat>
this reminds me i should look at the oracle ampere thing
<geist>
yeah i do wonder how much egress you get for free for using google drive
<geist>
maybe i can put my toolchain tarballs over there
<heat>
you can't wget google drive links i'm pretty sure
<geist>
seemed to work
<geist>
it did a 302 to a new location but it did work
<heat>
oh yeah?
<heat>
cool
<heat>
the links suck and are non predictable but if it works it works
<geist>
oh wait, duh that was the github link
<heat>
yep doesn't work for gdrive
<geist>
yah it gave me 90KB of javascvript
<heat>
this reminds me, someone wrote a fuse gdrive fs :p
<geist>
no reference to fred yet in qemu
<heat>
you know what really pissed me off the other day?
<heat>
i was trying to fetch your printf_tests from gitiles and it doesnt support a plain text download
<heat>
it can only do base64
<heat>
for some reason
<geist>
you can probably do a direct git fetch
<geist>
i forget the syntax but fairly sure you can just grab a blob directly from it
<heat>
can you fetch a single file?
<heat>
i don't keep a fuchsia repo locally anymore, it's just too damn big
<geist>
looking
<geist>
`git archive` may be what it is
<heat>
isn't git archive what ppl use to make tarballs from repos?
<geist>
yah but you might be able to spec it tight enough to grab a single file
<geist>
oh i dunno, i give up
<geist>
i thought there was a way, but dont see it
<heat>
git is herd
<Ermine>
fuse gdrive fs probably uses some api to get files
<geist>
yah
Arthuria has joined #osdev
netbsduser has joined #osdev
<geist>
oh when you're talking about my printf tests you mean the fuchsia version?
<geist>
the LK version i should upconvert to the unittest framework
Fingel has quit [Quit: Fingel]
<heat>
yep i used the fuchsia version
<heat>
my internal kunit stuff is very similar to gtest and yours, so it just works
netbsduser has quit [Ping timeout: 268 seconds]
Gooberpatrol_66 has quit [Read error: Connection reset by peer]
Gooberpatrol66 has joined #osdev
netbsduser has joined #osdev
<GreaseMonkey>
learnt another stupid thing about my Brio's BIOS today: it supports at least enabling the A20 line via INT 0x15 AX=0x2401 and reporting the status via AX=0x2402, but AX=0x2403 reports it as not supported. go figure.
<GreaseMonkey>
i checked the other methods (the chipset is an i440EX), turns out both the traditional way and the <h1>FAST</h1> A20 way work fine
<kof673>
i think my "bootloader" asm code...has optional define to force whatever method :/
netbsduser has quit [Ping timeout: 245 seconds]
<kof673>
not that it has been tested at all on various machines
<ddevault>
Hare strings are unicode, not ascii/undefined
<ddevault>
so it's a bit more involved than that
<ddevault>
but tbh not being able to easily work with ASCII strings is maybe a design defect in Hare
<ddevault>
it would be nice to just say "I declare that my kernel normatively only ever works with ASCII strings and turn off some guardrails so I can work with them as such"
* kof673
sees bslsk05 stare at helios > the Castillo pyramid [...] was designed so that [...] its ceremonial stairway was transformed into the image of a snake > at the equinoxes, the sun beams > this serpent of conjoined light and stone slithers up the stairway in spring, and down in autumn
<nikolar>
ddevault can't you just cast them to a byte array
<ddevault>
yeah
<ddevault>
but then you don't have any of the stdlib's string:: functions, like strings::index or what have you
volum has quit [Quit: Client closed]
<ddevault>
and Hare does not have pointer arithmetic per-se, which also makes it a bit more annoying than C in some respects
<nikolar>
I'd reimplement some of that for byte arrays
<bslsk05>
nullprogram.com: A Branchless UTF-8 Decoder
<zid>
I need a small loan of a million euroes
rustyy has joined #osdev
<FireFly>
€foo variables when
<zid>
GeDaMo: fuz wrote that avx one link that
<GeDaMo>
I don't remember that one :|
bauen1 has quit [Ping timeout: 264 seconds]
zetef has joined #osdev
<kof673>
i believe that is the license for the other link: https://www.cs.princeton.edu/~bwk/tpop.webpage/code.html just wants > You may use this code for any purpose, as long as you leave the copyright notice and book citation attached.
<ddevault>
which is requiring some legwork to get C and Hare to play well in a freestanding environment
<nikolapdp>
neat
<ddevault>
and at the moment I'm getting a GP fault because of an unaligned stack on an SSE register load
<ddevault>
and I am not sure why
<ddevault>
may be a compiler bug...
<netbsduser>
i considered butchering up lwext4 to make an instant ext4 driver but i think i found it needed big refactoring
<nikolapdp>
which compiler then
<nikolapdp>
hare?
<ddevault>
it would benefit from some refactoring indeed, but it _is_ an instant ext4 driver
<nikolapdp>
zid: what an odd bunny
<ddevault>
yeah, hare, specifically it looks like a codegen issue with qbe
<zid>
I don't think adding 5 to rsp in a linker scriptcounts as a compiler boog :P
<zid>
can rust disable sse yet
<netbsduser>
i think the last straw was that it was completely non parallel and would need a lot of work to make it so
<ddevault>
I fixed the linker script
<ddevault>
now there's a second bug
<nikolapdp>
kek
<nikolapdp>
always another
<ddevault>
not really sure why qbe is emitting SSE instructions here in any case
<ddevault>
maybe because this is a variadic function?
<ddevault>
I don't actually know how that ABI works
<nikolapdp>
what's the code
<ddevault>
it's just a printf implementation
<zid>
last I checked, rust code couldn't be built -mgeneral-regs-only and uefi only left the boot processor with sse enabled, so all the APs would instantly fault :D
<ddevault>
stack is aligned properly when entering the function
<ddevault>
but then the stack frame unaligns it
<zid>
aligned before the call, or after?
<ddevault>
both
<zid>
afaik it needs to be unaligned before, but unaligned after
<zid>
it can't be both, you push rip
<zid>
jmp printf
<zid>
you need to be on +8, then call needs to put it to +0
<ddevault>
well, after the jump it's aligned on 16
<bslsk05>
github.com: Float-free libcore (for embedded systems and kernel drivers, among other things) · Issue #1364 · rust-lang/rfcs · GitHub
<zid>
9 year old bug that you can't use rust for embed x86
rustyy has joined #osdev
<nikolapdp>
lol are you kidding me
<nikolapdp>
systems language my ass
<netbsduser>
the rust as systems language is a recent claim
<nikolapdp>
this "bug" isn't
<netbsduser>
the language wasn't designed as one and they have mountains of technical debt to overcome to be one
<zid>
They're dranking their own koolaid on that one
<netbsduser>
i tried and rejected rust for some systems software (user land stuff, not kernel) because it simply couldn't cope with things like malloc() failing
<zid>
yea that was the main contention for linux
<netbsduser>
they want to change that but it will be a slow slog for a language that was designed for writing a web browser in, not systems software
<nikolapdp>
didn't they accept panicing alloc into the kernel
<zid>
I think so :/
<nikolapdp>
lollers
Arthuria has joined #osdev
bauen1 has quit [Ping timeout: 240 seconds]
bauen1 has joined #osdev
Arthuria has quit [Ping timeout: 245 seconds]
<ddevault>
how tf do variadic arguments work
<ddevault>
how is this gcc code not hosing the stack
<heat>
mjg, mofer are you trying to defend "light" smoking?
<heat>
do you realize how geezer you're sounding right now?
<ddevault>
hm?
<ddevault>
what do you mean new thing
<heat>
oh wait, my bad
<heat>
i mixed up fosstodon with... that conference thing you did last year
<mjg>
heat: i defend smoking a pipe every 2 weeks
<mjg>
heat: i don't know if that's your idea of "light smoking"
<mjg>
or rather, would defend, if it was not for the risk of becoming a cigarette smoker
<mjg>
like what happened to my friend
<heat>
well that is *the* problem isn't it
<heat>
it's highly addictive
<mjg>
i don't know how many pipe smokers end up like that
<mjg>
i was never tempted by cigarettes after several months of smokin'
stolen has joined #osdev
<kof673>
"wisdom is double-sided" -- job "xyz only loves people with wisdom" -- book of wisdom "not found on the earth of the living" -- job beware of false redefinitions of "word of wisdom" lol
<kof673>
this is not to argue smoking either way, but illiteracy is rampant lol
<mjg>
anyhow stern words from the generation addicted to tiktok
<mjg>
:XX
<heat>
i don't have a tiktok
<heat>
i also don't smoke and i rarely drink
<mjg>
well then it's not a common problem among genz i guess
<gog>
i'm still having a few cigarettes a week :|
<nikolapdp>
as a genz, i don't have tiktok either
<gog>
i really need to just stop
<mjg>
gog: oh?
<mjg>
bummer
<gog>
yeahhhh i've quit completely a few times and then a night of partying gets me back on them
<mjg>
ye i know people who only smoke at parties
<gog>
it's a lack of self-discipline
<mjg>
pretty weird imo
<gog>
i value the momentary relief from tension more than my health i guess
<mjg>
short term satisfaction is the bane of human existence innit
<gog>
yuuuuuup
<mjg>
when did you start? hs?
<gog>
yeh 16
<mjg>
that's some millenial shit
<heat>
like 90% of smokers i know are in a permanent struggle to stop
<heat>
iz insane
<heat>
dem nicotine be wildin
<gog>
i quit after age 22 for about 7 years
<gog>
then i've been off and on them since
<mjg>
here is a funny story how a guy i knew in middle school(!) got found out
<mjg>
both of his parents were smokers
<mjg>
one day his mom was talking with some other mom how much youth is smoking today
<mjg>
dude was passing by
<mjg>
she was liek "lol i wonder if my son is smoking"
<mjg>
she got close to him and sniffed him out, while dude was just coming home from a smoke break
<mjg>
:d
<mjg>
talk about unlucky
<heat>
she had to be fucking oblivious, can't believe it
<mjg>
i easily believe because their entire house smelt like an ash tray
<heat>
smokers have a distinct smell hard to wash off
<mjg>
did i mention both parents smoked
<mjg>
A LOT
<heat>
oh i guess
<ddevault>
does anyone know of something similar to mcopy for ext4 (populate a file as an ext4 image with content)
<heat>
mkfs.ext4 -L i think
<ddevault>
mkfs.ext4 can create an empty ext4 system
<bslsk05>
github.com: Onyx/scripts/create_disk_image.sh at master · heatd/Onyx · GitHub
<heat>
give it a -d <directory> and it'll prefill your ext2/3/4 fs
<ddevault>
nice, thanks
<heat>
yw
<mjg>
here is a better story: i had a bunch of smokers in class in high school. one of them was at the blackboard, holding a piece of chalk
<heat>
*annoyingly* most mkfses do not have this
<mjg>
... like a cigarette
<mjg>
he even tapped it few times to drop the ash
<mjg>
the teacher pretended to not see
<heat>
does everyone smoke in eastern europe?
<mjg>
no
<mjg>
there was a large windowless wall on one side of the school
<mjg>
it was a "designaed" smoking area
<mjg>
every smoker would go there during recess and teachers would pretend the area does not even exist
<mjg>
even tho anyone coming in or out of the building could see people going there
<ddevault>
works perfectly, thanks again heat
<heat>
:)
bauen1 has joined #osdev
m257 has joined #osdev
<nikolar>
That's a neat trick, I should do that for my fs too
gog has quit [Quit: Konversation terminated!]
<Bitweasil>
mjg, I don't smoke cigarettes, but I do miss aspects of the "social smoker" culture. It was a solid way to meet people, and most of 'em didn't mind if you were smoking a tobacco pipe either.
<Bitweasil>
I knew people who didn't smoke but carried a lighter regularly, just as a social lubricant to start talking to people.
gog has joined #osdev
gog has quit [Quit: Konversation terminated!]
d5k has joined #osdev
m257 has quit [Quit: Client closed]
d5k has quit [Ping timeout: 252 seconds]
m257 has joined #osdev
zetef has quit [Remote host closed the connection]
m257 has quit [Ping timeout: 250 seconds]
zetef has joined #osdev
stolen has quit [Quit: Connection closed for inactivity]
<geist>
yah, though it's a nasty habit and i'll never do it etc, it also must be kinda nice to be able to take a few minutes to go relax with something like nicotine
zetef has quit [Read error: Connection reset by peer]
dude12312414 has joined #osdev
<kof673>
i had a book with pictures of old computers...there was a mainframe thing in a bunker underground for military or navy...built-in ashtray :D
Cindy has joined #osdev
<Bitweasil>
Relax, and be social.
<Bitweasil>
It at least used to be a sort of "instant friends group" - show up, light up, and the expectation was that you were up for talking about [whatever].
<Ermine>
heat: in eastern europe vodka is dominant (j/k)
<kof673>
a cigarette is just a miniature hoopoe bird (black/white ash, red fire) </alchemy joke>
Gooberpatrol66 has joined #osdev
<heat>
Ermine, no jk
<heat>
the sweetest grandmas have their own fucking alcohol production
<Ermine>
moonshine is not vodka though
<Bitweasil>
A still would be fun. Not poisoning myself, and not getting myself into serious legal issues, discourage me from doing such things, though.
<Bitweasil>
I'd probably just use it for fuel.
<GeDaMo>
Alcohol is good for cleaning too :P
<Ermine>
Don't try to clean soft touch stuff with alcohol though!
<heat>
Ermine, moonshine is just redneck vodka isn't it
<Ermine>
There are differences in technology
<Ermine>
It's important
<Bitweasil>
I'm not familiar with distilling at any serious level, all I know is "stills are a thing" and the basic physics of how they work.
<Ermine>
If you want into eastern europe family, you need to start feeling such differences with heart
<Cindy>
hi
<Cindy>
i'm cleaning up some old C code
<Bitweasil>
And... ?
<Cindy>
i want to understand how red-black trees work
<Cindy>
my dumbass self back then kept violating the strict aliasing rules
<heat>
Ermine, yo mr math i need some help
<heat>
theres this fun trick i picked up for ring buffers
<heat>
where instead of doing e.g head = head % size you just do head++ and tail++ and then mask later
dude12312414 has quit [Ping timeout: 260 seconds]
<heat>
mask, mod, whatever you prefer
<Ermine>
okay, that would work
<heat>
with this trick you don't need to reserve the last element, because you can always know if head == tail mod n is full or not (if tail - head == n, it's full)
<heat>
HOWEVER
<heat>
i can't tell if mod 2^64 (or ^32 or whatever) is going to fuck me over in any case
<heat>
i dont think so, but i find it hard to think this through
<zid>
just do it on an unsigned char in simulation
<zid>
but, it shouldn't even matter right, if you're just only allowing tail to chase
<Ermine>
if stuff is signed, you need to avoid negative numbers
<zid>
it will fuck up if you allow tail++s to pend
<zid>
but if it's just "tail can never exceed head" then you can let it wrap however it wants
<Ermine>
because cpus don't mod correctly negative numbers
<heat>
no signedness, signed overflow is UB even
dude12312414 has joined #osdev
<zid>
heat: long as they don't get out of sync you don't have to care about wrapping unless you try to calculate 'how full'
<zid>
if head wraps at 4 billion, tail will also wrap at 4 billion etc, if the only check you ever do is if(tail == head) full(); then nothing cares
<Bitweasil>
Assuming nothing stupid happens during operation. :) I've invented a long list of clever ways to subtly screw up ringbuffers...
<Bitweasil>
Yeah, the last... 10-15 years have been i{3,5,7,9}-XXnnn, where XX is the generation (possibly one digit), and then the revision within that.
<geist>
not that that still isn't a very useful chip
<Bitweasil>
Unfortunately, the revision within that is literally useless, without a lookup table.
<heat>
Ermine, what was that weird cpu again you had?
<heat>
the really new one
<Bitweasil>
But *in general,* the bigger the number within a generation, the more powerful the chip.
<childlikempress>
honestly there was no need to make anything past skylake
<geist>
they now just dropped the 'i' part, but i think the logic is still the same
<childlikempress>
skylake bestlake
<Bitweasil>
There's no easy way to see what the difference between an 8600 and 8650 and 8700 are, without lookup tables, though.
<geist>
sandy bridge or gtfo
<heat>
KABY
<zid>
SANDY
<Bitweasil>
idk, Ivy Bridge added unrestricted guest virtualization. Which is *properly* nice.
<heat>
kabylake is actually a really solid uarch
<childlikempress>
snb was good but didn't have nearly enough avxes
<heat>
still used
<childlikempress>
kbl was just an skl refresh no?
<heat>
yeah
<geist>
problem with first gen skylake is it has all the meldown/spectre stuff
<childlikempress>
yeah
<childlikempress>
just keep refreshing skylake forever tbh
<heat>
i'm pretty sure facebook is still running kabylake
<geist>
you need a few gens after it to get some fixes
<heat>
or at least they were running kabylake a few years ago
<childlikempress>
'''fix'''
<Ermine>
heat: i7-1360P
<heat>
exactly
<geist>
well, some of the fixes are real
<Ermine>
it's raptor lake
<Bitweasil>
i7-5600U is my laptop... 2C/4T gutless wonder
<geist>
i'm not sold on their new big/little stuff though, though i guess AMD is about to get into that too
<zid>
1360p sounds like a xeon
<zid>
I really really want a 1390P
<childlikempress>
what's wrong with big.little
<zid>
which is a xeon
<Ermine>
Tbh I want them to stop doing *lakes and switch to something else
<childlikempress>
tbh my hope (probably overly idealistic) is
<zid>
cove?
<gog>
bay
<zid>
LANDING
<geist>
not that there's a problem with big/little in general, it's that i dunno if x86 stuff is properly tuned for it
<zid>
yea I'm avoiding those cpus if I can
<childlikempress>
that big.little will drive software tooling to make it really easy to deal with extremely heterogeneous arches
<zid>
they sound like they'd have issues
<geist>
even running linu xon the one test machine i have i dont think it really groks it
<zid>
someone asked in #gcc today how to make ld not fucking take forever if the final link of their make ends up on the e core, for example
<childlikempress>
it definitely hasn't happened yet but if it did that would be a Good Thing
<zid>
the scheds need to schedule shit off the e cores if they hit 100% cpu imo
<heat>
AIUI the upstream linux scheduler is really not that great with arm big.little either
<Bitweasil>
big.LITTLE is fine, IFF you have a scheduler that isn't idiotic with it. And it's not helped by the "Well, the little CPUs don't support everything the big ones do, so we'll just disable those features on the big ones to avoid needing feature-aware scheduling."
<geist>
Bitweasil: precisely
<geist>
the AMD solution will be much nicer since the cores will be much closer to each other
<geist>
at the expense of maybe it doesn't net as many gains, i guess
<childlikempress>
cus things are headed towards increasing heterogeneity and special-purpose hardware
<Bitweasil>
A feature aware scheduler would be fine, IMO. "Oh, you faulted AVX on an e-core, fine, you're pinned to the p-cores now."
<Bitweasil>
Like we do lazy switching for vector and FPU.
<Ermine>
heat: that's surprising, android would be interested in good big.little scheduler
<childlikempress>
but the software is Not There because no one (except apple and android) are interested in performance on client parts
<Bitweasil>
Most of the pieces are already there in the kernel with the lazy switching. It would just have to be added into the scheduler.
<heat>
AIUI they used to have tons of patches
<geist>
well, i think android is okay with it, they just have a lot of patches
<childlikempress>
Ermine: i'm pretty sure android does have good support for it, just not upstreamed
<childlikempress>
yeah
<geist>
and it's tuned for a specific use case, android
<Bitweasil>
Apple's big.LITTLE support in the OS is bloody amazing.
<heat>
both android and, before GKI, the vendors
<Bitweasil>
I miss my M1 Mini. :(
<geist>
that doesn't mean good general purpose support necesarilly
<childlikempress>
the fabled 'general purpose' workload
<heat>
yep
<childlikempress>
:^)
<acidx>
Bitweasil: 387 emulation was a thing a while back. would be funny to have that for those cases. :P
<heat>
the android kernel has some weird patches that aren't upstreamable
<childlikempress>
'ah yes, i run general-purpose code'
<geist>
well, yes you do
<geist>
ie, you browse the web, etc
<geist>
ie, what regular people do
<acidx>
"e-core but it's not efficient in power or time or anything really, it's just funny because all AVX is now SWAR"
<Ermine>
eh
<childlikempress>
geist: my point is that everything is specialised
<geist>
disagree
<heat>
disagree
<Bitweasil>
acidx, there's no reason to emulate it. Trap the undefined instruction fault, see if it's something the other cores have, and re-schedule on the other cores.
<geist>
heh [X] Doubt
<heat>
the collection of everything specific gives you a more general-purpose workload
<Ermine>
people say that linux, namely pmOS, on thinkpad x13s drains battery a bit faster than windows
<Bitweasil>
We *literally* do that for FPU instructions and vector instructions, with the "disable it" bit, so you don't waste time saving FPU registers for a task that won't need them.
<Bitweasil>
It faults, you swap the registers out, enable the FPU, resume the task.
<heat>
if i take the bits of a program that hammer memory, the bits of a program that hammer the SIMD, and put them all together, it's a general purpose workload
<geist>
yah i was thinking about exactly that the other day, i think that's only really feasible on x86
<geist>
since the AVX instructions are their own class of instruction
<heat>
Bitweasil, we actually don't, no one does lazy FPU for x86 anymore
<childlikempress>
xsaveopt?
<Bitweasil>
Oh, because it leaked? Bleh. Was wondering about that. In any case, the concept is well established.
<heat>
not really, it's just slower
<geist>
yeah. and with xsaves, etc its pretty efficient
<Bitweasil>
Fair. I mean, just about everything is hammering the vector engine for memset and such these days.
<geist>
exactly that too, so the lazy save ends up being not that much of a win
<Bitweasil>
The downside is that if it only needs it for a little while, you're pinning it on a p-core for no good reason, but... beats disabling the features entirely, IMO.
<geist>
since between any two given context switches it's a pretty good chance the code will use it
<childlikempress>
meh the power requirements is mostly from big vectors
<Bitweasil>
Or at least give me a firmware config option to enable those features.
<childlikempress>
strings fns can use small vectors and still get big wins
<geist>
yah right
<geist>
and the xsaves/etc actually tracks all of that. tracks if the top of the registers were dirtied, etc
<childlikempress>
yeah
<geist>
so if you're just running AVX256 code on a 512 machine it ends up being not really any less efficient
<Bitweasil>
Ah, okay. I've been in the ARM world for a solid few years now. :)
<geist>
yah that's the trick with xsves, it has in hardware bitmaps for what is dirtied and whatnot from last time
<zid>
don't tell anybody but avx-512 is fake on amd
<childlikempress>
zid: no it's not
<geist>
and then it has a compressed storage format, that basically doesn't bother writing out zeros
<zid>
childlikempress: that's the spirit
<childlikempress>
not in any meaningful sense
<geist>
yes i know what you're trying to say: it has 256 bit wide vector ALU
<Bitweasil>
The ARMv8/v9 scalable vector stuff is pretty cool. You can write for a 2048 vector, and if it's a 128, 256, 512 hardware... whatever, it'll do what's needed to create the correct result. Just takes longer on some chips.
<childlikempress>
you get the same number of gemm flops with avx2 vs avx512 ops, but saying it's fake is an extreme oversimplification
<Bitweasil>
(at least as I understand it, I've not had to implement it yet)
<geist>
but that doesn't really mean you can't get to the instructions. do the AVX512 general purpose instructions net you amny more stuff from 256? maybe not in that situation, but therea re also a lot of additional non just 'wide ass vector' bits in avx512
<zid>
childlikempress: glad i can count on you
<childlikempress>
you still save decode/rename slots. and you do have a full-width shuffle unit
<geist>
exactly
<childlikempress>
and intel has some things half width in avx512, some things even quarter width
<childlikempress>
yet no one accuses its avx512 of being fake :p
<zid>
I do
<childlikempress>
cool😎
<geist>
heh you do you zid
<geist>
i forget did you get a newer avx512 cpu zid?
<geist>
you upgraded iir
<zid>
nah it's avx256
<zid>
avx512 doesn't even exist, smh
<zid>
it's a popaganda piece by intel to sell more.. laptops!?
<geist>
right right, becuse it's fake. forgot
<geist>
well most of my current machines are AMD, but pre-zen 4, so they dont even have not-real avx512
<zid>
I wonder if the fact mainly only laptops had it was coincidence or planned
<childlikempress>
not planned
<zid>
avx10 also fake, we're up to avx10.1 already
<zid>
yea I assume it was just coincidence too
<zid>
they got the process working for the laptop skus but not the desktop ones, oopsie, guess laptops get the HPC feature lol
<Bitweasil>
Hey, they're good at retracting non-working features on desktop CPUs too. See transactional memory and SGX...
<childlikempress>
i would guess it's that limited capacity on the newer processes was prioritised for server/hpc (obvious), but then laptops over desktops because power efficiency
<childlikempress>
Bitweasil: transactional memory came back
<childlikempress>
because it's actually a really good idea :p
<childlikempress>
sgx on the other hand ... lol
<childlikempress>
'SGX is wonderful, it’s led to an entire industry of hackers finding bugs in SGX'
<Bitweasil>
Is it back? They gave up for a couple generations of CPU, I thought.
<heat>
avx512 being possibly omitted from future CPUs blew up on the intel people making up SysV feature levels' faces
<Bitweasil>
After several generates of "Release, whoopsie!"
<Bitweasil>
IMO, around the time you're using AVX512, doing GPU compute on the integrated GPU starts looking like a feasible approach...
<childlikempress>
wasn't part of the point of avx10 that they were committing to putting it in every future cpu?
<zid>
avx10 dead already
<zid>
we're 10.1
volum has joined #osdev
<childlikempress>
Bitweasil: the advantage of avx512 is that latencies are on order of a nanosecond
<childlikempress>
so it's easy to integrate with scalar code
<CompanionCube>
heat: they can always redefine v4, not like feature levels are actually too important, anyway/
<childlikempress>
jsut generally easy interopereability
<childlikempress>
if you have a massively parallel problem, sure, go ham on the gpu
<heat>
CompanionCube, feature levels are definitely important as they are seeing deployment
<heat>
and glibc ldso knows how to use them, and so does everything else
<heat>
you can't switcheroo a whole feature level
<Bitweasil>
Yeah, that's always the problem with GPU - transfer latency. But integrated GPUs are sharing memory, cache, etc, so you don't have as bad of a hit.
<Bitweasil>
If you're doing "one vector thing," sure, keep it on the CPU, but once you get into heavily vectorized code, it'll probably benefit from GPU.
<heat>
in fact, IIRC -v4 vs -v3 is mostly a difference of AVX512 :))
<geist>
yah
<childlikempress>
heat: yeah i remember felix was going off for a bit trying to convince intel to take out 128-bit vector support for avx10
<geist>
-v3 is everything up to an including avx2
<childlikempress>
so it would be a more reasonable feature level to target--only have to support 256 and 512-bit vecotrs, not 128 too
<heat>
why?
<heat>
if you have 256 and 512 isn't 128 mostly free?
<Bitweasil>
*lobs a molotov-MMX instruction in for lulz*
<Bitweasil>
You still need the stuff to track that you're using less of the register, clear bits and pieces, etc.
<childlikempress>
heat: no i mean the max vector size
<heat>
oh yes that's insane
<heat>
who's that for, bochs?
<childlikempress>
avx10 is basically avx512 instructions, but the cpu is allowed to just support 256-bit vectors (and fault if you try to use a 512-bit vector)
<childlikempress>
so you have to be prepared for 512 or 256 (or just lowest-common-denominator target 256). the goal was to make it so you don't also have to support 128 (or, make it so the lowest common denominator is 256 not 128)
<childlikempress>
heat: i mean for software emulators wider vectors is basically free :p
<bslsk05>
www.qemu.org: QEMU version 7.2.0 released - QEMU
<heat>
IIRC it was a gsoc project and everything
<heat>
AVX wasn't implemented for the longest time
<heat>
-cpu <cpu model with AVX> never enabled avx in cpuid
Matt|home has quit [Remote host closed the connection]
<geist>
FRED isn't implemented either :(
<geist>
qemu 9 was released yesterday or day before
<zid>
without fred? disgraceful
<zid>
send it back
<dostoyevsky2>
geist: any new great features... couldn't find any in the readme, but I guess I have only used like 1% of the features qemu offers
<geist>
lots of new riscv stuff. but curiously the whole x86 side is completely blank
<geist>
note that they really roll the major numbers fairly fast now, so it's more like a yearly or 9 months cadence they snapshot and make a new major
<geist>
so it's not *that* different from 8.x
<heat>
x86 considered obsolete
<childlikempress>
x86 is OLD and LAMEz0rs
<zid>
I'm inventing a new cpu, it's x86, but the TSS gets a huffman table for the instructions that the program uses
<zid>
so that the .text density is way higher
<zid>
It's also 10x slower to execute, you're welcome
<heat>
i'm inventing new entry/exit functionality for the x86
<heat>
its better than before, but without an IO bitmap just to piss the microkernel people off
<zid>
every syscall instructiont takes a table for which pages need to stay mapped
<zid>
on the user stack
<zid>
that have to be verified
<childlikempress>
i'm inventing a new arch, it's riscv but there's a separate feature flag for every instruction
<heat>
genius
<heat>
how are you naming the feature flags?
<childlikempress>
after #osdev users
<childlikempress>
you're not allowed to have more instructions than there are #osdev users
<childlikempress>
because MINIMALISM!
<childlikempress>
REDUCED
<heat>
Zid
<zid>
That actually sounds useful, childlikempress
<zid>
Can you convert all those feature flags to base64
<zid>
and tack it onto the end of the letters 'avx'
<zid>
then it'd be double perfect
<Ermine>
I'm supposed to be pissed off?
<zid>
about what
<zid>
did heat call you names again
<heat>
the heck did i do
<zid>
If he's upsettingly bad at it, I can try, or if he upset you because he's good at it, I can shout at him
<zid>
either work for me, I get to shout at someone
<childlikempress>
ughhhhh crap. i ran out of disc space in the middle of a brew upgrade and now everything is broken and it's confused
<zid>
nice
<zid>
That's why you should use a real package manager
<childlikempress>
i should! too bad macos doesn't have one :<
<zid>
install gentoo on your mac, simple
<childlikempress>
fuck you
<dostoyevsky2>
geist: they released qemu 8 exactly the same time last year IIRC
<zid>
have they cracked the sun code and figure out how long years last
navi has quit [Ping timeout: 268 seconds]
<heat>
the sun code??
<heat>
don't tell mjg
<heat>
is that why a year takes so long?
<dostoyevsky2>
slowlaris
<gog>
hi
chiselfu1e has quit [Remote host closed the connection]
chiselfuse has joined #osdev
<heat>
hi gog happy 25th of april
<gog>
it's not that day yet
<heat>
yes it is
<heat>
just saw the fireworks too
<gog>
oh
<zid>
do you get fireworks every night
<zid>
or just 25th of april
<zid>
personally I think the moon should make the westminster chimes when midnight happens
<zid>
and the sun should do something cool at noon
<kof673>
(from earlier, i am obligate to point out): moonshine: It's meaning derives from the notion of light without heat, or light from the moon (elsewhere "the house of light" lol): > But those that would rightly understand it should first learn the difference between fire and light.
Left_Turn has quit [Read error: Connection reset by peer]
<geist>
actually i guess ubuntu 24.04 will be any day now