<zid>
clever: Yea the pic I link was a thumbnail from one of these vids
<clever>
zid: that exact vid, i just took the vid ID from your link
<mrvn>
zid, clever: Now I will have nightmares about some helicopter pilot going postal and flying through NY City with that thing. :)
<clever>
:D
mrlemke has joined #osdev
gog has joined #osdev
nyah has quit [Ping timeout: 246 seconds]
<heat>
how do you set the LBA size in nvme? is it when you're formatting the drive (namespaces)?
<heat>
woah linux can get 13M IOPS on a single core, 12900K, optane
Likorn has quit [Quit: WeeChat 3.4.1]
<zid>
I wouldn't be surprised if you were the only one who knew nvme at all
<zid>
we regularly discuss the pros and cons of 1979 intel
<geist>
heat: i think it's when formatting
<geist>
or at last when i did on the one drive i had it had to reformat it
<heat>
ah right
<heat>
that makes sense
<zid>
which is sort of odd
<zid>
a layer separation of sorts, but trim kinda murders that
<geist>
well, also nvme has the notion of multiple namespaces and you can create/delete/resie/etc them individually
<geist>
so it sort of makes sense that that's when you specify the block size
<geist>
unclear if you can mix/match block sizes on any firmware or if all namespaces have to have the same
<geist>
and/or if different namespaces are allowed to mix the underlying flash blcoks
<geist>
presumably no and yes
<heat>
different namespaces can have different lba formats
<zid>
nvme is silly and should feel silly
<heat>
it's part of the identify command for the individual namespace
<geist>
yah whethe ror not real hardware allows that i dunno
<geist>
see `nvme id-ns`
<heat>
fun silly(tm) fact: the identify command can list up to 64 lba formats
<heat>
you then choose one
<heat>
they even have a 2-bit performance rating on them
<zid>
ARGB_8_8_8-8 please
<geist>
yah i think thats what i remember with this one too
<geist>
alas most of my nvmes dont support more than one format
<zid>
I don't have any :P
<zid>
sata is life
<geist>
but i think it makes sense, most of mine are samsungs, and the samsungs all have onboard dram
<heat>
me neither :|
<heat>
i need to get one though
<zid>
yea I was surprised, my 850 which is.. ancient now, has 512MB of DDR3
<geist>
so, i dont think there's a fundamental reason why they would bother supporting less than the 512 byte lbas
<geist>
because they have enough ram to hold that table in memory
<heat>
less?
<geist>
but, the WD blue i have that *does* support reformatting
<heat>
like 256 byte lbas?
<geist>
also has the dram-sharing 'feature' because it doesn't have any ram on board
gog has quit [Ping timeout: 244 seconds]
<zid>
When are we rooting an SSD and running ponyos on it
<geist>
so in that case it actually does make some sense to support 4K because it then doesn't need as big of a translatino table, and it just asks for correspondingly less stolen ram
<geist>
heat: less blocks is what i really mean (via larger LBAs)
<heat>
the nvme spec explicitly disallows < 512 byte lba
<heat>
ah yeah
<geist>
since the size of the translation table is inversely proportional to the block size
<clever>
geist: what about 512 byte virtual block sizes, and 1mb physical block sizes, smaller translation table, at the cost of bigger rmw cycles
<geist>
i remember doing some math and it kinda lines up if you think of there being a large in memory table with say 4 bytes per block
<geist>
clever: that's basically what you get with SD cards
<heat>
my 870 evo (sata) has 1GB of dram
<heat>
pretty impressive
<geist>
and fundamentally why they're slower
<zid>
yea I was super surprised by how much mine had
<clever>
ah
<zid>
70 is 2 gens lower and only has 1GB
<zid>
newer*
<clever>
geist: that could also explain why SD cards deal with sequantial write loads better, its not a rmw if you write the physical block in one shot
<geist>
right
<clever>
i have been thinking about testing a log based fs on the rpi
<clever>
so all write load is sequential
<heat>
f2fs!
<clever>
there have been many mentions on the rpi forums, about the warantee on your SD card being void if you use it as an OS disk on any system
<clever>
because they are designed for highly sequential loads like photo of video
<geist>
right, my general experience with that is samsung SD cards are pretty good for random io
<clever>
but a log based fs would entirely solve that, and possibly increase both lifetime and performance
<geist>
whether or not that's because they're intrinsically less safe i dunno
<geist>
but sandisks, for exmaple, run like ass
<clever>
i have a sandisk that i managed to kill with one too many gcc builds
<clever>
it went internally read-only
<geist>
heat: my math is off so i dont remember what i was thinking but consider something like 1TB
<geist>
2^40. that has 2^(40 - 9) 512 byte blocks (2^31)
<geist>
so that is already 2GB of memory if you have one byte per, so i dunno how a translation table could copletely hold in ram
<geist>
answer is it probably doesn't and it swaps it in and out as its in used
<heat>
yeah it can't
<geist>
for say 4 bytes per block that'd be 2^33
<heat>
my 870 evo is 1TB, 1GB of ram
<geist>
with 4K blocks it's a little nicer though: 40 - 12 + 2
<geist>
2^30
<clever>
and now it fits within a 32bit int!
<heat>
I wonder what NVMEs run as a CPU
<heat>
I think western digital has been using riscv
<heat>
they contribute quite a lot to riscv code
<geist>
right, i had read somewhere that one of the samsung controllers a few years ago was something like 3 arm cortex-r5s
<geist>
one acts like the interface and the other two run the wear levelling/drive the flash or something
<bslsk05>
www.anandtech.com: The Inland Performance Plus 2TB SSD Review: Phison's E18 NVMe Controller Tested
<geist>
3 cores. i dont think any nvme has remotely any space to fit more than one soc
<clever>
> This confused me for a bit... I expected a single tap, for the single ARM core that's inthere... but instead, I found three taps... does that mean this chip has three ARM-cores?
<clever>
> There's two Feroceons, which are quite powerful arm9-like cores, and a Cortex-M3 core, which is a bit smaller, more microcontroller-ish core
<clever>
this guy also found 3 cores on the same jtag ring, in a mechanical hdd
<geist>
sure. not surprising at all
<clever>
he then modified the firmware, so the drive lied about the contents of /etc/shadow when you write a magic string to any block
<geist>
maybe odd to folks that haven't looked at this sort of thing before, but it's totally not odd to find piles of dissimilar cores on a single soc, even on the same jtag chain
<clever>
and boom, free root
<heat>
"free" root
<clever>
heat: well, yeah, you need to mitm the supply chain first, lol
<clever>
so not really free
<geist>
unclear what each of these would even run anyway, bt it's entirely possible you have 3 cores like that so each one of them simply runs a big loop of command processing
<clever>
a checksummed fs like zfs would also prevent this, as would luks
<heat>
and you need root to know where /etc/shadow is stored
<geist>
kinda makes sense doing that than trying to build some sort of RTOS and then deal with it
<clever>
heat: pattern match any block that starts with root::something
<heat>
geist, maybe the new stuff runs lk :)
<clever>
you could also expose a full api, where you can talk to the backdoor by writing blocks to disk
<geist>
possibly, but as far as i know no one is using LK for riscv stuff
<clever>
ive even seen a flash->floppy adapter earlier
<clever>
where you are basically emulating a serial port, by reading/writing to sector 0
<bslsk05>
'The first thing that ever used MPEG4' by Cathode Ray Dude [CRD] (00:50:59)
<clever>
thats basically like those cd->tape adapters
<clever>
but it cant emulate tracks, and it cant know the movements of the head
<clever>
so its instead a kind of mmio over the floppy interface, where the tracks are dynamically changing their own data
<zid>
It's 3am, so I made egg fried rice.
<zid>
Made sense at the time
<klange>
I think I had one of those at some point, I recognized it in the video when I watched it earlier.
<clever>
zid: been there, made pizza at 5am, lol
<zid>
I'm going out for coffee with a friend "saturday morning"
<zid>
guess that means.. stay awake until lunch?
<zid>
Either that, or try to bed soon and then set a possibly needlessly early alarm and hate myself all morning. Decisions
<heat>
ditch the friend and sleep
<heat>
antisocial behaviour 101
<zid>
no, I begged him to pay attention to me because he brushed me off the last 2 weekends because he was busy :P
<zid>
we usually go out every sunday to window shop antique shops and stuff
floss-jas has joined #osdev
floss-jas has quit [Excess Flood]
<zid>
This egg fried rice isn't bad, I am an amazing chef.
<geist>
yah i think ripping on old video recording things from early 2000s is a bit much
<geist>
yeah it desn't look great, but damn, any sot of camera that records digital video back then is pretty great
<geist>
i'd have been pretty chuffed to have something like that
<clever>
around that time, i had a toy photo camera for kids
<clever>
it was horrible resolution, and it downloaded over serial
<clever>
still got it
<geist>
i think i had a Kodak DC260 or something like that in around 1999 that was like VGA rez and it was pretty cool. 16MB card i think
<zid>
when I was in hs one of the teachers took pictures of our projects with a camera that saved to a floppy disk, I thought it was pretty rad
<zid>
must have been old as fuck at the time though
<clever>
mine is from nickelodeon
<zid>
floppies were on the way out when I was in hs
<geist>
by 2005 i think i had gotten a tiny little olympus camera that took the then new SD card format. still hvae that somewhere, was a neat little camera
<bslsk05>
'Nick Click: The 90s Nickelodeon Digital Camera Experience' by LGR (00:14:24)
<geist>
yah late 90s they were starting to get slightly better than toys.
<clever>
no removable storage, less for a kid to break
<geist>
or at least in thet oy to maybe usable range
<geist>
ah yeah the DC260 was actually 2 megapixel. was farther along than i thought
<clever>
i think mine was 640x480 max
<clever>
at best
<zid>
The 384kbit video looks a lot better than he's shitting on it
<zid>
the sensors and stuff back then were garbage anyway
<clever>
zid: some of that is the re-encoding improving it
<zid>
anything pre mobile phone cameras was pretty garbage
<clever>
half way thru the video, he says that re-encoding to h264, then upscaling, improves the quality
<geist>
but yeah i sort of like watching him discover thins that were fairly commonplace when i was younger
<clever>
if he first upscales, then re-encodes, it looks far worse, the same as just playing it directly
<geist>
like zip disks and whatnot. sort of interesting to see how someone presumably younger sees stuff like that
<clever>
i found the video editing deck that CRD played with rather interesting
<zid>
I view a camera like that as basically like a dictaphone but with some crappy video attached, but you don't have to pay for film
<zid>
film is expensive to take notes with :p
<clever>
zid: some of the cameras CRD talked about in that vid, only handle 2 or 3 minutes of video
<zid>
I suppose magnetic tape would have just been better though
<clever>
before the storage is entirely full
<geist>
reminds me i always kinda wanted one of those tiny tape recorders
<zid>
magnetic tape is surprisingly good
<geist>
with the little micro tape. i dunno why
<geist>
though i guess the real auteurs of tape recorders all go for that particular 'portable' model that folks used back in i guess the 70s
<geist>
it's like a little mini reel to reel that you carry out when recording wildife or whatnot
<zid>
What about a wire recorder? :P
<zid>
Who needs magnetic tape when you can just use a spool of steel wire
<zid>
2.2km/hr
heat_ has joined #osdev
heat has quit [Ping timeout: 248 seconds]
heat has joined #osdev
heat_ has quit [Read error: Connection reset by peer]
<heat>
geist, trippy: the NVMe queues can be non-contiguous
<geist>
oh yeah?
<heat>
yeah, it's optional to support but it's a thing
<geist>
wonder if htat has something to do with mapping them into virtual machines
<geist>
ie a guest thinks it's giving you a contiguous run of physical but the host says 'aww shit that's actually discontig'
heat_ has joined #osdev
<heat_>
love m'internet
<heat_>
<heat> i was thinking that the contiguous bit in the capabilities meant that completion queues and submission queues needed to be contiguous (one after the other in physical memory)
<heat_>
<heat> it actually means that if !contiguous, you pass in a sg-list of queue pages
heat has quit [Read error: Connection reset by peer]
heat_ is now known as heat
<heat>
yeah maybe passthrough
<heat>
good point
<heat>
because I don't see how this is can be fast at all
<heat>
but it's still way faster than not having NVMe inside the VM
<geist>
though i think for that to work i guess the iommu would solve that
<Griwes>
iommu is such a nice thing for so many reasons
<geist>
i'd implement some sort of context switching so you can run more than one thread
<geist>
oh they left, okay
mahmutov has joined #osdev
the_lanetly_052 has joined #osdev
<kazinsal>
yeah, I'd say memory management -> tasking -> user mode -> syscalls
<kazinsal>
at that point you're basically just hooking bits up to the user space as you go
<geist>
i'd generally do tasking first since you dont really need any more than a heap (if that)
<geist>
you can switch threads you're basically at a useful embedded rtos level
<geist>
then you can keep going
<kazinsal>
true
<kazinsal>
then your threads can mmap in additional address space
<geist>
you can run without paging, etc and work on cpus that dont have a mmu, etc
<geist>
it's one of the reasons it's easy to port LK to a bunch of different arches: mmu is optional
GeDaMo has joined #osdev
mahmutov has quit [Ping timeout: 260 seconds]
<mrvn>
you don't even need mmap, just give every process 4GB heap and implement demand paging.
<mrvn>
zid: are't wire recorder more the precursor to vynil?
Burgundy has joined #osdev
zid has quit [Ping timeout: 250 seconds]
Likorn has joined #osdev
wolfshappen has joined #osdev
wootehfoot has joined #osdev
No_File has quit [Quit: Client closed]
zid has joined #osdev
<zid>
I did syscalls first but that's because I've got nothing useful to actually run so I just wanted to test I had set the descriptor tables and stuff up properly for them to work seeing as the code was in the repo :P
<mrvn>
making an echo syscall is useful
<mrvn>
or log
<kazinsal>
yeah, "blast sev + message into syslog" is a good initial syscall
<kazinsal>
lets you test both argument passing and make sure your userspace code for formatting strings etc works nicely
<cookie>
hey, here once more to request someone tells me to "just fucking start it"
<cookie>
i have all my ~~ducks~~ ideas in order, but starting is hard
<kazinsal>
starting is hard
<kazinsal>
I plan to eventually restart work this weekend! hopefully my undiagnosed unmedicated adhd brain can accomplish that
<kazinsal>
(I say, for the nth weekend in a row)
<cookie>
i did another project recently that happened to be a good test for rust and i don't think it'll work unfortunately
<cookie>
(it was a very over engineered static site generator, and i've written enough rust to not fall in the common pitfalls, i just had a loooot of types and it got unsustainable to fix multi page type errors)
<kazinsal>
my largest barrier to entry for rust is rust people
<cookie>
the rrir crowd?
<j`ey>
you mean the trolls that pretend to be rust people? :P
<cookie>
the actual rust people i've talked to are okay-nice, though there's a fair amount of variance
<j`ey>
as with any community
gog has joined #osdev
<zid>
that's why you should trick yourself and not actually start
<zid>
just prototype a bunch of pieces
<zid>
and accidentally end up compiling them all together
<gog>
mew
* kazinsal
gives gog headpats
* gog
prrs
* cookie
scritch gog
* kazinsal
scritchies
<gog>
o:
<gog>
my gsoc proposal was accepted now wtf do i do
<cookie>
uh, code?
<gog>
idk if i can work full time and do that
<kazinsal>
learn go
<kazinsal>
or whatever googlers are into these days
<gog>
i'm kinda on a sigma grindset and i don't know if i can add anything to my grinding set
<cookie>
zid: i like that
gog has quit [Ping timeout: 246 seconds]
<cookie>
stupid printer issue came up
<cookie>
android refuses to connect to CUPS because i'm not serving it with a TLS cert
<cookie>
so now i have to add a lot of jank to serve a valid tls cert.. in my home network
blockhead has quit []
nyah has joined #osdev
dude12312414 has joined #osdev
dude12312414 has quit [Remote host closed the connection]
gog has joined #osdev
<zid>
gog: what was the proposal?
<gog>
code sharing for EDK core
<gog>
every firmware image has a bunch of duplicated code and they want to not have that
<zid>
Buy bigger hard drives, done, that'll be $8000 consultation fee.
<gog>
true
doorzan has joined #osdev
ethrl has joined #osdev
heat has joined #osdev
<heat>
congrats gog
tomaw has quit [Quit: Quitting]
tomaw has joined #osdev
mahmutov has joined #osdev
puck has quit [Excess Flood]
puck has joined #osdev
Teukka has quit [Read error: Connection reset by peer]
<bslsk05>
betanews.com: HP chooses Ubuntu-based Pop!_OS Linux for its upcoming Dev One laptop -- could System76 be an acquisition target?
No_File has joined #osdev
the_lanetly_052 has quit [Ping timeout: 276 seconds]
No_File has quit [Quit: Client closed]
Likorn has quit [Quit: WeeChat 3.4.1]
<mrvn>
yeah, why change a distribution with long time security support? Lets pick something obscrube that will disappear next year?
<mrvn>
s/change/choose, what am I typing?
<heat>
pop_os isn't obscure
<heat>
but you're right, they should've used hp-ux
cookie is now known as ckie
<mrvn>
4 years old, better than I thought. But very much looks like made by a manufacturer to sell it's own systems because the marketing division couldn't have someing labeled Ubuntu on the system.
doorzan has quit [Remote host closed the connection]
<mrvn>
I wish people would call these clones something else than an OS. It's more like a theme for Ubuntu.
<heat>
it's not just a theme
<j`ey>
they said they're going to write a new DE to replace GNOME
<mrvn>
No, not just. But quite like it. How many extra packages does it add? Maybe 10? Some patched packages for a different look&feel? 99% is pure Ubuntu I bet.
<mrvn>
A new DE to replace GNOME is like a new theme to an app.
<heat>
no
<heat>
it's new software
<heat>
it's not "like a theme to an app"
<mrvn>
so is a theme
<heat>
GNOME is super complex
<mrvn>
It's also supper simple: It shows me icons and when I click them something starts.
<mrvn>
New desktop. Oh now it's green background instead of blue. I still click at an icon and something starts.
<heat>
you do realise GNOME isn't just the desktop right?
<mrvn>
Might be a million lines of code that changed but it's still just a desktop.
<j`ey>
so all OS's with a desktop are themes of ubuntu?
<mrvn>
if it's based on Ubuntu then I would say so pretty much.
<heat>
ubuntu is just a theme of debian
<mrvn>
That's how it started. But they make a lot of their own packages now with their own security support, their own kernels, ...
<mrvn>
That's where I would draw the line. When they have their own security support for packages not developed in-house then they've complete the fork to OS / distribution status.
<j`ey>
but at the end of the day, it's still an OS heh
<heat>
arch doesn't really have security support for packages
<heat>
is it a theme?
<heat>
also no custom kernels
<j`ey>
well they still package it / have a custom config
<mrvn>
no idea what arch is. A random collection of unmaintained binaries?
<heat>
arch linux
<heat>
which i do in fact use
vai has joined #osdev
<heat>
j`ey, that's nothing compared to the patch fest redhat and canonical have on their kernels
<j`ey>
true
<mrvn>
I know hat arch linus is. I just don't what to call it. It's not a "theme based on xyz". If it has no security support it would fail my "distribution" threshold.
<mrvn>
+w
<heat>
they're maintained
<heat>
but there's no security support
<heat>
no backporting of patches, etc
<j`ey>
it's clearly a distro
<j`ey>
(btw)
<mrvn>
security support can mean upating to newer versions. backporting isn't needed.
<heat>
your definition of distro is clearly skewed by the far more corp-y redhat and canonical distros
gog has quit [Ping timeout: 260 seconds]
<mrvn>
heat: except my point is there are no canonical distros (except maybe a hdnafull), they are just something like a theme is to an app but for distributions.
<heat>
it's a distribution if you're distributing the packages
<heat>
they distribute the packages from their own servers
<mrvn>
so every Debian mirror is a distribution?
<j`ey>
im not really sure what is gained by saying a certain linux-thing is not a distro
gog has joined #osdev
<heat>
you're being deliberately obtuse
<mrvn>
I just wish for a better name that reflects the difference between maintaining thousands of packages and just adding 10 packages to someone elses work.
<j`ey>
remix
<heat>
I just wish for a better name that reflects the difference between remembering to update a package/filing bugs and actually reviewing code
<heat>
:)
Likorn has joined #osdev
<mrvn>
j`ey: remix is a good word for it. Now make everyone else use it please.
<sortie>
jafarlihi, the interesting case is when the source and destination memory overlaps, in which case you don't want to lose the original data, so you need to be careful. memcpy(3) has undefined behavior in that case (it can assume the memory does NOT overlap), but memmove(3) has to handle that case per the standard :)
<sortie>
The easy trick I went with here is to simply copy backwards
<sortie>
Inefficient to only do byte copies though
<sortie>
But hey it's an initial skeleton, it's meant to be obviously correct and simple :)
<jafarlihi>
How copying backwards matter for overlapping?
<GeDaMo>
[---src---]
<GeDaMo>
[--dest---]
<mrvn>
Line 6 should check if src + size < dst (with overflow protection)
<sortie>
Think about it for a bit :) I imagine a source buffer that is before the destination buffer, and they overlap, so you copy one byte at a time forward, and then you overwrite the data you were supposed to copy, so you have to do it backwards which works in that case
<GeDaMo>
If you copy from the beginning of src to the beginning of dest, you would overwrite the middle of src
<jafarlihi>
Oh, I get it now. Thanks!
<GeDaMo>
But copying from the end of src to the end of dest is safe
<zid>
which is why memcpy on overlapping regions is ill advised
<zid>
you don't know if it copies forwards or backwards
<zid>
hence memmove
<sortie>
s/ill advised/undefined behavior that WILL blow up on your fact/g :)
<mrvn>
I would just have memcyp call memmove
<heat>
adding needless branches 101
pretty_dumm_guy has joined #osdev
<mrvn>
heat: you think that matters with that implementation of memmove?
xenos1984 has joined #osdev
<heat>
*shrug*
<heat>
memcpy is the thing you use 99% of the time
<heat>
memmove isn't
<mrvn>
if (dst < src | src + size < dst) should branch predict very well.
<mrvn>
and in user space it can't overflow.
<heat>
now, making memmove call memcpy would be a good idea
<sortie>
memcpy implemented using memmove is fine
<sortie>
Honestly all that matters is what performance you want, how actual machines perform, and how simple / complicated / large / small you want the code
<sortie>
YMMV
<mrvn>
if (dst < src | src + size < dst) memcpy(); else reverse_memcpy(); works too
<mrvn>
Just stick an asm("memcpy:") label after the if. :)
terminalpusher has quit [Remote host closed the connection]
terminalpusher has joined #osdev
terminalpusher has quit [Remote host closed the connection]
<bslsk05>
github.com: Onyx/memcpy.c at master · heatd/Onyx · GitHub
<Ameisen>
geist: I don't remember seeing one, but it's possible that there's one somewhere.
<geist>
but if there were it'd probably be something like the delayed spin up thing i was talking about
<geist>
downclocking i think is more implicit, due to TDP
<heat>
a quick ctrl+f on wikipedia says no, there's no such bit
<geist>
yah was thinking it'd be a similar bit to ERMS: basically warning so you can tune your code
<heat>
it may be one of those details that compilers just know about
<geist>
it's also possibly a general notion that i fyou're using avx512 it's probably for some sort of long running calculation so you generally dont ust use it for one off things
<heat>
llvm-mca could probably tell you more if you feed it some code
<geist>
this is a really interesting blog as i get into reading it. it seems to directly assert that the instant you run one of the 512s (or even the 256s) it instantly starts to transition to a lower clock while spooling up the 512 units
<geist>
at least on that class of skylake he's testing on
<geist>
duno what the current state of the art is on modern cores
<geist>
the interesting thing is its actually throttling the dispatch of all ALU ops while it's waiting for the voltage regulator to stabilize, so it's actually running the instruction but it just runs at a reduced rate
vai has quit [Ping timeout: 272 seconds]
<jafarlihi>
Is there any resource out there for implementing terminal scrolling?
<heat>
no
<heat>
it's just something you need to do
<heat>
imagine, you need to scroll things up
<heat>
so move every line upwards, discard the top one and empty out the bottom one
<geist>
right. it may be possible to use hardware scrolling in some very specific situations, but that also usually involves eventually running out of framebuffer and needing to reset to the top
<geist>
depends on hardware, etc etc. but it's an optimization
<clever>
ive also done this just recently
<clever>
in the case of the rpi, you can specify a multiple bitmap/w/h/x/y pairs to display
<clever>
so i never run out and have to reset to the top of the bitmap
<heat>
geist, right, this is text mode
<clever>
so i treat the bitmap like one big circular buffer, and then i just split it at the wrap point, and tell the hw to render the 2 halves
<heat>
specifically the "exercise is left to the reader" in the meaty skeleton page
<geist>
ah that's odd, seems like writing a little routine that uses a memcpy + memset would be easy enough to leave in
<geist>
since that's basically what it boils down to
<geist>
copy everything up a line, clear the last line
<heat>
you can't use those, it's video memory
<heat>
but yeah, it's like a "let's see if you can do more than copying code from the wiki"
<geist>
sure you can, it's just not great
<geist>
it's a good starting point, then you optimize (by keeping off screen buffer and dirty lnes, etc) later
<heat>
not if it's virtualized
* geist
shrugs
<heat>
(well, you can try but things may crash)
<geist>
if they crash your VM is broken
<heat>
i've crashed qemu before when doing a popcnt on ahci mmio memory
<geist>
right, using exotic instructions on mmio may do it
<geist>
i've personally found bugs with HVF with that
jafarlihi has quit [Quit: WeeChat 3.5]
<geist>
anyway, it's a general observation that you should avoid reading from the framebuffer, but it's not *illegal*
<geist>
well they left anyway
<heat>
yup
<heat>
possibly not in text mode though
<heat>
a much smaller buffer
<mrvn>
geist: memmove, the srd+dst overlap
<geist>
yah my eperience is that text mode buffers are small enough that the slowdown isn't bad
<geist>
mrvn: indeed, but not in the direction that matters, though you're strictly right
<mrvn>
With a framebuffer on old hardware it's also worth checking if the char in one line is the same as the char in the next line and then not move that part.
<geist>
ie a sane memcpy with a stride less than a single line of the text mode would still work fine if it's copying lower to high
X-Scale has quit [Ping timeout: 240 seconds]
X-Scale` has joined #osdev
<mrvn>
geist: scrolling down could be a problem, scrolling up the size you move at once shouldn't matter.
X-Scale` is now known as X-Scale
<geist>
yah, tis true. i'm assuming scrolling up
<mrvn>
if the stride is larger than the width you've already read the data you overwrite. no harm done.
<mrvn>
Does scrolling the console on x86_64 turn into a single repl movs?
<geist>
probably + a clear at the end for the new line
<mrvn>
just copy an extra 80 bytes that you memset to ' ' at boot.
<geist>
i suppose you could store just off the end of the framebuffer a blank line of ' '
<geist>
and then do a memcpy that reaches off the end
<geist>
scrolls one extra line back
<mrvn>
Soon you want to have a scroll back buffer though and then it's just copying blocks from the ring buffer to the screen.
<geist>
especially since clearing a line on vga text mode is really 0x2000 (or 0x0020 i forget)
<geist>
or something like that
<mrvn>
or framebuffer and it all changes
<clever>
geist: and if you where doing scrollback in LK's gfxconsole, would you maintain the scrollback as text or graphics?
<geist>
text
<clever>
text would need more complex cpu and re-rendering, while graphics needs more ram and is just a pure memcpy bandwidth job
wootehfoot has quit [Quit: Leaving]
<clever>
and then there is the question of the scrollback just being flat char[80]'s or if its just a mess of "foo\nbar\n" and you have to re-parse it to find the lines
<bslsk05>
github.com: lk/gfx.c at master · littlekernel/lk · GitHub
<clever>
this line randomly crashes the system
<geist>
patches welcome
<geist>
would initialize it to zero and then in the wrapper function test for null
<clever>
its already testing for null
<clever>
it just isnt initialized
<clever>
i'll see about filing a PR for that tonight
<geist>
change the malloc to a calloc then
<mrvn>
clever: I like having 64k text scrollback buffer or so. Doing it as gfx would cost too much ram.
<geist>
i'm much better about zero initializing things nowadays due to doing that a lot in zircon
<geist>
especially with C++ where you can easily zero initialize
<mrvn>
I really like that you can write "Foo *bla{CAFEBABE};" for class members now.
<clever>
geist: ehhh, every single byte in the surface (except flush) is written to in create_surface, so calloc feels like a waste of bandwidth
<mrvn>
geist: Have you tried out the new compiler options to initialize (zero fill) padding in structs?
<geist>
yeah, it's one of those compromises for future expansion
<mrvn>
and stack
<geist>
mrvn: yes we use it in zircon kernel, much to the detrement of performance
<geist>
but all is not lost, there's an attribute you can use judiciously to turn it off in particular situations
<mrvn>
You have that much padding that it is noticeable?
<geist>
we actually measured a fair amount of performance loss in some benchmarks as a result of the compiler folks just turnign it on one day and not telling us
* geist
grumbles
<geist>
mostly little temporary objects on the stack that now suddenly go through a more complicated setup
<geist>
objects with buffers in them etc
<heat>
linux also uses it
<mrvn>
well, buffers are not padding. Sounds more like a force zero fill of everything than just padding.
<geist>
for example, we have a neat little object that you pass around when doing mmu operations that tracks up to N pending invlpgs so you can queue up some pending TLB invalidations and then flush a the end
<geist>
so it ends up with an array of uintptr_ts and a count
<mrvn>
so basically a vector with fixed capacity
<geist>
it's intended to be very fast to initialize, since we create one preemptively before diving into the mmu code for any reason
<geist>
but now with the zero fill its dumping down 300 bytes or so on the stack every time
<geist>
so that's a good case where it actually shows up in benchmarks. we dont need it zero filled, becuse the inner array is totally safe because of the counter (which is constructed with 0)
<mrvn>
aloca() it?
<geist>
a) we absoklutely forbid all forms of local stack allocations and b) the point is it doesn't know the sie so it allocates a thing up front in the wrapping function
<geist>
that's the point, you create it in the outer stack and pass it into the inner functions so it can accumulate a list of pages to flush
<mrvn>
sometimes I wish for "Bla * foo[[uninitialized]]" for function arguments to signal the function will initialize the data.
<geist>
there are a few patterns like that in the kernel that
<geist>
but again there's an attribute you can put on things that tells the complier to not do the zero fill, so all is not lost
<mrvn>
do you happen to remeber what it's called?
<geist>
and actually we dont do a zero fill, it does some sort of pattern fill. with a pattern that's designed to trigger exceptions if you deref it etc. i forget it
<geist>
mrvn: i think it's uninitialized or something like that
<bslsk05>
fuchsia.googlesource.com: zircon/kernel/arch/arm64/mmu.cc - fuchsia - Git at Google
<geist>
i should actually look into that next week. we did it eksewhere, maybe this was a missed case
terminalpusher has joined #osdev
<mrvn>
geist: carefull though. I believe the initialize everything was designed as a security mitigation so previous data from the stack doesn't leak. Don't compromize safety for speed. :)
<geist>
right! SECURITY > *
<geist>
but yeah basically my experience is generally initializing everything in the object is a good idea except in cases wher eyou can prove there's no need
<geist>
and it's a performance thing, so it'susually objects with arrays in them where some other variable controls access to the array
<geist>
i am usually okay with leaving out initialiation for that
<geist>
mrvn: anyway i cant find it offhand, but it's something like unitialized or unused or one of those attributes
<geist>
easy enough to figure it out in godbolt
<mrvn>
Maybe this should be solved from the other side. Mark everything security relevant and wipe the stack when leaving the function instead of initializing
<geist>
we do also use safe stack and/or shadow call stack in the kernel
<geist>
which is kinda a pain, since each thread now has two stacks, but it gives you a lot of that benefit
<mrvn>
shadow stack is where you place return addresses on one stack and stack frames on another one, right?
<geist>
yah i dont thik it's implemented on x86, but we use it on arm
<geist>
basically it's an upwards growing stack of 8 byte values, simply the return addresses
<mrvn>
is it predictable where the two are?
<geist>
x18 points to it at all times
<geist>
no. we randomly allocate them in the kernel
<geist>
x86 it's harder to do something like that, but x86 has the whole safe stack/regular stack thing, and the return addresses are on the safe stack
<mrvn>
yeah, can't push/pop anymore with shadow stack
<mrvn>
unless you swap the SP between the two I guess
<geist>
right. off the top of my head the idea is you leave RSP looking at the safe stack, which still has a traditional stack frame, but you also use TLS to store an unsafe stack where you put any locals that have any possibiility of escaping
<geist>
so the compiler knows what locals have no pointers to them, etc and can still put them on the regular safe stack
<mrvn>
doesn't rust have this implicitly since the stack frame is on the heap? Or am I confusing something there?
<heat>
the safe stack is the stack where you put big vulnerable stuff right?
<geist>
but anything you get a pointer to goes on the unsafe stack
<heat>
or yeah the unsafe stack
<heat>
ah
<geist>
or has any sort of possiblity of you doing an overflow, etc
<geist>
arrays of things, etc
<mrvn>
every buffer needs to be on the unsafe stack so over/underflows can't change the return address
<geist>
it'll still registrers on the safe stack too, so even if you could overflow on the unsafe stack you can't trash the register/return state
<geist>
s/still/spill
<mrvn>
yeah, anything not accessed with pointer arithmetic should be save.
<geist>
iirc the shadow call stack is basically superior in the sense that it's simpler and avoids the ROP exploits, but is only really useful on architectures where there's not any real cost to it (ie, risc machines that return from functions via register indirection)
<mrvn>
just an odd though: can you prefix push/pop with %fs or %gs?
<mrvn>
+t
<mrvn>
21:58 < geist> right! SECURITY > *
<geist>
yah so it's a more freebie one
<geist>
but only on arches where it's a freebie
<mrvn>
isn't push/pop on x86 extra fast? accessing a second stack is slower, right?
<geist>
probalby
<geist>
SECURITY > *!
<geist>
just repeat that every time your silly reptile brain starts to consider performance as a consideration
<geist>
otherwise the beatings will recommence
<mrvn>
geist: my whole kernel is design kind of like that. KISS, must work > fast
* geist
gets out the taser
<geist>
i guess that's what i get for working at google for 10 years. hardware is free! security > *!
<heat>
no, you get p r e b u i l t s
<heat>
:P
<geist>
(i'm just being snarky, obviously security is important, and its part of the tension of engineering to work with multiple constraints)
<mrvn>
My experience is that improving your algorithm will gain you so much more speed than any little security or unoptimized loops or such will cost.
<mrvn>
Saving 5% on memcpy can't beat not calling memcpy at all etc.
<geist>
heat: actually was reading something that i had never considered before: idea is that all tools should be rebuilt at least every say 6 weeks. reasoning is it wont be able to pick up new compiler features, etc without that
<geist>
ie, leaving old tools around built last year is another path for security sploits, etc
<geist>
and kinda makes sense, i had never considered it before
<mrvn>
geist: it also tests the compiler for bugs and makes sure the source still compiles and conforms to modern syntax.
<geist>
vs conventional wisdom of finding a solid release of some thing and sticking with it and then roll when new release is out, features, etc
<heat>
everyone updating tools is nice but it only works if most people are on the same page (i.e same company)
<geist>
oh 100%
<GeDaMo>
Don't forget to recompile the compiler :P
<mrvn>
Try compiling a 6 year old c++ source with todays clang
<geist>
GeDaMo: absolutely
<heat>
tianocore still supports like VS2010 and GCC 4.8
<mrvn>
And when you have to fix a security bug is not the time to try to update sources to the current compiler.
<geist>
but i think it's even further. it also means *all* libraries and all applications you use interanlly for your company/etc should be constantly rebuilt
<geist>
and something older than N units of time is a cause for alarm
<mrvn>
geist: I would consider that the test suite for every (major) compiler update.
<geist>
and this is versus to the notion of only rebuilding somethig if the source changes
<geist>
anyway, a thing i hadn't really thought about, but i'm just a low level kernel person. i dont think about those things much
<mrvn>
Some years back there was a group that would rebuild all of Debian every month.
<geist>
so here's a completely unrelated x86 question
<geist>
if i were to sya breadboard up a 386 or whatnot, would it be possible to wire up the address space such that there is *no* ram below 1MB
<geist>
idea being that you put some rom at the start address (0xfff0... somehting)
<GeDaMo>
Interrupt vectors?
<geist>
and the first thing it does is immediately bounce into protected mode
<geist>
so question is can you get to protected mode with no ram and no cache
<mrvn>
geist: and go straight to 32bit mode in the bios? Or do you have an UEFI for that?
<geist>
i think so, since you can pre-can a GDT in the rom and then load it
<geist>
mrvn: right. but not a regular bios. i'm saying something you build yourself, not attempting to make aPC clone
<geist>
and it'd be simpler if you say started RAM at some higher address and just left < 1MB to rom or whatnot
<mrvn>
don't see why you can't constexpr the whole boot process up to starting your kernel in long mode and make that your bios image.
<geist>
so question is can you write code that uses no ram at all and gets to protected mode. i think so
<geist>
again. you're missing the point
<geist>
did you read the problem statement?
<mrvn>
hardcoded GDT, page tables, ... for 32bit and 64bit
<geist>
no. i said 386
<geist>
like literally an 80386
<mrvn>
ups, drop the 64bit. The answer remains though. nothing needs ram there.
<geist>
yes but really? i'm worried somethig implicitly needs a stack or whatnot
<geist>
since you'd hve to operate in a few instructios that literally only has the registers or read only memory to operate
<mrvn>
geist: only thing that needs stack would be building page tables dynamically with some recursive function.
<geist>
yah if you wated to get to 64bit you'd have to at least pre-can a few page tables in the rom
<geist>
which is a bummer, sicne it'd use up a few pages, but so it goes
<mrvn>
recursive page table. only needs 1 page.
<geist>
but once you're in protected mode you can run >1MB and then you have ram
<mrvn>
And the parts you don't use you can put other stuff in.
<geist>
the idea is if you're wiring up your own x86 you have no need to reproduce any of the 640k memory hole or bios or whatnot
<mrvn>
Dd 386 have 2MB pages?
<geist>
and i'd just re-layout memory to do something similar to arm64 qemu or whatnot: put hardware in low addresses and start ram somewhere higher
<geist>
no not at all. large pages came at least 10 years later
<mrvn>
what about graphics memory? Your own gfx card with dedicated ram?
<geist>
of coure
<mrvn>
So that could still be at A000/B000
<geist>
no. the point is not to do that
<mrvn>
I though hardware in low addresses
<geist>
the whole point is you are breadboarding something that has *no* backwards compatibility
<geist>
well i' thinking low addresses as in say < 1GB
<geist>
no need to cram it low there. can really leave say a gigantic run of memory for framebuffer. no reason to think so small
psykose has quit [Remote host closed the connection]
<mrvn>
Just don't put the ram at 2GB. that makes going higher half more complicated.
<mrvn>
start of the ram
psykose has joined #osdev
<geist>
anyway i think the anser is yeah, the only thing it'd really need is a pre-canned GDT and a LGDTR pointer in rom to get up to protected mode
<geist>
and then you can use ram
<mrvn>
nod
<geist>
idea is a 386 is pretty easy to breadboard
<heat>
geist, offtopic but I genuinely don't know if current chipsets can use those tables if they're in ROM
<mrvn>
Are there any 386 clones that can power up into 32bit mode?
<geist>
and nothing really there aside from the cpu starting in real mode has any real PC legacy
<mrvn>
Another stray thought: Have you ever put a 386 into an FPGA?
<geist>
could
<geist>
assuming something like that exists and intel hasnt squashed it
<mrvn>
should make it simple to modify it to go right to 32bit on power up.
<geist>
that would be interesting. if a 386 compatible machine started in 32bit mode it'd have to contend with the fat that there's no GDT. yet
<geist>
so i guess it'd have to define the starting state as 'already in 32bit mode, no GDT pointer, but segment registers are set up with 32bit mode as if they had been loaded'
<heat>
you could hardcode the segment bases and limits
<geist>
so presumably the very first thing you do is load a real GDT so that it can continue
<mrvn>
straight 1:1 memory map, all segment registers loaded with 0-4G.
<geist>
i guess you'd have a NMI hazard right off the bat, since you have no IDT or whatnot loaded on instruction #1
<geist>
whereas that problem doesn't exist for real mode since the vectors are implicitly 0
<mrvn>
interrupts disabled. Why would you get an NMI?
<geist>
because that's the point of NMI
<geist>
one cannot disable it
<mrvn>
yes, but what do you expect to throw one?
<geist>
not the cpu's problem
<geist>
still have to consider the design, since nothing prevents a syste designer from doing it
<mrvn>
true
<mrvn>
Nothing stops you from pre-loading an IDT that points into the rom though.
<mrvn>
(on FPGA)
<geist>
so you'd probably have to do something like have some sort f window where nmis can't fire, or require the system designer deal with the hazard by putting an external NMI gate (a-la a20)
<geist>
yeah in an fpga you can do what you want. i'm thinking more like if intel had designed a 386 back in the day that started in protected mode. say via a pin strapping
<mrvn>
How does this work on ARM?
<geist>
ARM has no nmi, so it avoids the problem
<mrvn>
.oO(rake a wire cutter and cut the NMI pin)
<geist>
and i assume when it starts in EL3 (or highest EL) it starts with everything masked
<geist>
irq/fiq/serror/debug
<heat>
geist, you can mask NMIs if you really want to
<heat>
it's in one of the RTC registers
<heat>
you would just start with that pre-masked, easy solution
<geist>
yah but that assumes a RTC exists and is wired up the way PCs are
<geist>
from the point of a raw early x86 (before half of the PC arch got integrated into it) RTC/PIC/etc were all just part of the syscal
<geist>
system
<geist>
nmi on first instruction could be like, there's an nmi button on the board that some person holds down when releasing reset
<geist>
boom, cpu starts nmi is asserted on first instruction
terminalpusher has quit [Remote host closed the connection]
<heat>
yes, you'd need to make an NMI register a platform detail
<geist>
z80 has some logic delaing with nmis and whatnot, but the key there is x86 post real mode has all this *state* that implicitly relies on stuff already existing in memory
<geist>
hence why in general it's hard to start the cpu > real mode
<geist>
and perhaps why it has never been removed thus far even though it'd make a lot of sense
<geist>
whereas no other modern (or even contemporary architecture to x86 in the 80s) had all this in-memory state
<heat>
there's not that much state, just the IDT
<geist>
ie, data structures that live in memory that the cpu reads whe it feels like it
<geist>
not true: the IDT referes to segments that must live in a GDT/LDT
<heat>
you can hack your way to a memoryless GDT if you change the way the CPU works at startup
<geist>
but the cpu re-fetches data from the GDT upon exceptino
<heat>
hmm
<heat>
right
<geist>
ote other arches like 68k or whatnot had a table (VBAR) bu they were just addresses
<geist>
and that can safely just start at an implicit address (0)
<heat>
well, you curl up in a ball and cry
<geist>
yah further 'x86 sucks'
<heat>
maybe you just put it in ROM and hopefully it works
<geist>
and pretty uch 100% of this can be traced to the 286. that's when they had been really affected by the apx432 koolaid that had been flowing in the breakrooms at intel
<geist>
which was like this except on steroids
<mrvn>
What does the IDT point to on power up?
<heat>
nothing
<geist>
nothing, cpu starts in real m ode, which doesn't use it
<mrvn>
it still has the register with some context
<mrvn>
contents
<geist>
probably either 0 or UNDEFINED
<heat>
^^
<geist>
since its not used until software makes it so
<mrvn>
yeah, but that is an important difference.
<geist>
not really, since the cpu doesn't use it
<mrvn>
You can map rom to 0 but not to UNDEFINED
<geist>
doesn't matter, because it's not used until you enter protected mode, which software is required to do, and software is supposed to set up the IDT before doing so
<geist>
prior to 286 x86 simply existed and could run meaningful stuff at instruction 0. no additioal state to set up
<mrvn>
but you want to go straight to 32bit with a pin. And then you cold have an IDT at 0 for the NMI.
<geist>
mrvn: oh sure. that was smy point of the whole discussion. there's all this extra shit you have to set up before going to protected mode
<geist>
anyway i think we may have beaten this horse
<geist>
more of a though experiment in the challenges of starting an x86 in > real mode
<heat>
ok right so
<heat>
all the descriptor tables can be in ROM, and are in ROM when switching from 16 to 32-bit in firmware
<geist>
yah would just have to define what the starting addresses of the tables are
<geist>
an ugly hack but so it goes
<heat>
what's a platform without ugly hacks
<geist>
like 'entr point is at X, IDT is at Y, GDT is at Z, suggested contents for GDT is WWWW'
<geist>
heat: riscv!
<heat>
yet!
<heat>
how's lk-user coming along?
<mrvn>
heat: all my hacks a beautiful :)
<geist>
heat: nothing today, going to get LK working with gcc 12.1 first, and debating what to do with the shared riscv thing but i dont need it yet
<geist>
i had wired up a little file descriptor table though
<geist>
probably thenext thing to do is either decide to go the musl route or continue to newlib for now
<geist>
probably the second for a bit, since i can wire up mor stuffs that way
<geist>
musl is a huge task and it's highly linux centric
<heat>
if you call it a handle table you're 50% less UNIX-y :P
GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]
<geist>
yah trouble is really it's the whole 'how does the table allocate and pack it's values'
<geist>
posix has very specific first pack which is pretty bad honestly, but its fairly baked into the design
<mrvn>
geist: 1342 endianess
<geist>
i' really debating in my head what i'm trying to build aside from hello world
<mrvn>
mandelbrot? computing PI?
<mrvn>
frogger
<j`ey>
qemu
<heat>
irc client
<mrvn>
tetrinet
<heat>
by the way, uclibc-ng is a thing, although I've heard it's of lesser quality than musl (and LGPL licensed)
<heat>
geist, by the way why do you want to switch from newlib?
<heat>
is it that bad?
<geist>
it's fine for embedded, but it seems to be lacking
<geist>
thigns like internal locking, it's pretty unclear how well it does any of that
<geist>
i shouild work with it a bit more though, to be fair. it's more of a basic libc in the stdio + heap sense, it seems
Likorn has quit [Quit: WeeChat 3.4.1]
<heat>
have you tried building some simpler packages with it
<heat>
klange used it for quite a bit of time so it must be capable of something
<geist>
yah i probably should stick with it honestly
<geist>
the muslthing was more of a 'lets see how hard this would be' and it seems like it's at least straightforward, just non trivial
<geist>
haha looking at the fragmentation of some of my VM disk files on my nas server
<geist>
401k fragments for my windows 10 img file!
<heat>
i think that if you want to avoid having a posixy interface, you should avoid musl, at least for now
<geist>
i think so too to be honest
<heat>
it did influence quite a bit of my design
<heat>
you could also go full posix and make the perfect svr4 clone
<geist>
i'm torn between 'lets just do posix so i can be like sortix' and 'meh more fun to build a lower level more embeddedy but user space thing'
<geist>
and the latter is probably an actually useful thing
<heat>
how lower level?
<geist>
but since i'm keeping the lkuser stuff a separate project, there's nothing that keeps someone from building something else on it
<geist>
ie, still maintaining a kernel vs user space implementation as separate layers
<geist>
oh i dunno, i mean lower level as in doesn't need fork() signals, etc. more of 'here are some files, here's a way to access the network, here's a way to create more processes and threads'
<geist>
'here's a way to get to devices other than /dev nodes'
<geist>
'here's some ipc and futexes to build stuff out of'
<mrvn>
like an exokernel or lib kernel?
<geist>
mrvn: what are those?
<geist>
or more specfically what do you mean when you say those (everyone has different definitions of that0
<mrvn>
Basically the whole kernel as a library you link against.
<geist>
and run in user space?
<mrvn>
kernel space
ozarker_ has joined #osdev
<geist>
i guess? basically? depends on if you consider linking a bunch of .o files as 'linking against' or whatnot
<mrvn>
I think the big point is that you replace system calls with function calls
<heat>
well, that's not it
<geist>
lk build system is very modular, so in this case what i'm doing is providing another module called 'lkuser' that implements the syscall layer
<mrvn>
hmm, so the equivalent of libc but not posix-y
<geist>
so it's basically building a vaneer routine, an 'executive' so to speak that acts as a user space interface to the kernel that is largely unconcerned weith user space
<mrvn>
I kind of envision that as ldso segment every user space program gets in my kernel.
<geist>
so no it's not at all the equivalent of a libc, it's more of adding a module to an existing modular system
<geist>
no i'm talking a layer below that as in LK currently has no concept of user space. it doesn't switch out of protected mode. you can write bmedded stuff with that
<mrvn>
vdso even
<geist>
but you can simply add another layer that adds a user space and then provides syscalls with a 'personality' that interfaces with teh inner LK code
<geist>
and you could build multiple kinds of these if you want, even at the same time
<mrvn>
Would you still link against .o files for that personality in the user space progs?
<geist>
this is functionally what we did for zircon, we started with LK and then started building up the zircon syscall interface
<geist>
no... i think you're thinking wayy too hard about this
<geist>
user space is user space. judst like any other
<geist>
100% of what i've been talking about the last 5 minutes is how the kernel code is organized
<mrvn>
just wondering how the user space side is going to talk with the kernel side
<geist>
via syscalls
<geist>
of which the kernel side is implemented in the lkuser module
<geist>
which is linked with the LK kernel
<heat>
via lcall 0x7, per i386 svr4 abi
<geist>
(just to be clear, not what heat just suggested)
<mrvn>
seems to fit with what linux / bsd have as personality then
<heat>
nooooooooooooooooooooooooooooooo
<geist>
right
<geist>
basically add a particular user space personality to LK as a module you add to LK kernel
<geist>
where it'll get nasty is when i want to start implementing process termination and i need to be able to unblock threads in the LK kernel, which you currently can't do, i think
<heat>
anyway geist I think you're overthinking it
<heat>
both of those two designs can be useful
<geist>
whcih two designs?
<heat>
the posix thing and the light embedded userspace thing
<geist>
oh surfe. exactly. it's mostly which oens i want to fiddle weith first
<geist>
i think a point i was trying to make a while ago and got lost in the noise is i can do *both*
<geist>
because it's a personality that should be logically seperate from the core kernel
<heat>
damn right
<heat>
suck it people that write budget linux
<geist>
at the expense of perhaps an added layer of abstraction
<mrvn>
don't you have that layer anyway because you have to copy the syscall args from user space memory to kernel and then call the actuall kernel function that do the job?
<heat>
what if you write an nt executive layer?
<mrvn>
heat: then you will get stoned
<mrvn>
The L4 microkernel has an optional linux portability layer.
<geist>
mrvn: sure but i mean a layer as in user space will say treat handles or file descriptors this way or use futexes for blocking, but the inner kernel may not have any of those concepts
<geist>
but in general good modular designs work that way anyway
<geist>
so it's not that much of a stretch
<mrvn>
geist: sure. My point is you already have that for a posix layer. Your kernel won't implement eveything posix and the layer has to emulate it already. Adding an lkuser layer just changes what you have to emulate or even reduces it because it will be finetuned to lk
<geist>
right
<geist>
this is versus some sort of designs where the kernel *is* posix
<geist>
you could really bake a lot of that all the way down
<geist>
if you didn't care to implement anything else
<geist>
like say directly testing for pending signals inside interrupt handlers, instead of at least making some sort of veneer routine that separates those two layers
<geist>
i think that's generally one of my good engineering practices i'm pretty good at: building separation of layers and concerns between layers
<mrvn>
it's layers, all the way down. No, wait, that was turtles.
<dh`>
depends what you mean by "posix"
<dh`>
that is, any kernel that's supposed to be able to do complex things needs a way to interrupt stuff in progress
<geist>
usualy the big standouts are: signals, fork, file descriptors that work in a particular way, notion of everything being in a fs namespace
<dh`>
and asynchronous notifications
<mrvn>
dh`: why?
<geist>
signals i think tend to infect a bunch of the core kernel pretty quickly, though not usually inexorably
<geist>
if nothing else because ou have to test for various things at various points
<mrvn>
geist: is signalfd cleaner?
<geist>
but that can usually be abstracted with a layer of 'test for whatever the other layer wants here' sort of things you sprinkle around
<geist>
which is kinda a layering violation, but at least it's abstracted
<dh`>
mrvn: because in production uses you end up in situations where something's spending time/resources doign something useless and you want to stop it
<dh`>
you can get away without any kind of kill/interrupt mechanism in a special-purpose system but it's a significant limitation
<mrvn>
dh`: so check a signalfd every now and then. No need to interrupt the process.
<dh`>
what does "interrupt" mean in this context? that you poke the process while it's busy and it responds
<dh`>
there are a lot of ways to implement that
<mrvn>
dh`: stopping it and changing the IP/SP
<clever>
mrvn: i found signalfd handy to easily deal with ctrl+c in a select() loop, without having to deal with volatile vars and checking them on every iteration
<dh`>
no signal implementation I know of does that directly
<dh`>
(then again, I'm sure there are lots I don't know of)
<clever>
epoll actually
<mrvn>
dh`: that's basically the posix way. you set a function to be executed when interrupted
<bslsk05>
github.com: rpi-open-firmware/uart-manager.cpp at master · librerpi/rpi-open-firmware · GitHub
<dh`>
uh
<dh`>
*userland* needs a function to be executed when interrupted
<dh`>
that has litle effect on the kernel internals
<mrvn>
dh`: signalfd says otherwise
<dh`>
how? signalfd is just an alternate mechanism for delivery to userland
<mrvn>
I think there are many way to implement it both for user and kernel.
<clever>
depends on what you want done with the signal
<heat>
sigqueue also works
<dh`>
the part of signals that actually affects the kernel architecture is the machinery that causes a blocked process to unblock and bail out
<geist>
yah and really you need that the instant your user space has the notion of a forceful termination of a thread or process
<clever>
signalfd doesnt do that, and just converts the SIGINT into a write on an FD
<geist>
ie, thread A in process A calls proc_exit() and takes out thread B
<mrvn>
dh`: so start fixing the underlying issue: blocking. Most kernels do async IO internally nowadays.
<geist>
that's the part i'll have to plumb through LK that i'm not looking forward to
<clever>
forcefull termination doesnt really need userland signals
<geist>
but... already done it at last once, for zircon
<clever>
it just needs a way to wake the thread and have the kernel side clean up after itself
<geist>
right
<dh`>
I don't claim to understand what signalfd does and doesn't do, but in order to deliver a signal you still need to interrupt the target process
<clever>
forcing the userland to temporarily run a sig-handler is entirely seperate
<geist>
functionally it means every place you block on an event_t or whatnot (in LK terminoloty) has to handle it returning with a specific error code
<geist>
like ERR_UNBLOCKED
<clever>
dh`: signalfd basically just routes certain signals to an fd, which you then read/poll/select/epoll as a normal fd, and it no longer interrupts your process
<clever>
you just block on it along with all of your other inputs
<dh`>
mrvn: you can write your kernel so _nothing_ blocks, but that's very expensive from a code structure standpoint
<geist>
oh yeah that's another posixy thing that's a pain in the ass: select()
<geist>
poll() is a little easier, but ugh. select
<clever>
geist: why not just do epoll only?
<dh`>
doing some bulk I/O ops asynchronously is not like e.g. doing ops like mkdir asynchronously
<geist>
because if you're doing posix you gotta do it all
<clever>
geist: userland wrapper around epoll?
<geist>
possibly
<mrvn>
dh`: with signalsfd you no longer block on select/poll/epoll because they return activity on the signalfd instead your programm being interrupted
<heat>
poll and select aren't easy to emulate with epoll
<heat>
definitely isn't fast
<geist>
yah hand't looked into it
<heat>
and most software out there uses poll/select and not epoll
<clever>
heat: slap any program doing that and tell them to get with the times :P
<dh`>
mrvn: I don't understand what you mean
<geist>
yah maybe i'll just build another message passing kernel :)
<mrvn>
dh`: the interruption of a signal turns into a normal return of the syscall.
<dh`>
application calls poll, goes to sleep, signal comes in, you still need to wake up the sleeping process
<mrvn>
dh`: yes, you wake up. But with activity on the FD, not with EINTR
<dh`>
but also, application is in the middle of its select loop doing something else and blocked in say mkdir, you still need to wake it up
<dh`>
and as I've been trying to say, EINTR is a userland-facing phenomenon
<mrvn>
dh`: no, that gets redirected into the FD. you don't get interrupted anymore.
<mrvn>
dh`: if you block in mkdir you are screwed.
<dh`>
if you do that, then the signal maybe never gets delivered
<dh`>
and at least for unix signals, that's not part of the expectation
<dh`>
maybe you don't care, but in that case you also need a way to unstick a process that's sitting in mkdir forever on a dead network volume
<heat>
that's generally not interruptible in linux, only interruptible for kill signals
<mrvn>
dh`: it's for processes that use a select/poll/epoll loop. And you pick which signals you want to keep as interrupts and which to redirect to the signalfd
<dh`>
heat: that's a matter of nfs mount options
<dh`>
at least in nfs
<dh`>
it's also linux-specific from what I can see
<clever>
i had my first thundering herd in years yesterday
<clever>
turns out, my nas was hard nfs mounted to my irc box
<clever>
and when the nas hung, the df's from cacti hung
<clever>
and when the nas recovered, some 300 df's came to life at once
<geist>
noice
<dh`>
> it's also linux-specific from what I can see <-- that is, signalfd
<heat>
write a 4.4BSD clone
<heat>
the peak of unix
<heat>
everything went downhill from there on
<geist>
dh`: oh hey you know a thing about binutils and riscv
<bslsk05>
github.com: binutils-gdb/elf32lriscv-defs.sh at master · bminor/binutils-gdb · GitHub
<geist>
as far as i can tell riscv (along with microblaze) are the only arches i've tried that have that particular nerf in place
<geist>
i hit it the other day when trying to make a shared lib with my -elf toolchain
<geist>
nuking that test seems to have no ill effect. i wonder why that's there?
<dh`>
that looks useless and broken
<heat>
that settles it
<geist>
as someone on another discord put it it's another case of discrimination against embedded elves
<dh`>
it's not like shared libraries don't work on riscv or something
<dh`>
nor are elf shared libraries os-specific; I mean, it _says_ "elf"
<geist>
exactly. it seems to work just fine, and though sure maybe the gcc defaults for -elf are not useful (though they seem to be) but you can drive ld with all the switches manually if you really want
<geist>
except when it doesn't support -shared
<geist>
i guess i should file a bug about it then
<geist>
i tried about 10 other arches i maintain toolchains for and i think the only other one i saw that had a similar thing is microblaze
<clever>
something else i was thinking about, relocations and shared vs static libraries
<geist>
but that tends to be a *highly* embedded target
<dh`>
yeah I would file a bug on it
<geist>
yay validation!
<clever>
for say a static userland binary, would relocations usually be missing, and the kernel just loads it to a fixed addr and job done?
<dh`>
that kind of thing creeps in by accident and then nobody will notice it's there unless they happen to step on it themselves
<heat>
geist, maybe just send a patch
<mrvn>
clever: before secruity mitigations, yes
<heat>
from my experience, things go slowly in GNU toolchain land
<dh`>
(also, binutils configury is very extra so it's very hard to avoid having stuff like this happen)
<clever>
mrvn: and how would i convince the linker to include relocation data anyways?
<mrvn>
clever: PIE
<geist>
or, it has relocations but if the loader puts it where it's natively linked, the relocations all end up evaluating to a NOP
<geist>
that gives you the best of both words: something you *could* relocate but if you dont have to you dont do any of the work and dirty COW pages
<heat>
clever, ld has switched to keep relocations
<dh`>
traditionally ever since virtual memory first appeared statically linked programs have a fixed load address
<heat>
s/switched/switches/g
<geist>
clever: generally i think just -dynamic i think? or something like that
<clever>
geist: but wont that also then require an interpreter and a runtime linker?
<clever>
and its not static anymore
<mrvn>
clever: who else would do reloactions?
<clever>
even if you have 0 DT_NEEDED
<geist>
well not necessary. depends on definition of static here
<geist>
i'd say static is if you have no external references
<geist>
but that's independent of relocations
<clever>
mrvn: trying to extrapolate into a static kernel, with relocations, and the bootloader patching it
<dh`>
it used to be that it was impossible with elf binutils to generate an executable image that still had ordinary relocations in it
<geist>
however, ELF combines the two mechanisms, so it's sometimes confusing
<mrvn>
clever: do you have binaries that don't use ANY dynamic libs?
<heat>
"--emit-relocs" or -q
<heat>
clever, ^^
<geist>
all an external patch looks like to ELF is a relocation that refers to a non-local symbol
<dh`>
if that has changed, that's a nice plus
<mrvn>
clever: that's what geist is doing with some magic
<clever>
heat: *checking*
<dh`>
because it made it extremely painful to build executables for nommu systems
<geist>
in the past i used to jsut manually drive the toolchain, specifocally ld, to make what i want
<geist>
vs using a more higher level pre-canned notion of what gcc wants me to do
<mrvn>
clever: So far I'm just writing my boot.S by hand to be 100% PC relative and setup the kernel to run in higher half and then the kernel is linked to a fixed address.
<geist>
but that tends to be nasty business, but usually doable since ld usually lets you get what you want if you try hard enough
<geist>
though that doesn't apply necessarily to gold or lld alas.
<clever>
mrvn: in my case, there is no mmu, and the top of the address space moves based on how much ram i have
<geist>
haha oh that's the classic CP/M problem
<geist>
SYSGEN.COM to fix that!
<clever>
either i put it at a fixed addr and create a hole in the middle of my ram
Likorn has joined #osdev
<mrvn>
clever: then go write an elf loader with relocations and add that as a stub before the kernel (or equivalent). That also lets you do kernel address space randomization.
<clever>
or i apply relocation patches to it
<geist>
actually apple ][ DOS 3.3 had that problem too: if yuo booted a disk on a 48k machine that was formatted at 64k it wouldn't boot because the dos it loaded off the disk was linked at the wrong address
<mrvn>
clever: or run the kernel at a fixed low address and the user space higher up
<clever>
mrvn: ive already got an elf loader in the previous stage, which i can extend to support relocations, so i just need the linker to emit them
<geist>
at least x86 DOS didn't have this problem because segments FTW
<mrvn>
clever: look at the PIC/PIE output to see if that suits you. Be aware that the reolcation format changes from arch to arch and depending on flags.
<clever>
yeah, segmentation was basically a cheaper MMU
<zid>
I was about to joke just add segments
<geist>
they were trashy, but actually kinda helpful when you think about it. lets you easily build relocatable stuff
<geist>
for that particular era of hardware/software at least
<zid>
I mean that was always the point right
<mrvn>
geist: you mean in 16bit mode?
<zid>
it solves the issue of how to run multiple programs very neatly
<geist>
yah. it served two purposes: extend the address space and also build relocatable stuff
<zid>
virtual memory is way too galaxy brain for that era (too many gates if nothing else)
<geist>
yah and drivers too. DOS TSRs and drivers just got loaded somewhere modulo 16 bytes and had segments for their base/etc
<mrvn>
I think they used segments in 32bit just because they already had them.
<geist>
or 16 bit protected mode
<dh`>
the 80286 segments were clearly intended to run multics but done by people who didn't read the directions adequately
xenos1984 has quit [Read error: Connection reset by peer]
<heat>
zid, virtual memory is the galaxy braniest idea known to man
<dh`>
and the 80386 segments were basically structurally identical
<heat>
even now
<geist>
yah. i was saying earlier they had too much axp432 koolaid in the water coolers around Intel at the time
<heat>
"imagine memory, but it's not really there, until it is, but they it might not be, and if you write to it, it might switch to something else"
<geist>
which was segments but galaxy brain version
<zid>
"First, imagine a full 32bit lookup table to map any integer to any other integer"
<geist>
'why dont we just implement OO in the hardware directly' <mind blown> 'lets throw in garbage collection too!'
<heat>
Jazelle was also great
<geist>
but when you think about it, for example, a non mmu 68k machine (say a mac) would have to deal with relocating binaries and whatnot
<geist>
since there was no hardware to do it for you. presumably when they loaded apps either it was at a fixed spot (one app on early macs) and/or had to be relocated on load. and the OS and drivers probably did
<clever>
amiga is also non-mmu 68k and nearly everything is relative addressing
<j`ey>
heat: BXJ!
<heat>
"java is slow in our processors" "what if we <hits bong> run java bytecode directly on the CPU"
<geist>
yah amiga too
<clever>
there is a magic pointer to the description of a library
<zid>
heat: I cry every time
<clever>
negative offsets give you a function pointer table, but i think the table is relative to that root pointer
<zid>
Every time someone mentions sim cards I cringe internally
<clever>
positive is a struct describing the library
<dh`>
in amigaos the call to load and relocate an executable was "LoadSeg"
<dh`>
not sure why I remember this
<geist>
yah i remembe rlookinat a jazelle once at a company i was at, but at the time it was heavily restricted
<geist>
was some binary blob you weren't really alloewd to look at, and it would trap at the drop of a hat
<clever>
isnt that the java on arm thing?
<j`ey>
yeah
<geist>
yah was an actual mode bit in the CPSR and everything
<geist>
i think it ran some subset of java bytecodes and would trap out as soon as something tricky happened
<clever>
from what ive read, hw support for every single java bytecode is optional
<clever>
and its possible for it to support not a single opcode, and still technically support the mode
<geist>
but it was very hidden uner layers of licenses and whatot, so we didn't go for it and just wrote our interpreter in assembly
<geist>
this wa back when ARM was much harder to deal with. they were being hyper secret about everything like it mattered
<clever>
explains why i couldnt find much info online
<heat>
could you write a bootloader in java
<zid>
Should you though
<heat>
obviously yes
<heat>
if the cpu supports it
<heat>
I mean, it's right there
<heat>
why would you not use it
<zid>
heat are you feeling okay
<geist>
it's bad idea jeans day
<kazinsal>
also might be a bad idea to wear jeans today. it's actually above 20 C for the first time this year :toot:
<clever>
i just finished wiring that software into mqtt and home-assistant, so now i can write automation rules to change the temp
<clever>
just in time for summer, so i can not use any of that for a few months, lol
mahmutov has quit [Ping timeout: 246 seconds]
xenos1984 has joined #osdev
<geist>
you know i'm starting to think this 3950x is simply unstable. now that i think about it it was always vaguely crashy when it was my main desktop cpu
<zid>
shame cus that's a nice cpu
<zid>
give it more voltage?
<zid>
might just be a crappy bin
<geist>
i generally assumed it was the usual graphics stuff, etc
<zid>
does prime95 exist on linux
<geist>
absolutely
<geist>
but it doesn't seem to crash under load
<geist>
generally just sort of sitting there. after a few days
<geist>
i had a while back replaced the 3950x with a 5950x on my main desktop, and the latter is rock solid. that's when i had moved the 3950x down to my server
<geist>
and then it started gettnig more unstable over the next 6 months or so
<geist>
but it doesn't run hot or anything
<zid>
sounds like a clear candidate for MOAR VOLTS
<heat>
microcode?
<zid>
does microcode give you more volts
<heat>
no but it might fix bugs
<geist>
yah have that updated, but... part of that problem may be it's a relatively old AM4 mobo that hasn't gotten a bios update in a few years and probably wont any more
<geist>
so it'll be out of date except what linux uploads
<geist>
though i guess linux will do the right thing there
<dh`>
how much ram and is it ecc?
<geist>
64GB and yes
<dh`>
not that then
<geist>
i do have a 3900x i can stuff in and see though, will probably try that next
<geist>
but if it doesn't fail it doesn't mean a lot honestly
<geist>
though it's closer, power profile, to the 3950x so if it's a vreg on the mobo or whatnot i'd expect it to push it
<geist>
yay consumer level hardware being pressed into server usage. sigh.
<raggi>
geist: they should be still publishing update bios for the amd stuff I would have thought
<geist>
yah but if motherboard vendor doesn't actually put out a patch you dont get it
<geist>
but linux should have its own microcode patches
<clever>
yeah, ive heard that windows relies on the bios to patch microcode
<clever>
while linux just prepends the microcode to the initrd, and linux uploads it on boot
<raggi>
My gigabyte one I had to go root around in their "ftp" servers to find the more recent ones, they're just terrible at putting them where they're supposed to be
<raggi>
I swear the mobo manufacturers have some really weird software practices, no idea why but they just can't write and manage software in any normal ways
<raggi>
Yeah, my am4 even has an update from this month, and they finally published on the main site again
<clever>
my motherboard is capable of reaching out to the internet on its own, and updating its own firmware
<clever>
its just a button in the bios config
<raggi>
Yeah, that should be standard now, but alas. Mine still has this weird thing where when it boots in UEFI mode the GUI is super laggy, 1s pause then .5s run, repeating
<clever>
my gui has bloody twinkling stars in the background artwork, lol
<geist>
oh actually there is an updated bios, i should grab it
<raggi>
My am4 board is stable tho, which is better than the one my haswell is in, so they successfully drove my bar down to "MVP"
<geist>
yah honeslty i'm really really sad if it turns out this cpu is fundamentally unstable
<geist>
since i'm pretty much team AMD at the moment. dont let me down1
<geist>
i'm trying hard to blame everything else. actually the fact that they've stuck with the same socket for at least 5 years makes things really nice to debug
<raggi>
Yeah, my 3850 I got on your recommendation and "touches wood 3 times* is still working real nice
<raggi>
You did reseat everything I presume?
<geist>
well, i swapped cpus
<geist>
so i should put the 3950x back and see again
<geist>
the annoying thing is the MTBF is like 3 days so it's a long term solution
<raggi>
I mean reseat the ram, and the gpu, etc
<raggi>
At this point could be arbitrary electrical fault
<geist>
yah i did that too
<geist>
i thought i was onto something by removing the vid card and it appearing stable, but it blew up eventually
<geist>
note it blowing up is a hard lockup. everything just stops. only a power cycle or reset fixes it
andreas303 has joined #osdev
<dh`>
I have an (older) machine that does that, eventually figured it was the motherboard not really supporting ecc dram
<geist>
possible. yes. maybe it's an ecc fault that the mobo then explodes on
<geist>
though i the past i've had ecc faults from time to time that showed up at least in linux's dmesg log, but maybe the 3950x (or zen 2 in general) has a different fault mechanism that the mobo can't deal with
<dh`>
characteristic symptom was that it would lock cold under load, including when rebooting into memtest86+, but a power cycle would clear the problem for weeks
<dh`>
or months
<geist>
whereas before i was using a zen 1 era cpu... that may be onto something
<geist>
i have generally found that ECC ram does generate recoverable errors fairly frequently
<geist>
though not usually every few days
<dh`>
in my case what appears to happen is that the error gets generated (might only be an unrecoverable error?) but nothing happens until you access that memory, so the system dies, usually under load, at intervals of days
<gamozo>
My Piledriver box would throw like 2-3 ECC erors a day (cheap ram, many sticks). Was kinda wild
<zid>
my friend's an ecc nut, he links me machines that have 500 ecc errors logged since they last booted
<geist>
hmm,also possible this 1500x i put back in it doesn't support ECC
<geist>
and thus doesn't have the problem
<geist>
i dont see any mention of it in the dmesg, but that may not be meaningful. but i thought linux woiuld at least mention it
<zid>
I'm still on the hunt for those 32GB of 933MHz ECC UDIMMs
<dh`>
anyway there's all manner of possible similar problems
<geist>
yah that's a good suggestion
<zid>
I like to crash hardware so that it's broken through reboots
<zid>
I kept doing it when I was testing my CPU with prime95