klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
FreeFull has quit []
_xor has quit [Quit: bbiab]
heat_ has joined #osdev
heat has quit [Read error: Connection reset by peer]
<gog> i know a guy
<gog> i call him hraunmann cause hes got the lava
heat_ is now known as heat
sonny has joined #osdev
PSedlacek has joined #osdev
pretty_dumm_guy has quit [Quit: WeeChat 3.5]
azu has joined #osdev
sonny has quit [Remote host closed the connection]
sonny has joined #osdev
heat has quit [Ping timeout: 248 seconds]
<mrvn> great, I finally know a guy who knows a guy.
<moon-child> did the guys you knew previously not know guys?
<mrvn> ask gog
<moon-child> hmm, is knowing somebody reflexive? Because if so...
Likorn has quit [Quit: WeeChat 3.4.1]
<mrvn> bool knows(const Guy&, const Guy&) [[reflexive]];
Likorn has joined #osdev
sonny has quit [Ping timeout: 252 seconds]
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
<ckie> came here to paste a little journal entry from my project. might help any other new people who are lurking
<ckie> I think actually going for barebones x86_64 was stupid and a waste of time. Should've just started with the compiler, linked against SDL2 and went from there since I don't really care about where the SMP bit is on the intel core 2 uf02urtf or whatever. I wanna see it alive.
<ckie> .. yes i do know the wiki mentions this. i read it all, but i did not listen
sonny has joined #osdev
srjek has quit [Ping timeout: 255 seconds]
\Test_User is now known as |Test_User
Gooberpatrol66 has quit [Ping timeout: 248 seconds]
PSedlacek has quit [Quit: Connection closed for inactivity]
<mrvn> ckie: what is 2 * 3 + 4?
<ckie> mrvn: i'm not a robot ya nerd
<ckie> (that is also how i interact with captchas. habit)
<ckie> last time i was here i asked about the segment registers because i was trying to change the gdt layout even though it was already valid
<mrvn> wow, that sentense has less word salat than your "journal entry".
<ckie> mrvn: well the word salad is because i wasn't excepting to actually show it to anyone except myself
<ckie> probably should've cleaned it up
* ckie is tired
<ckie> also using "ya nerd" as a derogatory was stupid. further evidence i should sleep instead of reading my code and placing ranty comments about how i did things stupidly
<zid> I also have absolutely 0 idea what the context is :P
<zid> just that you seem like you cleared some hurdle
<zid> in terms of a mistaken approach
<ckie> zid: yup, and i am very annoyed at past me for being so dumb with the previous design
<zid> whatever that may have been :P
<mrvn> zid: building a car was stupid for hime because he needs a blender so he can butter the pizza and go from there because who cares if the tires are rubber.
<ckie> when i'm not tired i churn out a ton of code and then when i get tired my thoughts slow down and i find the architectural problems in what i programmed
<mrvn> zid: scroll up 30m. That sentence makes about as much sense as his original one.
<ckie> s/his/their
<ckie> mrvn: the annoying part is i can't tell which sentences make sense and which don't
<ckie> i can only see how i basically wrote a giant bug
<mrvn> "Where we are going we don't need roads"
<ckie> the quotes are nice for my melted brain
<ckie> do continue
<ckie> also was 2*3+4 supposed to be evaluated with operator precedence
<ckie> s/$/\?/
<ckie> s/\?/?/
<ckie> i should really sleep
<ckie> i'm gonna do that
<ckie> mrvn, zid: thanks for enduring my brain puke. good night
sonny has quit [Remote host closed the connection]
GeDaMo has joined #osdev
<Mutabah> Ugh... why did I decide to implement EHCI
<Mutabah> The hardware interface is quirky and has complex documentation
<Mutabah> and then you get to the USB2/USB1 split transaction handling
<Mutabah> (tip: Run, run very far)
<mrvn> run silent, run deep
bliminse has quit [Quit: leaving]
Likorn has quit [Quit: WeeChat 3.4.1]
mahmutov has joined #osdev
bliminse has joined #osdev
zaquest has quit [Quit: Leaving]
zaquest has joined #osdev
Gooberpatrol66 has joined #osdev
azu has quit [Ping timeout: 248 seconds]
<Mutabah> For those who don't know: EHCI needs to know what USB2 hub (device ID and port) is used for any non-usb2 device you talk to
<Mutabah> And needs to know the speed of that device
Likorn has joined #osdev
<zid> Have you considered OHCI
<Mutabah> Oh, I've implemented that already
<Mutabah> I should just dump EHCI and go to XHCI
<zid> huh interesting
<zid> I guess technically qemu's q35 is OCHI, but I don't imagine they implement it?
<zid> wait I don't mean OCHI that's why
<zid> fuck
<zid> I meant UHCI
the_lanetly_052_ has joined #osdev
ElementW has quit [Ping timeout: 248 seconds]
ElementW has joined #osdev
SGautam has joined #osdev
Burgundy has joined #osdev
ketan_ has joined #osdev
dennis95 has joined #osdev
<Mutabah> ohci/uhci are USB 1 controllers
<Mutabah> ehci is USB2, and a pile of hack
<Mutabah> xhci is USB3+ and aparently is not too bad
nyah has joined #osdev
pretty_dumm_guy has joined #osdev
pretty_dumm_guy has quit [Quit: WeeChat 3.5]
Burgundy has quit [Ping timeout: 255 seconds]
mahmutov has quit [Quit: WeeChat 3.1]
mavhq has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
mavhq has joined #osdev
mahmutov has joined #osdev
mahmutov has quit [Client Quit]
SGautam has quit [Quit: Connection closed for inactivity]
xenos1984 has quit [Read error: Connection reset by peer]
mahmutov has joined #osdev
pretty_dumm_guy has joined #osdev
xenos1984 has joined #osdev
jafarlihi has joined #osdev
<jafarlihi> Is it possible to fuzz OpenBSD running in KVM using syzkaller without creating another VM inside OpenBSD?
vdamewood has joined #osdev
bauen1 has joined #osdev
ketan_ has quit [Read error: Connection reset by peer]
heat has joined #osdev
<heat> jafarlihi, yes but you need to change syzkaller
<heat> fuchsia and another system run in that mode
<jafarlihi> Is it worth it? That is, will I find bugs or is it fuzzed to death already?
<heat> fuzzed to death
<bslsk05> ​syzkaller.appspot.com: syzbot
jafarlihi has quit [Quit: WeeChat 3.5]
sortie has quit [Quit: Leaving]
sortie has joined #osdev
eroux has joined #osdev
srjek has joined #osdev
bauen1 has quit [Remote host closed the connection]
heat has quit [Remote host closed the connection]
heat has joined #osdev
pretty_dumm_guy has quit [Ping timeout: 255 seconds]
mahmutov has quit [Ping timeout: 240 seconds]
mahmutov has joined #osdev
sebonirc has quit [Remote host closed the connection]
sebonirc has joined #osdev
the_lanetly_052_ has quit [Ping timeout: 248 seconds]
vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<ddevault> this kernel project is the most fun I've had programming in some time
<heat> welcome to the biggest timesink of the rest of your life
<ddevault> this is my second kernel and, uh, fifth huge project
<zid> weren't you raging yesterday about segment selectors? :P
<heat> it's part of the experience
<ddevault> oh, naturally there are frustrating parts
<ddevault> I did choose an implementation language which doesn't even have DWARF support
<ddevault> I am dreading ACPI though
<heat> import ACPICA
<ddevault> I will, eventually, in userspace
<ddevault> but I intend to defer it as long as possible
<heat> microkernel?
<ddevault> yeah
* heat cries in /unix
<ddevault> I currently don't have any provisions for running C code yet
<ddevault> so it will take a little bit of effort to set up C drivers like ACPICA
<zid> 'C drivers like acpica'?
<ddevault> ACPICA is, in fact, written in C
<heat> #fact
<zid> I mean sure, but
<ddevault> I will have to write up a little C library for interfacing with the syscall API and kernel services for it
<ddevault> not libc, thankfully
<zid> ah in that eense
heat_ has joined #osdev
heat has quit [Ping timeout: 248 seconds]
heat_ is now known as heat
<mrvn> ddevault: do you have inline asm?
<ddevault> no
<ddevault> we dpm
<ddevault> we don't* intend to add it, either
<mrvn> You need some asm, no way around that. So better get started on that C interface.
<ddevault> we write assembly files separatley and link to them
<ddevault> works fine
<ddevault> would not have gotten this far without it
<mrvn> then why can't you link to C files the same way?
<ddevault> well, I could, but it would require some build system stuff
<ddevault> and in any case, I don't want to
<ddevault> ACPICA should be implemented as a standalone driver, not linked to Hare code imo
<mrvn> one entry to compile C source, big deal. :)
<ddevault> communicate over IPC
<j`ey> ddevault: does sr.ht not have search ability?
<ddevault> what are you searching for?
<j`ey> stuff inside a repo
<ddevault> no, not yet
<j`ey> ddevault: https://git.sr.ht/~sircmpwn/helios/tree/master/item/caps/%2Bx86_64.ha in this view it shows the commit '970eee3d'.. but that doesnt touch this file
<bslsk05> ​git.sr.ht: ~sircmpwn/helios: caps/+x86_64.ha - sourcehut git
<ddevault> yeah, that's just the latest commit on that branch
<j`ey> confusing (for github users at least..)
<ddevault> agreed
<j`ey> anyway, LAST = PAGE looks wrong?
<ddevault> good call
<zid> what's a .ha out of interest
<bslsk05> ​harelang.org: The Hare programming language
<zid> I don't like their example code so clearly it's a terrible language with no redeeming features
<j`ey> lol
<ddevault> I designed this language -_-
<zid> I wasn't talking about the design
<j`ey> buf: *[*]u8: uintptr: u64,
<j`ey> wut
<ddevault> context?
<bslsk05> ​git.sr.ht: ~sircmpwn/helios: vulcan/rt/syscalls.ha - sourcehut git
<ddevault> yeah, this is not great
<ddevault> can skip the *[*]u8 cast
<zid> I still can't figure out what 0z means, the quick language guide has a section on inferred suffices but it isn't there, i u u8 and f32 are
<ddevault> it's size, which is equivalent to size_t
<mrvn> zid: follow the yellow brick road
<zid> surprised it isn't ssize_t but it makes sense to use z for it at least
<citrons> why ssize_t?
<zid> because he has a u for unsigned
<zid> and z is from the printf specifier for size_t %zu
<ddevault> z is from the "z" in "size"
<ddevault> which is also where printf gets it from
<zid> so z without u to me says "ssize_t"
<ddevault> hare does not link to libc, we do not need any of its baggage
<citrons> ah. well, ssize_t wouldn't be very useful for hare
<zid> maybe not
<mrvn> why do syscalls always return a pair of u64?
<zid> nor was I suggsting anything
<ddevault> efficiency, we have two return registers
<zid> I just said it was surprising
<citrons> sure
<mrvn> It should be i8, i16, i32, i64, iz, u8, u15, u32, u64, uz
<mrvn> jsut for consistentcy
<ddevault> we don't have a signed size type at all
<ddevault> something cannot have a negative size
<ddevault> it exists only as a hack for libc
<mrvn> ddevault: nothing for a ptrdiff_t?
<ddevault> not presently, no
<mrvn> Actually, is it i64 ot s64?
<ddevault> i64
<zid> Okay so the *actual* problem I have with the example, is that it seems to be throwing grammar at the screen in order to demonstrate it, *I hope*
<zid> Because if it hasn't done that this code is kind of crazy
Jari-- has joined #osdev
<citrons> which example
<zid> the one.. ont he front page
<zid> that I got linked
<mrvn> 14 lines for hello world. hare sucks.
<ddevault> I would be more prepared to listen to your arguments if you had spend more than 90 seconds with the language
<ddevault> spent*
<zid> I wasn't talking about the language
<zid> for the nth time
<zid> You're getting defensive over something and I don't know what or why
<j`ey> then whats wrong with the example?
<mrvn> ddevault: no i++ or ++i in hare?
<zid> I said the example looks like it's using grammar just to use it, not that anything is wrong with it
<ddevault> mrvn: no, we haven't bothered but I'm not entirely opposed
<mrvn> ddevault: how about range-for?
<ddevault> no
<zid> if the z there is *required* to make it compile, then I'd maybe have something to say about the /language/ but I don't know anything about the language so I can't
<mrvn> generators?
<mrvn> list comprehension?
<ddevault> not as a first-class language feature
<ddevault> and also no
<ddevault> hare is closer to C than to anything else
<ddevault> knock it off, zid
<mrvn> are strings 0 terminated or length prefixed?
<zid> Literally no idea what you're talking about
<ddevault> neither
<ddevault> the length and data are stored separately
<mrvn> so mor like c++ then
<ddevault> a str type contains a length and a pointer
<mrvn> Fist problem I have with hare just form the front page: "const"? Why isn't that the default. Mutable should be the exception that you mark down.
<ddevault> there is no default
<ddevault> let and const are different keywords
<ddevault> it's not let const ... or let mut ...
<mrvn> and let is mutable and const is not?
<zid> heat: I love your reddit name :D
<heat> thanks
<ddevault> correct
<heat> zid, lots of doomers in that post btw
<mrvn> ddevault: so I could write: for (const i = 0z; ; )?
<ddevault> no, the minimum required for a for loop is the condition
<ddevault> you can write for (const i = 0z; true) though
<zid> that post being the nvidia gpu one?
<mrvn> ddevault: strange choice to use "let" there and not "var" or something
* ddevault shrugs
<ddevault> I like my bike sheds painted blue, how about you
<mrvn> ddevault: clashes with my tardis
<heat> zid, yes
<zid> I assume he meant to add "Given that the newer nvidia cards will be moving to firmware based drivers, do you think the DRI layer will get thinner, and if so, will it allow hobby OSs to possibly interface nvidia cards?"
<Jari--> Does anyone still use Objective C? I used to work on telecom company, doing GTK+/Objective C, Maemo, etc.
<zid> but he asked "CAN OS USE NVIDIA???/"
<citrons> I wish lua used `let` instead of `local`
<citrons> so much typing
<mrvn> Don't want to run from some monster and end up in my bike shed by accident
<zid> I wish lua had 'continue'
<citrons> it has `function` and not `fn` as well
<heat> zid, OS could always use nvidia
<heat> like, it has been done
<citrons> apparently `continue` is weird with lexical scoping
<mrvn> both lua and hare sucks, they have no "fun".
<heat> it's not easy, but you can do it and will be able to do it
<zid> I'm not aware of any hobby OS that implements say, accelerated DX though
<heat> zid, DX?
<zid> directx
<ddevault> DX is proprietary you dork
<ddevault> you mean GL or VK
<zid> no I don'#t
<zid> I mean DX
<ddevault> right
<Jari--> you get sued really well
<ddevault> good luck with that
<heat> you theoretically can if you use wine
<mrvn> Just compile wine for the OS and you have DX
<Jari--> whine
<citrons> Wine Is Not A Hobby OS
<heat> you'll never get sued if you implement DX
<zid> wine doesn't do it natively does it? afaik for older dx at least it was rewriting it as gl
<heat> obviously
* Jari-- is working on Commodore Basic virtual machine project
<zid> you can do native dx if you have a windows host and microsoft's balloon driver though
<zid> in a vm
<citrons> I'm sure reactos aspires to have an open source directx
<heat> it does
<heat> ...
<ddevault> reactos is largely based on wine
<zid> which is basically the solution you *might* be able to get away with, if nvidia's plan is set up to allow it to be easy
<ddevault> it will probably just use the same OpenGL translation layer
<citrons> probably
<zid> I've not seen the details of what nvidia's up to, other than "driver now lives in rom"
<mrvn> ddevault: when you specify a return type of ((u64, u64) | syserror) how does the code know which of the two it is?
gog has quit [Read error: Connection reset by peer]
gog` has joined #osdev
<ddevault> it's a tagged union
<zid> how big the userspace portion would have to be to submit dx commands to that, but I was hopeful for "idk, less than before though!"
<ddevault> so it checks the tag
<heat> zid, probably still the same interface
<heat> so, probably still the same userspace portion
<mrvn> ddevault: I see no tag there. what if I want ((u64, u64) | (u64, u64) | syserror)?
<heat> as far as I understand it, the idea is to get an open source nvidia vulkan driver
<citrons> `(a | b)` is the annotation for a tagged union type
<j`ey> mrvn: the compiler adds the tag
<citrons> it's a language feature
<ddevault> mrvn: that collapses to ((u64, u64) | syserror)
<ddevault> the tags are implicitly assigned
<heat> then use zink to get opengl over vulkan
<mrvn> so you can't have 2 things in an union that have the same structure?
<heat> it wouldn't be a bad idea to get direct3D over vulkan as well
<mrvn> No explicit tags?
dennis95 has quit [Quit: Leaving]
<ddevault> no explicit tags
<zid> hopefully the details are fun rather than boring
<ddevault> you can have two things with the same structure by defining new type aliases
<ddevault> new type, same storage and semantics
<Jari--> Level of Windows 95 support? ReactOS? Or is it 2000, 2003, etc. what could be relative Windows product in features it supports currently?
<zid> if all they're doing is literally using a BAR to implement mmap("nvidia.sys") then that's super super boring
<heat> zid, there's a whole API built around the firmware
<heat> which is RISCV btw, suck it zid
<heat> RISCV is a real architecture
<zid> riscv is free, and exists
<heat> facts
<zid> and nvidia own fabs
<Jari--> BSD386 FTW !
<zid> so it makes sense
<heat> what
<ddevault> I have a RISC-V machine right here :)
<heat> you mean 386BSD?
<ddevault> helios will be ported to it in the foreseeable future
<heat> what does that have to do with anything
<citrons> I'll have a riscv system someday
<heat> anyway, it's literally just a blob of code that the nvidia co-processor executes and you interface with the blob using an API
<heat> it's not mmap("nvidia.sys")
<zid> Yea I said I didn't know, I wasn't suggesting it was
<Jari--> I would prefer to connect and dock my personal hand phone (Android) to a desktop PC's monitor, keyboard and mouse...
<zid> mmap(nvidia.sys) is what *used* to happen
<zid> so you're back to front anyway
elastic_dog has quit [Ping timeout: 248 seconds]
<heat> executing arbitrary code sounds like a good way to not get your driver signed
<zid> I said I hope they're not just going do do what used to happen via mmap, loading the 'installed nvidia driver into memory' via mmap, but now as a BAR, and that they're actually going to do cool things
<zid> It won't be a driver though it'll be firmware!
<heat> yes i know
<zid> so who needs to sign anything, muahaha
<ddevault> most drivers use firmware
<ddevault> similar designs have already shipped thousands of times
pretty_dumm_guy has joined #osdev
<mrvn> firmware -- code where other people have installed backdoors in
<heat> you need to have your driver signed else your driver won't load on secure booted machines
<zid> yes, again, not what I said
<ddevault> and firmware does not run on the host CPU
<ddevault> it runs on the device itself
<zid> WHy would they need to *submit* it for wqhl, if the point of this is to move the driver code into the firmware
<heat> submit the fw? they wouldn't
<zid> So why did you say what you said
<zid> I said they wouldn't need it signed
<heat> yes, the fw won't
<zid> you said "it needs to be signed though"
<heat> but the driver will
<zid> yes, but the driver won't change
<zid> they get it signed once then just keep doing 'firmware updates'
<ddevault> or just get new drivers signed like they do in a normal release cycle
<heat> at the first sight of "oh this runs native code on your native CPU fetched from firmware?" it would get blacklisted pretty quickly
<ddevault> that's *not* how it works
<heat> that's why kernels and bootloaders behave the way they do
<zid> and presumably why they have the risc-v'y bit
<heat> yes it is, any piece of code that can load arbitrary code and is signed will get blacklisted
<mrvn> if it doesn't require the firmware to be signed then it should get banned
<mrvn> ddevault: "firmware does not run on the host CPU". If it has access to memory, e.g. DMA, then that distinction is meaningless.
<ddevault> well, that's where something like IOMMU comes in
elastic_dog has joined #osdev
<mrvn> one can hope
gog` is now known as gog
dude12312414 has joined #osdev
lainon has joined #osdev
<mrvn> does anyone have inline asm stubs for adcx and adox?
<moon-child> there are intrinsics
<mrvn> named what?
<mrvn> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173 gcc still has a bug about it open
<bslsk05> ​gcc.gnu.org: 79173 – add-with-carry and subtract-with-borrow support (x86_64 and others)
<bslsk05> ​www.intel.com: Intel® Intrinsics Guide
<moon-child> idk I saw them in the manual, don't use intrinsics much
<heat> heres a tip: they're hard to use in kernels
<heat> i.e you can't use SSE intrinsics in SSE-disabled code
<mrvn> <source>:8:9: error: '_addcarry_u64' was not declared in this scope
<heat> include the header
<heat> #include <x86intrin.h> ?
<heat> ah no, #include <immintrin.h>
<mrvn> both work on godbolt
<heat> what are you doing with those? internet checksum?
XgF has quit [Remote host closed the connection]
XgF has joined #osdev
<mrvn> https://godbolt.org/z/3KaqMvE5P the intrinsic doesn't produce adox
<bslsk05> ​godbolt.org: Compiler Explorer
<mrvn> nor adcx for that matter
<mrvn> heat: big nums
<GeDaMo> Do you need to specify a machine to the compiler?
<heat> mrvn, it's addcarryx
<heat> you're using the adc intrinsic :)
<heat> hmm even then
<mrvn> both produce just adc
<heat> i can't get gcc or clang to gen those instructions
<heat> wtf
<mrvn> From what I saw in gcc bug reports gcc simply doesn't have the capability to track C and O chains for carry. It only has a flag that something modifies the CC flags.
<heat> icc does support it
<heat> just use icc, ez
<heat> you're probably better off using inline asm though
<mrvn> And what is wrong with clang here? https://godbolt.org/z/WKT1M7MYx
<bslsk05> ​godbolt.org: Compiler Explorer
<j`ey> restrict in the "wrong" place
<j`ey> "Big & __restrict__ a"
<mrvn> thx. And nox clang complains my target has no adx feature. How do I turn on cpu features in clang?
<mrvn> s/nox/now/
aejsmith has quit [Remote host closed the connection]
<bslsk05> ​clang.llvm.org: Clang command line argument reference — Clang 15.0.0git documentation
aejsmith has joined #osdev
<mrvn> No adcx not adox with clang either.
<bslsk05> ​github.com: clang/adx-builtins.c at master · microsoft/clang · GitHub
<mrvn> GeDaMo: that generates adc, not adcx/adox
<mrvn> Also: uint64_t != unsigned long long which makes those macros horrible to use. Who came up with that?
gildasio1 has joined #osdev
gildasio has quit [Ping timeout: 240 seconds]
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<heat> if you're unhappy with clang you can always submit a patch and wait indefinitely for the patch to get merged without any feedback whatsoever
<heat> </butthurt>
<ddevault> how can I determine what regions of physical memory are used for mmio on x86, as distinguished from general purpose RAM?
<heat> ddevault, MMIO isn't marked as available on the memory map
<zid> you can't, really, but you could ask someone to tell you
<zid> like the e820
<ddevault> is it not more granular than "all physical addresses which are not marked as available"?
<heat> depends
<heat> on the EFI memory map? possibly
<zid> all physical addresses may or may not refer to a device
<mrvn> Is any region outside of some pci mapped device used for MMIO?
<zid> the cpu can't really tell
<heat> mrvn, local apic, io apic
<ddevault> alright
xenos1984 has quit [Read error: Connection reset by peer]
<heat> $chipset_stuff
<ddevault> I'll just assume all unavailable memory is potentially useful for devices
<heat> why do you care?
<zid> yea seems kind of inside out
<heat> any not-available memory is not available :)
<zid> normally you'd just collect up your system's information and use it, not try to rverse engineer it
<ddevault> distinguishing device memory from non-device memory for page allocation
<ddevault> microkernel, so userspace should be able to map device memory
<heat> <heat> any not-available memory is not available :)
<zid> finding out that 0xDEADBF0012 *doesn't* do anything is less useful than finding out which memory *is* useful
<geist> generally what you do is start off by figuring out what is memory. that's what e820/efi/etc tell you
<geist> so that's good, you know now that anything that's outside of that is potentially device memory
<heat> also, there's memory which is there but you can't touch it
<geist> finding device memory is then a case of going through bus specific mechanisms to discover or allocate
<ddevault> I assume if I map some bogus physical address into an address space and userspaces tries to use it, it will just create a fault (from which I can recover)
<heat> see ACPI NVS, EFI runtime services data, SMM data (which the chipset disallows to touch)
<mrvn> ddevault: no, it will just do something undefined
<ddevault> oh
<ddevault> well that's great
<mrvn> most likely ignore write and read 0
<heat> in x86 you usually get all-ones
<geist> ddevault: that's right. so you shouldn't allow that. or if something tries to map a random physical address there should be some amount of pre-determiniation that it's something valid
<ddevault> ah that's fine
<mrvn> or 1
<geist> ie, a PCI bus scanner that discovers all of the BARs and then only lets drivers map the bars
<ddevault> the issue, geist, is that the PCI driver is in userspace
<geist> sure. but you know where the starting point is from parsing ACPi/etc
<mrvn> ddevault: so? it should ask the PCI Scanner for mapped memory
<heat> let the PCI driver map everything and let it carve out parts of the address space for client drivers
<geist> so the bus driver starts by mapping the ECAM/etc, then scans the pci bus and as a result of that knows all the BARs
<geist> the trick is of course ACPI/etc but its not turtles all the way down. at some point you get to some root of truth where to start a search
<ddevault> yeah, and ideally it's rooted in userspace
<geist> yep. searching for ACPI is kinda annoying, but actually UEFI tells you where it is
<geist> or the root RSDP is in a known range of memory so you can map that and search it (the 640k hole)
<ddevault> eh, it's not that annoying
<geist> yah, exactly.
<ddevault> but it still doesn't semantically belong in the kernel
<ddevault> so if I can avoid it, I shall
<ddevault> (but I probably can't, because SMP)
<geist> but anyway the end result is that drivers shouldn't just willy nilly map things, or even better be *allowed* to map things
<heat> in fuchsia you get mmio as a vmo as well right?
<geist> for fuchsia, for example, only the root drivers have the ability to synthesize a physical memory object, so part of say a PCI driver startup, it's *handed* the objects it needs to map
<mrvn> ddevault: You should have some secure "token" to allow mapping memory ranges. And a owner of such a token can create new tokens for sub-ranges. Then you can start the PCI Bus Scanner with a token for the whole PCI memory range. That then creates sub tokens for the individual devices and gives them to each driver.
<geist> exactly. so the driver itself doesn't have any rights to just map something anywhere
<ddevault> aye, mrvn, I understand
<geist> the PCI bus driver has the necessary authority to construct the physical mappings on behalf of the drivers
<zid> and tbh, I'd just prefer that as an API anyway, "give me the mmio region for device 4" rather than "is address 37 good or bad so I can remap a pci device there?"
<geist> right
<ddevault> anyway, I am starting to conclude that PCI will probably have to live in the kernel, at least partially
<heat> no
<geist> it can be done in user space, it's just kinda messy
<mrvn> ddevault: no, it just has to ask the kernel to map the memory
<ddevault> yes, but the kernel has to determine if the physical address the user wants to map is sane
<geist> and there may be some necessary evil, especially as it pertains to interrupt mapping, where the pci driver has some special perms to talk to kernel space
<heat> no it doesn't
<heat> it *may*, but it doesn't
<geist> if you simply trust the pci driver to not be busted give it the authority to just synthesize mappings of any physical
<ddevault> well, it depends on what happens when you write to or read from an invalid physical address
<ddevault> hence the original question
<mrvn> ddevault: look at what I wrote before. The kernel only limits the PCI Bus Scanner to the PCI memory region. Everything else is user space.
<geist> you've distributed a bit of trust around, but the pci driver has a lot of authority
<ddevault> mrvn: yes, I understand
<heat> ddevault, anything that has the capability to map random physical memory should be trusted
<heat> it's the only way
<ddevault> my god
<ddevault> forget it
<heat> ok
<geist> oh? what's wrong? too many answers?
<mrvn> ddevault: you will need a similar abstraction for DMA
<geist> trying to be helpful
* geist queues up the Too Many Cooks youtube
<mrvn> geist: now I'm hungry
<geist> heh
JanC has quit [Remote host closed the connection]
JanC has joined #osdev
<geist> okay so now in my new server board the cpu fan is turned sideways, which means it' generally blowing against the top of the case
<geist> i have another identical case with an actual fan mount there, so i guess i should swap the motherboard with it and get a top exhaust fan which should help
<geist> server definitely runs hotter under load. easily bumps past 80c and then the cooler fan spins up pretty loud
<heat> any fan is already miles better than a macbook's thermal design
<heat> yes yes I know I won't shut up about it, let me rant about macbooks while I have it
<geist> dunno have you used one of the new M1 macbooks? you gotta really work at it to get it to heat up
<heat> no I have the last intel macbook pro
<geist> though i haven't used an air, maybe they're a bit worse, though AFAIK they just dont have a fan
<geist> well if it makes you feel any better the M1s basically just dont spin the fan up, or if they do it's basically silent
<heat> the avg on-load temp is like 96C
<heat> right, I feel like that's the issue
<heat> make it spin pls
<mrvn> heat: if it's designed to run that hot what is the problem?
<geist> their new solution seems to just be 'make the cpu so efficient it doesn't need much cooling'
<heat> it deeply annoys me
<mrvn> the hotter the cpu the more efficient the cooling
<heat> also touching really hot aluminium isn't pleasant
<zid> It's never going to be good just because there's less laptop
<geist> i have a lenovo thinkpad 11th gen for fuchsia testing upstairs, and it almost instantly starts heating up under load
<mrvn> the aluminium isn't 96°C
<geist> and silly they put i think the intake vents on the bottom
<geist> so if you have it on your lap it really heats up fast
<heat> mrvn, it's not but it's still warm
<zid> if it were 2-3kg of laptop it'd heat up a LOT slower :p
<mrvn> heat: so what you really want is the fan to be controled by the cases outside temp.
<heat> if the CPU never reached 100C comfortably this just wouldn't be an issue
<zid> what you need to do is remove the thermal paste
<mrvn> it's not a laptop even if named thus, don't put it on your lap. :)
<zid> the cpu will throttle more and the body won't heat up as much
<zid> best way to increase testicle thermals
<heat> yeah right idk
<heat> it's only a laptop if you're not doing actual heavy work on it
<heat> and if you're not, why do you have a 2000 euro laptop
xenos1984 has joined #osdev
<heat> I should try an M1 one though
<heat> since that's massively better
<mrvn> I don't think they made any laptops in the last decade, only mobile systems wiht a screen+keyboard.
<mrvn> laptops are now called phones.
<heat> i can put my cheap laptop on my lap no problem
<mrvn> heat: plastic case?
<heat> yes
<heat> also hopefully the one I'm hopefully getting will not have these issues
<mrvn> heat: still surprises me. No heat vents on the bottom?
<heat> dell latitude 7420
<heat> mrvn, hrrrm
<heat> kinda?
<heat> it's usable-hot
<zid> heat: What about we water-cool your legs?
<zid> here's a 3L bottle of water, finish it quickly
<mrvn> Mostly you either block the vents or the 50°C air blowing out of it gets anoying.
<heat> zid, no liquid nitrogen?
<zid> Okay here's a genuine idea, tape an asbestos tile to the bottom
<mrvn> Every laptop should have a designated hot plate to place your coffe on.
<GeDaMo> Lick the tile first so it will stay in place :P
<zid> lick it all you want just don't take a hand-file to it and start huffing :p
<heat> my nuts look swollen and purple
<heat> you sure that's a genuine solution?
<zid> Your medical issues don't impact my solution's efficacy dw
<zid> You need a doctor not an engineer
<heat> ok wonderful
<heat> sgtm
<heat> just wanted to check
<mrvn> Does x86_64 have the opposite of setc/seto?
<GeDaMo> setnc/stno?
<GeDaMo> Er setno
<mrvn> I mean set the CC to what's in a register
<zid> sets the no flag, which causes all calls to fail
<zid> "call" "computer says no."
<GeDaMo> Not directly
<GeDaMo> neg al; add al, 1; maybe? Assuming al is 0 or 1
<mrvn> GeDaMo: neg already sets CF
<mrvn> The CF flag set to 0 if the source operand is 0; otherwise it is set to 1. The OF, SF, ZF, AF, and PF flags are set according to the result.
<mrvn> I need to set both CF and OF to specific values.
<GeDaMo> Even better! :P
<mrvn> looks like the only way for that is and extra adcx/adox .
<moon-child> mrvn: use the 'loop' instruction for your loops, then you don't have to save/restore c and o
<moon-child> :)
<moon-child> (don't actually do this)
<heat> USE IT
<heat> also use AAA and AAD
<zid> I wish
<heat> and enter + leave
* |Test_User complains about how loop is in it's own loop of reasons to not use - no one uses it so they didn't bother optomizing it, not optomized so no one uses it
<GeDaMo> You can push and pop the flags
<moon-child> leave is fine
<moon-child> enter is crap
<moon-child> pushf/popf are slow
<heat> pushf and popf aren't slow
<zid> heat: That's how I emulate DAA on SM83, I set up a 32bit compat segment selector, lgdt it, reload cs, do daa, then do the reverse
<mrvn> moon-child: Oh, I thought loop would still change cf. but that's even better
<heat> pushad and popad are slow though
<heat> are thus, great
<mrvn> heat: ASCII Adjust After Addition?
<heat> yes
<moon-child> heat: popf is 13 cycles
<moon-child> (on zen2, at any rate)
<heat> ok
<heat> but you need it
<heat> how about that
<mrvn> heat: whatever for should i use that?
<heat> idk i'm just listing my top 10 x86 instructions
<mrvn> My top instruction is sex
<heat> i've never used that
<moon-child> mrvn: you can instruct me any time you want
<mrvn> heat: oh, a virgin. :)
<heat> :D
<heat> that is also not a thing wtf
<heat> lying bastard
<zid> my top 10 is a wildcard and mod/rm
<bslsk05> ​hbfs.wordpress.com: Branchless Equivalents of Simple Functions | Harder, Better, Faster, Stronger
<mrvn> sex == sign extend on some archs but x86 chickend out and calls that something else.
<Griwes> well if you go with sex then you also have to go with zex and that's just weird
<heat> movsx
<heat> and movzx
* Griwes is "looking forward to" writing his exception handling code that will need to sign-extend n*7-bit integers
<moon-child> not that hard, really
<moon-child> shift left, then right
<heat> Griwes, why?
<Griwes> heat, because C++ language specific exception tables use LEB128 to encode numbers
<klys> store and exchange
GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]
mahk has joined #osdev
<geist> yah iirc SEX is one of the 6809 instructions, at least
<heat> Griwes, oh right, those exceptions
Likorn has quit [Quit: WeeChat 3.4.1]
mahk has quit [Quit: mahk]
mahk has joined #osdev
mahk has quit [Client Quit]
mahk has joined #osdev
<mrvn> how do I pass a "const uint64_t *pb" in the "s" register to inline asm?
<mrvn> "rsi" register
Gooberpatrol66 has quit [Quit: Leaving]
<bslsk05> ​gcc.gnu.org: Machine Constraints (Using the GNU Compiler Collection (GCC))
<heat> good luck making sense of that
<heat> oh that's easy
<mrvn> never mind, it's "S", not "s"
<heat> the S constraint
<heat> it's not 's' because fuck you
<zid> The constraints for rdi, rax, rsi are SaD
<zid> makes mi cry every tim
<gog> alexa play despacito
<mrvn> that's because there is also a "d" register
<mrvn> a,b,c,d for int registers, D,S for pointers
mahk has quit [Ping timeout: 248 seconds]
mahmutov has quit [Ping timeout: 246 seconds]
knusbaum has joined #osdev
knusbaum has quit [Quit: ZNC 1.8.2 - https://znc.in]
knusbaum has joined #osdev
<mrvn> How bad is it to go through a large uint64_t[] backwards instead of forward?
<mrvn> Has the predictive memory pre-fretcher advanced to the point where it doesn't matter?
Likorn has joined #osdev
<heat> i think the C copy backwards thing in that memcpy bench suite is a good bit slower
<bslsk05> ​godbolt.org: Compiler Explorer
<moon-child> mrvn: afaik it's fine
<moon-child> I think they will even detect strided access
<mrvn> If I want to use loop then I kind of need to work barkwards through the array
<moon-child> so like if you're touching every other cache line, or every n
<moon-child> loop is slow though
<heat> s/slow/very fast/
<heat> there should be an llvm mode to use crap instructions
<heat> one better than -O0 that is
<mrvn> Is there a syntax for %rdi + 8*%rcx - 8?
<heat> yes?
<moon-child> gcc will do an actual division by constant in __attribute__((cold)) code
<moon-child> mrvn: I think-8(%rdi,%rcx,8), but idk att
<mrvn> moon-child: cold code is optimized for size
<heat> 8($rdi, 8, %rcx)
<moon-child> disp is -8, not 8
<moon-child> also rdi is %, not $
<heat> oops
<bslsk05> ​godbolt.org: Compiler Explorer
<heat> also oops
<heat> i'm writing asm in IRC give me a break
<heat> :P
* moon-child pats heat
<moon-child> there, there
<mrvn> hmm, that broke the result.
* heat mtrrs moon-child
<bslsk05> ​godbolt.org: Compiler Explorer
<mrvn> I want %rdi - 8 * %rcx though :(
<moon-child> hmm
<moon-child> store your bigints in big endian
<bslsk05> ​godbolt.org: Compiler Explorer
<heat> compiler smart, know syntax
<moon-child> so did I, apparently
<heat> SO DID I
<heat> SUCK IT
<mrvn> moon-child: and how do you index that then with the loop variable?
<heat> you're very into the loop instruction
<moon-child> mrvn: then you just index with the loop variable
<moon-child> another option: don't index at all; instead, walk your pointers forward. use lea whatever,[whatever + 8] (lea doesn't set flags)
<heat> inc 8 times
<heat> manual add unrolling :P
<mrvn> moon-child: wait. I'm already doing big endian. I want litte endian
<mrvn> heat: inc changes CF iirc
<dminuoso> mrvn: Intel is documented to detect forward and backward strides in E.3.4.2 of Intel® 64 and IA-32 Architectures Optimization Reference Manual
<moon-child> agner sez of k8/k10 'Data streams can be prefetched automatically with positive or negative strides'
<moon-child> presumably applies to newer parts too
<dminuoso> Oh but wait
<dminuoso> That's for the IP prefetcher, not the DCU prefetcher
<mrvn> instructions cn be prefetched with negative stride?
<dminuoso> As far as documentation goes, only by the IP prefetcher
<dminuoso> So as long as you explicitly have load instructions
<mrvn> I mave mov (read), adox (read), mov (write)
<moon-child> dminuoso: 'explicitly have load instructions' as opposed to what, regular instructions w/memory operands?
<dminuoso> moon-child: Presumably, yes. The IP prefetcher, as far as I can read it, triggers on loads but only if a particular set of conditions are met.
<dminuoso> The DCU prefetcher seems to be triggered on just any meomry operand
<dminuoso> Its not clear whether DCU can work out negative strides, but the IP prefetcher is documented to support them (again but only under special circumstances)
<dminuoso> Well thats to L1 anyway
<dminuoso> There's also L2 prefetching
<dminuoso> For L2 prefetching it works in both directions
<dminuoso> Really just read the manual
<dminuoso> E.3.4.2 and E.3.4.3
ptrc_ has joined #osdev
psykose_ has joined #osdev
Test_User has joined #osdev
lanodan_ has joined #osdev
Ellenor has joined #osdev
<mrvn> Finally addition of 2 Big nums in parallel: https://godbolt.org/z/eWjhPd33j
<bslsk05> ​godbolt.org: Compiler Explorer
<heat> ok backwards copy doesn't seem to have a big impact on my CPU
<heat> i was wrong
psykose has quit [*.net *.split]
thinkpol has quit [*.net *.split]
|Test_User has quit [*.net *.split]
Starfoxxes has quit [*.net *.split]
Raito_Bezarius has quit [*.net *.split]
lanodan has quit [*.net *.split]
simpl_e has quit [*.net *.split]
ccx has quit [*.net *.split]
ptrc has quit [*.net *.split]
SarahMalik has quit [*.net *.split]
nog0x7cd has quit [*.net *.split]
jeaye has quit [*.net *.split]
noocsharp has quit [*.net *.split]
phr3ak has quit [*.net *.split]
psykose_ is now known as psykose
ptrc_ is now known as ptrc
<geist> yah i think pretty much any halfway modern design can prefetch reverse just as well
Test_User is now known as \Test_User
phr3ak has joined #osdev
<geist> maybe it takes a bit longer to train it, possibly
<heat> seems to have a tiny, tiny impact (around 40MB/s)
<heat> might just be noise ofc
thinkpol has joined #osdev
<geist> i also remember reading that the cortex-a53 can detect N strides in parallel, etc, and that's a pretty dumb machine. i think it's just a given nowadays
<mrvn> reading 4 Big nums and writing 2 back might throw it off
lanodan_ is now known as lanodan
<mrvn> geist: it should at least detect 2 strides. Reading one and writing another is very common.
<mrvn> 3 strides is common too
dude12312414 has joined #osdev
jeaye has joined #osdev
Starfoxxes has joined #osdev
foudfou has quit [Remote host closed the connection]
Raito_Bezarius has joined #osdev
foudfou has joined #osdev
ccx has joined #osdev
Raito_Bezarius has quit [Max SendQ exceeded]
<geist> yah
doug16k has joined #osdev
Raito_Bezarius has joined #osdev
Raito_Bezarius has quit [Max SendQ exceeded]
<mrvn> heat: how fast is it that 40MB/s is tiny?
<heat> 4000MB
<heat> /s
<heat> and this is relatively slow, just a laptop with a ULP kabylake R
<bslsk05> ​gist.github.com: gist:e83005662c837800fd5273934923a42b · GitHub
<mrvn> 1% then
dude12312414 has quit [Remote host closed the connection]
rustyy has quit [Quit: leaving]
rustyy has joined #osdev
Raito_Bezarius has joined #osdev
<doug16k> neat to compare to 3950x with "slow" dual channel 2400 ECC memory with TSME enabled: https://gist.github.com/doug65536/bba271d46469e73790679ace14b6c408
<bslsk05> ​gist.github.com: 3950x, 2400 ECC, dual channel, TSME enabled · GitHub
<doug16k> it beat me at memset
<zid> my sandy gets 40GB/s on this with cheap ram if memory serves
<heat> doug16k, hey!
<heat> long time no see!
<doug16k> yeah
<doug16k> client says june 2021
<doug16k> sorry, july
<doug16k> I wish I could get 64GB 3200 ECC, but everyone has buffered unbuffered registered unregistered in the product details, to spam search indexes, so I can't find any
<zid> correct
<zid> finding ram is *impossible*
<zid> My favoure is "32GB ECC" "Showing results for 3x2GB non-ECC"
<mrvn> 3rd shelf, half way up in the box labeled RAM
<mrvn> pretty easy to find
<zid> Why can't people just properly mark things as UDIMM, etc :(
divine has joined #osdev
<doug16k> heat, were you saying your memcpy was 40MB/s? sounds uncached
<mrvn> doug16k: backwards is 40MB/s slower than forward
<doug16k> ah
<zid> oh I forgot I pulled some dimms when I was testing why my cpu was crashing, crap
<zid> someone remind me to fix that next time I say I am bored kthx
<doug16k> is the forward copy using big moves and backward one always uses bytes?
<mrvn> why would it?
<doug16k> because backward one might need to be byte
<doug16k> right?
<mrvn> we are talking about memcpy, not memmove
<doug16k> then why is it ever backward?
<mrvn> because then it can use loop and use rcx as index
<doug16k> you can already
<doug16k> all you have to do is point past the end of the memory and count from negative up to zero
<mrvn> loop is counting down, not up
<doug16k> if you point rdi to the *end* of the array, and negate the count in rcx, then you could access [rdi+rcx*4] and inc rcx jnz
<doug16k> is that what you mean by using rcx for index and count?
<mrvn> doug16k: you can. But that's not loop
<doug16k> why not?
<doug16k> ok loop then
<mrvn> Plus inc changes CF which brakes my case of looping adcx/adox
<doug16k> if you want it to take more cycles for nothing
<doug16k> ok use loop then*
<mrvn> doug16k: the question was wether it would be slower or not.
nyah has quit [Ping timeout: 246 seconds]
<doug16k> historically, loop has been intentionally slow
<mrvn> doug16k: it's strange. loop should be faster than everything else since it doesn't add any dependencies.
knusbaum has quit [Ping timeout: 248 seconds]
knusbaum has joined #osdev
<doug16k> if the bigint were that big, I'd have at least 8 adc unrolled, then use setc at the bottom then cmp the something to get carry back?
<doug16k> ...at the top
<bslsk05> ​godbolt.org: Compiler Explorer
<mrvn> doug16k: The goal was to not have to save/restore any flags.
<doug16k> me too, that's why it didn't pushf popf, but I know what you mean
<doug16k> it will be free though, out of order will put that through for nothing
<doug16k> almost. it will kind of overlap with the loop overhead
<mrvn> If you use anything but "loop" then I think you have to setc, seto at the end and an extra adcx, adox at the start.
<doug16k> how big is the bigint. if it is so big this loop overhead matters, you might be bottlenecked on memory anyway
<mrvn> doug16k: anywhere from millions to 2 words.
<mrvn> is popf/pushf slower than adcx, adox, setc, seto?
<mrvn> +2 xor
<heat> you should profile things
<heat> also llvm-mca
<doug16k> mrvn, you unroll some, you don't hammer the setc seto for each one
<doug16k> needs to be like memcpy where it has large and small behaviour
<mrvn> doug16k: obviously. That's something to measure too. Can't unroll forever of the icache runs dry.
<doug16k> I mean 8
<doug16k> has to fit the uop cache
<mrvn> doug16k: It's 6 opcodes per iteration.
<doug16k> or more. just make it high enough that the setc seto disappears because it fit it through for nothing, overlapping the loop counter
<mrvn> 4 opcodes to advance the pointers, 6 opcodes for the flags.
<mrvn> If I unroll it for 2 iterations it's 12 opcodes + 10 opcodes overhead. That's probably slower.
<mrvn> 8x unroll would be 48+10
<doug16k> there are no opcodes once it predicts that branch at the bottom taken. it will stream already-decoded ops into the reorder buffer