klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
kivikakk1 has quit [Ping timeout: 268 seconds]
slidercrank has quit [Ping timeout: 276 seconds]
nyah has quit [Quit: leaving]
brunothedev has joined #osdev
<brunothedev> why dont i use "multiboot_tag_vbe" ?
brunothedev has quit [Quit: WeeChat 3.6]
thatcher has joined #osdev
xvanc has joined #osdev
<klange> That tag is for locating the VESA BIOS extensions, with which you could theroetically perform modesetting operations, except VBE 2 only has a BIOS interface to do that, and VBE 3 doesn't actually exist.
<klange> Sorry, misspoke. VBE 2 has a protected mode interface nothing implements, and VBE 3 data is not provided by GRUB through that interface (and doesn't really exist anyway)
xvanc has quit [Ping timeout: 276 seconds]
<klange> If you were thinking of using the mode information to understand your framebuffer, note that this information is only provided if VBE was used to perform modesetting by GRUB which may not necessarily be the case. GRUB actually has a handful of other modesetting drivers, or may be running through EFI and using GOP rather than VBE.
[itchyjunk] has joined #osdev
antranigv has joined #osdev
xvanc has joined #osdev
xvanc has quit [Remote host closed the connection]
<heat> again, what a great nap
<heat> this is neat
<zid> heat what do I do now I am an amd fanboi
troseman has joined #osdev
<heat> OH
<heat> I have a benchmark for you to run zid
<zid> kk
<heat> 1) install google benchmark (probably just "benchmark on your gentoo thingy)
zaquest has quit [Remote host closed the connection]
<zid> I've not got vmware installed yet
<zid> but now I can do vmware 17!
<heat> ok do that then
<heat> i wanna test how rep movsb works in your new shiny thing
<heat> 5800x right?
<zid> *exctracts some zips to rar parts* *extracts some rars*
zaquest has joined #osdev
<bslsk05> ​dubz.co: Dubz
<heat> me and the lads 'avin some fun down at the footie
slidercrank has joined #osdev
<heat> turn it on
<zid> I have to reboot just for heat smh
zid has quit [Remote host closed the connection]
<bnchs> hi
<bnchs> i misread amd fanboi as amd femboi
zid has joined #osdev
<zid> Gigabyte is dumb, option is called "SVM" no wonder I missed it the first time.
<heat> yeah cuz svm is the real name
<heat> amd-v is marketing bs
<zid> It should be called intel vt-d emulation
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
<zid> makes me reboot then just quits on me smh
<heat> 8gb of ram moment
<zid> That's a good point, how many cpus did I boot this with, 12 I assume
<heat> btw no, svm is different from intel vmx afaik
<zid> instead of 16
<heat> why?
<zid> cus that's what it was set to before
<zid> 1650 was 12
<heat> ok i have the stuff
<zid> when do I get ~the stuff~
<bslsk05> ​gist.github.com: bench.cpp · GitHub
<heat> download both of those files and do "g++ memcpy_ours.S bench.cpp -lbenchmark -O2"
<heat> then run it
<zid> how do I get this into vm with a name like that
<heat> fuck
xvanc has joined #osdev
<heat> don't you have the shared clipboard thing
<zid> oh yea I should do now
<zid> doesn't work
<zid> keyboard encoding issues and it stops half way through, sick
<zid> Time to set a share back up maybe
xvanc has quit [Remote host closed the connection]
<bslsk05> ​redirect -> gist.github.com: bench.cpp · GitHub
xvanc has joined #osdev
<heat> i dont know if this shortener is any good
<heat> rip goo.gl
<zid> k I have them
<heat> u run it or wat
<zid> yea I was trying to remember github password
<bslsk05> ​gist.github.com: out.txt · GitHub
<zid> did I win
<heat> 5800x right?
<zid> yea but with a couple cores missing
<zid> 1800MHz ram
<zid> aida gives it ~50GB/s read speeds
<zid> (less than my 1650)
<moon-child> 1800mhz? Isn't everything at least 2666mhz for a long time now?
<zid> that's probably 1333Mhz
<heat> aha this is very funny
<zid> they love to pretend DDR means you can double the clock speed a couple of times eery time you chinese whisper it
<zid> any standout results heat?
<heat> 1) borislav was fucking lying and rep movsb still sucks on amd
<heat> 2) seems that rep movsb speed goes off the cliff after ~8K
<heat> I assume glibc avoids that by doing nt stores
<zid> BM_string_erms/1024 14.6 ns 14.6 ns 41961244 bytes_per_second=65.2261G/s
<zid> BM_string_erms/8192 1837 ns 1837 ns 380748 bytes_per_second=4.15265G/s
<zid> after 1k
<zid> I assume maybe after 4096
<moon-child> heat: you aren't aligning your movsb dst
<heat> zid, add ->Arg(4 * KB) \ to the AT_COMMON_SIZES thing
<zid> I'll wait until you argue with moon about whether your dest is aligned or not
<zid> and whether I need to change something
<heat> i am not aligning the movsb dst, its a fact
<heat> doesn't really matter here
<heat> everything is good and aligned anyway
<moon-child> pretty sure it does
<moon-child> should just replace new with aligned_alloc or w/e
<heat> malloc gives ya 16-byte aligned
<moon-child> but not 64-byte aligned
<heat> why do I want that?
<heat> zid, in any case do what I told ya, the results should be consistent with what I've measured here
<moon-child> because erms apparently wants it
<zid> I was right
<heat> wth
<heat> this is wiiiiiiiiiiiild
<bslsk05> ​gist.github.com: out2.txt · GitHub
<zid> 4k is fastest
ne0ac2g has joined #osdev
<heat> this is fucking wild
<heat> your rep movsb is ASS
<heat> can you grep for erms and fsrm in /proc/cpuinfo?
<moon-child> oh btw I found that rep stosq could be decent at some sizes
<moon-child> on zen 2
ne0ac2g_ has joined #osdev
<bslsk05> ​gist.github.com: out3.txt · GitHub
<bslsk05> ​gist.github.com: gist:92f088b4451132105521f169b527452b · GitHub
<zid> erms yes, frrmsr no
ne0ac2g has quit [Ping timeout: 260 seconds]
[itchyjunk] has quit [Remote host closed the connection]
<heat> i'm wondering if any of this can be getting influenced by the vm
<zid> It's mainly influenced by amd's crappy memory controller and dual channel being shit
<zid> it's highter latency and lower bw
xvanc has quit [Remote host closed the connection]
ne0ac2g has joined #osdev
eddof13 has joined #osdev
ne0ac2g_ has quit [Ping timeout: 276 seconds]
<heat> zid, https://tinyurl.com/567ajjry new bench.cpp
<bslsk05> ​redirect -> gist.github.com: newbench.cpp · GitHub
<heat> this time with aligned allocs
<geist> general observation: if you're benchmarking your routines you *really* want to also benchmarks unaligned bits
<geist> doubleplus so on memcpy. it's easy to build super fast ass routines that work great when everything is aligned and collapse to nothing otherwise
<heat> yeah
<heat> but I think that in this case you just want to do unaligned stores
<geist> what i have done for my test hardness is allocate a large buffer by some overshoot and then run with varying stc and dst offsets
<heat> per mjg
<geist> well, sure, but that's the point. if you're making a fast implementation you have to be able to deal with all of them
<heat> btw could you run this too geist? on your zen
<geist> what kinda zen are you interested in? i dont have a 4, only up through a 3
<heat> yeah 3 works, just want to make sure zid's vm isn't influencing the results
<geist> ah
<heat> not that it *should*
<geist> yah can't quite this minute but might in a little bit
<heat> i'm fairly sure the benching hot loop doesn't do rdtsc, etc. but it's really weird that speed just falls off a cliff in 4096 -> 8192
<heat> yeah thats fine
eddof13 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
xvanc has joined #osdev
<heat> surely this cant be a cache thing, 32KiB L1 per core
<geist> i need benchmark.h
<geist> and probably whatever that came with
<heat> google benchmark
<heat> probably just sudo apt install benchmark? or similar, usually named "benchmark"
<geist> need implementation of our_memcpy
<bslsk05> ​IRCCloud pastebin | Raw link: https://irccloud.com/pastebin/raw/fk3luVYg
<geist> that is a 3950x, which is a zen 2
<heat> alright thanks
<ghostbuster> dumb question but what does it mean to say the stack is aligned, eg. on a 16-byte boundary. is it that ESP must be evenly divisible by 16? or EBP-ESP? or both?
<heat> yeah those results seem to match up
<heat> with, erm, logic
<Mutabah> ghostbuster: ESP must be evenly divisible
<ghostbuster> ty
<Mutabah> and EBP will most likely also be... if frame pointers are in use
<heat> no, ebp will most likely be misaligned
<heat> wait, actually
<heat> per x86_64 sysv, push %rbp should re-align the stack
<Mutabah> yep.
<heat> but ebp should still be misaligned i think
<geist> not so sure. or, it may be that there is no such requireent to ebp
<geist> but rsp being aligned is
<Mutabah> The prelude is usually `push %rbp; mov %rbp, %rsp`
ne0ac2g_ has joined #osdev
ne0ac2g has quit [Ping timeout: 260 seconds]
slidercrank has quit [Ping timeout: 248 seconds]
heat has quit [Ping timeout: 248 seconds]
bradd has joined #osdev
sympt5 has joined #osdev
xvanc has quit []
sympt has quit [Ping timeout: 255 seconds]
sympt5 is now known as sympt
slidercrank has joined #osdev
vdamewood has joined #osdev
levitating has quit [Ping timeout: 276 seconds]
levitating has joined #osdev
ne0ac2g has joined #osdev
LostCarcosa has joined #osdev
ne0ac2g_ has quit [Ping timeout: 248 seconds]
bgs has joined #osdev
nlocalhost has joined #osdev
sympt has quit [Ping timeout: 276 seconds]
LostCarcosa has quit [Remote host closed the connection]
xvmt has quit [Read error: Connection reset by peer]
xvmt has joined #osdev
slidercrank has quit [Quit: Why not ask me about Sevastopol's safety protocols?]
slidercrank has joined #osdev
ne0ac2g has quit [Quit: WeeChat 3.8]
ne0ac2g has joined #osdev
ne0ac2g has quit [Client Quit]
danilogondolfo has joined #osdev
marshmallow has quit [Quit: ZNC 1.9.x-git-170-9be0cae1 - https://znc.in]
GeDaMo has joined #osdev
marshmallow has joined #osdev
gog has joined #osdev
marshmallow has quit [Quit: ZNC 1.9.x-git-170-9be0cae1 - https://znc.in]
marshmallow has joined #osdev
bauen1 has quit [Ping timeout: 268 seconds]
vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
marshmallow has quit [Quit: ZNC 1.9.x-git-170-9be0cae1 - https://znc.in]
nyah has joined #osdev
theWeaver has quit [Quit: fuck fuck fucking fuck shit fuck]
marshmallow has joined #osdev
slidercrank has quit [Remote host closed the connection]
slidercrank has joined #osdev
<zid> heat's not gu nna believe what the new bench.cpp does lol
<zid> it now drops off a cliff still, but at 16k instead of 4k
Terlisimo has quit [Quit: Connection reset by beer]
Terlisimo has joined #osdev
bauen1 has joined #osdev
wand has quit [Ping timeout: 255 seconds]
craigo has joined #osdev
brunothedev has joined #osdev
bauen1 has quit [Ping timeout: 248 seconds]
kivikakk has quit [Ping timeout: 255 seconds]
kivikakk has joined #osdev
levitating has quit [Ping timeout: 255 seconds]
levitating has joined #osdev
IdentityInFlux has joined #osdev
bauen1 has joined #osdev
<brunothedev> i think i am gonna make a virtual 8086 mode to write my vesa driver
<brunothedev> multiboot framebuffer is too much of a hastle
<gog> i guarantee you v86 mode is going to be more of a hassle
<brunothedev> gog: then pure real-mode
<gog> more hassle
<gog> and neither technique is forward-compatible
<brunothedev> ok, so can you reference me some os who uses multiboot_tag_framebuffer, the example code by grub is very weird, and dont draw actual things
<gog> i don't have a reference
<gog> personally i use EFI GOP to get the framebuffer properties
<gog> the principle is the same though
<brunothedev> gog: so, no simple os that uses multiboot_tag_framebuffer
<gog> idk maybe
<bslsk05> ​github.com: sophia/video.c at main · adachristine/sophia · GitHub
<gog> this is the code that actually does the drawing
<gog> if you can get the framebuffer properties it's all the same after that
bradd has quit [Ping timeout: 255 seconds]
<brunothedev> gog: i already got the struct, which means i have most things
<gog> ok then you can use this code to see how to actually do the thing
<gog> it'll be exactly the same if you have 32bpp
<gog> idk about the order of the fields tho
<gog> i don't support any other mode than bgra32bpp rn because i don't have to
<brunothedev> according to the multiboot2 specification, the bpp dont change many things: https://github.com/cludblaze/multiboot2
<bslsk05> ​cloudblaze/multiboot2 - multiboot2规范的示例代码 (0 forks/1 stargazers/GPL-3.0)
<gog> well, anyway, int10h v86 or real mode is going to be more difficult and slower than than a planar bitmap
<brunothedev> oh well, still, i am gonna try vesa, think i am gonna write assembly code who sets the thing in real mode and then provide as args on a c function
<gog> that's what grub is doing for you but ok
<brunothedev> i've also seen multiboot_tag_vbe too
<brunothedev> vbe has more docs, but there is no docs on ^
<gog> ok
<gog> i'm not gonna tell you not to do something but the framebuffer tag is way easier if all you want to do is draw some stuff on screen
<gog> the vbe tag is lower-level and i guarantee you're going to have a worse time with that than you will understanding what you're doing wrong with the framebuffer tag
sylviabsd has joined #osdev
Left_Turn has joined #osdev
<brunothedev> what is the best resolution for a 80x24 terminal?
<brunothedev> so that i can write a good "findmode" function
<gog> depends on the font size
les has quit [Quit: Adios]
slidercrank has quit [Read error: Connection reset by peer]
<brunothedev> 8x8 asounds good
<gog> 80x24 iirc was the 640x400 CGA
les has joined #osdev
<gog> that would be 8x12 for 640x400
<gog> 8x13 for 640x480
<gog> but these are legacy modes
<gog> my code just uses 8x13 font on whatever size it gets because i just used a font i found
brunothedev has quit [Quit: WeeChat 3.6]
funno has joined #osdev
<funno> checking for bison 3.0.4 or newer... 3.8.2, bad
<funno> configure: error: Building gprofng requires bison 3.0.4 or later.
<funno> Hello, I'm trying to build binutils on Ubuntu as described here https://wiki.osdev.org/Building_GCC but my bison version seems too new? I get this error ^
<bslsk05> ​wiki.osdev.org: Building GCC - OSDev Wiki
<gog> which version of binutils?
<funno> 2.40
<bslsk05> ​www.mail-archive.com: [Bug gprofng/29148] New: bison version too new??
<funno> wow a year old
<funno> can i downgrade bison on ubuntu?
<funno> or should i build an older bison from source and put it in my path?.
Left_Turn has quit [Ping timeout: 248 seconds]
Left_Turn has joined #osdev
sylviabsd has quit [Remote host closed the connection]
Left_Turn has quit [Ping timeout: 248 seconds]
<Ermine> gog: may I pet you
<gog> ye
* Ermine pets gog
* gog prr
<sham1> Hell
<sham1> Helo
<Ermine> ehlo
funno has quit [Quit: I used to think I was indecisive, but now I'm not too sure.]
<nikolar> Aloha
<sham1> Do you mean "hello" or "goodbye"
<sham1> We need statically typed language
<gog> hellogoodbye
<mrvn> Who writes a "findmode" function anymore? It's not like TFTs have more than one resolution and CRTs are hopefully truely dead.
<gog> idk this kid doesn't really seem to understand that the thing they want to do is actually very easy
<gog> and that the things they think will be easiser are actually worse
<zid> hi gog
<gog> hi zid
<zid> when do we eat the moss?
<gog> we don't
<gog> the moss is protected by law
<zid> Come on, just a little bit
<gog> we eat fish
<zid> a part nobody will notice is missing
* gog chomp fishy
sinvet has joined #osdev
<sham1> I hope that fishy isn't fermented
<mrvn> don't we need the moss to grow shrooms?
[itchyjunk] has joined #osdev
<gog> haha that's right it's almost the time of year to go collect psilocybin
heat has joined #osdev
<zid> hello mr. heaterson
<heat> hello zed
<zid> I ran your new bench.cpp
<zid> You're not going to like it.
<heat> i've heard
<heat> send out.txt pls
<zid> how did you hear
<zid> are you log watching, that's gross af
<heat> yes
<bslsk05> ​gist.github.com: gist:4e444b66dd9dea03081fed2283c9e5c7 · GitHub
<heat> ok
<heat> these results make more sense
<zid> why would 16k being the limit make more sense
<zid> than 4k
<heat> I guess amd rep movsb REALLY wants 64-aligned rep movsb
<heat> zid, because your L1 is 32KiB
<zid> oh is it
<zid> who knew
Left_Turn has joined #osdev
<zid> emerge -e @world, 587 packages remain
<zid> dawn of the first day
<gog> why
<zid> -march=native
<zid> is no longer native
<sham1> Why would the name lie like that
<zid> sham1: I'm now an amd fanboi so I need to expunge all this intel nonsense
<sham1> Well march=native is still your (now) superior chipset (AMD all the way). But okay yeah, I see why you'd have to recompile everything
<sham1> AMD ruuls, Intel druuls
<zid> Xeon was great, so was coal
<zid> time to move on
<heat> sheesh
<heat> never thought zid would ever say this
<zid> I'll say anything you want if you give me a couple of hunred of quid worth of processor to say it about
<heat> "zid: ooga booga xeon good tdp large"
<heat> mjg, amd cpus are wild with ERMS
<gog> ryzen ryzen ryzen
<heat> definitely deserves extra care
<gog> zid is ryzen gang with me now
<heat> gog are you in a gang
<zid> gog: what's your passmarks
<heat> gong
<gog> zid: idk
<zid> I'm allowed to run it now
<zid> ryzen too fast, even with -march=sandy gcc, we're up to package 90
<heat> zid is everyone in the UK on cocaine
<IdentityInFlux> heat: yes
<heat> thank you zid
<zid> more or les
<IdentityInFlux> you're welcome zid
<zid> who are you
<heat> np gog
<zid> oh
<zid> The person from before
<sham1> gog: Ryzen Gang!
<gog> zid: 15606 lol
<zid> That's surprisingly high
<zid> what's the single?
<gog> 2517
<zid> My xeon just cracked like 2100 with overclock
<gog> and keep in mind this is a mobile cpu that's actually a zen2
<zid> yea
<zid> that's why I didn't try hard to end up with a 1950x or whatever
<heat> because your xeon sucked man
<zid> that xeon was incredibly dumbly good
<zid> for what it cost to put it together
<zid> compared to ryzen2
<gog> i have no complaints about my computer's performance after almost a year with it so
<zid> 2500 is more than fine
<gog> i'm curious what my boss' new computer does
<gog> he's gone at a trade show rn and it's sitting here in a box
<gog> i should open it
<zid> You should swap it for a different one, in a pink theme case
<gog> yes
<gog> hahaha cute
<zid> I saw some disgusting NSFL shit on reddit the other day
<zid> someone destroying an Evangelion edition gpu in order to put a water block on it instead
<gog> noooo
<zid> I almost shit his pants
<zid> and his shoes
<heat> his?
<heat> that's some dedication
<zid> I will walk to his house, and poop in his presumably, ugly sandals
<zid> cus nobody with sane footwear would do that
bauen1 has quit [Ping timeout: 255 seconds]
<bnchs> hi sid
<zid> sidden infant death?
slidercrank has joined #osdev
<sham1> Either that or the unstable branch of Debian
<bnchs> i hate retroarch's cheats feature
<zid> retroarch is gross
<bnchs> set something to 0x25970F40
<bnchs> guess what
<bnchs> it ends up being 0x0F402597
<bnchs> they fucked up the byte swapping
<zid> retroarch is a frontend for stolen emulator cores btw
<bnchs> yes
<zid> so it's up to the emulator to do that bit, so no wonder it doesn't match retroarch's ui
<bnchs> i barely use it, than to see how shit it is
<bnchs> incluing the times it crashed my DE
<bnchs> general protection fault
<bslsk05> ​github.com: RetroArch/cheat_manager.c at 7e74d830ca97ccfdf18395e95bd9130b8046ff5f · libretro/RetroArch · GitHub
<bnchs> holy shit, they don't use shifts, they use MULTIPLY
<bnchs> like * 256 * 256 * 256 * 256
<sham1> That's horrible, although it does work obviously
<bnchs> yeah, it does work, but that sounds horribly inefficient than shifting
<bnchs> lest the compiler doesn't optimize it
<sham1> How would it be inefficient? The compiler would optimise that. It knows that that's 2^8 * 2^8 * 2^8 * 2^8 = 2^32
<bnchs> i mean if the compiler doesn't optimize it
<nortti> I think that's pretty unlikely. it can be implemented as two peephole passes (constant folding, conversion of multiplies to shifts), and I'd expect any compiler able to build retroarch to implement peephole optimization
<bnchs> true
Brnocrist has quit [Ping timeout: 246 seconds]
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
<heat> i'm a HACKER
<heat> I use shifts
<bnchs> heat: i shift my ass in chairs, is that how it works?
<heat> yes
theboringkid has joined #osdev
shikhin has quit [Quit: Quittin'.]
shikhin has joined #osdev
theboringkid has quit [Ping timeout: 255 seconds]
bauen1 has joined #osdev
eddof13 has joined #osdev
slidercrank has quit [Ping timeout: 268 seconds]
Arthuria has joined #osdev
gog has quit [Quit: Konversation terminated!]
sylviabsd has joined #osdev
sylviabsd is now known as sylvbsd
xenos1984 has quit [Ping timeout: 248 seconds]
xenos1984 has joined #osdev
<bnchs> hi friends
<bnchs> and people who are not friends
<zid> heat: I need more benchmarks
sylvbsd has quit [Remote host closed the connection]
demindiro has joined #osdev
<mjg> did osmeone say benchmark?
<mjg> > Benchmarking; by which I mean any computer system that is driven by a controlled workload, is the ultimate in performance testing and simulation. Aside from being a form of institutionalized cheating, it also offer countless opportunities for systematic mistakes in the way the workloads are applied and the resulting measurements interpreted.
<mjg> :S
foudfou has joined #osdev
<heat> mjg, did you see the weird unaligned ERMS results on ryzen?
<heat> its bizarre
<heat> i guess i'll need to find a way to align it here
<mjg> now
<mjg> no
<mjg> where
<demindiro> heat: which ryzen CPU?
<mjg> did not i mention rep likes aligned bufs?
<mjg> ignoring fsrm
<mjg> 16 bytes minimum
<demindiro> With ermsb it should matter less / not at all though
<mjg> erms does not help with aignment issues
<bslsk05> ​gist.github.com: out2.txt · GitHub
<bslsk05> ​gist.github.com: gist:4e444b66dd9dea03081fed2283c9e5c7 · GitHub
<heat> mjg, erms is blind rep movsb, memcpy is glibc memcpy, our_memcpy = my GPR-only+erms memcpy
<mjg> so what's the alignment of the target buf
<heat> whatever new[] was giving me
<mjg> 's not the way to bench dawg
<heat> it's very interesting how ERMS starts sucking after 4K
<mjg> you aligned_alloc to control it
<heat> like look at that shit, why
<mjg> this format is shit to read mate
<heat> why
<mjg> also does not help mb vs gb
<mjg> woudl be best to graph it
<heat> lmao
<heat> do you want a powerpoint too?
<mjg> dude get some basic gnuplot fu
<mjg> it really is not difficult
<heat> anyway i probably want to align erms
<heat> but to what? I see you align to 16
<heat> it doesn't help that the amd opt manual is completely silent about any of this
<sham1> Why do a PowerPoint when Jupyter do the job
<mjg> i found haswell is affected by anything < 32
<mjg> newer archs are fine with just 16
<mjg> no clue about amd
<mjg> i picked 16 as a tradeoff
<mjg> it already helps evne on haswell
<mjg> and i could not be fucked o timplement more and runtime switch on it
<mjg> on that note, i gave slight thought to that comparison of addresses target vs source upfront
<mjg> even if direction of copying affects stuff, i'm 99% confident the very fact there is a branch and a possible mispredict on it
<mjg> maeks the check pessimal
<mjg> it possibly makes sense if you are resorting to rep
<mjg> that will is ot verified, along with the abve claim
<mjg> that is to be verified*
<mjg> wtf
<mjg> i'm trying to declutter my backlog here though
<mjg> and the fact that automemcpy machinery des not work ootb is not helping
<mjg> re gnuplot, split this one func per file
<heat> i'm not plotting for you, sorry
<heat> i can read this just fine
<mjg> lol
<mjg> i'm sayin give yourself a shot at plottin, you will see it is EZAF
<mjg> and if you don';t like it anyway, so be it
<heat> i use google benchmark because it's the easiest and highest quality framework for me to use
<heat> i'm not going to handroll some shitty code for gnuplot
<heat> in fact this thing can spit out json and csv so if you really want to, glhf
<heat> you can literally use this in ppt :))
eddof13 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
xenos1984 has quit [Ping timeout: 246 seconds]
<mjg> i would think that's boomer
<mjg> grew up without a graphical interface and now does not want graphs
<mjg> btw gnuplot can do ascii :p
<mjg> and here you are, genz. is that neoboomer?
<mjg> anyhow a benchmark were yu did not control aigment for the target buffer is totally geezered
<mjg> also you probably want falign-loops=32
<heat> you're so fucking annoying it's not even funny
<mjg> see the above quote
<heat> you are the real unix geezer
<mjg> > Benchmarking; by which I mean any computer system that is driven by a controlled workload, is the ultimate in performance testing and simulation. Aside from being a form of institutionalized cheating, it also offer countless opportunities for systematic mistakes in the way the workloads are applied and the resulting measurements interpreted.
<mjg> it is hard to do a sensible test for things like this. by not controlling for the stuff like i mentioned above you are not even trying
demindiro has quit [Ping timeout: 260 seconds]
<bnchs> did retroarch just suddenly become worse?
<bnchs> >trying to modify the memory value for current level to 5
<bnchs> >get taken to the 8th circle of hell, because retroarch makes it 327680
xenos1984 has joined #osdev
<zid> I use gnuplot as a debugger occasionally
Brnocrist has joined #osdev
<zid> debugging my normal calculations, just dumped the results to a file instead of writing code to try view them somehow :P
<bslsk05> ​gist.github.com: bench3.cpp · GitHub
<heat> this one will take a good bit longer to run
<zid> I'm EMERGING
<zid> not sure how well it'll react to ctrl-z
<mjg> what does this do? benchmark::ClobberMemory();
<mjg> just clflushes?
<heat> no
<heat> just a compiler barrier
<heat> asm volatile("":::"memory")
<zid> "memory" in le clobbers
<zid> this gigabyte mobo doesn't know what 100MHz is
<zid> it's at 99.98MHz and makes all my speeds look super ugly
<heat> mjg, fyi locally it seems that 32 alignment is ideal
<heat> doubles speed for erms pretty much
<mjg> how does it differ vs 16
<mjg> that's on your kaby?
<heat> yes
<mjg> i'm gonna revisit this bit as well
<mjg> note all the stringop work i did was almost 5 years ago
<mjg> i'm gettin fuzzy on details
<mjg> the code to perform alignment is kind of crap afair
<mjg> im gonna hack it up in c and check what clang comes up with
sortie has quit [Quit: Leaving]
<mjg> target bufs like to be misaligned big time, but i don't rmeember how often that happens of the range which falls under rep usage
<mjg> s/of/for
sortie has joined #osdev
<mjg> heat: dtrace -n 'fbt::memcpy:entry,fbt::memset:entry,fbt::memmove:entry /arg2 > 256/ { @alignment[arg0 & 0x1f] = count(); }' for buildkernel
<bslsk05> ​dpaste.com <no title>
<zid> heat always tracing the d
<mjg> so for these routines vast majority is already aligned
<heat> i don't use dtrace, im not cringe
<mjg> heat gets his data the old fashion way: just assume whatever you take out of your ass is correct and roll with it
<mjg> if someone questions it later "you had reasons"
<heat> no, i just don't have dtrace
<mjg> for copyin/copyout
<mjg> 24 32636
<mjg> 16 521734
<mjg> 8 33705
<mjg> 0 2168248
<mjg> the rest is very little
<mjg> but it does happen
<bslsk05> ​dpaste.com <no title>
<heat> I have realised my rename is not implemented correctly and is not atomic
<heat> yay?
<mjg> ok mjg@
eddof13 has joined #osdev
<heat> i'm fairly scared of touching any of my dcache code
<heat> basically so full of fucking locks, most of which I don't even remember what they cover
<mjg> they cover pragmatically picked vars
<zid> mjg is broken
<zid> he didn't say pessimal in that sentence
[itchyjunk] has quit [Ping timeout: 255 seconds]
[itchyjunk] has joined #osdev
<heat> mjg is pessimal
<zid> my audio is dropping out, look at my VM to see what it's doing
<zid> "emerge: (433 of 589) dev-lang/rust-1.68.1 Compile"
theboringkid has joined #osdev
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 255 seconds]
Lumia has joined #osdev
<geist> rUUUUST
Lumia has quit [Ping timeout: 252 seconds]
Lumia has joined #osdev
theboringkid has quit [Quit: Bye]
theboringkid has joined #osdev
<zid> I ran out of thermal paste at last, got some mx-4 for my Q6600
<zid> I quickly googled for a graph in case any of the modern brands are any better than 20 year old stuff
<zid> nope, mayonnaise beats noctua.
<geist> huh really? was gonna say noctua is okay
<zid> It's basically "Is it wet? Yes? Then within a tenth of a degree"
<geist> well, sure. but there is also whether or not it'll remain that way in a week or a year
<geist> obbiously mayonaise will work, but it'll dry out, etc
<zid> yea that's why they exist
<zid> but it's funny still
<geist> yah
<geist> my guess is the way you apply it is probably generally more important
<geist> and i honestly dunno if i ever do the right thing nowadays. too much? not enough? beats me!
<geist> also i wonder if i should every year or so reapply the stuff
<zid> I think it's basically self correcting
<zid> as long as you have enough mounting pressure you're good to go
<zid> as theoretically it only exists to fill voids
<geist> yah, unless you put way too little i guess
<geist> but i usually do the fairly decent sized blob in the middle and it seems to spread out properly, so i think i'm basically doing the right thing
<geist> doesn't seem that smearing it out ahead of time or whatnot really makes a difference
<zid> yea it's funny to watch people fight over the best 'technique'
<zid> I've always done 'just squirt it around a bit until it looks like the last frame of a porno'
<zid> works great
<geist> it's possible there's a teeensy bit of difference if you're extreme overclocking
<geist> like too much may actually not like it touch as efficiency as it could, etc etc. but yeah, i think for the vast majority of cases you're within a C or two of ideal if you just get some of it basically on there
<zid> Butter the toats.
<zid> I think the weird pattern people might actually end up having air bubbles sometimes
<geist> but annoyingly i think it might matter a teensy bit for Zen 4 vs other cores at the moment, since zen 4 will run right up to the thermal limit and then throttle itself, and it's designed to do that
<zid> yea all modern cpus are
<geist> not entirely, actually. zen 4 takes it to the next level
<zid> derbaur dropped a chat with an intel engineer.. today
<zid> saying they do exactly the same thing too
<zid> and any wasted heat below the limit is unused potential
<geist> zen 4 *wants* you to go right to the limit, and that's part of the design, whereas a lot of the other fairly modern stuff is more like they'll throttle to avoid damange
<geist> possible the 13th gen intel stuff is more okay with it
<geist> and laptops yeah have been doing that sort of thing for a while
<zid> That they used to design for 100C once you saturated the part entirely in heat over 20 minutes, but now the power density is so high and the heat governer exists etc so they just turbo up to 'hot' then stay there by modulating freq
<geist> but in that particular situation, the cooler and effiiency of it actually does equate to more 'speed' out of your core
<zid> I have package temp 60C but 30C core temps cus.. power density hotspots
<geist> yah when i unlock PBO on my 5950x it'll eventually get up to 80C or so after a bit of running, but i'm okay with that
<geist> normally it runs a nice 60-70C when under load
<zid> I'm up to 75C, building llvm
<zid> but the L3 is only at 40C, millimeters away
<zid> power density ho
<geist> yah
<zid> I think I'm good to ignore my thermal paste situation until I can be bothered to play around with overclocking
<zid> 70C on a chip that's actively trying to be hot is great
<geist> yah agreed
<zid> I was going to get a 'new' cooler for it but prices were absolutely nuts
<zid> £150 for an aio, I can get the full radiator from a ford focus for £25
<zid> So I ended up finding a £6 bracket adapter for my 212
<zid> (and it was delivered within 12 hours, heh)
Turn_Left has quit [Ping timeout: 276 seconds]
Lumia has quit [Remote host closed the connection]
Lumia has joined #osdev
<mjg> zid: can you strace -fo /tmp/crap emerge media-sound/rexima
<mjg> zid: it is an almost hello-world port
<zid> (562 of 589)
<zid> maybe soon
slidercrank has joined #osdev
GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]
ZombieChicken has joined #osdev
<heat> zid: can you send me one of your kidneys
<heat> i need it for a benchmark
ZombieChicken has quit [Quit: WeeChat 3.8]
theboringkid has quit [Quit: Bye]
brunothedev has joined #osdev
<brunothedev> to write my vbe driver, i am going to real mode, make a call, store the info, go into protected mode, make a call to my c function and provide the info as args, does this works?
<heat> 🚨🚨🚨 bad idea alert 🚨🚨🚨
<brunothedev> heat: i love bad ideas, i am one myself, tho, does it compile and works?
<nortti> if you want to make your life worse for zero benefit, I think so, though you need to maintain real mode environment intact into your kernek
<heat> fuck do i know
<heat> please stop
<nortti> *kernel
<nortti> which might necesitate writing your own bootloader since I don't know if grub might have already trampled over that by the time you hit your kernel entry point
<heat> do understand that you're very much alone if everyone tells you "its a bad idea" and you go ahead and do it
<brunothedev> heat: is that i have no other options
<heat> yes you fucking do
<brunothedev> heat: what?
<heat> klange has been through it with you like 4 times
<heat> multiboot 1 and 2 give you a framebuffer
<heat> please use it
<brunothedev> ok so what simple os uses it? The grub example is not a good reference
<heat> yes it is
<nortti> what have you gotten working thus far? I've not fully followed all this conversation but you are able to get the framebuffer address and its parameters, right? have you been able to blast some garbage onto the screen yet?
<nortti> (not a value judgement on what your OS is going to do, literally just "can you write random bytes and have that cause an effect on-screen?")
<brunothedev> nortti: the grub example at best nearly hides the software to write a pixel, and at worst dont have a implementaion for it
<brunothedev> the maximum it does is to tell the address to write the pixel on: https://github.com/cloudblaze/multiboot2/blob/master/kernel.c#L163
<bslsk05> ​github.com: multiboot2/kernel.c at master · cloudblaze/multiboot2 · GitHub
<heat> you do realize you just need to write bytes to it
<heat> *framebuffer = 0xRGBA;
<brunothedev> heat: ok, lemme try a rando value
<bslsk05> ​elixir.bootlin.com: dcache.c - fs/dcache.c - Linux source code (v2.6.16) - Bootlin
brunothedev has quit [Read error: Connection reset by peer]
<heat> goto considered harmless
<Ermine> heat: why are you lurking in 2.6 kernel?
ThinkT510 has quit [Quit: WeeChat 3.8]
<heat> i'm trying to see a simpler example of a rename operation
<heat> this fucking sucks
<heat> you need to grab like 400 locks for it
SpikeHeron has quit [Quit: WeeChat 3.8]
SpikeHeron has joined #osdev
ThinkT510 has joined #osdev
thatcher has quit [Remote host closed the connection]
Lumia has quit [Remote host closed the connection]
thatcher has joined #osdev
Lumia has joined #osdev
eddof13 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<heat> random prefetches in the middle of vfs code are simply brilliant
danilogondolfo has quit [Remote host closed the connection]
<mjg> rename is notorioushily shit to implement
<mjg> i also refer you to the netbsd ufs code
<mjg> with all the comments inside
<heat> /* XXX FUCK THIS SHIT FUCK SHIT MAN WTF BULLSHIT */ goto bullcrap_fuckyou;
<heat> mjg, found the best intel feature ever
<heat> Fast Zero Length REP MOVSB
<heat> The latency of a zero length REP MOVSB is now the same as the latency of lengths 1 to 128 bytes.
brunothedev has joined #osdev
<mjg> ulala
<mjg> +#define X86_FEATURE_FSRS(12*32+11) /* Fast short REP STOSB */
<mjg> +#define X86_FEATURE_FZRM(12*32+10) /* Fast zero-length REP MOVSB */
<mjg> +#define X86_FEATURE_FSRC(12*32+12) /* Fast short REP {CMPSB,SCASB} */
nur has quit [Remote host closed the connection]
<mjg> the q is how fast is the sort thing tho
<heat> sort?
<mjg> short
<mjg> as noted preiouvsly it was still slower than the overlapping stores
<heat> ah yes
<mjg> for up to 64 afair on ice lake
<heat> welp on zen3 it's still slow
<sham1> "12 * 32 + 11"
<mjg> i would totes welcome rep which is just doing it right kthx
<sham1> But why is it spelt like that
<sham1> Why use a multiplication
<heat> cuz they're doing it based on CPUID words
<heat> so leaf 0 gets word 0 1 2, leaf 1 gets 3 4 5, etc
<sham1> Even then, wouldn't it be more fluent to use hexadecimals and shifts?
<mjg> you know what would be funny
<heat> sham1, why?
<mjg> if Fast zero-length REP MOVSB was sitll so expesnive you should branch on it
<sham1> heat: no real reason other than aesthetics
<heat> hex doesn't work here
<heat> nor shifts nor anything else
<heat> it's just a "bit number" that you then go and index into a bitmap
<mjg> where are these ops tho
<mjg> i don't see intel saying anything
<heat> feature_map[feature / 32] & (1UL << (feature % 32))
<sham1> C23 can't come fast enough to grant us binary literals
<heat> mjg, what ops?
<mjg> aand there it is
<zid> I mean, every C compiler has supported it since the 90s anyway sham
<sham1> But it's not standard!
<zid> waiting for C23 doesn't really gain you anything
<mjg> new version of optimization manual states it
<zid> 90s compilers won't support C23
<zid> there's 0 downside to just using 0b now
<heat> mjg, yes honey that's what I'm looking at
<bnchs> heat: i like leaving comments like that
<sham1> And I don't support 90s compilers. What's the point?
<zid> You can just go "Yea well this is C23 code"
<mjg> REP MOVSB performance of zero length operations is enhanced. The latency of a zero length REP MOVSB
<mjg> is now the same as the latency of lengths 1 to 128 bytes. When both Fast Short REP MOVSB and Fast Zero
<mjg> Length REP MOVSB features are enabled, REP MOVSB performance is flat 9 cycles per operation, for all
<zid> if anyone says "omg that's a compiler extension!!"
<mjg> strings 0-128 byte long whose source and destination operands reside in the processor first level cache.
<heat> oh yeah wanna copy? i'll copy
<mjg> Lol
<mjg> so it still makes sense to branch on it
<heat> REP CMPSB and SCASB performance is enhanced. The enhancement applies to string lengths between 1
<heat> Support for fast short REP CMPSB and SCASB is enumerated by the CPUID feature flag:
<heat> ands reside in the processor first level cache.
<heat> and 128 bytes long. When the Fast Short REP CMPSB and SCASB feature is enabled, REP CMPSB and REP
<heat> SCASB performance is flat 15 cycles per operation, for all strings 1-128 byte long whose two source oper-
<mjg> is flat 12 cycles per operation, for all strings 0-128 byte long whose destination operand resides in the
<mjg> When Fast Short REP STOSB feature is enabled, REP STOSB performance
<sham1> I mean, C23 also added typeof to the standard, which is nice
<mjg> processor first level cache.
<mjg> that's still unusable
<heat> mjg, have you seen the """new""" recommendation
<heat> well, one of
<mjg> which page
<heat> just a sec
<heat> A two-socket Sapphire Rapids system can have up to 224 (2 sockets x 56 cores/socket x 2
<heat> threads/core) hardware threads. Scalability and performance bottlenecks may happen when all of these
<heat> hardware threads compete for the same addr
<heat> WHAT
<sham1> Also, #embed is nice in general to have, because unlike doing weird linker hacks, there's less chance of accidental UB
<sham1> It's also more portable
<heat> mjg, the "how to fix it" is even more hilarious
<brunothedev> error: invalid use of void expression: "*fb = *where;"
<heat> learn C please
<bnchs> brunothedev: wtf are you doing?
<nortti> brunothedev: what are the types of "fb" and "where" and why?
<brunothedev> heat: learning it
<heat> wrong channel, see ##c
IdentityInFlux is now known as Amorphia
<bnchs> hi Amorphia
<heat> how tf do you pronounce your name?
<heat> benches? bancs?
<brunothedev> nortti: fb is "void *", where is "uint32_t *"
<bnchs> heat: bunches
<heat> oooooooooooooohhh
<bnchs> brunothedev: you don't dereference a void *
<zid> quick, write his code for him one line at a time
<heat> yep, pretty much
<heat> explain him the whole of C
<bnchs> yes, i'll explain to him the C lore
<brunothedev> lmao
<heat> please go through the k&r book and if needed go to ##c
<nortti> modern c is also pretty good
<heat> i'm a rust purist so I struggle every time someone talks about this
<brunothedev> heat: r*st(vomit emoji)
<bslsk05> ​gustedt.gitlabpages.inria.fr: Modern C
<heat> >int printf ( char const format [ static 1] , ...) ;
<heat> NOPE
<bnchs> heat: so what if you listen to the alphabet
<bnchs> A, B, C.....
<heat> D
<heat> heheheheehehehehehe
<sham1> heat: tbf, that signature for printf is semantically valid, you do want at least to have a '\
<sham1> A \0 there
<nortti> yeah, the coding style advocated by modern c makes sense, even if it is unconventional
<brunothedev> most learning c resources should be called: Learning libc (and a little bit of c)
<heat> it's a travesty
<heat> it is barely C
<nortti> in the sense that C sucks majorly and that sucks marginally less, yes
<heat> how does it suck less?
<heat> please parse the syntax "char const format[static 1]"
<nortti> it makes a part of the interface that would traditionally be left implicit part of the type system
<heat> wth is the static and the 1?
<heat> why are keywords inside []
<nortti> "at least one char"
<nortti> it's part of c99 iirc
<heat> how is it at least? this is not consistent with any other bit of the language
<nortti> yeah, it's a messy syntax due to backcompat (can't have new resrved words unless they're _Reserved, aiui ppl were already putting shit in there and expecting it to do nothing)
<nortti> so why not repurpose the one that already has two different meanings
<heat> because it looks horrible and makes no sense
<brunothedev> it is funny how a recommendation to i read an old c book sparked a c specifications argument
<heat> file scope static makes sense, block scope static isn't too great but still makes some sense, static to denote "at least N elements" does not make sense
<nortti> do you have a better suggestion for how to mark it, given the constraints?
<heat> define a new keyword and stop being stupidly backwards compat
<heat> or new syntax
<mjg> heat: so according to agner that up/down thing is about bypassing false memory dependency problems
<mjg> i can slap together a bench which intentinally runs into them
<sham1> heat: static there is yet another way that keyword is being overloaded. But yeah, it means that the pointer points to a thing that has at least one element. I.e. it's not a NULL
<mjg> we will see what happens
<heat> mjg, what up/down thing?
<sham1> Also IIRC the correct one would be "int printf(char format[const 1], ...);"
<nortti> wait, printf is not defined to not mutate the format string?
<sham1> Because if printf modifies the format string, that'd be just some veritable BS
<heat> it cant mutate the format string
<nortti> yeah, posix uses const char *restrict format, even
<sham1> That's some veritable BS on the standard's part
<mjg> heat: if (dst < src)
<mjg> in memcpy
<nortti> sham1: hm?
<brunothedev> is there any os that use multiboot_tag_framebuffer? Literally anyone simple enough to understand the code
<mjg> totally gonna check for it now
<heat> learn C
<sham1> nortti: it's absolutely disgusting and it shouldn't be like that.
<heat> mjg, wdym false memory dependency problems?
<heat> i am very confused
<brunothedev> heat: i asked for a os that uses the thing, this is not a c problem
<heat> yes it's a C problem because you clearly don't understand pointers
<nortti> sham1: wait, I just read you wrote `[const 1]` and not `[static 1]`, are you saying the standard should express a const char* with const inside the brackets instead of outside them?
<brunothedev> heat: pointers are a reference to a memory address
<bnchs> brunothedev: the hint is, you cast the void pointer to a pointer of your type, aka uint32_t
<sham1> nortti: yes.
xenos1984 has quit [Read error: Connection reset by peer]
<heat> i'm fairly sure char str[const 10] is correct
<heat> no idea if char const str[10] is semantically the same thing
<brunothedev> bnchs: oh ok, i thought pointers in this context is just a pointer to a certainmemory address that i can write literally anything
<nortti> heat: I think char const str[10] is semantically same as char const *str, that is, without static it's just ignored
<sham1> In a parameter listing like this one, char format[const 1] is basically just const char *format, except that the compiler is allowed to assume that there is at least one element there
<sham1> And yeah, const str[10] is just a lie in a parameter context
Lumia has quit [Ping timeout: 248 seconds]
<heat> mjg, anyway erm, isn't all that stuff supposed to be memmove?
<sham1> Or was it equivalent to char *const format. I don't remember, I'll admit
<sham1> But yeah, projects like LLVM treat foo bar[static 1] as basically saying "this cannot be NULL" and for example clang will give you a warning if it can prove that you're maybe possibly passing one in
<sham1> You can even put "restrict" within the square brackets there and suddenly it's restrict-qualified
<nortti> sham1: afaict, looking at C17, `char foo[const 10]` is equivalent to `char * const foo`
<nortti> that is, the function body is allowed to do *foo = 0, but not foo = 0
<sham1> Yeah okay, so for printf the actual thing to do would be "int printf(const char format[static 1], ...)"
<nortti> yeah
<sham1> Or at least that's my opinionated take on it, because to me modifying the format string is heresy
<heat> we should make rodata writeable
<nortti> read-ourite data
<sham1> On a related note, I need to figure out how to have ld separate .rodata from .text, just so I can make .rodata not executable
<sham1> So it'd be its very own program segment
xenos1984 has joined #osdev
<heat> doesn't yours do it by default?
<sham1> Nah, it just shoves it into the R-X segment alongside .text. Not to mention that also happening outside of OSDev as well like with Linux binaries
<heat> -z separate-code
<sham1> Ah. Good to know
<heat> i think that should work, but please try
dude12312414 has joined #osdev
craigo has quit [Ping timeout: 252 seconds]
kof123 has left #osdev [#osdev]
brunothedev has left #osdev [j'aime de onion frite a huile!]
brunothedev has joined #osdev
bgs has quit [Remote host closed the connection]
brunothedev has left #osdev [as a troll, my mission is to fill libera.chat #osdev room with some thing, here a song: I was born in a Dublin street, where the loyal drums did beat,]
brunothedev has joined #osdev
<moon-child> 'Obvious oversights. Textbook mistakes. Surely not in the OpenBSD console code?'
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
Lumia has joined #osdev
<heat> moon-child, is that a quote from the bible
<kazinsal> The Gospel According to tedu
<moon-child> https://research.exoticsilicon.com/articles/unbreaking_utf8_on_the_console I got bored and skipped to the end when they started explaining what utf8 is, but apparently they had a turbo fucked utf8 decoder
<bslsk05> ​research.exoticsilicon.com: ExoticSilicon.com - fixing cringeworthy bugs in the OpenBSD console code
<kazinsal> christ on a bike that CSS
<moon-child> come on kazinsal it's cute
<kazinsal> it's like a da share z0ne meme explaining a bug in an esoteric unix clone
<kazinsal> da mothafuckin /usr/share z0ne
<heat> what is that
Left_Turn has joined #osdev
<sham1> That background is cringe