klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
<uplime> t/b 7
<uplime> erm
jjuran has quit [Ping timeout: 272 seconds]
Sos has quit [Quit: Leaving]
chartreuse has quit [Read error: Connection reset by peer]
pretty_dumm_guy has quit [Quit: WeeChat 3.2-rc1]
furan has quit [Ping timeout: 272 seconds]
<doug16k> disable builtin for what? how could you possibly beat builtin sincos? https://man7.org/linux/man-pages/man3/sincos.3.html#NOTES
<bslsk05> ​man7.org: sincos(3) - Linux manual page
<doug16k> builtin sincos is there so it can just instantly know the answer is 0.0 and 1.0, for constant 0 angle, for example
<doug16k> or any constant angle
<doug16k> that's one of the dumbest things I have ever seen in a man page
<bslsk05> ​github.com: freebsd-src/e_fmodf.S at master · freebsd/freebsd-src · GitHub
<doug16k> the ONE function that you would think would be twiddling the bits of the mantissa. nope
<doug16k> must be faster than screwing with the mantissa
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
<kingoffrance> something makes me think you are implementing a libm
<kingoffrance> cant quite put my finger on it
<klange> I was overthinking this, why spend the time getting a filesystem parser set up in real mode when I can just _load the entire CD somewhere_ and keep the C code I have parsing ISO9660...
aerona has quit [Quit: Leaving]
<vdamewood> Can we refer to protected and long mode together as prolonged mode?
<klange> So my latest legacy loader iteration hops into unreal mode, loads sectors from the boot disk and copies them to 64MiB until the BIOS throws an error, does the e820 stuff, and then hops on over to a C menu to configure boot parameters.
<klange> Next I think I'll bother actually getting a video mode from VESA and switch the menu to some shiny LFB graphics, and then I need to port my line editor so the kernel command line can be edited directly instead of relying on the check boxes.
<doug16k> kingoffrance, let's call it a micro libm
<doug16k> barely-enough-math-to-do-3d-stuff libm
<doug16k> I am basically copying the public domain stuff at this point
<doug16k> just stuff like trunc, sincos, fmodf
<doug16k> that's probably most of it
<doug16k> oh, sqrtf
<doug16k> it's one instruction on a couple of machines, but I'm going to throw in the fallback for rest
<doug16k> I am using fast inverse sqrt every place I need sqrt so far anyway
<doug16k> you should look at this, it's the most jaw dropping bit twiddle trick I have seen yet: https://en.wikipedia.org/wiki/Fast_inverse_square_root
<bslsk05> ​en.wikipedia.org: Fast inverse square root - Wikipedia
<doug16k> it's accurate to about 1e-6 as shown, 1e-8 if you do that 2nd iteration
<doug16k> for reasonable numbers
<doug16k> +/- 10M
<moon-child> doug16k: I think the idea is, if you wanted to benchmark, you would want to disable constant folding for those functions
<moon-child> not that the lib functions would actually be faster than the builtins
<moon-child> (for the sincos junk)
<doug16k> maybe they got tired of people asking why sincos is "so slow" sometimes when their little benchmark that computes a constant angle sincos over and over is so fast
<doug16k> not knowing the answer already tends to be slower, yeah
<bslsk05> ​github.com: sleef/sleefsimdsp.c at master · shibatch/sleef · GitHub
<doug16k> we made machines for computing and they actually kind of suck at it
<doug16k> you have to transform what you want to do into something completely unnatural
<moon-child> I dunno if that's entirely fair
<moon-child> 1. physics
<moon-child> 2. there are lots of different kinds of computing, and hard to make one machine good at all of them
<doug16k> yeah, math libs could be way less ugly if they sacrificed some performance. they make them as hideous as necessary to take advantage of every possible symmetry
<moon-child> i mean, you could say the same thing about CPUs. 'sqrtss' is nice, but who knows what contortions the hardware goes through to make that happen
<doug16k> yeah, my divider works by making successively better estimates with multiplication, doing guess and check convergence
<doug16k> not even close to what schoolbook algorithm says
<doug16k> you can parallel the hell out of multiplication
<doug16k> division is sequential
iorem has quit [Quit: Connection closed]
nur has joined #osdev
<doug16k> I wonder why cpu manufacturers think it's okay to not even bother saying how many cycles some instructions can be?
<doug16k> why is it so cool for half a table to have blank in cycles field?
<doug16k> I gotta go see how bad it is first hand? why does that help?
<doug16k> people will go make the most asinine perf test you can imagine and get a completely wrong result
<doug16k> of course linux puts the output side of the sincos math function on the right. I wonder what is with the obsession of wrong parameter order
<moon-child> maybe it has pathological behaviour on some inputs and they don't wanna advertise that (or be responsible when people figure out their 99% perf numbers aren't always accurate)?
<moon-child> who knows
yuu has joined #osdev
<doug16k> I would be completely happy with a range like 18-3600. just best and worst value case, assuming all L1 hits and stuff like all the numbers assume
nyah has quit [Ping timeout: 272 seconds]
<moon-child> yeah, I would be happy with that too, but it might lead to bad press. Sensationalization. W/e.
<doug16k> if that were the case, then nobody would use intel processors. just look at how many cases they stall
<doug16k> when I first read about pentium pro, I started wondering if it ever *didn't* stall
<doug16k> in reality it was amazing
<doug16k> they can be honest
<doug16k> until recently, optimization guides just went on and on about all the awful things stalls that happen if you do it wrong. now they are almost saying, do whatever, it's all fast
<doug16k> some things are almost completely solved now
<doug16k> you used to worry about the instruction lengths. now it is almost negligible because the cpu caches the micro-ops and runs them full speed no matter how big they were, in a loop
<doug16k> the max loop length is over 1000 instructions now, very big
aerona has joined #osdev
<doug16k> you intermittently have infinite instruction fetch bandwidth
<doug16k> s/fetch/decode/
<doug16k> it's not even decoding them again, already done. memoization
srjek has quit [Ping timeout: 272 seconds]
ZipCPU has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
<doug16k> lol, __double_t. why?
<doug16k> what if the gravitational constant of the universe changes? is there a typedef for when all planets collapse into points?
<doug16k> you might be all forgiving, and say that is in case it is a microcontroller or something. nope, this is the body: return ((double)(x + 0x1.8p52) - 0x1.8p52);
<Mutabah> ... what?
<Mutabah> Wait, what is that meant to do? Obtain some sort of floating point noise error?
flx has joined #osdev
<bslsk05> ​github.com: freebsd-src/math_private.h at main · freebsd/freebsd-src · GitHub
<doug16k> how stupid is that for it to return double, and parameter is __double_t
<doug16k> like windows SHORT. as if it will ever not be just short
bsdbandit01 has joined #osdev
<geist> hang on a sec, there's a whole block of comment talking about why
bsdbandit01 has quit [Read error: Connection reset by peer]
<doug16k> how does it being __double_t minimize conversions?
<geist> depends on what it's typedefed to doesn't it?
<doug16k> it's adding a double to it right?
<doug16k> so they convert it or we convert it, same?
<doug16k> I mean assuming it wasn't already double
<doug16k> if it were already double, wouldn't you get even more conversions? down to float then up to double again for add?
<doug16k> if __double_t were float
<geist> i dunno, i'm looking in the freebsd headers and there's some non trivial definitions of it
<doug16k> I don't see how it can help for the parameter to not be double
<geist> including on x86_32 it seems to be typedefed to a 'long double'
<geist> there's an x86/_types.h where it's defined that way
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
<doug16k> fno-associative-math breaks that code
<doug16k> should try in godbolt
<doug16k> would be funny if it special cases that
<geist> yah keep in mind it may be for some arch that you dont care about
<doug16k> ah, -ffast-math to break it https://www.godbolt.org/z/ncjMxG4Mv
<doug16k> yeah, I should just one-instruction it all for x86 targets
<doug16k> fsincos rsqrtss sqrtss
<doug16k> been ages since I wrote an assembly asm constraint
<doug16k> for x87
dissident has joined #osdev
<doug16k> oh nice, I can use "t" constraint to have gcc arrange for the angle to be in x87 top of stack
<moon-child> x87 D:
kori has joined #osdev
chin123 has quit [Remote host closed the connection]
chin123 has joined #osdev
<doug16k> lol, x87 fsincos well optimized into the code is 25ns each, calling sinf and cosf (and builtin calling sincosf for me) is 5ns each!
<doug16k> I checked the assembly. it's really doing it
<doug16k> sweeping -pi to pi, so easiest/fastest domain
<bslsk05> ​gist.github.com: testsincos.cc · GitHub
<doug16k> I forgot a loop to rev up the cpu and do a burnout to warm up the tires, though :P
<moon-child> lol, gcc turns the sin-followed-by-cos into a call to sincos
<moon-child> if I tell it to not do that, the times get really variable
<doug16k> I wonder how
<doug16k> if I fno-builtin (zen2) it does becomes 6-7ns for separate sin then cos call
<doug16k> seems like it just overlaps the two
<doug16k> one sin is too easy
<moon-child> hardcoded. Def-use edges. Looks like they don't have to be in the same block either, e.g. http://ix.io/3pTz turns into a sincosf call
<doug16k> I would be pretty mad if it didn't CSE that out
<doug16k> when you do sin you practically get cos for free. similar to how you get remainder for free if you div on x86
<doug16k> it had to get remainder to do the div
<doug16k> couldn't help it
bradd has quit [Remote host closed the connection]
<doug16k> I expect similar magical things to happen for / and %
<moon-child> handled differently. E.G. http://ix.io/3pTB calls sin and cos separately, but uses a single div
<moon-child> maybe would be faster to do sincosf because then you don't have to deal with the branch?
<doug16k> I think if it uses sse, it has a cheaper sin-only and cos-only
<doug16k> x87 you have to sincos, there is no separate sin and cos
<moon-child> sse doesn't have trig instructions
<doug16k> if you know you need both, you can get other one for nearly free
<doug16k> I know, but it uses sse to do sin nonetheless
<moon-child> huh, how?
<doug16k> it's just a bunch of mul add and stuff
justyb11 has joined #osdev
<bslsk05> ​github.com: freebsd-src/k_sincos.h at main · freebsd/freebsd-src · GitHub
<doug16k> taylor expansion on steroids that I can't comprehend
<bslsk05> ​github.com: dgos/sin.cc at master · doug65536/dgos · GitHub
<doug16k> my ascii art too :P
<doug16k> unicode art?
<doug16k> has crap loops instead of fmod yes, forgive that bit
<doug16k> fmod not there yet
<doug16k> awful compared to how fast you could make it though
<doug16k> at least I made that one though
<doug16k> 20 divides is pretty rough
<doug16k> rest is free
<doug16k> oh I fixed the hideous fmod hack I had before. it's not asinine anymore
<doug16k> I don't claim to know anywhere near enough math theory to make a math lib
iorem has joined #osdev
bradd has joined #osdev
iorem has quit [Ping timeout: 268 seconds]
aerona has quit [Quit: Leaving]
jjuran has joined #osdev
<klange> This is going to end up being just as hacked up and terrible as the old one, but differently so, but I have at least added simple UDP sockets and moved dns lookups to userspace...
<klange> Still patching together TCP again, managed to get a SYN SYN-ACK'd, so that's a start.
iorem has joined #osdev
bradd has quit [Remote host closed the connection]
<jjuran> His vorpal blade went snicker-SYNACK
<moon-child> T_T
bradd has joined #osdev
Sos has joined #osdev
bradd has quit [Remote host closed the connection]
bradd has joined #osdev
bradd has quit [Remote host closed the connection]
sortie has joined #osdev
bradd has joined #osdev
<doug16k> wow, check out avx512f 4x4 matrix mul: https://www.godbolt.org/z/fdTsjx519
<doug16k> that's matrix * matrix
<moon-child> ooh, slick
<moon-child> also gets misaligned cause the instruction names are so long :P
<kazinsal> always brings a smile to my face when I see the finvsqrt() // what the fuck?
<doug16k> you either say it when you read that line, or when you see how accurate it is
<kazinsal> the switch from casts to memcpys is interesting. does the compiler freak out when trying to vectorize with the evil floating point bit level hacking?
<doug16k> I used memcpy because it is UB to do that aliasing nonsense carmack did
<doug16k> gcc knows what I mean. it's one instruction
<moon-child> -fno-strict-aliasing is your friend
<doug16k> not my friend
<moon-child> ¯\_(ツ)_/¯
<kazinsal> undefined behaviour is like goatse to me. I've grown so used to it that it's like a warm hug from an old friend
<doug16k> I'll do -fyes-actually-optimize-kthxbye
<moon-child> undefined behaviour is bananas. if I need it fast I'll write the loop in asm. Usually I don't need it fast
<doug16k> if you do that, then you make the compiler constantly think maybe every variable changed
<doug16k> completely kills it
<doug16k> every deref it goes, "hmmm, can't completely rule out that it accesses any of these, better restructure so I push those out of register variables to memory, in case these derefs to types I can't check hit one
<doug16k> with strict aliasing, it can be sure that everything that is a different type doesn't alias and can't change
<doug16k> which adds up
<doug16k> it cripples it partially back to the old days, when it had to push everything out to memory before making a call, just in case
<doug16k> just use memcpy when you would aliasing hack, and the compiler does it perfectly
<doug16k> I haven't caught it once calling memcpy when I do it for that
<doug16k> every time it is doing the thing I would do if hand writing the assembly
sortie has quit [Ping timeout: 272 seconds]
<moon-child> largely, but in exchange you have to be hypervigilant every time you're writing not-inner-loop code
<moon-child> (you know, I think we may have had this exact argument at some point in the past :P)
<doug16k> don't do type punning hacks and you won't need to turn off strict aliasing
<doug16k> type pun hacks are UB
<doug16k> I just don't write the UB in the first place, so don't need no-strict-aliasing
<klange> I want to say "not all of them" but I guess the ones that aren't also aren't "hacks"...
<HeTo> just don't reinterpret_cast, use C-style casts or void* or unions. use headers and a proper build system. I think that gets rid of all strict aliasing problems doesn't it?
<HeTo> and it's not like you'll accidentally write reinterpret_cast without meaning to
<doug16k> all you need to do is play like you are memcpy-ing it to a local variable, and use the local variable. builtin_memcpy will see what you did there and it will skip the copy and use it directly, as if you had punned it, but not UB now
<doug16k> the optimizer is required to understand that and not screw up
<doug16k> the optimizer is *allowed* to completely screw up if you type pun
<klange> You can also do that with unions, which may be cleaner.
<doug16k> it's UB in C++ to do that
<doug16k> if you accessed it with one member, all further accesses must be through the same member
<klange> Well C++ is no fun. [Is it still UB if it's extern C?]
<doug16k> if C codegen did it, it's ok
<doug16k> C optimizer isn't allowed to screw it up
<doug16k> structs don't have linkage right?
<klange> There's a pair of evil-but-legal union assignments as casts in my vm that look like this: (((KrkValueDbl){.val = (value)}).dbl)
<moon-child> I think klange means, what if you say extern "C" { void f() { union { int i; float f; } x; x.f = 1; x.i++; } }
<doug16k> I think that would be UB
<doug16k> extern c just changes the linking
<doug16k> right?
<j`ey> you can use C++ code insinde an extern C function
<klange> Hm, well, don't try to use this header from C++ then :)
<doug16k> gcc makes it work, in C++
<klange> None of these even have the boilerplate wrappers anyway.
<doug16k> if you are pedantic though, you don't pun through unions
<HeTo> (KrkValueDbl){.val = (value)} wouldn't work in C++ anyway
<HeTo> although GCC supports it in g++ language standard I think, but then again, I think GCC allows type punning through unions in C++ as well
<HeTo> in C++20 you could do KrkValueDbl{.val = (value)}
<klange> This was at one point a static inline with a memcpy.
<moon-child> (linus torvalds calls anyone who compiles with -fstrict-aliasing a 'fucking moron'. https://lkml.org/lkml/2018/6/5/769. My opinion of the flag aside, I don't think I would want to work on linux)
<doug16k> ever since I saw how well and aggressively builtin memcpy skips the copy and uses it directly, I'm hooked
<bslsk05> ​lkml.org: LKML: Linus Torvalds: Re: [GIT PULL] Device properties framework update for v4.18-rc1
<doug16k> you get no UB, ubsan doesn't trap it, and it goes just as fast, and works properly every time
<klange> Well, the flag is enabled by default if other conditions are satisfied, and if those conditions aren't satisfied you shouldn't demand it.
sortie has joined #osdev
<doug16k> moon-child, yes, coming from the person who proclaims the most insecure, leaky language ever is the best for kernels
<doug16k> (Linux)
<doug16k> Linus*
<doug16k> if you asked what is a portable completely unstable language that leaks all resources and crashes at the drop of a hat, I'd say, really promptly, "C!"
<sortie> That question is undefined behavior
<kazinsal> I think if I were an accidental living icon of the cult of free software like linus is I would probably also be having freakouts at the drop of a hat
<sortie> omg it's kazinsal
<kazinsal> it me
<sortie> The icon of osdev
<kazinsal> oh no
<sortie> All praise the great one
<doug16k> I love C too
<sortie> I have read and memorized all your codes
<doug16k> my point is, Linus' opinion on a topic doesn't matter
<kazinsal> it does to GNU Plus Linux cultists
* sortie is technically not a card carrying FSF member anymore
<kazinsal> actually, GNU Plus Linux cultists are probably more opinionated about that godawful brace style the GNU coding style uses
<moon-child> doug16k: 'opinion' doesn't matter that much; the arguments are actually largely valid. But calling people fucking morons is not really a great way to interact
<moon-child> I mean, I'm not going to take his word for it regardless; 'appeal to authority' et al
<sortie> kazinsal: It's a feature. “Don't use this code”
<kazinsal> GPLv4 will require you to reformat any code licensed under it to GNU style
<doug16k> he has an excuse for his behaviour. he thinks where he's from gives him blanket authorization to treat people that way
<j`ey> moon-child: this was before he stepped back for a bit and worked on his communication skills :P
<moon-child> doug16k: do you happen to know: I remember hearing that gnu make has some kind of job server that makes recursive make perform correctly. Is that right? Do you need to explicitly set it up or does it happen automatically?
<kazinsal> now he only blows up on people who deserveit
<moon-child> lol
<doug16k> moon-child, yeah if you call the make with $(MAKE) it does the magic
<moon-child> cool
<doug16k> that's my understanding from the docs, I don't use recursive makes so I am not 100% about it
<moon-child> ok
<sortie> moon-child: Yeah it happens automatically via the MAKEFLAGS variable if the command looks like a make invocation, I believe
gog has joined #osdev
pretty_dumm_guy has joined #osdev
Belxjander has joined #osdev
Lucretia has joined #osdev
GeDaMo has joined #osdev
gareppa has joined #osdev
pg12 has quit [Ping timeout: 252 seconds]
gareppa has quit [Quit: Leaving]
tenshi has joined #osdev
farcas has joined #osdev
dormito has quit [Ping timeout: 268 seconds]
farcas has quit [Ping timeout: 244 seconds]
piotr_ has quit [Remote host closed the connection]
piotr_ has joined #osdev
piotr_ has quit [Ping timeout: 264 seconds]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
nyah has joined #osdev
farcas has joined #osdev
Arthuria has joined #osdev
gog has quit [Ping timeout: 252 seconds]
dormito has joined #osdev
pg12 has joined #osdev
piotr_ has joined #osdev
Mikaku has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
iorem has quit [Ping timeout: 252 seconds]
piotr_ has quit [Ping timeout: 264 seconds]
iorem has joined #osdev
ahalaney has joined #osdev
<klange> appropriate screenshot from Windows, what with that weather widget panel thing: https://cdn.discordapp.com/attachments/711112727426367571/853978105446400010/qemu_weather_tool.PNG
isaacwoods has joined #osdev
bsdbandit01 has joined #osdev
opios2 has joined #osdev
<klange> Need to fix ACKing FINs and other general socket closure stuff to at least get this up to the same lack of quality as the old stack, as well as remove or disable all the debug prints...
bsdbandit01 has quit [Read error: Connection reset by peer]
<Belxjander> klange: I need to write a network stack... can we collab ?
vdamewood has quit [Quit: Life beckons]
iorem has quit [Ping timeout: 268 seconds]
kspalaiologos has joined #osdev
hgoel[m] has quit [Ping timeout: 244 seconds]
medvid has quit [Read error: Connection reset by peer]
paulusASol has quit [Read error: Connection reset by peer]
carmysilna has quit [Read error: Connection reset by peer]
carmysilna has joined #osdev
OSdever has joined #osdev
<OSdever> Hello. A question that will seem stupid: how is it that videos are smoothly played on a modern desktop operating system? It seem to me that the hullabaloo of preemption would cut up playback and make it un-smooth.
paulusASol has joined #osdev
hgoel[m] has joined #osdev
medvid has joined #osdev
<GeDaMo> Buffering
<GeDaMo> Also, computers are fast
<Mutabah> and sometimes some level of hardware acceleration
<Mutabah> even just the video card converting YUV into RGB for you
<OSdever> I had thought it must involve some kind of buffering. But yet even with a basic framebuffer driver on relatively slow computer, I can cleanly watch a video. So the video playback software is being given time to run frequently enough to push out the video to the framebuffer 50 times a second, at the right time? It is a thing I know nothing about, though I make good progress with my own OS, and soon to port GNU Bash
<Mutabah> What basic framebuffer driver?
<OSdever> NetBSD WSCons VESA-FB on a Pentium III 500mhz
<Mutabah> also, on some systems (thinking WinXP and earlier) video playback had some interesting direct framebuffer access hacks - allowing the video deocder to direct blit to the screen
<OSdever> Yes, I know nothing about video and graphics, yet this is fascinating me
<OSdever> Mutabah: I thank you for inventing your Acess 2 OS in yore days , it was for me an interesting project to follow, and inspired me that I could also write an operating system
gareppa has joined #osdev
gareppa has quit [Remote host closed the connection]
iorem has joined #osdev
srjek has joined #osdev
iorem has quit [Quit: Connection closed]
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
bsdbandit01 has joined #osdev
<graphitemaster> multiple), these interrupts wakeup the kernel mode driver in the OS and the KMD is responsible for putting all applications _waiting_ on vblank (usually with a swapbuffers call) on the run queue to schedule again, presenting the back of the buffer (one not rendered to) and then flips (for double buffering)
<graphitemaster> OSdever, Generally the way modern graphics works (and how videos are played smoothly) is through a presentation protocol that involves a swapchain and device timestamps, the application is almost always double buffered (these images come from the swapchain) and your process should ideally be presenting an image every 16 milliseconds, the hardware (GPU) has a legitimate interrupt that is fired for each display's vsync (since there can be
dennis95 has joined #osdev
<graphitemaster> The cycle of images in the swapchain behaves like a ring buffer conceptually but depending on the way the application requests the type of presentation it may be relaxed.
<graphitemaster> This is also how it avoids vsync tear.
<graphitemaster> The scheduling quanta of the OS itself needs to be consistent enough to ensure a graphics application can present every 33ms or so.
<graphitemaster> If the OS cannot maintain that then you're not going to have smooth anything.
bsdbandit01 has quit [Read error: Connection reset by peer]
<graphitemaster> This entire system is different for high framerate displays and variable refresh rate rendering.
<graphitemaster> Also yes, HW accelerated decoding and presenting of video is very common too.
<graphitemaster> There's also HW overlays
<graphitemaster> Which skip the OS completely, they're done completely on GPU.
<Bitweasil> Also, 33ms is a long time in modern computers.
<Bitweasil> brb, need to reboot.
srjek has quit [Ping timeout: 272 seconds]
<Bitweasil> <3 ZNC.
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
ephemer0l has joined #osdev
justyb11 has quit [Quit: Leaving]
justyb11 has joined #osdev
mahmutov has joined #osdev
gruetze_ is now known as gruetzkopf
piotr_ has joined #osdev
Arthuria has quit [Ping timeout: 272 seconds]
matt|home has quit [Remote host closed the connection]
nur has quit [Read error: Connection reset by peer]
nur has joined #osdev
<geist> doug16k: oooh matrix multiply
<geist> you mean not in 4 instructions like on the SH-4? noice
gog has joined #osdev
tenshi has quit [Quit: WeeChat 3.1]
OSdever has quit [Quit: CGI:IRC (Session timeout)]
farcas has quit [Ping timeout: 264 seconds]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
myon98 has quit [*.net *.split]
myon98 has joined #osdev
m3a has joined #osdev
benjif has joined #osdev
<doug16k> geist, yeah like that except with about 1000x more cache, running about 20 times the clock speed
<doug16k> SH-4 nop might keep up with avx-512 matrix mul yeah
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
m3a has quit [Quit: leaving]
<kazinsal> woohoo, macbook arrived
GeDaMo has quit [Quit: Leaving.]
<kazinsal> it's going to take a whiule to get used to the command key being next to space
<kazinsal> whew
<j`ey> M1?
dormito has quit [Ping timeout: 272 seconds]
<kazinsal> yep
<Bitweasil> ooooh
<Bitweasil> Fancy!
<Bitweasil> I want one, just can't justify it in the slightest.
<kazinsal> My archaic thinkpad finally gave up the ghost so I decided to give Apple a go
kspalaiologos has quit [Quit: Leaving]
<Bitweasil> Super fast, should outlast even a battery-heavy Thinkpad in battery life.
<Bitweasil> And runs x86 code faster than an awful lot of x86 machines.
<kazinsal> 30 minutes of using it has already made all my desktop's monitors look like cheap garbage in comparison :(
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
<Bitweasil> There is that, yes. :(
<geist> doug16k: heh yeah but more like it was neat that it was just a single instructoin for vector * matrix
<geist> kazinsal: oh woot
<geist> yah i've been a mac laptop user since about 2003. solid pieces of kit
<geist> always harder to justify a mac pro, but as far as laptops are concerned they Just Work
<geist> i have zero interest in trying to maintain a windows or linux laptop and deal with whatever idosyncracies they may have
<Bitweasil> Back in 2003, they were about the only thing that would actually sleep and wake 10 times in a row reliably.
benjif has quit [Quit: Leaving]
<kazinsal> So now I'm fighting with authenticator apps and the like, of course
<kazinsal> and attempting to multitask with my desktop, my macbook, my iPhone, and now also powering up my old Pixel because I appear to have forgotten to transfer the blizzard authenticator off of it...
vdamewood has joined #osdev
dormito has joined #osdev
<geist> Bitweasil: yeah that's true too
pretty_dumm_guy has quit [Quit: WeeChat 3.2-rc1]
<doug16k> geist, yeah, that is pretty neat
<moon-child> apparently powerpc has an instruction to transpose a bitmatrix
<doug16k> that code I pointed out was matrix * matrix, not matrix * vector
<doug16k> as in 64 multiplies and 48 adds
<moon-child> doug16k: right, that's why it would be 4 instructions
<doug16k> the x86 hate in this room is a real bummer you know
dennis95 has quit [Quit: Leaving]
<kazinsal> sounds like Intel needs to do something fuckin' interesting for once
<moon-child> doug16k: x86 is my guilty pleasure
<doug16k> you know the argument is weak when people compare a cpu with near zero clock speed against pushing 5GHz
<j`ey> I think you might have taken the sh4 comment a little too seriously!
<doug16k> funny though how the "CISC" processor uses several instructions, and the RISC one does one
<kc8apf> x86 took forever to gain anything close to ppc vperm
Sos has quit [Quit: Leaving]
srjek has joined #osdev
<moon-child> doug16k: I like fuzxxl's take: so-called 'risc' has converged on cisc, because everything about original risc was a bad idea except for load/store and simplified encodings
Sos has joined #osdev
<doug16k> which one is more risc-like, x86, or arm64?
<doug16k> the one with only two registers per instruction is the less complex one right?
<moon-child> lol
<doug16k> instructions have less information on x86
<doug16k> you might say the x86 instructions are 33% simpler than arm64 ones
<Bitweasil> Hm.
<Bitweasil> movs pc, r0
<Bitweasil> That *shouldn't* work in HYP mode on ARMv7, correct?
<Bitweasil> Linux kernel seems to be doing this in response to a HVC call, which isn't correct per the manual, unless it's trying to trap into the undefined instruction handler, or I'm decoding stuff badly wrong. :/
<Bitweasil> Actually... hrm.
<Bitweasil> That's not even a valid return address in R0.
* Bitweasil has confused the Linux kernel somewhere...
froggey has quit [Ping timeout: 272 seconds]
ahalaney has quit [Quit: Leaving]
froggey has joined #osdev
NieDzejkob_ has quit [Ping timeout: 244 seconds]
NieDzejkob has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
sortie has quit [Quit: Leaving]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]