#osdev on 2022-08-06 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:14 <heat> i was going to ask why jmp couldnt just alias to mov ..., pc but I realize that it's a very inefficient jump opcode

00:17 [itchyjunk] has quit [Ping timeout: 268 seconds]

00:17 <geist> Yah, there are a few arches where PC is a regular register like that, but PDP-11 and arm32 were one of them

00:18 <geist> However, arm32 still has regular branch instructions too, mostly so you can get more access to more bits for an offset

00:18 <geist> So you had both: b<cc>, bl <offset>, bl <register>, and a bunch of mov/add/sub/etc pc stuff

00:18 <geist> Actually kinda a mess when you think about it, since there are a bunch of ways to do the same thing

00:18 <geist> Neat when you’re writing assembly,m but hard to optimize

00:19 dude12312414 has quit [Remote host closed the connection]

00:20 dude12312414 has joined #osdev

00:20 <geist> Though it’s sort of a generic riscv thing to do (the arm32 solution) in another way it’s sort of anti-risc because you have multiple ways to accomplish the same thing

00:20 <geist> So it depends on how you look at it

00:21 [itchyjunk] has joined #osdev

00:21 <geist> In the case of arm64 at least all of the regular registers have no special purpose *except* x30, which is also known as lr. It’s implicitly used in bl and ret instructions

00:22 <geist> Riscv is sort of cleaner and worse since none of the registers have any special case *except* the arch highly recommends you use certain regs for SP, RA, etc. it goes out of its way to say if you use them in those functions implementations may optimize for it

00:23 <geist> Which really means they may as well be special regs. I think this was the crux of ARMs decision in arm64 to do that

00:27 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

00:57 <heat> yeah

00:57 <heat> also that reminds me

00:57 <heat> how do call stacks look?

00:57 <heat> how do you unwind the stack?

00:58 <heat> if it assumed you're saving lr at the top of the stack before doing bl?

01:01 <zid> I tend to use tweezers and a slightly blunt scalpel for scraping

01:11 <dh`> well no, a special reg is really one that's not in the general register file so you can't address it in most instructions, and that's definitely bad for the stack pointer

01:11 <dh`> for ra, not so clear

01:23 heat has quit [Ping timeout: 240 seconds]

01:31 vai has joined #osdev

02:11 freakazoid333 has quit [Ping timeout: 255 seconds]

02:11 <geist> dh`: depends. lets the cpu optimize it better of push/pops are always against a particular special register. key is riscv says 'the stack can be anything' but then it basically says 'it really needs to be this one'

02:11 <geist> same with ra

02:12 <geist> so it's competing goals of a generic ISA that's clean and pure and one that is designed for high speed implementations, but of course i'm sure these topics have been beaten to death elsewhere

02:15 <geist> but i think really a lot of these boil down to 'i like how arm64 does it' vs 'riscv is designed to be extremely simple' so it's a give and take. both are nice in various ways

02:20 <clever> geist: how did riscv deal with hypervisors and nested virtualization?

02:20 <geist> nested virt i dunno, hypervisors it's kinda more similar to x86 style than ARM style

02:21 <geist> basically in supervisor mode you can (if you're allowed to) set up a set of banked regsters and then switch to virtualized mode, which then switches modes such that the cpu then appears to be running in supervisor but is really using the banked copies

02:21 <geist> traps out as usual, nested paging as usual

02:21 <geist> so it's kinda a side step, or maybe if you considered supervisor to be EL1, you drop to EL0.9 or something

02:22 <geist> nested seems like it'd be kinda straightforward though you'd have to trap fiddling with those regs

02:23 <geist> i think initially they were planning on doing a hard nested levels, a-la armv8. there

02:23 <geist> there's even an unused mode bit combination for it

02:24 <geist> note banking the control regsters like that on riscv is fairly straightforward since there are only a handful of them. less than 10 or so

02:24 <geist> so it's pretty simple, at least for now. as more state is inevitably added to the architecture will see

02:25 <clever> ah

02:26 <geist> dunno if linux has KVM support for that yet, but qemu has emulation support for it at least

02:29 <clever> i was talking in #arm on the osdev discord, about how to handle the traps, for my LK hypervisor idea i mentioned before

02:29 <clever> and the basic conclusion i came to, is that you can treat it like `eret` returns, kinda

02:29 <clever> create a function that is split in half

02:29 <geist> https://github.com/kvm-riscv/linux/tree/master/arch/riscv/kvm looks like it's pretty recent, probably still being worked on

02:29 <bslsk05> github.com: linux/arch/riscv/kvm at master · kvm-riscv/linux · GitHub

02:30 <vai> Sane virtualization A.I. start up am planning.

02:30 <clever> the 1st half, will save all EL2 gprs, restore the EL1 gprs, restore the spsr/elr, and eret down to EL1, leaving a bunch of saved state on the EL2 stack

02:30 <geist> yep, that seems like a pretty standard way to do it

02:30 <clever> the "exception from lower level" vector, will then assume the EL2 stack is in that state, save the EL1 regs, restore the EL2 regs, and then `bx lr`

02:31 <clever> so the function from the 1st half will return, like a normal function

02:31 <geist> and you see how the EL2 virtualization feature bit makes that even simpler

02:31 <geist> since it removes the need to bounce through EL2 for the host kernel

02:31 <clever> yeah, just run LK entirely in EL2

02:31 <clever> and either use that optional feature to alias EL1->EL2, or rewrite half of LK to use the EL2 names

02:32 <geist> yah the latter would probably involve macroizing most of it

02:32 <geist> so you could set a build flag

02:32 <clever> exactly

02:32 <geist> probably not too terribly difficult

02:32 <clever> and once its a build flag, you could also run LK in EL3 as well

02:32 <geist> had to basically do that for riscv so it can run in M and S mode

02:33 <geist> indeed

02:33 <clever> and macro the "guest" stuff a bit, and boom, you now have a secure firmware

02:33 <geist> that's a fun project, get to it!

02:33 <geist> and do it in lots of commits, that's much easier for me to take

02:33 <geist> build the mechanism first, then commits to start converting parts

02:34 <clever> i was thinking the first step, would be to just allow LK to run normally in EL2 or EL3

02:34 <clever> without actually doing any different while in those levels

02:34 <geist> right

02:34 <geist> but even more so than that, figure out the mecyhanism to statically build it for the different modes

02:34 <clever> yeah

02:35 <geist> find all the places where registers are referred and see about making it a compile time thing

02:35 <geist> prototype a mechanism, then find all the places where it'd need to get changed and see what happens

02:35 <clever> yeah

02:35 <geist> EL2 is difficult to generically run in, however, just as a warning. it doesn't support all of the features that you think exists in EL1

02:35 <clever> for example?

02:35 <geist> or at least not without the feature that makes it look like EL1

02:35 <geist> like TTBR1

02:36 <geist> 'high' virtual memory

02:36 <clever> ah, for the high/low split

02:36 <clever> kinda makes sense, you would just have a single virtual view for the whole hypervisor, and EL2 isnt meant to be dealing with EL0

02:36 <geist> basically that EL2 feature (forget the FEAT_* name) basically 'completes' EL2 while also giving it an ability to alias EL1 registers

02:36 <clever> and then its EL1's job to deal with EL0 and need a split

02:37 <geist> anyhoo, gotta go. doing an escape room in a bit. should be fun!

02:37 <geist> ttyl! unless i dont make it out

02:37 <clever> yep, laters

02:37 <clever> lol

02:48 gog has quit [Ping timeout: 245 seconds]

03:17 <vai> Jaahas meni sitten 30 minuuttia tai jotain kävellessä.

03:17 <vai> oops :) sorry

03:19 frkzoid has joined #osdev

03:23 <vai> ramdisk driver is not good, but having block drivers buggy is much more easier to fix than core file system corruptions, no file system corruptions

03:58 [itchyjunk] has quit [Remote host closed the connection]

04:17 gildasio has quit [Remote host closed the connection]

04:22 gildasio has joined #osdev

04:26 srjek has quit [Ping timeout: 244 seconds]

04:41 vai has quit [Remote host closed the connection]

05:17 <geist> Yay i escaped

05:17 <geist> 3 minutes to spare

05:54 poyking16 has joined #osdev

06:00 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

06:04 MiningMarsh has joined #osdev

06:12 ThinkT510 has quit [Quit: WeeChat 3.6]

06:17 ThinkT510 has joined #osdev

06:37 the_lanetly_052 has joined #osdev

07:43 poyking16 has quit [Ping timeout: 245 seconds]

07:45 poyking16 has joined #osdev

08:57 opal has quit [Remote host closed the connection]

08:58 opal has joined #osdev

09:01 GeDaMo has joined #osdev

09:09 gildasio has quit [Remote host closed the connection]

09:10 gildasio has joined #osdev

09:14 gildasio has quit [Remote host closed the connection]

09:14 gildasio has joined #osdev

09:15 <moon-child> froggey: follow up to your question from many months ago: https://weinholt.se/scheme/alignment-check.pdf says 'Using alignment checking does therefore not appear to incur any overhead as long as programs do not generate an excessive number of exceptions.'

09:39 the_lanetly_052_ has joined #osdev

09:41 the_lanetly_052 has quit [Ping timeout: 240 seconds]

09:44 jjuran has quit [Quit: Killing Colloquy first, before it kills me…]

09:45 jjuran has joined #osdev

10:14 vdamewood has joined #osdev

10:20 gxt___ has quit [Remote host closed the connection]

10:21 gxt___ has joined #osdev

10:24 terminalpusher has joined #osdev

10:24 Vercas6 has quit [Quit: Ping timeout (120 seconds)]

10:25 Vercas6 has joined #osdev

10:50 terminalpusher has quit [Remote host closed the connection]

10:51 <froggey> moon-child: oh interesting, thanks!

10:56 poyking16 has quit [Ping timeout: 268 seconds]

10:58 poyking16 has joined #osdev

11:01 gog has joined #osdev

11:08 socksonme_ has joined #osdev

11:09 poyking16 has quit [Ping timeout: 268 seconds]

11:11 poyking16 has joined #osdev

11:19 poyking16 has quit [Ping timeout: 240 seconds]

11:19 Vercas6 has quit [Remote host closed the connection]

11:20 Vercas6 has joined #osdev

11:22 poyking16 has joined #osdev

11:25 [itchyjunk] has joined #osdev

11:45 poyking16 has quit [Ping timeout: 268 seconds]

11:47 poyking16 has joined #osdev

12:04 gildasio has quit [Remote host closed the connection]

12:04 gxt___ has quit [Remote host closed the connection]

12:05 gxt___ has joined #osdev

12:05 gildasio has joined #osdev

12:13 node1 has joined #osdev

12:14 <node1> Hi

12:14 <node1> Can we consider a snapshot is (Full + incremental) data?

12:15 poyking16 has quit [Ping timeout: 245 seconds]

12:16 <mjg> i think you may want to provide some context for the question

12:18 poyking16 has joined #osdev

12:20 terminalpusher has joined #osdev

12:20 <node1> sure, There is an option in hypervisor. Which has an option for full and link clone. And it says a `link clone has new snapshot will be created in the original virtual machine`

12:21 <node1> So i would likes to understand whether this snapshot is incremental copies or differential copies or full + incremental copies?

12:29 <mjg> that probably depends on the hv, you should ask on the related channel

12:29 <mjg> there are funny hv-specific formats for achieving this

12:30 poyking16 has quit [Ping timeout: 268 seconds]

12:30 node1 has quit [Ping timeout: 252 seconds]

12:30 node1 has joined #osdev

12:36 node1 has quit [Ping timeout: 252 seconds]

12:57 SpikeHeron has quit [Quit: WeeChat 3.0]

13:11 vai has joined #osdev

13:11 <vai> way back

13:12 <kazinsal> in most hypervisor platforms a "snapshot" consists of functional metadata + the delta

13:13 <kazinsal> in which the delta of the persistent storage is only valid assuming the source is unchanged

13:14 <kazinsal> in dangerously common parlance, that condition not being met means that the resulting system is "undefined"

13:14 <kazinsal> nobody clones the base disk for a couple of reasons

13:14 <kazinsal> most instantly notably it means hitting the snapshot button means immediately doubling the storage requirements of the VM

13:16 <kazinsal> whether or not you implement that by freezing the VM and cloning the disk on backing media then unfreezing and forking doesn't realistically matter

13:25 Piraty has quit [Quit: -]

13:31 Piraty has joined #osdev

13:33 heat has joined #osdev

13:34 Piraty has quit [Client Quit]

13:38 Piraty has joined #osdev

13:47 <vai> Writing FAT takes long time, so I am using a 2 Mb HD image :) Qemu.

13:49 heat has quit [Remote host closed the connection]

13:49 heat has joined #osdev

13:50 dutch has joined #osdev

13:58 Piraty has quit [Quit: -]

14:08 poyking16 has joined #osdev

14:22 Piraty has joined #osdev

14:25 terminalpusher has quit [Remote host closed the connection]

14:26 Vercas6 has quit [Ping timeout: 268 seconds]

14:30 Piraty has quit [Quit: -]

14:31 vai has quit [Ping timeout: 245 seconds]

14:31 Piraty has joined #osdev

14:49 Vercas6 has joined #osdev

14:49 Messier81 has joined #osdev

14:56 Messier81 has quit [Ping timeout: 245 seconds]

14:57 the_lanetly_052_ has quit [Ping timeout: 245 seconds]

15:01 carbonfiber has quit [Quit: Connection closed for inactivity]

15:14 gildasio has quit [Ping timeout: 268 seconds]

15:16 gildasio has joined #osdev

15:19 the_lanetly_052_ has joined #osdev

15:32 the_lanetly_052_ has quit [Ping timeout: 245 seconds]

15:37 Vercas6 has quit [Remote host closed the connection]

15:38 Vercas6 has joined #osdev

16:07 <heat> here's a weird gcc detail: it will not use cmpxchg16b for __atomic intrinsics but will for __sync

16:07 <heat> it's explicitly documented that way

16:07 <heat> why?

16:08 <zid> C++ abi has some weird peculiarities wrt primitives

16:08 <zid> but don't ask me what they are

16:11 <heat> what are they

16:11 * heat is a rebel

16:11 <psykose> i'm telling mom

16:28 gildasio has quit [Quit: WeeChat 3.6]

16:30 gildasio has joined #osdev

16:33 gildasio has quit [Client Quit]

16:40 <heat> how usable is the rpi 400?

16:40 <heat> for both desktop lunix and osdev

16:45 <psykose> it's an overclocked rpi4, i'd imagine it's fine

16:45 <heat> yes but erm, how usable are those

16:45 <heat> like is it actually decently powerful

16:45 <psykose> it depends on your expectations

16:45 <psykose> no, it fucking sucks

16:45 <psykose> i host a bunch of shit out of mine

16:46 <psykose> but it's good for what it is and what it cost

16:46 <heat> define "fucking sucks"

16:46 <psykose> for me to define that you'd have to define decently powerful

16:47 <heat> upper range is "gives you a decent desktop experience and with firefox", lower range is "can it compile things in a reasonable amount of time"

16:47 <psykose> i can bench a compile for ya if you want

16:47 <psykose> the former.. definitely not

16:48 <psykose> (though i haven't tried? maybe it's magic)

16:48 <j`ey> heat: my linux kernel build on rpi takes 30mins

16:48 <j`ey> for a cut down kernel

16:48 <heat> fucking what

16:48 <heat> oh

16:49 <psykose> yeah that sounds about right

16:49 <heat> that's horrible

16:49 <j`ey> and that's a specific kernel, with as much as possible turned off

16:49 <psykose> it's like a pentium4 or so lmao, isn't it

16:49 <heat> what's the gold standard for a relatively affordable arm64 machine then?

16:49 <psykose> there isn't one

16:49 <psykose> all the sbcs are in this range

16:50 <psykose> if you want to pay more into the larger category.. probably those clearfog boards

16:50 <heat> I don't need an sbc, just some sort of board or laptop even

16:50 <heat> well, I don't need one, I want one

16:50 <j`ey> m1? :P

16:51 <psykose> >affordable

16:51 <heat> hahahaha

16:51 <heat> you gave me an idea

16:51 <j`ey> im sure heat can afford that!

16:51 <heat> correct

16:51 <heat> otoh

16:51 <heat> it's a macbook

16:52 <j`ey> pinebook pro? but I dont think its much faster

16:53 <GeDaMo> https://www.theregister.com/2022/08/02/pinebook_pro_finally_starts_shipping/

16:53 <bslsk05> www.theregister.com: Open source laptop PineBook Pro is shipping again • The Register

16:54 <heat> j`ey, make arm64_defconfig sounds about right?

16:55 <heat> wait, is that even a thing?

16:56 <heat> looks like it

16:58 <heat> ok no jk

16:58 <heat> ah, it's make defconfig

16:58 <j`ey> yeah that ^

16:59 <heat> what's the kernel's target for arm64?

16:59 <heat> it's not bzImage

16:59 <Ermine> rpi 400 reminds of commodore 64

16:59 <j`ey> Image

17:00 <j`ey> heat: ^

17:00 <heat> yeah

17:00 <heat> because you geniuses got rid of compression

17:00 <heat> smh

17:01 <j`ey> let the bootloder do it

17:01 <Ermine> GeDaMo: UK keyboard version is out of stock already

17:03 <heat> oh well

17:03 <heat> i'm fucked

17:03 <heat> I was trying to compile linux on my phone

17:03 <heat> mission failed

17:04 <Ermine> oom?

17:04 <heat> no

17:04 <heat> there's some libc/kernel header fuckery going on inside termux

17:04 <heat> could I use an adb shell? shrug

17:05 <Ermine> and use stuff from termux?

17:05 <heat> no

17:05 <heat> and just compile it

17:05 <heat> like, can base AOSP compile linux

17:05 <Ermine> ah, you have root access

17:06 <heat> hm?

17:07 <Ermine> if you want to compile linux with adb shell, you need to install gcc, and you would need to have root rights for it

17:08 <heat> termux has a copy of LLVM

17:08 <heat> you do not need root to install software

17:09 <Ermine> ah

17:10 <psykose> writing files to the disk needs root

17:10 <heat> no it doesn't

17:10 <psykose> reading them back needs root too

17:10 <psykose> looking at your phone? that needs root as well

17:10 <Ermine> on android, there's a whole bunch of partitions

17:11 <heat> unlocking your phone needs root

17:11 <psykose> oi, you got a root license mate?

17:11 <Ermine> buying the phone needs root

17:11 <heat> ok the answer seems to be "no, it's not possible"

17:12 <psykose> see? you need root

17:12 <heat> however, I found out I have an ld installed

17:12 <heat> for some reason

17:12 <heat> MCLinker

17:12 <Ermine> cursed idea: subscription access to root

17:12 <heat> love the idea

17:14 <heat> ok i think i was just missing packages

17:20 <heat> ok no im kinda fucked

17:20 <Ermine> go to ur computer

17:21 <Ermine> typing is tiring af

17:21 <heat> im on my computer

17:21 <heat> i just wanted to benchmark it

17:21 <heat> theoretically the cpu should be pretty decent

17:45 <zid> heat how tall are you

17:50 <gog> meow

17:51 <gog> good evening friendos

17:51 <zid> I know gog is 290cm tall

17:51 <gog> yes

17:51 <gog> 293 after yoga

17:57 <psykose> lorge fish

17:58 poyking16 has quit [Ping timeout: 240 seconds]

18:00 poyking16 has joined #osdev

18:09 srjek has joined #osdev

18:11 srjek|home has joined #osdev

18:15 srjek has quit [Ping timeout: 244 seconds]

18:31 rorx has quit [Ping timeout: 244 seconds]

18:43 <geist> re: rpi400. think of it as like a mini netbook class stuff. hard to giv eyou a good frame of reference

18:43 <geist> but it's like say a very lowly clocked modern x86 with 4 cores and no SMT

18:46 <clever> SMT?

18:51 <geist> hyperthreading

18:51 <geist> SMT (simultaneous multithreading) is the generic term for that

18:51 <zid> HT is an intel trademark thingy yea

18:54 <Ermine> I'm thinking how to get pine64 tech in country which they do not ship to

18:55 <psykose> i would guess you kinda just don't, unless you know someone that can bring one or has

18:56 <psykose> because if anyone actually did it with the intention of resale, it would be more expensive than something that already does ship

19:00 <Ermine> or travelling to country which is in the list

19:00 <psykose> yea

19:02 poyking16 has quit [Ping timeout: 245 seconds]

19:20 <heat> zid, three football fields

19:21 <psykose> that's tiny

19:21 <heat> geist, yeah but if it takes 30 minutes to compile a very cut down linux kernel then it's super super underpowered I'd say

19:21 <heat> psykose, it gets big when I get hard, I swear

19:21 <psykose> prove it

19:21 <heat> 😳

19:22 chartreuse has joined #osdev

19:22 <heat> I thought the a72s were decent though :/

19:22 <psykose> nah

19:22 <psykose> i mean it's fine, you're not going to be building linux on it

19:23 <heat> I would like something capable of building my OS

19:23 <heat> especially under itself for the fun of it

19:23 <psykose> i can bench it for you if you want

19:23 <psykose> though mine is overclocked to 2ghz

19:24 <j`ey> heat: gcc takes an hour or so to build

19:24 <heat> yeah erm forget it lol

19:24 <heat> theoretically building my OS would require building a toolchain, building the base system (pretty quick), and then building packages (which builds a toolchain again, plus a bunch of shit)

19:25 <geist> heat: sure. but then that's exactly expected

19:26 <geist> i'm not really apologizing for it or whatnot, but think of it as a mid range desktop cpu from say 2010. runs on a few watts

19:26 <geist> downclocked 2010 cpu more

19:26 <heat> yeah sure

19:27 <geist> things like compiling linux on it are probably not a thing you should do much, or at least use ccache

19:27 <zid> everything post 2011 is trash, anything pre 2011 is antique, change my mind

19:27 <heat> where are you supposed to get a decent arm64 machine without buying a server?

19:27 <geist> OTOH one of the big downsides to the rpi line is storage is achingly slow, so you also need to factor that out of it. using MMC as a root is terrible

19:27 <geist> heat: again define decent

19:27 <heat> modern x86-like performance

19:27 <geist> pay for a VM

19:27 <geist> or get a mac

19:28 <heat> that's kinda depressing

19:28 zhiayang has quit [Quit: oof.]

19:28 <zid> arm's never had performance

19:28 <zid> it's just had cheap

19:28 <geist> such are many things in life

19:28 <heat> are the arm chromebooks that bad?

19:28 <psykose> adding usb3 storage helps

19:28 <geist> right. apple of course changed that completely, at least proved that you *can* get performance out of the arm arch

19:28 <geist> it's still just a matter of money and the desire to do so and building a team that does

19:28 <zid> yea m1 is really rad

19:28 <zid> whatever team they assembled knocked it out of the park

19:28 <zid> whoever was in charge really knew what they were talking about

19:29 <geist> in the sense that building a team to build intel/amd class desktop stuff only exists in a few places

19:29 <zid> and got all the bottleneck predictions correct

19:29 <geist> and so the ISA is really not the biggest part. it's the money/time/team/people power

19:29 <zid> hardest part of a big ass project like that is invariably going to be "not knowing what the issues you will run into will be, ahead of time"

19:29 zhiayang has joined #osdev

19:30 <geist> yes. and apple spent 10 years on that

19:30 <zid> even 10 years is amazing

19:30 <geist> you gotta build up the team, you can't just will it into being with money

19:30 <geist> right

19:30 <zid> they clearly struck gold on getting the right things earmarked as future issues

19:30 <geist> but not to turn things into a love fest, it's just that it *can* be done

19:31 <zid> you're not going to realize what you REALLY want is a super overpowered memory controller, just looking at the ISA or whatever

19:31 <geist> heat: but seriously if you want a non x86 desktop experience apple is it. you might be able to start getting used M1 stuff as folks upgrade to M2

19:31 <mjg> i thought non-x86 desktop experience is that nothing works

19:31 <zid> it sorta works

19:31 <mjg> in which case get yourself a talos box

19:31 <geist> and of course there are lots of arm server stuff now, though they're server stuff so you really wont have one physically, but you can get a slice

19:31 <zid> thanks to marcan

19:32 <geist> mjg: what do you mean 'nothing works'?

19:32 <geist> i take somewhat of an exception to that, lots of stuff works, just depends on precisely what you mean

19:32 <mjg> i see that was rather unclear. up until recently even apple was x86

19:32 <geist> like 'running x86 binaries directly?' probably not so well

19:32 <geist> mjg: sure except when it wasnt, etc

19:32 <mjg> so any non-x86 desktop was a bizzaro "pray it works"

19:33 <mjg> people running sparcstations, powerpc and other stuff

19:33 <geist> again i disagree with that. a rpi400 runs a pretty nice desktop, for example

19:33 <geist> it's just not super fast

19:33 <psykose> pretty much everything works in my experience, even web browsers etc had 99% of everything working just because phones share the same architecture

19:33 <mjg> it does?

19:33 <geist> sure it is

19:33 <psykose> the main thing people complain about is that they can't run random third party programs lol

19:33 <geist> lots of poeople put in a lot of times making stuff work fine on non x86

19:33 <mjg> i'm positively surprised, i stopped following rpis long atime ago

19:33 <geist> its a bit disengenous to just write it off as 'pray it works'

19:34 <mjg> well, as a counter point, i got 2 different arm boards at work and some x86

19:34 <geist> if it's open source it probably compiles fine on non x86 right now

19:34 <mjg> wtf errata is rampant on the former

19:34 <geist> what kinda errata?

19:34 <geist> like, 'dont turn it on it catches on fire' or 'the cpu has a bug tha tyou need a workaround for?'

19:35 <mjg> for example the cpu might fuck up it's own cache if you enable l2 prefetching

19:35 <geist> that sounds like a piece of shit socs. i can assure you that most of them are pretty good

19:35 <geist> but there are some vendors that do their ownt hing and fuck up stuff like caches *cough* broadcomm

19:35 <mjg> not saying no to the first sentence :P

19:36 <mjg> i used to work for a company doign 100% embedded work, vast majority of it arm

19:36 <geist> also gotta remember there are a bazillion vendors and whatnot out there. it's part of the problem that ARM has been trying to standardize on

19:36 <geist> alas it take a few bad apples to sour everything a bit

19:36 <mjg> the cpu bugs & doc bugs combined shorttened lives by 10 years for each month of work

19:36 * geist nods

19:37 <mjg> fortunatley i did not have to deal with that at the time

19:37 xenos1984 has quit [Read error: Connection reset by peer]

19:37 <geist> anyway, also depends a bit on the class of board, class of soc, how cheap they are, quality of the vendor, etc

19:37 <mjg> my fav bug reported by coworker: you had to write to a reg as low endian

19:37 <mjg> but you read it as big endian

19:37 <mjg> :s

19:37 <geist> well anyway, i wouldn't let that cloud your whole view of the ecosystem

19:37 <geist> shitty socs has very very little to do with the ARM ISA

19:37 <mjg> i like to rant man

19:38 <geist> or the quality of arm cores

19:38 <mjg> i have no opinion of arm itself, apart frm not liking ll/sc

19:38 <geist> sure but also there are folks reading this that moight not be commenting, but get a bad takeaway

19:38 <geist> so i like to make sure there's a bit of balance to the rants

19:38 <mjg> but that is already addressed with LSE, so...

19:38 <heat> henlo

19:38 <mjg> geist: fair

19:39 <geist> i'm kinda torn about the ll/sc style atomics. i first really hit it on PPC and it seems like a pretty good solution. or at least a pretty flexible solution

19:40 <geist> main downside is it's hard to fairly do it in a big.LITTLE situation and i think it has scaling issues

19:40 <geist> which is why arm puts the new atomics under 'LSE' which stands for 'large system extensions'

19:40 rorx has joined #osdev

19:41 <geist> in general the trend with a lot of these things in modern designs is to do with less, more powerful instructions. ie, the CISCy style view sort of wins in the end for certain things since it gives the cpu more ability to do what it needs to do

19:41 <zid> and also hardware accelerate things directly

19:41 * geist goes to mention some new arm thing but then wonders if it's public, so doesn't

19:41 <mjg> well the key point is that a more powerful instruction can be optimized in microcode

19:41 <zid> rather than having to *guess* what the high level operation being done is

19:42 <mjg> to give you an example

19:42 <mjg> say you atomic add in multiple cpus at the same time

19:42 <geist> mjg: yeah but instead of doing it across the board, can do it selectively. so atomics i think are a win win, even riscv uses single instruction atomics

19:42 <mjg> they can cooperate and agree on end value

19:42 <mjg> without every single one doing an explicit increment

19:42 <mjg> huge deal in numa

19:43 <geist> yah that's *probably* not what happens, but at least it keeps the cache line exclusive stuff to a more constrained situation

19:43 <mjg> it does happen

19:43 <geist> and yeah the ll/sc stuff is already a problem on big.LITTLE, since the big cores win the race every time

19:43 <geist> we actually have observed it in zircon already. spinlocks are 'won' by big cores by some large marging

19:43 <geist> much more so than the relative performance delta of the cores

19:44 <mjg> have you tried fair locks like mcs?

19:44 <geist> that's on the list of things to do, for this reason

19:44 <geist> par tof the complexity is doing it with ll/sc style locks, etc

19:44 <zid> geist: That's just overlooked in general, that high clockrates etc give you better latency

19:44 <geist> since those tend to work better with atomics

19:44 <mjg> so you really got a big.LITTLE board with small cores not doign LSE?

19:45 <geist> yes

19:45 <mjg> now that sounds like cheap shit :-P

19:45 <zid> running a 2GHz cpu for half the time, vs a 1GHz cpu all of the time isn't *actually* the same thing in practice

19:45 <geist> mjg: yes.

19:45 <geist> basically all of the a53s, a57s, a72s, a73s out there, whcih are probably still the majority of a class arm cores out there already are not LSE

19:46 <geist> it's a55+ and a75+ that went LSE (v8.1+)

19:46 <mjg> [re that atomic optimization, it's a known idea: bunch of cpus on 1 node "agree" to bump something by n on another socket and then one goes ahead to do the work, assuming there is contention]

19:46 <geist> and though that's been out for a while, there's lots of hardware out there that's pre that

19:46 <geist> mjg: fairly certain none of the arm designs do that

19:46 <mjg> i don't know what arm does

19:46 <geist> they do atomics based on cache line exclusivity guarantees, even in LSE stuff

19:47 <mjg> i do know arm suffers greatly when faced with cmpxchg loops et al

19:47 <geist> especially since it's far more than add/subtract that yo can do with atomics, and frequently *do* with atomics

19:47 <geist> compare and swap is generally the primary thing you do

19:47 <mjg> (e.g., the graviton cpus on amazon cloud)

19:47 <geist> note FWIW those will be LSE for sure

19:47 <mjg> i know graviton is lse, it still sucks :p

19:47 <geist> neoverse-n1 and whatnot is a derivitive of either a75 or a76 i forget

19:48 <geist> there *is*however a cmpxchg instruction in LSE at least, so fi they suck i suspect it's in the way the mesh of cpus work

19:48 <geist> and how the caches are distributed

19:49 <mjg> i don't have numbers handy, but bottom line: there was a performance bug in freebsd where a certain rw lock would get taken/released A LOT with a cmpxchg loop

19:49 <geist> one of the caveats with LSE is they haven't removed the ll/sc stuff, so it has to coexist

19:49 <geist> so the underlying mechanism has to be the same

19:49 <mjg> 32 or whatever core amd64 suffered 10%-ish slowdown

19:50 <mjg> arm around 50%

19:50 <mjg> similar core count

19:50 * geist nods

19:50 <geist> yah would be interesting to see hat the interconnect looked like on whatever core it is

19:50 <geist> also keep in mind therea re multiple vendors building their own implementation. Ampere, Cavium, etc

19:50 <mjg> i google around a little, apparently it was not just me :-P

19:50 <geist> they myay have radically different response to stuff

19:51 <geist> which itself is a total PITA for kernel people

19:51 <geist> since you can't really optimize it for one thing, necessarily

19:51 <mjg> ye would be curious to bench realities of top-of-the-field cpus

19:51 <mjg> in this aspect

19:51 <geist> but yeah moving to ticket based spinlocks is on our list of things to do

19:52 <geist> there's a key i haven't figured out with ARM and ticket locks with and without LSE (which we have to deal with)

19:52 <geist> how to efficiently do it. with non LSE a plain spin based spin lock has a particular power optimization

19:52 <geist> that involves looping, then failing to acquisition, then WFEing until someone releases it, and then spinning again

19:52 <mjg> ticket spinlocks is one type of fair locks

19:52 <geist> it's quite efficient, because you aren't actually spinning

19:52 <mjg> another one is queues, like mcs

19:53 <geist> but i dont understand precsely how to get the same optimization with the mcs or ticket locks. i'm sure it's doable, but the ARM ARM hasn't updated it

19:53 <geist> all their examples still show ll/sc locks

19:53 <mjg> if it is anything like "pause" from x86, mcs should come with it for free

19:53 <geist> that part of the arch is *extremely* tricky, and if you mess it up there are fairly dire power consequences

19:53 <geist> it's closer to mwait

19:53 <j`ey> mjg: WFE sounds like pause

19:53 <mjg> you queue yourself in and then chill waiting for your turn

19:53 <j`ey> 'wait for event'

19:54 <geist> right, it's closer to monitor/mwait

19:54 <geist> but the exact sequence is very very tricky

19:54 <mjg> but maybe there is some arm-diff getting in the way

19:54 <mjg> i only know few tidbits here and there

19:55 <geist> for ll/sc it's straightforward: when you do the atomic grab of the lock you gain the exclusive, which you can then WFE on

19:55 <geist> when someone writes to it on another cpu it breaks the eclusive which is treated as an 'event' in the arm cores that are waiting on that cache line, which fall out of the WFE and then try the loop again

19:55 xenos1984 has joined #osdev

19:55 <mjg> fwiw concurrencykit has mcs locks (bsd or similar licensed, basically you can take it if you want)

19:55 <mjg> it is *plausible* it is doing trt on arm

19:56 <geist> oh i'm sure linux has solved it but i dont want to look at what linux does (please dont tell me)

19:56 <geist> but the BSDs in my experience are far behind on the ARM64 front

19:56 <mjg> concurrencykit is not linux

19:56 <geist> even freebsd is fairly behind every time i look at it

19:56 <mjg> bsd as in license

19:56 <geist> yes yes i know

19:56 <mjg> not tied to it

19:56 <mjg> ye freebsd is kind of crap, but i have wip patches to fix some of it

19:56 <geist> what i mean is the bsd kernels and arm64 are generally fairly behind state of the art

19:57 <mjg> fwiw i got the vfs layer to do faster path lookups than linux

19:57 <heat> freebsd is kind of crap - bsd dev

19:57 <heat> finally, we got em

19:57 <geist> and netbsd openbsd are fairly simple ports. not knocking it, but they just haven't had a lot of work on it

19:57 <mjg> heat: bro, i have been saying this for a decade

19:57 <geist> there are a few places where i think even zircon is ahead of freebsd in particlar arch optimizations (though very few)

19:57 <mjg> oh?

19:57 <geist> but spinlocks, probably not

19:57 <mjg> well i can't vouch for arm ports

19:57 <geist> linux of course is the gold standard here, but it's also. well linux

19:58 <mjg> i can tell you some of the amd64 stuff is faster on freebsd than on linux

19:58 <geist> yes i'm 100% talking about ARM ports

19:58 <mjg> but mostly because they refuse to fix it

19:58 <mjg> as opposed to me doing anything special

19:58 <mjg> most notably memset and memcpy

19:58 <geist> yes 100% arm

19:58 <mjg> yye, ye, ok

19:58 <heat> memset and memcpy?

19:58 <heat> the userspace one?

19:58 <heat> ones*

19:58 <mjg> kernel

19:58 <mjg> linux has plain ERMS use, if supported

19:59 <mjg> that's terribly slow for small sizes

19:59 <mjg> i don't understand why they do this

19:59 FreeFull has joined #osdev

19:59 <geist> exept for the *new* second ERMS bit that syas 'use it anyway'

19:59 <geist> that may be a case where $vendor adds patch because they added the bit

19:59 <mjg> glibc, maintained by intel, is doing regular stores until it can do simd

19:59 <moon-child> doesn't musl do that too?

19:59 <mjg> it's literally intel disagreeing with itself

19:59 <geist> that's user space though. differnet constraints

20:00 <geist> can use simd, etc

20:00 <heat> moon-child, no

20:00 <moon-child> well, no, musl uses rep movs/stos regardless of erms :P

20:00 <mjg> geist: small sizes which *don't* do simd

20:00 <heat> musl's stringops are pretty non-optimal

20:00 <moon-child> heat: that was meant as a put-down on musl

20:00 <geist> yes, because a libc exist doesn't mean its already optimized everything to the wall

20:00 <heat> moon-child, downvote

20:00 <heat> musl is largely like a 3 or 4 people effort

20:01 <mjg> i asked someone from intel about it, they told me to just use erms and be done with it

20:01 <mjg> ... except see above

20:01 <geist> thing is there's a second erms bit now right?

20:01 <mjg> fast short rep, ye

20:01 <geist> it says 'really actually use rep movsb'

20:01 <mjg> i have not benched it yet, i do find it suspicious so far though

20:01 <geist> trouble with intel is they're probably mostly tasked with making the newest hottest thing go fast

20:01 <geist> since old stuff they dont sell anymore doesn't make them money

20:01 <moon-child> there's also avx512

20:02 <moon-child> which aside from the power stuff is kinda cheating for memcpy/memset type tasks

20:02 sprock has quit [Ping timeout: 268 seconds]

20:02 <geist> even with zircon i have to fight a bit to get folks to care bout stuff older than about 5 years

20:02 <mjg> btw zircon is just arm64 and amd64?

20:02 <geist> my usual get out of that argument card is 'yeah but folks run fuchsia on qemu with kvm on older machines' which will work for about a 10 year window

20:03 <geist> mjg: correct. there's a rv64 port floating around too

20:03 <geist> but not mainlined

20:03 <mjg> cool

20:03 <j`ey> what about m68k?

20:03 <mjg> bigger than 64

20:03 <j`ey> just copy those bits from LK, and im sure it'll work

20:03 <geist> if it's 64bit we can probably port to it

20:03 <mjg> motorola 68k?

20:03 <geist> and i know you're kidding, but we do draw the line at 64bits

20:04 <geist> whcih really simplifies things. gives you a fresh playing field

20:04 <geist> at the expense of i think the kernel would work fairly well in certain embedded situations that we just cant even try

20:05 <geist> i think ppc64 would be problematic though. there are probably some implicit little endian assumptions

20:05 <geist> though i can't think of any off the top of my head in the kernel itself

20:05 <geist> in user space, yeah i bet

20:09 <heat> in drivers

20:09 <heat> filesystems

20:09 <mjg> ppc has even weaker real world memory ordering than arm

20:09 <mjg> as in the cpu is more likely to fuck you over if you mess up your barriers

20:10 <mjg> for this reason alone it is worth porting to

20:10 <heat> geist, do you want a new rv64 port? 👀

20:11 <moon-child> mjg, have to acquire hardware in order to do that

20:11 <mjg> see TALOS

20:11 <mjg> :)

20:12 <moon-child> are they still 8k a pop?

20:12 <moon-child> :P

20:12 <heat> brb need to reboot, kde is acting up

20:12 <heat> year of the linux desktop

20:12 heat has quit [Remote host closed the connection]

20:13 heat has joined #osdev

20:14 sprock has joined #osdev

20:17 <heat> ldp and stp are the most non-RISC RISC instructions I've seen

20:18 <geist> what do you mean?

20:18 <geist> you must not have seen ldm/stm!

20:19 <heat> i have not

20:19 <heat> that is arm32 right?

20:19 <geist> yes

20:19 <geist> a pretty standard: load/store this bitmap of registers in order style instruction

20:19 <geist> lots of arches had it

20:20 <geist> but it's a very non risc thing. ldp/stp is a step back from that

20:46 Vercas6 has quit [Remote host closed the connection]

20:48 Test_User has joined #osdev

20:48 Vercas6 has joined #osdev

20:49 \Test_User has quit [Ping timeout: 245 seconds]

20:58 carbonfiber has joined #osdev

21:04 \Test_User has joined #osdev

21:05 GeDaMo has quit [Quit: A program is just a bunch of functions in a trenchcoat.]

21:06 Test_User has quit [Ping timeout: 268 seconds]

21:11 <carbonfiber> I am trying to understand the timing diagrams on PDF-page 138 (internal-page 124) in the following ATA standard https://web.archive.org/web/20140722012229/http://www.t13.org/Documents/UploadedDocuments/project/d2008r7b-ATA-3.pdf

21:11 <bslsk05> web.archive.org: Wayback Machine

21:11 <carbonfiber> does anyone here know how to understand those timing diagrams or know of a guide that explain those type of timing diagrams in more depth?

21:12 <carbonfiber> i tried searching but am only able to find other types of timing diagrams.

21:20 <zid> 404 for me

21:20 <heat> after a brief look, that looks irrelevant

21:20 <zid> oh nevermind cus the link fucked up my end

21:20 <zid> I assume it was an edge trace on the hw level that nobody cares about?

21:20 <zid> unless they're making a controller chip

21:21 <heat> yea

21:21 <heat> the ATA spec (and other similar specs) have a lot of that

21:22 <heat> even the e1000 docs have a solid 2 or 3 chapters that are pretty irrelevant

21:22 <heat> (to us)

21:22 <zid> I found an e1000 doc that was 50 pages of electricals

21:22 <zid> because it was mainly the manual for how to put one onto a pci-e card

21:23 <zid> such that the LEDs worked properly

21:32 <mrvn> carbonfiber: those timing diagrams are relevant when you design hardware.

21:44 kkd has joined #osdev

21:57 <CompanionCube> geist: eh, ppc64le is a thing

22:04 <zid> humans are are the way cat genes decided to colonize mars

22:07 <mrvn> zid: cats are clearly advanced tool users

22:18 <heat> I have just needed to input a MM/DD/YYYY date

22:18 <heat> my day is ruined

22:18 FreeFull has quit []

22:18 <zid> correct

22:28 <geist> CompanionCube: true. though iirc my G5 powermac (which i'd hack on) didn't support LE mode

22:28 <geist> i *think*

22:47 <zid> ARM processors are the way neanderthal genes try to ruin humanity

22:47 <geist> mkay

22:48 <heat> riscv64 is the superior architecture

22:48 <zid> found the gorilla

22:48 <geist> sounds like everyone is just trying to trigger each other

22:48 <heat> virgin arm arm with its 6000 pages vs chad riscv manual written in LaTeX with 200

22:49 <zid> that's because you're american geist

22:49 <geist> wow, really going for it today

22:49 <zid> come be a chill dutch person or whatever instead

22:49 <heat> or a sarcastic british person

22:49 <zid> what? it's a known fact that americans have no bants or sarcasm

22:52 <geist> mkay

23:17 socksonme_ has quit [Ping timeout: 252 seconds]

23:22 immibis has joined #osdev

23:26 [itchyjunk] has quit [Ping timeout: 255 seconds]

23:30 [itchyjunk] has joined #osdev

23:32 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

23:43 <heat> geist, did you know there are builtins for isb and writing to system registers?

23:43 <heat> mind = moderately blown

23:44 <geist> yeah

23:44 <geist> i think they're fairly new

23:44 <geist> we generally switched zircon code to it

23:44 <geist> not sure precisely which toolchain they showed up in, so be a bit careful about it

23:45 <heat> 2014

23:45 <heat> the isb one

23:45 <heat> __builtin_arm_wsr64 seems newer?

23:45 <geist> yeah possibly

23:46 <geist> the isb or dsb one i can kinda take or leave. it's nice to use the builtin, but then it's also just as easy to have a inline asm thing

23:46 <geist> but yah, use it if it's avail

23:46 LittleFox has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

23:46 LittleFox has joined #osdev

23:47 <heat> msr and mrs seem easy as well?

23:47 <heat> I don't see the point in this except a e s t h e t i c s

23:47 <geist> yah

23:47 <geist> though if you didn't have to type them in in the first place that's nice

23:47 <geist> so if you have new code it makes sense to use em

23:52 netbsduser` has quit [Quit: Leaving]

23:55 <heat> geist, you have a zen 3 right?

23:55 <heat> do you know if you have pcid, tlbsync, invlpgb?

23:56 <heat> i think they're there but I'm not sure, and I'm not sure if they're there on desktops

23:56 <heat> at least I haven't seen anyone use them yet