<uplime>
oh ok, that's actually good news because I did a lot of work to prepare it
<doug16k>
you will end up much happier if you go through that
<uplime>
sweet
<doug16k>
and you won't be asking us about weird stuff that doesn't happen when you have a proper toolchain
<doug16k>
distros patch the hell out of it and totally force security stuff whether you like it or not, and it will screw you right up, sooner or later
<doug16k>
my system compiler isn't even gcc anymore, it's ubuntucc
<uplime>
yeah that makes sense
<uplime>
my system compiler is actually clang so I had to install a system gcc to begin with
<geist>
doug16k: yeah on top of that, last i checked the pl011 emulation in qemu isn't even the 'pretend there's a fifo and back pressure' type
<geist>
it simply synchronously dumps the char in a qemu internal buffer
<geist>
so the tx fifo appears always completely drained
<geist>
internally to qemu there's a 'simple synchronous, blocking' serial driver api and a much more complicated one that has notions of internal fifos that are filled so proper hardware flow control can appear to work, etc
<geist>
anyway TL;DR you can just jam chars out of the pl011 in qemu like there's no tomorrow
Sos has quit [Quit: Leaving]
<doug16k>
uplime, geist has a script that produces a very good toolchain for starting out with kernel dev in a snap: https://github.com/travisg/toolchains
<bslsk05>
travisg/toolchains - Shell script to build gcc for various architectures (30 forks/49 stargazers/MIT)
<geist>
and some prebuilts at newos.org/toolchains if you happen to be using linux or mac or freebsd
<bslsk05>
newos.org: Index of /toolchains
<uplime>
I actually am using mac
<doug16k>
geist, I was hoping so
<uplime>
thanks guys
<geist>
some prebuits for mac then, up through 7.5.0 i haven't een able to build newer ones yet for Reasons
<bslsk05>
github.com: qemu/pl011.c at master · qemu/qemu · GitHub
<kazinsal>
note to self: just ssh to my build server instead of trying to build a modern gcc on my new laptop when it arrives
<doug16k>
geist, yeah, I saw that trace_pl011 and stopped there and reran my thing with that on :)
Arthuria has quit [Ping timeout: 264 seconds]
<doug16k>
yeah, just throws it at the fd
<doug16k>
at the chardev and subsequently at the fd I guess
<geist>
saw it when debugging some guest pauses on some test code while runngin fuchsia test cases
<geist>
basically if the serial pipe isn't drained fast enough it'll eventually start blocking the guest
<doug16k>
ah you mean it stalls you so you can write all the GB/s?
<doug16k>
so it practically *is* isa-debugcon
<doug16k>
that is the nice bit of isa-debugcon
<geist>
pretty much. i've seen more sophisticated serial device implementations in qemu that properly emulate a TX fifo being full, etc
qookie has quit [Ping timeout: 265 seconds]
<geist>
but in this case there's no actual bother pretending there is a TX fifo, even though a real pl011 is sufficiently complex to have a reasonable implementation of it
<geist>
but the nice thig is for debugging it's a single store and done. no looping to wait for the fifo to drain
<bslsk05>
github.com: qemu-rom/pci_arch.cc at master · doug65536/qemu-rom · GitHub
Oli has quit [Ping timeout: 252 seconds]
<doug16k>
one line driver. that's a new one
isaacwoods has quit [Read error: Connection reset by peer]
<geist>
yah, sadly unless you globally burn a register (like x18) that always holds a pointer to the mmio window for it you need two instructions and a trashed register to be able to write to it frmo *any* context
<geist>
especially when debugging deep asm where you want to make a macro and move it around
<geist>
i've done DEBUG('a') DEBUG('b'), ec a bazillion times
<geist>
on that topic: gcc has very rich support for -ffixed-regN on arm64, so youc an generally speaking pick one to globally burn if you want
sprock has quit [Ping timeout: 252 seconds]
<geist>
suggest x18 first, since it's architecturally set aside for your use, but after that i'd start with the callee saved ones (x19+) since the abi will already properly save them when making firmware calls, etc
<doug16k>
the isa-debugcon originates in bochs, where the serial port is emulated so realistically, it limits you to the actual baud rate and lets you overrun and everything
<doug16k>
if you make an infinite bandwidth serial port, isa-debugcon becomes obsolete
<doug16k>
unlimited I should say
<doug16k>
I actually added a feature in my bochs fork, you can set "turbo" to true on each serial port, and it makes it just go as fast as it can
<doug16k>
transmitter holding register is empty forever
<doug16k>
when you receive a character, then next one is ready to receive before it starts the next instruction :P
Oli has joined #osdev
<doug16k>
turbo as in galaxy quest. it holds down the turbo
<doug16k>
if I did msr daifset, #0xf then wfi - is that stop forever?
<geist>
yah funny, actualy another 'bug' we had inthe serial driver that actually showed up under load is making it loop while sending a lot of data if the fifo still has bits in it
Arthuria has joined #osdev
<geist>
seems like it makes sense right? while (fifo_not_empty) shove_char;
<doug16k>
I expect that to send one at a time
<doug16k>
wait
<geist>
er fifo_not_full
<doug16k>
ah
<geist>
problem is in a bottomless serial port it may sit there and vmexit in a loop for kinda an unbounded amount of time if you're generating lots of data
iorem has joined #osdev
<geist>
if you do it with interrupts off, etc you can end up with weird stalls
<clever>
when i was merging the pl011 drivers in LK, i found something interesting/useful, on the rx direction
<doug16k>
shove_char is thousands of instructions or way worse?
<geist>
Real Hardware you'd fill up the fifo pretty fast and then you're off into TX interrupt land to tell you when you can shove more in, maybe a millisecond later
<geist>
shove_car is *one* instruction, but vmexits
<doug16k>
because of vmexit
<clever>
when the LK fifo is getting full, it turns off the rx interrupts
<clever>
the "hardware fifo" in the uart then fills up
<clever>
and then qemu can apply back-pressure to the fifo feeding it
<geist>
clever: RX is a different kettle of fish
<geist>
TX is where things get iffy. but actually RX has a similar problem: if the host has say just sent you a few thousand bytes for some reason
<geist>
and your RX interrupt logic has one of those while loops in it
<doug16k>
geist, it just blows my mind that there isn't a device where you do a store and it tells it the physaddr of a descriptor to just send a {base,length} range of physical address space
<geist>
it can end up sitting there for an unusually long time with irqs off
jaevanko has joined #osdev
<doug16k>
an mmio store*
jaevanko has quit [Client Quit]
<geist>
clever: but that's the point, not all qemu serial devices are made equal
<geist>
some of them aren't smart enough to really properly emulate a fifo
valerius_ has joined #osdev
<doug16k>
I keep thinking I should make a simple-uart device which is one register, where you write to the register the physaddr of a {physaddr_t base, physaddr_t length} pair in memory. it sends that range
<geist>
virtio-serial i guess has some of that
<doug16k>
yeah but you have to jump through hoops enumerating it
<geist>
oh yeah also... i just saw a new one like that. in qemu
<geist>
hang on.
mctpyt has quit [Ping timeout: 265 seconds]
<doug16k>
fw_cfg has a "DMA register" which roughly does what I described. not as simple - that one deals with a file id and offset
<bslsk05>
github.com: qemu/goldfish_tty.c at master · qemu/qemu · GitHub
<geist>
it's used in the new 68k based 'virt' machine, added like this year
sprock has joined #osdev
<geist>
normally it's just a e9 style thing: shove char down the first register (REG_PUT_CHAR)
<geist>
but check the REG_DATA_PTR, REG_DATA_LEN egisters: you basically load up a pointer and a lenth, and then one of the commands you shove down the REG_CMD regster is CMD_WRITE_BUFFER
<bslsk05>
github.com: m68k/uart.c at master · travisg/m68k · GitHub
<geist>
it seems to do what it says
<radens>
What's the best way to cite a section or table of the intel manual in code comments? It looks like they occasionally move sections into appendices or insert new chapters in the middle?
<doug16k>
going to take a stab at riscv64 target next in my rom project
<doug16k>
radens, just include the heading with the section number as checksum
<doug16k>
3.4.5 How to not screw up your sysretq
<doug16k>
oh and always say volume number
<doug16k>
vol 3 3.4.5 whatever...
<doug16k>
if they aren't used to it they might be lost in wrong volume
<doug16k>
(you make sure rcx is canonical is how you don't screw up sysretq on intel's broken implementation)
<radens>
I've usually just been leaving the volume off because most of the time it's in volume 3, but I just cited something in volume 2 and am now updating a few comments.
<doug16k>
as usual, amd did it correctly
<radens>
amd does it right except when they decide to extend the architecture and not throw it away
<radens>
oh look another extension which is almost but not quite the same between the two vedors
nyah has quit [Quit: leaving]
<doug16k>
avx-512 is coming soon on zen4 they say
<doug16k>
avx-512 is quite good
<doug16k>
compiler can do brilliant autovectorization tricks
<geist>
interesting. AMD was kinda badmouthing it a bit
<doug16k>
that's what the ads say anyway :D
<geist>
but then that's just corpspeak
<doug16k>
it does have limited applicability
<doug16k>
256 is huge already
<doug16k>
if you want the autovectorizer to do more than nothing, and your stuff is loops of vectorizable stuff, it's great
<geist>
to me it's kinda like intel insisting on burning 30% of the die with GPU stuff i dont want: i dont want the avx512, plz use the space for something else
<doug16k>
without those mask registers, the compiler looks at what it has to do to autovectorize something, and half way into it, says "yeah right", and can't, because too many edge cases or alignments
<doug16k>
if it does make it to the end and decide to vectorize, someone is going to see it and go "oh my, look how much code it generated!"
<doug16k>
unless it is utterly trivially perfectly aligned and vectorizable
<doug16k>
with the mask registers it can make one loop do the beginning, middle part, and end
<doug16k>
it just does fancy things with mask at start and end for weird starting and ending places
<clever>
with how masking in the VPU works, i would need 4 32bit stores, a vector load, and a vector compare, to setup the masks, thats getting to the point that it may out-do the cost of an 8-wide vector
<doug16k>
(when it starts or ends in the middle of an aligned vector)
<clever>
so for certain things under the lane-width, it would be better to use scalar for the mis-aligned start/end
<doug16k>
yeah I expect anything getting serious with vectors will have some masking capability
<clever>
the masking i have in mind, is more of abusing the per-lane conditional execution
<bslsk05>
gamozolabs.github.io: Vectorized Emulation: Hardware accelerated taint tracking at 2 trillion instructions per second | Gamozo Labs Blog
<doug16k>
yeah exactly - mask means per lane enable
<doug16k>
if not enabled and it stores, no store
<doug16k>
for that lane's word
<clever>
i dont know if the masking i have access to, applies to load/store
<doug16k>
that's where you really benefit for autovectorizing
<doug16k>
for just the execution of if else, sure, what you said is great
<clever>
radens: you just gave me a crazy idea, implement an emulator, using exclusively vector opcodes..., but load/store might be tricky
<clever>
as-in, emulate $lanes cores at once
<radens>
That's what the dude did
<radens>
He streams on twitch sometimes
<clever>
aha!
<radens>
Received an award at blackhat for it
<clever>
i was guessing based on the title alone, before reading
<radens>
You can find the source on github somewheree
<radens>
anyway, not to dissuade you, there's probably mode to explore
<doug16k>
when you vectorize if else, it executes both, with the mask set at if start, and inverted at else
<doug16k>
if it's a lot you can skip over a block of stuff if mask is zeros
<doug16k>
jmp to the else
<clever>
doug16k: hmmm, now that youve said that, i can see how emulating 16 cores, each doing a diff opcode, could be mighty complex
<doug16k>
na, it's just an AND gate for the register write
<clever>
doug16k: for the QPU, there is a special all/some/none flag for all conditional things
<doug16k>
mux in previous value or new value, mux controlled by mask
<clever>
so you can skip an else{} clause, if all of the lanes are in the right state
<doug16k>
in each lane
<doug16k>
yeah exactly
<doug16k>
that is what every gpu needs to not be stupid
<clever>
and thats likely why the QPU has it, since it runs shaders
<doug16k>
without it, you execute an awful lot more nops
<clever>
but the VPU seems more like a generic DSP, not a GPU
<doug16k>
if all lanes are false on a vectorized if, you nop the entire true arm
Arthuria has quit [Ping timeout: 245 seconds]
<doug16k>
but with good mask tricks you can jump over true on zero mask
<doug16k>
you land at else and invert mask
<doug16k>
or land at endif and forget everything
<doug16k>
done if
<doug16k>
after the else invert it could jmp_if_mask_is_zero to the endif
<doug16k>
if some were on and some were off it'd execute both true and else case (as intended)
<doug16k>
you would have some threshold to know when it is enough stuff to be worth the jump
<doug16k>
if they are small bodies then it would just let both run
<doug16k>
in compiler I mean
<doug16k>
on big giant cpu, probably at least 20 cycles worth of stuff, so maybe 60 instructions ballpark. on actual gpu, fewer
Oli has quit [Quit: leaving]
<clever>
thats part of why i'm trying to profile every opcode down to the clock cycle
<clever>
so i could then add those up, and know how a given pice of code compares to another
<doug16k>
if you are pretty sure the branch mostly goes one way, branch is faster on giant cpu. probably not nearly as smart predictor on real gpu
<kingoffrance>
IOW "branchless" may not help so much with modern predictors?
<doug16k>
faster on big cpu with small body going same way every time, I mean
<doug16k>
kingoffrance, it helps drastically for things like binary tree comparisons, where it is hopelessly impossible to predict the branch and it's 50% mispredict
<doug16k>
other than that, yeah, modern predictor is so amazing, assume it handles your stuff
<kingoffrance>
i just remember it was shown in "hackers delight" is a case ive seen it, but even then it was "for some cpus..."
<kingoffrance>
thats kind of an old book now
<doug16k>
it's all about whether a branch mostly is taken or not
<doug16k>
when in the middle, it's bad
<doug16k>
it's like power dissipation of a transistor. all the way on and off, no power dissipation. half on: worst case
<doug16k>
if it is taken half the time, it is hopeless
<doug16k>
it will mispredict approximately every other time
<moon-child>
only if it's random. If there's a pattern it may figure it out
<doug16k>
right
<doug16k>
up to a certain length
<doug16k>
if you don't confuse things with aliasing (it can't perfectly remember the addresses)
<clever>
something i saw on youtube, is that a modern cpu, will build a branch prediction table, based on the source of the branch
<clever>
and a normal switch-case, with one common branch at the top, is poor performing
<clever>
the trick in that video, was to replace every `break;` with a duplicate indexed jump, for cases like while(true) { switch
<clever>
so, the cpu is now building a seperate prediction table, based on a pair of case blocks
<doug16k>
my avl tree comparison has a specialization for T=pair<uintptr_t,uintptr_t> - it converts them to 128 bit integers and returns lhs128<rhs128. codegen ends up doing an inlined sub,sbb and using carry to know if it is less, does setcc then neg then merges the true and false pointer. it runs at ludicrous speed. two instruction comparison then mask merge
<doug16k>
it becomes 100% correctly predicted because the only involved conditional branch is the depth loop, which stays the same for long periods of time, and is so few that it can know the whole pattern
<doug16k>
oh and the not taken until found branch
<doug16k>
that's the only one with some cost to it
<doug16k>
once per lookup or insert
<doug16k>
assuming you are hammering it
<doug16k>
er, supposed to be taken once per lookup or insert
<doug16k>
I checked. I saw a ton of mispredicts with my hardware perf counter module that gives me a "perf top" of my kernel. I fixed the compare and it disappeared and tree perf went through the roof
<doug16k>
other than that pathologically difficult branch, yeah, predictor is pretty amazing
<doug16k>
the one that would be there if you didn't make it branch free
<doug16k>
committed mmap will acquire a lock, lower bound an index of free virtual ranges, by size then address, best-lowest-fit the allocation, find the corresponding index entry in the by-physical index, check for adjacent allocated block and coalesce with it, or edit out the underlying free range, update both indexes, create page tables if needed, do physical allocation, populate PTEs . then unlock the lock
<doug16k>
it does that in less time than one NOP instruction on my vic-20
<doug16k>
when warmed up a bit of course
<doug16k>
probably over 10,000 instructions per vic-20 nop nowadays
<doug16k>
remember when 2 microseconds was a short period of time? ages now
<clever>
i was recently asking in a c64 channel, about how i could put the rpi sprite hw to use
<clever>
and the basic answer is, you cant
<clever>
because c64 games, often race the crt beam, and edit things in the middle of a scanline
<clever>
so you need to emulate the cpu and "gpu" in lock-step
<doug16k>
if you introduce a bit of latency it's trivial
<doug16k>
I doubt people want latency. whole point of classic hardware like that is latency not being a thing
<clever>
my rough plan, was to just translate the c64 sprites into rpi sprites
<doug16k>
like when mario flashes when you take damage. it is utterly deterministic when it turns him on and off every frame
<clever>
the problem, is that the game expects to be able to edit sprite control flags, in the middle of a frame, and get certain results
<doug16k>
it can't miss it
<kingoffrance>
lol theres crazy ppl speed runs and such try to jump inside blocks. yes, the bugs are timing based too
<kingoffrance>
or at least, deterministic if you jump at just the right time and speed and hit the right spot lol
<doug16k>
yeah, it's *nothing* like a modern engine that wonders how long the previous frame took
<doug16k>
the frame duration is determined by the vertical scanning frequency of the TV
<doug16k>
it could be, but often just assumes it is holding perfect framerate. then you get the "slowdown" that was common in the day
<kingoffrance>
when there are too many things on screen, yep
<clever>
doug16k: there was a guy on youtube, doing a repair of a game demo unit, it had something like 20 cart slots, and a customized console
<clever>
one problem he had, is that the hsync pulse, is routed to the cart, for on-cart chips to do extra logic
<clever>
and that trace was broken on a few slots
<doug16k>
hsync pulse. high tech stuff
<doug16k>
normally it'd be considered a tad fast
<doug16k>
unless you are doing a countdown like c64 raster interrupt
<doug16k>
...in hardware
<doug16k>
31 clocks per scanline is pretty tight (assuming 2 phase 1MHz)
lg_ has joined #osdev
<doug16k>
probably just enough to let foreground to make a bit of progress if you sprayed every hsync at irq
lg_ has quit [Read error: Connection reset by peer]
<doug16k>
there are extreme cases like Atari 2600, where you bitbang the screen during the scanout
<doug16k>
there is no video ram. there is no crtc
<clever>
doug16k: for this snes stuff, it wasnt an irq, it was a dedicated pulse sent to the cart, so something non-standard could then take action
<doug16k>
there is no ramdac
lg has quit [Ping timeout: 252 seconds]
<doug16k>
yeah, 65xx time loved their irq driven fiddling with video chip
<doug16k>
it is a simple way to open up huge possibilities of really impressive stuff
lg has joined #osdev
<doug16k>
you can do "impossible" things no problem, as long as you don't want to do them all across the same row :P
<clever>
doug16k: i think the snes sound chip, is also its own cpu, talking over a pair of fifo's?
<doug16k>
and even then, you coud ficker multiplex unlimited sprites
<clever>
and the boot rom, is a very dumb loader, that lets you push bytes over into the sound chips ram
<doug16k>
flicker*
<doug16k>
not sure what "ficker" means, but it doesn't sound good
<bslsk05>
'SPC700 & ARAM - Super Nintendo Entertainment System Features Pt. 10' by Retro Game Mechanics Explained (00:15:05)
<kingoffrance>
yes IIRC thats basically how spc files are "played" too
<clever>
kingoffrance: yeah, you can just emulate the SPC alone, and replay a sequence of pushes to its fifo
<clever>
kingoffrance: ive also heard of exploits involving other chip-tunes
<clever>
c64 ones i think?
<clever>
basically, the "audio player" was just a full blown emulator, complete with rom bank switching
<clever>
but, it didnt enforce that your rom-bank# was within the size of your rom file
<doug16k>
snes is offload, offload, offload
<clever>
so if you switch to bank 65535, you overflow the buffer
<kingoffrance>
well ive heard of ringtone hacks to "homebrew" ancient phones so :)
<clever>
you now have a 6502 emulator, running hostile code, that can peek/poke host ram
<kingoffrance>
it might be useful in some cases :)
<clever>
you may now wave goodbye to ASLR
<klange>
I'm poking around for examples of how raw sockets are used on Linux, et al., to do low-level things like implement dhcp clients ("but that's just UDP" you say, but not so fast... none of the dhclient implementations I've found actually use UDP sockets, they're all RAW!)
<clever>
and for extra fun, the file browser auto-plays music when you select it
<doug16k>
maybe due to extreme brokenness of dhcp servers that can't tolerate the slightest newer thing in the frame?
<klange>
One of my joke distros of ToaruOS did intro music on startup.
<klange>
doug16k: Actually it seems to be because negotiating the necessary IP packet with no source address is weird.
<bslsk05>
ahti.space: Building a GCC System Compiler for MacOS - Ahti Gogs
<doug16k>
uplime, unrecognized command-line option '-macosx_version_min'; did you mean '-mmacosx-version-min='?
<doug16k>
_ vs -
<moon-child>
not the double m?
<doug16k>
that too
<doug16k>
can't ask for better error message than that
<uplime>
klange pointed out in another channel that something is feeding it clang options (which is probably what that is) like that but I'm not sure how to tell its gcc
<doug16k>
start with not-completely-invalid parameters
<uplime>
I'm not the one passing -macosx_version_min though
<doug16k>
ah, then grep for it
<uplime>
oh good idea
<doug16k>
you are doing slightly harder mode (mac build)
<doug16k>
fully hard mode is trying to build it on windows :P
<uplime>
yeah, i'm not going to have access to my linux vm any time soon unfortunately
<klange>
So you already have a system gcc from brew, and it's 10.2, so I suggest you pass a --target flag and actually build a cross-compiler here and skip trying to build a regular gcc.
<doug16k>
10.2 is great
<moon-child>
doug16k: I don't think it builds with cl, so at least on win it won't get confused between system vs native compiler
<klange>
Though at the same time... that brew gcc might be at fault for some of this? There's a lot of litter in your make vars.
<doug16k>
moon-child, yeah but on windows you invariably end up doing a recursive traversal, building the dependencies for everything
<uplime>
oh ok so i don't need to worry about potential security patches and what-not with it?
<doug16k>
moon-child, can't just go, boom, apt get libsomething
<moon-child>
msys!
<doug16k>
it's starting to be like that though
<klange>
You may want to convince GCC to build with the system clang instead; I think it prefers a gcc if it finds one.
<klange>
I don't know what to tell configure for that, though >_>
<klange>
maybe just CC=clang and it'll skip that step
<doug16k>
uplime, you expect blackhats attacking your stuff soon?
<doug16k>
oh
<doug16k>
you mean all modified. gotcha
<uplime>
right, that stuff
<uplime>
klange: ill give that a shot. i read in the building gcc article that building with llvm is unsupported but a lot of the mac stuff seemed older anyways, so maybe thats changed
<klange>
unsupported meaning they won't help you :)
<uplime>
haha good point
<doug16k>
can you crosscompile host=xxx-macos(?) ?
<klange>
They say any sufficiently c++11-capable compiler is fine?
<klange>
Also did you do binutils yet? You should build it first.
<uplime>
yeah binutils built fine
<doug16k>
just to powerpc?
<uplime>
it was the system one. is that the target name for macos it uses internally?
<doug16k>
I hear darwin a lot
<uplime>
oh yeah darwin is what i normally see and would expect
<doug16k>
old powerpc one
<doug16k>
the one now is that codebase though yeah
<klange>
darwin is the kernel :)
<doug16k>
ah
<doug16k>
my exposure to mac is mostly just grumbling that you have to say . when doing find -name 'foo'. MUST say find . -name 'foo'
<doug16k>
other than that in text editor
<doug16k>
M1 broke sshfs thing, since it now requires signed and blessed kernel modules
<kingoffrance>
well thats why you want $find instead of "$FIND" :)
<doug16k>
had to fix some stuff to scp
<uplime>
oh goody, now binutils doesn't build
<uplime>
about ready to just spin up a debian vm and start building on that
<doug16k>
I'm still amused to no end that my mac vm presents 4 quad core xeons to osx, to express the 16 cpus I assigned the guest :P
<moon-child>
haha
iorem30 has quit [Quit: Connection closed]
<doug16k>
I guess the code looks and figures, geez, that many cpus? gotta be xeons, and says I have xeons
<doug16k>
maybe less dumb, maybe the custom firmware figures out something believable to say in the acpi tables
<kingoffrance>
s/thats why/stuff like that in general/
valerius_ is now known as valeriusN
<doug16k>
anyone know off hand what the GPIO is for in aarch64 virt machine?
<doug16k>
there is a pl061
<doug16k>
do you need to fiddle with cpu pins or something/
<doug16k>
I feel like it must be easier to get serial to stdout than this
<doug16k>
`qemu-system-aarch64: -global pl011.chardev=... conflicts with chardev=serial0` yeah? what serial0 is that?
<doug16k>
you mean the one I am trying to control? `-global driver=pl011,property=chardev,value=debug-out`
<doug16k>
:(
<doug16k>
-serial stdout means add a new serial to the dead one it had already? or what?
<doug16k>
this is how you get stupidity like ovmf spraying the terminal out of every serial port
<clever>
linux also sprays the boot console to every console= on the cmdline
<clever>
and reading /dev/console, goes to whichever is last
<clever>
something in userland, will also spawn a getty on each of them, seperately
<doug16k>
it doesn't enumerate the machine and end it out every port in existence though, right?
<clever>
correct, it only uses the ports you listed in the cmdline
<doug16k>
ovmf enumerates the machine
<doug16k>
*every* port
<clever>
i could see that causing problems with some ports
<clever>
how confused is the bluetooth controller going to be?
<doug16k>
I tried to move my bootloader log to serial on ovmf. it sprays the terminal out every one. ovmf takes over serial completely, unusable
<doug16k>
tried com2, nope
<doug16k>
still spraying there
<doug16k>
not on each screen change either
<doug16k>
it sits in an infinite loop, sending the redraw of the entire screen as fast as it can, forever
<doug16k>
in the bios
<doug16k>
once you are out to your code, the api calls cause necessary output
<doug16k>
I wouldn't send the whole screen over and over when I was 15 year old programmer
<doug16k>
ridiculous
<doug16k>
I am so cuckoo, that I hook the idle vector in the real mode IDT so I can halt when it is waiting for IRQ
<doug16k>
in icount deterministic emulation, it makes it just skip time to the completion
<doug16k>
I can't even just let seabios spin, if I can help it
<clever>
that reminds me, the rpi has an irq, to signal that the AXI bus is idle
<clever>
arm itself (the raw verilog), also has an output signal, to state that all cores are in sleep mode, on either wfe or wfi
<clever>
which can then signal external logic, to do something
<riverdc>
wow usb looks complicated...
<riverdc>
wasn't expecting that :(
<doug16k>
riverdc, it's 2nd hardest thing to do. hardest is hw accelerated gpu
<doug16k>
wait no, put "real browser works" as 1st and bump those down :P
<riverdc>
i had put "real browser works" in the "essentially impossible" category
<doug16k>
it's about right
<doug16k>
xhci isn't that bad
<doug16k>
at first it is completely overwhelming, but if you keep alternating between xhci and usb spec, it starts to make sense
<doug16k>
usb is a series of tubes. bush missed that one
<doug16k>
control endpoints that do send+recv request/response, then zero or more half duplex endpoint contexts that can send or receive
<doug16k>
per device
<riverdc>
the machine i was planning to test on (raspi) doesn't even have a PS/2 port. no idea what i should do for keyboard input now
<riverdc>
guess just serial everything
drakonis has joined #osdev
<doug16k>
and a device can present one of several interfaces, so it can act as the crappy simple version (like usb block storage) or the super fast good one (like USB Attached SCSI)
<clever>
riverdc: you could implement ps2, if you use a level shifter on the gpio
<doug16k>
you can bitbang ps2 no problem
<doug16k>
it was bitbanged in the real pc!
<doug16k>
the protocol is designed for bitbang
<clever>
oh yeah, back when the keyboard controller was an MCU?
<doug16k>
exactly
<doug16k>
there's a way to say "shut up I'm not listening" when you stop monitoring the receive
<doug16k>
so you can go do something
<doug16k>
then when you get back you release it
<doug16k>
it's very robust
<doug16k>
you can tell IBM was crazy about making sure keyboard input was infallible
<doug16k>
nobody will ever say I pressed one number and computer got another (they hoped)
<doug16k>
the keyboard itself had an internal self diagnostic - in the 80's. that tells you how freaked out about erroneous keyboard input they were
<clever>
lol
GeDaMo has joined #osdev
<clever>
doug16k: that reminds me, linux has a lot of drivers, to bit-bang spi or i2c on gpio
<bslsk05>
'So how does a PS/2 keyboard interface work?' by Ben Eater (00:33:06)
<klange>
< klange> I wonder if Ben Eater is going to do more videos on USB or if this was a one-off to answer a question in a comment...
<GeDaMo>
I'm just watching the USB one now :P
<klange>
Somehow I don't think he'll be adding a USB controller to the 6502 machine.
<doug16k>
put a comment asking for an xhci host controller with streams support on a breadboard :P
iorem has joined #osdev
<doug16k>
and a pcie root complex on a breadboard
<doug16k>
if anyone can do it, ben can
<doug16k>
got one answer: from qemu doc files: GPIO lines for triggering a system reset or system poweroff (on aarch64 virt vm)
Mutabah has joined #osdev
riposte has joined #osdev
riposte has quit [Ping timeout: 268 seconds]
riposte has joined #osdev
<kingoffrance>
yeah, as soon as you say "somehow i dont think" then "hold my beer" takes effect and it is a matter of time until someone does it
<kingoffrance>
the very act of uttering that, jinxes it
johnjay has quit [Ping timeout: 268 seconds]
johnjay has joined #osdev
<doug16k>
it's a bad sign when "complex" is the best name you can come up with for your design :P
<doug16k>
pcie root "complex"
<klange>
oops I forgot to implement stack growth, quake was mad
<doug16k>
as perf nuts say, stack is hot in the cache
<klange>
old code got stale, had to fight with configure/libtool to get SDL to build shared again, but, ay, quake is back on toaru 2
<klange>
why... does there appear to be a limit to how far right I can turn...
<klange>
wonder if that's a 64-bit bug in quake
<klange>
I also continue to have this bug where resizing and not restarting screws up the pallete?
<klange>
So we've got gcc, binutils, doom, quake, SDL1.2, the only thing on my "must have" list is mbedTLS.
<klange>
which is kinda dependent on the network stack existing, so, we're back to there again, I have pretty much exhausted my list of distractions
<klange>
_unless_... I want to get the audio in quake working on the thinkpad
<klysm>
pci express root neurotic conundrum
riposte has quit [Quit: Quitting]
riposte has joined #osdev
KREYREEN has quit [Remote host closed the connection]
KREYREEN has joined #osdev
iorem has quit [Ping timeout: 272 seconds]
<graphitemaster>
toaru 2?
<klysm>
yeah that's what he's calling it
MarchHare has quit [Ping timeout: 244 seconds]
iorem has joined #osdev
srjek has quit [Ping timeout: 268 seconds]
MiningMarsh has quit [*.net *.split]
simpl_e has quit [*.net *.split]
merry has quit [*.net *.split]
Amanieu has quit [*.net *.split]
moon-child has quit [*.net *.split]
Griwes has quit [*.net *.split]
ids1024 has quit [*.net *.split]
Benjojo has quit [*.net *.split]
HeTo has quit [*.net *.split]
Lucretia has quit [*.net *.split]
Patater has quit [*.net *.split]
Affliction has quit [*.net *.split]
jstoker has quit [*.net *.split]
bleb has quit [*.net *.split]
pbx has quit [*.net *.split]
manawyrm has quit [*.net *.split]
Stary has quit [*.net *.split]
yuu has quit [*.net *.split]
remexre has quit [*.net *.split]
yuriks has quit [*.net *.split]
meisaka has quit [*.net *.split]
gruetzkopf has quit [*.net *.split]
simpl_e has joined #osdev
HeTo has joined #osdev
MiningMarsh has joined #osdev
Benjojo has joined #osdev
Amanieu has joined #osdev
Lucretia has joined #osdev
moon-child has joined #osdev
Patater has joined #osdev
jstoker has joined #osdev
Affliction has joined #osdev
yuu has joined #osdev
bleb has joined #osdev
pbx has joined #osdev
Stary has joined #osdev
remexre has joined #osdev
yuriks has joined #osdev
manawyrm has joined #osdev
gruetzkopf has joined #osdev
meisaka has joined #osdev
merry has joined #osdev
ids1024 has joined #osdev
Griwes has joined #osdev
<klange>
2.x was always going to be "when I finally port it to x86-64", which is why even the relatively huge project of "toaru-nih" was only a minor version bump from 1.2
zhiayang has quit [Ping timeout: 245 seconds]
zhiayang has joined #osdev
tenshi has joined #osdev
<graphitemaster>
twoaru
gmacd has joined #osdev
<klange>
I will write a lengthy blogpost on the whole adventure to go along with the release.
<geist>
woot put in my first JLCPCB order using Kicad
<geist>
in the past have used osh park + eagle so it's a new thing this time
<j`ey>
fun!
<j`ey>
m68k?
<geist>
naw, simpler for now, but i have started slowly working on a 68k based thing
<geist>
the amount of glue logic you need for a 68k is substantially more than i'm used to
<geist>
this is a z80 based thing
<j`ey>
what package is the z80?
<geist>
68k basically needs a CPLD to really make heads or tails of the bus logic
<geist>
z80 is a 40 pin dip
<j`ey>
oh nice, no hard soldering there
<geist>
16 bit address, 8 data bits, some control logic. very straightforward
<j`ey>
the PCBs I designed were qfp, and after multiple attempts to solder that, i just got someone else to assemble them
<geist>
i have a 68008 and 68000 (mostl differing by number of pins, because D bus is 8 vs 16)
<geist>
and a 68030 with all the 128 pins brought out. that one has a but-ton of traces to deal with
<geist>
well, relatively speaking.
<geist>
i snagged off mouser a few 512k x 8 sram chips + a few 128k x 8 bit roms, so can easily put together a few MB of ram
<klange>
panel widget is now looking at a new API for network status, dhclient can set the address on the interface it's configuring... could probably do udp sockets right now with this... but still thinking about data flow...
Arthuria has joined #osdev
<doug16k>
geist, drilling and routing and masks and all that?
<geist>
yah jlcpcb is mega cheap, though to be fair they've giving the first board a big discount
<doug16k>
did you get any slots cut out or stuff?
<geist>
but tseems a lot cheaper than oshpark
<geist>
nothing fancy, just a new board for my rc2014
Sos has joined #osdev
<doug16k>
yeah, they know if they can get you to try one board you will get more, so might as well give away first one
<geist>
yep
<geist>
but even that it seems mega cheap. but i know folks that have used them and it has been pretty quality
<geist>
so far the ordering experience has been pretty nice. very slick
<doug16k>
yeah they use the real machines and real pros operating it and stuff, apparently
KREYREEN has quit [Remote host closed the connection]
SwitchToFreenode has joined #osdev
SwitchToFreenode is now known as KREYREN
mctpyt has quit [Ping timeout: 245 seconds]
mctpyt has joined #osdev
<seds>
how would one represent the architecture and the OS communication in a UML diagram? I was thinking to represent the OS as a package and the architecture (RISC-V) as a package too
<seds>
but I don't like the idea of representing RISC-V as a package
<seds>
I wonder if anyone has some OS+architecture communication uml diagram I could take a look
<moon-child>
seds: why do you want to do that? What information do you want to convey with the diagram? 'architecture and communcation' is kind of vague
<seds>
moon-child: for my thesis, i have to represent the general solution i am proposing, basically an overview how components communicate with the hardware
<seds>
so i wonder how "hardware components" are represented in a uml diagram
<seds>
perhaps i just answered my question – components
<moon-child>
:)
<seds>
thanks!
<moon-child>
another thing--presumably your thesis involves designing some novel aspect of an os. That being the case, you might want to design your diagrams to showcase that aspect
MarchHare has joined #osdev
<seds>
moon-child: yeah, although i have the general idea of what i am designing, there are still some doubts on how i will organize it, so i am sketching a little simple overview of how things should be placed
<seds>
(still the first year of my thesis), so i have enought time to go back and resketche it
<seds>
s/resketche/resketch/
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
srjek has joined #osdev
sortie has quit [Remote host closed the connection]
sortie has joined #osdev
V has joined #osdev
iorem has quit [Quit: Connection closed]
Mutabah has quit [Ping timeout: 264 seconds]
mctpyt has quit [Ping timeout: 264 seconds]
dennis95 has quit [Quit: Leaving]
KREYREN has quit [Ping timeout: 252 seconds]
SwitchToFreenode has joined #osdev
SwitchToFreenode is now known as KREYREN
<doug16k>
do these look reasonable: -march=rv64imafdc -mabi=lp64d
<sortie>
No
<sortie>
The string rv64imafdc will never heard look reasonable.
<doug16k>
lol
<kazinsal>
Yeah, holy shit GCC, how did you guys forget about the letter G
<kazinsal>
does -march=rv64gc not work?
gareppa has joined #osdev
<doug16k>
valid arguments to ‘-mabi=’ are: ilp32 ilp32d ilp32e ilp32f lp64 lp64d lp64f; did you mean ‘lp64’?
<doug16k>
d means pass up to double in hard float reg
<kazinsal>
makes sense
mctpyt has joined #osdev
<doug16k>
funny that for my first attempt at riscv64, it won't assemble the very first instruction, and I'm stuck already. at instruction zero
<doug16k>
not promising
<doug16k>
whole thing compiles and links. can't get ONE instruction of assembly to work. ONE
<klysm>
doug16k is this with qemu-system-riscv64 ?
<doug16k>
yes
<doug16k>
but it's the compile, not execution problem
<kazinsal>
I have this weird feeling that GNU doesn't consider risc-v free enough to assemble
<doug16k>
:)
<kazinsal>
refusing to work on a berkeley product out of spite
tenshi has quit [Quit: WeeChat 3.1]
<klysm>
doug16k well as long as you know what's going on. otherwise I'd ask about that particular instruction.
<doug16k>
where's the pseudo-op to just load a damn constant into a register? I MUST do ldu addiu ? ok mom
<klysm>
oh I made a macro for that
<doug16k>
ya macro
<doug16k>
macro five is more like it
<doug16k>
maybe the five comes from needing to write five macros by the time you finish one piece of assembly
<kazinsal>
Extremely Reduced Instruction Set Computing
<meisaka>
maybe they wanted RISC 4, but the roman numeral "IV" has too many letters
<bslsk05>
github.com: riscv-asm-manual/riscv-asm.md at master · riscv/riscv-asm-manual · GitHub
<doug16k>
technical writing today man, wow
<doug16k>
leave out everything to save space
<kazinsal>
Every once in a while I see USB and think "I should write a USB stack", but then my brain immediately goes, "no, dumbass, you need to write an actual block device interface and an AHCI or ATA driver first".
<kazinsal>
doug16k: iirc it's rt <op> rs
<doug16k>
you should make an ahci driver just for the fun of it. it's elegant enough to be fun
<klange>
impossible
<doug16k>
you should be able to dual-wield ahci drivers before you even think about usb driver :P
<sortie>
klange, this is osdev, we haven't even implemented word impossible in the dictionary yet
<kazinsal>
AHCI, but make it distributed
<sortie>
AHCI as a service
<kazinsal>
sorry y'all, I am no longer writing a networking platform. it's all about S T O R A G E
<klange>
hardware is not allowed to be elegant and fun
<klange>
this is illegal
<kazinsal>
*implements btrfs, everyone's data disappears*
<sortie>
storage go btrt
<kazinsal>
btrfs and refs are the only two filesystems I've ever lost data on, and they're the only two filesystems I've used that explicitly claim to be scalable, performant, and reliable
<klange>
icbinbtrfs
<kazinsal>
(in the modern era, under regular workload conditions; I have definitely killed floppy disks and experimental filesystems before)
<doug16k>
what about the ones that say they will eventually corrupt it if you don't have ecc. what's up with that?
<gog>
that's a feature
<doug16k>
everyone else's filesystem gets away with all the bad ram
<kazinsal>
It's a stealthy way of forcing you to shut off the computer and go outside once in a while
<doug16k>
windows actually keeps checksums on some critical stuff so it can tell it got corrupted
gmacd has joined #osdev
<kazinsal>
yeah, NTFS dates back far enough (yet impressively is still a solid filesystem) that it has a whole bunch of checks for "what if the universe decides your computer isn't allowed to work right today"
<klange>
well it _is_ windows after all
<geist>
doug16k: any luck getting riscv to assemble? 'li' is what you want
<doug16k>
la worked
<geist>
la works too
<geist>
li is a constant, la is an address
<doug16k>
it's an address in my case
<doug16k>
but a constant address, so, it's philosophical
<gog>
doug16k: ah the filesystem equivalent of telling you to "go touch grass"
<geist>
it changes whether or not it tries to assemble it as a fixed literal or something that's PC relative
<doug16k>
ah
<geist>
note that asm manual you linked is not the manual for the instruction set
<doug16k>
gotcha. worse relocation on li
<geist>
that's 'how to write asm'
<geist>
as in 'some pseudo instructions, some assembler intrinsics'
<geist>
the actual ISA manual is the user instruction spec
<doug16k>
la might just move with everything and not fixup
<geist>
la probably only takes a label
<doug16k>
ya but I mean, if I ASLR'd it to a different place, la doesn't need fixup, li does?
<geist>
auipc is the key, it's much like addrp in ARM. given some N bit offset from PC it computes the page aligned address relatively to PC for that symbol
<geist>
then subsequent call/jalr/ld/sd/addi/etc provides the rest of the 12 bits
<doug16k>
does %lo and %hi work for you?
<geist>
right
<geist>
also note the pseudoinstruction like 'call': it does the auipc to find the target and then the jalr instruction provides the rest of it
xenos1984 has quit [Remote host closed the connection]
<geist>
i'm sur eyou've figued it out but the 'branch on <logical operation>' are all comparing two registers
<doug16k>
yeah, lack of flags
<geist>
also since the x0 reg is zero, a bunch of them are pseudo instructions like 'beqz, branch on equal zero, where it compares your reg to x0'
<doug16k>
oh the zero reg, lol
<doug16k>
startup code tries to use it
<doug16k>
how do you store?
<geist>
sb/sh/sw/sd
xenos1984 has joined #osdev
<geist>
only addressing mode it support is reg + 12 bit signed immediate
<bslsk05>
github.com: riscv-asm-manual/riscv-asm.md at master · riscv/riscv-asm-manual · GitHub
<geist>
almost no code at all in riscv uses the raw registers, it's all a0, t0, s0, plus the special case ones (like zero, or ra, or s)
<geist>
sp
<j`ey>
geist: does x0 to sp too?
<geist>
j`ey: unable to parse
<j`ey>
geist: does x0 encode to sp too? (like arm)
<geist>
sp == x2
<j`ey>
missed quite a key word
<geist>
see table i just linked
<j`ey>
looking
<geist>
there no magic encodings. simply 32 registers, one of which is fixed to zero
<geist>
and any instruction that intrinsically deals with PC does it because the instruction is PC related
<doug16k>
%lo works now!
<geist>
j`ey: the layout of pseudonames to real registers is a mess, but there's a reason for it: there's a embedded spec that only includes the first 16 registers, so they had tom make sure that < 16 gets a medly of t, a, and s registers
<j`ey>
I like that secondary naming scheme
<geist>
yah it's pretty natural once you get a hang of it
<geist>
what *does* complicate it is riscv has a thumb2 like thing where some instructions can be 16 bit
<geist>
and one of the things is it relies on you using a subset of the registers
<geist>
iit's it's something like x5-x12 or something
<geist>
neat thing is the compressed instructions are 1:1 map to the larger ones, so the assembler itself actually can substitute
<geist>
and there's no penalty for using the compressed instructions, so it's one of those extra levels to the register allocator and whatnot, much like x86, where you're trying to more heavily use those 8 registers so you can get compressed instructions
<geist>
looks like it. note you've forgotten to write a 0 into a0
<doug16k>
which reg is zero register?
<geist>
'zero'
<doug16k>
oh I thought a0 was stuck as zero black hole
<geist>
no. you probably outta spend like 5 minutes just skimming the docs
<j`ey>
a0 is x10 (from that list)
<geist>
unless you're intentiontionally trying to stub your toe, which can be fun!
<geist>
but yah remember a,t,s are just ABI level labels to three classes of registers
<geist>
a and t are both temporary (callee trashed) but as are used to pass args (and return)
<geist>
ses are callee saved
<geist>
and t are used for neither, and the assembler will sometimes use them, so it's not generally a good idea to use t registers to hold values across your asm function
<geist>
so using as for work like this is totally fine, but if you want a hard zero register use 'zero' which is an alias for x0
<j`ey>
ohhhh thats actually a name
<geist>
side note if you want to see how well the compiler/assembler actually used compressed instructions, here's a random block of some LK code: https://pastebin.com/Qw9g7Vjk
<geist>
compiler actually gets good usage of it. basically thumb2 level compression
<j`ey>
yeah, looks pretty good
<geist>
yah it's really powerful when you can do the instruction size per instruction. ARM really missed an opportunity there with ARM64
<doug16k>
riscv64 virt machine only gets 64KB boot rom image at 0x1000, and RAM is so far away, you can't reach it with 20 bits, so I copy the rom into ram and jump to it before I do anything, and abandon the rom image altogether, too far away
riposte has quit [Quit: Quitting]
<geist>
doug16k: makes sense. note if you're compiling for 64bit you probably want to use -mcmodel=medany
<doug16k>
on aarch64 and x86 and i386 it keeps using the rom
<geist>
honestly medany should be default
<geist>
without it the default is 'medlow' i think, and medlow assumes all code/data is +/-2GB of 0
<geist>
rv32 is implicitly medlow because that's the entire address space
riposte has joined #osdev
<geist>
doug16k: oh also fun thing you can do on riscv: you can use atomics from the get go, so a common trick to trap eveyrthing but the first cpu is to immediately to an atomic and let only the first one in
<geist>
well i cut it off a bit early, that test only lets the first 8 cores in, but there's a test later than then lets the one that got ticket 0 through
<geist>
so a detail you'll find on Real riscv machines that's annoying but makes sense: the boot cpu # is not always 0 based
<geist>
in your case you're probably booting in machine mode and not using opensbi because you're trying to write low level firmware?
<doug16k>
ah, riscv scoffs and says, no *you* assign cpu numbers :D
<geist>
basically
<geist>
worse, the sifive machines actually number the cpus oddly: cpu 0 is a nerfed, machine mode only cpu, cpus 1-4 are the Real ones
<geist>
then dobleplus weird: opensbi arbitrarily lets one cpu through to boot the OS
<geist>
and its not always cpu 1
<geist>
so anyway, long story short linux actually dynamically assigns logical cpus and I needed to do something the same in LK
<geist>
broke assumptions that cpu 0 == machine cpu id 0 and then everythings happy
<doug16k>
how do you do IPI if the cpus don't have a hardwired "number"
<doug16k>
or semihardwired
<doug16k>
(like x86 letting you change it if you insist)
<geist>
well, depends. the cpus *do* have a hard wired number, it's readable in the mhartid register
<geist>
(HART is riscv term for 'hardware id')
<doug16k>
ah ok good
<doug16k>
so I just let a race randomize first cpu and it is as intended?
<geist>
when you're in machine mode most stuff is hart id based. and when you're talking to the interrupt controller that's what you use
<geist>
it's more complicated: when you boot in machine mode, all the cpus will probably just start simultaneously
<geist>
that's what you're testing right now, right?
<doug16k>
yeah I should be very first instruction at power up
<geist>
it's very much like in the ARM world: if you're the firmware you have all the problems, and it may be soc specific
<geist>
kay, so on qemu virt it'll just start all the cores at the same time, and it'll number them hart 0-N
<geist>
so no sweat, trap them and you can continue on as usual
<doug16k>
assuming they all come flying in, I can pick one, use it, and send the rest to the trap, and they all can read their own cpu number and populate some array?
<geist>
but.... if you boot on Real Hardware, or qemus emulation on it, cpu 0 is special and weird, so you have to deal with that, etc
<geist>
right. but, if you boot on SBI in supervisor mode, which is the -bios default mode and how linux runs
<geist>
then, SBI boots first, and it provides a fiction of what the cpu numbers are and handles things like IPIs and timers for you
gareppa has quit [Quit: Leaving]
<geist>
(and even a console!)
<geist>
ie, you make firmware calls to send IPIs, etc
<geist>
supervisor mode has no shartid, when a cpu is booted in supervisor mode (via the firmware) it is assigned an id and it's passed in a0
<geist>
and you have to remember it. it's like in Hudsucker's Proxy: you're given an id, it will not be repeated
<geist>
also for whatever reason SBI itself arbitrarily lets exactly one cpu through to boot the supervisor kernel, and it's not always the lowest numbered one
gareppa has joined #osdev
<geist>
so basically: if you want to suppor all the boot modes eventually (which LK does, which is why it's fresh on my mind) you have to deal with both boot situations
<geist>
a) you get all the cpus at once, not necessarily 0 indexed
<geist>
b) you get exactly one cpu, and it can be any of them
<geist>
hence why i do the atomic thing: the first cpu that passes _start gets assigned 0 and then boots the kernel, all subsequent cpus go to purgatory but then get assigned a logical # as they're woken up
<geist>
basicaly what linux does. it solves both boot situations nicely
<doug16k>
yep, that should cover all cases
<doug16k>
possible cases
<geist>
note qemu virt machine the cpus are 0 based, so you can use your existing 'cpu 0 is special' logic and that's fine
<geist>
it's really whe you start dealing with real sifive hardware where you get the case where cpus are not 0 numbered, which is what really fouled up my existing system
<doug16k>
I'd rather make it vaguely realistic, just in case someone somewhere refers to it thinking this means it works on real machine
<geist>
right, that's generally my idea, plus it's fun to handle all the boot situations. i think you understand this. kinda fun to check off all the boxes even if you never use it
<doug16k>
exactly
<geist>
it's an intresting strategy that they actually hide some core things from supervisor mode. the fact that s mode cant read the current cpu id is annoying and kinda genius whn you think about it
<geist>
it's all set up such that the amount of stuff you have to virtualize is very low
<geist>
like, does the cpu *need* to know what id it is? no. software can dealw ith that as long as it was told
<geist>
riscv you can't even read he current ISA capabilities (misa) register in supervisor mode. that's supposed to be a contract between you and a higher authority (SBI, hypervisor, etc)
<doug16k>
hey you just reminded me, I need -mstrict-align on aarch64 if I don't enable the cache, right?
<geist>
right
<doug16k>
to be realistic
<geist>
yah qemu i dont think will trap on unaligned, but on real hardwar eyou'd probably make it a hundred instructions in before it traps. compiler loves to emit unaligned stuff
<doug16k>
I don't do anything misaligned but I bet the compiler did, yeah
<doug16k>
I cleaned it up quite a bit more
<doug16k>
now it just builds a table of pci devices and caches it after configuring everything
<geist>
noice
<doug16k>
it's like a little scaffold where you can do something with a just-barely-initialized machine with a framebuffer, on several architectures
<geist>
the riscv virt machine is pretty similar. not exactly the same layout, but clearly heavily inspired by the arm one
<geist>
sooo. you know which one you gotta do next right?
<doug16k>
which one?
<geist>
qemu 6.0 just added a virt machine for m68k
<doug16k>
yeah!
<doug16k>
that's nostalgic as hell
<geist>
that one has the goldfish_tty thing i was pointing at yesterday
<doug16k>
never had one but drooled over it from afar
<geist>
dunno what sort of framebuffer you can get on it
<geist>
yeah i started fiddling with a basic hello world for it, on github
<geist>
i think it's my treat to port LK to 68k once i finish the current round of things i should do first
<geist>
my toolchain script will happily build a 68k gcc as well
<doug16k>
probably secondary-vga works if it is anything like the aarch64 or x86 virt
<doug16k>
that bochs interface stuck because it is so simple and does what you need directly
<geist>
yah question is whethe ror not it brings out PCI
<doug16k>
now that architectures are all needing same MMIO thing, I', realizing I just have two machines: x86, virt
<doug16k>
I'm guessing I just point 68k at the virt mmios and away you go
<doug16k>
awww, come on: qemu-system-riscv64: unable to find CPU model 'max'
<doug16k>
each one is special?
<doug16k>
so max is just a hack
<doug16k>
it's ok though, I'll plumb it into the per-arch nitpick switch
<doug16k>
oh I have it already for cortex-a72
<riverdc>
doug16k: when you bitbanged ps/2, did you set things up so that a single key press/release trigerred a single interrupt somehow? or were you getting an interrupt for every clock cycle (which would be 11 or so times for each press)
<geist>
doug16k: yah i think 'rv64' is all you need
<geist>
it doesn't really emulate much else anyway
mctpyt has quit [Ping timeout: 264 seconds]
<doug16k>
riverdc, in a loop advancing through a state machine pretty much
<bslsk05>
github.com: lk/do-qemuriscv at master · littlekernel/lk · GitHub
<doug16k>
riverdc, is that what you mean?
<riverdc>
yep, exactly, thansk
<riverdc>
thanks*
<doug16k>
riverdc, you will outrun the bus by miles
<doug16k>
you are the flash. bus is slow
<riverdc>
you don't know how fast my cpu is :)
<riverdc>
kidding
<doug16k>
geist, so if I went with sifive, that approximates the real I/O and stuff that would be there on real one?
gareppa has quit [Quit: Leaving]
<doug16k>
oh sifive is just the cpu
gmacd has quit [Remote host closed the connection]
<doug16k>
sorry I was confusing it with a desktop computer
<geist>
no, that's not true, it actually selects a machine
<doug16k>
sifive does mean a desktop?
<geist>
there are two machines it emulates: sifive_e which is the sifive hifive, 32bit, embedded
<doug16k>
biggest one I mean
<geist>
sifive_u is the unleashed, which is a real 4 core 64bit machine. i have one on my desk
<doug16k>
I can sifive_u qemu?
<geist>
see thing i linked you
<geist>
one of the optinos in the script is to tell it to emulate either a sifiive u or e
<doug16k>
ok I'll search it
<geist>
the _u is what you want, unless you also want to do some embeded, machine mode only 32bit riscv
<geist>
which is also fun
<doug16k>
so if you hardcoded all the addresses that you saw in qemu, that could work on real one? it's that closely simulated?
<geist>
yes
<geist>
well, depends on what you mean by 'all the addresses'
<doug16k>
cool
<geist>
you can also just look at the sifive unleasehd manual
<doug16k>
what you see at power up
<geist>
t's just a qemu model of real hardware, dont by sifive themselves
<geist>
also the sifive hardware looks kinda like virt. ram is roughly in the same place, etc
<doug16k>
example: at the very 1st instruction of virt vm, you can write to serial
<geist>
and of course the sifive folks i think did most of the work to add 'virt' so it's pretty similar
<geist>
oh i dunno
<geist>
gosh you want it to be a golden platter?
<doug16k>
no
<doug16k>
I mean there is some subset of stuff that just works from instruction 0
<geist>
i dunno!
<doug16k>
it at least makes them show up at the expected address is the question
<geist>
note with that i *always* use OpenSBI which probably smooths all that over anyway
<geist>
i have no real interest in running machine mode bare metal on mmu capable RISCV machines
<geist>
but most likely the sifive uart emulation is sloppy enough that at worst you might have to hit an enable bit and then be going
<doug16k>
yeah I don't need that much to get a basic port up, I need to deal with weird far away addresses at rom start, forcing things to certain places, copying data image to ram, and zeroing bss, and trapping the APs
<bslsk05>
github.com: lk/uart.c at master · littlekernel/lk · GitHub
mctpyt has joined #osdev
<geist>
but that code has also run on actual hardware, so i dunno if the enable bit is required for qemu, for example
<geist>
but frankly if you dont have real hardware i wouldn't bother writing code against an emulation of it. it's not that different from the virt machine, just different
<doug16k>
yeah my ambitions are restricted mostly to getting lots of arches booting to a framebuffer and PCI bar initialized scaffold on virt vm, so I can cheat and make most of them use the same virt mmio window code with per-arch addresses and sizes
<geist>
yah the sifive pci stuff i dont think is emulated
<geist>
the real hardware the pci bits are kinda a mess because it's mostly implemented on a fairly large external FPGA
<geist>
and i think the firmware (uboot, etc) has to load up the firmware and whatnot
<geist>
fairly certain qemu just doens't emulate any of that, or maybe emulates it already being magically configured
<geist>
also to add complexity there's a second version of it now. sifive unleashed and sifive unmatched
<geist>
the unmatches are just starting to ship now
<geist>
it's pretty different physically though software wise it seems to be fairly similar. newer riscv cores that are faster though
<geist>
unmatched has much better PCIe support though. has an actual nvme slot and a real PCIex4 i believe
<geist>
i have a nvidia card in it and it actually brings it up and uses it for reals
<doug16k>
I think the next levels of ambition should be to get context switching working, then irq handling, then a scheduler, then bringing out APs
<klysm>
doug16k, it's 64-bit
<geist>
sure. basically starting over and build a simple OS?
<geist>
in a sense that was kinda my raionale for LK back in 2008 or so
<geist>
start over build something simple and very portable
<doug16k>
geist, not really. just mostly exploring ways to make my stuff way more cross platform
<doug16k>
with way better custom configure+makefile build
<doug16k>
so I want to expose myself to all the little nitpicking you need across platforms better
<geist>
cool, yeah
<geist>
it's fun, i love that kinda stuff. to a certain extent that's where a lot of the LK energy goes: it's very depth first. make work on All The Things but not very deeply
<geist>
it's a very different though process than i've seen a lot of hobby stuff here, which is totally fine.
sortie has quit [Quit: Leaving]
<klange>
breadth first?
<geist>
yah breatdth first
<geist>
(this is where i do like things like slack and discord better, can go back and edit typos)
<klange>
describing os projects through graph traversal approaches...
<geist>
but it has worked well in LKs favor. get basic but pretty polished basic support on lots of things, provide a bunch of customization options, and then get out of the way
<geist>
hence why lots of companies and whatnot pick it up and then hack it up for their purpose
<geist>
of course things like 68k or vax support dont matter in this case, but as long as i can keep those toy ports safely hidden away behind an arch/ dir it's no real harm
<klange>
what does that make toaru... it's not depth-first as I don't try to dig deep into doing things, and it's not breadth-first because I don't try to cover a lot of hardware... maybe it's A*
<geist>
since there's virtually no ARCH_* stuff outside of an arch dir or the corresponding platorm/* it works pretty well
<doug16k>
klange, probably cost/benefit ratio driven in your case
<doug16k>
you get an awful lot of functionality done in short time periods
<geist>
depends on what kinda graph it is. i think in the case where the graph is 'usability as a desktop' things like taurau and sortix are very depth first
<geist>
whereas LK is almost anti that, i'm not interested that much in a desktop/interactive experience as much as a building block to build other things out of
<geist>
but then i want very solid SMP, architecture, etc support
<geist>
ultimately of course it's self serving: my interests lie in low level stuff, so the project reflects my interests
<geist>
from talking to sortie and probably klange i think their interests are not there, right?
<klange>
I wouldn't say I'm not interested, just that my overall approach is less focused on it.
<geist>
sure. that's totally fine btw. it's just a different way of approaching it
<geist>
in no way is this a dig on anything
<klange>
sortie and I are aiming for the same area, which is more userspace / high-level functionality vs. kernel, hardware, etc.
<geist>
LK/newos/etc (my projects) are always held back because i just want to write kernels
<geist>
so i try to pair myself with companies or whatnot where they'll let me do the bottom parts and someone else does the higher bit
<klange>
but we also take fundamentally different, perhaps even opposite approaches: sortie has always been about maximum formality and correctness, and it's served him well for complicated ports
<geist>
yah you can see these different venn diagrams start to form
<klange>
my approach with toaru has been to eschew correctness and accuracy in favor of maximizing apparent functionality
<geist>
or some sort of tree of approaches, etc
<geist>
also i think it's a bit deeper for me. i like to prepare to do something much more so than doing it.
<geist>
ie, settin gup the build system, getting the basic port working, organizing my workbench, buying the tools, etc
<geist>
i do it all the time in my personal life. always preparing for some task that i dont necessarily get to
<geist>
even sitting around here right now talking about it is another form of it. its like i mentally prepare for some task by talking about it, reading the docs, learning everything i can about a topic before even starting
<geist>
most of the time i never get to step 2
<klange>
I have trouble finishing things, so that's kinda baked into the design philosophy here: nothing is ever done, it's all just MVP.
<geist>
same here, though honestly i'm okay with that sometimes. i am fine with half finished things in the sense that it's an aspect of engineering: building something that works version 1, then v2, then v3
<geist>
i build stuff in phases, bootstrapping the next phase. you'll almost never ever see me go througha formal large design and then spend a lot of time going directly to the end, becaue i just dont think that way
<geist>
means i'm a total mismatch for lots of stuff at work that wants that sort of stuff
bsdbandit01 has joined #osdev
<geist>
that has worked very well for me at various work. i dont get fixated on the end goal as much as i get fixated on the *path* to the end goal, and how to properly plan a route such that at all points in the design/implementation the solution is still useful
<geist>
i personally think that's a very useful skill that's hard to teach
<doug16k>
haha, in riscv, stepping assembly source is "compiled code" and you can see what it really did by going into step by instruction :P
<doug16k>
ah the weird stepping is all the cpus flying in. oops
<doug16k>
runs and debugs though! \o/
<geist>
yay
<geist>
you dawg i heard you like cpus so here's 8 of them at the same time
V has quit [Ping timeout: 252 seconds]
<radens>
Forgot how nice Kconfig is
mahmutov has quit [Ping timeout: 268 seconds]
bsdbandit01 has quit [Read error: Connection reset by peer]
nyah has quit [Quit: leaving]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
qookie has quit [Ping timeout: 245 seconds]
V has joined #osdev
<geist>
doug16k: FWIW, the 68k virt machine dont do PCI
<geist>
what it does is a bunch of virtio mmio apertures
<geist>
so could probably get virtio-gpu working pretty quickly, though that's of course non trivial