Gooberpatrol66 has quit [Quit: Konversation terminated!]
Gooberpatrol66 has joined #osdev
gog is now known as ghoulg
y0m0n has quit [Ping timeout: 245 seconds]
Dead_Bush_Sanpa1 has joined #osdev
Fingel has joined #osdev
Dead_Bush_Sanpai has quit [Ping timeout: 252 seconds]
Dead_Bush_Sanpa1 is now known as Dead_Bush_Sanpai
heat has quit [Ping timeout: 246 seconds]
ghoulg has quit [Quit: byee]
goliath has quit [Quit: SIGSEGV]
kline is now known as spookline
edr has quit [Quit: Leaving]
innegatives has joined #osdev
innegatives has left #osdev [WeeChat 4.4.2]
kof673 is now known as kof674
kof674 is now known as kof673
Fingel has quit [Quit: Fingel]
Fingel has joined #osdev
Fingel has quit [Client Quit]
Fingel has joined #osdev
Vercas9 has quit [Remote host closed the connection]
Vercas9 has joined #osdev
craigo has joined #osdev
Gooberpatrol66 has quit [Ping timeout: 252 seconds]
Gooberpatrol66 has joined #osdev
innegatives has joined #osdev
<innegatives>
Hey, say you write i686 emulator (without any translation magic like in qemu) in Rust, how performant would it end up being when run on a modern i9 CPU? Could you get usable Windows 2000 experience out of it just by software emulation?
<the_oz>
I think questions like that are silly
<innegatives>
the_oz: why?
<the_oz>
measuring performance of a non-thing
<innegatives>
just some ballpark
<the_oz>
it can be what you want, pixie dust and
<the_oz>
you need to drill down on how the language is implemented
<the_oz>
how performant is this calling convention via that, but even stuff like this is dependent on how you design the OS to be
<the_oz>
or, stuff is tied to the hardware, or how the resultant asm is via gcc, clang, handrolled, what even is rust's compiler?
<the_oz>
as far as names go
<innegatives>
in general though you need to execute certain amount of instructions for each emulated instruction, i want to know if these are generally excessive enough to not be feasible with modern CPUs. say you have the most optimized hand-rolled asm implementation
<innegatives>
i686s were like 1GHz
<the_oz>
that's a whole nother indirection
<the_oz>
the most performance is not emulation
<the_oz>
performant* the most performant will be Rust generation considering that's where most of the work is done
<the_oz>
but I mean, what are you really EMULATING if you care about performance
<zid`>
interpreting x86 would be fairly slow
<zid`>
just divide your clockspeed by the target clockspeed, to see how many cycles you'd have to fetch, execute, retire each instruction emulated
<zid`>
5ghz cpu targetting 1ghz -> 5 cycles
<innegatives>
zid`: but given how many generations are between i686 and 14th gen, do clock speeds map that directly?
<zid`>
but thankfully we don't have to do any of that, we have it built into our cpus to run i686 code in a hardware emulator :p
<zid`>
innegatives: ipc only matters if you're emulating exact timings, it'd just make the emulated code slower and the emulator faster
<zid`>
to do so
<zid`>
i.e the emulator would get more 'work' done so your 5 cycles would be 'worth' more, when you account for it
<innegatives>
ah, got it
<innegatives>
thanks
innegatives has quit [Quit: WeeChat 4.4.2]
<kof673>
qemu used to have "kqemu" for windows, bsd IIRC, linux surely...before "hardware virtualization" cpu features..........so there are other ways to cheat i suppose...the secret is to cheat > kqemu is a kernel module that "accelerates" QEMU virtualization by allowing guests to run some operations directly on the host's CPU.
<kof673>
but if you leave too early, no cheat for you
Fingel has quit [Quit: Fingel]
<geist>
well, that was a non answer. i think the answer is expect about 100:1 if you raw interpret
Fingel has joined #osdev
<geist>
based on past experience with this stuff
<geist>
that's like if you sort of naievely do it. if you translate to something else you can start to drill down to the tens:1
<geist>
and then JITting gets you down into the single digits
<clever>
kof673: and then there was the whole co-linux cheating
<clever>
essentially, it used a windows kernel driver to get ring0 perms, then it just context-switched the whole damn cpu to linux, directly in bare ring0
<clever>
but with some careful rules, like allocating physical memory from the windows memory manager first, and never touching IO, linux can run without disrupting things
<klys>
is there a free hyperv server?
<clever>
when colinux tries to r/w a block device, it just forwards the request back to the windows kernel, and to a normal file on ntfs
<clever>
network packets flow between a virtual nic in linux and a virtual nic in windows
<vai>
hi all morning
<zid`>
geist: It's all of course, completely pointless, because x86 has hardware support for not having to write any of this :P
<zid`>
and it's a non-answer because it's a non-question, it doesn't mention how accurate he wants to be, and that has a 1x to 100000x performance penalty range
op has joined #osdev
<geist>
well, i dunno i saw the original question as 'if i were to write an emulator could you reasonably expect to run stuff on it'
<geist>
and the answer is probably yeah
<geist>
win2k may be a little slow but probably usable
<zid`>
I'd expect, using only interp, to get a few tens of mhz, and that's only because the instruction emulation is effectively 'free' in lots of cases
<geist>
eh, depends. i wrote an emulator a while back (granted it was arm) and with not a ridiculous amount of effort got it pretty close to 10:1
<zid`>
100 cycles to fetch, decode, and implement the memory decoding
<zid`>
depends how complicated your memory map is if you're doing an interp
<geist>
was a full mmu and everything
<geist>
but. x86 is more complicated for sure, but then that was also like 15 years ago and the host cpus weren't as sophisticated
<geist>
i was pleasantly surprised at how fast it ran considering
<geist>
but of course inlined the crap out of everything, one bigass loop, etc. did some basic amount of work to make it not terrible
<zid`>
10:1 is insane
<zid`>
for an interp
<zid`>
unless the test code was doing a lot of idle looping
<geist>
i was surprised too
<zid`>
you can emulate jz -2 in 10 cycles
<geist>
again this was arm, it was simpler, and i had pulled a few tricks
<zid`>
10 cycles just isn't many cycles though
<geist>
but the tricks i pulled would put it at least within an order of magnitude of a straight interp
<zid`>
you can fetch the instruction, do a branch to some code, run a native instruction or two
<zid`>
tops
<geist>
so say 100:1 if it was more naieve. that's not bad
<zid`>
if it has to then walk a tree or do a bunch of if()s for memory decoding, you need a lot more than 10 cycles
<geist>
even interpreting arm is kinda gnarly, it's not a ridiculously clean thing
<geist>
the trick that i did that really sped it up over a brute force emulator is pre-decode everything into essentially a much more regular VLIW word
<geist>
and then interpreted that. so that at the end of the day the raw dispatch was basically just a giant case statement
<geist>
but again i started with an emulator and then added that as an optimization
<zid`>
so for example on gameboy, which is super simple, per memory access, I need to check if dma is pending, switch on the address to filter off the mmio etc, check if it's within the mirrored memory range and adjust it, then actually do the write
<geist>
and it sped it up a lot, an order or magnitude or so, and got it within 10:1
<zid`>
yea an interp basically means "giant switch"
<geist>
anyway so even if it was like 1000:1, a modern core i9 is probably about 1000 times faster than something that could run win2k
nitrix has joined #osdev
<zid`>
yea, I'd run w2k on a few tens of mhz
<zid`>
it wouldn't be *fun* but it'd run
<geist>
it might not be toooo fast, but it would run
op has left #osdev [#osdev]
<geist>
but you're right, dealing with device emulation really puts a kink in things
<zid`>
yea if you can get away with full fastmem, you can sustain 10:1
<geist>
t hough i'd hope that the vasst majority of stuff is memory
<zid`>
(fastmem is an emulation term where you just do mmap(4GB) and do as little memory decoding as you can)
<geist>
yah that was some of the tricks i had in the emulator. the mmu TLB held direct pointers into the memory bank and whatnot
<geist>
so the translation from the TLB -> location in memory was really only one level
<zid`>
if you can get away with if(addr < 2B) fastpath();
<zid`>
as your entire fastpath for memory access, that helps a *lot*
<zid`>
2GB*
vdamewood has joined #osdev
<zid`>
you might actually keep to your 10 cycle budget there
<zid`>
because it just turns into a jump if signed to the slowpath, which gets predicted well too
<geist>
at the time (about 2005-2006) there was someone here that was writing a jitting emulator
<geist>
and he was outperforming mine by quite a bit, but then it was a jitter
<geist>
was in the single digit:1 range
<geist>
i forgot his name, he was a smart guy. i learned a lot about emulators from him
<zid`>
yea they're surprisingly not as fast as you like, if you're going for accuracy, that's only a 2x or 4x speedup vs an interp
<zid`>
like, even if the JIT is as good as say, gcc -O3
<zid`>
gameboy is just unjittable, it's *slower* to JIt it
vampiredamewood has quit [Ping timeout: 248 seconds]
<zid`>
because in that era, pre-cache, people *relied* on instruction timings
<zid`>
so all you gain by the jit is removing the switch(instr), the vast bulk of the code still gets ran, slowly walking through each instruction running it in pieces and updating all the timers etc
<geist>
yeah totally
<zid`>
Actually, there is a gameboy jit
<zid`>
but it exists because the target platform didn't have enough ram to hold the emulator, the rom, and the ram, all at the same time
<zid`>
so it just.. doesn't have an emulator, it jits the code to native code, combining the 'emulator' and 'rom' part into one
<clever>
related, there is static recomplilation
<zid`>
yea, basically what it did
<zid`>
it blurred the lines between static recompiling and dynarecing
kfv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
kfv has joined #osdev
aejsmith has quit [Quit: Lost terminal]
aejsmith has joined #osdev
Burgundy has joined #osdev
bessieTheBoy has joined #osdev
kfv has quit [Remote host closed the connection]
kfv has joined #osdev
<bessieTheBoy>
I have been having issues with my idt and have been quite literally been staring at the function for 30 minutes. Below is my code and some runtime values provided by gdb:
<bessieTheBoy>
this still doesn't seem to work. the target is:
<bessieTheBoy>
0x0008:0x05a005a0
<bessieTheBoy>
which is an issue obviously
<bessieTheBoy>
when it should be:
<bessieTheBoy>
0x0008:0x2005A0
<zid`>
so post a godbolt that demonstrates the issue
<zid`>
I can't fix code I can't see, and I did my utmost to describe what was wrong with your previous code, that you didn't understand type promotion / operator precedence
<zid`>
yea and I'm not american, I can still speak the language
<zid`>
portugoose (simplified)
cow321 has joined #osdev
<heat>
portugoose (simplified) is brazillian geese
<zid`>
portugoose (advanced) then
<zid`>
I guess yours would be nóóóóóóóóóós?
<heat>
yeah
<Ermine>
portugoose (traditional)
xenos1984 has quit [Ping timeout: 264 seconds]
hwpplayer1 has joined #osdev
xenos1984 has joined #osdev
Arthuria has quit [Ping timeout: 272 seconds]
fedaykin has quit [Quit: leaving]
fedaykin has joined #osdev
Gooberpatrol66 has quit [Read error: Connection reset by peer]
Gooberpatrol66 has joined #osdev
hwpplayer1 has quit [Quit: I'll be back]
Dead_Bush_Sanpai has joined #osdev
goliath has quit [Quit: SIGSEGV]
pjals has joined #osdev
<pjals>
Hello. Is there a libc without functions that wrap syscalls? Just basic things like strlen, strcat, etc. I know I can implement them myself but there are optimizations people much smarter than me can do to implement such functions. (or maybe not, C compilers *are* pretty good at optimizing)
craigo has quit [Quit: Leaving]
<zid`>
Could just steal one then?
<zid`>
if all you want is strlen, memcpy, memcmp, etc
<zid`>
go look at the freebsd libc source or something
<heat>
or musl
<zid`>
heat so mean
<zid`>
suggesting musl's code to someone is a warcrime
xenos1984 has quit [Ping timeout: 248 seconds]
<pjals>
Yea, I guess. I don't really like borrowing code because I come from a package """maintainer""" background where seeing an extern folder is pain.
<bslsk05>
libc.llvm.org: String Functions — The LLVM C Library
bauen1 has quit [Ping timeout: 252 seconds]
<zid`>
extern folder? pfft
<zid`>
I meant steal.
<Ermine>
gcc builtin strops when
<pjals>
Doesn't it inline strlen if you don't include the header?
xenos1984 has joined #osdev
<zid`>
nothing to do with the header, the header just gives a declaration so that the compiler can do type checks
<zid`>
and use the right abi
<zid`>
but gcc will silently replace external calls to certain string/memory functions with builtins, yes
<zid`>
i.e it might decide it's faster to drop a 'rep movsb' for a smallish copy rather than farming out through a function pointer to a .so
jedesa has joined #osdev
heat_ has joined #osdev
heat has quit [Ping timeout: 252 seconds]
heat_ has quit [Remote host closed the connection]
heat_ has joined #osdev
goliath has joined #osdev
pjals has quit [Quit: Lost terminal]
xal has quit [Quit: bye]
xal has joined #osdev
strategictravele has joined #osdev
strategictravele has quit [Client Quit]
strategictravele has joined #osdev
ghoulg has quit [Quit: byee]
Fingel has quit [Quit: Fingel]
Fingel has joined #osdev
gog has joined #osdev
vdamewood has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
theruran has quit [Quit: Connection closed for inactivity]
npc has quit [Remote host closed the connection]
vdamewood has joined #osdev
vdamewood is now known as vampiredamewood
cow321 has quit [Remote host closed the connection]
cow321 has joined #osdev
strategictravele has quit [Quit: strategictravele]
GeDaMo has quit [Quit: 0wt 0f v0w3ls.]
goliath has quit [Quit: SIGSEGV]
goliath has joined #osdev
karenw has joined #osdev
* karenw
spends a couple of hours fiddling with assembly code, can't find a solution. Logs in a few days later and solution just pops into her head and works first time.
<karenw>
Why is my brain like this.
d5k has joined #osdev
youcai has quit [Ping timeout: 260 seconds]
<dormito>
thats just how brains work
<dormito>
anyone know of documentation on the on-disk layout of "sun disk labels". Oracle is very very good at make sure searches related to sun things are useless
d5k has quit [Ping timeout: 248 seconds]
youcai has joined #osdev
<Ermine>
karenw: that's totally fine
<the_oz>
karenw, I call it the backburner effect
<the_oz>
not thinking about a problem merely shifts the attention phase, it'
<the_oz>
s not NOT working on the problem. There's still work being done, like when you go to the bathroom and POP there's the answer, or during sleep or phase down to/from sleep, it's applying filters and reorganization, all without conscios thought, in fact it works better that way
<karenw>
It's not super important, but I was annoyed that after I reloaded segment selectors I did a lret directly to the address of another ret. And then I come back today and am like "Oh, I just pop the old address off the stack into a scratch register duh"
<karenw>
And then push it back after the selector so I lret straight out of the function.
<the_oz>
don't be annoyed, your brain did figure it out
<the_oz>
:)
<karenw>
Why couldn't it work it out the other day though! Stupid thing.
<the_oz>
left brain yells at right brain
<the_oz>
who helped via hippocampus
<karenw>
Half my brain is literally trying to sabotage me though (neurological problems, not a metaphor)
<netbsduser`>
only two days till my copy of aix/6000 internals arrives
<netbsduser`>
disappointed to learn it's about an older version than i thought (3.2) but what can one do
gog is now known as ghoulg
<nikolar>
i got a copy of the pdp-11/70 processor handbook
<nikolar>
pretty cool :)
EmanueleDavalli has quit [Quit: Leaving]
<dormito>
karenw: thanks
<dormito>
I was/am assuming they are different due to this sentance from fdisk(8): "It understands GPT, MBR, Sun, SGI and BSD partition tables."
<dormito>
but yeah, I guess the source in the only place to look
<karenw>
sun-pt.h in util-linux seems to have what you are after
<karenw>
And I'm guessing they are both based on some classic UNIX format and diverged at some point in history.
<karenw>
pt-sun.h, gah, thanks for that spoonerism
<dormito>
yeah, wouldn't be surprised if they are nearly identical
hwpplayer1 has joined #osdev
<dostoyevsky2>
netbsduser`: that's why lost interest in the aix book, as it's like twenty years old
Gooberpatrol_66 has joined #osdev
Gooberpatrol66 has quit [Ping timeout: 276 seconds]
<karenw>
`__asm volatile("rdmsr" : "=d"(valueh), "=a"(valuel) : "c"(msr));` What's wrong with this for rdmsr? I seem to be invoking UB because clang then just does `xor %eax, %eax; ret` instead of the maths to convert edx/eax into a 64-bit value
<karenw>
Or maybe it's the next line? `return (((uint64_t)valueh) << 32) & valuel;`
<ghoulg>
karenw: there's another constraint for that
<ghoulg>
=A
<karenw>
I thought that only worked with 16bit returns in ax:dx. I'll give it shot!
<ghoulg>
but yeah doing it the =d =a way you need to shift and arithmetic or
<karenw>
I'm doing & not | so it is (correctly) optimizing that to 0. Doh.
<karenw>
And =A does not work, it matches what the manual says and just places it in ax or dx.
<karenw>
*gcc manual
<ghoulg>
ah ok
<ghoulg>
my bad
<Mondenkind>
karenw: you want | not &
<Mondenkind>
oh you said that already
<ghoulg>
sniped
hwpplayer1 has quit [Quit: I'll be back]
<karenw>
Woo, I finally got a lapic interrupt to fire. It correctly panics in generic_interrupt_handler as I don't have a handler set up.
<karenw>
Now to get the IOAPIC unmasked correctly and we mmight finally have access to real (well, qemu virtualised) hardware.
chibill has quit [Quit: ZNC 1.9.0+deb2build3 - https://znc.in]
<karenw>
TIL: mov foo,%di does not clear the high bits of %di (but moving to %edi does)
<karenw>
err, high bits of %rdi
<Mondenkind>
what happens to you when you're an architecture and you get built up haphazardly over decades of uarch evolution
heat_ has quit [Ping timeout: 246 seconds]
<karenw>
Hmm, I tried to set the lapic timer in repeat mode, but qemu (tcg) just hangs. Trying to work out what I missed. IDT gate is type 0xE which should block nested interrupts. Returning from the ISR should return IF back to how it was. I'm not sure if I should write to EOI, I think I should? But doing so doesn't help.
heat has joined #osdev
<karenw>
I assume I should always write to EOI unless it's an exception (vector < 32) or spurious (vector 255)?
<johnjaye>
what is the best textbook for os stuff i can buy for less than $100?
netbsduser` has quit [Ping timeout: 246 seconds]
<karenw>
Mondenkind: It caused some funky behaviour in C. Because it was being treated as garbage by a switch(uint8_t) statement as the value was 0xA80FE. But then it would hit the `default` case and print the value as `0xFE` and I initially couldn't work out why 0xFE wasn't hitting `case 0xFE`...
chibill has joined #osdev
<karenw>
Because mov $0xFE,%di then jumping into C resulted in a uint8_t with a value of 0xA80FE. The result of UB is UB.
<geist>
yeah that's just an implicit thing about the x86-64 extension
<Mondenkind>
oh, funny
<Mondenkind>
(i assume you mean dil? or a800fe?)
<geist>
which makes sense when you think about it, because it keeps the contract the same with 32bit stuff
Fingel has quit [Quit: Fingel]
<Mondenkind>
geist: no you would have gotten that anyway
<Mondenkind>
the point was to improve performance
Fingel has joined #osdev
<Mondenkind>
with the benefit of hindsight, you would only have 64-bit regs, and if you were to not have 64-bit regs, you'd have the zeroing in all cases. as is you can't get the zeroing in all cases without breaking compat, but you can at least get it in _one_ case where there is no compat to keep
<karenw>
0xA800FE sounds right yeah. Was 100% a `mov ,%di` at fault.
<karenw>
Having a uint8_t with a >0xFF value is pretty cursed.
chibill has quit [Quit: ZNC 1.9.0+deb2build3 - https://znc.in]
<geist>
and yeah that's part of the ABI as well. where it's called out who is responsible for zeoring out the top of registers when it's a short thing
chibill has joined #osdev
<Mondenkind>
does inline asm constraints count as ad hoc abi? in this essay i will deconstruct the socio-political phenomenon of 'abi' as neocolonialism driven by the same powers that forced instruction set architectures onto us
<karenw>
Once I managed to write a kprint in such a way it didn't clear the high bits for me, and it printed a value >0xFF, it was obvious. But depending on exactly how and where kprint was invoked changes the assembly output and could end up correctly truncating it.
<karenw>
Mondenkind: This wasn't inline asm, this was a .s file containing a legit function, so ABI is the correct term to use.
<Mondenkind>
STILL
<Ermine>
the what
<Ermine>
As they say, great engineers see the root of their problems in the society?
<Mondenkind>
oh i'm not an engineer anymore. hate computers. i just stick around for the memes
<Mondenkind>
was not at all joking though i just don't care enough to actually go and write the essay :P
<kof673>
he who controls the abi spice controls the universe?
<kof673>
it would be nice if you could get a spice in there