epony has quit [Remote host closed the connection]
epony has joined #osdev
dzwdz is now known as [lostTheGame]
[lostTheGame] is now known as AHHiLostTheGame
AHHiLostTheGame is now known as AAAAiLostTheGame
<dormito>
Hmmm. The wiki's interrupts page recently had an edit removing comment claiming real mode was obsolite (edit claimed that it was removing 'opinion'). It is my opinion that this is a bad move. last I heard Intel is threatening to remove real mode from their hardware (maybe they have even done so, I havent been checking their chips). While I wouldn't say people should not be able to write an OS in
<dormito>
realmode (which would be insane), we do have a real mode warning page for a reason (though it does seem disproportionately focues on the BIOS interface)
<gog>
obsolete is a relative thing
<gog>
legacy is a better term
<heat>
gog
<heat>
you're legacy
<gog>
heat
<gog>
i have a legacy mode
<gog>
i use it when i need to traverse airport security
<heat>
gog32
<heat>
Gog on Gog 64
<gog>
yes
<heat>
i went to the gym yesterday and now my muscles are owie ouch
<gog>
i shovled the snow and chopped at the ice today
<gog>
that was a workout
<heat>
iceland moment
<gog>
yes'
<heat>
yes's
<dormito>
maybe it should be replaced with a small warning about not using protected mode (and a link the the realmode OS warning page)
<heat>
protected mode is also legacy
<dormito>
long mode is an extention of protected mode :p
<heat>
well, i'd say long mode is separate, and 32-bit compat is an extension of long mode
<heat>
long mode doesn't mean 64-bit anyway
<dormito>
IIRC you set additional bit, and you can't NOT enable the protected mode bit.
<heat>
you have to be in 32-bit, but you don't have to be in 64-bit
<heat>
and the GDT, IDT work differently
<heat>
(so does paging)
<dormito>
well, IIRC long mode requires paging, where as just protected mode does not
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
Gooberpatrol66 has quit [Ping timeout: 245 seconds]
AAAAiLostTheGame is now known as dzwdz
<moon-child>
gog subsystem for gog
navi has quit [Quit: WeeChat 4.0.4]
<moon-child>
gog is not gog
heat has quit [Read error: Connection reset by peer]
heat_ has joined #osdev
stazthebox has joined #osdev
zid has quit [Ping timeout: 245 seconds]
xenos1984 has quit [Read error: Connection reset by peer]
heat_ is now known as heat
zid has joined #osdev
<heat>
no idea if this is common but i've just used a fun technique to debug a problem where i devised some sort of improvised journal with an improvised ring buffer, in order to easily trace certain points in the code post-mortem
<heat>
and it worked well
<heat>
like literally just "struct journal_entry[1024]; u32 pos;"
Gooberpatrol66 has joined #osdev
heat_ has joined #osdev
heat has quit [Read error: Connection reset by peer]
experemental has quit [Ping timeout: 255 seconds]
epony has quit [Remote host closed the connection]
epony has joined #osdev
<moon-child>
yeah that's not uncommon I think
epony has quit [Remote host closed the connection]
xenos1984 has joined #osdev
epony has joined #osdev
epony has quit [Remote host closed the connection]
jbowen has quit [Excess Flood]
jbowen has joined #osdev
epony has joined #osdev
heat_ has quit [Remote host closed the connection]
heat has joined #osdev
goliath has quit [Quit: SIGSEGV]
heat_ has joined #osdev
heat has quit [Read error: Connection reset by peer]
Left_Turn has quit [Read error: Connection reset by peer]
elderK has joined #osdev
dude12312414 has joined #osdev
dude12312414 has quit [Remote host closed the connection]
heat_ has quit [Ping timeout: 276 seconds]
heat has joined #osdev
gog has quit [Ping timeout: 276 seconds]
gbowne1 has quit [Remote host closed the connection]
gbowne1 has joined #osdev
heat has quit [Ping timeout: 240 seconds]
kfv has joined #osdev
gbowne1 has quit [Remote host closed the connection]
kfv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
experemental has joined #osdev
kfv has joined #osdev
exit70 has quit [Quit: ZNC 1.8.2+deb2build5 - https://znc.in]
exit70 has joined #osdev
kfv has quit [Read error: Connection reset by peer]
kfv has joined #osdev
kfv has quit [Remote host closed the connection]
kfv has joined #osdev
GeDaMo has joined #osdev
kfv has quit [Read error: Connection reset by peer]
kfv has joined #osdev
kfv has quit [Read error: Connection reset by peer]
kfv has joined #osdev
kfv has quit [Remote host closed the connection]
kfv has joined #osdev
[_] has joined #osdev
kfv has quit [Client Quit]
[itchyjunk] has quit [Ping timeout: 260 seconds]
drakonis has quit [Quit: WeeChat 3.6]
kfv has joined #osdev
zid has quit [Ping timeout: 245 seconds]
Left_Turn has joined #osdev
<Ermine>
heat: fwiw on my phone CONFIG_VIRTUALIZATION is not set, so there's no KVM
Cindy is now known as bnchs
<kazinsal>
could be a build config problem. I spent a solid half hour today fiddling with ffmpeg because the ubuntu build defaults didn't have a specific late 80s container format
zid has joined #osdev
<Ermine>
Also the kernal is built with clang 8
epony has quit [Remote host closed the connection]
epony has joined #osdev
elderK has quit [Quit: Connection closed for inactivity]
goliath has joined #osdev
navi has joined #osdev
gog has joined #osdev
epony has quit [Remote host closed the connection]
epony has joined #osdev
sbalmos has quit [Ping timeout: 245 seconds]
sbalmos has joined #osdev
bitoff has quit [Remote host closed the connection]
bitoff has joined #osdev
epony has quit [Remote host closed the connection]
Turn_Left has joined #osdev
epony has joined #osdev
Left_Turn has quit [Ping timeout: 260 seconds]
Arthuria has joined #osdev
elderK has joined #osdev
Arthuria has quit [Remote host closed the connection]
<immibis>
dormito: it's wikipedia, so you need a reputable source saying real mode is obsolete
<immibis>
"wikipedia is not the place for original research"
heat has joined #osdev
<heat>
Ermine, yeah that's not surprising
dude12312414 has joined #osdev
Arthuria has joined #osdev
dude12312414 has quit [Remote host closed the connection]
Arthuria has quit [Remote host closed the connection]
larsjel has joined #osdev
kfv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
freakazoid332 has quit [Read error: Connection reset by peer]
frkazoid333 has joined #osdev
elderK has quit [Quit: Connection closed for inactivity]
<Ermine>
heat: old clang or no virtualization?
<heat>
yes :)
<heat>
clang 8 isn't that old
<heat>
i'm spending more time doing brain surgery on my kernel's mistakes than writing code
<heat>
this is annoying :/
<gog>
UEFI
<heat>
you want to annoy me further?
<heat>
smh
<gog>
yes
<gog>
being annoying is my job
<gog>
i take it very seriously
experemental has quit [Ping timeout: 245 seconds]
<heat>
i'm proud of you gog
<heat>
you're a true professional
Loxsin has joined #osdev
<Loxsin>
Hello, they are saying that if you want to have a modern OS, you have to lock very granularly. But how granularly becomes too granular?
<heat>
that's a good question
<heat>
but ideally you want to have no locks
<heat>
like a "struct mutex big_memory_management_lock;" and having all memory management routines lock it is awful
<heat>
"struct process_address_space { struct mutex lock; /* ... */ }; struct memory_allocator { struct mutex lock; /* ... */ };" is better, but still bad, but it really depends on your workload and the contention you're realistically getting
<heat>
and how much you care, ofc
<Loxsin>
In my virtual memory subsystem, I am having one SpinLock per process for page table manipulation, one RWLock to protect the splay-tree of memory segments mapped in the process, and every memory segment has a Mutex to protect his associations from offset to page.
<heat>
splay-tree? why?
<heat>
anyway yeah that sounds... ok
<heat>
it will not scale
<heat>
but it's better than a giant lock
<Loxsin>
I pick the splay-tree because I thought: "Page faults will often cluster together within same segment."
<heat>
can you even use a splay tree with a rwlock? i don't think so
<heat>
because it'll rebalance on lookups too
Celelibi has quit [Ping timeout: 260 seconds]
<heat>
right?
<Loxsin>
Yes, so if it can't immediately upgrade to a write-lock, it won't splay on lookup.
xenos1984 has quit [Ping timeout: 276 seconds]
<Loxsin>
So it is not a true splay-tree. It is a best-effort one, if there isn't contention.
<heat>
ah, ok, that works i guess
<heat>
but splay is really a weird choice, because splay just isn't that good of a tree
<gog>
splay, queen
<heat>
i would be surprised if that design is faster than a good old rb tree
<gog>
yaaaas
xenos1984 has joined #osdev
<heat>
but yeah your design is... fine, it just won't scale at all
<Loxsin>
I liked the splay-tree because it was the first tree structure I implemented when I was a student. But I think a radix tree will be a greater solution. More over the radix tree comes with it possibility to fine grain lock.
<heat>
no, a radix tree is a bad idea
<heat>
(here)
Celelibi has joined #osdev
<heat>
say, for a 1024 * PAGE_SIZE mapped segment, you'll need 1024 entries pointing at the same segment struct
<heat>
it's immensely wasteful
<heat>
linux has this thing called a maple tree, it's a modified btree that can do RCU and as such doesn't need any sort of locks for lookups
<heat>
as such, that part scales really well nowadays on linux
<Loxsin>
I thought about trying one of the unusual RWLock designs that are highly penalising of writers because I couldn't think up a good way to do any kind of lock-free management of the segment mapping tree.
<heat>
binary trees can't really be lock-free
<Loxsin>
Even without respect to the data structure. I didn't smart enough to figure out how to avoid ending up with custered state if the mapping state changed for example mid-page-fault.
<heat>
yeah, the problem is hard. i haven't bothered with locklessness there, yet
<heat>
it's tricky enough that being lockless there is fairly new in linux, and had a few bugs
Turn_Left has quit [Read error: Connection reset by peer]
Left_Turn has joined #osdev
vdamewood has joined #osdev
DanielNechtan is now known as bombuzal
xenos1984 has quit [Ping timeout: 256 seconds]
<Loxsin>
How unScalable do you think will be my current locking system?
vinleod has joined #osdev
<Loxsin>
I think, excluding programs of many thread, the biggest contender will be the segment locks for key files, such as the libc SO. I already abolished what could have been a lot of contention on the kernel page table spinlock by adding a system of per CPU slabcache.
vdamewood has quit [Killed (calcium.libera.chat (Nickname regained by services))]
vinleod is now known as vdamewood
<heat>
Loxsin, very
<heat>
heavily multithreaded programs with lots of memory allocation and page faults will die on either the rwlock or the mapping spinlock
<heat>
and if they don't die there, they will definitely choke on the segment mutex
<heat>
particularly, as you said, on stuff like libc.so
<heat>
but, you know, it won't be horrible in the real world, but it's easy to devise ways to break that design down
<heat>
also, it really depends on how your locks work. if they spin (even mutexes should spin!), it's much better than if they go to sleep at the sight of contention
<heat>
oh, also: your splay tree makes your rwlock effectively a mutex, since you always try to grab it if its uncontended (meaning that threads that get there while you're splaying that shit will block)
xenos1984 has joined #osdev
Turn_Left has joined #osdev
<Loxsin>
I know it won't be too horrible because, looking OpenBSD, until not some little time ago they had only a grand lock which protected all the VM activity, and many more
<bslsk05>
www.nmedia.net: Martin Pieuchot: OpenBSD kernel lock contention
<heat>
like, the most they can ever scale is to 8 cpus. seriously. simple stuff like make builds choke on it
<Loxsin>
Even at only 17 threads, make build world spent 30% of time building and 70% of time stuck waiting for a locks
<heat>
mjg: ^^look at that link
<heat>
fun stuff
Left_Turn has quit [Ping timeout: 245 seconds]
<Loxsin>
Looking to their result, one problem I having is I also have something like their uvm_pageqlock. I have a free pages lock, a dirty pages lock, and an evictable pages lock, and they are replicated foreach Numa Domain, but on a typical machine they are global
<heat>
what do you need a dirty pages lock for?
<Loxsin>
When a page which would go onto the list of eviction candidates is found to be dirty, it goes instead onto the dirty list, and the writeback daemon processes that list by writing back the page to disk and then makes the page an eviction candidate proper
<heat>
see, i don't agree with that design
<heat>
writeback should be done on a per-inode basis
<heat>
queueing pages is too granular
<heat>
i use a radix tree and efficiently mark sections of the inode's data dirty. writeback runs on an inode until that inode gets cleaned out
<heat>
i queue dirty inodes on a per-block-device thread's list, so i have a small spinlock there, but that's no biggie
<heat>
i don't have LRU yet, but LRU won't care if a page is dirty or not
<heat>
also, tip for LRU: batch promotion/demotion, else you'll be grabbing the lock too much
<mjg>
heat: no i'm confident real bottleneck is i/o
<mjg>
heat: not LOCKING!
<mjg>
heat: the bigger -j the more i/o therefore the bigger problem it is
<mjg>
heat: trust me, i'm a geezer
<heat>
if everything is pessimal, what's the real crapper
<heat>
maybe the real crapper is the friends we made along the way
<mjg>
the real crapper is the girl kissers we dissed along the way
<heat>
bmc@eng.sun.com
<Loxsin>
I worry that the eviction candidates list locking will be punishing, but maybe OpenBSD has such evil contention because it's uvm_pqlock seems to lock the freelist as well as the paging queues, and I don't think OpenBSD engages in sophisticated strategies to avoid the need to allocate pages
<mjg>
vast majority of openbsd contention is their equivalent of big kernel lock
<mjg>
note they are not even running at some ungodly scale here either
<heat>
mjg, isn't it usually the case that if you remove the big lock you'll just contend on the smaller locks instead?
<heat>
oh my, sleeping takes a global lock
<heat>
ew
<mjg>
it is, but it normally scales better
<mjg>
in fact there is a funny trick: if you have a global lock and contend on it big time, you add another lock just shift some of the traffic
<heat>
what?
<mjg>
you do understand throughput actively *degrades* past some threshold
<mjg>
as you add cpus fucking with a lock
<mjg>
so a lol hack is to add another lock for some of the consumers in hopes of not reaching that threshold
<heat>
yes
<mjg>
there you go
<heat>
i just didn't get what you meant
<heat>
isn't that process just... breaking up a lock?
<mjg>
not sayin' i recommend it, but it is a thing
<mjg>
it's not breaking the lock, because you still have to take the origianl
<mjg>
you just gate some of the fuckers
<Loxsin>
Yes, I see the OpenBSD only gets locks recently, and before that, entire kernel is under lock and key. But I worry that the paging queues being globally locked is problematic.
<mjg>
of course it's problematic
<mjg>
or rather, will be
<heat>
having a global queue is problematic
<Loxsin>
In technically I have per-Numa Domain queues. I could partition further. I wonder how scalable kernels carry out page replacement efficiently.
<heat>
page replacement is itself inefficient by nature
<heat>
what you really don't want to pay is the cost in the "fast path"
<mjg>
you have numa support?
<mjg>
what kernel is it
<Loxsin>
I have the beginning of Numa support. I read the SRAT and I keep a separate instance of Domain class for each Numa domain. Domain class instances include freelists, dirty page queue, and evictable queue. I only make domain-specific allocation in the case of per-CPU structures yet.
<Loxsin>
It is my hobby kernel. I am interested in scaling about as well as a high quality kernel implemented in the year 2000 would scale. So I want it to be close to linear in typical cases at least for a few dozen cores.
<heat>
kernels in 2000 didn't scale
<heat>
</end>
<heat>
the vmem paper was only published in 2001 as some novelty big brained thing, rcu wasn't really a thing yet, etc etc
<heat>
linux was full of huge locks, bsd was full of huge locks
<heat>
solaris was optimal
<Loxsin>
Linux and BSD were trinkets in that day.
<heat>
not really
<Loxsin>
But IRIX, wallahi, or z/OS, there must be a different matter.
zxrom has quit [Ping timeout: 252 seconds]
<heat>
IRIX lmfao
dude12312414 has joined #osdev
<Loxsin>
Haven't you seen how big were Silicon Graphics machines?
<heat>
anything that came from the sysv codebase probably didn't scale, anything that came from the BSD codebase definitely didn't scale
<heat>
anything in the linux codebase didn't scale
<heat>
windows... no clue, probably didn't scale
<heat>
>In the early 1990s, IRIX was a leader in Symmetric Multi-Processing (SMP), scalable from 1 to more than 1024 processors with a single system image.
<heat>
(X) Doubt
<Loxsin>
My feeling as well is that the 4096 core machines, I don't imagine how they are very compatible with traditional approaches to scaling, there the importance has to be on replication and partitioning, and relaxing on coherency. So I think a traditional kernel design becomes ineligible for such systems.
<mjg>
i had seen irix sources
<mjg>
it could not possibly scale
<mjg>
modulo maybe to the ballpark of 32
<mjg>
sounds like the same mythos which surrounds solaris
<heat>
SOLARIS WAS OPTIMAL in 2000
<mjg>
the claim is it scaled to hunders of cpus
<mjg>
which it clearly did not
<Loxsin>
What I read about for IRIX is that they wanted to handle scaling to many processors by emphasising replication. It would be like having multiple IRIXes on one system. Maybe some load-balancing between them but effectively running one IRIX per group of 64 CPUs or whatever number.
<mjg>
so in fact not scaling
<heat>
yeah if you run 1 openbsd on 1 CPU it scales perfectly
<mjg>
just make sure oepnbsd isnot also the hypervisor
<Loxsin>
They argued that you don't scale by traditional means to 4096 CPUs. It's not the right OS design anymore.
<mjg>
old papers are mostly stupid
<mjg>
so there is that
<heat>
linux can do it, somewhat
<mjg>
1. apply fine-grained locking
<mjg>
2. oops realize some of the objs are de facto global
<mjg>
3. .. now you have global locks
<mjg>
4. fuck it claim great perf anyway
<mjg>
any kernel without rcu or an equivalent
<heat>
EPOCH GREATEST IDEA EVER
<heat>
EPOCH(9)*
<heat>
what if rcu but crapper
<Ermine>
epoch?
<mjg>
i'm heading off
<mjg>
fuck off nerds
<heat>
fuck you
<gog>
hi
<Loxsin>
Yes, it's the problem I mentioned. Take the page second-chance queue on my kernel. It's locked separately from the dirty queue and the free lists but it's still a global data structure. Well, actually per Numa domain
<heat>
Ermine, yeah see the nice FBSD manpage epoch(9)
<heat>
or the EBR papers
<Ermine>
heck
<heat>
darn
<gog>
meow
<heat>
ok gog, close your eyes and pretend you're a southern cat
<Loxsin>
So one retro idea I have, which I learned from the early 1990s Unix books, is to find a natural and effective divisor, e.g. physical page address colour, and divide up the dirty and free lists by colour. But that deglobalises page replacement which carries his own problems.
<Ermine>
it has 'caveats' section
<heat>
i told you how to smoothen the pain of page replacement
<heat>
if you're not following it, it's your fault
<heat>
but doing the actual page replacement will always suck ass if LRU
<Ermine>
IRIXen actuallt
<heat>
IRIXES
<Ermine>
Irices
<Ermine>
But really, sometimes -en feels more natural
<Loxsin>
In regard to that heat. I have separate page laundering process because it allows me to evict clean pages without locking the owner segment or owner process for private memory.
<Loxsin>
I do carry out the emplacement of pages onto the second-chance list in batches, and if paging dynamics are healthy, I return a few surrounding pages from the second chance list back to active use at the same time. Maybe that alone will help me to beat OpenBSD.
Loxsin has quit [Quit: CGI:IRC (EOF)]
Loxsin has joined #osdev
gabi-250_ has quit [Quit: WeeChat 3.0]
gabi-250 has joined #osdev
gabi-250 has quit [Client Quit]
gabi-250 has joined #osdev
pieguy128 has joined #osdev
epony has quit [Remote host closed the connection]
vdamewood has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
netbsduser has quit [Ping timeout: 256 seconds]
gbowne1 has joined #osdev
<geist>
re: the page color stuff loxsin was talking about, I *think* it's irrelevant now
<geist>
it was generally based on caches at the time, which were generally direct mapped
<geist>
so if you have say a 32kb direct mapped cache, then everything with 32kb will map to the same cache lines, so there's advantage to making sure allocators spread things across the lines
<heat>
yeah i have a comment in my slab allocator referencing you directly
<geist>
caches of modern era are more and more associative
<heat>
// TODO: Colouring? But geist said that maybe it's not necessary these days
<heat>
maybe i should remove the TODO bit
<geist>
well, i'm not positive it's not still useful in some ways, but the original slab paper definitely was trying to skew allocations so they didn't align on the same cache blocks, which makes sense. cpus of the time probably had direct mapped L1s/L2s
<heat>
given that no one does coloring it's probably true? no one in the malloc space at least
<heat>
let me check linox
<geist>
yeah fairly certain that's the case
<geist>
maybe there's some reason to do it if you knew how memory was striped across DRAM blocks
<geist>
say you had a 4 channel DRAM system, and it was striping memory, every 1K per channel, then hypothetically if you knew that you might try for it, but then that's already hidden by the L1/L2 cache layer so i guess still doesn't really help you
<geist>
er erase the last part
<geist>
but if the striping was say >=4K then you might try to color your pmm allocations, maybe
<geist>
to ensure that all dram channels are evenly used
<heat>
ok SLUB doesn't do colouring, SLAB does
<geist>
but i dunno if any of that info is exposed anywhere so i dont know if you can act on it
<heat>
but SLAB was removed
<heat>
and there's no real regression from SLAB to SLUB, so there's that
<heat>
(SLAB is also old)
<geist>
yah
<geist>
been playing with my supersparc 20 workstation yesterday, and i think it's exactly contemporaneous with the original Bonwick paper. early/mid 90s, high end 32bit sparcs
xenos1984 has quit [Read error: Connection reset by peer]
<geist>
ah the external 1MB cache is direct mapped
<geist>
page 108
gabi-250 has quit [Ping timeout: 240 seconds]
<mcrod>
i hate the holidays. why? i get sick, every time
<mcrod>
i am full of misery and now I will reap your souls
<zid>
play more dark souls
<mcrod>
i don't have the strength
<zid>
That's what the dark souls is for
<mcrod>
dark souls is to play dark souls when you don't even have the strength to get out of the bed?
<geist>
you keep playing and get stronger
<geist>
through constant toil and misery you acquire strength
<mcrod>
what I wish I could do right now more than anything, is remove my nose and let everything drain
<geist>
huh reading about the mmu and whatnot on the sun4m is pretty neat. it's certainly nicer than other stuff at the time
<zid>
You too can beocme JOHN DARKSOUL
<geist>
i played dark souls enough to know that that is not my cup of tea
<mcrod>
dark souls I was the hardest for me
<mcrod>
everything else was a joke by comparison
<geist>
call me an old fart, but really i enjoy my games enough when they give you a little bit of challenge, but aren't frustrating
<mcrod>
except elden ring. malenia is hard.
<zid>
dark souls is exactly at that level for me, geist
<zid>
or rather, was, it's now easy
<mcrod>
also
<geist>
i think i played the remix version released a few years ago
<mcrod>
my copy of TAOCP is coming tomorrow
<geist>
i made it fairly far but it just kept going, so after a while i just lost interest
<geist>
i totally get the appeal of the precise combat though, but meh.
<mcrod>
hopefully I can grab the package tomorrow...
<mcrod>
i'm very sad that knuth didn't release compiler techniques, and might not even do so
<geist>
hmm, not sure i ever got a copy of TAOCP
<mcrod>
you would know, it's a ton of bricks
<mcrod>
and it's *extremely* math heavy
<mcrod>
unfortunately some people blindly just implement what the book says (i.e., the abstract algorithm itself) and think there's no more optimization to be done, but each and every one of those blind implementors forget they're running on a practical computer now, the capabilities of the OS/hardware are important