* kof123
drops an origami unicorn next to mrvn, and walks off
SpikeHeron has joined #osdev
nyah has quit [Quit: leaving]
<moon-child>
table lookup rather than a polynomial? No multiplier?
<gorgonical>
just realized that for instructions like adr, objdump lies to you and presents it in a way that it doesn't actually work
<gorgonical>
adr x21, ffffffc000082710 is a lie
<gorgonical>
That's the address it will end up with, but the actual value encoded is 0x26d0
<heat>
hah
<heat>
it should be commented like the rip-rel accesses
<heat>
like mov 0x2314f7(%rip),%rdi # ffffffff814be340 <acpi_gbl_gpe_lock>
<gorgonical>
and also I think my endiannness is backward
<heat>
oopsie
<gorgonical>
why would that be though
<gorgonical>
No I manually flipped all the bytes for the decoder
<gorgonical>
but I'm using aarch64-buildroot-linux-gnu-objdump for it
<gorgonical>
...
<gorgonical>
am I dumb or something? When I xxd -g 1 I get the correct byte sequence of 95 36 01 10. But when I objdump they are printed in that reverse order: 10013695. Is this because it's printing the leftmost byte as the first one at the address, and the rightmost byte as the last one?
<gorgonical>
The file *is* little-endian
<gorgonical>
Oh crap, reverse what I said about objdump
<gorgonical>
The RIGHTMOST byte is the first one at the address
<gorgonical>
e.g. it says ffffffc000080040: 10013695 adr x21, ...
wand has joined #osdev
<gorgonical>
Have I just been misreading objdump's hex output for years?
<moon-child>
it does that for jumps too
<moon-child>
presents the absolute offset instead of relative
<heat>
gorgonical, for arm64 yes
<heat>
because they are 32 bit words and not individual bytes
<gorgonical>
heat: I think I know what you mean. In my head it makes more sense to think of it all as a long sequence of bytes though
<gorgonical>
So seeing it written as 4-3-2-1; 8-7-6-5 can be strange to me
<zid`>
gorgonical learns endian exists day today?
<moon-child>
endianness for simd shuffles is really annoying
Iris_Persephone has joined #osdev
<Iris_Persephone>
hiiiiiiiii
<moon-child>
because the numbers are written as big endian in source, but interpreted as little endian--as it were--by the instruction
<moon-child>
sup
<gorgonical>
wait oh no
<gorgonical>
I am a giant dingus
<Iris_Persephone>
have I ranted about how annoying the windows gfx shell is here yet
<gorgonical>
So xxd -g 1 shows 95 36 01 10. That's how the bytes are laid out in memory. little endian means that the word is 0x10013695. But then this online converter is wrong
<gorgonical>
That's what's confusing me
<zid`>
usually the little online tools have a toggle for le or 'no spaces'
<zid`>
if it's showing as 95360110 then it's just bytes but without spaces
<zid`>
if it's showing 0x95.. it's wrong
<gorgonical>
I think this converter just isn't very good
<Iris_Persephone>
what are you trying to convert
<zid`>
32bit words
<gorgonical>
I'm just poking at my head.S and making sure I'm doing some variable addressing right
k8yun has joined #osdev
Burgundy has left #osdev [#osdev]
<gorgonical>
I'm glad none of you know me in real life. Today's events would be embarrassing otherwise
<gorgonical>
lol
<gorgonical>
The shame of forgetting how endianness works is great
wand has quit [Remote host closed the connection]
<heat>
i'm starting to think that allowing memory allocation in IRQ context is a good idea
<dh`>
difficult to avoid
wand has joined #osdev
<heat>
is it? a quick look at freebsd malloc(9) hints that it does not support "fast interrupt handlers"
<heat>
my problem here is that I want to make all my block IO drivers follow a strict io-queue method where you queue requests and on IRQ you complete() them and submit the next one, if it exists
<heat>
currently this may involve memory allocation and deallocation for bounce buffers, sg lists, etc
<heat>
which may just hint that I'm trying to do too much on the top half IRQ and doing some softirq here would be a good idea
<heat>
honestly, I don't know
<dh`>
in general you don't want anything slow on the irq path code
<dh`>
both because interrupt latency is bad in general and also because you can drop inputs if you don't react to the hardware fast enough
<dh`>
my usual thought is capture what you need from the hardware immediately and defer the rest for further processing
<dh`>
but it's been a long time since I wrote a real device driver
<heat>
right, I think the real question here is "what is slow?"
<heat>
malloc could be slow-ish (rare) or it could be blazingly fast (hopefully frequently)
<dh`>
traditionally malloc is slow
<dh`>
anyway for anything besides incoming network packets you probably already have a place to put what you get from the hardware
<dh`>
it's incoming network packets that are a headache in this context
<heat>
you can allocate upfront for rx no?
<dh`>
only if you allocate the maximum input size every time
<dh`>
maybe that's not a problem
<heat>
oh right, yes, I see what you mean. I was thinking about raw NIC rx buffers for DMA
<heat>
yes, that is usually all under softirq or threaded interrupts
<dh`>
I have no idea how rx works for network devices with tcp offload
<dh`>
even when I did write a few real drivers long ago, none of them were anything like that
matthews has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
matthews has joined #osdev
craigo has quit [Ping timeout: 252 seconds]
<mrvn>
heat: if you allocate in the IRQ then you have to disable IRQs in malloc. And for SMP that means you have to lock it globally. Bad idea.
heat has quit [Remote host closed the connection]
heat has joined #osdev
<mrvn>
Why can't you allocate all the memory when someone submits a read or write? E.g. the TCP stack would allocate bounce buffers and stuff and submit that to the NIC to add to the discriptors for receiving data.
k8yun has quit [Ping timeout: 268 seconds]
<mrvn>
The descriptors having buffers to read into activates the IRQ for the NIC.
Arthuria has joined #osdev
<mrvn>
dh`: a frame is 1500 bytes and a NIC has space for a limited numberd of descriptors so you don't need that much memory to max that out. With tcp offloading you might need 64k per frame which would eat a lot more memory. Still not all that much in servers that have that offloading and frame merging.
<mrvn>
1024 jumbo frames would be just 64MB for systems with >64GB of ram.
<mrvn>
peanuts
<mrvn>
.oO(and give you about 1s to allocate more in the soft irq when busy)
<moon-child>
don't have to disable irqs
<moon-child>
if you do the rop trick
k8yun has joined #osdev
<heat>
what rop trick
<heat>
mrvn, did you miss the last 20 years of malloc advancements? no need for a global lock on SMP at all
<heat>
and it would also still be there without irq-safe malloc soooo, no idea what you're on about
mctpyt has joined #osdev
<moon-child>
heat: isr probes stack to see if malloc is currently running. If not, then it can malloc freely. Otherwise, it overwrites malloc's return address with its own continuation (stashing the original return address somewhere)
<moon-child>
so the isr runs right after malloc finishes
<moon-child>
this can be seen as an implicit mutex, and a very simple scheduler (in particular, you have to deal with the case when another isr runs and wants to malloc). In the limit, it's an actual scheduler and mutex. But there is some interesting space to play with ahead of the limit
<heat>
thanks, i hate it
mctpyt has quit [Ping timeout: 246 seconds]
<moon-child>
:<
<moon-child>
another avenue which might be interesting is to make the allocator 'lock-free' (relying on instruction-level atomicity). That sound annoying though
<moon-child>
sounds*
<heat>
that's not the right lock-free
<moon-child>
probably the actual right thing to do is, if there's some reason you might need to malloc, shove an event in a queue somewhere and let someone else get to it in due time. But both of the above ideas are a lot cuter
<moon-child>
heat: wym
<heat>
you can have a totally percpu allocator
<heat>
no need for shared state
<moon-child>
it's a different meaning of 'lock-free' from the normal one, but the right one for this situation
<heat>
in fact, I think slub actually does this
<heat>
SLUB's slabs are all percpu AIUI
<moon-child>
yes. per cpu. But 'lockfree' in that it's totally reentrant
<moon-child>
so you can malloc concurrently from regular code and an isr
k8yun has quit [Quit: Leaving]
[_] has joined #osdev
<moon-child>
probably this is way easier on x86 than other arches, since you have lots of instruction-level-atomic rmw without synchronisation guarantees
<heat>
dude just cli and sti?
<moon-child>
laaaaame
<moon-child>
:^)
<heat>
lol
<moon-child>
what even is the point of writing your own os if you can't commit awful, terrible crimes?
<moon-child>
normally we just have to live with the crimes of existing os authors
<moon-child>
need some equity, ne?
<heat>
writing poor man's UNIX is already a bad enough crime isn't it
<moon-child>
heat
[itchyjunk] has quit [Ping timeout: 246 seconds]
<heat>
moon-child
gxt_ has quit [Remote host closed the connection]
gxt_ has joined #osdev
heat has quit [Ping timeout: 246 seconds]
[_] is now known as [itchyjunk]
<zid`>
heatchilder
<zid`>
warumkinder
[itchyjunk] has quit [Read error: Connection reset by peer]
<zid`>
someone needs to make a video player that doesn't suck
<zid`>
vlc can't play shit properly on its best days, doesn't hw accelerate by default. mpc-hc is dead, and doesn't downmix to stereo by default and has bad hotkeys etc
bgs has joined #osdev
Arthuria has quit [Remote host closed the connection]
<dh`>
disabling interrupts in malloc in no way means you need a global lock
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 248 seconds]
<Terlisimo>
zid`: mpv?
<zid`>
Isn't that a venereal disease
<zid`>
>mpv is a free (as in freedom) media player for the command line
<Amorphia>
zid`: it's not PC to say venereal disease anymore :L
<zid`>
why, did people stop having sex
<Amorphia>
yeah sex is not PC anymore
<zid`>
what do they have instead?
<Amorphia>
"intimacy"
<zid`>
that's not what I am talking about though
<Amorphia>
what are you talking about
<zid`>
I'm talking about diseases you get from shagging
<Amorphia>
lmao
* Amorphia
hands zid` a test kit for "intimately sourced infections"
<zid`>
like, TB?
<zid`>
I hear that's rife in prisons
<Amorphia>
not "inmately sourced"
<zid`>
sure it is
<zid`>
I have to cough on you, a lot
<Amorphia>
hahahaha
slidercrank has joined #osdev
bgs has quit [Remote host closed the connection]
wand has quit [Ping timeout: 255 seconds]
wand has joined #osdev
arminweigl_ has joined #osdev
arminweigl has quit [Ping timeout: 255 seconds]
arminweigl_ is now known as arminweigl
Left_Turn has joined #osdev
Turn_Left has quit [Ping timeout: 248 seconds]
danilogondolfo has joined #osdev
shinbeth has quit [Remote host closed the connection]
bauen1 has quit [Ping timeout: 255 seconds]
gxt__ has joined #osdev
gxt_ has quit [Ping timeout: 255 seconds]
gog has joined #osdev
<Iris_Persephone>
did I just walk into a discussion about the clap
awita has joined #osdev
GeDaMo has joined #osdev
<zid`>
no, TB
<Iris_Persephone>
the terminology for this changes a lot for some reason, probably the euphemism treadmill
<Iris_Persephone>
is it STI or STD that's in vogue these days?
<FireFly>
sti::cout
<Iris_Persephone>
lmao
<sham1>
Standard Incantation
<zid`>
none of these are euphamisms though so idk why the euphamism treadmill is applicable, I think it's just it used to be VD then became STD for reasons unknown, maybe better public understanding? Then STI later because more accurate?
<zid`>
That's my heresay anyway.
<gog>
STD isn't necessarly inaccurate but it'd be more applicable for chronic conditions like herpes, hepatitis or HIV
<zid`>
yea
<zid`>
it's.. less accurate, but not inaccurate, imo
<zid`>
hence, more accurate
<gog>
yes
<gog>
but even those aren't necessarily sexually transmitted only
<zid`>
with hep and hiv etc not being sexually transmitted a lot of the time, maybe we'll go change again in future
<gog>
yeah
<gog>
lol
<Iris_Persephone>
I mean it's the whole "we need a more scientific term for this because the older one is now used as an attack"
<zid`>
yea that's what the treadmill is
<Iris_Persephone>
like "retard" and shit used to be a technical term
<zid`>
but none of these get used as euphamisms
<zid`>
so as far as I am concerned, this must be a different process
<Iris_Persephone>
I'd say it's a different manifestation of the same process
<Iris_Persephone>
also "VD" turning into "STI" is literally listed on the wiki page about the euphemism treadmill lmao
<zid`>
destigmatisation I'd go for
<zid`>
euphamism treadmill o
<zid`>
no
sympt5 has joined #osdev
<Iris_Persephone>
like if you want to get technical I think the proper linguistic term is "pejoration"
sympt has quit [Ping timeout: 246 seconds]
sympt5 is now known as sympt
<zid`>
we're talking about the inverse process though
<Iris_Persephone>
the inverse is "melioration" but I don't think that applies because that's like reclaiming a word
slidercrank has quit [Ping timeout: 255 seconds]
bauen1 has joined #osdev
craigo has joined #osdev
netbsduser` has quit [Ping timeout: 255 seconds]
netbsduser has joined #osdev
netbsduser has quit [Client Quit]
terminalpusher has joined #osdev
Starfoxxes has quit [Ping timeout: 248 seconds]
<mrvn>
moon-child: if you do the rop trick then what you basically do is a soft-irq. Might as well just do that from the start. Note: even with the rop trick you still need a global lock, it's just done in hardware when you do atomic cmpxchg.
<mrvn>
a percpu allocator as heat suggested helps
<mrvn>
Note: the isr probing the stack actually only works with percpu alloc or you have to probe all cores stacks and that's rather racey.
<mrvn>
s/works/works well/
SpikeHeron has quit [Quit: WeeChat 3.8]
Starfoxxes has joined #osdev
marshmallow has quit [Remote host closed the connection]
gog has quit [Quit: Konversation terminated!]
gog has joined #osdev
<lav>
mow
<gog>
meo
<sham1>
mov
<mrvn>
moo
<gog>
are there any arcitechtures with the mnemonic "moo"
<gog>
that'd be cool
<mrvn>
"I accidentally took my cats medicin. Don't ask meow."
<mrvn>
gog: it's a Memory-Out-of-Order read. :)
<mrvn>
aka prefetch
novasharper has joined #osdev
zxrom has joined #osdev
<gog>
:o
<zid`>
move offsettable object
<gog>
moo
<lav>
oom
<zid`>
I am listening to japanese pop jazz stuff
<zid`>
It's surprisingly good
<gog>
i've got madonna on again
<zid`>
with or without an intrusive r
<gog>
:P
<gog>
MA'DONN
<zid`>
maddonna on sounds like a prescription medication to me cus of the intrusive r
slidercrank has joined #osdev
dutch has joined #osdev
<mrvn>
lols @ The Ark. A LSD like compount that's smaller than a water molecule.
craigo has quit [Ping timeout: 255 seconds]
awita has quit [Ping timeout: 246 seconds]
nyah has joined #osdev
bgs has joined #osdev
[itchyjunk] has joined #osdev
pbx has joined #osdev
<pbx>
2~/wind 21
pbxvax has joined #osdev
<pbxvax>
after writing some drui##ivers i managed to get this VAX11/750 on the net
pbxvax has quit [Remote host closed the connection]
<geist>
oh neat
gog has quit [Remote host closed the connection]
gog has joined #osdev
<gog>
fuuuuuuuuuuuucking god damn can i please have a stable internet
<gog>
i'm trying to get work done and my db connection times out
<gog>
i have one more feature to test before i'm done with this stupid thing T_
<bslsk05>
lore.kernel.org: Re: Deprecating and removing SLOB - Yosry Ahmed
<mjg>
aight
<mjg>
i guarantee the irq trips *are* slower. it may be there are other properties down below which make a difference, for example how it reacts to changing load
<mjg>
how many elements to fill etc.
<mjg>
and it may be they work better for G
<mjg>
m
<mjg>
you could load a toy kernel module on your laptop
<mjg>
just sayin
<mjg>
:]
<mjg>
maybe some git logging would explain why they roll with irqs over there
<heat>
because they always have?
<heat>
I don't think kmalloc has ever been banned in hardirq context
<heat>
at least not in the last 20 years (2.4?)
<heat>
in any case, even slub slow(er) paths do irqsave
<heat>
it's all irq safe stuff that needs to be called and can be called from hardirq context
<heat>
versus freebsd explicitly saying "no hard irq stuff)" in malloc(9)
<mjg>
i'll hack up simple code, give me few
<heat>
whether that's a lie is beyond me, since freebsd manpages love to lie
<mjg>
you can't use malloc in interrupts
awita has quit [Ping timeout: 246 seconds]
<heat>
CringeBSD
<heat>
i could probably just replace my preemption disabling with irqs in slab.cpp and see what happens
<mjg>
as i said, it is faster to not fuck with interrupts
<mjg>
frankly i'm confused how that's even a question. is it because linux is clearly doing it at lesat in slab?
<heat>
it's because linux does it all the time
<mjg>
not in slub fast path, if the comment is to be believed
<heat>
and I don't know if this is some legacy thing they're stuck with for the slab allocators, or something else
<mjg>
i would guess some of it is indeed used from interrupt handlers
<heat>
sure, maybe not the slub fast path, but for sure the other paths
<mjg>
but instead of dedicating a bucket for that purpose they use irqs to syncrho access
<mjg>
shitty tradeoff if you ask me
<heat>
btw what's 20 rdtscp "cycles" going to amount to?
<mjg>
dawg plz
<heat>
feline plz
<heat>
seriously, how much time is that?
<mjg>
i can tell you i already see uma_zalloc/uma_zfree on the profile when pushing packets on freebsd
<mjg>
there is branches which do't need to be there
<mjg>
it would be much worse if it was rolling with interrupts
<mjg>
so that's what
<mjg>
wait maybe i can share one
<heat>
what's your tsc's frequency?
<mjg>
... no i can't
<mjg>
look we can flame tomorrow
<mjg>
i'm bailingfrom this crap for the day
<mjg>
got a an email backlog :[
<heat>
can you literally just show me the tsc frequency or am I going to have to bench this myself
<heat>
i want to understand how much impact this shit can have
<heat>
i don't want to be distracted with "hey look, flamegraphs!"
<mjg>
have to boot it
<mjg>
again 5 fucking years
<mjg>
regradless of that i defo encourage you to run a similar etst on your machine
<mjg>
you can write a lol module and load it
gorgonical has quit [Remote host closed the connection]
<mjg>
just flip to the console first in case it panics :p
<mjg>
tell you what though, sometime in next 2 months i suspect i'll patch the allocator to be optimal
<mjg>
once that happens, i'll pessimize it just for you with irq instead of preemption trip
<mjg>
and test
<mjg>
Timecounter "TSC" frequency 2100000221 Hz quality 1000
<mjg>
skylakhw.model: Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz
<mjg>
aka skylake
DynamiteDan has quit [Excess Flood]
DynamiteDan has joined #osdev
<sham1>
yawn
terminalpusher has quit [Remote host closed the connection]
Left_Turn has joined #osdev
levitating has quit [Ping timeout: 246 seconds]
<heat>
ok so in theory it amounts to ~9ns
<heat>
for any sort of fast path
<heat>
i wonder, does this verify if you do "cli; cli; cli" etc?
levitating has joined #osdev
<heat>
as in, do you get a performance penalty by disabling IRQs once they are already disabled
<Amorphia>
smh not using mode-locked Ti:sapphire laser optics for fast computation
<Amorphia>
nanosecond timescales are cringe
Turn_Left has quit [Ping timeout: 264 seconds]
slidercrank has quit [Ping timeout: 260 seconds]
<mjg>
heat: that's a too primitive calculation for real impact. as noted, the above is good enough to show there is a difference, but it most likely *underplays* is
<mrvn>
here is an idea: don't enable irqs in the kernel. problem solved.
<mjg>
it
<mjg>
same shit with rolling with atomics on a loop for a bench
<mrvn>
If you have problems with the irq handler allocating memory have you considered giving it a SLAB for it's own (per-cpu)?
<mjg>
that's literally what i recommended above
<mjg>
and no, i don't have the problem
<mjg>
:]
<mrvn>
My IRQ driver gets 4k of memory per irq that it moinors. The driver requests an irq by sending a message, which is where the 4k come from, and the IRQ driver replies with a message when the irq happens waking up the driver that asked for the irq.
<mrvn>
s/moinors/monitors/
<mrvn>
So basically all my IRQs are soft irqs which totaly avoids the problem too and makes the IRQ handler really small and fast.
bliminse has quit [Quit: leaving]
awita has joined #osdev
awita has quit [Remote host closed the connection]
bgs has quit [Remote host closed the connection]
k8yun has quit [Quit: Leaving]
GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]
sinvet has quit [Ping timeout: 252 seconds]
<moon-child>
mrvn: not percpu malloc is strawman
elastic_dog has quit [Ping timeout: 248 seconds]
elastic_dog has joined #osdev
dude12312414 has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]