michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.0.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
witchymary has quit [Read error: Connection reset by peer]
witchymary has joined #ffmpeg-devel
mkver has quit [Ping timeout: 252 seconds]
thilo has quit [Ping timeout: 260 seconds]
thilo has joined #ffmpeg-devel
thilo has quit [Changing host]
thilo has joined #ffmpeg-devel
arch1t3cht0 has joined #ffmpeg-devel
arch1t3cht has quit [Ping timeout: 255 seconds]
arch1t3cht0 is now known as arch1t3cht
lemourin has quit [Quit: The Lounge - https://thelounge.chat]
lemourin has joined #ffmpeg-devel
NotWarcop has quit [Ping timeout: 248 seconds]
Marth64 has quit [Quit: Leaving]
Warcop has joined #ffmpeg-devel
rvalue has quit [Ping timeout: 244 seconds]
Marth64 has joined #ffmpeg-devel
cone-951 has quit [Quit: transmission timeout]
rvalue has joined #ffmpeg-devel
jamrial has quit []
Martchus_ has joined #ffmpeg-devel
Martchus has quit [Ping timeout: 260 seconds]
Warcop has quit [Ping timeout: 245 seconds]
Kei_N has quit [Read error: Connection reset by peer]
AbleBacon has quit [Read error: Connection reset by peer]
haihao has joined #ffmpeg-devel
Livio has joined #ffmpeg-devel
Livio has quit [Ping timeout: 252 seconds]
b50d has joined #ffmpeg-devel
derpydoo has joined #ffmpeg-devel
b50d has quit [Remote host closed the connection]
mkver has joined #ffmpeg-devel
Krowl has joined #ffmpeg-devel
Livio has joined #ffmpeg-devel
SuperFashi has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
SuperFashi has joined #ffmpeg-devel
SuperFashi has quit [Client Quit]
SuperFashi has joined #ffmpeg-devel
Krowl has quit [Read error: Connection reset by peer]
cubicibo has joined #ffmpeg-devel
<unlord> Hi folks, I'm a little confused about how CPU feature detection works in ffmpeg
<unlord> This has a run-time check of flags & AV_CPU_FLAG_RVV_I32 gated by a compile-time flag #if HAVE_RVV
<nevcairiel> the compile time check assures the compiler/assembler support this, and of course the flags are for runtime hardware support
<unlord> I'm trying to compile ffmpeg on RISC-V with RVV hardware, but when I run checkasm --bench only 4 functions show up
Krowl has joined #ffmpeg-devel
<unlord> I wonder if the problem may be this is an #elif and not #endif \ #if, https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/riscv/cpu.c#L93
<nevcairiel> the two blocks seem to check and set the same flags, so it seems correct
<unlord> but these are compile time checks, at runtime (on say, a different computer) you could not have hwprobe() but do have HWCAP
<unlord> are you supposed to be able to package ffmpeg on a computer that is not the one you run it on?
<nevcairiel> within reason, sure
<unlord> falling back to hwcap if you don't have hwprobe() seems like a totally reasonable thing to do, but that is a runtime check
<nevcairiel> but the function is linked to anyway, so it not present would not just hide cpu flags, but rather fail launching entirely, or how would that otherwise ever happen
<unlord> which function?
<nevcairiel> __riscv_hwprobe
<unlord> it is defined on line 39
<nevcairiel> oh i missed that, i saw the same name in some libc mentions and figured it was coming through that
<nevcairiel> but i guess it didnt want to depend on a newer libc
<unlord> well, this is why I asked about packager being on a different computer
<unlord> if you're packager has a newer libc, it could depend on a function that is not present on a different computer
<nevcairiel> thats fine, if you build against a new libc and want to run against an old one, thats on you
<unlord> yeah okay, but this syscall can still not be present in the kernel on the runtime computer
cubicibo has quit [Ping timeout: 256 seconds]
tufei has quit [Remote host closed the connection]
tufei has joined #ffmpeg-devel
<unlord> yeah, so making that changes now I get 422 functions run https://paste.debian.net/1324719/
<unlord> should I submit patches for this diff? https://paste.debian.net/1324720/
derpydoo has quit [Ping timeout: 248 seconds]
<wbs> unlord: line 29 should be |, not ||
<unlord> wbs: oh yeah!
HarshK23 has joined #ffmpeg-devel
Krowl has quit [Read error: Connection reset by peer]
MetaNova has quit [Quit: quit]
MetaNova has joined #ffmpeg-devel
feiw1 has joined #ffmpeg-devel
feiw2 has quit [Read error: Connection reset by peer]
Livio has quit [Ping timeout: 245 seconds]
jamrial has joined #ffmpeg-devel
Krowl has joined #ffmpeg-devel
<unlord> wbs: fixing that typo it still shows 422 tests run
<unlord> there are a few places where the RVV is slower than C though
<Lynne> a lot of functions were written for a 908
<unlord> yeah, I'm seeing different numbers on c908
<unlord> I guess I will finish fixing this for dav1d before I give suggestions here
* Sean_McG peeks in
<Sean_McG> Trac is still misbehaving for me today
lean58 has joined #ffmpeg-devel
<lean58> hi
<Sean_McG> hi lean
lean58 has left #ffmpeg-devel [#ffmpeg-devel]
<Sean_McG> short visit *shrug*
lean has joined #ffmpeg-devel
lean has left #ffmpeg-devel [#ffmpeg-devel]
lean has joined #ffmpeg-devel
<kurosu> ugh, if there are RVV functions that are slower than C, are we going to need RVV_${EXTENSION}_SLOW flags ?
<unlord> no, you just need to specialize for VLEN > 128
<kurosu> meaning, 2 versions of the same function, akin to SSEx vs AVX2 vs AVX512?
<kurosu> I imagine SVE2 is going to need that too
<unlord> not exactly, I show how you can do this with only 10 bytes for one of the MC blend functions in AV1 here: https://people.videolan.org/~unlord/Optimizing%20Software%20for%20RISC-V.pdf#page=27
<unlord> I have not looked at SVE2, but AFAICT there is only one vector length in hardware now?
<unlord> has anyone tried Graviton 4?
<wbs> unlord: for SVE1, there's 256 bit (on Graviton 3, and 512 biit for fujitsu a64fx), but only 128 bit for all known SVE2 implementations
<courmisch> kurosu: I have a few of those slow funcs but they are not merged
<unlord> wbs: he asked about SVE2 though
<unlord> does it make sense to add support for SVE1 ?
<kurosu> Well, SVE1 is already a case, and I would expect SVE2 implementations to have a similar variety of vector widths ?
<courmisch> the only functions that look slower now are those that don't scale down well (same issue on x86) and those where the benchmark is messed up (loop filters)
<wbs> kurosu: yes, in principle. the boring thing is that nobody has released an implementation with SVE2 with a vector length over 128 bit, yet
<unlord> vp9_dc_8x8_8bpp_c: 27.6
<unlord> vp9_dc_8x8_8bpp_rvv_i64: 37.9
Krowl has quit [Read error: Connection reset by peer]
<wbs> kurosu: so it's a bit hard to tune/measure to make sure it would behave reasonably for those vector lengths
<wbs> unlord: as for SVE1, if some function can be expressed with only those instructions, then it definitely makes sense to do that as well
<wbs> unlord: someone contributed a couple SVE1/2 opts for x264, where some were SVE1 and others were SVE2
<kurosu> Only if it makes sense for a company to fund SVE1 optimizations?
<unlord> wbs: true, I have not spent much time with SVE yet :(
<kurosu> I mean, that sounds mostly like AWS so far
<wbs> unlord: (most of the gains there came from widening loads and such, nothing really interesting)
<courmisch> vp9_dc_8x8 numbers are 46 for C and 41 for RVV, which is not great but still better than nothing. According to the author anyway (not me)
lean has left #ffmpeg-devel [#ffmpeg-devel]
<unlord> courmisch: yeah, I read the code and it can probably be improved
<courmisch> kurosu: SVE does not have group multipliers so that T-Head RVV bug won't affect it - no needs to specialise
psykose has quit [Remote host closed the connection]
lean has joined #ffmpeg-devel
<courmisch> and even for RVV I'm pretty sure it's a teething issue. the intent of the spec is to follow VL, not LMUL
psykose has joined #ffmpeg-devel
<courmisch> SiFive optimisations don't specialise on VLEN either, so there's hoping their designs won't be broken like Alibaba's
<unlord> yeah, though I'm hoping for a "least common denominator" solution that isn't terrible everywhere
IndecisiveTurtle has joined #ffmpeg-devel
lean has left #ffmpeg-devel [#ffmpeg-devel]
Krowl has joined #ffmpeg-devel
<courmisch> unlord: and anyway, we can agree that the VP9 code is not best (not me)
psykose_ has joined #ffmpeg-devel
psykose has quit [Ping timeout: 244 seconds]
psykose_ is now known as psykose
<Sean_McG> Sebastinas: if I create patches to mask out the badly behaved AltiVec pieces, would you re-enable it for the next release? it really sucks that it had to be disable wholesale for this one. I don't feel like I have the skill currently to fix the broken ones.
<Sean_McG> ppc64el is still considered a "release architecture", yes?
<Sean_McG> and also for everyone else, maybe FFmpeg next release can/should remove the big-endian PPC stuff. As much as it will break my heart to do it, I even volunteer for the job.
<JEEB> yea if it has been broken for quite a while and nobody has noticed it... q66 was at one point doing powerpc builds, but not sure if he cared about BE or just LE
<courmisch> for *years* VLC crashed if you don't have Altivec because some Mac developer hard-coded the Altivec flag
<Sean_McG> it wasn't noticeable until I added the QEMU PPC nodes, there was no big-endian coverage on FATE for quite some time before that
Traneptora_ has joined #ffmpeg-devel
Traneptora has quit [Read error: Connection reset by peer]
<q66> JEEB: *she and also we're still doing builds for both LE and BE but also we don't run the testsuite for ffmpeg in particular, in practice it works ok at least on LE
<q66> we do patch out a little altivec bit which does not build correctly under clang
<q66> idk if that has any impact on what you are facing
<q66> i do vaguely recall there being practical altivec-related breakage related to vec_xl so maybe?
<Sean_McG> yes because vec_xl has different semantics than vec_ld
<q66> this is only for the non-vsx case though, ie big endian
<q66> (because there is no LE ppc64 that does not have vsx in practice, as both VSX and LE are only really usable on power8 or maaaybe power7)
<Sean_McG> I am told that there are POWER8 configs that don't have a vector unit, but I imagine they are rare
<q66> there aren't
<q66> there are some chips that implement the same ISA level and don't have a VSX unit, but these also cannot run little endian because their VMX unit is big endian only (the NXP qoriq stuff)
<q66> (e6500)
<Sebastinas> Sean_McG: ppc64el still is, yes. If altivec support is fixed, I can reenable it.
<Sean_McG> Sebastinas: not sure I have the skill to fix the IDCT, but maybe it can just be disabled for the configs where it fails
<q66> and on power7 you can technically do little endian but it's wonky because one of the guarantees power8 brought and everything in practice relies on is that you have safe unaligned memory access
<Sean_McG> q66: ouch, yeah.
<q66> on power7 it will trap and i think either the kernel does not have a fixup handler, or it will be very slow
<Sebastinas> But if it starts failing again, I will just disable it. If it takes two years for somebody to start looking at it, I'd better spend my time elsewhere than to chase fixes for ppc64el.
* Sean_McG sighs
<Sean_McG> OK.
<Sean_McG> temped to just `git rm lib*/ppc` then
<q66> if i set up my home power9 box again i can maybe try running the tests myself
<q66> i haven't gotten around to it yet
<q66> but i wanted to test some things on that machine so
<Sean_McG> I have access to a POWER9 via the GNU compile farm, I'm just not permitted to use it as a CI node
<q66> i have two boxes at home actually and neither is set up rn :p
<q66> i wanted to uhh
<q66> try getting the intel dedicated gpus to work
<Sean_McG> lucky, I've flirted with buying a Raptor Blackbird but I can't stomach $5K for it
<q66> i used to run it as a workstation but nowadays i just run an 80-core ampere altra because it's like twice as fast and uses fifth of the power
<q66> even though gpus on altra is a special type of hell because unlike ibm's impl on power, somehow every pcie implementation in arm is scuffed as hell
<Sebastinas> Sean_McG: sigh indeed. It's frustrating when bugs sit unattended for that log. But to me it looks like that nobody (except you) seems to care about ppc64el.
<q66> so it needs some pretty cursed kernel patches involving fixup handlers and pcie errata fixes to get gpus to work without displaying random garbage around your screen
<Sebastinas> s/log/long/
<Sean_McG> Sebastinas: probably accurate assessment.
<q66> i do care but i'm stretched a bit thin
<q66> and ffmpeg hasn't really been on my radar too much because i maintain a whole distro and i can't put time into everything
<Sean_McG> fair enough
mkver has quit [Ping timeout: 276 seconds]
Krowl has quit [Read error: Connection reset by peer]
Livio has joined #ffmpeg-devel
ccawley2011 has joined #ffmpeg-devel
rvalue- has joined #ffmpeg-devel
rvalue has quit [Ping timeout: 255 seconds]
rvalue- is now known as rvalue
<courmisch> hmm, better than GCC ICE, GCC OOM
<Sean_McG> yikes
<courmisch> Linux opensoars. nohup gcc -> kernel panic
<JEEB> q66: sorry for the mistake, that was not on purpose :) and OK, so things worked with some minor patching for you with both LE and BE
<q66> JEEB: well
<q66> the tests probably don't pass
<q66> but in practice it does not look broken
<q66> at least for LE anyway
<q66> at least for the videos i dealt with
<q66> but yea when i have the machine up maybe i can give the suite a spin on both endians
Krowl has joined #ffmpeg-devel
<q66> assuming i can dig up a GPU that actually works on BE
<q66> i think i took at least one to spain
<JEEB> yea, that sounds like something that gets very little testing so really easy to break
<q66> the set of GPUs that actually work on BE is like, super narrow
<JEEB> (or: never worked)
<JEEB> yea
<q66> even then it's "works" under some definitions of "works"
<q66> anything newer than ati terascale (i.e. gcn 1.0 and newer == radeon 8xxx and newer) == bust as in it does not probe
<q66> anything older == bust as in it's too buggy
<q66> anything nvidia == don't even try lol
<q66> and that narrow band is like, good enough to run a desktop and some videos but 3d applications gonna have random bugs
<q66> specific old nvidias at one point "worked" but no longer do for years
<courmisch> big endian is dead
<JEEB> effectively pretty much, yea
<courmisch> the only outstanding use case is routers
<courmisch> and even then, I feel like the love of big endian is cargo cult
<Sean_McG> so if I was to excise the BE powerpc stuff from FF, I wouldn't get any resistance/grief on it?
<q66> you would because big endian fans like to be very loud about it :P
<q66> myself i'm like uhh
<Lynne> there were some users who ran ffmpeg on their wii
<Sean_McG> the Wii can't do AltiVec, it has no vector unit
<JEEB> yea but if the stuff doesn't build right now :D
<q66> i recognize that few people care anymore (i don't really care that much about BE specifically either)
<q66> but also like
<Sean_McG> (it's basically G3-derived)
<JEEB> (or work, one or the other)
<q66> i like to make sure stuff works correctly anyway
<q66> because usually it results in code that's less of a mess
<Lynne> but that was years ago, and IMO, if users really need ffmpeg on BE, they should pull old versions
<Sean_McG> ^
<courmisch> my religion does not outlaw big endian, but there is pretty much nothing using it and thus it's impossible to test
<JEEB> Sean_McG: basically if you're starting to remove something that doesn't seem to work for ages, I don't think anyone will hurr a durr much
<Sean_McG> OK.
<q66> also, most stuff actually still works reasonably well, it's just that the stuff that does not work is pretty visible
<q66> if you really wanted you could have a reasonably working big endian workstation
<courmisch> JEEB: bold statement. You seem to forget that this is FFmpeg
<JEEB> :D
<courmisch> where somebody cares about Alpha and AVR32
<Sean_McG> I've been looking for a G5, no luck yet (those don't do VSX though, too early a POWER ISA version)
<JEEB> if altivec has literally been broken for years and we haven't received reports then... eh
<q66> G5s are pretty easy to find and kinda fun to fuck around with, at least the PCIe ones
<courmisch> 20 years ago, I tried to boot NetBSD for PDP-11 on emulator
<courmisch> didn't even work
<Curid> add a bug that breaks everything on BE and see how long it takes until someone notices
<Sean_McG> JEEB: it's not _totally_ broken, just some unfortunate failures
<q66> i set up one that had uhh... a 2013-era GPU, USB3, root on NVMe, and 10G networking
<q66> : ^ )
<q66> (it was still slow as hell)
<q66> they're barely fast enough to play 1080p h264 video in software :)
<Sean_McG> and I hear the system fans are obnoxious
<courmisch> big endian GPU is even more dead than big endian CPU
<JEEB> Sean_McG: right. so the broken things would get das boot
<Sean_McG> but I can just put it in my basement
<courmisch> it's deader than dead
<Lynne> qemu has big endian x86 emulator if you're interested in forbidden satanic summoning rituals
<JEEB> lol BE x86
<q66> i've been looking for some of my old pics
<Sean_McG> BE x86, oh dear
<JEEB> doom3 <3
<Sean_McG> indeed
<courmisch> I mean, if you want to one-up, you can try mixed-endian
<q66> Sean_McG: the fans are quiet in idle and very loud under load
<q66> they're worse on G4s
<Sean_McG> I had a Quicksilver G4 many moons ago
<q66> <courmisch> big endian GPU is even more dead than big endian CPU
<q66> probably wouldn't be that hard to fix up amdgpu to at least baseline work
<Sean_McG> and I have a G4 Mac Mini sitting right here on my desk, but I think the GPU is stone cold dead on it
<courmisch> I don't think AMD/ATI ever made big endian GPUs
<q66> doesn't really matter
<Sean_McG> which is weird, because it does not Sad Mac on boot
<q66> it means that stuff will run inherently slower (no direct gpu buffers and whatever for you) but other than that the driver will abstract that way for you
Teukka has quit [Read error: Connection reset by peer]
<q66> chances are you won't be able to get opengl beyond 4.1 either
<q66> but baseline functionality is doable
<Sean_McG> the project I am working on does not require substantial grunt
<q66> the issue with amdgpu right now is that some of the firmware loading logic is bust and as a result the hw won't probe
<q66> works fine on ppc64le
<Sean_McG> I help with ONScripter-EN, a drop-in replacement game engine for Japanese visual novels
<q66> fun
<Lynne> amdgpu has no shortage of issues, I can tell you that better than anyone
<q66> amdgpu is an *awful* codebase
<q66> and the development practices are totally wack
<Lynne> its fine, its driver-quality
<Lynne> the issue simply is it's buggy
<q66> just throwing shit at a wall until it sticks
<q66> (and in the process breaks something else for somebody else)
Teukka has joined #ffmpeg-devel
Teukka has quit [Changing host]
Teukka has joined #ffmpeg-devel
<q66> (then repeat the process)
<Lynne> they don't give enough information to developers
<q66> especially the display core is total garbage
<q66> it even requires hardware floating point
<Sean_McG> q66: https://everymac.com/systems/apple/powermac_g5/specs/powermac_g5_dual_2.3.html <-- this is/was yours? I did like those cases
<q66> in the kernel lol
<q66> and until relatively recently it was mixing fp and non-fp code in the same TUs, with guards like
<q66> DC_FP_ENABLE(); call_some_float_shit(); DC_FP_DISABLE();
<q66> which broke hilariously on e.g. aarch64 because the compiler is allowed to spill everywhere
<Sean_McG> ouch.
<q66> os you'd get stuff yielding
<q66> and you'd have some code finishing on different cores than where it started
<q66> which resulted in some really fun corruption
<q66> when people started caring about aarch64 amdgpu more, it finally got moved into separate TUs
<q66> but that was done by people outside AMD, AMD don't give a shit
<q66> in fact they just kept breaking it the same way for newer gens :P
<q66> Sean_McG: yea that's the one i had
<q66> or well, still have
<q66> it's just in a basement in another country
mkver has joined #ffmpeg-devel
Workl has joined #ffmpeg-devel
<q66> besides the float stuff the DC also has some hilariously large structures which totally blows up the stack size depending on your compiler/configuration
Krowl has quit [Ping timeout: 265 seconds]
<q66> nowadays not enough to be unsafe, but still takes up like 2k
<q66> used to be like 5k with some older versions of clang and LTO on some archs
cone-503 has joined #ffmpeg-devel
<cone-503> ffmpeg James Almer master:afb06aef7ebe: avcodec/decode: remove unused argument from ff_frame_new_side_data_from_buf()
feiw2 has joined #ffmpeg-devel
feiw1 has quit [Remote host closed the connection]
Workl has quit [Read error: Connection reset by peer]
Livio has quit [Ping timeout: 264 seconds]
IndecisiveTurtle has quit [Ping timeout: 264 seconds]
Livio has joined #ffmpeg-devel
Warcop has joined #ffmpeg-devel
<cone-503> ffmpeg Rémi Denis-Courmont master:56fc5fc6ce9b: lavc/vp9dsp: restrict vertical intra pointers
<cone-503> ffmpeg Rémi Denis-Courmont master:c98127c00eab: lavc/vp9dsp: use restrict qualifier for copy/avg MC
<cone-503> ffmpeg Rémi Denis-Courmont master:7aa6510fe1fc: lavc/vp9dsp: copy 8 pixels at once
<cone-503> ffmpeg Rémi Denis-Courmont master:7b24f96c8793: lavc/vp9dsp: remove R-V I intra functions
<unlord> courmisch: thanks for your comments on my patch, what is the next step? Do I need to send a second patch to the ML updating the units?
IndecisiveTurtle has joined #ffmpeg-devel
<courmisch> unlord: I think a separate patch is OK? it's really a bug from the clock_gettime addition, not RISC-V
<courmisch> `if (&clock_gettime)` WTF?
<courmisch> does the compiler magically guess that clock_gettime is a weak symbol and *can* be NULL? in ISO C, that statement is a truism
<unlord> courmisch: right, I see where FF_TIMER_UNITS should be fixed on line 87 of libavutil/timer.h
<unlord> courmisch: that if test is for __APPLE__ only right?
<courmisch> I would guess so, because I don't think clock_gettime() can be a weak symbol on Linux
<courmisch> but that does not change the fact that the expression is sketchy from C language standpoint
feiw1 has joined #ffmpeg-devel
feiw2 has quit [Ping timeout: 260 seconds]
Livio has quit [Ping timeout: 276 seconds]
Livio has joined #ffmpeg-devel
MikhailAMD has joined #ffmpeg-devel
MikhailAMD has quit [Client Quit]
aaabbb_ has quit [Ping timeout: 272 seconds]
IndecisiveTurtle has quit [Ping timeout: 245 seconds]
aaabbb_ has joined #ffmpeg-devel
cone-503 has quit [Quit: transmission timeout]
Livio has quit [Ping timeout: 248 seconds]
cone-930 has joined #ffmpeg-devel
<cone-930> ffmpeg James Almer release/5.1:a937b3c58bab: swsresample/swresample: error out on invalid layouts
ccawley2011 has quit [Read error: Connection reset by peer]
feiw2 has joined #ffmpeg-devel
feiw1 has quit [Read error: Connection reset by peer]
AbleBacon has joined #ffmpeg-devel