michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
System_Error has quit [Remote host closed the connection]
uau has joined #ffmpeg-devel
System_Error has joined #ffmpeg-devel
System_Error has quit [Remote host closed the connection]
System_Error has joined #ffmpeg-devel
qwertyttyert has joined #ffmpeg-devel
jamrial has quit []
jamrial has joined #ffmpeg-devel
Marth64 has quit [Quit: Leaving]
<qwertyttyert>
Windows Lav 0.79.2 can not decode: ffmpeg-7.0.2 "-ac 1 -c:a aac -profile:a aac_he -b:a 24k" and, -ac 1 -c:a aac -profile:a aac_he_v2 -b:a 32k" Win. VLC 3.0.21 can. VLC 3.0.30 Linux Mint can not.
<haasn>
TD-Linux: turns out my register stack spilling was a non-issue
<haasn>
because I never need both the input and output pointers at the same time except for the merged read_write fast path, but those don't get called per chunk but per line
<haasn>
so the worst case currently is 6 parameters to a function call
<haasn>
4 plane pointers, 1 output pointer and a priv pointer
<haasn>
I can also get away without needing a stride, ever, because I found a very clever design for how to do the filter steps
<haasn>
define a ring buffer like chunk_t buffer[TAPS]; populate it first using the single-line read() callback and then run filter() directly on the ring buffer
<haasn>
that way it will have a stride fixed at compile time
<haasn>
and a pretty small one too
<haasn>
so this should be great for cache locality
<haasn>
although it does require iterating the input image in vertical stripes instead of horizontal rows, this is trivial to do because all of the functions just operate on 32 pixel chunks
<haasn>
so far the only thing about the whole approach that gives me a slight pause is the insane number of function calls we're making
<haasn>
this is completely fine on even older x86 CPUs I tested, since they collapse down to a call/ret pair in practice
<haasn>
but maybe other archs have problems with massive amounts of function calls (on the order of 100k per frame)
<haasn>
(it's still faster than swscale either way)
cone-080 has joined #ffmpeg-devel
<cone-080>
ffmpeg Andreas Rheinhardt master:3797e9239e6e: swscale/x86/swscale: Move some constants to rgb2rgb.c
<cone-080>
ffmpeg Andreas Rheinhardt master:452d6738b507: fftools/ffmpeg_opt: Remove audio_drift_threshold
<cone-080>
ffmpeg Andreas Rheinhardt master:10047fea1c47: avutil/cpu: Disable ff_getauxval() on x86
<cone-080>
ffmpeg Andreas Rheinhardt master:4afe61ea6c15: swscale/x86/swscale: Make M24 variables static
ngaullier has joined #ffmpeg-devel
ccawley2011 has quit [Ping timeout: 244 seconds]
tufei has quit [Remote host closed the connection]
realies has quit [Quit: ~]
ccawley2011 has joined #ffmpeg-devel
realies has joined #ffmpeg-devel
<fflogger>
[newticket] Jay123210599: Ticket #11443 ([undetermined] PNG and APNG Incorrect Colors Because of Tags) created https://trac.ffmpeg.org/ticket/11443
<fflogger>
[newticket] IncrediBlame: Ticket #11444 ([ffmpeg] Overflow-2 in start_time_realtime calculation in rtsp.c) created https://trac.ffmpeg.org/ticket/11444
tufei has joined #ffmpeg-devel
sdc has quit [Ping timeout: 248 seconds]
kylophone has quit [Ping timeout: 248 seconds]
kylophone has joined #ffmpeg-devel
sdc has joined #ffmpeg-devel
haasn has quit [Ping timeout: 248 seconds]
kurosu has quit [Ping timeout: 248 seconds]
kurosu has joined #ffmpeg-devel
haasn has joined #ffmpeg-devel
sr55 is now known as s55
s55 has quit [Changing host]
s55 has joined #ffmpeg-devel
ccawley2011 has quit [Ping timeout: 252 seconds]
<haasn>
Is there a way to compile a file multiple times with different compiler-time definitions (-D) set using ffmpeg makefiles?
<haasn>
I want to avoid having a file to #include the template twice because that requires wrapping all static symbols to avoid collisions
<jamrial>
don't think so
<haasn>
I suppose I could just create multiple implementation files that just include the template once
Mirarora has quit [Quit: Mirarora encountered a fatal error and needs to close]
<Traneptora>
haasn: why are we #including C files anyway
<Traneptora>
if there would be collisions then #including them twice is going to duplicate code
<haasn>
Traneptora: types differ between implementations
<haasn>
this is for templating the same file as 8 and 16 bpc
<haasn>
if I define static helper(pixel_t x); it collides
<haasn>
ditto for "static" types (which aren't a thing)
<BizzaroLeader>
boost your C skills today!, free course next corner on street!
<BizzaroLeader>
alternative is C++/Rust, patches welcome!
<Traneptora>
haasn: if you're #including it twice, once at 8bit once at 16bit why can't you just do `#define helper(depth) helper_##depth`
<Traneptora>
and then do static helper(8)(pixel_t x);
<Traneptora>
or something similar to that
<Traneptora>
also, if FFmpeg is on C11, why not use generics?
<Traneptora>
then later uint16_t foo; helper(foo); will replace helper with helper16
<Traneptora>
it's a C11 feature and I wasn't sure if we had adopted those yet. I think we did switch to C11 in other parts of the code but we're still on C90 in some ways
<BizzaroLeader>
well, it works for very trivial cases, but what if you need to have different expressions depending on type, for example if you use same template for float and ints?
<BizzaroLeader>
hmm, this may be even better, dunno how compilers handle it...
<haasn>
Traneptora: that gets awkward for types
<haasn>
Which was the breaking point for me
<Traneptora>
haasn: wdym it gets awkward for types
<Traneptora>
this is only awkward if you typedef pixel_t to a specific actual type
<Traneptora>
which you would be avoiding if you used _Generic to dispatch instead
<BizzaroLeader>
but would using generics allow special functions generation, so code gets faster instead slower with more types used by generics?
<Traneptora>
if you were to use generics you'd basically have preprocessor macros create copies of the code with _uint8_t suffixes, copies with _uint16_t suffixes, etc.
<Traneptora>
and then you just dispatch to them with _Generic at the highest level possible
<Traneptora>
it may not be worth it, was just a thought
<BtbN>
jamrial: something I just wondered: Couldn't ff_isom_write_hvcc detect now if an lvhc is needed? It's parsing enough of the vps extension now for that. That'd make manually setting the disposition unneccesary.
<jamrial>
BtbN: i guess. it would need to also ensure there are sps/pps for the second layer
<haasn>
BizzaroLeader: btw, I came up with a good idea for how to do filtering, but I didn't yet implement it to see if it's fast enough
ngaullier has quit [Ping timeout: 244 seconds]
<BizzaroLeader>
silent!, make a patent first!
<haasn>
BizzaroLeader: btw, did you test if using float intermediates instead of integer math for everything is faster?
<BizzaroLeader>
i use float for floats, fixed for ints
<haasn>
I am thinking that it may even be faster ot use 32-bit float internally instead of 15-bit integer math like swscale currently does
<BizzaroLeader>
mixing will not work good
<BizzaroLeader>
at least for scaling - bilinear
<haasn>
for the simple reason that you need fewer float ops to do the equivalent of several integer ops in a row
<BizzaroLeader>
hmm, but float is still slow on CPUs
<haasn>
you skip shifting, zero extending, rounding
<haasn>
I will implement it and benchmark
<haasn>
with float intermediate
<haasn>
it would also simplify codegen because you would need only a single float path rather than a separate high bit and low bit depth path
<wbs>
it probably depends heavily on the cpu type. old old x86 were slow on floats. modern x86 are decent. the question is how it behaves on e.g. i686 builds that use x87 (not sse2), and e.g. small arm cores. the x86 case is probably not important, mostly good to know. the small arm core case may be more relevant
<wbs>
that is, i686 x87 build, but on not ancient hw (<15 years old)
bencoh has joined #ffmpeg-devel
Mirarora has joined #ffmpeg-devel
DauntlessOne4 has joined #ffmpeg-devel
zsoltiv_ has joined #ffmpeg-devel
zsoltiv has joined #ffmpeg-devel
microlappy has joined #ffmpeg-devel
BizzaroLeader has quit [Ping timeout: 240 seconds]
<haasn>
wbs: I was planning on compiling for AVX2 as a baseline, with a separate (possibly smaller) fallback implementation only for ancient CPUs
microlappy has quit [Quit: Konversation terminated!]
<haasn>
speaking of which, how can I coax certain files to be built with differing CFLAGS? I tried foo.o: CFLAGS += -skdjf and $(SUBDIR)foo.o: CFLAGS += -kkjdk
<haasn>
but neither triggers
<haasn>
I see other makefiles doing it so it must work _somehow_
BizzaroLeader has joined #ffmpeg-devel
<cone-080>
ffmpeg Diego de Souza master:7454a07d583a: avutil/hwcontext_cuda: add 4:2:2 pixel format support
<cone-080>
ffmpeg Diego de Souza master:30e6effff94c: avcodec/nvdec: add 4:2:2 decoding and 10-bit support
<cone-080>
ffmpeg Diego de Souza master:7e9655800da3: avcodec/cuviddec: add HEVC/H.264 4:2:2 and H.264 10-bit support
<cone-080>
ffmpeg Diego de Souza master:2cfef29f9780: avcodec/nvenc: add 4:2:2 encoding and H.264 10-bit support
<cone-080>
ffmpeg Diego de Souza master:ed80e5558601: avcodec/nvenc: add UHQ to AV1 for NVENC
<cone-080>
ffmpeg Diego de Souza master:a583f7e2fd45: avcodec/nvenc: add Temporal Filtering for AV1 and H.264 in NVENC
<cone-080>
ffmpeg Timo Rothenpieler master:89b37b4dcb2d: avcodec/nvenc: use encoder level options for qmin/qmax
<cone-080>
ffmpeg Timo Rothenpieler master:b37606e56220: avcodec/nvenc: finalize SDK 13.0 support