michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
iive has quit [Quit: They came for me...]
System_Error has quit [Remote host closed the connection]
System_Error has joined #ffmpeg-devel
lemourin has joined #ffmpeg-devel
lemourin has quit [Killed (molybdenum.libera.chat (Nickname regained by services))]
realies has joined #ffmpeg-devel
lemourin is now known as Guest8647
lemourin has joined #ffmpeg-devel
Guest8647 has quit [Killed (erbium.libera.chat (Nickname regained by services))]
realies has quit [Ping timeout: 244 seconds]
haihao has quit [Ping timeout: 248 seconds]
haihao has joined #ffmpeg-devel
realies has joined #ffmpeg-devel
arch1t3cht7 has joined #ffmpeg-devel
cone-288 has quit [Quit: transmission timeout]
arch1t3cht has quit [Ping timeout: 252 seconds]
arch1t3cht7 is now known as arch1t3cht
<pross>
what does constant linesize mean? does your encoder expects everyframe to have identical frame->linesize[] values??
<pross>
most, if not all encoders, assume the input dimensions will not change after the init call
thilo has quit [Ping timeout: 248 seconds]
thilo has joined #ffmpeg-devel
thilo has quit [Changing host]
thilo has joined #ffmpeg-devel
<Traneptora>
pross: I mean specifically that frame->linesize won't change from frame to frame
<Traneptora>
more specifically, if subsequent frames have a negative linesize but the first frame is positive, or vice versa, it may cause issues
lemourin has joined #ffmpeg-devel
lemourin has quit [Killed (lead.libera.chat (Nickname regained by services))]
Traneptora has quit [Quit: Quit]
Marth64 has quit [Quit: Leaving]
realies has quit [Quit: ~]
realies has joined #ffmpeg-devel
System_Error has quit [Remote host closed the connection]
<kurosu>
the rv patch to rv40dsp makes it sounds like the h264chroma test is broken, and it does perplex me: for 8 bits, it feels pixels with [0..3] and for >8, [0..255]
<kurosu>
fills
<kurosu>
Not going to test much overflow
<elenril>
Lynne: so why was b6a6e2b19da needed?
<elenril>
it conflicts with the documentation for avcodec_get_hw_frames_parameters(), which says "The function is stateless, and does not change the AVCodecContext"
<elenril>
commenting out that block does not seem to break vulkan decoding
Cheetahze has quit [Quit: Connection closed for inactivity]
Raz- has quit [Remote host closed the connection]
System_Error has quit [*.net *.split]
j45_ has joined #ffmpeg-devel
j45 has quit [Ping timeout: 248 seconds]
j45_ is now known as j45
j45 has quit [Changing host]
j45 has joined #ffmpeg-devel
<fflogger>
[newticket] gegege: Ticket #11333 ([undetermined] Using c# code to call version 7.1 ffmpeg, the video will stop after converting only a small portion) created https://trac.ffmpeg.org/ticket/11333
<fflogger>
[editedticket] gegege: Ticket #11333 ([undetermined] Using c# code to call version 7.1 ffmpeg, the video will stop after converting only a small portion) updated https://trac.ffmpeg.org/ticket/11333#comment:1
kasper93 has quit [Ping timeout: 272 seconds]
kasper93 has joined #ffmpeg-devel
jamrial has joined #ffmpeg-devel
^Neo has joined #ffmpeg-devel
^Neo has joined #ffmpeg-devel
Daemon404 has left #ffmpeg-devel [#ffmpeg-devel]
ngaullier has joined #ffmpeg-devel
ngaullier has quit [Ping timeout: 265 seconds]
<Lynne>
elenril: I believe that was needed for mpv to work
<elenril>
yeah, I figured it out eventually
<elenril>
it breaks things
<Lynne>
specifically, we want to keep state allocated either during frame_params or decoder init
ngaullier has joined #ffmpeg-devel
<Lynne>
basically to avoid reinitializing everything between those two functions
Sean_McG- has joined #ffmpeg-devel
<ngaullier>
Hello, it seems subscription to the ML is broken: no confirmation email sent (I tried to subscribe an alternative email yesterday to turnaround an smtp issue I have currently).).
Sean_McG has quit [Ping timeout: 260 seconds]
Sean_McG- is now known as Sean_McG
<elenril>
Lynne: is it actually expensive?
ngaullier has quit [Ping timeout: 244 seconds]
ngaullier has joined #ffmpeg-devel
<Compn>
ngaullier, what email did you try to subscribe?
<Compn>
ngaullier, i can check it, or add an email manually to the list
<ngaullier>
This is ffnicolasg@sfr.fr ; I tried to post anyway, so I have some emails pending and it would be nice to make them pass. Thanks to you!
<Compn>
that email is not a member
<Compn>
and there are no mails in the moderation queue from that email either.
<Compn>
i sent a subscription invite
<Compn>
michaelni will have to look at the logs to see whats up.
cone-682 has quit [Quit: transmission timeout]
Traneptora has joined #ffmpeg-devel
ngaullier has quit [Remote host closed the connection]
ngaullier has joined #ffmpeg-devel
<BBB>
haasn: dropping 2/6 sounds correct to me. I'm not trying to be difficult, I just believe the commit does the wrong thing :)
<Compn>
ngaullier, did you get any invite/subscribe mails from the list? i sent both
<BBB>
maybe I'm not explaining myself enough in the review (same for Traneptora?), I can explain further if that's helpful
<Traneptora>
I just logged into IRC, is this the trc function thing?
<BBB>
yes
<Traneptora>
my complaint was that if we haven't added something to FFmpeg yet but later gets added to H.273, someone using FFmpeg might call with a value >= AV_TRC_NB
<BBB>
that check is left in though
<haasn>
BBB: I’d be curious to hear what you think the commit does differently
<BBB>
haasn: nothing :) I'm re-reading it now
<Traneptora>
simply returning the function is identical to doing if (!func) return NULL; else return func;
<BBB>
I read it as "it's accessing the NUL pointer" but I'm not reading correctly
<Traneptora>
I care more about the bounds-check
<BBB>
the bounds check is kept, so maybe it's fine
<BBB>
haasn: sorry, coffee
<BBB>
haasn: (i.e. I take back that the commit is wrong)
<Traneptora>
oh, if the bounds check is left in then I have no issue. there's no reason to change if (!func) return NULL; else return func; into just "return func;"
<Traneptora>
I was probably misreading the +/- issues
<Traneptora>
cause in the case where func is null, return func; also returns null;
<BBB>
yes
<Traneptora>
haasn: I misread this as you dropping the bounds check, I no longer have an issue with the commit
<haasn>
Okay :)
<Traneptora>
haasn: the only issue I potentially have is that eotf_bt1886 isn't an inverse of bt709 (which I know is true by spec) but it could matter if something tries to roundtrip to linear
<JEEB>
yea, just like sRGB has ended up like that since the reference display is pure gamma 2.2
<Traneptora>
yea, but as far as I understand everyone inverts sRGB by just taking the inverse function
<ngaullier>
Compn: I did not receive any mail.. I am not used to this new mailbox I created temporarily, but I tested it can receive email and its spam folder is empty... I will try another email provider to make sure, but it seems mails are not sent.
<JEEB>
OETF being the two-part one but it seems like consensus now is that the EOTF is pure 2.2
<BBB>
with the function pointers that are being returned, how does that work for simd?
<Traneptora>
problem with these non-inverse functions is when you have digital manipulation of sRGB content (e.g. I want to scale an sRGB image in linear light)
<BBB>
the idea of most of these tables if so provide a set of coefficients so that we can implement a simd conversion from A to B, or at worst a lookup-table conversion from A to B
<BBB>
how does that work if the data identifiers return a function?
<JEEB>
Traneptora: yea it's a fun dilemma isn't it
<JEEB>
you need to somehow differentiate between "I just want to round-trip to linear"
<JEEB>
(and back from it)
<JEEB>
and actually converting from X to Y
<Compn>
ngaullier, if your email has a web interface, could you check the spam folder in that? sometimes if you are using pop3 to access , the spam folder will not be sent . its annoying but happens
<ngaullier>
Compn: OK, I successfully subscribed with ffnicolasg@mailo.com, so there is something mysterious with french operator sfr... Sorry
<JEEB>
like, if you want to convert BT.709 marked content to linear for any other purpose than going linear and back, then you utilize BT.1886
<ngaullier>
Compn: yes, I did check the web interface and all subfolders... well, the emails provided by the internet providers are really shitty.
<JEEB>
but yea, MS picked the two-step sRGB transfer function for EOTF as well, but then people hit that their applications didn't look like they used to look on 2.2 monitors (which makes sense since the content was mastered on 2.2 screens with 1:1 sample values)
<JEEB>
Apple I think defaults to pure gamma 2.2 and has a nice settings menu where you can decide between two-stage, pure gamma 2.2 and BT.1886 for "generic SDR" flagged content
<fflogger>
[editedticket] MasterQuestionable: Ticket #11333 ([undetermined] Incomplete conversion of certain Apple H.265 MOV "hstack") updated https://trac.ffmpeg.org/ticket/11333#comment:2
<haasn>
Traneptora: none of the EOTFs are inverses of the OETF
<haasn>
except for the pure power curves on a display with Lw = 1, Lb = 0
<haasn>
and sRGB although this is arguably not correct
<haasn>
(I just don't have a spec to quote on that)
<haasn>
BBB: typically you would use the returned function pointers to construct a LUT which you can then apply in SIMD
<haasn>
applying a LUT is orders of magnitude faster than even the simplest of these functions
<haasn>
in sw
<BBB>
right
<BBB>
LUTs in SIMD isn't really a thing, but I understand what you mean
<JEEB>
the wayland discussion threads quote the sRGB spec quite a bit I think
<BBB>
(gather be damned)
<haasn>
Traneptora: if you want to scale in linear light you just want to make sure you use the correct pair of inverses
<Traneptora>
how would you do a LUT if it's float -> float? or would you not
<haasn>
so either trc + trc_inv
<haasn>
or eotf + eotf_inv
<BBB>
you'd approximate the float conversion using the precision of the input
<haasn>
or oetf + oetf_inv (didn't add a patch for this yet as it is more complicated and I don't currently have an established use case for them)
ngaullier has quit [Ping timeout: 252 seconds]
<BBB>
so if the input is 8bit, your LUT becomes 8bit approximation of the float functions
<Traneptora>
no but like, if your input is FLOAT
<Traneptora>
do you just not have a lut in that case
<BBB>
that's like cursing
<haasn>
I think float inputs are rare and usually what you want to avoid for performance on sw
<BBB>
I suppose you either approximate, or you don't LUT?
<Traneptora>
or it's just slow ig
<haasn>
probably I would go through a first conversion pass to integers
<BBB>
same
<haasn>
and then have the core scaling pipeline all in fixed point integers
<haasn>
honestly I'd say if you want fast float ops you should be using a shader
<Traneptora>
is 32-bit integer arithmetic actually faster than float ops
<Traneptora>
I'd imagine no
<BBB>
Traneptora: probably not
<Traneptora>
then why would you convert to integers first or do you mean 16-bit integers
<BBB>
you said you wanted a LUT
<Traneptora>
I don't want a LUT, specifically
<BBB>
right, then you don't need to convert
<haasn>
to have a single pipeline
<Traneptora>
I just said how would you do it and "you wouldn't" seems like the answer
<BBB>
I meant "either convert to int, or don't use LUT"
<haasn>
instead of bothering with a bunch of special cases only for float inputs
<Traneptora>
well if you're converting to INT then it would have to be 16-bit ints probably
<haasn>
swscale currently would use 19 bit ints for that
<haasn>
inside 32-bit integers, but with headroom to avoid overflow on scaling itc
<haasn>
I mean it's at the very least not _slower_ than doing 32 bit float math
<haasn>
but saves you from having to reimplement your scaling pipeline in float logic
<BBB>
a lot of sws is the way it is because of historical reasons
<haasn>
and you still want to use a LUT rather than several transcendental float ops + conditionals
<BBB>
yeah
<haasn>
unless gather really are that slow
<BBB>
they are
<BBB>
but LUT is fast enough
<BBB>
since the LUT can combine all operations together
<BBB>
if all you do is conversion
<haasn>
right
<haasn>
I mean we use a 3D LUT
<haasn>
inside swscale
<BBB>
:)
<haasn>
for color conversions
<BBB>
right
<haasn>
that embeds the EOTF, yuv matrix, xyz decoding, ootf, tone mapping, gamut mapping etc
<Traneptora>
a 3D 16-bit lut though seems kinda memory chonky
<BBB>
the color combination can be done in math if you want small LUTs
<BBB>
but even then
<haasn>
so all you need to do is a single tetrahedral lookup
<BBB>
LUT plus simple math SIMD is not that complex
<haasn>
and you get all operations for free
<Traneptora>
how do you do a 3D lut without memory issues
<Traneptora>
can you even do that in 16-bit
<haasn>
reduce precision and interpolate smartly
<BBB>
nobody was suggesting a 16bit 3d LUT :)
<BBB>
even 10bit is problematic
<Traneptora>
so you do an 8-bit 3D lut, ic
<BBB>
(30bit=1GB*sizeof(type))
<BBB>
you can do an 10-16bit 1D LUT and math for color combination
<haasn>
in my current implementation I think I default to 6-bit 3DL2t
<haasn>
LUT*
<haasn>
so 65x65x65
<haasn>
(+5 to avoid midpoint issues)
<haasn>
+1 rather
<haasn>
well, and to simplify the math
<Traneptora>
this sounds like it loses a lot of precision
<Traneptora>
how do you interpolate it back
<haasn>
you get both an exact 0.5 and the ability to get the index and fractional part by just shifting
<haasn>
tetrahedral interpolation on 16-bit coefficients
<haasn>
65x65x65 is a pretty typical LUT size for e.g. display calibration
<haasn>
the overall LUT is also designed to be quasilinear
<BBB>
for interpolation between points? That's reasonable I suppose
<haasn>
if you were to take for example linear light input it's clear that you would need a lot more precision
<haasn>
I currently don't have a special case for this but it would be easy enough to preprocess linear light input with an integer square root or something
<haasn>
tetrahedral interpolation is really fast and good at dealing with RGB 3DLUTs
<haasn>
and actually higher quality than more "expensive" interpolation methods
<haasn>
actually, I have two cases
<haasn>
the fast path is a single 3DLUT lookup
<haasn>
but I also have a "slow" path which is used when doing per frame dynamic tone mapping
<haasn>
that works like this: first do the 65x65x65 3DLUT lookup that just encodes a transformation to IPT space
<fflogger>
[editedticket] MasterQuestionable: Ticket #11333 ([undetermined] Incomplete conversion of certain Apple H.265 MOV "hstack") updated https://trac.ffmpeg.org/ticket/11333#comment:3
<haasn>
then apply a single 256-element 1D tone mapping LUT (recomputed per frame/scene)
<haasn>
and then finally go through a 65x129x129 output 3DLUT using a full trilinear interpolation, this one goes from IPT to the final output space
<haasn>
this is higher quality overall because most of the logic in moved into the output LUT which is interpolated in IPT input space
<haasn>
and that space is very good at decorrelating input channels
<haasn>
or rather, most of the nonlinearity is moved there
<haasn>
but slower, especially the trilinear step
<haasn>
what we could do in theory is do the RGB->IPT transform in pure logic
<Traneptora>
haasn: I'm asking cause I'm trying to convert RGB to XYB, which is fairly expensive
mkver has joined #ffmpeg-devel
<haasn>
but it involves several complicated operations that are not really easy in integer math
<haasn>
what was XYB again?
<Traneptora>
it's an absolute space. the RGB -> XYB transform takes linear RGB (bt709/srgb primaries), runs an affine matrix transform on it, then takes the cube root, and then adds a constant
<haasn>
GPUs eat this kind of operation for breakfast
<Traneptora>
design goal is high portability to embedded devices
<Traneptora>
for this particular piece of software
<Traneptora>
no simd, no threading
<Traneptora>
low memory
<Traneptora>
by low memory I'm talking in the order of 2 MiB for the library itself in the lowest-memory mode
<Traneptora>
but as you can clearly see I'm already approximating inverse-sRGB with a polynomial and I'm approximating cbrtf with a float-bit-level-hack
<Traneptora>
so a 65x65x65 lut wouldn't be that strange to implement and it's only like 200k
<haasn>
if you don't mind a certain amount of error you could approximate it more easily as a 3x3 matrix multiplication followed by a single LUT
<haasn>
perhaps
<haasn>
I would also consider going down to 33x33x33
<haasn>
presumably the design goal is to be bit exact at 8 bits precision?
<Traneptora>
haasn: it would certainly be nice
<Traneptora>
but it's not critically important
<Traneptora>
but a 3x3-matrix-multiplication would be in which space?
<Traneptora>
or rather, at what precision?
ngaullier has joined #ffmpeg-devel
DEATH has quit [Ping timeout: 265 seconds]
DEATH has joined #ffmpeg-devel
DEATH has quit [Ping timeout: 246 seconds]
DEATH has joined #ffmpeg-devel
ngaullier has quit [Ping timeout: 264 seconds]
<haasn>
your gamma rgb input into some optimized intermediate space
<haasn>
you'd have to numerically solve the system ideally
<haasn>
maybe it's not worth the effort
<Traneptora>
haasn: by numerically solve the system wydm? I can invert the matrix (in fact, the inverted matrix is in the spec, not the forward one)
<Traneptora>
and having the matrix inverse is basically solving the system
System_Error has joined #ffmpeg-devel
<Traneptora>
since it's an invertible square matrix
ngaullier has joined #ffmpeg-devel
rvalue has quit [Read error: Connection reset by peer]