narmstrong changed the topic of #linux-amlogic to: Amlogic mainline kernel development discussion - our wiki http://linux-meson.com/ - ml linux-amlogic@lists.infradead.org - official channel moved from Freenode - publicly logged on https://libera.irclog.whitequark.org/linux-amlogic
<pulpoff> i have updated regulator values in dtb file
<pulpoff> but still same error
<pulpoff>    regulator-vddcpu-b {
<pulpoff>             compatible = "pwm-regulator";
<pulpoff>             regulator-name = "VDDCPU_B";
<pulpoff>             regulator-min-microvolt = <0xa6040>;
<pulpoff>             regulator-max-microvolt = <0xfde80>;
<pulpoff>             pwm-supply = <0x49>;
<pulpoff>             pwms = <0x4c 0x01 0x5dc 0x00>;
<pulpoff>             pwm-dutycycle-range = <0x64 0x00>;
<pulpoff>             regulator-boot-on;
<pulpoff>             regulator-always-on;
<pulpoff>             phandle = <0x44>;
<pulpoff>     };
pulpoff has quit [Quit: Client closed]
pulpoff has joined #linux-amlogic
pulpoff has quit [Client Quit]
pulpoff has joined #linux-amlogic
sputnik__ has joined #linux-amlogic
sputnik__ has quit [Remote host closed the connection]
sputnik has joined #linux-amlogic
sputnik is now known as Guest5462
Guest5462 has quit [Remote host closed the connection]
sputnik has joined #linux-amlogic
sputnik is now known as Guest939
Guest939 has quit [Read error: Connection reset by peer]
sputnik has joined #linux-amlogic
sputnik is now known as Guest3214
chewitt_ has joined #linux-amlogic
chewitt has quit [Ping timeout: 268 seconds]
camus has joined #linux-amlogic
chewitt_ is now known as chewitt
chewitt_ has joined #linux-amlogic
chewitt has quit [Ping timeout: 260 seconds]
vagrantc has joined #linux-amlogic
<xdarklight> pulpoff: I don't understand your question, we're not using any OPP with 1040000uV on G12B (S922X). if your board has the same regulator as meson-g12b-w400.dtsi (according to meson-g12b-gtking-pro.dts it has since it's not overriding any value) then I doubt that the regulator can output more than 1022000uV (sure, one can put any value in .dts - but it doesn't mean that this represents the reality)
chewitt_ has quit [Read error: Connection reset by peer]
chewitt has joined #linux-amlogic
lyudess has quit [Quit: WeeChat 3.2]
vagrantc has quit [Quit: leaving]
Lyude has joined #linux-amlogic
tdebrouw has joined #linux-amlogic
warpme__ has joined #linux-amlogic
chewitt has quit [Read error: Connection reset by peer]
chewitt_ has joined #linux-amlogic
Darkmatter66 has quit [Quit: ZNC 1.8.2 - https://znc.in]
Darkmatter66 has joined #linux-amlogic
cottsay08 has joined #linux-amlogic
Stricted- has joined #linux-amlogic
JerryXia1 has joined #linux-amlogic
orkid_ has joined #linux-amlogic
cottsay has quit [Ping timeout: 260 seconds]
orkid has quit [Ping timeout: 260 seconds]
Stricted has quit [Ping timeout: 260 seconds]
mcirsta has quit [Ping timeout: 260 seconds]
rektide has quit [Ping timeout: 260 seconds]
JerryXiao has quit [Ping timeout: 260 seconds]
cottsay08 is now known as cottsay
rektide_ has joined #linux-amlogic
GNUtoo has quit [Ping timeout: 276 seconds]
GNUtoo has joined #linux-amlogic
tdebrouw has quit [Quit: Leaving.]
Darkmatter66 has quit [Quit: ZNC 1.8.2 - https://znc.in]
Darkmatter66 has joined #linux-amlogic
camus has quit [Ping timeout: 256 seconds]
camus has joined #linux-amlogic
camus has quit [Quit: camus]
Darkmatter66 has quit [Ping timeout: 265 seconds]
Darkmatter66 has joined #linux-amlogic
mcirsta has joined #linux-amlogic
Darkmatter66 has quit [Ping timeout: 245 seconds]
Darkmatter66 has joined #linux-amlogic
buzzmarshall has joined #linux-amlogic
camus has joined #linux-amlogic
vagrantc has joined #linux-amlogic
<gbisson> is anyone having issues with gstreamer video decoding on a311d/s922x?
<gbisson> What i see on 5.14 is that it works great with ffmpeg, but gstreamer is problematic
<gbisson> So I wonder if someone knows where the difference is ;)
<chewitt_> @ndufresne ^
chewitt_ is now known as chewitt
<chewitt> might be a little late for him being around .. it's POETS day after all
<ndufresne> threads I would say
<gbisson> Note that the kernel only throws errors like this: meson-vdec ff620000.video-decoder: VIFIFO usage (16780194) > VIFIFO size (16777216)
<ndufresne> gstreamer stateful decoder is threaded, so that might cause issue in fragile decoders, but for me it worked on S905
<ndufresne> well "only" this is pretty fatal, you bitstream will be corrupted
<ndufresne> gbisson: quesiton is which codec ? cause they are not equal, or similar
<gbisson> ha ok, problem is that it shows that, then it's stuck for some time, then it goes fine
<gbisson> h.264
<ndufresne> that looks like the effect of stream corruption, it hangs till the next IDR
<ndufresne> what gst version ?
<gbisson> let me boot the board back up
<gbisson> it's debian bullseye with 5.14 kernel basically
<gbisson> 1.18.4-3
<gbisson> I tried 2 different streams:
<gbisson> same result (both H.264)
<gbisson> Tried different sync (glimagesink, waylandsink, kmssink) => same result
<gbisson> actually the error message doesn't always appear, but the stream getting stuck happens every single time, I can record what I see if it helps
vagrantc has quit [Quit: leaving]
<ndufresne> gbisson: we can investigate further, e.g. you can trace with GST_DEBUG="v4l2*:7" and share the output ?
<ndufresne> the goal is as usual, to spot if the driver has gone nut, or if its inside gstreamer
<ndufresne> my env is too output for me to be able to repro myself today (though I'd be testing on git master)
<ndufresne> * outdated
<gbisson> ndufresne: thanks for your time on this, hrere is the output: http://linode.boundarydevices.com/gary/gst_a311d_h264dec.txt
<gbisson> as you can see, not much happens during the first 9 seconds
<gbisson> Then up until 35s there are some frames here and there but not great
<gbisson> Only then it works ok
<ndufresne> this one is weird, but likely not related: Failed to enumerate frame sizes for pixelformat NM12 (Invalid argument)
repk_ has quit [Quit: WeeChat 3.3]
repk has joined #linux-amlogic
<ndufresne> gbisson: looks like this is a ring buffer based driver, and it does not throttle the number of pending frames
<ndufresne> and 652 frames left
<gbisson> ndufresne: v4l2-dbg reports the proper format though
<ndufresne> that the number of pending (submitted) frames that have been sent to the decoder
<chewitt> @ndufresne "Failed to enumerate frame sizes for pixelformat NM12 (Invalid argument)" sounds similar to what @JC (Pi dev) was talking about in LE slack yesterday
<ndufresne> for the "invalid argument" thing we need to trace the ioctl and see what are the argument
<chewitt> "it is more of a disagreement about how to resolve (between ffmpeg & V4L2) what the video size/SAR/interlace etc. is at the start of time and if it ever changes."
<ndufresne> gstreamer fallback back to try_fmt with min/max resolution and keep going here
<ndufresne> ah, well, GStreamer code predates the API to handle that properly
<ndufresne> but there is a MR that I will merge at the next starting dev cycle, its from NXP, https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1381
Guest3214 has quit [Remote host closed the connection]
Guest3214 has joined #linux-amlogic
<ndufresne> its not ideal, but it is following the spec
<ndufresne> chewitt: remains that in that trace, the format guess worked and decoding started
<ndufresne> it just hand at some point
<ndufresne> I don't even know if the driver have enough space to rememebr 650 timestamps
<ndufresne> cause of course you need to pass over the TS to the respective capture buffer
<ndufresne> anyway, ffmpeg implementation will not allow this to happen iirc
<ndufresne> cause its single threaded
<ndufresne> this all looks familiar / typical issue with Amlogic ring buffer handling
<ndufresne> gbisson: for now it looks like a driver issue
<gbisson> ndufresne: ok, so for the enum issue, the patch from NXP will fix it, correct?
<ndufresne> gbisson: no idea ;-P, whatever happens really need investigation
<gbisson> ndufresne: as for the pending frames issues, it's a threading issue, most likely because of the driver missing some locks?
<ndufresne> with the NXP patch, if the driver implement SRC_CH event, then GStreamer will be able to fix any wrong format on capture queue, if it changed at run-time
<gbisson> ok
zkrx has quit [Quit: zkrx]
<ndufresne> but it is suboptimal cause this implementation will pre-allocate toward a guess resolution, which may not be right/hw specific, and then re-allocate if something changed
<ndufresne> with NXP Amphion driver, the driver would just sit there waiting forever for GStreamer to reallocate, but in your case, the driver did start decoding, it just got stock and stopped producing output
<ndufresne> so it does not look identical to me
<gbisson> ok, I might try to rebuild gst myself with that patch to see how it goes
<ndufresne> the extremely large amount of pending frames (652) is clearly a red flag
zkrx has joined #linux-amlogic
<ndufresne> buffering that much will makes seeks pretty much unusable
<ndufresne> imho, driver should limit their internal frame queue ...
<ndufresne> CODA is ring buffer based and does not seem to behave like this ...
<ndufresne> an alternative, you could enhance GStreamer, and add userland throttling, see if it helps, the driver reports 10frames backlog (10frame before first frame out), so in theory we have the info to throttle
<gbisson> ok, I believe it'd best to have the same behavior between all the stateful drivers
<gbisson> I have plenty of i.MX devices so I guess I could compare with CODA more deeply
<gbisson> thanks for all your input
<chewitt> ndufresne what you're describing with the huge number of frames .. is exactly what the pi devs described as a problem before
<chewitt> when they started to work on v4l2_m2m for their H264 IP block, I started tracking their work
<chewitt> up to a point, the changes improved behaviour with the Amlogic vdec in addition to RPi4
<chewitt> then they reached a point where driver changes were needed (all about queue sizes)
<chewitt> and from that point forwards Amlogic regressed heavily
<chewitt> (using their ffmpeg branch)
<gbisson> chewitt: so no-one looked into the kernel side yet correct?
<chewitt> to be clear, playback start always worked
<chewitt> but seeking also started to work well, but then regressed badly after they started making the driver changes
<chewitt> @gbisson correct, nobody looked at the driver since
<chewitt> this is all with ffmpeg of course (Kodi is just a big wrapper around it) .. but I think the underlying issues are common and are driver
<ndufresne> I don't recall RPi H264 drvier being reported as having issues in this regard, they have been using/shipping gst for a while ...
<ndufresne> perhaps I would need more context
<ndufresne> gst threaded design is the only way we could to achieve lower latency with good throughput, but it rely on driver not over-comitting
<chewitt> the possible difference being .. pi devs make driver changes so queues are sensible, and it's not an issue
<ndufresne> which isn't specified in the spec of course
<ndufresne> yeah, but Amlogic has a real ring buffer
<ndufresne> fixed size regardless of the compression ratio
<ndufresne> this is much trickier to tweak
<ndufresne> I've been there with an Endless (some downstream v4l2 m2m decoder), but can't remember what they did on kernel size to limit the queue
<ndufresne> but as soon as they figure-out how to limit the queue, both GStreamer and Chromium (these were our target) started to work well and also be responsive
<chewitt> @narmstrong ^ any of this sound familiar?
<ndufresne> otherwise, after each seek, the CPU would got go 100% for 30s, making the system sluggish
<ndufresne> (was on amlogic old gen 805 , whatever they are called)
<chewitt> Meson8 .. S805/S802/S812
<ndufresne> we need a new Max J. ;-D
<chewitt> indeed
<narmstrong> yes it's the exact issue I have now with the android v4l2_codec2
<ndufresne> I manage to get thing fixed mainline on many HW through my work, but for some reason, Amlogic mainline has never been part of my work
<narmstrong> there is a new h264 decoder based on the same HW as VP9 and HEVC that doesn't have this limitation/whatever, but maxime never finished it
<ndufresne> we have strong feeling that the VP9/HEVC core is some derivative of Hantro G2
<ndufresne> this came up last week from one of my colleague
<narmstrong> perhaps
<ndufresne> reversing the firmware that parse would tell us ;-P
<ndufresne> but these accelerator are entirely frame base, no ring buffer
<ndufresne> actually, VP9 would requires a payload like ivf to be stored in a ring buffer
<chewitt> two people had a look at the firmware for me
<chewitt> @jernej (Allwinner LE dev) said it wasn't possible to figure out the ISA
<chewitt> and IIRC @cyrozap came to similar conclusions
<chewitt> and when I asked Amlogic people they declined to answer :)
<ndufresne> I guess reversing the accelerator (basically the binary produces by the parser) and turn this into stateless driver might be "easier"
<gbisson> I gotta go, thanks all for the support, I'll look into it more next week.
camus has quit [Quit: camus]
<cyrozap> chewitt, ndufresne, gbisson
<cyrozap> oops
<chewitt> Hi @cyrozap
<cyrozap> chewitt, ndufresne, gbisson: I don't believe I said it was impossible, just that it would be kind of a pain and I was more interested in MediaTek's video hardware. So long as the state of the AMRISC core can be dumped (preferably registers, but even just the data memory may suffice if you can figure out what instructions write registers to memory) and custom firmware uploaded to it (which should be
<cyrozap> possible since the firmware binaries appear to be unsigned), then it should be possible to (eventually) figure out the ISA.
<chewitt> I meant that it wasn't immediately possible to figure out the ISA .. bad wording
<chewitt> where there's a will, there's always a way :)
<cyrozap> Yeah, it would be way easier if a compiler existed, like it does for MediaTek's MD32.
<cyrozap> But without a compiler, the process basically amounts to: 1) diff the older and newer versions of binaries to make some educated guesses about which instructions do what, 2) fill a binary with the instruction sequences you want to test, 3) dump data memory, 4) load and execute the binary, 5) dump data memory
<cyrozap> 6) diff the data memory dumps
vagrantc has joined #linux-amlogic
<cyrozap> Essentially, the CPU just needs to be treated as a machine that takes instructions and external state (data memory) as input, modifies internal and external state (registers and data memory) based on the instructions, and then outputs external state.
vagrantc has quit [Client Quit]
pulpoff has quit [Quit: Ping timeout (120 seconds)]
<cyrozap> With enough tests and data, you can treat it like a system of equations and solve for the CPU operations.
<ndufresne> cyrozap: didn't know there was interesting in RE MTK video HW (codec or other ?)
<ndufresne> * interest
<ndufresne> in fact, I generally don't even know how to get access to hackable MTK HW ...
<ndufresne> gbisson: it looks like the patch I gave my help, I see SRC_CHANGE even implementation in the h264 decoder
<ndufresne> gbisson: if you could add couple of printk and check if after stall the driver is in sess->status = STATUS_NEEDS_RESUME; I think we would be set
<cyrozap> ndufresne: Yeah, I'm interested, at least. And for hackable hardware, there's the MT8183-based ChromeOS devices, though my "preferred" platforms are the Orange Pi 4G-IoT and some Amazon devices because they're cheap, I have a bunch of them, and I can get my own code running on all of them (the Orange Pi doesn't have any signature checks, and the signature checks on the Amazon devices can be easily
<cyrozap> bypassed with some simple hardware mods).
<ndufresne> cool, might have a loot at some point ;-P
<cyrozap> I wrote a Ghidra processor module (incomplete) for the MD32, which is the CPU the VPU firmware runs on: https://github.com/cyrozap/ghidra-md32
<ndufresne> is there a specific goal toward taking control of that co-processor ?
<ndufresne> or perhaps this is security research ?
vagrantc has joined #linux-amlogic
<cyrozap> Well, that's where it's VPU firmware runs, and all the VPU firmware does is parse bitstreams and interact with the video hardware (which can also be accessed by the main CPU cores), so if we can figure out the video hardware from reverse engineering the VPU firmware, then we could have blob-free video encoder/decoder operation.
<cyrozap> *its
<ndufresne> cool cool, so blob free is the goal, I like it
<ndufresne> I'm personally curious on what type of HW accelerator these blobs hide
<ndufresne> I bet the firmware on 8183 is likely simplier, since the codec are stateless from this version
<cyrozap> So, my main goal is really just to get rid of the unnecessary blobs, but it's also a fun little CPU you can do other stuff with. It has full access to memory and peripherals, so it can interact with any hardware the ARM cores can, so you could use it to offload arbitrary functions. And from a security perspective, if there are any flaws in the firmware that get you arbitrary code exec inside the
<cyrozap> MD32, you could use that to take over the whole SoC.
<cyrozap> Technically the codec hardware has always been stateless--they just modified the firmware to support stateless operation.
<ndufresne> yes, but in the domain, I know two type of base HW, well 3, frame base, slice base and macroblock base
<ndufresne> for the third, the firmware code will be about entropy decoding
<ndufresne> there is also some hybrid solution, on some Qualcomm I have been told that HEVC decoding is partially done on DSP (basically software decoding)
<ndufresne> but I'm also curious if we will find out known chips, like hantro/vsi, amphion, allegro, etc.
<ndufresne> It sounds like the MD32 is on top of everything, for if there is DRM it will sit there
<cyrozap> Interestingly, I think the non-ChromeOS MediaTek devices don't use the MD32 to operate the VPU, so you can actually get some details on the hardware from the released kernel sources: https://github.com/freedomtan/kernel-3.18-X20-96-board/tree/9e9d9fcb0b567bbedd409f7c79779caa233cbace/drivers/misc/mediatek/videocodec
<cyrozap> I don't know enough about video codec hardware to really understand it, though.
<ndufresne> just stumble across "/* Add one line comment for avoid kernel coding style, WARNING:BRACES: */" what the hell that means ...
<cyrozap> trying to avoid kernel code style checks, probably
<ndufresne> ok, so this is bunch of ref count, and wait_isr() I guess the userland will bang in devmem to program the registers
<ndufresne> its kind of worst then VSI ref driver
<ndufresne> but I think that ressamble the RPi4 HEVC ref driver
<ndufresne> though quite unlikely related, the chip company that make the RPi4 HEVC have been owned by Broadcom for a long time now
<ndufresne> (owned merged into)
<ndufresne> That's how cedrus diver was born, passing known bitstream, and finding back where each stream param got written by the userland blob
<ndufresne> (later some code was leaked, but yeah)
tdebrouw has joined #linux-amlogic
vagrantc has quit [Quit: leaving]
<mcirsta> I am interested in the HW vdec of the Amlogic SOCs but my knowledge of these things is close to 0
<mcirsta> ndufresne I was just setting up my env to be able to build and test what was done so far and I'm hoping to somehow figure out how to adapt the rpi work in ffmpeg
<mcirsta> I think it's really interesting to actually open source the binary firmware too but given the interest so far I doubt it will be easy
<mcirsta> if the Amlogic provided firmware works well enough I think it's a reasonable compromise to just use that
pulpoff has joined #linux-amlogic
<pulpoff> hi chewitt
mcirsta has quit [Quit: Connection closed]
<warpme__> amlogic is statefull decoder so proper state machine synchro between provider (kernel driver + hw decoder) internal state machine & consumer (ffmpeg/player) state machine is required.
<warpme__> imho major complication here is that state machines between vendors may differ - so beside api design (already done in v4l2_m2m) we need also design (common in all drivers) in-driver state machine abstraction.
<warpme__> IMHO this should be: a\driver unified state machine exposed to consumer + state translation in kernel driver to hw vendor specific state machine.
<warpme__> Sounds complicated - but without this we will have well known in software "spaghetti effect": you fix driver for vedor A and breaks vendor B. PlanB can be going with per-vendor quirks in consumer (ffmpeg/player).
<warpme__> given lack of knowledge about per-SOC state machine specifics - quirks approach probably will be most reasonable. I would narrow SoCs to just 2: brcm & amlogic. Rest like coda or venus are too exotic (imho). Key issue of quirk approach: it will never be upstreamed :-(
<warpme__> we have quite well working decode of h264 on rpi & amlogic on mainline 5.15. Show stopper is seek. There is I exemplary code for ffmpeg doing buffers flush at seek - but it puts hw decoders into hang (due state machines de-sync). So - if we will fix this (in drivers per rpi & aml) - we may have something worth to show for users...
tdebrouw has quit [Quit: Leaving.]