#linux-amlogic on 2021-11-26 — irc logs at libera.irclog.whitequark.org

2021-06-01 12:24 narmstrong changed the topic of #linux-amlogic to: Amlogic mainline kernel development discussion - our wiki http://linux-meson.com/ - ml linux-amlogic@lists.infradead.org - official channel moved from Freenode - publicly logged on https://libera.irclog.whitequark.org/linux-amlogic

00:06 <pulpoff> i have updated regulator values in dtb file

00:06 <pulpoff> but still same error

00:06 <pulpoff> regulator-vddcpu-b {

00:06 <pulpoff> compatible = "pwm-regulator";

00:06 <pulpoff> regulator-name = "VDDCPU_B";

00:06 <pulpoff> regulator-min-microvolt = <0xa6040>;

00:06 <pulpoff> regulator-max-microvolt = <0xfde80>;

00:06 <pulpoff> pwm-supply = <0x49>;

00:06 <pulpoff> pwms = <0x4c 0x01 0x5dc 0x00>;

00:06 <pulpoff> pwm-dutycycle-range = <0x64 0x00>;

00:06 <pulpoff> regulator-boot-on;

00:06 <pulpoff> regulator-always-on;

00:06 <pulpoff> phandle = <0x44>;

00:06 <pulpoff> };

00:18 pulpoff has quit [Quit: Client closed]

00:30 pulpoff has joined #linux-amlogic

00:31 pulpoff has quit [Client Quit]

00:31 pulpoff has joined #linux-amlogic

01:43 sputnik__ has joined #linux-amlogic

01:44 sputnik__ has quit [Remote host closed the connection]

01:44 sputnik has joined #linux-amlogic

01:44 sputnik is now known as Guest5462

01:45 Guest5462 has quit [Remote host closed the connection]

01:45 sputnik has joined #linux-amlogic

01:45 sputnik is now known as Guest939

02:23 Guest939 has quit [Read error: Connection reset by peer]

02:24 sputnik has joined #linux-amlogic

02:24 sputnik is now known as Guest3214

03:57 chewitt_ has joined #linux-amlogic

04:00 chewitt has quit [Ping timeout: 268 seconds]

04:44 camus has joined #linux-amlogic

05:04 chewitt_ is now known as chewitt

05:08 chewitt_ has joined #linux-amlogic

05:10 chewitt has quit [Ping timeout: 260 seconds]

05:19 vagrantc has joined #linux-amlogic

06:27 <xdarklight> pulpoff: I don't understand your question, we're not using any OPP with 1040000uV on G12B (S922X). if your board has the same regulator as meson-g12b-w400.dtsi (according to meson-g12b-gtking-pro.dts it has since it's not overriding any value) then I doubt that the regulator can output more than 1022000uV (sure, one can put any value in .dts - but it doesn't mean that this represents the reality)

06:39 chewitt_ has quit [Read error: Connection reset by peer]

06:39 chewitt has joined #linux-amlogic

06:58 lyudess has quit [Quit: WeeChat 3.2]

07:08 vagrantc has quit [Quit: leaving]

07:26 Lyude has joined #linux-amlogic

07:37 tdebrouw has joined #linux-amlogic

08:19 warpme__ has joined #linux-amlogic

09:31 chewitt has quit [Read error: Connection reset by peer]

09:31 chewitt_ has joined #linux-amlogic

10:50 Darkmatter66 has quit [Quit: ZNC 1.8.2 - https://znc.in]

10:53 Darkmatter66 has joined #linux-amlogic

11:47 cottsay08 has joined #linux-amlogic

11:48 Stricted- has joined #linux-amlogic

11:51 JerryXia1 has joined #linux-amlogic

11:55 orkid_ has joined #linux-amlogic

11:55 cottsay has quit [Ping timeout: 260 seconds]

11:55 orkid has quit [Ping timeout: 260 seconds]

11:55 Stricted has quit [Ping timeout: 260 seconds]

11:55 mcirsta has quit [Ping timeout: 260 seconds]

11:56 rektide has quit [Ping timeout: 260 seconds]

11:56 JerryXiao has quit [Ping timeout: 260 seconds]

11:56 cottsay08 is now known as cottsay

11:56 rektide_ has joined #linux-amlogic

12:30 GNUtoo has quit [Ping timeout: 276 seconds]

12:31 GNUtoo has joined #linux-amlogic

12:43 tdebrouw has quit [Quit: Leaving.]

13:06 Darkmatter66 has quit [Quit: ZNC 1.8.2 - https://znc.in]

13:07 Darkmatter66 has joined #linux-amlogic

13:31 camus has quit [Ping timeout: 256 seconds]

13:40 camus has joined #linux-amlogic

13:54 camus has quit [Quit: camus]

14:33 Darkmatter66 has quit [Ping timeout: 265 seconds]

14:34 Darkmatter66 has joined #linux-amlogic

14:36 mcirsta has joined #linux-amlogic

14:55 Darkmatter66 has quit [Ping timeout: 245 seconds]

14:55 Darkmatter66 has joined #linux-amlogic

15:09 buzzmarshall has joined #linux-amlogic

15:25 camus has joined #linux-amlogic

15:42 vagrantc has joined #linux-amlogic

16:22 <gbisson> is anyone having issues with gstreamer video decoding on a311d/s922x?

16:22 <gbisson> What i see on 5.14 is that it works great with ffmpeg, but gstreamer is problematic

16:22 <gbisson> So I wonder if someone knows where the difference is ;)

16:23 <chewitt_> @ndufresne ^

16:23 chewitt_ is now known as chewitt

16:24 <chewitt> might be a little late for him being around .. it's POETS day after all

16:25 <ndufresne> threads I would say

16:25 <gbisson> Note that the kernel only throws errors like this: meson-vdec ff620000.video-decoder: VIFIFO usage (16780194) > VIFIFO size (16777216)

16:25 <ndufresne> gstreamer stateful decoder is threaded, so that might cause issue in fragile decoders, but for me it worked on S905

16:25 <ndufresne> well "only" this is pretty fatal, you bitstream will be corrupted

16:26 <ndufresne> gbisson: quesiton is which codec ? cause they are not equal, or similar

16:26 <gbisson> ha ok, problem is that it shows that, then it's stuck for some time, then it goes fine

16:26 <gbisson> h.264

16:26 <ndufresne> that looks like the effect of stream corruption, it hangs till the next IDR

16:27 <ndufresne> what gst version ?

16:28 <gbisson> let me boot the board back up

16:28 <gbisson> it's debian bullseye with 5.14 kernel basically

16:28 <gbisson> 1.18.4-3

16:29 <gbisson> I tried 2 different streams:

16:30 <gbisson> http://linode.boundarydevices.com/videos/Hobbit-1080p.mov

16:30 <gbisson> http://linode.boundarydevices.com/videos/trailer_1080p_h264_mp3.avi

16:30 <gbisson> same result (both H.264)

16:30 <gbisson> Tried different sync (glimagesink, waylandsink, kmssink) => same result

16:41 <gbisson> actually the error message doesn't always appear, but the stream getting stuck happens every single time, I can record what I see if it helps

16:47 vagrantc has quit [Quit: leaving]

16:58 <ndufresne> gbisson: we can investigate further, e.g. you can trace with GST_DEBUG="v4l2*:7" and share the output ?

16:58 <ndufresne> the goal is as usual, to spot if the driver has gone nut, or if its inside gstreamer

17:00 <ndufresne> my env is too output for me to be able to repro myself today (though I'd be testing on git master)

17:00 <ndufresne> * outdated

17:04 <gbisson> ndufresne: thanks for your time on this, hrere is the output: http://linode.boundarydevices.com/gary/gst_a311d_h264dec.txt

17:04 <gbisson> as you can see, not much happens during the first 9 seconds

17:05 <gbisson> Then up until 35s there are some frames here and there but not great

17:05 <gbisson> Only then it works ok

17:05 <ndufresne> this one is weird, but likely not related: Failed to enumerate frame sizes for pixelformat NM12 (Invalid argument)

17:09 repk_ has quit [Quit: WeeChat 3.3]

17:09 repk has joined #linux-amlogic

17:09 <ndufresne> gbisson: looks like this is a ring buffer based driver, and it does not throttle the number of pending frames

17:09 <ndufresne> and 652 frames left

17:09 <gbisson> ndufresne: v4l2-dbg reports the proper format though

17:10 <ndufresne> that the number of pending (submitted) frames that have been sent to the decoder

17:10 <chewitt> @ndufresne "Failed to enumerate frame sizes for pixelformat NM12 (Invalid argument)" sounds similar to what @JC (Pi dev) was talking about in LE slack yesterday

17:11 <ndufresne> for the "invalid argument" thing we need to trace the ioctl and see what are the argument

17:11 <chewitt> "it is more of a disagreement about how to resolve (between ffmpeg & V4L2) what the video size/SAR/interlace etc. is at the start of time and if it ever changes."

17:11 <ndufresne> gstreamer fallback back to try_fmt with min/max resolution and keep going here

17:12 <ndufresne> ah, well, GStreamer code predates the API to handle that properly

17:13 <ndufresne> but there is a MR that I will merge at the next starting dev cycle, its from NXP, https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1381

17:13 Guest3214 has quit [Remote host closed the connection]

17:13 Guest3214 has joined #linux-amlogic

17:13 <ndufresne> its not ideal, but it is following the spec

17:13 <ndufresne> chewitt: remains that in that trace, the format guess worked and decoding started

17:14 <ndufresne> it just hand at some point

17:14 <ndufresne> I don't even know if the driver have enough space to rememebr 650 timestamps

17:14 <ndufresne> cause of course you need to pass over the TS to the respective capture buffer

17:15 <ndufresne> anyway, ffmpeg implementation will not allow this to happen iirc

17:15 <ndufresne> cause its single threaded

17:15 <ndufresne> this all looks familiar / typical issue with Amlogic ring buffer handling

17:16 <ndufresne> gbisson: for now it looks like a driver issue

17:16 <gbisson> ndufresne: ok, so for the enum issue, the patch from NXP will fix it, correct?

17:16 <ndufresne> gbisson: no idea ;-P, whatever happens really need investigation

17:17 <gbisson> ndufresne: as for the pending frames issues, it's a threading issue, most likely because of the driver missing some locks?

17:17 <ndufresne> with the NXP patch, if the driver implement SRC_CH event, then GStreamer will be able to fix any wrong format on capture queue, if it changed at run-time

17:17 <gbisson> ok

17:18 zkrx has quit [Quit: zkrx]

17:18 <ndufresne> but it is suboptimal cause this implementation will pre-allocate toward a guess resolution, which may not be right/hw specific, and then re-allocate if something changed

17:19 <ndufresne> with NXP Amphion driver, the driver would just sit there waiting forever for GStreamer to reallocate, but in your case, the driver did start decoding, it just got stock and stopped producing output

17:19 <ndufresne> so it does not look identical to me

17:20 <gbisson> ok, I might try to rebuild gst myself with that patch to see how it goes

17:20 <ndufresne> the extremely large amount of pending frames (652) is clearly a red flag

17:20 zkrx has joined #linux-amlogic

17:20 <ndufresne> buffering that much will makes seeks pretty much unusable

17:21 <ndufresne> imho, driver should limit their internal frame queue ...

17:21 <ndufresne> CODA is ring buffer based and does not seem to behave like this ...

17:22 <ndufresne> an alternative, you could enhance GStreamer, and add userland throttling, see if it helps, the driver reports 10frames backlog (10frame before first frame out), so in theory we have the info to throttle

17:23 <gbisson> ok, I believe it'd best to have the same behavior between all the stateful drivers

17:24 <gbisson> I have plenty of i.MX devices so I guess I could compare with CODA more deeply

17:25 <gbisson> thanks for all your input

17:25 <chewitt> ndufresne what you're describing with the huge number of frames .. is exactly what the pi devs described as a problem before

17:26 <chewitt> when they started to work on v4l2_m2m for their H264 IP block, I started tracking their work

17:27 <chewitt> up to a point, the changes improved behaviour with the Amlogic vdec in addition to RPi4

17:27 <chewitt> then they reached a point where driver changes were needed (all about queue sizes)

17:27 <chewitt> and from that point forwards Amlogic regressed heavily

17:27 <chewitt> (using their ffmpeg branch)

17:29 <gbisson> chewitt: so no-one looked into the kernel side yet correct?

17:29 <chewitt> to be clear, playback start always worked

17:30 <chewitt> but seeking also started to work well, but then regressed badly after they started making the driver changes

17:30 <chewitt> @gbisson correct, nobody looked at the driver since

17:31 <chewitt> this is all with ffmpeg of course (Kodi is just a big wrapper around it) .. but I think the underlying issues are common and are driver

17:32 <ndufresne> I don't recall RPi H264 drvier being reported as having issues in this regard, they have been using/shipping gst for a while ...

17:33 <ndufresne> perhaps I would need more context

17:33 <ndufresne> gst threaded design is the only way we could to achieve lower latency with good throughput, but it rely on driver not over-comitting

17:34 <chewitt> the possible difference being .. pi devs make driver changes so queues are sensible, and it's not an issue

17:34 <ndufresne> which isn't specified in the spec of course

17:34 <ndufresne> yeah, but Amlogic has a real ring buffer

17:34 <ndufresne> fixed size regardless of the compression ratio

17:34 <ndufresne> this is much trickier to tweak

17:35 <ndufresne> I've been there with an Endless (some downstream v4l2 m2m decoder), but can't remember what they did on kernel size to limit the queue

17:35 <ndufresne> but as soon as they figure-out how to limit the queue, both GStreamer and Chromium (these were our target) started to work well and also be responsive

17:36 <chewitt> @narmstrong ^ any of this sound familiar?

17:36 <ndufresne> otherwise, after each seek, the CPU would got go 100% for 30s, making the system sluggish

17:37 <ndufresne> (was on amlogic old gen 805 , whatever they are called)

17:37 <chewitt> Meson8 .. S805/S802/S812

17:37 <ndufresne> we need a new Max J. ;-D

17:37 <chewitt> indeed

17:38 <narmstrong> yes it's the exact issue I have now with the android v4l2_codec2

17:38 <ndufresne> I manage to get thing fixed mainline on many HW through my work, but for some reason, Amlogic mainline has never been part of my work

17:39 <narmstrong> there is a new h264 decoder based on the same HW as VP9 and HEVC that doesn't have this limitation/whatever, but maxime never finished it

17:40 <ndufresne> we have strong feeling that the VP9/HEVC core is some derivative of Hantro G2

17:40 <ndufresne> this came up last week from one of my colleague

17:40 <narmstrong> perhaps

17:40 <ndufresne> reversing the firmware that parse would tell us ;-P

17:41 <ndufresne> but these accelerator are entirely frame base, no ring buffer

17:42 <ndufresne> actually, VP9 would requires a payload like ivf to be stored in a ring buffer

17:42 <chewitt> two people had a look at the firmware for me

17:42 <chewitt> @jernej (Allwinner LE dev) said it wasn't possible to figure out the ISA

17:42 <chewitt> and IIRC @cyrozap came to similar conclusions

17:43 <chewitt> and when I asked Amlogic people they declined to answer :)

17:44 <chewitt> https://github.com/cyrozap/amlogic-video-codec-re/blob/master/Notes.md

17:48 <ndufresne> I guess reversing the accelerator (basically the binary produces by the parser) and turn this into stateless driver might be "easier"

17:53 <gbisson> I gotta go, thanks all for the support, I'll look into it more next week.

17:55 camus has quit [Quit: camus]

18:06 <cyrozap> chewitt, ndufresne, gbisson

18:06 <cyrozap> oops

18:09 <chewitt> Hi @cyrozap

18:10 <cyrozap> chewitt, ndufresne, gbisson: I don't believe I said it was impossible, just that it would be kind of a pain and I was more interested in MediaTek's video hardware. So long as the state of the AMRISC core can be dumped (preferably registers, but even just the data memory may suffice if you can figure out what instructions write registers to memory) and custom firmware uploaded to it (which should be

18:10 <cyrozap> possible since the firmware binaries appear to be unsigned), then it should be possible to (eventually) figure out the ISA.

18:12 <chewitt> I meant that it wasn't immediately possible to figure out the ISA .. bad wording

18:12 <chewitt> where there's a will, there's always a way :)

18:13 <cyrozap> Yeah, it would be way easier if a compiler existed, like it does for MediaTek's MD32.

18:18 <cyrozap> But without a compiler, the process basically amounts to: 1) diff the older and newer versions of binaries to make some educated guesses about which instructions do what, 2) fill a binary with the instruction sequences you want to test, 3) dump data memory, 4) load and execute the binary, 5) dump data memory

18:18 <cyrozap> 6) diff the data memory dumps

18:20 vagrantc has joined #linux-amlogic

18:20 <cyrozap> Essentially, the CPU just needs to be treated as a machine that takes instructions and external state (data memory) as input, modifies internal and external state (registers and data memory) based on the instructions, and then outputs external state.

18:21 vagrantc has quit [Client Quit]

18:22 pulpoff has quit [Quit: Ping timeout (120 seconds)]

18:23 <cyrozap> With enough tests and data, you can treat it like a system of equations and solve for the CPU operations.

18:30 <ndufresne> cyrozap: didn't know there was interesting in RE MTK video HW (codec or other ?)

18:30 <ndufresne> * interest

18:30 <ndufresne> in fact, I generally don't even know how to get access to hackable MTK HW ...

18:35 <ndufresne> gbisson: it looks like the patch I gave my help, I see SRC_CHANGE even implementation in the h264 decoder

18:37 <ndufresne> gbisson: if you could add couple of printk and check if after stall the driver is in sess->status = STATUS_NEEDS_RESUME; I think we would be set

18:41 <cyrozap> ndufresne: Yeah, I'm interested, at least. And for hackable hardware, there's the MT8183-based ChromeOS devices, though my "preferred" platforms are the Orange Pi 4G-IoT and some Amazon devices because they're cheap, I have a bunch of them, and I can get my own code running on all of them (the Orange Pi doesn't have any signature checks, and the signature checks on the Amazon devices can be easily

18:41 <cyrozap> bypassed with some simple hardware mods).

18:47 <ndufresne> cool, might have a loot at some point ;-P

18:48 <cyrozap> I wrote a Ghidra processor module (incomplete) for the MD32, which is the CPU the VPU firmware runs on: https://github.com/cyrozap/ghidra-md32

18:48 <ndufresne> is there a specific goal toward taking control of that co-processor ?

18:49 <ndufresne> or perhaps this is security research ?

18:50 vagrantc has joined #linux-amlogic

18:52 <cyrozap> Well, that's where it's VPU firmware runs, and all the VPU firmware does is parse bitstreams and interact with the video hardware (which can also be accessed by the main CPU cores), so if we can figure out the video hardware from reverse engineering the VPU firmware, then we could have blob-free video encoder/decoder operation.

18:52 <cyrozap> *its

18:54 <ndufresne> cool cool, so blob free is the goal, I like it

18:55 <ndufresne> I'm personally curious on what type of HW accelerator these blobs hide

18:56 <ndufresne> I bet the firmware on 8183 is likely simplier, since the codec are stateless from this version

18:56 <cyrozap> So, my main goal is really just to get rid of the unnecessary blobs, but it's also a fun little CPU you can do other stuff with. It has full access to memory and peripherals, so it can interact with any hardware the ARM cores can, so you could use it to offload arbitrary functions. And from a security perspective, if there are any flaws in the firmware that get you arbitrary code exec inside the

18:56 <cyrozap> MD32, you could use that to take over the whole SoC.

18:57 <cyrozap> Technically the codec hardware has always been stateless--they just modified the firmware to support stateless operation.

18:58 <ndufresne> yes, but in the domain, I know two type of base HW, well 3, frame base, slice base and macroblock base

18:59 <ndufresne> for the third, the firmware code will be about entropy decoding

19:00 <ndufresne> there is also some hybrid solution, on some Qualcomm I have been told that HEVC decoding is partially done on DSP (basically software decoding)

19:01 <ndufresne> but I'm also curious if we will find out known chips, like hantro/vsi, amphion, allegro, etc.

19:03 <ndufresne> It sounds like the MD32 is on top of everything, for if there is DRM it will sit there

19:10 <cyrozap> Interestingly, I think the non-ChromeOS MediaTek devices don't use the MD32 to operate the VPU, so you can actually get some details on the hardware from the released kernel sources: https://github.com/freedomtan/kernel-3.18-X20-96-board/tree/9e9d9fcb0b567bbedd409f7c79779caa233cbace/drivers/misc/mediatek/videocodec

19:10 <cyrozap> I don't know enough about video codec hardware to really understand it, though.

19:13 <ndufresne> just stumble across "/* Add one line comment for avoid kernel coding style, WARNING:BRACES: */" what the hell that means ...

19:14 <cyrozap> trying to avoid kernel code style checks, probably

19:18 <ndufresne> ok, so this is bunch of ref count, and wait_isr() I guess the userland will bang in devmem to program the registers

19:19 <ndufresne> its kind of worst then VSI ref driver

19:19 <ndufresne> but I think that ressamble the RPi4 HEVC ref driver

19:20 <ndufresne> though quite unlikely related, the chip company that make the RPi4 HEVC have been owned by Broadcom for a long time now

19:20 <ndufresne> (owned merged into)

19:21 <ndufresne> That's how cedrus diver was born, passing known bitstream, and finding back where each stream param got written by the userland blob

19:24 <ndufresne> (later some code was leaked, but yeah)

19:40 tdebrouw has joined #linux-amlogic

21:09 vagrantc has quit [Quit: leaving]

22:12 <mcirsta> I am interested in the HW vdec of the Amlogic SOCs but my knowledge of these things is close to 0

22:13 <mcirsta> ndufresne I was just setting up my env to be able to build and test what was done so far and I'm hoping to somehow figure out how to adapt the rpi work in ffmpeg

22:14 <mcirsta> I think it's really interesting to actually open source the binary firmware too but given the interest so far I doubt it will be easy

22:15 <mcirsta> if the Amlogic provided firmware works well enough I think it's a reasonable compromise to just use that

22:15 pulpoff has joined #linux-amlogic

22:15 <pulpoff> hi chewitt

22:52 mcirsta has quit [Quit: Connection closed]

23:27 <warpme__> amlogic is statefull decoder so proper state machine synchro between provider (kernel driver + hw decoder) internal state machine & consumer (ffmpeg/player) state machine is required.

23:28 <warpme__> imho major complication here is that state machines between vendors may differ - so beside api design (already done in v4l2_m2m) we need also design (common in all drivers) in-driver state machine abstraction.

23:28 <warpme__> IMHO this should be: a\driver unified state machine exposed to consumer + state translation in kernel driver to hw vendor specific state machine.

23:29 <warpme__> Sounds complicated - but without this we will have well known in software "spaghetti effect": you fix driver for vedor A and breaks vendor B. PlanB can be going with per-vendor quirks in consumer (ffmpeg/player).

23:36 <warpme__> given lack of knowledge about per-SOC state machine specifics - quirks approach probably will be most reasonable. I would narrow SoCs to just 2: brcm & amlogic. Rest like coda or venus are too exotic (imho). Key issue of quirk approach: it will never be upstreamed :-(

23:42 <warpme__> we have quite well working decode of h264 on rpi & amlogic on mainline 5.15. Show stopper is seek. There is I exemplary code for ffmpeg doing buffers flush at seek - but it puts hw decoders into hang (due state machines de-sync). So - if we will fix this (in drivers per rpi & aml) - we may have something worth to show for users...

23:51 tdebrouw has quit [Quit: Leaving.]