DC-IRC has quit [Remote host closed the connection]
DC-IRC has joined #armbian-broadcom
<DC-IRC> <microlinux> AFAIK, broadcom abandoned thr TV box maket
<DC-IRC> <microlinux> So, rpi5 will be based on what's left
<DC-IRC> <microlinux> Probably a bit better lithography, cortex a72 x4 with av1 codec, something like that I suppose
<DC-IRC> <clever___> an rpi engineer has stated, that the vc6 core wasnt finished when the 2711 came out
<DC-IRC> <clever___> so the 2711, is a vc4 core, with random bits of vc6 bolted on the side
<DC-IRC> <microlinux> Something like that
<DC-IRC> <microlinux> You mean that what they advertise, vc6, isn't a vc6 on real?
<DC-IRC> <clever___> yeah, its just vc4, with some vc6 features
<DC-IRC> <tenkawa42> Doubtful..
<DC-IRC> <tenkawa42> Not cost efficient enough
<DC-IRC> <microlinux> Ohh, amazing, isn't that called false advertising? Hahah
<DC-IRC> <clever___> vc6 is whatever broadcom decides to call vc6
<DC-IRC> <tenkawa42> Not in the US market anyway
<DC-IRC> <tenkawa42> Not with BCM's typical cost model
<DC-IRC> <microlinux> A rpi5 should have a more powerful gpu for sure, even doubling the performance they will stay quite behind
<DC-IRC> <clever___> that kinda goes counter to the original point of rpi, which they have stated on the blog before
<DC-IRC> <clever___> moores law says that you can get more compute for the same price, as things get better
<DC-IRC> <clever___> but the other side, says you can get the same computer, for a cheaper price
<DC-IRC> <tenkawa42> yeah.. the RPI5 has to stay on par with the 4 in cost, etc,
<DC-IRC> <clever___> so they can give you a computer thats good enough to learn coding, for less money, which makes it easier for more people to enter the market
<DC-IRC> <tenkawa42> otherwise they might as well not bother
<DC-IRC> <microlinux> Yeah, that's what I mean, for the prices, it's not competitive (if they place 4xa72 and the same videocore)
<DC-IRC> <tenkawa42> Yeah
<DC-IRC> <microlinux> I would recommend any other brand, if price remains similar
<DC-IRC> <microlinux> Unless.. you need gpio/csi
<DC-IRC> <tenkawa42> That might be why its already been this long
<DC-IRC> <microlinux> Bc gpio/csi stuff it's far more supported/documented on rpi
<DC-IRC> <tenkawa42> Yeah
<DC-IRC> <tenkawa42> The dev IO is their strength for sure
<DC-IRC> <microlinux> For example.. vpu was literally deprecated on rk3399.
<DC-IRC> <microlinux> On legacy
<DC-IRC> <microlinux> So, you can only use it from android
<DC-IRC> <clever___> is it the same VPU as the rpi VPU?
<DC-IRC> <microlinux> Nono, I am saying that rpi can do vpu on "mainline"
<DC-IRC> <microlinux> It's supported and easy to use with many csi cameras
<DC-IRC> <clever___> and VPU is?
<DC-IRC> <microlinux> No other brand offers that as today, not on mainline neither close
<DC-IRC> <microlinux> I don't know the exact name of rockchip vpu, but who cares about the name. There is a vpu driver for mainline forever in progress
<DC-IRC> <microlinux> It should kinda work now, but only for decoding
<DC-IRC> <clever___> because on the rpi, VPU is a dual-core cpu, with both scalar and vector opcodes
<DC-IRC> <microlinux> And with software patches
<DC-IRC> <clever___> its not something you have drivers for, its a completely self-contained cpu cluster
<DC-IRC> <microlinux> Okay, but.. it works on mainline
<DC-IRC> <microlinux> That's the difference
<DC-IRC> <microlinux> I mean, rpi vpu
<DC-IRC> <clever___> that statement means nothing however
<DC-IRC> <clever___> the VPU is required for the system to even have working dram
<DC-IRC> <microlinux> Okay, the vpu decoding/encoding features
<DC-IRC> <microlinux> If you want it like that
<DC-IRC> <clever___> h264 encode/decode goes thru the VCE
<DC-IRC> <tenkawa42> @microlinux in the rpi the vpu is the cpu
<DC-IRC> <microlinux> Yes, I got that clear. It sound confusing when comparing it with other brands, but for vpu I mean the blocks that do enc/dec
<DC-IRC> <microlinux> And that works on rpi on mainline
<DC-IRC> <clever___> yeah, thats why i asked "is it the same VPU as the rpi VPU?"
<DC-IRC> <tenkawa42> the vpu doesn't do any hw enc/dec
<DC-IRC> <tenkawa42> right @clever___ ?
<DC-IRC> <microlinux> How so??
<DC-IRC> <tenkawa42> thats another component I thought
<DC-IRC> <clever___> on the rpi, the VPU only runs the drivers for the VCE
<DC-IRC> <clever___> and the VCE does the h264/vc1/mpeg2 encode/decode
<DC-IRC> <microlinux> Well, it's what we call a vpu outside rpi, so, no idea why it's so confusing to say the same on rpi
<DC-IRC> <tenkawa42> Yeah
<DC-IRC> <clever___> and then linux talks to the VPU over an RPC channel
<DC-IRC> <tenkawa42> I thought the hardware part was that VCE part
<DC-IRC> <clever___> yep
<DC-IRC> <clever___> for license/patent reasons, broadcom will never tell us how to drive the VCE directly
<DC-IRC> <microlinux> With vpu, vision processing unit, I refer to enc/dec blocks, if it has any other name on rpi, sorry
<DC-IRC> <clever___> so the VPU must be the middle-man
<DC-IRC> <tenkawa42> Yeah BCM and their fun licensing....
<DC-IRC> <clever___> patent fee's also get involved
<DC-IRC> <tenkawa42> Indeed
<DC-IRC> <clever___> if broadcom wants to sell a device that is capable of mpeg2 hw decode, they must pay a per-device fee
<DC-IRC> <clever___> but if they disable it in software, they dont have to pay
<DC-IRC> <clever___> and then they can sell a license key to the end-user, to re-enable it
<DC-IRC> <microlinux> But well, the point it's the same, for many industrial use cases, you need a solid setup on mainline with csi cameras and hw enc capabilities. That's only available on rpi as today.
<DC-IRC> <clever___> if we knew how to manage the VCE, removing such a check would be trivial
<DC-IRC> <clever___> yeah
<DC-IRC> <microlinux> Even while I wouldn't recommend rpi5 as learning/affordable desktop, bc other socs offer better hw for the same price. It will be the king on that csi/gpio/maker use cases.
<DC-IRC> <microlinux> @clever___ you think that's too far away from rpi5??
<DC-IRC> <clever___> ive not seen any hints, but ive also not been decompiling the latest firmware lately
<DC-IRC> <tenkawa42> Kind of having similar discussions in the Risc-V space right now too.. trying to get people to understand we need to stabilize things before getting too aggressive with some of the mainline use cases
<DC-IRC> <microlinux> 4xa72 at decent lithography it's a decent cpu power indeed. An improved vc6 and av1 would be a great deal
<DC-IRC> <clever___> h265/hevc on the pi4 is also weird
<DC-IRC> <microlinux> @clever, why rpi wasn't able to use the same hw decoders for aarch64?
<DC-IRC> <clever___> its decode only, and they skipped the closed drivers
<DC-IRC> <microlinux> It works great on armhf, but.. aarch64...
<DC-IRC> <clever___> the RPC between linux and the VPU was using 32bit slots for userland pointers, so when a reply comes back, the userland code can find its own state
<DC-IRC> <clever___> a 64bit pointer wont fit in a 32bit field
<DC-IRC> <clever___> its the same as scsi/msd using a 32bit tag on every request, so when replies come back out of order, you can match things up
<DC-IRC> <microlinux> And broadcom didn't offer any 64 bit alternative?
<DC-IRC> <clever___> they didnt want to rewrite the VPU side to support 64bit slots
<DC-IRC> <clever___> they dont want to maintain the closed firmware more then they have to
<DC-IRC> <clever___> so, they are hiding all of this mess within the linux kernel
<DC-IRC> <microlinux> Bc everytime an rpi guy say rpi os 64 bit it's much better than 32 but rpi os I always disagree bc of this same point
<DC-IRC> <clever___> and exposing hw encode/decode over v4l mem2mem encoder/decoder blocks
<DC-IRC> <clever___> then its mainline's job to define the userland api, which is already done
<DC-IRC> <microlinux> Performance wise it wasn't as good as the 32 bit counterpart
<DC-IRC> <microlinux> It was a diff implementation of course
<DC-IRC> <clever___> yeah, the closed api was amazing
<DC-IRC> <clever___> one min
<DC-IRC> <microlinux> Indeed, 4k hvec runs like a champ on 32 bit
<DC-IRC> <clever___> 4k hevc has proper linux drivers
<DC-IRC> <clever___> so it should behave identically on 32bit and 64bit
<DC-IRC> <microlinux> So, it only affects h264?
<DC-IRC> <clever___> and vc1 and mpeg2
<DC-IRC> <microlinux> Okaa
<DC-IRC> <clever___> this is the omx/openmax api
<DC-IRC> <clever___> if you want to take a video file and play that, you create instances of:
<DC-IRC> <clever___> video_decode
<DC-IRC> <clever___> video_scheduler
<DC-IRC> <clever___> video_render
<DC-IRC> <clever___> audio_decode
<DC-IRC> <clever___> audio_render
<DC-IRC> <clever___> and clock
<DC-IRC> <clever___> you then wire up all of the matching colored ports
<DC-IRC> <clever___> 131 on `video_decode` provides raw yuv frames, wire that to 10 on `video_scheduler`
<DC-IRC> <clever___> then `11` on scheduler goes to 90 on `video_render`
<DC-IRC> <clever___> `audio_decode` goes to `audio_render`
<DC-IRC> <clever___> and `clock` goes to both `audio_render` and `video_scheduler`
<DC-IRC> <clever___> then you just feed raw compressed video into `video_decode`
<DC-IRC> <clever___> and raw compressed audio into `audio_decode`
<DC-IRC> <clever___> and the openmax stack (within the VPU) does everything
<DC-IRC> <clever___> linux/arm never has to touch a single byte of decoded video
<DC-IRC> <clever___> so your not wasting clock cycles on copying finished frames
<DC-IRC> <clever___> but, the move towards more open api's, has led to worse overall handling, the open stuff has to play catch-up
<DC-IRC> <clever___> v4l mem2mem decoders, are returning each video frame to vlc as a `dma_buf` containing a yuv frame i think
<DC-IRC> <clever___> but x11 cant display yuv
<DC-IRC> <clever___> so then the arm cpu has to do a costly yuv->rgb conversion
<DC-IRC> <clever___> but if you full-screen vlc, it can switch to drm leasing, and then it can do direct yuv playback
<DC-IRC> <clever___> @microlinux does that all make sense?
<DC-IRC> <clever___> as a test, i opened vlc on a random `h264 1280x528 24.000fps` file
<DC-IRC> <clever___> its using 17-20% while playing in a maximized window, with the fkms driver (oops)
<DC-IRC> <clever___> but that actually lets me test a diff codepath
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> pi@pi400:~ $ vcgencmd dispmanx_list
<DC-IRC> <clever___> display:2 format:XRGB8888 transform:0 layer:-127 1280x1024 src:0,0,1280,1024 dst:0,0,1280,1024 cost:801 lbm:0
<DC-IRC> <clever___> pi@pi400:~ $
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> fkms is also called firmware kms or fake kms
<DC-IRC> <clever___> linux is exposing a kms/drm api to userland, but its just RPC'ing everything back to the VPU
<DC-IRC> <clever___> and with `dispmanx_list`, you can see that the VPU is rendering a single RGBx8888 image, at 1:1 scale
<DC-IRC> <microlinux> Ohh, that explains a lot
<DC-IRC> <clever___> if the mouse pointer is visible, a second sprite appears
<DC-IRC> <clever___> `display:2 format:ARGB8888 transform:0 layer:1 64x64 src:0,2,64,62 dst:734,0,64,62 cost:125 lbm:0 `
<DC-IRC> <clever___> thats an ARGB sprite, so the per-pixel alpha lets the pointer draw over things
<DC-IRC> <clever___> hw composition overlays it
<DC-IRC> <clever___> but, the arm core is responsible for putting the video into this single RGB 1280x1024 frame
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> > 1280*1024*4*24
<DC-IRC> <clever___> 125,829,120
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> so the arm is having to write 125mb/sec to ram, just to populate this with the video
<DC-IRC> <clever___> and because this is a 1280x528 video, maximized, it happens to not need scaling
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> > 1280*528*4*24
<DC-IRC> <clever___> 64,880,640
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> half the load, but you still have to deal with tearing or knowing when the buffers in double-buffering are out of sync with the non-video elements
<DC-IRC> <clever___> now i'll switch to kms (also called full kms), and repeat.....
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> root@pi400:~# vcgencmd dispmanx_list
<DC-IRC> <clever___> root@pi400:~#
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> yep, dispmanx is no longer in control
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> root@pi400:/sys/kernel/debug/dri/0# less state
<DC-IRC> <clever___> plane[70]: plane-3
<DC-IRC> <clever___> crtc=crtc-3
<DC-IRC> <clever___> fb=226
<DC-IRC> <clever___> allocated by = Xorg
<DC-IRC> <clever___> refcount=2
<DC-IRC> <clever___> format=XR24 little-endian (0x34325258)
<DC-IRC> <clever___> modifier=0x0
<DC-IRC> <clever___> size=1280x1024
<DC-IRC> <clever___> layers:
<DC-IRC> <clever___> size[0]=1280x1024
<DC-IRC> <clever___> pitch[0]=5120
<DC-IRC> <clever___> offset[0]=0
<DC-IRC> <clever___> obj[0]:
<DC-IRC> <clever___> name=0
<DC-IRC> <clever___> refcount=4
<DC-IRC> <clever___> start=00100281
<DC-IRC> <clever___> size=5242880
<DC-IRC> <clever___> imported=no
<DC-IRC> <clever___> crtc-pos=1280x1024+0+0
<DC-IRC> <clever___> src-pos=1280.000000x1024.000000+0.000000+0.000000
<DC-IRC> <clever___> rotation=1
<DC-IRC> <clever___> normalized-zpos=0
<DC-IRC> <clever___> color-encoding=ITU-R BT.601 YCbCr
<DC-IRC> <clever___> color-range=YCbCr limited range
<DC-IRC> <clever___> ```
<DC-IRC> <clever___> linux is now managing a single 1280x1024 image, still RGBx8888
<DC-IRC> <clever___> to my surprise, vlc is now using 48% cpu
<DC-IRC> <clever___> this should effectively be identical performance to before, will need to investigate more
<DC-IRC> <clever___> and if i full-screen, the video breaks
<DC-IRC> <clever___> it did try to switch to drm leasing, but i have an ancient install of raspi-os