Skip to content
This repository has been archived by the owner on Mar 6, 2024. It is now read-only.

Open Test Thread #14

Open
hbiyik opened this issue Jul 22, 2023 · 144 comments
Open

Open Test Thread #14

hbiyik opened this issue Jul 22, 2023 · 144 comments

Comments

@hbiyik
Copy link
Owner

hbiyik commented Jul 22, 2023

@avafinger

A lot has been added specially hevc and vp8 encoders with scaling support:

https://github.com/hbiyik/FFmpeg/wiki

should be stable to test if you are interested.

@nyanmisaka
Copy link

Is there an option to force *_rkmpp_decoder outputting drm_prime hw frames?

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 22, 2023

FFMPEG_RKMPP_PIXFMT=DRMPRIME env value

@nyanmisaka
Copy link

HEVC/VP9/AV1 10-bit to H264/HEVC 8bit transcoding can cause system crashes. Sometimes I have to cut the power to reset.

FFMPEG_RKMPP_PIXFMT=DRMPRIME ./ffmpeg -c:v hevc_rkmpp_decoder -i hevc_10bit.mp4 -an -sn -c:v h264_rkmpp_encoder -rc_mode VBR -b:v 6M -maxrate 6M -bufsize 12M -profile:v high -level 4.1 -g:v 120 -f null -
[  771.797702] rk_vcodec: mpp_translate_reg_address:1838: reg[  0]: 0xffffffff fd -1 failed
[  771.797713] rk_vcodec: mpp_task_dump_mem_region:2025: --- dump mem region ---
[  771.797724] mpp_rkvenc2 fdbd0000.rkvenc-core: no memory region mapped
[  771.797737] rk_vcodec: mpp_process_task_default:630: alloc_task failed.
[  771.797747] rkvenc2_wait_result:1995: session 00000000736d7b74 pending list is empty!
[  771.797753] rk_vcodec: mpp_msgs_wait:1634: session 3 wait result ret -5
[  772.019465] rkvdec2_ccu_timeout_work:1643: fdc38100.rkvdec-core, task timeout
[  772.019515] rkvdec2_ccu_timeout_work:1643: fdc48100.rkvdec-core, task timeout
[  772.019586] mpp_rkvdec2 fdc48100.rkvdec-core: resetting...
[  772.019782] mpp_rkvdec2 fdc48100.rkvdec-core: reset done
[  772.019794] mpp_rkvdec2 fdc38100.rkvdec-core: resetting...
[  772.019896] mpp_rkvdec2 fdc38100.rkvdec-core: reset done

Also it seems the post RGA -width 1280 -height 720 doesn't accept 10-bit hw frames.

[h264_rkmpp_encoder @ 0xaaaae92c1dc0] Scaling is only supported for NV12,NV16,YUV420P,YUV422P. drm_prime requested

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 22, 2023

Can you try without forcing drmprime? You actaully dont neee to drm prime, softframes are 0 copy.

Scaling is only possible yuv420/422 p/sp planes, but i noticed when the outputframe is drmprime i dont check it correctly. Ill fix that...

@nyanmisaka
Copy link

You actaully dont neee to drm prime, softframes are 0 copy.

Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), nv12(tv, progressive), 1920x1080
But the encoder's input still suggests nv12, which usually implies non-zero-copy.

After removing the env, here comes the rga3 alignment issue or maybe the DMA32 buffer issue on my 16GB RAM board.

Stream mapping:
  Stream #0:5 -> #0:0 (hevc (hevc_rkmpp_decoder) -> h264 (h264_rkmpp_encoder))
Press [q] to stop, [?] for help
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] Decoder noticed an info change
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] 10bit NV15 plane will be downgraded to 8bit nv12.
rga_api version 1.8.1_[4]
err hs[0,1088,1080]
Error srcRect
[RgaBlit,782]Error srcRect

fd-vir-phy-hnd-format[12, (nil), (nil), 0, 8192]
rect[0, 0, 1920, 1088, 2816, 1080, 8192, 0]
f-blend-size-rotation-col-log-mmu[8192, 0, 0, 0, 0, 0, 1]
fd-vir-phy-hnd-format[20, (nil), (nil), 0, 2560]
rect[0, 0, 1920, 1088, 1920, 1088, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user patamaters when rga call blit fail
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] RGA failed falling back to soft conversion
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] RGA failed to convert NV15 -> NV12. No Soft Conversion Possible
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] Failed set frame buffer (code = -1)
[hevc_rkmpp_decoder @ 0xaaaaf9f9af50] Decoder Failed to get frame (code = -1)
Error while decoding stream #0:5: Operation not permitted

And for the HEVC 8-bit 1080p input + post RGA downscaling to 720p.
https://test-videos.co.uk/vids/bigbuckbunny/mp4/h265/1080/Big_Buck_Bunny_1080_10s_30MB.mp4

./ffmpeg -stream_loop -1 -c:v hevc_rkmpp_decoder -i Big_Buck_Bunny_1080_10s_30MB.mp4 -an -sn -c:v h264_rkmpp_encoder -rc_mode CBR -b:v 6M -maxrate 6M -bufsize 12M -profile:v high -level 4.1 -g:v 120 -width 1280 -height 720 -f null -
rga_api version 1.8.1_[4]
err hs[0,1088,1080]
Error srcRect
[RgaBlit,782]Error srcRect

fd-vir-phy-hnd-format[12, (nil), (nil), 0, 2560]
rect[0, 0, 1920, 1088, 2304, 1080, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
fd-vir-phy-hnd-format[37, (nil), (nil), 0, 2560]
rect[0, 0, 1280, 720, 1280, 720, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user patamaters when rga call blit fail
[h264_rkmpp_encoder @ 0xaaaaf24ec9e0] RGA failed falling back to soft conversion
[h264_rkmpp_encoder @ 0xaaaaf24ec9e0] Error applying Post RGA

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 22, 2023

Ok ill give it a look in detail tonight

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 22, 2023

the 2nd issue is quite weird.

rect[0, 0, 1920, 1088, 2304, 1080, 2560, 0]

the hstride is given 2304 here, and this is received from mpp directly. for this NV12 frame it should be 1080, or with alignment 1088 1920, but wtf is 2304, and why mpp reports so, is interesting, could be an issue with mpp that i need to dig in.

Update:

  1. issue with forcing DRMPRIME and having everything crashed, fixed in ed61669
  2. Issue with forcing DRMPRIME and scaling gives wrong limitation error, fixed in bf96d57
  3. Issue Big_Buck_Bunny_1080_10s_30MB.mp4 conversion crashes the rga: This is most likely an MPP bug, i understood completely why it is happening and i do not want to patch it from ffmpeg because obviously mpp is providing wrong stride, raised a bug issue mpp provides wrong hstride when decoding this hevc file rockchip-linux/mpp#422, i expect a simple bugfix from mpp.

@nyanmisaka
Copy link

Thx for the update. With the fixes:

  Stream #0:0: Video: hevc (Main) (hev1 / 0x31766568), drm_prime(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 6000 kb/s, 23.98 fps, 24k tbn (default)
    Metadata:
      BPS-eng         : 5794040
      DURATION-eng    : 01:30:06.401000000
      NUMBER_OF_FRAMES-eng: 129624
      NUMBER_OF_BYTES-eng: 3915613217
      _STATISTICS_WRITING_APP-eng: mkvmerge v42.0.0 ('Overtime') 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2020-01-14 15:39:31
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      encoder         : Lavc60.3.100 hevc_rkmpp_encoder
rga_api version 1.8.1_[4]
[hevc_rkmpp_encoder @ 0xaaaaefd216c0] Reconfigured with w=1280, h=720, format=nv12.
Video
ID                             : 1
Format                         : HEVC
Format/Info                    : High Efficiency Video Coding
Format profile                 : [email protected]@Main
Codec ID                       : hev1
Codec ID/Info                  : High Efficiency Video Coding
Duration                       : 23 s 524 ms
Bit rate                       : 4 409 kb/s
Maximum bit rate               : 6 000 kb/s
Width                          : 1 920 pixels
Original width                 : 1 280 pixels
Height                         : 1 080 pixels
Original height                : 720 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 23.976 (24000/1001) FPS
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Scan type                      : Progressive
Bits/(Pixel*Frame)             : 0.089
Stream size                    : 12.4 MiB (100%)
Menus                          : 2
Codec configuration box        : hvcC
  • forcing DRMPRIME + 8-bit nv12 input + w/ and w/o post RGA => corrupted output, TODO?
    hevc_8b

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 23, 2023

this should all work, however i switched to librga and it is being a bi**h, i am working on it.
Anything else in the meanwhile??

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 23, 2023

ah one question, which player are you testing the drmprime output? is it kodi or something else?

@nyanmisaka
Copy link

I can't figure out why librga exists independently of mpp. This resulted in developers having to maintain compatibility between them.

sw frames seem to work fine in the encoder. Video quality looks better than AMD graphics cards, albeit slower.

I've been using the command line to test the encoder. Kodi might be a good option for testing ffmpeg as a library. Or refer to some tools of the author of rpi-ffmpeg such as https://github.com/jc-kynesim/hello_drmprime.

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 23, 2023

all of the issues you reported today and yesterday including the one i thought mpp related should be fixed in 1d57c70

There is something fishy with mpp to my understanding however this was not the root cause of the issue.

About Sw frames, even when so-called non-drmprime planes are decoded with rkmpp_decoder they are still hardware planes most of the time, simply mmapped to drm device. Especially when the NV12 plane is used, there is no copy at all. So i would call them hybrid and in the transcoding scenarios they are mainly NV12, so hardware frames mapped to AVFrame. It is kinda tricky but this the actual reason why i started this fork in the first place.

In short, when transcoding forcing DRMPRIME does not make any difference in terms of performance, you can also verify this with the throughput and resource usage. This is the reason i did not test those parts with transcoding with DRMPRIME forced, so thanks for that :), lots of issues were found and fixed (hopefully). i had tested drmprime only when getting kmsgrab input with drmprime bgr0 frames.

@nyanmisaka
Copy link

Overall works great after the latest changes, with some exceptions.

  • In post RGA cases, avctx->{width,height} should be finalized before calling rkmpp_config(), otherwise MPP_ENC_GET_HDR_SYNC cannot generate the correct extradata for the HEVC encoder. Also, with forcing DRMPRIME enabled, the changes to avctx->{width,height} are ignored. You can check the resolution of the encoded file with ffprobe.

  • The default level of HEVC encoder is an invalid value: 255. Better to have a new option "auto=0" and let the mpp runtime to handle the default level of the H26x encoders.

It seems that my experience with desktop GPUs doesn't fully apply to Arm/Rockchip. As you said, forcing DRMPRIME does not improve performance. I also couldn't find any encoder preset to trade off between speed and quality. So the claimed encoding speed of 8k30 cannot be equivalent to single 4k120 or 1080p480. Maybe in parallel encoding it will work.

As for the HEVC encoder of rk3588, it doesn't support NV15 input, which means it cannot encode Main 10 profile/10-bit video, maybe they will add it in the next generation of HW. But for now it's best to remove it so as not to confuse users.

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

thanks for the detailed feedback, all is fixed at latest: cf6e176,

about profile 8.5:
profile value 255 was so called profile 8.5 for hevc which is a special profile that enforces no limitation. I was sceptic about the h265 parser in mpp so i used that one, but now changed to default 0.

my understanding of this is that, only way to trade of speed is to reduce the input size. I think there is a constant pixel per cycle process rate. but i am also not sure, i have not yet benchmarked the encoder performance. And the best way to do it would be to first compare with mpi_enc_test things that come default with mpp.

open question: do you know any way to produce NV24, YUV444P, NV16, YUV422P, BGR24, YUYV422, UYVY422, BGRA, BGR0, NV12, YUV420P formatted DRM PRIME frames with ffmpeg so that i can push them to encoder.

currently i can only test NV12, NV16, YUV420P (rkmpp_decoder) & BGR0 (kmsgrab) drm prime frames and rest is not tested due to lack of input.

@veldspar
Copy link

alright, I checked out your git, I did modify the configure line a little from your wiki however:

./configure --enable-rkmpp --enable-version3 --enable-lib
drm --enable-nonfree --enable-gpl --enable-version3 --enable-libx264 --enable-librtmp --enable-shared --enable-static --enable-libx265 --enable-libmp3lame --enable-libpulse --enable-openssl --enable-libopus --enable-libvorbis --enable-libaom --enable-libass --enable-libdav1d --enable-libx265 --enable-libvpx

I used that configure line because i also used it for jjm2473's fork of rkmpp enabled ffmpeg, so i have a comparison. the build went on clean, no errors from the first try on, however a warning during linking.

After build it complained about missing libavdevice.so.60 on first try(ffmpeg --encoder) - I found it in the libavdevice subfolder of your git pull. After that it comlained about all the rest of the common libs being missing one by one. simple guess would be that I didnt install ffmpeg. after adjusting my LD_LIBRARY_PATH for testing ffmpeg loads.

so far so good. ffmpeg -encoders | grep rk lists the h264, hevc and vp8 encoders
ffmpet -decoders | grep rk lists h263, h264, hevc, mpeg1/2/4, vp8 and vp9 hardware decoders. I'm starting to like this.

Now lets give it a try - the goal is to trancsode.
./ffmpeg -i in.mkv -c:v hevc -c:a copy /extern/nn.hevc.mp4

and i get a segfault. same goes when trying to encode to h264

EDIT: I tried a clean build with the configure line from the git repos wiki, but i also keep getting segfault when trying to transcode a video, doesnt matter whether i transcode to h264 or hevc. Input in all cases has been h264 full hd

As for the linker warning, this is what i get:
LD ffprobe_g /usr/bin/ld: /lib/aarch64-linux-gnu/libtirpc.so.3: warning: common of rpc_createerr@@GLIBC_2.17' overridden by definition from /lib/aarch64-linux-gnu/libc.so.6
/usr/bin/ld: /lib/aarch64-linux-gnu/libtirpc.so.3: warning: common of rpc_createerr@@GLIBC_2.17' overridden by definition from /lib/aarch64-linux-gnu/libc.so.6 /usr/bin/ld: /lib/aarch64-linux-gnu/libtirpc.so.3: warning: common of rpc_createerr@@GLIBC_2.17' overridden by definition from /lib/aarch64-linux-gnu/libc.so.6
STRIP ffplay
`

Tiny attachment: I had installed a current mpp, however that got installed into /usr/local/lib and the system mpp was used instead. turns out your ffmpeg isnt compatible with the mpp that comes from radxas repo. a simple override using LD_LIBRARY_PATH fixed that issue, and the segfault is gone. Might be worth adding that to the wiki.

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

Tuned the statement in the wiki explicitly

Please use latest versions of those libraries, espcially mpp, and rga which are not very backwards compatible.

@nyanmisaka
Copy link

open question: do you know any way to produce NV24, YUV444P, NV16, YUV422P, BGR24, YUYV422, UYVY422, BGRA, BGR0, NV12, YUV420P formatted DRM PRIME frames with ffmpeg so that i can push them to encoder.

currently i can only test NV12, NV16, YUV420P (rkmpp_decoder) & BGR0 (kmsgrab) drm prime frames and rest is not tested due to lack of input.

This might require extra effort in libavutil/hwcontext_drm. Currently it is only a skeleton and does not contain a frame allocator unlike other hwcontexts. I remember someone have a patch for this and put it somewhere.

Once complete, the command line should look like this:

./ffmpeg -init_hw_device drm=dr:/dev/dri/renderD128 -filter_hw_device dr -f lavfi -i testsrc2=s=1280x720,format=bgra -vf hwupload,format=drm_prime -f null -

The ideal situation is that we can have a separate hwcontext_mpp as a sub-device of the hwcontext_drm.

@nyanmisaka
Copy link

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

amazing, that helps alot

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

hmm on a second review i am not so sure about this:

This patch gets the hstride from the the linesize of the picture descriptor, however to my experience and testing, ffmpeg and drm do not always have same defintiion hstride and plane count.
https://github.com/Consti10/rv1126_ohd_sushi/blob/853fa1fc2d50f0e4f9b5eea71d1ff2657c9a2765/buildroot/package/ffmpeg/0005-hwcontext_drm-internal-frame-allocation.patch#L310C1-L311C1

ie: for ffmpeg AV_PIX_FMT_BGR0 is 1 plane format with
hstride = 4 * width
vstride = height
size = hstride * vstride

but for drm it is 4 1 plane format with
hstride = width
vstride = height
size = 4 * hstride * vstride

I have learned this by testing kmsgrab drmplane formar bgr0

if(isrgb)

May be it is better to look at libdrm in detail how the size and strides and planes are defined.

Update: After carefull examination, the hstride multiplier is based on the char_per_block definition rather than the plane size. so this might should work

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

ok i was confusing with plane count and char_per_block

@veldspar
Copy link

quick question - the -level option for the hevc encoder, is that like the presets from x265? if so, is lower level slower preset, aka better quality? sorry to bother in here, but i couldnt find anything about this on the web

@nyanmisaka
Copy link

Nope. It refers to coding constraints. There's no x265 CRF equivalent.
https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding_tiers_and_levels

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

mean while one interesting feature could be this so called "split mode" to reduce the latency on live streaming apps. Any experience on it? Does it really make sense?

@nyanmisaka
Copy link

Sounds like it should help moonlight. Btw have you figured out the patch yet?

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

ah i see, yeah actually this cloud gaming thing is quite popular and hip lately, so latency could be a thing. I can not work on the patch, i will when i go to home in the evening,

@nyanmisaka
Copy link

The original author left a little flaw. This should save you some time.

0001-lavu-hwcontext_drm-Add-internal-frame-allocation.patch

FFmpeg cli works as expected but there are some issues with certain formats.

./ffmpeg -init_hw_device drm=dr:/dev/dri/renderD128 -filter_hw_device dr -f lavfi -i testsrc2=s=1280x720,format=yuv444p -vf hwupload,format=drm_prime -c:v h264_rkmpp_encoder -rc_mode VBR -b:v 4M -maxrate 4M -bufsize 8M -y /tmp/out.mp4

yuv444p_fail

@hbiyik
Copy link
Owner Author

hbiyik commented Jul 24, 2023

So, thanks a lot for the patch...

After several fixes, NV12, NV16, NV24, BGR24, YUYV422, YVY422, BGRA,

YUV420P, YUV422P, YUV444P are not giving red channels, considering those are yuv formats, it is obviously something wrong with the source not the encoder. dont knwo what currently.

BGR0 also does not work, i see that the size of the DRM plane is 3/4 of what it is supposed to be, i think thats because last X or 0, part of the plane is not allocated to save space, but this is not ok for the MPP, if you give such strides with less the size, mpp will try to read non existent last quarter of the plane where there was suppossed to be bunch of 0s, buth it will segfault and crash the kernel. I dont know which one to blame but, considering kmsgrab is also giving full size plane with with bgr0 format, it seems like something with this generated drm primes is wrong.

But i could already test planar planes and bgr0 from other sources, so i think these test planes with this new patch is giving good enough coverage.

image

@veldspar
Copy link

so another issue - the build generally works fine, however I noticed when converting DVD Source material(I tried one source with a 352x576 SAR 24:11 and another 720x576 source) ffmpeg complains about duplicating frames. Sample out:

   encoder         : Lavc60.3.100 ac3
[vost#0:0/hevc_rkmpp_encoder @ 0x55a0eedb50] More than 1000 frames duplicatedts/s dup=981 drop=418 speed=15.9x    
[dvd @ 0x55a0eedea0] buffer underflow st=0 bufi=10966 size=20148ate= 753.7kbits/s dup=3275 drop=1367 speed=  16x    
[dvd @ 0x55a0eedea0] buffer underflow st=0 bufi=12990 size=20148
[dvd @ 0x55a0eedea0] buffer underflow st=0 bufi=15014 size=20148
[dvd @ 0x55a0eedea0] buffer underflow st=0 bufi=17038 size=20148
[dvd @ 0x55a0eedea0] buffer underflow st=0 bufi=19062 size=20148
[vost#0:0/hevc_rkmpp_encoder @ 0x55a0eedb50] More than 10000 frames duplicateds/s dup=9936 drop=3975 speed=13.6x    
^C[out#0/dvd @ 0x55a0dba0b0] Error writing trailer: Immediate exit requestedits/s dup=21884 drop=8468 speed=14.6x    
frame=51408 fps=314 q=-0.0 Lsize=  197120kB time=00:28:35.32 bitrate= 941.4kbits/s dup=21939 drop=8491 speed=10.5x    
video:177251kB audio:14838kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.619029%
Exiting normally, received signal 2.

this happens regardless of whether I try the iso image as source or the individual VOB files. The resulting file is a stuttering output that plays a couple frames, then hangs for a fraction of a second and then continues playing.

line used in above example was:
ffmpeg -fflags +genpts -i dvd.iso -c:v hevc -b:v 1000k -c:a ac3 -rematrix_maxval 1.0 -ac 2 -f dvd out.mp4

this works fine in software on my desktop computer(I need to encode to h264 in both cases, your ffmpeg on rock 5 or my ffmpeg on desktop, for some reason when i choose hevc i get a non-playing video on this second video i tried and vlc claims its an mpeg-2 source stream, but that might be a bug upstream the ffmpeg pipe)

@nyanmisaka
Copy link

Command lines like this can turn off the blue led.
sudo sh -c "echo none > /sys/class/leds/blue\:status/trigger"

But AFAIK the green led on 5B cannot be controlled by software.

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 16, 2023

v10+panthor-v3+32b+g310+egl15+afrc+afbc
v10+reg-cache

tested both above branches from boris's mesa, but gnome/gdm crashes at start.

i also had to patch to compile it properly.

diff --git a/src/egl/main/egldriver.h b/src/egl/main/egldriver.h
index 6c8ea71..51c111e 100644
--- a/src/egl/main/egldriver.h
+++ b/src/egl/main/egldriver.h
@@ -227,7 +227,7 @@
    /* for EGL_EXT_surface_compression */
    EGLBoolean (*QuerySupportedCompressionRatesEXT)(_EGLDisplay *disp,
                                                    _EGLConfig *config,
-                                                   const EGLint *attr_list,
+                                                   const EGLAttrib *attr_list,
                                                    EGLint *rates,
                                                    EGLint rate_size,
                                                    EGLint *num_rates);

otherwise does not compile and complains about long to int pointer mismatch

 [LNK] Process 1470 (gnome-shell) of user 1000 dumped core.
                                              
Stack trace of thread 1470:
#0  0x0000ffff62d1aaec panfrost_resource_set_damage_region (rockchip_dri.so + 0xd0aaec)
#1  0x0000ffff620f5384 dri_st_framebuffer_validate (rockchip_dri.so + 0xe5384)
#2  0x0000ffff621b8a18 st_framebuffer_validate (rockchip_dri.so + 0x1a8a18)
#3  0x0000ffff621b9680 st_api_make_current (rockchip_dri.so + 0x1a9680)
#4  0x0000ffff620f4ed4 dri_make_current (rockchip_dri.so + 0xe4ed4)
#5  0x0000ffff620f8c18 driBindContext (rockchip_dri.so + 0xe8c18)
#6  0x0000ffff7c25683c dri3_bind_context (libGLX_mesa.so.0 + 0x4683c)
#7  0x0000ffff7c245a08 MakeContextCurrent (libGLX_mesa.so.0 + 0x35a08)
#8  0x0000ffff87b233d0 n/a (libGLX.so.0 + 0x33d0)
#9  0x0000ffff87b23e90 n/a (libGLX.so.0 + 0x3e90)
#10 0x0000ffff87b25370 n/a (libGLX.so.0 + 0x5370)
#11 0x0000ffff8bff02e0 n/a (libmutter-cogl-13.so.0 + 0x802e0)
#12 0x0000ffff8bf9acdc cogl_display_setup (libmutter-cogl-13.so.0 + 0x2acdc)
#13 0x0000ffff8bf99c44 cogl_renderer_check_onscreen_template (libmutter-cogl-13.so.0 + 0x29c44)
#14 0x0000ffff8c7df188 n/a (libmutter-13.so.0 + 0x14f188)
#15 0x0000ffff8ca84324 n/a (libmutter-clutter-13.so.0 + 0x64324)
#16 0x0000ffff8cac1548 clutter_context_new (libmutter-clutter-13.so.0 + 0xa1548)
#17 0x0000ffff8c730874 n/a (libmutter-13.so.0 + 0xa0874)
#18 0x0000ffff8d0197c4 g_initable_new_valist (libgio-2.0.so.0 + 0x997c4)
#19 0x0000ffff8d0198d4 g_initable_new (libgio-2.0.so.0 + 0x998d4)
#20 0x0000ffff8c7ad544 n/a (libmutter-13.so.0 + 0x11d544)
#21 0x0000ffff8c7ac4e8 n/a (libmutter-13.so.0 + 0x11c4e8)
#22 0x0000ffff8c7ae5c0 meta_context_setup (libmutter-13.so.0 + 0x11e5c0)
#23 0x0000aaaab1eb23f0 n/a (gnome-shell + 0x23f0)
#24 0x0000ffff8c4c7b80 __libc_start_call_main (libc.so.6 + 0x27b80)
#25 0x0000ffff8c4c7c60 __libc_start_main_impl (libc.so.6 + 0x27c60)
#26 0x0000aaaab1eb2c70 n/a (gnome-shell + 0x2c70)

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 16, 2023

but the good news is kmscube is working, so i can use mpv to render directly over gbm. I think thats good enough for me to start working on poring mpp.

below are the PKGBUILDs in case of an interest

https://github.com/7Ji-PKGBUILDs/linux-panthor/blob/master/PKGBUILD
https://github.com/7Ji-PKGBUILDs/mesa-panthor/blob/master/PKGBUILD

and finally
Thermal values, but with an hot NVME:

bigcore0_thermal (/sys/devices/virtual/thermal/thermal_zone1/hwmon1)
    temp1: 50.85C
bigcore1_thermal (/sys/devices/virtual/thermal/thermal_zone2/hwmon2)
    temp1: 50.85C
center_thermal (/sys/devices/virtual/thermal/thermal_zone4/hwmon5)
    temp1: 50.85C
gpu_thermal (/sys/devices/virtual/thermal/thermal_zone5/hwmon6)
    temp1: 49.92C
littlecore_thermal (/sys/devices/virtual/thermal/thermal_zone3/hwmon4)
    temp1: 50.85C
npu_thermal (/sys/devices/virtual/thermal/thermal_zone6/hwmon7)
    temp1: 49.92C
pwmfan (/sys/devices/platform/pwm-fan/hwmon/hwmon3)
    pwm1: 100.00%
regulators
    dcdc-reg1 (vdd_gpu_s0): 0.68V (min: 0.55V, max: 0.95V, state: enabled, num_users: 2)
    dcdc-reg10 (vcc_1v8_s3): 1.80V (min: 1.80V, max: 1.80V, state: enabled, num_users: 1)
    dcdc-reg2 (vdd_cpu_lit_s0): 0.75V (min: 0.55V, max: 0.95V, state: enabled, num_users: 1)
    dcdc-reg3 (vdd_log_s0): 0.75V (min: 0.68V, max: 0.75V, state: enabled, num_users: 1)
    dcdc-reg4 (vdd_vdenc_s0): 0.75V (min: 0.55V, max: 0.95V, state: enabled, num_users: 1)
    dcdc-reg5 (vdd_ddr_s0): 0.85V (min: 0.68V, max: 0.90V, state: enabled, num_users: 1)
    dcdc-reg6 (vdd2_ddr_s3): 0.50V (state: enabled, num_users: 1)
    dcdc-reg7 (vdd_2v0_pldo_s3): 2.00V (min: 2.00V, max: 2.00V, state: enabled, num_users: 4)
    dcdc-reg8 (vcc_3v3_s3): 3.30V (min: 3.30V, max: 3.30V, state: enabled, num_users: 2)
    dcdc-reg9 (vddq_ddr_s0): 0.50V (state: enabled, num_users: 1)
    nldo-reg1 (vdd_0v75_s3): 0.75V (min: 0.75V, max: 0.75V, state: disabled, num_users: 1)
    nldo-reg2 (vdd_ddr_pll_s0): 0.85V (min: 0.85V, max: 0.85V, state: disabled, num_users: 1)
    nldo-reg3 (avdd_0v75_s0): 0.75V (min: 0.75V, max: 0.75V, state: disabled, num_users: 1)
    nldo-reg4 (vdd_0v85_s0): 0.85V (min: 0.85V, max: 0.85V, state: disabled, num_users: 1)
    nldo-reg5 (vdd_0v75_s0): 0.75V (min: 0.75V, max: 0.75V, state: disabled, num_users: 1)
    pldo-reg1 (avcc_1v8_s0): 1.80V (min: 1.80V, max: 1.80V, state: disabled, num_users: 1)
    pldo-reg2 (vcc_1v8_s0): 1.80V (min: 1.80V, max: 1.80V, state: disabled, num_users: 1)
    pldo-reg3 (avdd_1v2_s0): 1.20V (min: 1.20V, max: 1.20V, state: disabled, num_users: 1)
    pldo-reg4 (vcc_3v3_s0): 3.30V (min: 3.30V, max: 3.30V, state: disabled, num_users: 1)
    pldo-reg5 (vccio_sd_s0): 3.30V (min: 1.80V, max: 3.30V, state: disabled, num_users: 1)
    pldo-reg6 (pldo6_s3): 1.80V (min: 1.80V, max: 1.80V, state: disabled, num_users: 1)
    regulator@42 (vdd_cpu_big0_s0): 0.80V (min: 0.55V, max: 1.05V, state: enabled, num_users: 1)
    regulator@43 (vdd_cpu_big1_s0): 0.80V (min: 0.55V, max: 1.05V, state: enabled, num_users: 1)
    vcc-1v1-nldo-s3-regulator (vcc_1v1_nldo_s3): 1.10V (num_users: 6)
    vcc12v-dcin-regulator (vcc12v_dcin): 12.00V (num_users: 2)
    vcc3v3-pcie2x1l0-regulator (vcc3v3_pcie2x1l0): 3.30V (state: enabled, num_users: 2)
    vcc3v3-pcie2x1l2-regulator (vcc3v3_pcie2x1l2): 3.30V (num_users: 2)
    vcc3v3-pcie30-regulator (vcc3v3_pcie30): 3.30V (state: enabled, num_users: 1)
    vcc5v0-host-regulator (vcc5v0_host): 5.00V (state: enabled, num_users: 3)
    vcc5v0-sys-regulator (vcc5v0_sys): 5.00V (num_users: 23)
soc_thermal (/sys/devices/virtual/thermal/thermal_zone0/hwmon0)
    temp1: 50.85C

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 18, 2023

@nyanmisaka

first steps on the moon

[alarm@alarm ~]$ uname -a
Linux alarm 6.7.0-rc1-panthor-g4a610b4a1e08 #1 SMP PREEMPT Sat Nov 18 19:44:55 UTC 2023 aarch64 GNU/Linux
[alarm@alarm ~]$ ls /dev/mpp_service 
/dev/mpp_service

@nyanmisaka
Copy link

That’s awesome 👏 One day we should package it as a DKMS.

I don't know if mpp has a hard requirement for their custom dma-buf-heaps driver. Currently in 6.1 bsp this driver is broken, and the android ion driver has been removed by upstream linux, so the only available mem allocator is drm.

I experimented with the 6.1 bsp kernel, and with some fixes, it already has a high degree of completion. Backporting the panthor to it is also valuable.
CB0C60DE-A425-4664-A925-26C20AA7C7D0

Yesterday I also encountered a system freeze in 6.7-rc1 caused by the hdmi screen being turned off.

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 18, 2023

it works!! :) (you need to get latest, i pushed some dts fixes.)

[alarm@alarm ~]$ sudo dmesg | grep mpp
[sudo] password for alarm: 
[    0.083389] mpp_service mpp-srv: c785ffa2ad90 author: boogie 2023-11-18 mpp: rkvenc2, remove devfreq support
[    0.083398] mpp_service mpp-srv: probe start
[    0.084611] mpp_service mpp-srv: probe success
[    0.358183] mpp_vdpu2 fdb50400.vdpu: Adding to iommu group 1
[    0.359279] mpp_vdpu2 fdb50400.vdpu: probe device
[    0.359853] mpp_vdpu2 fdb50400.vdpu: reset_group->rw_sem_on=0
[    0.360375] mpp_vdpu2 fdb50400.vdpu: reset_group->rw_sem_on=0
[    0.361343] mpp_vdpu2 fdb50400.vdpu: probing finish
[alarm@alarm ~]$ uname -a
Linux alarm 6.7.0-rc1-panthor-g4a610b4a1e08 #1 SMP PREEMPT Sat Nov 18 19:44:55 UTC 2023 aarch64 GNU/Linux

[alarm@alarm ~]$ ffmpeg -loglevel info -i h264_1080p_30_sdr.mp4 -an -sn -f null -
....
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 24559 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
[h264_rkmpp_decoder @ 0xaaaaea88d0b0] Picture format is nv12.
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_rkmpp_decoder) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[h264_rkmpp_decoder @ 0xaaaaea88d0b0] Decoder noticed an info change
....

[alarm@alarm ~]$ journalctl -f 
Nov 18 22:23:56 alarm mpp[1566]: mpp_info: mpp version: 3b278438 author: Yandong Lin   2023-11-01 fix[hal_h265e_vepu541]: fix roi buffer variables incorrect use

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 18, 2023

but my panthor is completely broken.

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 18, 2023

That’s awesome 👏 One day we should package it as a DKMS.

I am not sure it is easy, because mpp is using alot of extra exported symbols from iommu_rockchip and someother variants of rockchip drivers. This should never be allowed in mainline. One option would be to use existing kernel interfaces to do similar operations but i am not that good at kernel development. I think the reason for mpp doing this is that, since it is controlled by user space sometimes it needs to reset itself and some other systems like mmu. You can see those operation in the history log, where i have cherry picked some commits from vendor kernel.

I don't know if mpp has a hard requirement for their custom dma-buf-heaps driver. Currently in 6.1 bsp this driver is broken, and the android ion driver has been removed by upstream linux, so the only available mem allocator is drm.

I do not guess so, i think drm device is just a wrapper around dma_buf_heap device, so drm device should be ok.

Backporting the panthor to it is also valuable.

Should be, because mainline is not the best when it comes to opp_select, pvtm, especially on the clock driver. The clock driver was very different than the vendor, i am not sure it is better or worse but the commit history was even completely different.

Some notes:
I have disabled devfreq, qos, and dmc_lock operations in kernel, i do not knwo the exact impact but yeah, this is only a proof of concept at the moment, i will stop here i guess unless there is a working panthor.

Next target woudl be poritng RGA and i guess/hope it would be easier than mpp. But you know, nothing is easy with RGA.

@kyak
Copy link

kyak commented Nov 19, 2023

@nyanmisaka @hbiyik can you guys please enlighten me about what's going on with panthor? Is it better than panfork? How is it related to panfrost? :)

I understand you are doing something important for Rock 5. Will it equally work for Orange Pi 5?

@nyanmisaka
Copy link

@hbiyik

but my panthor is completely broken.

Do you mean adding these mpp related code prevents panthor from running?

I am not sure it is easy, because mpp is using alot of extra exported symbols from iommu_rockchip and someother variants of rockchip drivers. This should never be allowed in mainline. One option would be to use existing kernel interfaces to do similar operations but i am not that good at kernel development. I think the reason for mpp doing this is that, since it is controlled by user space sometimes it needs to reset itself and some other systems like mmu. You can see those operation in the history log, where i have cherry picked some commits from vendor kernel.

Indeed. We shouldn't waste time on DKMS at this early stage.

Should be, because mainline is not the best when it comes to opp_select, pvtm, especially on the clock driver. The clock driver was very different than the vendor, i am not sure it is better or worse but the commit history was even completely different.

Mainline is not mature in frequency scaling, even though collabora has a custom cpufreq driver. An example is that the lowest CPU freq in bsp can be reduced to 400Mhz versus 800Mhz in mainline.

@nyanmisaka
Copy link

@nyanmisaka @hbiyik can you guys please enlighten me about what's going on with panthor? Is it better than panfork? How is it related to panfrost? :)

I understand you are doing something important for Rock 5. Will it equally work for Orange Pi 5?

Panthor is still in WIP and may take a few months to complete. We are trying it to prepare for transplanting MPP and RGA to mainline.

JM: job manager based scheduling
CSF: firmware based scheduling

kernel mode: (linux kernel)

  • panfrost/JM: for mali v9 and older
  • panthor/CSF: for mali v10 (G3xx/G6xx/G7xx) and newer (formerly named as pancsf)

user mode: (mesa)

  • panfrost: for all mali gpus
  • panfork: a fork of mesa/panfrost that added v10/csf support without using a new kernel driver.
    The side effect is that it cannot bring out the full performance of mali v10.

These efforts are not made for a specific board, but for all rk3588 based platforms.

@nyanmisaka
Copy link

nyanmisaka/linux-rockchip@7e91bfc

It seems the PCIe ASPM is controlled by setting supports-clkreq for pcie3x4: pcie@fe150000.
@hbiyik This may help lowering the temperature of the NVME drive.

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 19, 2023

@nyanmisaka i enabled the ASPM but this thing is still hover 45~55 +-5C. I think this HW is designed to run hot, i also scrapped it from a dead laptop, and it (the NVME) was half burned anyways, so it is ok let it burn i dont care :)

Do you mean adding these mpp related code prevents panthor from running?

This i do not know, but i dont expect so, therefore i will wait a stable panthor to work on to troubleshoot efficiently, but it seems that it is not too hard to port mpp ot rga to mainline.

@kyak
Copy link

kyak commented Nov 20, 2023

Just a heads up: kodi seems to be broken (horizontal lines/tear when scrolling) after this pull request: xbmc/xbmc#23921

Can someone confirm?

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 20, 2023

@kyak i will check can you send a screenshot for me to compare.

Actually this pr in kodi was supposed to nake thinfs faster, i was following it.

@kyak
Copy link

kyak commented Nov 20, 2023

@kyak i will check can you send a screenshot for me to compare.

Actually this pr in kodi was supposed to nake thinfs faster, i was following it.

Not sure how to make screenshot (technically).

But this problem is not screenshotable anyway. Tearing and lines happen when scrolling, then it's back to normal.

@hbiyik
Copy link
Owner Author

hbiyik commented Nov 20, 2023

@kyak fixed in panfork, 7Ji-PKGBUILDs/mesa-panfork-git@bc05b90, i think answers why we are trying panthor :) panfork is good, but people pissed off its developer, and he is no longer maintaining it.

@kyak
Copy link

kyak commented Nov 21, 2023

@kyak fixed in panfork, 7Ji-PKGBUILDs/mesa-panfork-git@bc05b90, i think answers why we are trying panthor :) panfork is good, but people pissed off its developer, and he is no longer maintaining it.

Thanks a lot! I confirm that the problem is fixed.

@sielicki
Copy link

sielicki commented Dec 4, 2023

@nyanmisaka

first steps on the moon

[alarm@alarm ~]$ uname -a
Linux alarm 6.7.0-rc1-panthor-g4a610b4a1e08 #1 SMP PREEMPT Sat Nov 18 19:44:55 UTC 2023 aarch64 GNU/Linux
[alarm@alarm ~]$ ls /dev/mpp_service 
/dev/mpp_service

branch no longer exists, can you possibly post patches and/or repush this branch? I'm interested in headless decodes and want to play around with this.

@kyak
Copy link

kyak commented Jan 27, 2024

@hbiyik i'm getting black screen on all videos with exp_refactor_all branch. It started after a recent update, but i can't pinpoint exact packages.. Do you have any idea?

There has been your recent commit to the master branch. Maybe it also needs to go to exp_refactor_all branch?

@hbiyik
Copy link
Owner Author

hbiyik commented Jan 27, 2024

Yeah i know the reason why, ill push a fix tonight, but you can better switch to ffmpeg-rockchip

@kyak
Copy link

kyak commented Jan 27, 2024

Yeah i know the reason why, ill push a fix tonight, but you can better switch to ffmpeg-rockchip

Should I switch to ffmpeg-rockchip even though I'm using kernel 5.10?

@kyak
Copy link

kyak commented Jan 27, 2024

I've tried ffmpeg-rockchip and it displays green screen on SD videos. Just like ffmpeg-mpp master (without the exp-refactor-all patches).

@hbiyik
Copy link
Owner Author

hbiyik commented Jan 27, 2024

U should use direct rendering to plane but those also require some kernel and kodi patches

@kyak
Copy link

kyak commented Jan 27, 2024

Indeed, cherry-picking 65f9032 on top of exp_refactor_all has fixed the black screen issue.

@kyak
Copy link

kyak commented Jan 27, 2024

U should use direct rendering to plane but those also require some kernel and kodi patches

I use direct rendering to plane, but kernel is 5.10.160-r1080192-695850f9bde2-ced0156-1-aarch64-orangepi5-git+ and kodi is Git:20240127-41c4a59fc31 built with external ffmpeg (using exp_refactor_all as external ffmpeg by LD_LIBRARY_PATH).

@hbiyik
Copy link
Owner Author

hbiyik commented Jan 27, 2024

Yep that was the fix

@kyak
Copy link

kyak commented Mar 6, 2024

@hbiyik what is the recommended combination of ffmpeg, kodi and kernel packages for Orange Pi 5+ right now?

I'm currently at linux-aarch64-orangepi5-git (from 7Ji repo), kodi-ext-git (building myself) and ffmpeg with exp_refactor_all branch (building myself). I also point kodi at this custom ffmpeg. I also have ffmpeg-rockchip-git (from repo), but that's not used by Kodi.

It feels like I'm missing a chance to provide feedback to you while you are playing around with kernels and ffmpeg. I'm staying with something obsolete and irrelevant.

Please suggest what users need to do so that developers have the best feedback possible.

@hbiyik
Copy link
Owner Author

hbiyik commented Mar 6, 2024

@kyak

thanks for bringing this up, definetely use

https://github.com/nyanmisaka/ffmpeg-rockchip

may this is the last message in this repo and i will make it read only.

I will also add a message to the repo frontpage why you should not use this anymore.

Please also have a look at https://github.com/hbiyik/ffmpeg-rockchip/wiki/Rendering

7ji repo, agr, arch, debian based things are all moving to the ffmpeg-rockchip and i will be contributing there.

As always feeedback makes the project stronger, and always well come.

See you on the other repo.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants