[RFC] HwFramesContext config method support #1

hbiyik · 2024-01-01T03:13:06Z

This is a first attempt to provide a way for mpp and kodi to work with the decoder.

With above changes mpv works, however there still seems to be copies involved it is slower than expected.

I have some ideas on whats going on but I will update when i have a proper understanding

nyanmisaka · 2024-01-01T10:46:02Z

Can MPV and Kodi handle DRM_PRIME (NV15) natively?

hbiyik · 2024-01-01T13:51:10Z

i think this is more of task for EGL interface (mesa), what both kodi and mpv does is receive an DRMPRimeFrame, and import this to EGL interface using attribs, so they should both support as long as mesa does, but current mesa-panfork does not, may be ~~panfork~~ panthor will do.

On the otherhand, kodi can bypass EGL and directly render on DRM planes thorugh kernel but thats always buggy, i dont use it.

nyanmisaka · 2024-01-01T14:10:26Z

Then the following options are available:

Modify MPV and Kodi to allow them to use RGA filters.
Glue RGA and MPP_DEC together in FFmpeg.
Add proper RGA support in MPP libs, allowing MPP_DEC to output converted images, just like IEP de-interlacing.
Wait for panthor to support NV15.

hbiyik · 2024-01-01T14:19:27Z

Then the following options are available:

Modify MPV and Kodi to allow them to use RGA filters.

not maintainable

Glue RGA and MPP_DEC together in FFmpeg.

i already tried, this works to certain extend with spagetti code

Add proper RGA support in MPP libs, allowing MPP_DEC to output converted images, just like IEP de-interlacing.

i doubt this will ever happen

Wait for panthor to support NV15.

i think this is the only reasonable, future-proof way forward at the moment. Meanwhile i can try to take advantage of HwFramesContext config method to take advantage of NV12/NV16 DRM Prime frames

hbiyik · 2024-01-01T14:37:44Z

What i am trying to do in this PR is:

When you define the HWAccel of the decoder with flag AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX, the client is responsible for creating creating the device and providing its context. Since RKMPP device is newly invented no client supports this.

However at least for decoder (not rga related things), RKMPP device is actually a DRM device its own quirks. There are 2 ways to accomplish the compliancy.

Provide multiple HWDeviceConfigs:

See .hwconfigs is actually an array, so one decoder can have multiple hwaccel device. So it is possible to add another hwaccel here but with .device_type = AV_HWDEVICE_TYPE_DRM. However i think FFMpeg has some issues there. Ie:

ff_get_format is actually only checking the first matched HWAccel but not the array and quits here. I think this is either a bug, or multiple hwaccel device is not a very common practice so i did not go this path.

Allow client to use HWFramesCtx.

When the decoder claims that it supports AV_CODEC_HW_CONFIG_METHOD_HW_FRAMES_CTX method, the client does not necessarily need to give the device context, instead, codec should create its own hwdevice context, and the client will receive DRMPRIME frames over the hw_frames_ctx. This fits better to this decoder and this is what i am trying to implement.

nyanmisaka · 2024-01-01T14:48:28Z

ff_get_format is actually only checking the first matched HWAccel but not the array and quits here. I think this is either a bug, or multiple hwaccel device is not a very common practice so i did not go this path.

Generally speaking, in FFmpeg whenever a new hardware acceleration is added, a corresponding HW pix_fmt is also added, and frames_derive_to/from is used to import/export to other underlying HW pix_fmt. So it is a rare usage when you want to add hwaccel for rkmppdec but still borrow AV_PIX_FMT_DRM_PRIME.

nyanmisaka · 2024-01-01T15:37:37Z

~~Manually merged in 6584518~~

hbiyik · 2024-01-01T15:45:01Z

ooops, please do not merge yet, this hardly work as of now.

hbiyik · 2024-01-01T17:46:37Z

one question:
49cc93e

is this a good idea? those headers are exported to the system and it will cause all drm based other software to be recompiled due to this change. basically an API change for drm interface here.

hbiyik · 2024-01-01T17:58:14Z

May be use one layer with a specific format like FOURCC("MBUF") and assign it to the some object index, and get the ptr from the mppbuffer? Still a hack but a non breaking one.

hbiyik · 2024-01-01T18:04:48Z

Or add the new data to the end of structure, so the old interface variables will not be repostioned

UPDATE:
it will not help to move to the end of AVDRMObjectDescriptor,
new stuff should be at the end of AVDRMFrameDescriptor, because it has multiple members of AVDRMObjectDescriptors, and the new oversized AVDRMObjectDescriptor will overwrite the .nb_planes member.

Even if the ptr and opaque would be a member of AVDRMObjectDescriptor, it would only work when AVDRMObjectDescriptor is not used as an array. But that unconventional, a cleaner way would be to use layers may be...

nyanmisaka · 2024-01-01T18:56:01Z

Yeah it’s indeed a breaking change for upstreaming. So, either append an new array after layers to correspond to the objects, or create a new format AV_PIX_FMT_RKMPP instead.

nyanmisaka · 2024-01-01T19:10:34Z

It would be nice if they could reserve some bits when defining AVDRMObjectDescriptor. It's not very intuitive to separate ptr and opaque from it.

hbiyik · 2024-01-01T21:01:27Z

with some code clean up, there will be no need for MppBuffer object, then only thing necessary is to check if fd is mmapped or not (thats why u need ptr i think).

I remember there was a way to do this without any flags and just by mmap but i need to dig in little bit.

Particularly below part, does it really have to check desc->objects[i].ptr?

static int rkmpp_map_frame(AVHWFramesContext *hwfc,
                           AVFrame *dst, const AVFrame *src, int flags)
{
 ....
    av_assert0(desc->nb_objects <= AV_DRM_MAX_PLANES);
    for (i = 0; i < desc->nb_objects; i++) {
        if (desc->objects[i].ptr) {
            addr = desc->objects[i].ptr;
            map->unmap[i] = 0;

i think you are doing an optimization above to prevent unnecessary unmap and mmaps.

In that case may be we can introduce a format modifier to flag the descriptor as mmapped or not.

hbiyik · 2024-01-01T22:57:22Z

ok i created a seperate Pr for this, i think this is the cleaniest way, no hacks involved.

hbiyik · 2024-01-02T16:50:54Z

this works pretty fine.
here is the below command to get mpv working:

--vo=gpu-next is optional but it seems it works rather better
--swapchain-depth is optional but helps to sync the buffers better

simple drm_prime output:
mpv --swapchain-depth=8 --vo=gpu-next --hwdec=rkmpp pathtofile

avg hevc performance:
8bit: fps= 8k@30fps
10bit: not supported

AFBC->non afbc (10bit to 8bit)
mpv --swapchain-depth=8 --vo=gpu-next --vd-lavc-o=buf_mode=ext,afbc=on --vf=scale_rkrga=format=nv12 --hwdec=rkmpp pathtofile

avg hevc performance:
8bit afbc -> 8bit: fps= 8k@60fps
10bit afbc->8bit: 8k@60fps

AFBC->non afbc (HDR to HDR)
mpv --swapchain-depth=8 --vo=gpu-next --vd-lavc-o=buf_mode=ext,afbc=on --vf=scale_rkrga=format=p010 --hwdec=rkmpp pathtofile

avg hevc performance:
10bit afbc-> 10bit: 8k@30fps

mpv requires below path to play P010 drm frames.

diff --git a/video/out/hwdec/hwdec_drmprime.c b/video/out/hwdec/hwdec_drmprime.c
index 290f11c..9c63ab4 100644
--- a/video/out/hwdec/hwdec_drmprime.c
+++ b/video/out/hwdec/hwdec_drmprime.c
@@ -119,6 +119,7 @@
     MP_TARRAY_APPEND(p, p->formats, num_formats, IMGFMT_NV12);
     MP_TARRAY_APPEND(p, p->formats, num_formats, IMGFMT_420P);
     MP_TARRAY_APPEND(p, p->formats, num_formats, pixfmt2imgfmt(AV_PIX_FMT_NV16));
+    MP_TARRAY_APPEND(p, p->formats, num_formats, pixfmt2imgfmt(AV_PIX_FMT_P010));
     MP_TARRAY_APPEND(p, p->formats, num_formats, 0); // terminate it
 
     p->hwctx.hw_imgfmt = IMGFMT_DRMPRIME;

may be it would be a good idea to add a format option like 8bit for scale_rkrga so that it will auto select the correct 8bit variant of a 10bit frame variant, ie: NV15->NV12 and NV20->NV16

nyanmisaka · 2024-01-02T17:09:36Z

may be it would be a good idea to add a format option like 8bit for scale_rkrga so that it will auto select the correct 8bit variant of a 10bit frame variant, ie: NV15->NV12 and NV20->NV16

The swscale format filter of ffmpeg can accept multiple pixel formats and choose the appropriate one, like this format=nv12|nv16. So it is possible to modify the rkrga filter options and parse them manually.

hbiyik · 2024-01-02T17:15:32Z

fmt1|fmt2 looks cleaner.

One another thing, mesa EGL requires 32 byte aligned frames however decoder outputs 16byte aligned frames and this causes drm prime frames not to be imported properly and the screen outputs green or blue.

Is it possible to add a hstride option to scale_rkrga filter? so that after rga chain the frame could be imported to EGL properly.

nyanmisaka · 2024-01-02T17:50:39Z

Is it possible to add a hstride option to scale_rkrga filter? so that after rga chain the frame could be imported to EGL properly.

Sure.

ffmpeg-rockchip/libavutil/hwcontext_rkmpp.c

Line 245 in 9e73140

    
           layer->planes[i-1].pitch * (hwfc->height >> (i > 1 ? pixdesc->log2_chroma_h : 0));

Will this bypass the panfork issue?

layer->planes[i].offset =
    layer->planes[i-1].offset +
    layer->planes[i-1].pitch * (FFALIGN(hwfc->height, 64) >> (i > 1 ? pixdesc->log2_chroma_h : 0));

hbiyik · 2024-01-02T18:35:48Z

No, i just tested it. Because having a different pitch value for the buffer will only change the buffer size and offsets, however, those initial values will be overwritten by mpp like here.

To my knowledge mpp does not support preconfigured strides/pitches. If it would, may be setting the hstride of the mppframe before decode_get_frame by manually allocing the input mppframe would do the trickjust by the decoder. But thats just a theory, i was working it around with realigning to 64byte hstride with rga on next chain.

hbiyik · 2024-01-03T12:59:59Z

i think this is better to be handled by mpp, i asked them if they can support a decoding flow like this, may be thorugh some decccfg..

rockchip-linux/mpp#507

nyanmisaka · 2024-01-03T15:30:27Z

i think this is better to be handled by mpp, i asked them if they can support a decoding flow like this, may be thorugh some decccfg..

rockchip-linux/mpp#507

We really need this. For the H264 High10 use case, MPP is using hor_align=16 for both NV12 and NV15, while RGA3 requires hor_align=64 for NV15.

nyanmisaka/mpp@a986997

hbiyik · 2024-01-18T16:04:27Z

@nyanmisaka i tested this extensively, from my pov it is ready to be merged, there is no side affects until now i could see.

nyanmisaka · 2024-01-18T18:18:36Z

Merged in d0a1484

In close_output(), a dummy frame is created with format NONE passed to enc_open(), which isn't prepared for it. The NULL pointer dereference happened at av_pix_fmt_desc_get(enc_ctx->pix_fmt)->comp[0].depth. When fgt.graph is NULL, skip fg_output_frame() since there is nothing to output. frame #0: 0x0000005555bc34a4 ffmpeg_g`enc_open(opaque=0xb400007efe2db690, frame=0xb400007efe2d9f70) at ffmpeg_enc.c:235:44 frame #1: 0x0000005555bef250 ffmpeg_g`enc_open(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1462:11 frame #2: 0x0000005555bee094 ffmpeg_g`send_to_enc(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1571:19 frame #3: 0x0000005555bee01c ffmpeg_g`sch_filter_send(sch=0xb400007dde2d4090, fg_idx=0, out_idx=0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:2154:12 frame #4: 0x0000005555bcf124 ffmpeg_g`close_output(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08) at ffmpeg_filter.c:2225:15 frame #5: 0x0000005555bcb000 ffmpeg_g`fg_output_frame(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08, frame=0x0000000000000000) at ffmpeg_filter.c:2317:16 frame #6: 0x0000005555bc7e48 ffmpeg_g`filter_thread(arg=0xb400007eae2ce7a0) at ffmpeg_filter.c:2836:15 frame #7: 0x0000005555bee568 ffmpeg_g`task_wrapper(arg=0xb400007d8e2db478) at ffmpeg_sched.c:2200:21 Signed-off-by: Zhao Zhili <[email protected]>

HwFramesContext config method support

d7439cb

hbiyik force-pushed the ffmpeg-rockchip-hwframes branch from 3848781 to d7439cb Compare January 1, 2024 14:16

nyanmisaka closed this Jan 1, 2024

nyanmisaka reopened this Jan 1, 2024

nyanmisaka force-pushed the master branch from 6584518 to 9e73140 Compare January 1, 2024 15:47

MapleAura mentioned this pull request Jan 3, 2024

avcodec_send_frame(codec_ctx_, frame_) failed. #3

Closed

hbiyik mentioned this pull request Jan 10, 2024

HACK: Route v4l2m2m codecs to rkmpp via ENV variable #6

Closed

nyanmisaka closed this Jan 18, 2024

hbiyik deleted the ffmpeg-rockchip-hwframes branch January 19, 2024 15:04

vvsvvsvvs mentioned this pull request Apr 7, 2024

SIG11 when running ffmpeg #59

Closed

abratchik mentioned this pull request Aug 6, 2024

Rockchip HW acceleration not working on Orange Pi 5 plus 16GB #95

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] HwFramesContext config method support #1

[RFC] HwFramesContext config method support #1

hbiyik commented Jan 1, 2024 •

edited

Loading

nyanmisaka commented Jan 1, 2024

hbiyik commented Jan 1, 2024 •

edited

Loading

nyanmisaka commented Jan 1, 2024

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 1, 2024 •

edited

Loading

nyanmisaka commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 2, 2024 •

edited

Loading

nyanmisaka commented Jan 2, 2024

hbiyik commented Jan 2, 2024

nyanmisaka commented Jan 2, 2024

hbiyik commented Jan 2, 2024 •

edited

Loading

hbiyik commented Jan 3, 2024

nyanmisaka commented Jan 3, 2024

hbiyik commented Jan 18, 2024

nyanmisaka commented Jan 18, 2024

[RFC] HwFramesContext config method support #1

[RFC] HwFramesContext config method support #1

Conversation

hbiyik commented Jan 1, 2024 • edited Loading

nyanmisaka commented Jan 1, 2024

hbiyik commented Jan 1, 2024 • edited Loading

nyanmisaka commented Jan 1, 2024

hbiyik commented Jan 1, 2024 • edited Loading

hbiyik commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024 • edited Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 1, 2024 • edited Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 1, 2024 • edited Loading

nyanmisaka commented Jan 1, 2024

nyanmisaka commented Jan 1, 2024 • edited Loading

hbiyik commented Jan 1, 2024 • edited Loading

hbiyik commented Jan 1, 2024

hbiyik commented Jan 2, 2024 • edited Loading

nyanmisaka commented Jan 2, 2024

hbiyik commented Jan 2, 2024

nyanmisaka commented Jan 2, 2024

hbiyik commented Jan 2, 2024 • edited Loading

hbiyik commented Jan 3, 2024

nyanmisaka commented Jan 3, 2024

hbiyik commented Jan 18, 2024

nyanmisaka commented Jan 18, 2024

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

nyanmisaka commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

nyanmisaka commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 1, 2024 •

edited

Loading

hbiyik commented Jan 2, 2024 •

edited

Loading

hbiyik commented Jan 2, 2024 •

edited

Loading