-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rkmppdec: Allow to change AFBC mode from env #4
Conversation
If i understand correctly, the pipeline is: rkmppdec -> drm_prime -> drmModeSetting -> VOP2 -> hdmi/dp -> screen Here's an interesting commit: Joshua-Riek/rockchip-kernel@558c82b And these two are worth a look: |
Yes the pipeline is correct, [rant on] but no please not another new modifier as if there is enough support for the existing ones. [rant off] |
Output of [drm_info](https://gitlab.freedesktop.org/emersion/drm_info) tool
I think there is a serious problem, see above log: Each plane that Vop2 supports has maximum 4k resolution support, how is that supposed to render 8k input/output?
Kernel also complains in case 8k data is written:
|
i think max_input and output is hardcoded, but if the conencted width is > 4096(4k) then it allows double the width=8192, this might suffice the width but again the height will still be less than standart 8k resolution 8192x4096<7680x4320. So i am not sure if it is required to connect and 8k device to get input plane to support 8k, but it seems so, even if so, i dont really get why? power consumption? heat dissipation? who knows. Even if it would work that way, i am still not sure it will comply the height of 4320. So rockchip, thanks for making things unnecessarily complicated again. |
.max_input = { 4096, 4320 },
max_input_w = vop2_data->max_input.width; // 4096
max_input_h = vop2_data->max_input.height; // 4320
max_input_w <<= 1; // 8192
// isn't it 8192x4320? Perhaps it is because 8k is not natively supported by a single hardware unit, but is achieved through multiple hardware units using the "splice mode" of rk3588. Also, 8k is far less commonly used than 4k, and enabling it even requires overclocking VOP2, so the default is not 8k. RK dev @andyshrk should know more about this. |
yeah i mixed up w&h, it makes sense now, but for compatability reasons at least having max_input supporting to 8k is very reasonable, even though the output is not 8k, so that drm planes can be used without issues. |
Andy is using the same w&h values in upstream linux. But as for how to support 8k, it has not yet been finalized. |
4487c02 with this commit all broken images are gone. I also raised an mpp bug about it rockchip-linux/mpp#509 tested AFBC mode on NV12, NV16, NV15, NV20 on h264, hevc, vp9 and av1 up to 8k. All work flawlessly, no hickup or whatsoever. So there are 2 issues left. 1st is the 8k scaling issue in drm plane, i seriously thing this is a bug even rk3288 has 8k plane input support. |
Well... Another magic number on rockchip. That's where the |
Hi:
在 2024-01-05 02:00:15,"Hüseyin BIYIK" ***@***.***> 写道:
yeah i mixed up w&h, it makes sense now, but for compatability reasons at least having max_input supporting to 8k is very reasonable, even though the output is not 8k, so that drm planes can be used without issues.
To support 8K plane, vop2 need 2 hardware plane work in splice mode, even for a 8K input 4k output mode, we still need two hardware plane, one hardware plane does not have the ability/peformace to scale down a 8k input to 4k output。
whats more, for a 8K output, we also need to video ports work at splice mode。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hello @andyshrk, thanks for clearification
But for both VPs to work in splice mode (in 3588 vp0+vp1), do you need the attached adapter to have > 4096px width? You should still be able to input 8k to 2 different planes in splice mode, even the attached device is negotatited to 4k. right? |
When you say "the attached adapter " do you mean a monitor ? A monitor can work in any display mode it supports, it can be 1080P, 4K, or 8K. We switch splice mode dynamically according to the width of the display mode[0]
Yes, but it is really very difficulty for a low level driver to grabble a plane that maybe used by other userspace application 。 |
@andyshrk Thanks again, this is very helpful. When rock5b is set to 8k mode, dts is configured as
or
when it is in 4k mode dts is:
on opi5 it is also similar since it has only 1 hdmi port. If the connected montior can do >4k then the logic works but if not, we lose the scaling functionalities (8k->4k) which may be available according to dts config. Instead of dynamically deciding splicing according to modesetted width, i think it should have been possible to check the dts first, if the splice capable vps are connected to only 1 display interface, then splice should have been activated, because as above cases, vp1s are always unusued. This might work for 3588 but in case it creates problems for other vop2 variants, may be it is even a better idea to explain splice capability in a DTS property and act on the driver according to this property instead of dynamically checking the modeset resolution. I hope i make sense. |
@nyanmisaka i think the offset in the drm descriptor is not simple as pix * stride for AFBC, because this calculation points the byte offset of the frame which is afbc compressed, however the actual offset is the y pix offset after the AFBC decompression is done, I do not think it is possible to point to a byte offset before decompressing the frame, where it points to the nth y pixels stride start, since this is variable accroding to the compression also there is AFBC header as well at the beginning of the frame. Instead i think may be the offset is represented as pixel in AFBC descriptor but the renderer must offset the plane not the frame buffer. I am still not sure if this can be achieved per frame basis as well, because we know that in AV1 case we have dynamic offset on each frame. So this is some another challange to tackle. |
@hbiyik /**
* @offsets: Offset from buffer start to the actual pixel data in bytes,
* per buffer. For userspace created object this is copied from
* drm_mode_fb_cmd2.
*
* Note that this is a linear offset and does not take into account
* tiling or buffer layout per @modifier. It is meant to be used when
* the actual pixel data for this framebuffer plane starts at an offset,
* e.g. when multiple planes are allocated within the same backing
* storage buffer object. For tiled layouts this generally means its
* @offsets must at least be tile-size aligned, but hardware often has
* stricter requirements.
*
* This should not be used to specifiy x/y pixel offsets into the buffer
* data (even for linear buffers). Specifying an x/y pixel offset is
* instead done through the source rectangle in &struct drm_plane_state.
*/
unsigned int offsets[DRM_FORMAT_MAX_PLANES]; Instead, such x/y pixel offsets should be used in drm_plane.h->drm_plane_state /**
* @src_y: upper position of visible portion of plane within plane (in
* 16.16 fixed point).
*/
uint32_t src_y; Therefore, the existing |
Ugh, i was referring to |
Maybe you can try asking in FFmpeg IRC or ffmpeg-devel. The author of hwcontext_drm is still active too. https://github.com/fhvwy There have been no use cases for AFBC before. |
Thanks for the hint, let me discover one last thing that if it is really not possible to point out to exact byte offset of an AFBC frame, i think those offsets are coming from the decoders alignment requirements, however the imported drm device might not have those requirements. My favorite guy icecream95 has some ninja code to decode some parts of AFBC. Edit: Quickly disproved myself, Tiles are 16px * 16px in AFBC so pixel offset of <16 is already in an existing tile, may be possible to find byte offset of multiples of 16pixels but less than 16 is not possible. So does not help in rkmpp case. |
Yes, from the information I have obtained from my communication with the IC team, the splice function will reconstruct in future soc。 |
@andyshrk Thanks for the great news
I am confused about one thing. Is having splice enabled when the mode > 4k is a driver restriction or vop2 hardware limitation? I had the impression that this was a driver limitation but it is me guessing. If this is a hardware limitation then i guess it would mean that existing SOCs wont receive such an improvement.. |
From the hardware side, each plane/window--》CRTC/VP only supports max 4K input--》output。 At splice mode(mode > 4k), for example, Cluster0 + Cluster1 splice for a 8K plane, the Cluster0 should be attached to VP0, Cluster1 should be attached to VP1. When mode < 4K, VP1 is not work, so if we want to use Cluster0 + Cluster1 for splice , we have to move Cluster1 from VP1, but it is very difficult to move a plane from one CRTC to another on rk356x/rk3588 due to hardware design. So this is a little different for splice when mode > 4k. And from drm side, It is rare to see the low level driver grab one plane from one crtc to another, and bind which plane to which crtc always done by userspace. So this is a software thing, it is also a hardware limitation. Anyway, we will have a try, try to give VP0 a 8K input if VP1 is disabled when it has free plane.
|
I think i somehow got it, so hardware actually expects >4k to actually splice the vps, but what you will try is to may be manually activate splicing when <4k, and organize the planes manually in the driver and feed to spliced vp0+cp1, and may be give another plane to userspace to get the actual 8k plane input. Or something like this, thats why you are saying that to really address the issue vop2 core needs to be actually updated. |
@andyshrk I am rendering on So primary plane is on top of cursor plane with zpos. Rest of the planes are disabled, CRTC_ID=0, FB_ID=0 In this case, A/XRGB2101010 transparent parts are not blended and shown as black area. When i set the format to A/XRGB8888 the transparency blends and i can see with background video layer. I have tested with rkr4.1 branch, directly rendering with KMS, this is case with Kodi. PS: Same isseu happens when i use 2 primary planes but no cursor planes as well. Ie: |
baf41c5
to
493328a
Compare
libavcodec/rkmppdec.c
Outdated
@@ -515,7 +515,8 @@ static int rkmpp_export_frame(AVCodecContext *avctx, AVFrame *frame, MppFrame mp | |||
layer->format = rkmpp_get_drm_afbc_format(mpp_fmt); | |||
layer->nb_planes = 1; | |||
layer->planes[0].pitch = mpp_frame_get_hor_stride(mpp_frame) * 3 / 2; | |||
layer->planes[0].offset = 0; | |||
frame->crop_top = mpp_frame_get_offset_y(mpp_frame); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were too focused on AVDRMFrameDescriptor
and ignored AVFrame
itself 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but i think there may be stiill one theoretical problem, if in future a decoder provides frames with multiple layers and with offset different on each layers then crop fields can not be used. But i am not sure if there is such a case, ie: nv12m format has multiple planes and they are represented ni multiple planes of a layer, not in different layer where there is one plane. In any case this is not todays problem so idont care :)
please show the output of : and if you can write your XRGB2101010 data to a file, please also upload. |
libavcodec/rkmppdec.c
Outdated
@@ -514,8 +514,8 @@ static int rkmpp_export_frame(AVCodecContext *avctx, AVFrame *frame, MppFrame mp | |||
DRM_FORMAT_MOD_ARM_AFBC(AFBC_FORMAT_MOD_SPARSE | AFBC_FORMAT_MOD_BLOCK_SIZE_16x16); | |||
layer->format = rkmpp_get_drm_afbc_format(mpp_fmt); | |||
layer->nb_planes = 1; | |||
layer->planes[0].pitch = mpp_frame_get_hor_stride(mpp_frame); | |||
layer->planes[0].offset = mpp_frame_get_offset_y(mpp_frame) * layer->planes[0].pitch; | |||
layer->planes[0].pitch = mpp_frame_get_hor_stride(mpp_frame) * 3 / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you get that value 1.5 from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that value by testing.
But i think it is more of a definition, mpp defines the stride as it is a semi-planar format with multiple planes, therefore bases the stride calculation in that regard and gives a MPP_FRAME_YUV420SP format with align(width)
But DRM expects a stride to represent the whole plane and since AFBC planes are single planes so Y + UV planes, stride is 3/2 of the original stride for 420 subsampling, and *2 for 422 subsampling. (I have forgotten to distinguish against the subsampling here actually since it is a workaround).
On the other hand, i am not sure it is that simple for 4:2:0 case because, assume that you a width n*16, and your alignment is 16 (mpp's most cases) and n is an odd number.
Ie: DVD resolution: 720x480
where n=45 (odd number): 45*16 = 720px.
for NV12 afbc your mpp stride is 720, but drm stride is 720*3/2 = 1080 which is no more 16 aligned but 8 aligned.
So as the consumer of this frames are drm, this also depends of the VOP2 alignment or whatever the consumer's alignment requirement, but for encoder cases and some whatever future case this is a risk, so therefore i consider this as an mpp bug.
Entry is here
rockchip-linux/mpp#509
Tagging @FumasterLin @JeffyCN @HermanChen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but for encoder cases and some whatever future case this is a risk
I think I may have encountered somthing similar, but the width of my resolution is not 16-aligned, for example, 2900x2160 (slightly different from 2880x2160, 4:3). Neither the MPP encoder nor the RGA can handle it correctly.
The MPP HEVC decoder (AFBC) gives a stride 2944 (186x16 or 46x64)
2944*3/2=4416 (aligned with 8/16/64, but the broken image suggests a wrong stride)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is 64 aligned, 2944 does not align to 16, yet, i have created a test hevc nv12 file 2900x2160 it decodes correctly and gives 2944*3/2 = 4416 stride to layer descriptor.
However i think mpp again needs to scale down this 2/3 or RGA as well, because they expect only Y plane strides regardless that the picture is AFBC or not. That i have not PRed yet if you are using this patchset.
Ps: it is possible i did not your comment correctly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For AFBC input in RGA and MPP encoders, they don't care about the given stride, but use the given width and the hardcoded 16 (h264e, h265e) to calculate stride.
Try this branch
./ffmpeg -f lavfi -i testsrc=s=2900x2160,format=yuv420p -c:v hevc_rkmpp -b:v 6M -vframes 100 -y /tmp/testsrc_2900x2160_hevc_8bit.mp4
./ffmpeg -hwaccel rkmpp -hwaccel_output_format drm_prime -afbc 1 -i /tmp/testsrc_2900x2160_hevc_8bit.mp4 -c:v hevc_rkmpp -b:v 6M -y /tmp/broken1.mp4
./ffmpeg -hwaccel rkmpp -hwaccel_output_format drm_prime -afbc 1 -i /tmp/testsrc_2900x2160_hevc_8bit.mp4 -vf scale_rkrga=format=nv12:afbc=1 -c:v hevc_rkmpp -b:v 6M -y /tmp/broken2.mp4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the afbc nv12/nv16/nv15/nv20 output from rga3 is not compatible with vop。
@andyshrk
Thx for the info. This is optional for our use case.
But can you elaborate on this?
Also, is this kernel commit related?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the afbc nv12/nv16/nv15/nv20 output from rga3 is not compatible with vop。
@andyshrk Thx for the info. This is optional for our use case.
But can you elaborate on this?
Also, is this kernel commit related?
Not, it has nothing to do with this commit,it‘s because of the rga IC design,but I don’t have the detail information。I just heard this from other people。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not, it has nothing to do with this commit,it‘s because of the rga IC design,but I don’t have the detail information。I just heard this from other people。
Got it. Thx anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A side topic: Do you think this hack here could be related to this wrong definition.
This works now with the new NV20.
Perfect, because as next step i was planning on implementing NEON acceleration on plane copy, and this hack was making things uglier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, because as next step i was planning on implementing NEON acceleration on plane copy, and this hack was making things uglier.
Done in #17
below is the NV12 video + AR30 osd on top
this is the YU08 AFBC video and AR30 osd on top
i have tried to dump the ar30 buffer but couldnt find a way to do it. For your information im testing this using mesa-panfork with Kodi's GBM interface. So mesa should be irrelevant, i hope. |
Our BSP driver have a entry to do the dump。 Run this command at the point you display AR30 + video : And be carefull, CONFIG_ROCKCHIP_DRM_DEBUG is for debug only, you should disable it in real product. |
Sorry, I remember this issue when talked to my colleague, alpha is not supported of AR30 on rk3588. We should only report XR30 format to user space。 And another thing, XR30 is only support AFBC format,no linear。
|
Thats interesting, but with this pkane settings there is actually no afbc in ar30, but it displays correctly. Only alpha channel was missing. Rest of the images display well on the plane. |
From the dri/summary you dump, the AR30 is afbc |
Hmm thats also weird because it can not be, those graphics are osd/gui graphics generated by Kodi application. Kodi can not generate afbc textures. I will check that. |
you can dump the plane data as I said before, I can check the data is afbc or not? |
@andyshrk i think understood whats going on. Kodi is using EGL to create the textures according to the plane supported formats and modifiers. Luckily their logic fell in to the area that it matched XR30 with AFBC modifier so they created the texture with AFBC. So it makes sense. However, i am confused about one thing:
Does that mean both XR30 and AR30 is not supported in Esmart windows. Esmart does not have AFBC modifiers support in 3588.
Is this case for Esmart and cluster windows both? so only XR30 is supported on both ESMART and CLUSTER. |
@andyshrk I have also noticed that both in the vop2 driver code and in the 3588 TRM there is performance bottleneck when scaling down the spliced Cluster planes. Max 1.2 scale factor is supported to scale down. So this means that enabling splice <4k is not a good idea to benefit increased MAX_INPUT of planes, because since the output of the plane will be less <4k in this case there will be a need to scale more. This will simply suffer in performance. I do not know how conventional this idea is, but may be you can leverage RGA3 cores in scaling VOP2 to overcome this. Of course this will introduce more delay, a lot more complexity in the driver, not sure if it is even also possible. |
There is no X/ARGB30 in Esmart format list |
Yes,large scale down often encounters performance issues,we often use RGA or GPU to handle this large scale, but this is not done in vop driver, this is done in userspace(use librga api or gles)before commit a plane to drm。 |
This allows decoder options to be overriden via ENV where the client has not support to change the decoder options. Additionally AVOptions are printed as VERBOSE on infochange.
493328a
to
b987b0c
Compare
@nyanmisaka This pr is tested and ready to be merged from my POV. I changed the functionality to be more generic and cleaner. Tested to be working fine. If you thinks it is ok, this can go in. |
Merged in 99ea69d |
rk3588 hw only supports XRGB2101010 with AFBC mode, does not support ARGB2101010 at all. Current drm driver falsely advertises support for both with both in linear and AFBC modes as well. Current state of VOP2 driver has no mechanism to distinguish support per modifier so it is not also possible advertise support for XRGB2101010 only with AFBC support without refactoring the driver. Therefore, this patch disables both XRGB2101010 and ARGB2101010 until rockchip resolves the problem with a sutainable fix. If not applied, kodi with GBM will display black screen. Reference from rockchip: nyanmisaka/ffmpeg-rockchip#4 (comment)
In close_output(), a dummy frame is created with format NONE passed to enc_open(), which isn't prepared for it. The NULL pointer dereference happened at av_pix_fmt_desc_get(enc_ctx->pix_fmt)->comp[0].depth. When fgt.graph is NULL, skip fg_output_frame() since there is nothing to output. frame #0: 0x0000005555bc34a4 ffmpeg_g`enc_open(opaque=0xb400007efe2db690, frame=0xb400007efe2d9f70) at ffmpeg_enc.c:235:44 frame #1: 0x0000005555bef250 ffmpeg_g`enc_open(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1462:11 frame #2: 0x0000005555bee094 ffmpeg_g`send_to_enc(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1571:19 frame #3: 0x0000005555bee01c ffmpeg_g`sch_filter_send(sch=0xb400007dde2d4090, fg_idx=0, out_idx=0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:2154:12 frame #4: 0x0000005555bcf124 ffmpeg_g`close_output(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08) at ffmpeg_filter.c:2225:15 frame #5: 0x0000005555bcb000 ffmpeg_g`fg_output_frame(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08, frame=0x0000000000000000) at ffmpeg_filter.c:2317:16 frame #6: 0x0000005555bc7e48 ffmpeg_g`filter_thread(arg=0xb400007eae2ce7a0) at ffmpeg_filter.c:2836:15 frame #7: 0x0000005555bee568 ffmpeg_g`task_wrapper(arg=0xb400007d8e2db478) at ffmpeg_sched.c:2200:21 Signed-off-by: Zhao Zhili <[email protected]>
rk3588 hw only supports XRGB2101010 with AFBC mode, does not support ARGB2101010 at all. Current drm driver falsely advertises support for both with both in linear and AFBC modes as well. Current state of VOP2 driver has no mechanism to distinguish support per modifier so it is not also possible advertise support for XRGB2101010 only with AFBC support without refactoring the driver. Therefore, this patch disables both XRGB2101010 and ARGB2101010 until rockchip resolves the problem with a sutainable fix. If not applied, kodi with GBM will display black screen. Reference from rockchip: nyanmisaka/ffmpeg-rockchip#4 (comment)
rk3588 hw only supports XRGB2101010 with AFBC mode, does not support ARGB2101010 at all. Current drm driver falsely advertises support for both with both in linear and AFBC modes as well. Current state of VOP2 driver has no mechanism to distinguish support per modifier so it is not also possible advertise support for XRGB2101010 only with AFBC support without refactoring the driver. Therefore, this patch disables both XRGB2101010 and ARGB2101010 until rockchip resolves the problem with a sutainable fix. If not applied, kodi with GBM will display black screen. Reference from rockchip: nyanmisaka/ffmpeg-rockchip#4 (comment)
rk3588 hw only supports XRGB2101010 with AFBC mode, does not support ARGB2101010 at all. Current drm driver falsely advertises support for both with both in linear and AFBC modes as well. Current state of VOP2 driver has no mechanism to distinguish support per modifier so it is not also possible advertise support for XRGB2101010 only with AFBC support without refactoring the driver. Therefore, this patch disables both XRGB2101010 and ARGB2101010 until rockchip resolves the problem with a sutainable fix. If not applied, kodi with GBM will display black screen. Reference from rockchip: nyanmisaka/ffmpeg-rockchip#4 (comment)
rk3588 hw only supports XRGB2101010 with AFBC mode, does not support ARGB2101010 at all. Current drm driver falsely advertises support for both with both in linear and AFBC modes as well. Current state of VOP2 driver has no mechanism to distinguish support per modifier so it is not also possible advertise support for XRGB2101010 only with AFBC support without refactoring the driver. Therefore, this patch disables both XRGB2101010 and ARGB2101010 until rockchip resolves the problem with a sutainable fix. If not applied, kodi with GBM will display black screen. Reference from rockchip: nyanmisaka/ffmpeg-rockchip#4 (comment)
This helps video players which do not support AVOptions (ie:Kodi) to use AFBC mode.
With this change and this PR in Kodi, i initially got AFBC output. However i think there are some problems:
Output of AV1:
Output of H264 & Hevc
Output of VP9:
With VP9 you can see it is almost correct except some stride issue, i think it should be divided by 4.
However with H264, HEVC and AV1 it seems that there are some other issues, i suspect some of those modifiers might differ from decoder to decoder.
As always please do not merge yet :)