Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: access violation in some certain conditions #110

Closed
guyi2000 opened this issue Jan 20, 2024 · 16 comments
Closed

Bug: access violation in some certain conditions #110

guyi2000 opened this issue Jan 20, 2024 · 16 comments

Comments

@guyi2000
Copy link
Contributor

guyi2000 commented Jan 20, 2024

Short description

Hello, thank you very much for your software. I believe that this is the best alternative to the QuickPar program.

During usage, I noticed that clicking on the "Settings" button causes the program to crash. Additionally, when the GPU acceleration switch is enabled, there is a prompt that par2j64.exe has stopped working while creating the Par2 file. After carefully reading the source code, I identified a potential access violation in source/par2j/lib_opencl.c.

In lines 235-245 of the source/par2j/lib_opencl.c file within the init_OpenCL() function:

	ret = fn_clGetPlatformIDs(MAX_DEVICE, platform_id, &num_platforms);
	if (ret != CL_SUCCESS)
		return (ret << 8) | 10;
	if (OpenCL_method & 0x200){	// Modify the selection order and initial value
		gpu_power = INT_MIN;
	} else {
		gpu_power = 0;
	}
	alloc_max = 0;

	for (i = 0; i < (int)num_platforms,; i++){

Please note that fn_clGetPlatformIDs(MAX_DEVICE, platform_id, &num_platforms) is used to retrieve the number of platforms, and it is limited by MAX_DEVICE. However, in reality, this limitation does not take effect. In my computer environment, num_platforms = 5, whereas the default MAX_DEVICE is 3, resulting in an out-of-bounds error in the subsequent for loop.

Fixing this is straightforward by modifying the condition of the for loop to i < (int)min(num_platforms, MAX_DEVICE).

I will submit a pull request to address this issue. Thank you!

Environment

The presence of five OpenCL platforms on my computer is the main cause of this issue.

Here is the output of clinfo

Number of platforms                               5

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     NVIDIA GeForce MX150

  Platform Name                                   Intel(R) OpenCL HD Graphics
Number of devices                                 1
  Device Name                                     Intel(R) UHD Graphics 620

  Platform Name                                   Intel(R) OpenCL
Number of devices                                 1
  Device Name                                     Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz

  Platform Name                                   Intel(R) FPGA Emulation Platform for OpenCL(TM)
Number of devices                                 1
  Device Name                                     Intel(R) FPGA Emulation Device

  Platform Name                                   Intel(R) FPGA SDK for OpenCL(TM)
Number of devices                                 0

Additional information

As I do not have access to the source code of the MultiPar GUI, I am unable to modify the behavior of the "Settings" button. However, based on your description, I suspect that it is also caused by the same issue. I apologize for the inconvenience, but I hope you proceed with the necessary fix.

@Yutaka-Sawada
Copy link
Owner

Thank you for bug report and solution. Your OpenCL runtime seems to have a bug in clGetPlatformIDs. Though I adoped your pull request, I don't know that it works well always. If the listed order is bad, it may not find GPU device. Should I change MAX_DEVICE to 5 ?

@guyi2000
Copy link
Contributor Author

guyi2000 commented Jan 20, 2024

Thank you for your response. It is worth noting that this is not a bug in the OpenCL platform, but rather the behavior of clGetPlatformIDs() itself.

According to the documentation provided at https://man.opencl.org/clGetPlatformIDs.html, the parameter num_platforms returns the number of available OpenCL platforms. If num_platforms is NULL, this argument is disregarded. This implies that regardless of the value of num_entries, num_platforms will always be equal to the number of available OpenCL platforms, which leads to the aforementioned bug.

Based on the practice of the OpenCL SDK Context.c, it is recommended not to use a fixed value for MAX_DEVICE, but rather to use the clGetPlatformIDs function twice.

Here is the Context.c Line 20~24:

OCLERROR_RET(clGetPlatformIDs(0, NULL, &num_platforms), err, end);
MEM_CHECK(platforms = (cl_platform_id *)malloc(sizeof(cl_platform_id)
                                               * num_platforms),
          err, end);
OCLERROR_RET(clGetPlatformIDs(num_platforms, platforms, NULL), err, plat);

First, use clGetPlatformIDs(0, NULL, &num_platforms) to retrieve the number of available platforms. Then allocate memory accordingly. Finally, use clGetPlatformIDs(num_platforms, platforms, NULL) to obtain the platforms.

If you would like me to modify the code to align with the mentioned practice, please let me know, and I can submit another pull request. If you find it too complex, although not the best practice, you can consider changing MAX_DEVICES to a larger constant (such as 5 you mentioned) to avoid this issue. It depends on your preference.

@Yutaka-Sawada
Copy link
Owner

Thank you for detailed explanation. I misunderstood the behavior of clGetPlatformIDs function. So, I need to fix another possible bug at clGetDeviceIDs, too.

If you would like me to modify the code to align with the mentioned practice, please let me know, and I can submit another pull request.

I'm OK. Thanks. Changing MAX_DEVICES would be easy. I will modify my code myself.

@guyi2000
Copy link
Contributor Author

guyi2000 commented Jan 20, 2024

Yes, however, typically a home PC has one or zero OpenCL devices corresponding to an OpenCL platform (similar to the environment of my laptop that I provided above). As a result, it is highly unlikely to encounter the occurrence of access violation with the clGetDeviceIDs function. Thus, the issue with clGetDeviceIDs is not as severe as that with clGetPlatformIDs.

Thank you for your response. I believe we can close this issue once the modifications are completed.

@Yutaka-Sawada
Copy link
Owner

I modified my source code to use more stack memory (several tens bytes), because I'm lazy to write many code for dynamic allocation. It's simple, and it will work mostly. I posted the sample .EXE files on GitHub. If you have time, please test them. Thanks many advices.

@guyi2000
Copy link
Contributor Author

Thank you very much. I have tested the program. par2j.exe and par2j64.exe run as expected. But MultiPar GUI still crashes when clicking on the "Settings" button. I have debugged it using WinDbg Preview. I found that there is still an access violation issue, here are the logs and screenshots

(f7c.4784): Unknown exception - code c06d007e (first chance)
(f7c.4784): Unknown exception - code c06d007e (!!! second chance !!!)
eax=00cfb768 ebx=00000000 ecx=00000001 edx=00000000 esi=00000000 edi=00000000
eip=76cd9392 esp=00cfb768 ebp=00cfb7c4 iopl=0         nv up ei pl nz ac po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000212
KERNELBASE!RaiseException+0x62:
76cd9392 8b4c2454        mov     ecx,dword ptr [esp+54h] ss:002b:00cfb7bc=66b36ab7
0:000> g
(f7c.4784): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00fe2c28 ecx=eff50db2 edx=00000000 esi=05984248 edi=00000001
eip=00000000 esp=00cfb834 ebp=00cfb85c iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
00000000 ??              ???
0:000> g
(f7c.4784): Access violation - code c0000005 (!!! second chance !!!)
eax=00000000 ebx=00fe2c28 ecx=eff50db2 edx=00000000 esi=05984248 edi=00000001
eip=00000000 esp=00cfb834 ebp=00cfb85c iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
00000000 ??              ???

image

Since I do not have the debugging symbols for the GUI interface, I am unsure of the source code for the error. However, based on the function call stack frame, it seems that the issue lies with clGetPlatformIDs. This could be due to incomplete modifications in the related code.

@Yutaka-Sawada
Copy link
Owner

Thank you for test. I forgot to change a value in my source code. It failed to exit loop, when there are 4 or more OpenCL platforms. It might freeze. I fixed the bug, and updated the GitHub alpha version. I'm sorry for trouble some times.

@guyi2000
Copy link
Contributor Author

Thank you, but I tested it and found that the above problem is still not solved (the same behavior), is it possible that there are still some of the codes that have not been changed?

@Yutaka-Sawada
Copy link
Owner

is it possible that there are still some of the codes that have not been changed?

I'm not sure where is wrong near clGetPlatformIDs. The function to check OpenCL devices is like below;

#include <opencl.h>
typedef cl_int (CL_API_CALL *API_clGetPlatformIDs)(cl_uint, cl_platform_id *, cl_uint *);
typedef cl_int (CL_API_CALL *API_clGetDeviceIDs)(cl_platform_id, cl_device_type, cl_uint, cl_device_id *, cl_uint *);
typedef cl_int (CL_API_CALL *API_clGetDeviceInfo)(cl_device_id, cl_device_info, size_t, void *, size_t *);

#define MAX_DEVICE 8

static unsigned int check_cpu(void)
{
	unsigned int CPUInfo[4], cpu_flag = 0;
	HMODULE hLibOpenCL = NULL;

	__cpuid(CPUInfo, 1);
	cpu_flag |= (CPUInfo[3] & (1 << 26)) >> 19;
	cpu_flag |= (CPUInfo[2] & (1 << 9)) >> 9;
	cpu_flag |= (CPUInfo[2] & (1 << 19)) >> 18;
	cpu_flag |= (CPUInfo[2] & (1 << 20)) >> 18;
	cpu_flag |= (CPUInfo[2] & (1 << 1)) << 2;
	if ((CPUInfo[2] & (1 << 28)) != 0){
		if (IsWindows7OrGreater()){
			__cpuid(CPUInfo, 0);
			if (CPUInfo[0] >= 7){
				__cpuidex(CPUInfo, 7, 0);
				cpu_flag |= (CPUInfo[1] & (1 << 5)) >> 1;
			}
		}
	}

	hLibOpenCL = LoadLibraryA("OpenCL.DLL");
	if (hLibOpenCL != NULL){
		API_clGetPlatformIDs fn_clGetPlatformIDs;
		API_clGetDeviceIDs fn_clGetDeviceIDs;
		API_clGetDeviceInfo fn_clGetDeviceInfo;
		int i, j;
		size_t group_size;
		cl_int ret;
		cl_uint num_platforms = 0, num_devices = 0, param_value;
		cl_platform_id platform_id[MAX_DEVICE];
		cl_device_id device_id[MAX_DEVICE];

		do {
			fn_clGetPlatformIDs = (API_clGetPlatformIDs)GetProcAddress(hLibOpenCL, "clGetPlatformIDs");
			if (fn_clGetPlatformIDs == NULL)
				break;
			fn_clGetDeviceIDs = (API_clGetDeviceIDs)GetProcAddress(hLibOpenCL, "clGetDeviceIDs");
			if (fn_clGetDeviceIDs == NULL)
				break;
			fn_clGetDeviceInfo = (API_clGetDeviceInfo)GetProcAddress(hLibOpenCL, "clGetDeviceInfo");
			if (fn_clGetDeviceInfo == NULL)
				break;

			ret = fn_clGetPlatformIDs(MAX_DEVICE, platform_id, &num_platforms);
			if (ret != CL_SUCCESS)
				break;
			if (num_platforms > MAX_DEVICE)
				num_platforms = MAX_DEVICE;
			for (i = 0; i < (int)num_platforms; i++){
				if (fn_clGetDeviceIDs(platform_id[i], CL_DEVICE_TYPE_GPU, MAX_DEVICE, device_id, &num_devices) != CL_SUCCESS)
					continue;
				if (num_devices > MAX_DEVICE)
					num_devices = MAX_DEVICE;
				for (j = 0; j < (int)num_devices; j++){
					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_AVAILABLE, sizeof(cl_uint), &param_value, NULL);
					if ((ret != CL_SUCCESS) || (param_value == CL_FALSE))
						continue;
					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_COMPILER_AVAILABLE, sizeof(cl_uint), &param_value, NULL);
					if ((ret != CL_SUCCESS) || (param_value == CL_FALSE))
						continue;

					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(size_t), &group_size, NULL);
					if (ret != CL_SUCCESS)
						continue;
					if (group_size >= 256){
						cpu_flag |= 0x10000;
						i = MAX_DEVICE;
						break;
					}
				}
			}
		} while (0);

		FreeLibrary(hLibOpenCL);
	}

	return cpu_flag;
}

@guyi2000
Copy link
Contributor Author

Thank you for providing the code snippet. This particular section of code appears to be fine and I have personally tested it successfully. I believe the issue might lie elsewhere. To facilitate a more accurate problem localization, if convenient for you, you may consider compiling a debug version of the program and sharing the debugging symbols on OneDrive. This will allow me to debug the issue further. Alternatively, you could consider removing the section where clGetPlatformIDs is called, and then I can test whether it passes. This way, we can determine if the issue lies within that particular section. Nevertheless, I apologize for any inconvenience caused and thank you for your understanding.

@Yutaka-Sawada
Copy link
Owner

Alternatively, you could consider removing the section where clGetPlatformIDs is called, and then I can test whether it passes.

Thank you for suggestion and help. There might be another bug somewhere. To compare error, I made a sample without checking GPU. I just commented out OpenCL section in the check_cpu function. I put the sample package (MultiPar_sample_2024-01-21.zip) in "MultiPar_sample" folder on OneDrive. MultiPar_CheckGPU.exe is same as one of GitHub alpha version. MultiPar_NoCheck.exe is the one of commented out. If MultiPar_NoCheck.exe fails also, there would be more bug in another function.

@guyi2000
Copy link
Contributor Author

Oh, MultiPar_NoCheck.exe works very well. It appears that I need to reexamine the code snippet mentioned above as it seems that the issue lies within that code.

@guyi2000
Copy link
Contributor Author

This is confusing me. It seems that there might be a problem with the check_cpu() function since the MultiPar_CheckGPU.exe is not working while the MultiPar_NoCheck.exe is working fine. However, based on the code snippet provided earlier, I implemented a small executable program that correctly returns cpu_flag=65695 (0x1009f) without any error, indicating that check_cpu() is functioning properly.

Here is my code

// OpenCL 関係
#include <stdio.h>
#include <windows.h>
#include <VersionHelpers.h>
#include <opencl.h>

typedef cl_int(CL_API_CALL* API_clGetPlatformIDs)(cl_uint, cl_platform_id*, cl_uint*);
typedef cl_int(CL_API_CALL* API_clGetDeviceIDs)(cl_platform_id, cl_device_type, cl_uint, cl_device_id*, cl_uint*);
typedef cl_int(CL_API_CALL* API_clGetDeviceInfo)(cl_device_id, cl_device_info, size_t, void*, size_t*);

#define MAX_DEVICE 8

// CPU の拡張機能と GPU の OpenCL driver の存在を調べる
static unsigned int check_cpu(void)
{
	unsigned int CPUInfo[4], cpu_flag = 0;
	HMODULE hLibOpenCL = NULL;

	// CPU の拡張機能を調べる
	__cpuid(CPUInfo, 1);
	cpu_flag |= (CPUInfo[3] & (1 << 26)) >> 19;	// SSE2 対応か
	cpu_flag |= (CPUInfo[2] & (1 << 9)) >> 9;	// SSSE3 対応か
	cpu_flag |= (CPUInfo[2] & (1 << 19)) >> 18;	// SSE4.1 対応か
	cpu_flag |= (CPUInfo[2] & (1 << 20)) >> 18;	// SSE4.2 対応か
	cpu_flag |= (CPUInfo[2] & (1 << 1)) << 2;	// CLMUL 対応か
	if ((CPUInfo[2] & (1 << 28)) != 0) {	// AVX 対応なら
		if (IsWindows7OrGreater()) {	// Windows 7 以降なら AVX2 の判定をする
			__cpuid(CPUInfo, 0);
			if (CPUInfo[0] >= 7) {	// AVX2 用の基本命令領域があるなら
				__cpuidex(CPUInfo, 7, 0);
				cpu_flag |= (CPUInfo[1] & (1 << 5)) >> 1;	// AVX2 対応か
			}
		}
	}

	// GPU を調べるけど、32-bit版しかチェックできない・・・
	hLibOpenCL = LoadLibraryA("OpenCL.DLL");
	if (hLibOpenCL != NULL) {
		API_clGetPlatformIDs fn_clGetPlatformIDs;
		API_clGetDeviceIDs fn_clGetDeviceIDs;
		API_clGetDeviceInfo fn_clGetDeviceInfo;
		int i, j;
		size_t group_size;
		cl_int ret;
		cl_uint num_platforms = 0, num_devices = 0, param_value;
		cl_platform_id platform_id[MAX_DEVICE];
		cl_device_id device_id[MAX_DEVICE];

		do {
			fn_clGetPlatformIDs = (API_clGetPlatformIDs)GetProcAddress(hLibOpenCL, "clGetPlatformIDs");
			if (fn_clGetPlatformIDs == NULL)
				break;
			fn_clGetDeviceIDs = (API_clGetDeviceIDs)GetProcAddress(hLibOpenCL, "clGetDeviceIDs");
			if (fn_clGetDeviceIDs == NULL)
				break;
			fn_clGetDeviceInfo = (API_clGetDeviceInfo)GetProcAddress(hLibOpenCL, "clGetDeviceInfo");
			if (fn_clGetDeviceInfo == NULL)
				break;

			// OpenCL 環境の数
			ret = fn_clGetPlatformIDs(MAX_DEVICE, platform_id, &num_platforms);
			if (ret != CL_SUCCESS)
				break;
			if (num_platforms > MAX_DEVICE)
				num_platforms = MAX_DEVICE;
			for (i = 0; i < (int)num_platforms; i++) {
				// 環境内の OpenCL 対応機器の数
				if (fn_clGetDeviceIDs(platform_id[i], CL_DEVICE_TYPE_GPU, MAX_DEVICE, device_id, &num_devices) != CL_SUCCESS)
					continue;
				if (num_devices > MAX_DEVICE)
					num_devices = MAX_DEVICE;
				for (j = 0; j < (int)num_devices; j++) {
					// デバイスが利用可能か確かめる
					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_AVAILABLE, sizeof(cl_uint), &param_value, NULL);
					if ((ret != CL_SUCCESS) || (param_value == CL_FALSE))
						continue;
					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_COMPILER_AVAILABLE, sizeof(cl_uint), &param_value, NULL);
					if ((ret != CL_SUCCESS) || (param_value == CL_FALSE))
						continue;

					// WORK_GROUP_SIZE が 256以上ないとテーブルを作れない
					ret = fn_clGetDeviceInfo(device_id[j], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(size_t), &group_size, NULL);
					if (ret != CL_SUCCESS)
						continue;
					if (group_size >= 256) {
						cpu_flag |= 0x10000;
						i = MAX_DEVICE;
						break;
					}
				}
			}
		} while (0);

		FreeLibrary(hLibOpenCL);
	}

	return cpu_flag;
}

int main() {
	printf("%u", check_cpu());
	return 0;
}

@guyi2000
Copy link
Contributor Author

guyi2000 commented Jan 21, 2024

Very well, I have identified the root cause of the issue. There is no issue with your modifications to the code. The problem lies with me. Firstly, this issue is only encountered on x86. That is also why I mentioned earlier that there were no issues detected during compilation, as I was using the default x64 compilation method.

The primary concern lies in the loading mechanism of the OpenCL.dll, which utilizes the OpenCL ICD loading mechanism. This mechanism redirects the OpenCL.dll located in SysWOW64 to intelocl32_emu.dll (provided by Intel OneAPI).

However, it has been reported that the intelocl32_emu.dll itself has defects, as mentioned in the reference https://community.intel.com/t5/OpenCL-for-CPU/Crash-32-bit-application-with-Intel-CPU-Runtime-for-OpenCL-22/m-p/1369747. The clGetPlatformIDs function will result in an error.

After deleting the registry key value HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Khronos\OpenCL\Vendors\*, everything started running smoothly. LOL, the crashes of par2j64.exe and Multipar GUI are caused by different reasons.

Therefore, if someone happens to have both Intel OneAPI and this program installed, they may encounter the same issue. The solution is to unlink the OpenCL ICD reference. I'm not sure if there is a better method, but this is one approach to consider.

P.s. ANOTHER SOLUTION: If a version of MultiPar GUI that is compatible with x64 could be provided, it would be able to address this issue.

Indeed, our previous discussion was valuable as it successfully addressed the issue of access violation caused by having too many OpenCL platforms.

Thank you very much for your kind words and appreciation. I'm glad that I could assist you. If you believe that we have addressed the issue and no further assistance is required, then we can consider closing the issue. I also apologize for taking up your valuable time. If you have any more questions or need assistance in the future, feel free to reach out.

@Yutaka-Sawada
Copy link
Owner

Oh, I see. I'm glad to know that the problem was solved. My MultiPar.exe is 32-bit application to support old OSes. When someone sees the same error in future, this thread will be helpful. Thank you for fixing the rare bug. If you see another bug while using MultiPar, please report again.

@guyi2000
Copy link
Contributor Author

Thank you very much for your great software.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants