Message boards : Number crunching : All work is ending in Error
Author | Message |
---|---|
Dingo Send message Joined: 30 Jan 17 Posts: 11 Credit: 71,598,438 RAC: 851 |
Almost all my tasks on both Windows and Linux are ending in error: This is an example https://sech.me/boinc/Amicable/result.php?resultid=16886548 This is the output: Task 16886548 Name amicable_10_20_27977_1536444902.738474_524_1 Workunit 7600724 Created 9 Sep 2018, 2:57:59 UTC Sent 12 Sep 2018, 15:19:24 UTC Report deadline 15 Sep 2018, 15:19:24 UTC Received 12 Sep 2018, 15:21:04 UTC Server state Over Outcome Computation error Client state Compute error Exit status -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION Computer ID 45990 Run time CPU time Validate state Invalid Credit 0.00 Device peak FLOPS 31.47 GFLOPS Application version Amicable Numbers up to 10^20 v2.16 (mt) windows_x86_64 Stderr output <core_client_version>7.12.1</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005)</message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 7.9.0 Dump Timestamp : 09/13/18 01:19:24 Install Directory : C:\Program Files\BOINC\ Data Directory : C:\ProgramData\BOINC Project Symstore : Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... LoadLibraryA( C:\ProgramData\BOINC\dbghelp.dll ): GetLastError = 126 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 Engaging BOINC Windows Runtime Debugger... Loaded Library : dbghelp.dll LoadLibraryA( C:\ProgramData\BOINC\symsrv.dll ): GetLastError = 126 LoadLibraryA( symsrv.dll ): GetLastError = 126 LoadLibraryA( C:\ProgramData\BOINC\srcsrv.dll ): GetLastError = 126 LoadLibraryA( srcsrv.dll ): GetLastError = 126 LoadLibraryA( C:\ProgramData\BOINC\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 Symbol Search Path: C:\ProgramData\BOINC\slots\3;C:\ProgramData\BOINC\projects\sech.me_boinc_Amicable ModLoad: 0000000037050000 0000000000d67000 C:\ProgramData\BOINC\projects\sech.me_boinc_Amicable\Amicable_v_2_16.exe (-nosymbols- Symbols Loaded) Linked PDB Filename : C:\Temp\Amicable-boinc-version-128-bit\x64\Release\Amicable.pdb ModLoad: 0000000022330000 00000000001e1000 C:\WINDOWS\SYSTEM32\ntdll.dll (6.2.17134.254) (-exported- Symbols Loaded) Linked PDB Filename : ntdll.pdb File Version : 10.0.17134.228 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.228 ModLoad: 0000000021130000 00000000000b2000 C:\WINDOWS\System32\KERNEL32.DLL (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : kernel32.pdb File Version : 10.0.17134.228 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.228 ModLoad: 000000001eb50000 0000000000273000 C:\WINDOWS\System32\KERNELBASE.dll (6.2.17134.165) (-exported- Symbols Loaded) Linked PDB Filename : kernelbase.pdb File Version : 10.0.17134.228 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.228 ModLoad: 0000000020fa0000 0000000000190000 C:\WINDOWS\System32\USER32.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : user32.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 000000001f5b0000 0000000000020000 C:\WINDOWS\System32\win32u.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : win32u.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 0000000021400000 0000000000028000 C:\WINDOWS\System32\GDI32.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : gdi32.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 000000001e9b0000 0000000000192000 C:\WINDOWS\System32\gdi32full.dll (6.2.17134.112) (-exported- Symbols Loaded) Linked PDB Filename : gdi32full.pdb File Version : 10.0.17134.112 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.112 ModLoad: 000000001e8b0000 000000000009f000 C:\WINDOWS\System32\msvcp_win.dll (6.2.17134.137) (-exported- Symbols Loaded) Linked PDB Filename : msvcp_win.pdb File Version : 10.0.17134.137 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.137 ModLoad: 000000001e7b0000 00000000000fa000 C:\WINDOWS\System32\ucrtbase.dll (6.2.17134.254) (-exported- Symbols Loaded) Linked PDB Filename : ucrtbase.pdb File Version : 10.0.17134.254 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.254 ModLoad: 0000000021820000 00000000000a1000 C:\WINDOWS\System32\ADVAPI32.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : advapi32.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 0000000022250000 000000000009e000 C:\WINDOWS\System32\msvcrt.dll (7.0.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : msvcrt.pdb File Version : 7.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 7.0.17134.1 ModLoad: 0000000021c00000 000000000005b000 C:\WINDOWS\System32\sechost.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : sechost.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 0000000020e70000 0000000000124000 C:\WINDOWS\System32\RPCRT4.dll (6.2.17134.112) (-exported- Symbols Loaded) Linked PDB Filename : rpcrt4.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 0000000021dc0000 000000000002d000 C:\WINDOWS\System32\IMM32.DLL (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : imm32.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 000000001e6e0000 0000000000011000 C:\WINDOWS\System32\kernel.appcore.dll (6.2.17134.112) (-exported- Symbols Loaded) Linked PDB Filename : Kernel.Appcore.pdb File Version : 10.0.17134.112 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.112 ModLoad: 0000000019f30000 00000000001c9000 C:\WINDOWS\SYSTEM32\dbghelp.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : dbghelp.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 ModLoad: 000000001ab60000 000000000000a000 C:\WINDOWS\SYSTEM32\version.dll (6.2.17134.1) (-exported- Symbols Loaded) Linked PDB Filename : version.pdb File Version : 10.0.17134.1 (WinBuild.160101.0800) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 10.0.17134.1 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 7, Write: 577, Other 69 - I/O Transfers Counters - Read: 17519, Write: 604, Other 6394 - Paged Pool Usage - QuotaPagedPoolUsage: 113648, QuotaPeakPagedPoolUsage: 113648 QuotaNonPagedPoolUsage: 9104, QuotaPeakNonPagedPoolUsage: 9232 - Virtual Memory Usage - VirtualSize: 529408000, PeakVirtualSize: 625639424 - Pagefile Usage - PagefileUsage: 529408000, PeakPagefileUsage: 529416192 - Working Set Size - WorkingSetSize: 98172928, PeakWorkingSetSize: 98172928, PageFaultCount: 24901 *** Dump of thread ID 18012 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 *** Dump of thread ID 11556 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF737059BC7 read attempt to address 0x00003190 *** Dump of thread ID 32761 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 37.000000, User Time: 0.000000, Wait Time: 4242696448.000000 *** Dump of thread ID 30689963 (state: Unknown): *** - Information - Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 4294967296.000000, User Time: 21474836480.000000, Wait Time: 156250.000000 *** Dump of thread ID 5 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 51584956.000000, User Time: 140707994009600.000000, Wait Time: 9468.000000 *** Dump of thread ID 1 (state: Initialized): *** - Information - Status: Base Priority: Unknown, Priority: Unknown, , Kernel Time: 131812391695417344.000000, User Time: 51584956.000000, Wait Time: 5604.000000 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> The Linux output is different: <core_client_version>7.4.23</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> SIGSEGV: segmentation violation Stack trace (10 frames): [0x4384d0] [0x458bb0] [0x416f5f] [0x414e7e] [0x41bee3] [0x4141dc] [0x405ba3] [0x49bbcf] [0x44f095] [0x59fecb] Exiting... </stderr_txt> Proud Founder and member of Have a look at my WebCam |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
I can confirm it started crashing on latest work units, but I don't know why yet. I'll try to fix it today. |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
I've found the bug, will update CPU version later today. |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
It's fixed now. |
Dingo Send message Joined: 30 Jan 17 Posts: 11 Credit: 71,598,438 RAC: 851 |
|
SoNic1967 Send message Joined: 8 Sep 18 Posts: 13 Credit: 23,954,022 RAC: 0 |
Errors stopped on my PC. Good job! |
XAVER Send message Joined: 17 Jul 18 Posts: 1 Credit: 17,999,698 RAC: 0 |
GPU WUs still ending in error |
corris Send message Joined: 23 Apr 17 Posts: 4 Credit: 186,151,574 RAC: 0 |
yep, suddenly GPU on linux have started erroring Were running OK, now not so |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
Sorry, will fix it ASAP. |
corris Send message Joined: 23 Apr 17 Posts: 4 Credit: 186,151,574 RAC: 0 |
Not all Sergei Maybe something > 50% over the past hour or so. Error (nvidia 1060 Linux) at 18 seconds If you are on the case, then no worries |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
I've updated Windows & Linux OpenCL versions, can you check that they run fine? MacOS version will follow soon. P.S. MacOS OpenCL version is now updated. |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
It looks like it's fixed now. New GPU versions didn't give any errors so far, and almost 100 WUs are already finished. |
Dingo Send message Joined: 30 Jan 17 Posts: 11 Credit: 71,598,438 RAC: 851 |
I am getting a different error now on most of my tasks Task 17126533 Name amicable_10_20_8598_1537187402.097611_151_0 Workunit 7699398 Created 17 Sep 2018, 12:30:41 UTC Sent 17 Sep 2018, 13:47:25 UTC Report deadline 20 Sep 2018, 13:47:25 UTC Received 17 Sep 2018, 13:51:59 UTC Server state Over Outcome Computation error Client state Compute error Exit status -1 (0xFFFFFFFF) Unknown error code Computer ID 45990 Run time 1 min 1 sec CPU time 1 sec Validate state Invalid Credit 0.00 Device peak FLOPS 11,792.29 GFLOPS Application version Amicable Numbers up to 10^20 v2.17 (opencl_nvidia) windows_x86_64 Peak working set size 151.53 MB Peak swap size 1,413.85 MB Peak disk usage 0.01 MB Stderr output <core_client_version>7.12.1</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1 (0xffffffff)</message> <stderr_txt> Initializing prime tables...done c:\temp\amicable-boinc-opencl-version-128-bit\amicable\opencl.cpp, line 294: Preferences: <project_preferences> <max_jobs>0</max_jobs> <max_cpus>0</max_cpus> <kernel_size_amd>21</kernel_size_amd> <kernel_size_nvidia>23</kernel_size_nvidia> </project_preferences> c:\temp\amicable-boinc-opencl-version-128-bit\amicable\opencl.cpp, line 307: Kernel size for NVIDIA GPU has been set to 23 Initializing prime tables...done c:\temp\amicable-boinc-opencl-version-128-bit\amicable\opencl.cpp, line 294: Preferences: <project_preferences> <max_jobs>0</max_jobs> <max_cpus>0</max_cpus> <kernel_size_amd>21</kernel_size_amd> <kernel_size_nvidia>23</kernel_size_nvidia> </project_preferences> c:\temp\amicable-boinc-opencl-version-128-bit\amicable\opencl.cpp, line 307: Kernel size for NVIDIA GPU has been set to 23 c:\temp\amicable-boinc-opencl-version-128-bit\amicable\opencl.cpp, line 1130: clGetEventInfo returned error -58 23:50:43 (17116): called boinc_finish(-1) </stderr_txt> ]]> Proud Founder and member of Have a look at my WebCam |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
Try to reduce "Kernel size for NVIDIA GPU", 23 might be too high. |
Message boards : Number crunching : All work is ending in Error
©2024 Sergei Chernykh