GPU version: kernel size tuning and less UI lags

Author	Message
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 562 Credit: 72,451,573 RAC: 0	Message 335 - Posted: 10 Mar 2017, 18:24:04 UTC Last modified: 10 Mar 2017, 20:32:05 UTC Starting with version 1.11 it's possible to set kernel size for your GPUs at the preferences page. Default kernel size is 20. This is what version 1.10 used as well. Available kernel sizes range from 16 to 21. Smaller kernel size reduces UI lags, but increases OpenCL API overhead and therefore is slower. Decrease kernel size if your user interface is laggy/unusable when running GPU version. Larger kernel size increases UI lags, but uses GPU more efficiently. Increase kernel size if GPU load is less than 100% and you don't have UI lags or the GPU is not used for display output at all. Minimal kernel size (16) has been tested with AMD Radeon HD 5450 - the slowest GPU I could find, and it works fine with very little lags. Maximal kernel size (21) brings 98-100% load even to GeForce GTX 1080 at the cost of some UI lags. Setting it to 21 makes sense only for very powerful GPUs because it also uses quite a bit more memory. You can use this setting to fine-tune GPU version performance on your computers. ID: 335 · Rating: 0 · rate: / Reply Quote

Peppernrino Send message Joined: 30 Jan 17 Posts: 17 Credit: 486,804,901 RAC: 0	Message 449 - Posted: 26 May 2017, 15:47:11 UTC - in response to Message 335. Last modified: 26 May 2017, 15:47:25 UTC i am using a Radeon R9 270 with kernel size 21, and was getting some crashes while computing in tandem with Citizen Science Grid tasks. after installing new GPU drivers, i noticed the monitor was set to shut off every 15 minutes (dummy settings from last install) and turning this off seems to have prevented the problem entirely. i think Amicable has trouble starting the GPU in a high kernel state while the GPU is sleeping... which is probably why you warned to not use size 21 if it's used as display GPU... it all makes sense now. lol. :D my favourite project for GPU. ID: 449 · Rating: 0 · rate: / Reply Quote

hsdecalc Send message Joined: 2 Mar 17 Posts: 1 Credit: 152,349,267 RAC: 303,912	Message 467 - Posted: 13 Jun 2017, 9:28:29 UTC Last modified: 13 Jun 2017, 9:29:32 UTC On my 1080Ti there is no change when using kernel size 21 (Amicable_OpenCL_v_1_18.exe). Still have only 33% GPU-Usage and more than one CPU-Usage. I could run 3 WUs simultaneously. They need then about 520 sec instead 450 sec. ID: 467 · Rating: 0 · rate: / Reply Quote

Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 562 Credit: 72,451,573 RAC: 0	Message 468 - Posted: 13 Jun 2017, 9:37:36 UTC - in response to Message 467. Last modified: 13 Jun 2017, 9:38:26 UTC Yes, you need to run multiple WUs because they're CPU heavy now: https://sech.me/boinc/Amicable/forum_thread.php?id=72 ID: 468 · Rating: 0 · rate: / Reply Quote

Tex1954 Send message Joined: 4 Feb 17 Posts: 4 Credit: 24,049,346 RAC: 0	Message 515 - Posted: 29 Jun 2017, 18:39:03 UTC Last modified: 29 Jun 2017, 19:08:42 UTC I'm running 3 tasks at the same time on a GTX-980 with Linux and still only get 54%-62% even when not running anything else... Maybe I will try 4 tasks... and yes, Kernels set to 21... 8-) EDIT: This is better with 4 tasks... ID: 515 · Rating: 0 · rate: / Reply Quote

Peppernrino Send message Joined: 30 Jan 17 Posts: 17 Credit: 486,804,901 RAC: 0	Message 578 - Posted: 15 Jul 2017, 16:56:15 UTC i went into the BIOS and disabled some dummy settings that i think crunchers should generally know, but aren't talked about in enough places. So here goes: turn off c1e turn off c6 state turn off turbo turn off cool n quiet turn off apm and there was another timing thing that i turned off, i'll figure it out after a restart and post here. anyway, i've got everything back to full blast 100% with kernel 21 and have had NO errors since. ID: 578 · Rating: 0 · rate: / Reply Quote

MeMoo Send message Joined: 23 Sep 17 Posts: 1 Credit: 1,015,707,973 RAC: 0	Message 633 - Posted: 8 Oct 2017, 18:45:25 UTC - in response to Message 578. i went into the BIOS and disabled some dummy settings that i think crunchers should generally know, but aren't talked about in enough places. So here goes: turn off c1e turn off c6 state turn off turbo turn off cool n quiet turn off apm and there was another timing thing that i turned off, i'll figure it out after a restart and post here. anyway, i've got everything back to full blast 100% with kernel 21 and have had NO errors since. Thanks for this tip. Will disabling all these settings generally work for all computers? ID: 633 · Rating: 0 · rate: / Reply Quote

justsomeguy Send message Joined: 26 May 17 Posts: 2 Credit: 190,590,934 RAC: 0	Message 648 - Posted: 26 Oct 2017, 15:12:37 UTC - in response to Message 633. Should help on all...primarily, these are the throttling (cool&quiet) and power management (apm) settings. I've been turning most of these off by default on all my equipment for years. Good Luck! ID: 648 · Rating: 0 · rate: / Reply Quote

[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0	Message 712 - Posted: 22 Jan 2018, 20:25:02 UTC Hi, If I increase my LEVEL to 21, 22, 23 do I calculate bigger wu (larger figure) or is it just an optimization for my GPU to go faster? Naz ID: 712 · Rating: 0 · rate: / Reply Quote

Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 562 Credit: 72,451,573 RAC: 0	Message 713 - Posted: 22 Jan 2018, 20:33:48 UTC - in response to Message 712. You get the same WU. The bigger kernel size just means there is less overhead on GPU because it switches to the next kernel less often. But it also makes UI laggy, if you use the same GPU to connect to your monitor. ID: 713 · Rating: 0 · rate: / Reply Quote

[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0	Message 714 - Posted: 22 Jan 2018, 20:47:27 UTC - in response to Message 713. Merci for the quickly reponse ID: 714 · Rating: 0 · rate: / Reply Quote

[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0	Message 715 - Posted: 22 Jan 2018, 20:52:40 UTC - in response to Message 714. I think it would be interesting to create a table for each GPU so that we immediately have the figure to put so that our GPU is immediately effective. ID: 715 · Rating: 0 · rate: / Reply Quote

Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 562 Credit: 72,451,573 RAC: 0	Message 716 - Posted: 22 Jan 2018, 20:57:59 UTC - in response to Message 715. I think you got it a bit wrong. You can can increase kernel size as long as you have enough GPU RAM (monitor the usage with GPU-Z or similar tools). Bigger kernel size will mean faster processing of WUs on any GPU if it has enough memory (but more UI lags if it's attached to monitor). Kernel size = 23 should be enough for any GPU, even 1080 Ti. The way to set this option is to try different sizes and pick the largest number that still give you responsive UI and GPU load at 100% (can check it in GPU-Z). Once you have 100% GPU load, increasing kernel size further will not speed it up, because it's already at 100%. ID: 716 · Rating: 0 · rate: / Reply Quote

[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0	Message 717 - Posted: 22 Jan 2018, 21:06:59 UTC - in response to Message 716. Thank you for the idea of using GPU-Z! Anyway, the alliance francophone adore this project, soon we will significantly increase the number of crunchers in order to finish the searches of the 10 ^ 20 as soon as possible :-) ID: 717 · Rating: 0 · rate: / Reply Quote

Viking69 Send message Joined: 4 Mar 18 Posts: 3 Credit: 10,056,041 RAC: 14	Message 768 - Posted: 18 Mar 2018, 21:26:04 UTC So, what am I doing wrong? Too many failures, and not explanation from the App. Here is an excerpt from the last failure. <core_client_version>7.9.2</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1 (0xffffffff)</message> <stderr_txt> Initializing prime tables...done OpenCL.cpp, line 294: Preferences: <project_preferences> <max_jobs>2</max_jobs> <max_cpus>6</max_cpus> <kernel_size_amd>20</kernel_size_amd> <kernel_size_nvidia>20</kernel_size_nvidia> </project_preferences> OpenCL.cpp, line 307: Kernel size for NVIDIA GPU has been set to 20 OpenCL.cpp, line 440: clEnqueueWriteBuffer returned error -4 08:40:14 (2712): called boinc_finish(-1) </stderr_txt> ]]> ID: 768 · Rating: 0 · rate: / Reply Quote

Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 562 Credit: 72,451,573 RAC: 0	Message 769 - Posted: 18 Mar 2018, 21:58:13 UTC - in response to Message 768. Last modified: 18 Mar 2018, 22:05:56 UTC There was not enough GPU memory. Try to close all other programs. Work units that are being issued now require less memory than before, so there is a good chance you won't have this error now. P.S. You can also try to reduce kernel size. ID: 769 · Rating: 0 · rate: / Reply Quote

vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0	Message 930 - Posted: 26 Sep 2018, 19:17:59 UTC Are there any plans on increasing the kernel size allowed? Running a Tesla v100 SXM2 16Gb: Kernel at 18 = 160 sec Kernel at 20 = 92 sec Kernel at 21 = 85 sec Kernel at 22 = 81 sec Kernel at 23 = 79 sec Not sure if 24 would be faster but logic would say yes, maybe slightly. Also the new RTX 2080 just got released so they might be more capable then current cards. ID: 930 · Rating: 0 · rate: / Reply Quote

Jozef J Send message Joined: 24 Jan 17 Posts: 20 Credit: 1,193,027,994 RAC: 0	Message 931 - Posted: 26 Sep 2018, 21:17:43 UTC Just to compare: best 1080 and 1080Ti do task about 380- to average 420-440 sec.. And Tesla v100 SXM2 16Gb - stable 80 sec... this is insane.. increase of performance... prices for tesla v100 also)) while titan V do average 440 sec,,, a dont expect much increase for RTX cards, nearly half of gpu core take that "tensor".. Soo i dont know.. but how is depend amicable numb. task on tensor/cuda or how it work here on project? becouse titan V still same performance like 1080ti 2. increasing kernel size is maybe worth.. just for testing ... ) or develop better app for amd cards https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/ ID: 931 · Rating: 0 · rate: / Reply Quote

vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0	Message 932 - Posted: 27 Sep 2018, 19:41:54 UTC - in response to Message 931. Yeah, I ran a Titan V for about a month and compared to the Tesla v100, which spec wise is very similar, in some projects it was close (within 10%) but in others it was way off (more then double the time). I ended up selling the Titan as for the price it wasn't worth the added processing over a 1080 Ti. But I agree, maybe a 24 or 25 setting just for testing. ID: 932 · Rating: 0 · rate: / Reply Quote

vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0	Message 933 - Posted: 29 Sep 2018, 3:34:16 UTC - in response to Message 931. dont expect much increase for RTX cards, nearly half of gpu core take that "tensor".. Just installed my RTX 2080.....averaging 134 seconds a task. :) ID: 933 · Rating: 0 · rate: / Reply Quote