Message boards : Number crunching : GPU version: kernel size tuning and less UI lags
Author | Message |
---|---|
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
Starting with version 1.11 it's possible to set kernel size for your GPUs at the preferences page. Default kernel size is 20. This is what version 1.10 used as well. Available kernel sizes range from 16 to 21. Smaller kernel size reduces UI lags, but increases OpenCL API overhead and therefore is slower. Decrease kernel size if your user interface is laggy/unusable when running GPU version. Larger kernel size increases UI lags, but uses GPU more efficiently. Increase kernel size if GPU load is less than 100% and you don't have UI lags or the GPU is not used for display output at all. Minimal kernel size (16) has been tested with AMD Radeon HD 5450 - the slowest GPU I could find, and it works fine with very little lags. Maximal kernel size (21) brings 98-100% load even to GeForce GTX 1080 at the cost of some UI lags. Setting it to 21 makes sense only for very powerful GPUs because it also uses quite a bit more memory. You can use this setting to fine-tune GPU version performance on your computers. |
Peppernrino Send message Joined: 30 Jan 17 Posts: 17 Credit: 486,804,901 RAC: 0 |
i am using a Radeon R9 270 with kernel size 21, and was getting some crashes while computing in tandem with Citizen Science Grid tasks. after installing new GPU drivers, i noticed the monitor was set to shut off every 15 minutes (dummy settings from last install) and turning this off seems to have prevented the problem entirely. i think Amicable has trouble starting the GPU in a high kernel state while the GPU is sleeping... which is probably why you warned to not use size 21 if it's used as display GPU... it all makes sense now. lol. :D my favourite project for GPU. |
hsdecalc Send message Joined: 2 Mar 17 Posts: 1 Credit: 145,499,401 RAC: 197,912 |
On my 1080Ti there is no change when using kernel size 21 (Amicable_OpenCL_v_1_18.exe). Still have only 33% GPU-Usage and more than one CPU-Usage. I could run 3 WUs simultaneously. They need then about 520 sec instead 450 sec. |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
Yes, you need to run multiple WUs because they're CPU heavy now: https://sech.me/boinc/Amicable/forum_thread.php?id=72 |
Tex1954 Send message Joined: 4 Feb 17 Posts: 4 Credit: 24,049,346 RAC: 0 |
I'm running 3 tasks at the same time on a GTX-980 with Linux and still only get 54%-62% even when not running anything else... Maybe I will try 4 tasks... and yes, Kernels set to 21... 8-) EDIT: This is better with 4 tasks... |
Peppernrino Send message Joined: 30 Jan 17 Posts: 17 Credit: 486,804,901 RAC: 0 |
i went into the BIOS and disabled some dummy settings that i think crunchers should generally know, but aren't talked about in enough places. So here goes: turn off c1e turn off c6 state turn off turbo turn off cool n quiet turn off apm and there was another timing thing that i turned off, i'll figure it out after a restart and post here. anyway, i've got everything back to full blast 100% with kernel 21 and have had NO errors since. |
MeMoo Send message Joined: 23 Sep 17 Posts: 1 Credit: 1,015,707,973 RAC: 0 |
i went into the BIOS and disabled some dummy settings that i think crunchers should generally know, but aren't talked about in enough places. So here goes: Thanks for this tip. Will disabling all these settings generally work for all computers? |
justsomeguy Send message Joined: 26 May 17 Posts: 2 Credit: 190,590,934 RAC: 0 |
Should help on all...primarily, these are the throttling (cool&quiet) and power management (apm) settings. I've been turning most of these off by default on all my equipment for years. Good Luck! |
[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0 |
Hi, If I increase my LEVEL to 21, 22, 23 do I calculate bigger wu (larger figure) or is it just an optimization for my GPU to go faster? Naz |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
You get the same WU. The bigger kernel size just means there is less overhead on GPU because it switches to the next kernel less often. But it also makes UI laggy, if you use the same GPU to connect to your monitor. |
[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0 |
Merci for the quickly reponse |
[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0 |
I think it would be interesting to create a table for each GPU so that we immediately have the figure to put so that our GPU is immediately effective. |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
I think you got it a bit wrong. You can can increase kernel size as long as you have enough GPU RAM (monitor the usage with GPU-Z or similar tools). Bigger kernel size will mean faster processing of WUs on any GPU if it has enough memory (but more UI lags if it's attached to monitor). Kernel size = 23 should be enough for any GPU, even 1080 Ti. The way to set this option is to try different sizes and pick the largest number that still give you responsive UI and GPU load at 100% (can check it in GPU-Z). Once you have 100% GPU load, increasing kernel size further will not speed it up, because it's already at 100%. |
[AF>Amis des Lapins] Bipleouf Send message Joined: 24 Jan 17 Posts: 10 Credit: 11,968,152 RAC: 0 |
Thank you for the idea of using GPU-Z! Anyway, the alliance francophone adore this project, soon we will significantly increase the number of crunchers in order to finish the searches of the 10 ^ 20 as soon as possible :-) |
Viking69 Send message Joined: 4 Mar 18 Posts: 3 Credit: 8,032,527 RAC: 0 |
So, what am I doing wrong? Too many failures, and not explanation from the App. Here is an excerpt from the last failure. <core_client_version>7.9.2</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1 (0xffffffff)</message> <stderr_txt> Initializing prime tables...done OpenCL.cpp, line 294: Preferences: <project_preferences> <max_jobs>2</max_jobs> <max_cpus>6</max_cpus> <kernel_size_amd>20</kernel_size_amd> <kernel_size_nvidia>20</kernel_size_nvidia> </project_preferences> OpenCL.cpp, line 307: Kernel size for NVIDIA GPU has been set to 20 OpenCL.cpp, line 440: clEnqueueWriteBuffer returned error -4 08:40:14 (2712): called boinc_finish(-1) </stderr_txt> ]]> |
Sergei Chernykh Project administrator Project developer Send message Joined: 5 Jan 17 Posts: 534 Credit: 72,451,573 RAC: 0 |
There was not enough GPU memory. Try to close all other programs. Work units that are being issued now require less memory than before, so there is a good chance you won't have this error now. P.S. You can also try to reduce kernel size. |
vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0 |
Are there any plans on increasing the kernel size allowed? Running a Tesla v100 SXM2 16Gb: Kernel at 18 = 160 sec Kernel at 20 = 92 sec Kernel at 21 = 85 sec Kernel at 22 = 81 sec Kernel at 23 = 79 sec Not sure if 24 would be faster but logic would say yes, maybe slightly. Also the new RTX 2080 just got released so they might be more capable then current cards. |
Jozef J Send message Joined: 24 Jan 17 Posts: 20 Credit: 1,193,014,322 RAC: 0 |
Just to compare: best 1080 and 1080Ti do task about 380- to average 420-440 sec.. And Tesla v100 SXM2 16Gb - stable 80 sec... this is insane.. increase of performance... prices for tesla v100 also)) while titan V do average 440 sec,,, a dont expect much increase for RTX cards, nearly half of gpu core take that "tensor".. Soo i dont know.. but how is depend amicable numb. task on tensor/cuda or how it work here on project? becouse titan V still same performance like 1080ti 2. increasing kernel size is maybe worth.. just for testing ... ) or develop better app for amd cards https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/ |
vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0 |
Yeah, I ran a Titan V for about a month and compared to the Tesla v100, which spec wise is very similar, in some projects it was close (within 10%) but in others it was way off (more then double the time). I ended up selling the Titan as for the price it wasn't worth the added processing over a 1080 Ti. But I agree, maybe a 24 or 25 setting just for testing. |
vseven Send message Joined: 15 Mar 18 Posts: 12 Credit: 587,338,410 RAC: 0 |
dont expect much increase for RTX cards, nearly half of gpu core take that "tensor".. Just installed my RTX 2080.....averaging 134 seconds a task. :) |
Message boards : Number crunching : GPU version: kernel size tuning and less UI lags
©2024 Sergei Chernykh