State: Waiting for memory

Message boards : Getting started : State: Waiting for memory

To post messages, you must log in.

AuthorMessage
jari

Send message
Joined: 23 Jun 22
Posts: 2
Credit: 4,716,974
RAC: 4,477
   
Message 1595 - Posted: 25 Jun 2022, 4:46:41 UTC
Last modified: 25 Jun 2022, 4:48:16 UTC

The GPU task does not seem to run well and stopped with state: "Waiting for memory".
Also it seems the GPU is crashing somehow as seen in the below nvidia-smi output:

Thu Jun 23 20:04:03 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   60C    P8    N/A /  N/A |      4MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     11294      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+
Thu Jun 23 20:04:08 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   62C    P0    N/A /  N/A |     35MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     11294      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A    208003      C   ...le/amicable_OpenCL_v_3_02       29MiB |
+-----------------------------------------------------------------------------+
GPU 00000000:02:00.0: Detected Critical Xid Error


The CPU work units seem to be running normally, but GPU work load is not passing.
System info:

CPU:       Info: Quad Core model: Intel Core i7-8565U bits: 64 type: MT MCP L2 cache: 8 MiB 
           Speed: 2200 MHz min/max: 400/4600 MHz Core speeds (MHz): 1: 2200 2: 2200 3: 2200 4: 2200 5: 2200 6: 2200 7: 2200 
           8: 2200 
Graphics:  Device-1: Intel WhiskeyLake-U GT2 [UHD Graphics 620] driver: i915 v: kernel 
           Device-2: NVIDIA GP108M [GeForce MX250] driver: nvidia v: 515.48.07 
           Display: x11 server: X.Org 1.20.11 driver: loaded: modesetting,nvidia unloaded: fbdev,nouveau,vesa resolution: 
           1: 1920x1080~60Hz 2: 1920x1080~60Hz 
           OpenGL: renderer: Mesa Intel UHD Graphics 620 (WHL GT2) v: 4.6 Mesa 20.3.5


I think the GPU is possibly overheating and kicked out form the bus, but not sure why this happens.
ID: 1595 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jari

Send message
Joined: 23 Jun 22
Posts: 2
Credit: 4,716,974
RAC: 4,477
   
Message 1596 - Posted: 26 Jun 2022, 3:00:11 UTC - in response to Message 1595.  
Last modified: 26 Jun 2022, 3:42:41 UTC

Sun Jun 26 10:54:50 2022 | Amicable Numbers | Computation for task amicable_10_21_4584_1655960702.514059_799_1 finished
Sun Jun 26 10:54:50 2022 | Amicable Numbers | Starting task amicable_10_21_4584_1655960702.514059_800_0
Sun Jun 26 10:54:52 2022 | Amicable Numbers | Started upload of amicable_10_21_4584_1655960702.514059_799_1_r1275436014_0
Sun Jun 26 10:54:54 2022 | Amicable Numbers | Finished upload of amicable_10_21_4584_1655960702.514059_799_1_r1275436014_0
Sun Jun 26 10:56:31 2022 | Amicable Numbers | Aborting task amicable_10_21_4584_1655960702.514059_800_0: working set size > client RAM limit: 7866.50MB > 7866.34MB
Sun Jun 26 10:56:32 2022 | Amicable Numbers | Computation for task amicable_10_21_4584_1655960702.514059_800_0 finished
Sun Jun 26 10:56:32 2022 | Amicable Numbers | Starting task amicable_10_21_16704_1655526602.848309_766_2
Sun Jun 26 10:56:34 2022 | Amicable Numbers | Started upload of amicable_10_21_4584_1655960702.514059_800_0_r1962825675_0
Sun Jun 26 10:56:36 2022 | Amicable Numbers | Finished upload of amicable_10_21_4584_1655960702.514059_800_0_r1962825675_0
Sun Jun 26 10:58:22 2022 | Amicable Numbers | Aborting task amicable_10_21_16704_1655526602.848309_766_2: working set size > client RAM limit: 7973.66MB > 7866.34MB
Sun Jun 26 10:58:23 2022 | Amicable Numbers | Computation for task amicable_10_21_16704_1655526602.848309_766_2 finished
Sun Jun 26 10:58:23 2022 | Amicable Numbers | Starting task amicable_10_21_4584_1655960702.514059_868_1

The GPU tasks are aborting like this...

It looks like this is caused by low memory that I should increase the
Options->Memory to 90% (from earlier 50%).

Then it worked for about 3-5 minutes until the GPU overheated and system was forced to shutdown.

So it looks not possible to get the GPU tasks working in this machine.
ID: 1596 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sascha Harper-Noorthoorn van der Kruyff
Avatar

Send message
Joined: 16 Jun 22
Posts: 8
Credit: 1,045,938
RAC: 0
  
Message 1615 - Posted: 13 Aug 2022, 15:41:25 UTC - in response to Message 1596.  
Last modified: 13 Aug 2022, 16:13:22 UTC

Jari? Mine says all the time (well ok, not "all" the time, but "a lot") 'waiting for memory', and then you just go do something else, and it starts to work again. I would not worry about it, if I were you (just suppose that was a bad thing... then how com I 'discovered' already 6 amicable pairs?)
ID: 1615 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sascha Harper-Noorthoorn van der Kruyff
Avatar

Send message
Joined: 16 Jun 22
Posts: 8
Credit: 1,045,938
RAC: 0
  
Message 1623 - Posted: 8 Sep 2022, 13:28:41 UTC - in response to Message 1596.  

Jari... I find that there are always two or three different Amicable Numbers-projects going on in my Boinc manager, what helps me, and maybe you too, is to check up on their progress every so often, and sometimes switch to a project that 'moves forward' instead of 'sticking with the project' that you were busy with but "got stuck" cuz it 'waits for memory' or 'the CPU is busy' or whatever it says
ID: 1623 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Getting started : State: Waiting for memory


©2024 Sergei Chernykh