Posts by jari

1) Message boards : Getting started : State: Waiting for memory (Message 1596)
Posted 26 Jun 2022 by jari
Post:
Sun Jun 26 10:54:50 2022 | Amicable Numbers | Computation for task amicable_10_21_4584_1655960702.514059_799_1 finished
Sun Jun 26 10:54:50 2022 | Amicable Numbers | Starting task amicable_10_21_4584_1655960702.514059_800_0
Sun Jun 26 10:54:52 2022 | Amicable Numbers | Started upload of amicable_10_21_4584_1655960702.514059_799_1_r1275436014_0
Sun Jun 26 10:54:54 2022 | Amicable Numbers | Finished upload of amicable_10_21_4584_1655960702.514059_799_1_r1275436014_0
Sun Jun 26 10:56:31 2022 | Amicable Numbers | Aborting task amicable_10_21_4584_1655960702.514059_800_0: working set size > client RAM limit: 7866.50MB > 7866.34MB
Sun Jun 26 10:56:32 2022 | Amicable Numbers | Computation for task amicable_10_21_4584_1655960702.514059_800_0 finished
Sun Jun 26 10:56:32 2022 | Amicable Numbers | Starting task amicable_10_21_16704_1655526602.848309_766_2
Sun Jun 26 10:56:34 2022 | Amicable Numbers | Started upload of amicable_10_21_4584_1655960702.514059_800_0_r1962825675_0
Sun Jun 26 10:56:36 2022 | Amicable Numbers | Finished upload of amicable_10_21_4584_1655960702.514059_800_0_r1962825675_0
Sun Jun 26 10:58:22 2022 | Amicable Numbers | Aborting task amicable_10_21_16704_1655526602.848309_766_2: working set size > client RAM limit: 7973.66MB > 7866.34MB
Sun Jun 26 10:58:23 2022 | Amicable Numbers | Computation for task amicable_10_21_16704_1655526602.848309_766_2 finished
Sun Jun 26 10:58:23 2022 | Amicable Numbers | Starting task amicable_10_21_4584_1655960702.514059_868_1

The GPU tasks are aborting like this...

It looks like this is caused by low memory that I should increase the
Options->Memory to 90% (from earlier 50%).

Then it worked for about 3-5 minutes until the GPU overheated and system was forced to shutdown.

So it looks not possible to get the GPU tasks working in this machine.
2) Message boards : Getting started : State: Waiting for memory (Message 1595)
Posted 25 Jun 2022 by jari
Post:
The GPU task does not seem to run well and stopped with state: "Waiting for memory".
Also it seems the GPU is crashing somehow as seen in the below nvidia-smi output:

Thu Jun 23 20:04:03 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   60C    P8    N/A /  N/A |      4MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     11294      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+
Thu Jun 23 20:04:08 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   62C    P0    N/A /  N/A |     35MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     11294      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A    208003      C   ...le/amicable_OpenCL_v_3_02       29MiB |
+-----------------------------------------------------------------------------+
GPU 00000000:02:00.0: Detected Critical Xid Error


The CPU work units seem to be running normally, but GPU work load is not passing.
System info:

CPU:       Info: Quad Core model: Intel Core i7-8565U bits: 64 type: MT MCP L2 cache: 8 MiB 
           Speed: 2200 MHz min/max: 400/4600 MHz Core speeds (MHz): 1: 2200 2: 2200 3: 2200 4: 2200 5: 2200 6: 2200 7: 2200 
           8: 2200 
Graphics:  Device-1: Intel WhiskeyLake-U GT2 [UHD Graphics 620] driver: i915 v: kernel 
           Device-2: NVIDIA GP108M [GeForce MX250] driver: nvidia v: 515.48.07 
           Display: x11 server: X.Org 1.20.11 driver: loaded: modesetting,nvidia unloaded: fbdev,nouveau,vesa resolution: 
           1: 1920x1080~60Hz 2: 1920x1080~60Hz 
           OpenGL: renderer: Mesa Intel UHD Graphics 620 (WHL GT2) v: 4.6 Mesa 20.3.5


I think the GPU is possibly overheating and kicked out form the bus, but not sure why this happens.



©2024 Sergei Chernykh