Auto-Submit/Resubmit Tasks¶
Lin Ziyue
These two scripts are highlighted separately because they are extremely useful in actual work:
* The auto-submit script can control the number of queued tasks;
* The resubmit script can monitor the runtime of submitted tasks and automatically cancel/resubmit when needed.
Below are the core concepts and sample code. The actual commands to execute should be filled in at the TODO locations.
Auto-Submit¶
Only two things:
- Monitor the scheduler queue until enough quota is available to continue;
- Check if local marker files meet conditions to decide whether to proceed.
Key Points¶
squeue -u "$USER"only checks your own jobs; replace with appropriate command if not using Slurm.tail -n +2removes header line to ensurewc -lreturns pure numbers.grep -qis silent mode, only checks if keyword matches.- Threshold (21), sleep interval, keyword and marker filename can be adjusted per project needs.
Resubmit Tasks¶
This script only retains:
- Monitor SLURM queue task runtime;
- Filter tasks to process based on job name and working directory keywords;
- (Optional) Double-check via
ps auxthat task belongs to current account; - Provide two choices: "only delete (scancel) timeout tasks / continue custom operations".
The actual "subsequent operations" are left in TODO placeholders for you to fill in as needed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
Usage Examples¶
-
Directly kill tasks running ≥ 6 h with job name Runscrip
-
Detect first, then manually choose subsequent action
The script structure completely decouples "finding tasks & (optionally) canceling" from "subsequent custom operations".
To do what you want, simply add your own code in the # TODO sections.