56

In a Bash script, I would like to do something like:

app1 &
pidApp1=$!
app2 &
pidApp2=$1

timeout 60 wait $pidApp1 $pidApp2
kill -9 $pidApp1 $pidApp2

I.e., launch two applications in the background, and give them 60 seconds to complete their work. Then, if they don't finish within that interval, kill them.

Unfortunately, the above does not work, since timeout is an executable, while wait is a shell command. I tried changing it to:

timeout 60 bash -c wait $pidApp1 $pidApp2

But this still does not work, since wait can only be called on a PID launched within the same shell.

Any ideas?

imz -- Ivan Zakharyaschev
  • 4,730
  • 6
  • 50
  • 103
user1202136
  • 10,775
  • 3
  • 38
  • 59
  • Could you `sleep 60` instead? Not as efficient, but much simpler – Shahbaz Apr 05 '12 at 12:45
  • 1
    "60" has to be a maximum upper execution time. The actual runtime of the applications might be a lot lower. So no, it would be to inefficient for me. – user1202136 Apr 05 '12 at 12:51
  • If these programs really require you to use `kill -9`, they are broken. See also http://www.iki.fi/era/unix/award.html#kill – tripleee Aug 05 '15 at 10:15

6 Answers6

66

Both your example and the accepted answer are overly complicated, why do you not only use timeout since that is exactly its use case? The timeout command even has an inbuilt option (-k) to send SIGKILL after sending the initial signal to terminate the command (SIGTERM by default) if the command is still running after sending the initial signal (see man timeout).

If the script doesn't necessarily require to wait and resume control flow after waiting it's simply a matter of

timeout -k 60s 60s app1 &
timeout -k 60s 60s app2 &
# [...]

If it does, however, that's just as easy by saving the timeout PIDs instead:

pids=()
timeout -k 60s 60s app1 &
pids+=($!)
timeout -k 60s 60s app2 &
pids+=($!)
wait "${pids[@]}"
# [...]

E.g.

$ cat t.sh
#!/bin/bash

echo "$(date +%H:%M:%S): start"
pids=()
timeout 10 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 1 terminated successfully"' &
pids+=($!)
timeout 2 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 2 terminated successfully"' &
pids+=($!)
wait "${pids[@]}"
echo "$(date +%H:%M:%S): done waiting. both jobs terminated on their own or via timeout; resuming script"

.

$ ./t.sh
08:59:42: start
08:59:47: job 1 terminated successfully
08:59:47: done waiting. both jobs terminated on their own or via timeout; resuming script
Adrian Frühwirth
  • 39,348
  • 9
  • 57
  • 69
  • 2
    According to https://www.gnu.org/software/coreutils/manual/html_node/timeout-invocation.html, the `-k` should go before `60s` and in addition, you must specify a timeout for `-k`. So, for example, the first code example should be `timeout -k 60s 60s app1 &`. – pepan Dec 22 '14 at 15:07
  • @pepan Thanks, spot on! – Adrian Frühwirth Jan 08 '15 at 08:28
  • timeout does not appear to be on osx 10.12. – AnneTheAgile Sep 08 '17 at 23:47
  • 5
    @AnneTheAgile This was a Linux-specific question so that point is moot, especially since the question is built around `timeout` to begin with. However, it's easy to install GNU `timeout` on OSX/MacOS via e.g. [Homebrew](https://brew.sh/). – Adrian Frühwirth Dec 11 '17 at 11:16
  • "why do you not only use timeout since that is exactly its use case?" because they don't know how long a forked job should take, only that it should exit within N seconds after a certain point – jberryman Apr 06 '22 at 17:46
25

Write the PIDs to files and start the apps like this:

pidFile=...
( app ; rm $pidFile ; ) &
pid=$!
echo $pid > $pidFile
( sleep 60 ; if [[ -e $pidFile ]]; then killChildrenOf $pid ; fi ; ) &
killerPid=$!

wait $pid
kill $killerPid

That would create another process that sleeps for the timeout and kills the process if it hasn't completed so far.

If the process completes faster, the PID file is deleted and the killer process is terminated.

killChildrenOf is a script that fetches all processes and kills all children of a certain PID. See the answers of this question for different ways to implement this functionality: Best way to kill all child processes

If you want to step outside of BASH, you could write PIDs and timeouts into a directory and watch that directory. Every minute or so, read the entries and check which processes are still around and whether they have timed out.

EDIT If you want to know whether the process has died successfully, you can use kill -0 $pid

EDIT2 Or you can try process groups. kevinarpe said: To get PGID for a PID(146322):

ps -fjww -p 146322 | tail -n 1 | awk '{ print $4 }'

In my case: 145974. Then PGID can be used with a special option of kill to terminate all processes in a group: kill -- -145974

Aaron Digulla
  • 310,263
  • 103
  • 579
  • 794
  • This does not work. `wait` requires a pid to be a child of the current shell. I get the following error: "wait.sh: line 2: wait: pid 22603 is not a child of this shell". – user1202136 Apr 05 '12 at 12:54
  • I just noticed one problem with my approach: this only kills the shell that runs the app. Get a process list and look for a process with $pid as parent PID; that should be the app. – Aaron Digulla Apr 05 '12 at 13:00
  • Works like a charm. Thanks! I somewhat simplified the inner part: `( sleep 60 ; kill -9 $pids ) & killerPid=$!`. I think it is sufficient for my purpose. – user1202136 Apr 05 '12 at 13:08
  • Killing the parent has not always an effect on the children. If your app hangs (`kill` doesn't work while `kill -9` does), you will get zombie processes. – Aaron Digulla Apr 05 '12 at 13:11
  • This should help: http://stackoverflow.com/questions/392022/best-way-to-kill-all-child-processes – Aaron Digulla Apr 05 '12 at 13:12
  • 1
    [You shouldn't `kill -9`](http://mywiki.wooledge.org/ProcessManagement#I.27m_trying_to_kill_-9_my_job_but_blah_blah_blah...) – l0b0 Apr 05 '12 at 13:42
  • I would add a `sleep 5` or something after the `killChildrenOf $pid` to ensure that the `kill $killerPid` really does kill the process you think it's killing. – crudcore Feb 28 '13 at 18:47
  • 3
    @histumness: You can do that or try `kill -0 $pid` to check whether the process is still there. – Aaron Digulla Mar 01 '13 at 12:47
  • 3
    `I just noticed one problem with my approach...`: Did you consider a technique to use process group? Here is one message method to get PGID for a PID(146322): `ps -fjww -p 146322 | tail -n 1 | awk '{ print $4 }'`. (In my case: Outputs `145974`) Then PGID can be used with a special mode of kill to terminate all processes in a group: `kill -- -145974` – kevinarpe May 08 '17 at 09:07
  • 1
    A similar solution is suggested right in the question https://stackoverflow.com/q/50975596/94687 . (And I find your idea quite elegant and clever; I couldn't come up with this myself.) – imz -- Ivan Zakharyaschev Feb 09 '22 at 17:59
6

Here's a simplified version of Aaron Digulla's answer, which uses the kill -0 trick that Aaron Digulla leaves in a comment:

app &
pidApp=$!
( sleep 60 ; echo 'timeout'; kill $pidApp ) &
killerPid=$!

wait $pidApp
kill -0 $killerPid && kill $killerPid

In my case, I wanted to be both set -e -x safe and return the status code, so I used:

set -e -x
app &
pidApp=$!
( sleep 45 ; echo 'timeout'; kill $pidApp ) &
killerPid=$!

wait $pidApp
status=$?
(kill -0 $killerPid && kill $killerPid) || true

exit $status

An exit status of 143 indicates SIGTERM, almost certainly from our timeout.

Bryan Larsen
  • 8,956
  • 7
  • 53
  • 45
  • A similar solution is suggested right in the question https://stackoverflow.com/q/50975596/94687 . (And I find this kind of solutions to the problem quite elegant and clever; I couldn't come up with this myself.) – imz -- Ivan Zakharyaschev Feb 09 '22 at 17:59
1

I wrote a bash function that will wait until PIDs finished or until timeout, that return non zero if timeout exceeded and print all the PIDs not finisheds.

function wait_timeout {
  local limit=${@:1:1}
  local pids=${@:2}
  local count=0
  while true
  do
    local have_to_wait=false
    for pid in ${pids}; do
      if kill -0 ${pid} &>/dev/null; then
        have_to_wait=true
      else
        pids=`echo ${pids} | sed -e "s/${pid}//g"`
      fi
    done
    if ${have_to_wait} && (( $count < $limit )); then
      count=$(( count + 1 ))
      sleep 1
    else
      echo ${pids}
      return 1
    fi
  done   
  return 0
}

To use this is just wait_timeout $timeout $PID1 $PID2 ...

JonatasTeixeira
  • 1,388
  • 1
  • 17
  • 24
1

To put in my 2c, we can boild down Teixeira's solution to:

try_wait() {
    # Usage: [PID]...
    for ((i = 0; i < $#; i += 1)); do
        kill -0 $@ && sleep 0.001 || return 0
    done
    return 1 # timeout or no PIDs
} &>/dev/null

Bash's sleep accepts fractional seconds, and 0.001s = 1 ms = 1 KHz = plenty of time. However, UNIX has no loopholes when it comes to files and processes. try_wait accomplishes very little.

$ cat &
[1] 16574
$ try_wait %1 && echo 'exited' || echo 'timeout'
timeout
$ kill %1
$ try_wait %1 && echo 'exited' || echo 'timeout'
exited

We have to answer some hard questions to get further.

Why has wait no timeout parameter? Maybe because the timeout, kill -0, wait and wait -n commands can tell the machine more precisely what we want.

Why is wait builtin to Bash in the first place, so that timeout wait PID is not working? Maybe only so Bash can implement proper signal handling.

Consider:

$ timeout 30s cat &
[1] 6680
$ jobs
[1]+    Running   timeout 30s cat &
$ kill -0 %1 && echo 'running'
running
$ # now meditate a bit and then...
$ kill -0 %1 && echo 'running' || echo 'vanished'
bash: kill: (NNN) - No such process
vanished

Whether in the material world or in machines, as we require some ground on which to run, we require some ground on which to wait too.

  • When kill fails you hardly know why. Unless you wrote the process, or its manual names the circumstances, there is no way to determine a reasonable timeout value.

  • When you have written the process, you can implement a proper TERM handler or even respond to "Auf Wiedersehen!" send to it through a named pipe. Then you have some ground even for a spell like try_wait :-)

Andreas Spindler
  • 6,728
  • 3
  • 39
  • 33
0
app1 &
app2 &
sleep 60 &

wait -n
user1931823
  • 105
  • 1
  • 2
  • 6