4

Imagine you have resources for a job which has part 1 and part 2.

Part 2 requires part 1, but both take considerable time.

And, part 2 requires part 1 only at the end.

Would you start them both in parallel and let part 2 somehow wait for part 1 getting ready? (chances are it 50/50 it will be ready by then)

Ta Mu
  • 6,772
  • 5
  • 39
  • 82

2 Answers2

4

The critical question is - can you teach your job #2 to patiently wait for job #1 to complete when it gets to that dependency point and appropriately proceed or bail depending on job #1 result?

If you can't, then the answer is no - you'll have more trouble dealing with #2 failures due to incomplete job #1, too much noise in the process.

If you can (and if you have enough resources) - yes, by all means - your process will be faster. You might not even need special support from the CI system for such dependencies, it could be completely oblivious, it would simply see the 2 jobs as parallel.

A middle-ground possibility, if your CI systems supports it, would be to stagger the jobs: start job #2 with a delay sufficient to bring the chances of job #1 completing before job #2 needs it in a much more comfortable range (say 99%). The delay would be plain eyeballed or empirically determined, either fixed or, in a more advanced approach, automatically derived from historic measurements/stats (if available).

Dan Cornilescu
  • 6,730
  • 2
  • 19
  • 44
1

Yes, you should definitively run both parts in parallel if you have the resources. This means that your job will run as fast as your slowest running part. This puts emphasis on optimizing your slowest running part. It might be something as simple as running some processes of part 1 ahead of time before part 2 even begins.

Preston Martin
  • 3,278
  • 4
  • 17
  • 39