3

I am writing a C++ OpenCV-based computer vision program. The basic idea of the program could be described as follows:

  1. Read an image from a camera.

  2. Do some magic to the image.

  3. Display the transformed image.

The implementation of the core logic of the program (step 2) falls into a sequential calling of OpenCV functions for image processing. It is something about 50 function calls. Some temporary image objects are created to store intermediate results, but, apart from that, no additional entities are created. The functions from step 2 are used only once.

I am confused about organising this type of code (which feels more like a script). I used to create several classes for each logical step of the image processing. Say, here I could create 3 classes like ImagePreprocessor, ImageProcessor, and ImagePostprocessor, and split the abovementioned 50 OpenCV calls and temorary images correspondingly between them. But it doesn't feel like a resonable OOP design. The classes would be nothing more than a way to store the function calls.

The main() function would still just create a single object of each class and call thier methods consequently:

image_preprocessor.do_magic(img);
image_processor.do_magic(img);
image_postprocessor.do_magic(img);

Which is, to my impression, essentially the same thing as callling 50 OpenCV functions one by one.

I start to question whether this type of code requiers an OOP design at all. After all, I can simply provide a function do_magic(), or three functions preprocess(), process(), and postprocess(). But, this approach doesn't feel like a good practice as well: it is still just a pile of function calls, separated into a different function.

I wonder, are there some common practices to organise this script-like kind of code? And what would be the way if this code is a part of a large OOP system?

kazarey
  • 785
  • 1
  • 15
  • 31
  • 1
    Ok, I am giving a partial answer to this, because complete solution will require lot of work. For video/image reader, create an abstract class. Whether you process video or image, processing is done one frame at a time, so create an abstract class(interface) ImageProcessor. There is no Pre and Post. For example, retinex processing is used as Post Processing but some application can use it as PreProcessing. Retinex processing class will implement ImageProcessor. – saurabheights Feb 26 '17 at 21:14
  • 1
    Next, how do you make a pipeline. Make a class pipeline containing LinkedList of class PipeLineElement. Each element will hold Its ImageProcessor and Next PipeLineElement object. The pipelineElement will pass a packet(of all data) to ImageProcessor and receive the updated packet. The updated packet will be passed to Next element of pipeline. Note, that this pipeline is a serial pipeline, i.e. you cannot have two element processing at the same last packet, but you can further generalize the linked list. – saurabheights Feb 26 '17 at 21:17
  • 1
    Each element can now process in parallel. First element will process nth element, second element will process n-1th element, and soon, but with this a lot more issues such as pipeline bottlenecks and additional delay due to less multithreading in each element will cause severe real-time delays. Have been there and telling from experience :D. Overall, best of luck, this design is very good but requires very high expertise. On other hand each library can be easily added and removed from pipeline like working on linked list. – saurabheights Feb 26 '17 at 21:19
  • 1
    Also, you can work on one ImageProcessor implementation internally without any need of changes to the whole pipeline. Finally, dont use Mat as the only data that needs to passed. Use a class like PipeLinePacket to store any and all information that should be passed. This will also allow you to store any processing done in earlier ImageProcessor in pipeline and allow later ImageProcessor to use this. – saurabheights Feb 26 '17 at 21:24
  • Caveat Emptor!! :P and looks like I gave almost full solution. WIll add it as answer later. – saurabheights Feb 26 '17 at 21:25
  • @saurabheights Please do. I definetely didn't come up with anything like this design, and it sounds like a solid solution. I need to thoroughly think it through. Is it a design from your experience or is it some commonly known design pattern? – kazarey Feb 26 '17 at 21:43
  • This pattern is followed by my previous organization. I can't provide you complete code since it is proprietary. I will add the solution in a day in a bit more structured manner. Not sure, but it uses Strategy design pattern on the ImageProcessor level. However, this design is not(I think) software design technique, but an architect pattern to make a pipeline couple with image processing modules. Ofcourse, the coupling may also have a pattern name, but I don't think its software design pattern. For more, see: http://www.nyu.edu/classes/jcf/g22.2440-001_sp06/slides/session8/g22_2440_001_c82.pdf – saurabheights Feb 26 '17 at 22:34

2 Answers2

1

This structure lends itself to the pipes and filters architecture (see Pattern-Oriented Software Architecture Volume 1: A System of Patterns by Frank Buschmann):

The Pipes and Filters architectural pattern provides a structure for systems that process a stream of data. Each processing step is encapsulated in a filter component. Data is passed through pipes between adjacent filters. Recombining filters allows you to build families of related systems.

See also this short description (with images) from the Enterprise Integration Patterns book.

PsiX
  • 1,611
  • 1
  • 18
  • 34
1

Usually, in Image Processing, you have a pipeline of various Image Processing Modules. Same is applicable on Video Processing, where each Image is processed according to its timestamp order in the video.

Constraints to consider before designing such pipeline:

  1. Order of Execution of these modules is not always same. Thus, the pipeline should be easily configurable.
  2. All modules of the pipeline should be executable in parallel with each other.
  3. Each module of the pipeline may also have a multithreaded operation. (Out of scope of this answer, but is a good idea when a single module becomes the bottleneck for the pipeline).
  4. Each module should easily adhere to the design and have the flexibility of internal implementation changes without affecting other modules.
  5. The benefit of preprocessing of a frame by one module should be available to later modules.

Proposed Design.

Video Pipeline

A video pipeline is a collection of modules. For now, assume module is a class whose process method is called with some data. How each module can be executed will depend on how such modules are stored in VideoPipeline! To further explain, see below two categories:-

Here, let’s say we have modules A, B, and C which always execute in same order. We will discuss the solution with a video of Frame 1, 2 and 3.

a. Linked List: In a single-threaded application, frame 1 is first executed by A, then B and then C. The process is repeated for next frame and so on. So linked list seems like an excellent choice for the single threaded application.

For a multi-threaded application, speed is what matters. So, of course, you would want all your modules running 128-core machine. This is where Pipeline class comes into play. If each Pipeline object runs in a separate thread, the whole application which may have 10 or 20 modules starts running multithreaded. Note that the single-thread/multithread approach can be made configurable

b. Directed Acyclic Graph: Above-linked list implementation can be further improved when you have high processing power and want to reduce the lag between input and response time of pipeline. Such a case is when module C does not depend on B, but on A. In such case, any frame can be parallelly processed by module B and module C using a DAG based implementation. However, I wouldn’t recommend this as the benefits are not so great compared to the increased complexity, as further management of output from module B and C needs to be done by say module D where D depends on B or C or both. The number of scenarios increases.

Thus, for simplicity sake, let’s use LinkedList based design.

Pipeline

  1. Create a linked list of PipelineElement.
  2. Make process method of pipeline call process method of the first element.

PipelineElement

  1. First, the PipelineElement processes the information by calling its ImageProcessor(read below). The PipelineElement will pass a Packet(of all data, read below) to ImageProcessor and receives the updated packet.
  2. If next element is not null, call next PipelineElement process and pass updated packet.
  3. If next element of a PipelineElement is null, stop. This element is special as it has an Observer object. Other PipelineElement will be set to null for Observer field.

FrameReader(VIdeoReader/ImageReader)

For video/image reader, create an abstract class. Whether you process video or image or multiple, processing is done one frame at a time, so create an abstract class(interface) ImageProcessor.

  1. A FrameReader object stores reference to the pipeline.
  2. For each frame, it pushes the information in by calling process method of Pipeline.

ImageProcessor

There is no Pre and Post ImageProcessor. For example, retinex processing is used as Post Processing but some application can use it as PreProcessing. Retinex processing class will implement ImageProcessor. Each element will hold Its ImageProcessor and Next PipeLineElement object.

Observer
A special class which extends PipelineElement and provides a meaningful output using GUI or disk.

Multithreading
1. Make each method run in its thread.
2. Each thread will poll messages from a BlockingQueue( of small size like 2-3 Frames) to act as a buffer between two PipelineElements. Note: The queue helps in averaging the speed of each module. Thus, small jitters(a module taking too long time for a frame) does not affect video output rate and provides smooth playback.

Packet
A packet will store all the information such as input or Configuration class object. This way you can store intermediate calculations as well as observe a real-time effect of changing configuration of an algorithm using a Configuration Manager.

To conclude, each element can now process in parallel. The first element will process nth frame, the second element will process n-1th frame, and soon, but with this, a lot more issues such as pipeline bottlenecks and additional delays due to less core power available to each element will pop up.

Community
  • 1
  • 1
saurabheights
  • 3,520
  • 1
  • 28
  • 47
  • I had a bit of trouble with the Packet class. The input and output of ImageProcessor may be of various classes (`Mat`, `Point`, `vector`, `Mat` **and** `vector` etc). I ended up having just `map` inside the Packet where `ICustomType` is a base class for a template class which can contain value of any type and `string` is just a name of a desired object (e.g. "TARGET_IMAGE") wich I call in an ImageProcessor. This is simple enough for the scope of my project, but do you probably have some other suggestions on the universality of the Packet class design? – kazarey Mar 04 '17 at 10:30
  • Actually, this is not so generic approach since what information needs to be reused between modules will depend on those modules. You can try adding another class to hold packet for each module, such as RetinexPacket object which inherits Packet, but you will again need a function to map values from Packet object to RetinexPacket object during input and RetinexPacket to Packet object when giving output. This will reduce the amount of data needed in Packet class and will make it shorter, but will be too much of work. – saurabheights Mar 04 '17 at 11:27
  • Other benefits will be making each module independent and clearing their interfaces. For example, you can have RetinexInputPacket and RetinexOutputPacket. One can easily check RetinexInputPacket to see what information is required to start. However, we never had to go beyond this approach as it kept our interface clean, and data passing was simple and efficient. Also, each module is independent. Note that, class Packet, Pipeline and PipelineElement will go in separate library and any processing module library such as Retinex processing will depend on this library. – saurabheights Mar 04 '17 at 11:30
  • True, the solution with `map` is not generic and occasionally leads to some hard code (the names of those stored objects), but it felt like a reasonable trade-off for not overengineering comparatively simple system. On the other hand, I don't really see a way to achieve a complete universality of the intefaces without dependancy on knowing the context of the pipeline, even with the suggested Packet and ConcretePacket approach. The mentioned Packet mapping function would still need to know the pipeline structure. – kazarey Mar 04 '17 at 13:20
  • Consider: you need an img of edges of a human face. You have cached a `Mat` with the source img from a webcam in a Packet. Then you assemble a Pipeline such as (1)FaceDetector->(2)EdgeDetector. After the (1), you also cache a `Mat` with the detected face. But, if universal, step (2) takes just a Mat for an input, and you need to tell whether to pass the source img or the detected face img. I don't see the way to do so without introducing some context-aware Observer (say, `Algorithm`) because Packet itsef has no means to do the mapping on its own. – kazarey Mar 04 '17 at 13:22
  • Other way could be restricting a `PipelineStep` to take an input only of a type which the output from the previous `PipelieStep` had, and write `ImageProcessors` correspondingly, but, in my hands, this approach grew clumsy very very rapidly. – kazarey Mar 04 '17 at 13:25
  • Come to think of it, maybe its OK to instruct each `PipelineElement` from which cell of the data cache (in `Packet`) it should read and to what cell should it write (a bit like in assembly programming), provided we know the number and the order of their input/output arguments. Not very different from my hardcoded solution, but it removes the hardcode from the `ImageProcessors` themselves. – kazarey Mar 04 '17 at 14:45