2

I have a folder full of images that contain some metadata in the pixels that I'm trying to extract with my program. The program loops through each file, extracting the data. Problem is that some of the images are corrupted, and trying to extract data from them results in a segmentation fault.

Is there any way to catch this SIGSEGV signal and use it to ignore the current corrupted image and continue to the next noe?

try-catch will not help as this is a signal, not exception.

int Status::ExtractImageMetaData(cv::Mat src_in, int Index)
{
    unsigned char *Image = (unsigned char *)(src_in.data);
    unsigned char *StructByte = (unsigned char*) &ImageMetaData[Index];

    uint32_t MetaOffset = (src_in.total() * src_in.elemSize()) - sizeof(ImageMetaData_t);


    for(unsigned int i = 0; i < sizeof(ImageMetaData_t); i++)
    {
        StructByte[i] = Image[i + MetaOffset]; // SEGMENTATION FAULT HERE!

    }
}





int main(int argc, char const *argv[])
{

   if (imagePath == NULL){
        imagePath = "/Pictures/metadataExtractionTest/sqlTest/";
    }


// Iterate over all files in the "imagePath"
    for (const auto & p : fs::directory_iterator(imagePath)){

        string FullImageDir = p.path();
        Mat Img = imread(FullImageDir.c_str());

        int MetaStatus = SystemStatus.ExtractImageMetaData(Img, 0);     


        printf("done\n");

    } 


    return 0;

}
eirikaso
  • 153
  • 13
  • 2
    You shouldn't catch `SIGSEGV`. You should **_prevent_** it. If you have out of bounds array access, then check array size before you access it. Check for `null` as well. – Yksisarvinen Nov 07 '18 at 09:37
  • 1
    Suggestions to cope with a segfault notwithstanding I must second @Yksisarvinen: The segfault probably comes because you blindly accept data from the files without checking it. Properly vetting the data for plausibility and specfication and implementation limits will (1) Probably allow you to tolerate some file corruption and still extract some image or meta data, and (2) make your program more secure. The last one is more important if you run it on third-party sources or even distribute it: Because maliciously crafted files could actually attack your computer. – Peter - Reinstate Monica Nov 07 '18 at 12:37
  • 1
    OK guys. Followed your advice and checked for columns and rows in the input images before trying to extract metadata from them. That did the trick – eirikaso Nov 07 '18 at 12:42
  • Always nice to find the 'right' solution - i.e. one that fixes the underlying and properly understood problem. – Rags Nov 07 '18 at 13:15

3 Answers3

3

An alternative solution that might be worth looking into: (This assumes you are on Linux or some other Unix variant)

Before processing each image, fork the process into two, one parent and one child process (which will otherwise be identical, look at the man page for fork(2)). Let the child do the risky stuff that might crash while the parent waits for it to terminate. If the child crashes, the parent can detect this and is still in a good state, so it can move on to the next image. If the child succeeds, you will need a way to transfer the result from the child to the parent, perhaps by using a pipe to send data. By opening the pipe before the fork call, both processes can connect using the same pipe.

As has already been suggested, fixing the crashes is probably the best solution, but sometimes that is not possible and you have to get "creative" instead.

  • That is a very bad idea! A segfault is the lucky result, if you write "somewhere" into the memory. If your wrong memory access is going to a memory address which is allowed but not intended, everything can happen. In fact the code is broken by design. "Risky" means "running wrong code" and accept "everything can happen" is not a solution at all! – Klaus Nov 07 '18 at 11:56
2

Edited: SIGSEGV can't be caught as an exception with a try-catch block.

This page : Catch SIGSEV addresses the question.

Rags
  • 341
  • 1
  • 7
1

You never should "Ignore SIGSEGV and continue execution" at all.

If your program writes per accident to some unknown memory address, the result is undefined. If your hardware detects the access to a unallowed memory region, you have luck! The bad case is that you access memory without getting a segfault. But after the access your memory is corrupted. So you can not rely on your code and the execution.

I believe the only thing you can do: Write correct code! This means: Check out of bounds access, maybe simply use containers from STL and use checked access like std::vector::at.

Accepting that a program can write to every memory location is not a solution because a segfault is not the guaranteed result.

Klaus
  • 22,956
  • 6
  • 52
  • 102