1

Camera setup: We have setup some stereo cameras in a living apartment. That is, the indoor environment is monitored. With the stereo cameras, wide lens (3.5mm) are used to cover a big volume.

The height from the floor is around 2.8 meters. The objects (such as mug, bottle, telephone) are at least 3 meters away. For instance, an object (75mmx102mm), which located 3 meters away from the camera, is represented by 15x20px in the camera image. Thus, the images are getting smaller in the far field.

I have around 30 different objects to be recognized.

I do not use the depth information, because it is not so accurate. I just use RGB values from a single camera. Image resolution is 1360 x 1024 pixels.

Approaches: 1. Point detectors/descriptors, matches (some models in the database, and check the match one by one) 2. Bag of visual worlds + SVM classification (5 object categories)

I had experiences with Haar-cascade but I never tried for my current issue.

What methods/approaches should I try to investigate?

Thank you in advance,

0 Answers0