The Third EYE: Result and Conclusion

www.gunjangupta.net

Result

The detection of various materials living or dead is done. Using Logitech C270 webcam and codes with OpenCV library functions, certain day-to-day life activities can be tracked and reproduced in the form of required outputs. Face detection and tracking, movement detection and tracking are the functions that can be executed efficiently and have been executed with almost 70% efficiency w.r.t. time delay.

Using Microsoft Xbox 360 Sensor, more detailed tasks can be carried out that works efficiently in terms of time delay. Using Microsoft Xbox 360 Sensor, it uses three different cameras viz. RGB, IR and Depth cameras. Using these all together and running basic functions, many tasks can be carried out. The task of head pose estimation, face detection and tracking etc can be carried out effectively. Apart from that depth detection, distance measurement, use of camera during less light, object detection and tracking as well as motion detection can be carried out. Microsoft Xbox 360 sensor also has inbuilt microphone which is direction based and helps in detection any object or person based on the detection of type of sound. Also it helps the blind person in the proper detection of a person as well as store the same.

The other functions include the detection of text to speech and vice versa. Also detection of hand written letters into speech. Here the hand written letters will be compared with inbuilt letters and numbers and a probable outcome will be provided which can be turned into a speech form. This is effective as far as writing is proper and the detection is efficient. Also text to speech helps the blind person in noting down a person’s name in the directory and can be saved as a file name.

Object detection is an advanced form of handwritten to speech detection. Here the pre-built images will be compared with real time information of the object in front and will the approximated detected object and its name as well as provide a sound output.

The following screenshot shows the final WebcamFaceRec project, including a small rectangle at the top-right corner highlighting the recognized person. Also notice the confidence bar that is next to the pre-processed face (a small face at the top-centre of the rectangle marking the face), which in this case shows roughly 70 percent confidence that it has recognized the correct person.

Fig R.1: Webcam Face Recognition

Next result is of Text to Speech synthesis which works very fine. Sound can be listened practically but in this document I am showing snapshot of program with a text file and terminal line for execution. This successful implementation of Text to Speech synthesis can be used in WebCam Face Recognition program where when a face is recognized it can speak his name with pre-defined words like

“Hey PandaBoard User, XYZ is in front of you “!

Fig R.2: Text to Speech Synthesis

Now after Face Recognition and Text to Speech synthesis this project is almost complete but the last hurdle is that we cannot store faces of each individual and hence need to store new faces or unknown person coming as and when they come in front of camera. Speech to Text synthesis was one thought we were having to use when a new person’s face is detected but there is no Speech to Text software or algorithm developed till date for Ubuntu operating system and research is going on still. So to do that work we are thinking out of the box by using Sound Marking.

In sound marking we are going to record sound of the unknown person coming in front of device that is his/her own voice containing his name. So for doing this we used a Kinect Sensor and developed an application which records sound for 5 seconds in which whatever is spoken will be recorded and will be saved with the name of date and time of that moment. At the same instant a picture would be captured of unknown person with the same date and time name off course so that his/her face could be stored. Next time when that person comes in front of camera his/her face would be detected and we would get the filename of detected face from which we can play his/her recorded voice containing his/her name and hence the problem of Unknown faces could be solved.

We had developed the application of recording sound for windows whose snapshot is shown in next page but we are yet to develop the application for Ubuntu OS.

Fig R.3: Audio Recording for Sound Marking

One another application we developed is Basic Optical Character Recognition in which whatever number is drawn by the user gets detected by the application with a system error chance of 11.00%. This could help blind to write something or read something in digital alphabets and numbers.

Fig R.4: Optical Character Recognition

Conclusion

We have developed a device which can be used easily by visually impaired community. The size of this device is 4.5 by 4.0 inches with the camera size of 3 x 8.2 x 6 inches and whole weight is 332 grams. Approximate cost of this device is INR 16000 /-. This device has salient features as following:

· Face Detection, Tracking and Person Identity Detection
· Face Tagging and Storing New Person’s Face
· Optical Character Recognition: Hand Written Text Recognition
· Text to Speech
· Object Detection, Tracking and Tagging
· Motion Detection and Tracking
· Capturing an Image using Hand Gesture and Uploading it to Internet
· Colour Detection and Generation of Different Audio for Different Colours
· Sound Marking for saving new faces and name

Nothing is perfect so there are always chance of improvement and hence following are the future work which can be done in this project:

· Integrating Face Recognition , Text to Speech synthesis and Sound Marking in a single application
· Developing Speech to text Synthesis
· Improving Number and Character Recognition accuracy
· Increasing speed of Video Frame for processing
· Loading PandaBoard with only needed applications and softwares
· Working without Operating System , example using PUTTY

The Third EYE

Tuesday, June 3, 2014

Result and Conclusion

No comments:

Post a Comment