A Deep Learning-Based Real-Time Face Detection and Recognition System
This project implements a real-time face recognition system using a combination of FaceNet for face embedding extraction and MTCNN for accurate face detection. It leverages deep learning models to detect, extract, and recognize faces efficiently.
The project uses MTCNN to detect faces in an image or video stream. MTCNN is a pre-trained deep learning model that detects faces with high accuracy and outputs bounding box coordinates for each face.
For each detected face, MTCNN provides:
Here is how MTCNN is applied to detect a face in an image:
face_detector = mtcnn.MTCNN()
faces = face_detector.detect_faces(image_rgb)
box = faces[0]['box'] # Extract bounding box of the first detected face
Once a face is detected, the project extracts face embeddings using the FaceNet model (Inception-ResNet-v1 architecture). FaceNet is a deep learning model that generates a compact 128-dimensional embedding for a face.
These embeddings represent the unique facial features, enabling accurate comparison and recognition. The process involves:
160x160
pixelsSnippet for embedding extraction:
face = cv2.resize(face, (160, 160))
face = normalize(face)
embeddings = face_encoder.predict(np.expand_dims(face, axis=0))
To recognize faces, the system creates an "encoding dictionary" of known faces. The training process involves:
Faces/
)The final embeddings are stored in a pickle file for fast retrieval during recognition.
encoding_dict[person_name] = l2_normalizer.transform(np.mean(encodes, axis=0))
with open("encodings/encodings.pkl", "wb") as file:
pickle.dump(encoding_dict, file)
The real-time recognition system uses a webcam feed. The process involves:
The cosine similarity metric identifies how similar two embeddings are. If the distance is below a
threshold (0.5
), the face is recognized.
distance = cosine(db_encode, detected_face_encode)
if distance < recognition_threshold:
print(f"Recognized as: {name}")
Recognized faces are displayed in green bounding boxes, and unknown faces are shown in red.
The system performs real-time face recognition with impressive accuracy and low latency. Key Observations:
The cosine similarity threshold ensures a good balance between precision and recall in recognition.
Faces/{Person_Name}
directory.train_v2.py
to generate face encodings.detect.py
.'q'
to exit the webcam feed.Access the complete project code and instructions here: GitHub Repository
Developed with a passion for computer vision and deep learning. Explore more projects on GitHub.