(Part-III)EyeAttend — Facial Recognition based Attendance System from scratch. — A complete approach.

Published in

Analytics Vidhya

9 min readMar 21, 2021

Introduction

Welcome to Part III of our blog series, EyeAttend. Finally after 2 stories on our vision and general overview of the project, we will begin our coding in this story. Without wasting much of your time, let’s dig in.

Recap

Before starting, let’s revise where we left in the previous story. So, we discussed about the overview of the Deep learning models and we briefed about the following models in our Project:

keras-facenet (Transfer Learning)
mask-detector
real/fake detector

You can get familier with the project by taking a quick glance on Part-I and Part-II of this series.

Data is the main resource to train a Deep Learning Model. In our project, the first type of data required is the classroom data. In order to perform face matching, we actually need a database, from where we will draw our image comaparison. So first, we will be generating our classroom data. For that our requirement is student’s face images. Since during the development of this project, schools and college in our region were shut down due to Covid-19 outbreak, we have to come up with a way around to generate data.

Data Generation

1. Collection

We recollected our group images, local trips and other images that we had in our gallary. Next, we put them in a folder for consideration. These images do include duplicate faces, but we required collection of maximum faces so that we could to generate our classroom batch. Our trial was to get images with maximum faces to reduce this collection effort. Storing it in a directory marks the completion of this phase. See the below attached image for reference.

Raw Data : Group photos for extracting faces

2. Extraction

Next, we need to extract the faces from these images. Some images have them in potrait mode, while some are clicked at an angle. Extracting faces manually is a tedious task. As mentioned previously that we will be using transfer learning for face extraction and facial recognition, we will use the MTCNN (Multi Task Cascaded Neural Network) model for face extraction. MTCNN module can be installed from PyPI at [1]. You can install it in your machine via terminal by executing the following command :

pip3 install mtcnn

Please note that before installing this, you have pip3 installed on your machine.

After installing MTCNN, we will be importing the necessary libraries for Face extraction

import os
import numpy as np
from PIL import Image
from mtcnn import MTCNN

After importing the libraries, we will store our source and target path in a variable and initialize mtcnn object. You can also use Argument Parser to input these folder and output path directly in the terminal.

detector = MTCNN()
folder_path = '/home/username/Eye_Attend/Eye_Attend/images'
output_path = '/home/username/Eye_Attend/Eye_Attend/FACES_224'

To understand how MTCNN works, consider the following example :

If we run MTCNN detector on this image (via the below code), we get the following output:

Code:

img = Image.open(folder_path+os.sep+"IMG_20171130_105137.jpg")
pixels = np.asarray(img)
results = detector.detect_faces(pixels)
print(len(results))
print(results)

Output:

6
[{'box': [602, 470, 358, 454], 'confidence': 1.0, 
'keypoints': {'left_eye': (724, 643), 'right_eye': (888, 662), 'nose': (816, 743), 'mouth_left': (718, 807), 'mouth_right': (863, 825)}}, {'box': [1066, 401, 228, 305], 'confidence': 0.9999994039535522, 'keypoints': {'left_eye': (1136, 518), 'right_eye': (1244, 523), 'nose': (1196, 571), 'mouth_left': (1137, 624), 'mouth_right': (1235, 632)}}, {'box': [1542, 538, 226, 283], 'confidence': 0.9999988079071045, 'keypoints': {'left_eye': (1591, 645), 'right_eye': (1692, 646), 'nose': (1625, 692), 'mouth_left': (1596, 755), 'mouth_right': (1671, 758)}}, {'box': [1764, 844, 477, 588], 'confidence': 0.9999927282333374, 'keypoints': {'left_eye': (1873, 1078), 'right_eye': (2096, 1067), 'nose': (1968, 1204), 'mouth_left': (1898, 1299), 'mouth_right': (2088, 1294)}}, {'box': [1038, 951, 453, 523], 'confidence': 0.9999831914901733, 'keypoints': {'left_eye': (1162, 1174), 'right_eye': (1351, 1109), 'nose': (1289, 1293), 'mouth_left': (1244, 1375), 'mouth_right': (1410, 1313)}}, {'box': [-92, 769, 623, 850], 'confidence': 0.9803318977355957, 'keypoints': {'left_eye': (86, 1089), 'right_eye': (375, 1070), 'nose': (268, 1253), 'mouth_left': (84, 1357), 'mouth_right': (392, 1330)}}]

The output is a list of dictionaries, with each dictionary having ‘box’, ‘confidence’ and ‘keypoints’ as the keys. The key ‘keypoints’ is another dictionary with ‘left_eye’, ‘right_eye’, ‘nose’, ‘mouth_left’ and‘mouth_right’ keys. The box key list consists of x1, y1, width and height. This refers to the bounding box of the face and we can use it to extract the face from the image.

In a similar way discussed above, we will iterate over the images in our source folder one by one and detect all the faces from the current image, extract their bounding box coordinates and then store the face images in the output_folder. Following code helps us to extract the faces from our input directory path and saves the images in the output path.

for image in os.listdir(folder_path):
    img = Image.open(folder_path+os.sep+image)
    pixels = np.asarray(img)
    filename , extension = image.split(".")
    results = detector.detect_faces(pixels)
    for i in range(len(results)):
        x1, y1, width, height = results[i]['box']
        x1, y1 = abs(x1), abs(y1)
        x2, y2 = x1 + width, y1 + height
        face = pixels[y1:y2, x1:x2]
        face_image = Image.fromarray(face)
        face_image = face_image.resize((224, 224))
        face_image.save(
            output_path+os.sep+filename+"_"+str(i)+"."+extension
        )

Upon each iteration, the image variable is converted to numpy array, which is passed as an argument to detector.detect() for extracting the face details as explained in the previous example. We also store the filename and the extension of the image to save the face image with the same prefix. We iterate on every face detected and extract the face via the bounding box coordinates from numpy array of the image. Resize it to 224,244 dimensions as we will be requiring a pretrained model at a later stage, with weights on imagenet dataset which accepts224x224 image. Finally we will save the resulting face in the output directory. Below is the result on our source folder.

3. Segregation:

After extracting the possible detected faces from group images, we have some unwanted, distorted, blurred and duplicate faces (due to duplicate individuals in images). Next step is to take out quality images from this dataset. This will be our segregation phase. We will pick out quality images which are visible differentiable and less pixelated from others. In a new directory, these images will be pasted and the filename will be changed to the roll number of the individual. Reason for doing this will be explained later on. The outcome is seen as below:

You might observe some roll numbers missing, thats because we were not able to find images of few individuals in our gallary. Our next aim is to export this data over the Database so that it can be used in accordance the flow of our application when required.

A question that comes to mind is, do we actually need Image data for facial recognition? If you give it a thought, then you will understand that we require the face data only for comparing the face data sent from frontend, and you think deeper, then we are actually comparing the embeddings of the image attained from frontend with the embeddings of the faces of a particular batch from backend database. So, instead of storing 60–80 images and then retrieving each of them and hence calculating their embeddings is a computational expensive task.

So, a better wayout will be uploading the embeddings per face and other student details over the MySQL Data base.

4. Storing in Database:

We are using MySQL database in this project and now the next move is to upload the students data on DB. Along with the face embeddings, we will upload other details too, i.e name, email, college email, batch , branch etc. To be clear, the above classroom data is for branch ‘cse’ and batch 2017–21.

Our next challenge is to store numpy array in MySQL. This challenge was tackled by converting the numpy array into bytes and storing it in a Column of type LONGBLOB. This part was tricker however this too shall pass and it did pass. Lets see the code for uploading the data on MySQL table.

Student Details: Every Classroom’s details such a roll number, email id etc is kept as an Excel Record by the CR and Class Incharge. Being ex-CR for 2 years I too, had that student record list. We used the details in the excel sheet with face images (arranged in ascending order of roll number) and combined the two to automate the task of uploading the data on DB.

Before importing the libraries, we have to install the pretrained keras facenet model to be used for transfer learning.

pip3 install keras-facenet

After successful install, import the essential libraries and make an instance of Facenet from keras-facenet [2]:

import os
import pickle
import numpy as np
import pandas as pd
from PIL import Image
import mysql.connector
from keras_facenet import FaceNet
from tensorflow.keras.preprocessing.image import img_to_array,array_to_imgembedder = FaceNet()

Once everything is imported, we will open our excel file with students details. For privacy concerns, emails and personal details are omitted from the images. We use pandas library to read the data.

df = pd.read_excel(“StudentListEmail_ccet_gmail.xlsx”)
       .fillna(“example@domain.com”)df.head(5)

The above image shows output of a class’s student details. Using this and image of student’s face we will upload the same on MySQL DB.

Our next task is to calculate embeddings for each face and we store it in a list.

NOTE: The roll numbers in the excel file and the filenames of faces are same and equal in number, sorted in ascending order. There is no extra entry in excel with a roll number which is not present in our classroom folder and vice versa.

See the following code snippet for calculating embedding.

name = list(df['First Name '])
email = list(df['Email [Secondary]'])
ccet_email = list(df['Email Address [CCET]'])
roll= list(df['Roll Number'])class_temp_path ="/home/username/Eye_Attend/Eye_Attend_Final/classroom_224"temp_totalEmbeddings = []
for faces in os.listdir(class_temp_path):
    face = Image.open(class_temp_path+os.sep+faces)
    face = img_to_array(face)
    face = np.expand_dims(face, axis=0)
    embedding = embedder.embeddings(face)
    temp_totalEmbeddings.append(embedding)

The temp_totalEmbeddings list hold the embeddings of each face in our classroom. Each embedding is a 512 – dimensional Vector as shown below :

print(temp_totalEmbeddings[0].shape)
print(len(temp_totalEmbeddings))
print(len(roll))#Output 
(1, 512)
40
40

This shows that we have 40 students in our excel, and 40 images corresponding to them (sorted by roll number asc) in classroom folder.

Since each embedding is 512 dimensional vector, therefore our final matrix will be of dimensional 40 x 512.

Next we will see the upload code to upload all this data in the MySQL DB.

branch_ = 'cse'
batch_ = '2017'
conn = mysql.connector.connect(
    host="localhost",
    user="root",
    password="",
    database="eyeattend"
    )
if conn.is_connected():
    print("Successfully connected")
    mycursor = conn.cursor()
    create_tab = "CREATE TABLE IF NOT EXISTS `batch_2017` ( `roll_no` varchar(10) NOT NULL, \
                 `name` varchar(50) NOT NULL,\
                 `branch` text NOT NULL,\
                 `batch` text NOT NULL,\
                 `email` varchar(50) NOT NULL,\
                 `ccet_email` varchar(50) NOT NULL,\
                 `photo_embedd` longblob NOT NULL,\
                 PRIMARY KEY (`roll_no`) ) "
    
    mycursor.execute(create_tab)
    conn.commit()
    
    for i in range(len(roll)):
        r = str(roll[i])
        naam = str(name[i])
        ema = str(email[i])
        ccet = str(ccet_email[i])
        embed = temp_totalEmbeddings[i]
        pickemb = embed.dumps()
        mycursor.execute("INSERT INTO batch_2017 VALUES (%s,%s,%s,%s,%s,%s,%s)",(r,naam,branch_,batch_,ema,ccet,pickemb))
        conn.commit()
        # Closing the connection
    conn.close()
    
else:
    print('Error in connecting with database')

Here we utilise the mysql.connector package. branch and batch are the variables that can be used to tell which branch and batch of student data will be uploaded into the DB.

First, we try to make a connection with our backend Database. If the Connection is successful, we print out a success message and make a cursor object. Next we will run our CREATE TABLE query to make the table for the particular batch if it does not exist. Further, we iterate over the student details, and their face image embeddings one by one and insert them onto the Database. As mentioned earlier, to upload embeddings, we have to encode them in bytes form as our image_embed attribute accepts longblob. For this, we use the dumps() function and upload the data over DB. The output of this code in DB is shown below :

In this way, data of other streams such as ECE, CIVIL, MECH, BIOTECH, EE etc of the same batch can be uploaded in the same way. This is how we built our Database for the project. In future blogs, you will see how we will fetch only the roll numbers and embeddings for comaprison and how we will mark the attendance via facial recognition.

Conclusion

So this marks the conclusion of the blog. In this blog you saw how we used the pretrained keras-facenet model for generating vector embeddings of the face images and how we created our classroom database on MySQL. In the next blog, we will build the mask detector model. Stay tuned for that and we hope that you enjoyed this blog. Do like and share with your circle.

References

[1] https://pypi.org/project/mtcnn/

[2] https://pypi.org/project/keras-facenet/