Python+OpenCV+HOG+SVM+pedestrian detection

created at 06-29-2021 views: 3

This article summarizes the HOG+SVM related to pedestrian detection. Of course, there are also vehicle detection and so on... Here is an introduction using pedestrian detection as an example:

HOG: histogram of gradient , related to Window size, Block size, Block sliding step size, Cell size and Bin size (usually 9, 360 degrees equal 9 parts), the purpose is to get all the sliding in the Window The histogram of the gradient of the cell of the block, which constitutes the feature vector. Dimension: N = ((W–wb )/stride + 1)*((H-hb)/stride+1)*bins*n, where W is the width of the Window, H is the height of the Window, wb and hb are The width and height of the block, stride is the sliding step length of the block, bins is the projected block, and n is the number of cells contained in a block.

This article is mainly based on OpenCV to do a process of combing HOG and SVM, suitable for Xiaobai (known HOG and SVM principles) to learn, specific analysis on specific projects, and some project things will also be mentioned in the middle.

data set

INRIA Person Dataset: This database is currently the most used static pedestrian detection database, providing original pictures and corresponding annotation files. The training set has 614 positive samples (including 2416 pedestrians) and 1218 negative samples; the test set has 288 positive samples (including 1126 pedestrians) and 453 negative samples. Most of the human body in the picture is in a standing posture and the height is greater than 100 pixels, some of which may be incorrectly labeled. The pictures are mainly from GRAZ-01, personal photos and google, so the picture has a high definition. Some training or test pictures cannot be seen clearly under the XP operating system, but they can be read and displayed normally with OpenCV.

In the project, if you want to get a higher accuracy rate, you need to build your own data set based on the specific project scene, the pose of the pedestrian, the size of the frame image, the purpose of the detection, etc., or you can add your own data set to some Public data collection for training.

Basic process

1. Extract the HOG features of the data set: The sample is very important. It is necessary to obtain the image data set according to the actual environment of the project. Don't think that just making some pictures is enough for training.

2. Train the positive and negative samples to get the model

3. Use the trained model to generate detectors

4. Use the detector to identify the test negative sample set, and find the hard example that recognizes the error: a hard example refers to all the detections when pedestrians are detected on the original image of the negative sample (definitely no human body) using the first trained classifier These rectangular boxes are obviously false positives. Save these false positive rectangular boxes as pictures and add them to the initial set of negative samples. Retraining of SVM can significantly reduce false positives. This method is called Bootstrap. The bootstrap method first uses an initial set of negative samples to train a model, and then collects the negative samples misclassified by this initial model to form a set of difficult negative samples. Use this set of negative samples to train a new model, and this process can be repeated many times.

5. Extract the HOG features from the hard example, and train the model together with the features obtained in the first step.

6. Identification: There are two ways to identify. If you only use linear kernels, you only need the setSVMDetector and detect (detectMultiScale) that come with the Hog class. If you use RBF, you need to use the SVM class predict. But predict is to judge "yes and not". If you want to get the area, use the setSVMDetector and detect (detectMultiScale) of the Hog class.

7. Non-maximum suppression optimizes overlapping detection areas

Code (Python)

import cv2
import numpy as np
import random

#Loading samples
def loadImageList(dirName, fileListPath):
imageList = [];
file = open(dirName + r'/' + fileListPath)
imageName = file.readline()
while imageName !='':
imageName = dirName + r'/' + imageName.split('/', 1)[1].strip('\n')
#print imageName
    imageList.append(cv2.imread(imageName))
imageName = file.readline()
return imageList

#Get a positive sample, intercept the area of size (128,64) from (16, 16)
def getPosSample(imageList):
posList = []
for i in range(len(imageList)):
roi = imageList[i][16:16+128, 16:16+64]
posList.append(roi);
return posList

#Get a negative sample, randomly crop out 10 areas of size (128, 64) from pictures without pedestrians
def getNegSample(imageList):
negList = []
random.seed(1)
for i in range(len(imageList)):
for j in range(10):
y = int(random.random() * (len(imageList[i])-128))
x = int(random.random() * (len(imageList[i][0])-64))
negList.append(imageList[i][y:y + 128, x:x + 64])
return negList

#Calculate HOG features
def getHOGList(imageList):
    HOGList = []
    hog = cv2.HOGDescriptor()
    for i in range(len(imageList)):
    gray = cv2.cvtColor(imageList[i], cv2.COLOR_BGR2GRAY)
    HOGList.append(hog.compute(gray))
    return HOGList

#Get detector
def getHOGDetector(svm):
    sv = svm.getSupportVectors()
    rho, _, _ = svm.getDecisionFunction(0)
    sv = np.transpose(sv)
    return np.append(sv, [[-rho]], 0)
#Get Hard example
def getHardExamples(negImageList, svm):
    hardNegList = []
    hog = cv2.HOGDescriptor()
    hog.setSVMDetector(getHOGDetector(svm))
    for i in range(len(negImageList)):
        rects, wei = hog.detectMultiScale(negImageList[i], winStride=(4, 4),padding=(8, 8), scale=1.05)
        for (x,y,w,h) in rects:
        hardExample = negImageList[i][y:y+h, x:x+w]
        hardNegList.append(cv2.resize(hardExample,(64,128)))
    return hardNegList

#Non-maximum suppression
def fastNonMaxSuppression(boxes, sc, overlapThresh):
    # if there are no boxes, return an empty list
    if len(boxes) == 0:
        return []
    # if the bounding boxes integers, convert them to floats --
    # this is important since we'll be doing a bunch of divisions
    if boxes.dtype.kind == "i":
        boxes = boxes.astype("float")

    # initialize the list of picked indexes
    pick = []

    # grab the coordinates of the bounding boxes
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]
    scores = sc
    # compute the area of the bounding boxes and sort the bounding
    # boxes by the score of the bounding box
    area = (x2 - x1 + 1) * (y2 - y1 + 1)
    idxs = np.argsort(scores)

    # keep looping while some indexes still remain in the indexes
    # list
    while len(idxs) > 0:
        # grab the last index in the indexes list and add the
        # index value to the list of picked indexes
        last = len(idxs) - 1
        i = idxs[last]
        pick.append(i)

        # find the largest (x, y) coordinates for the start of
        # the bounding box and the smallest (x, y) coordinates
        # for the end of the bounding box
        xx1 = np.maximum(x1[i], x1[idxs[:last]])
        yy1 = np.maximum(y1[i], y1[idxs[:last]])
        xx2 = np.minimum(x2[i], x2[idxs[:last]])
        yy2 = np.minimum(y2[i], y2[idxs[:last]])

        # compute the width and height of the bounding box
        w = np.maximum(0, xx2 - xx1 + 1)
        h = np.maximum(0, yy2 - yy1 + 1)

        # compute the ratio of overlap
        overlap = (w * h) / area[idxs[:last]]

        idxs = np.delete(idxs, np.concatenate(([last],
                                               np.where(overlap > overlapThresh)[0])))

    # return only the bounding boxes that were picked using the
    # integer data type
    return boxes[pick]

#Main program
labels = []
posImageList = []
posList = []
posImageList = []
posList = []
hosList = []
tem = []
hardNegList = []
#Load pictures with pedestrians
posImageList = loadImageList(r"/home/ningshaohui/tfboy/INRIAPerson/train_64x128_H96", "pos.lst") 
print ("posImageList:", len(posImageList))
#Crop the picture, get the positive sample
posList = getPosSample(posImageList)
print ("posList", len(posList))
#Get the HOG of the positive sample
hosList = getHOGList(posList)
print ("hosList", len(hosList))
#Add the label corresponding to all positive samples
[labels.append(+1) for _ in range(len(posList))]

#Load pictures without pedestrians
negImageList = loadImageList(r"/home/ningshaohui/tfboy/INRIAPerson/train_64x128_H96", "neg.lst") 
print ("negImageList:", len(negImageList))
#Random crop to obtain negative samples
negList = getNegSample(negImageList)
print ("negList", len(negList))
#Get the negative sample HOG and add it to the overall HOG feature list
hosList.extend(getHOGList(negList))
print ("hosList", len(hosList))
#Add the label corresponding to all negative samples
[labels.append(-1) for _ in range(len(negList))]
print ("labels", len(labels))
####################So far, all the features and labels of SVM are obtained (excluding hard example)######################


#Create svm classifier, parameter settings
#################################################################
#-d degree:Degree setting in the kernel function (for polynomial kernel function) (default 3)

#-g r(gama):The gamma function setting in the kernel function (for polynomial/rbf/sigmoid kernel function) (default 1/k)

#-r coef0:Coef0 setting in the kernel function (for polynomial/sigmoid kernel function) ((default 0)

#-c cost:Set the parameters (loss function) of C-SVC, e-SVR and v-SVR (default 1)

#-n nu:Set the parameters of v-SVC, a type of SVM and v-SVR (default 0.5)
#-p p:Set the value of loss function p in e -SVR (default 0.1)

#-m cachesize:Set the cache memory size in MB (default 40)

#-e eps:Set the allowable termination criterion (default 0.001)

#-h shrinking:Whether to use heuristic, 0 or 1 (default 1)

#-wi weight:Set the type of parameter C to weight*C (C in C-SVC) (default 1)

#-v n: n-fold interactive test mode, n is the number of folds, which must be greater than or equal to 2
##################################################################################

svm = cv2.ml.SVM_create()
svm.setCoef0(0.0)
svm.setDegree(3)
criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 1000, 1e-3)#Termination condition
svm.setTermCriteria(criteria)
svm.setGamma(0)
svm.setKernel(cv2.ml.SVM_LINEAR)
svm.setNu(0.5)
svm.setP(0.1)  # for EPSILON_SVR, epsilon in loss function?
svm.setC(0.01)  # From paper, soft classifier 
svm.setType(cv2.ml.SVM_EPS_SVR)  # C_SVC # EPSILON_SVR # may be also NU_SVR # do regression task
svm.train(np.array(hosList), cv2.ml.ROW_SAMPLE, np.array(labels))
#Acquire hard example based on initial training results
hardNegList = getHardExamples(negImageList svm)
hosList.extend(getHOGList(hardNegList))
print ("hosList=====", len(hosList))
[labels.append(-1) for _ in range(len(hardNegList))]
####################So far, all the features and labels of SVM are obtained (including hard example)######################
####################Adding a hard example to the actual measurement can greatly improve the accuracy of the detection#########################

#Add hard example, retrain
svm.train(np.array(hosList), cv2.ml.ROW_SAMPLE, np.array(labels))

#Save model
hog = cv2.HOGDescriptor()
hog.setSVMDetector(getHOGDetector(svm))
hog.save('myHogDector.bin')

#Pedestrian detection
#hog.load('myHogDector.bin') #Because in the same file, there is no need to load the model
hog = cv2.HOGDescriptor()
hog.load('myHogDector.bin')
image = cv2.imread("1.jpg")
cv2.imshow("image", image)
cv2.waitKey(0)
rects, scores = hog.detectMultiScale(image, winStride=(4, 4),padding=(8, 8), scale=1.05)

#fastNonMaxSuppression-The first parameter
for i in range(len(rects)):
    r = rects[i]
    rects[i][2] = r[0] + r[2]
    rects[i][3] = r[1] + r[3]

#fastNonMaxSuppression-Second parameter
sc = [score[0] for score in scores]
sc = np.array(sc)

pick = []
print('rects_len',len(rects))
pick = fastNonMaxSuppression(rects, sc, overlapThresh = 0.3)
print('pick_len = ',len(pick))


for (x, y, xx, yy) in pick:
    print (x, y, xx, yy)
    cv2.rectangle(image, (int(x), int(y)), (int(xx), int(yy)), (0, 0, 255), 2)
    cv2.imshow('a', image)
    cv2.waitKey(0)
Please log in to leave a comment.