Object detection with Yolo on Raspberry Pi

Xukyo

1 year ago

It’s possible to embed object recognition models like Yolo on a Raspberry Pi. Of course, because of its low performance compared with computers, performance is lower in terms of real-time recognition. However, it is perfectly possible to develop algorithms using object recognition for applications that don’t require real-time recognition.

Hardware

Raspberry Pi 4
Screen+mouse+keyboard
SD card with OS Raspbian 64bits

Configuration

To use the Yolo, you’ll need to install the 64-bit version of Raspberry Pi OS. This version is available in the Raspberry Pi Imager software in the Raspberry Pi OS (others) menu.

Compatible Python versions are >=3.9

The libraries to be installed are

pip install numpy imutils opencv-python
pip install ultralytics

Object detection code

The code for object detection with Yolo is the same as on a computer

The Yolo model is initialized (the .pt file is downloaded at the start of the program).

model = YOLO("yolov5nu.pt")

Then open a video stream (or a video or image file).

vs = VideoStream(src=0, resolution=(640, 640)).start()
#video_cap = cv2.VideoCapture(0)

We then run the model on each image

detections = model(frame)[0]

Finally, we plot the detection results on the image and display them.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#

import datetime
from ultralytics import YOLO
import cv2
from imutils.video import VideoStream
import screeninfo

# define some constants
CONFIDENCE_THRESHOLD = 0.7
GREEN = (0, 255, 0)

model = YOLO("yolov5nu.pt")
#model = YOLO("./runs/detect/yolov8n_v8_50e2/weights/best.pt") # test trained model

print(model.names)

# initialize the video capture object
vs = VideoStream(src=0, resolution=(640, 640)).start()
#video_cap = cv2.VideoCapture(0)
#video_cap = cv2.VideoCapture("datasets\\Splash - 23011.mp4")
    
while True:
	# start time to compute the fps
	start = datetime.datetime.now()

        frame = vs.read(); ret=True
	#ret, frame = video_cap.read()
	
	
	
	# if there are no more frames to process, break out of the loop
	if not ret:
		break

	# run the YOLO model on the frame
	detections = model(frame)[0]
	
	# loop over the detections
	#for data in detections.boxes.data.tolist():
	for box in detections.boxes:
		#extract the label name
		label=model.names.get(box.cls.item())
		
		# extract the confidence associated with the detection
		data=box.data.tolist()[0]
		confidence = data[4]

		# filter out weak detections
		if float(confidence) &lt; CONFIDENCE_THRESHOLD:
			continue

		# draw the bounding box on the frame
		xmin, ymin, xmax, ymax = int(data[0]), int(data[1]), int(data[2]), int(data[3])
		cv2.rectangle(frame, (xmin, ymin) , (xmax, ymax), GREEN, 2)

		#draw confidence and label
		y = ymin - 15 if ymin - 15 &gt; 15 else ymin + 15
		cv2.putText(frame, "{} {:.1f}%".format(label,float(confidence*100)), (xmin, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, GREEN, 2)

	# end time to compute the fps
	end = datetime.datetime.now()
	# show the time it took to process 1 frame
	total = (end - start).total_seconds()
	print(f"Time to process 1 frame: {total * 1000:.0f} milliseconds")

	# calculate the frame per second and draw it on the frame
	fps = f"FPS: {1 / total:.2f}"
	cv2.putText(frame, fps, (50, 50),
				cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 255), 2)
       
	# show the frame to our screen
	cv2.imshow("Frame", frame)
	
	#press key Q to exit
	if cv2.waitKey(1) == ord("q"):
		break

#video_cap.release()
vs.stop()
cv2.destroyAllWindows()

The results of this code show object detection at a speed of around one frame per second (1 FPS, compared with 10FPS for a computer).

Improved speed

Running an object recognition model on a Raspberry Pi is a real challenge. But there are ways of improving the results.

Our basic result is obtained using the yolov5nu model with detection at 1 frame per second (1 FPS).

Change card

If you’re looking for high-performance results, it’s best to turn to microcomputers designed to work with artificial intelligence. Among other things, these boards have microprocessors adapted to these calculation volumes (e.g. Nvidia Jetson).

Changing language

The code proposed here is written in Python, but there are also lite versions written in C++ that considerably increase execution speed.

Reduce the number of images processed

You can reduce the computational load by skipping images.

if img_counter % 5 == 0 :
    detections = model(frame)[0]
img_counter+=1