Ep#39: I love Yolo V8 – Traning a model with Yolo V8

Written by:

In This post i would like to present you the method to build a local data set and train them locally to detect object (Human) detection using Yolo V8. As i have explained in the Video I use ultralytics as Yolo V8 .

In order to obtain images to build your dataset from a local raspberry pi camera. You can use the python code below to take images from the camera every 5 second and save it to a local directory.

import cv2
import os
import time
import threading

class VideoStream:
    def __init__(self, url):
        self.cap = cv2.VideoCapture(url)
        self.ret, self.frame = self.cap.read()
        self.running = True
        self.thread = threading.Thread(target=self.update, args=())
        self.thread.start()

    def update(self):
        while self.running:
            self.ret, self.frame = self.cap.read()

    def read(self):
        return self.ret, self.frame

    def stop(self):
        self.running = False
        self.thread.join()
        self.cap.release()


# Set the directory where images will be saved
save_dir = r'C:\Users\49179\Documents\servolamp\servolamp2\Taken images'
os.makedirs(save_dir, exist_ok=True)

# Initialize the webcam (0 is the default webcam)
url = "http://192.168.x.x:8000/stream.mjpg"
stream = VideoStream(url)



# Set the image counter
img_counter = 898

print("Recording images every 5 seconds. Press 'q' to quit.")


while True:
        # Read a frame from the webcam
        ret, frame = stream.read()
        if not ret:
            print("Failed to grab frame")
            break
        # Display the frame (optional)
        cv2.imshow("Webcam", frame)

        # Define the image file name
        img_name = os.path.join(save_dir, f"image_{img_counter}.png")

        # Save the captured image
        cv2.imwrite(img_name, frame)
        print(f"{img_name} saved!")

        # Increment the image counter
        img_counter += 1

        # Wait for 5 seconds before capturing the next image
        time.sleep(5)

        # Check if the user wants to quit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            print("Quitting...")
            break



# Release the webcam and close the window
stream.stop()
cv2.destroyAllWindows()

In the Next step you need to use an annotation tool to lable the object. In our Case the Object is Human, therfore, as i have explined in the Video you can use an Online tool such as CVAT.AI to do this Job. After exporting the labeled dataset you will get a zip file which contains three file and a folder like below:

and in the Obj_train_data folder you have many text file from the labeled images that look like as below:

If you open the text file, you get the length and width and the center point of the rectangle of the labeled position in X and Y. The First 0 is representation of the Class which is here human.

To train your model you need a .Yaml config file.

path: 'C:\Users\49179\Documents\servolamp\servolamp2\data' #dataset root directory
train: images\train
val: images\train


#classes
names:
   0: human

And also you need to run this Code to start the training:

from ultralytics import YOLO


#load a model
model = YOLO("yolov8n.yaml")  #build a new model from scratch

#use the model
results = model.train(data="config.yaml", epochs=100)

for Epochs = 100, it takes some hours depending on you hardware, however i have tested the prompt below to use GPU instead of CPU, but i did not see any changes in the usage of my CPU which i do not know why!

python
Copy code
import torch

# Set the device (use 'cuda:0' for GPU 0 or 'cpu' for CPU)
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Then move the model to the selected device
model.to(device)

# Continue with training
results = model.train(data="config.yaml", epochs=100)

Last but not least you need to run the Camera live and test your model on the Live capturing using the code below:

import os
from ultralytics import YOLO
import cv2

model_path = os.path.join(r'C:\Users\49179\PycharmProjects\face_recognitionv2', 'runs', 'detect', 'train5', 'weights', 'last.pt')
# Load a model
model = YOLO(model_path)  # load a custom model
auto_camera = "http://192.168.x.x:8000/stream.mjpg"
video_capture = cv2.VideoCapture(auto_camera)

while video_capture.isOpened():
    ret, frame = video_capture.read()  # Capture frame-by-frame from the camera
    if not ret:
        break

    # Perform inference on the frame
    results = model(frame)

    # Visualize the results (optional)
    annotated_frame = results[0].plot()  # YOLOv8 provides built-in visualization

    # Show the frame with detection
    cv2.imshow('YOLOv8 Detection', annotated_frame)

    # Press 'q' to quit the window
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the camera and close all windows
video_capture.release()
cv2.destroyAllWindows()

I hope that you enjoyed this post, please do not forget to subscribe and support me for more interesting blog post.

Support

Please support me to provide content on this Blog. Thank you

€1.00

Leave a comment