# Table of contents
1. [Introduction](#introduction)
2. [Aggregate Model Evaluation](#modelevaluation)
    1. [Loading the dataset](#modeload)
    2. [Perform detections](#modeldetect)
    3. [Save detections](#modeldetectionssave)
    4. [Evaluate detections](#modeldetectionseval)
    5. [Calculate results and plot them](#modelshowresults)
    6. [View dataset in fiftyone](#modelfiftyonesession)
3. [YOLO Evaluation](#yoloevaluation)
    1. [Load OIDv6](#yololoadoid)
    2. [Merge labels into one](#yolomergelabels)
    3. [Load YOLOv5 dataset](#yololoadv5)
    4. [Perform detections](#yoloperformdetections)
    5. [Evaluate detections](#yolodetectionseval)
    6. [Calculate results and plot them](#yoloshowresults)
    7. [View dataset in fiftyone](#yolofiftyonesession)

## Introduction <a name="introduction"></a>

This notebook loads the test dataset in YOLOv5 format from disk and evaluates the model's performance.

In [1]:
import fiftyone as fo
from PIL import Image
from detection import detect
from detection import detect_yolo_only

## Aggregate Model Evaluation <a name="modelevaluation"></a>

First, load the dataset from the directory containing the images and the labels in YOLOv5 format.

### Loading the dataset <a name="modeload"></a>

In [2]:
name = "dataset"
dataset_dir = "dataset"

# The splits to load
splits = ["val"]

# Load the dataset, using tags to mark the samples in each split
dataset = fo.Dataset(name)
for split in splits:
    dataset.add_dir(
        dataset_dir=dataset_dir,
        dataset_type=fo.types.YOLOv5Dataset,
        split=split,
        tags=split,
    )

classes = dataset.default_classes

ValueError: Dataset 'dataset' already exists; use `fiftyone.load_dataset()` to load an existing dataset

If the dataset already exists because it had been saved under the same name before, load the dataset from fiftyone's folder.

In [3]:
dataset = fo.load_dataset('dataset')
classes = dataset.default_classes

### Perform detections <a name="modeldetect"></a>

Now we can call the aggregate model to do detections on the images contained in the dataset. The actual detection happens at line 6 where `detect()` is called. This function currently does inference using the GPU via `onnxruntime-gpu`. All detections are saved to the `predictions` keyword of each sample. A sample is one image with potentially multiple detections.

In [6]:
# Do detections with model and save bounding boxes
with fo.ProgressBar() as pb:
    for sample in pb(dataset.view()):
        image = Image.open(sample.filepath)
        w, h = image.size
        pred = detect(sample.filepath, '../weights/yolo.onnx', '../weights/resnet.onnx')

        detections = []
        for _, row in pred.iterrows():
            xmin, xmax = int(row['xmin']), int(row['xmax'])
            ymin, ymax = int(row['ymin']), int(row['ymax'])
            rel_box = [
                xmin / w, ymin / h, (xmax - xmin) / w, (ymax - ymin) / h
            ]
            detections.append(
                fo.Detection(label=classes[int(row['cls'])],
                             bounding_box=rel_box,
                             confidence=int(row['cls_conf'])))

        sample["predictions"] = fo.Detections(detections=detections)
        sample.save()

 100% |█████████████████| 640/640 [6.2m elapsed, 0s remaining, 2.1 samples/s]      


### Save detections <a name="modeldetectionssave"></a>

We have to make sure that the predictions for each sample are saved within the dataset. That is why we must call `dataset.save()` manually. The `persistent` flag is set again just to make sure.

In [16]:
dataset.persistent = True
dataset.save()

### Evaluate detections against ground truth <a name="modeldetectionseval"></a>

Having saved the predictions, we can now evaluate them by cross-checking with the ground truth labels. If we specify an `eval_key`, true positives, false positives and false negatives will be saved under that key.

In [4]:
results = dataset.view().evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval",
    compute_mAP=True,
)

Evaluating detections...
 100% |█████████████████| 640/640 [2.1s elapsed, 0s remaining, 307.0 samples/s]      
Performing IoU sweep...
 100% |█████████████████| 640/640 [2.3s elapsed, 0s remaining, 280.1 samples/s]      


### Calculate results and plot them <a name="modelshowresults"></a>

Now we have the performance of the model saved in the `results` variable and can extract various metrics from that. Here we print a simple report of all classes and their precision and recall values as well as the mAP with the metric employed by COCO. Next, a confusion matrix is plotted for each class (in our case only one). Finally, we can show the precision vs. recall curve for a specified threshold value.

In [5]:
# Print a classification report for all classes
results.print_report()

print(results.mAP())

# Plot confusion matrix
matrix = results.plot_confusion_matrix(classes=classes)
matrix.show()

pr_curves = results.plot_pr_curves(classes=classes, iou_thresh=0.95)
pr_curves.show()

              precision    recall  f1-score   support

     Healthy       0.82      0.74      0.78       662
    Stressed       0.71      0.78      0.74       488

   micro avg       0.77      0.76      0.76      1150
   macro avg       0.77      0.76      0.76      1150
weighted avg       0.77      0.76      0.77      1150

0.6225848121901868






FigureWidget({
    'data': [{'mode': 'markers',
              'opacity': 0.1,
              'type': 'scatter',
              'uid': 'a30f32cd-dedd-40c9-b4e8-755f4a701673',
              'x': array([0, 1, 2, 0, 1, 2, 0, 1, 2]),
              'y': array([0, 0, 0, 1, 1, 1, 2, 2, 2])},
             {'colorscale': [[0.0, 'rgb(255,245,235)'], [0.125,
                             'rgb(254,230,206)'], [0.25, 'rgb(253,208,162)'],
                             [0.375, 'rgb(253,174,107)'], [0.5, 'rgb(253,141,60)'],
                             [0.625, 'rgb(241,105,19)'], [0.75, 'rgb(217,72,1)'],
                             [0.875, 'rgb(166,54,3)'], [1.0, 'rgb(127,39,4)']],
              'hoverinfo': 'skip',
              'showscale': False,
              'type': 'heatmap',
              'uid': '352f91ad-547a-412a-ad31-a9cef7ad7c16',
              'z': array([[105, 158,   0],
                          [  0, 382, 106],
                          [493,   0, 169]]),
              'zmax': 493,
        





FigureWidget({
    'data': [{'customdata': array([99., 99., 99., 99., 99., 99., 99., 99., 99., 99., 99., 98., 98., 98.,
                                   97., 97., 97., 97., 96., 96., 96., 95., 94., 93., 93., 92., 91., 91.,
                                   90., 89., 88., 87., 87., 86., 86., 85., 85., 84., 82., 82., 81., 80.,
                                   80., 79., 78., 78., 77., 76., 75., 74., 72., 70., 70., 68., 67., 66.,
                                   65., 64., 62., 62., 61., 59., 58., 56., 53., 52., 51., 50.,  0.,  0.,
                                    0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
                                    0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
                                    0.,  0.,  0.]),
              'hovertemplate': ('<b>class: %{text}</b><br>recal' ... 'customdata:.3f}<extra></extra>'),
              'line': {'color': '#3366CC'},
              'mode': 'lines',
              'name

### View dataset in fiftyone <a name="modelfiftyonesession"></a>

We can launch a fiftyone session in a new tab to explore the dataset and the results.

In [8]:
session = fo.launch_app(dataset, auto=False)
session.view = dataset.view()
session.plots.attach(matrix)
session.open_tab()

Session launched. Run `session.show()` to open the App in a cell output.


<IPython.core.display.Javascript object>

In [6]:
# Write final dataset to disk
def export_dataset(dataset, export_dir, classes=["Healthy", "Stressed"]):
    label_field = "ground_truth"

    # The splits to export
    splits = ["val"]

    # Export the splits
    for split in splits:
        split_view = dataset.match_tags(split)
        split_view.export(
            export_dir=export_dir,
            dataset_type=fo.types.YOLOv5Dataset,
            label_field=label_field,
            split=split,
            classes=classes,
        )

## YOLO Model Evaluation <a name="yoloevaluation"></a>

In this section we look at the object detection model in detail by evaluating it separately from the classification model. The object detection model was trained on the Open Images Dataset v6 on the two classes _Plant_ and _Houseplant_ which come with the dataset. 

### Load OIDv6 <a name="yololoadoid"></a>

Since we are only interested in evaluating the model, we only load the _test_ split of the dataset. The only classes of interest to us are _Plant_ and _Houseplant_ and we do not want to load keypoint detections or segmentation masks, which is why we specify the `label_types` parameter. There are 12106 images in the test split.

In [7]:
import fiftyone as fo
import fiftyone.zoo as foz
oid = foz.load_zoo_dataset(
    "open-images-v6",
    split="test",
    classes=["Plant", "Houseplant"],
    label_types=["detections"],
    shuffle=True,
)


Downloading split 'test' to '/home/zenon/fiftyone/open-images-v6/test' if necessary
Necessary images already downloaded
Existing download of split 'test' is sufficient
Loading 'open-images-v6' split 'test'
 100% |█████████████| 12106/12106 [1.0m elapsed, 0s remaining, 161.8 samples/s]       
Dataset 'open-images-v6-test' created


### Export dataset for conversion <a name="yoloexportoid"></a>

Unfortunately, the OID dataset does not adhere to the YOLOv5 label format understood by the object detection model. That is why we export the model as a YOLOv5Dataset using fiftyone's converter. The target directory will contain the proper folder structure as well as a `.yaml` file pointing to the images and labels. Take note that the exported files require around 4.2G of space.

In [9]:
# The directory to which to write the exported dataset
import os

export_dir = "/home/zenon/testdir"

# Only export if export_dir doesn't exist already
if not os.path.isdir(export_dir):
    # The name of the sample field containing the label that you wish to export
    # Used when exporting labeled datasets (e.g., classification or detection)
    label_field = "detections"  # for example

    # The type of dataset to export
    # Any subclass of `fiftyone.types.Dataset` is supported
    dataset_type = fo.types.YOLOv5Dataset  # for example

    # Export the dataset
    oid.export(
        export_dir=export_dir,
        dataset_type=dataset_type,
        label_field=label_field,
        classes=['Plant', 'Houseplant']
    )

### Merge labels into one <a name="yolomergelabels"></a>

The label files contain a 0 at the beginning of each line if the ground truth specifies a plant and a 1 if it specifies a houseplant. We do not care about the distinction between the two and only want to detect plants in general. That means we have to change all 1s at the beginning of each line in each label file into 0s. The YOLOv5 format requires that the labels start at 0 and not at 1, which is why 1s are changed to 0s and not vice-versa. To accomplish this task, we use a simple bash script in the labels directory:
```bash
for file in `ls test`
do
    sed -i 's/^./0/g' test/$file
done
```
This script calls sed to change the first character in each file to a 0. It performs the conversion in place (`-i` flag). For this script to work, the `val` directories inside `images` and `labels` must be renamed to `test` and the path to the directory changed in the `data.yaml` file. I believe `val` is the wrong name for a test dataset and it should be named accordingly.

### Load YOLOv5 dataset <a name="yololoadv5"></a>

Now that the labels are in the correct format and we only have one class to deal with, we can import the dataset into the variable `yolo`.

In [19]:
yolo_dataset_dir = export_dir

# The type of the dataset being imported
dataset_type = fo.types.YOLOv5Dataset

# Import the dataset
yolo = fo.Dataset.from_dir(
    dataset_dir=yolo_dataset_dir,
    dataset_type=dataset_type,
)
yolo.name = 'yolo'
yolo.persistent = True

 100% |█████████████| 12106/12106 [14.9s elapsed, 0s remaining, 804.0 samples/s]      


In case the yolo dataset already exists because it had been saved earlier, we can simply load the dataset from fiftyone's database.

In [28]:
yolo = fo.load_dataset('yolo')
fo.list_datasets()

['dataset', 'yolo']

### Perform detections <a name="yoloperformdetections"></a>

We can proceed as before by calling the model and saving the detections to the `predictions` field of each sample. Note that line 7 does not call `detect()` but `detect_yolo_only()`. The detections on all 12106 images take around 1.2h on a GTX 750Ti.

In [2]:
# Do detections with model and save bounding boxes
yolo_view = yolo.view()
with fo.ProgressBar() as pb:
    for sample in pb(yolo_view):
        image = Image.open(sample.filepath)
        w, h = image.size
        pred = detect_yolo_only(sample.filepath, '../weights/yolo.onnx')

        detections = []
        for _, row in pred.iterrows():
            xmin, xmax = int(row['xmin']), int(row['xmax'])
            ymin, ymax = int(row['ymin']), int(row['ymax'])
            rel_box = [
                xmin / w, ymin / h, (xmax - xmin) / w, (ymax - ymin) / h
            ]
            detections.append(
                fo.Detection(label='Plant',
                             bounding_box=rel_box,
                             confidence=float(row['box_conf'])))

        sample["predictions"] = fo.Detections(detections=detections)
        sample.save()

 100% |█████████████| 12106/12106 [1.2h elapsed, 0s remaining, 2.7 samples/s]      


### Evaluate detections against ground truth <a name="yolodetectionseval"></a>

Having saved the predictions, we can now evaluate them by cross-checking with the ground truth labels. If we specify an `eval_key`, true positives, false positives and false negatives will be saved under that key.

In [29]:
results = yolo.evaluate_detections("predictions", gt_field="ground_truth", eval_key="eval", compute_mAP=True)

Evaluating detections...
 100% |█████████████| 12106/12106 [42.8s elapsed, 0s remaining, 294.8 samples/s]      
Performing IoU sweep...
 100% |█████████████| 12106/12106 [43.5s elapsed, 0s remaining, 300.3 samples/s]      


### Calculate results and plot them <a name="yoloshowresults"></a>

Now we have the performance of the model saved in the `results` variable and can extract various metrics from that. Here we print a simple report of all classes and their precision and recall values as well as the mAP with the metric employed by COCO. Next, a confusion matrix is plotted for each class (in our case only one). Finally, we can show the precision vs. recall curve for a specified threshold value.

In [30]:
# Print a classification report for all classes
results.print_report()

print(results.mAP())

# Plot confusion matrix
matrix = results.plot_confusion_matrix(classes=['Plant'])
matrix.show()

pr_curves = results.plot_pr_curves(classes=['Plant'], iou_thresh=0.9)
print(pr_curves)
pr_curves.show()

              precision    recall  f1-score   support

       Plant       0.52      0.54      0.53     22535

   micro avg       0.52      0.54      0.53     22535
   macro avg       0.52      0.54      0.53     22535
weighted avg       0.52      0.54      0.53     22535

0.3623395579880134






FigureWidget({
    'data': [{'mode': 'markers',
              'opacity': 0.1,
              'type': 'scatter',
              'uid': '245dadac-2cdf-4379-b02c-25054e792f00',
              'x': array([0, 1, 0, 1]),
              'y': array([0, 0, 1, 1])},
             {'colorscale': [[0.0, 'rgb(255,245,235)'], [0.125,
                             'rgb(254,230,206)'], [0.25, 'rgb(253,208,162)'],
                             [0.375, 'rgb(253,174,107)'], [0.5, 'rgb(253,141,60)'],
                             [0.625, 'rgb(241,105,19)'], [0.75, 'rgb(217,72,1)'],
                             [0.875, 'rgb(166,54,3)'], [1.0, 'rgb(127,39,4)']],
              'hoverinfo': 'skip',
              'showscale': False,
              'type': 'heatmap',
              'uid': 'feaf31e2-40ce-4a97-953b-36efe673c3d7',
              'z': array([[11496,     0],
                          [12237, 10298]]),
              'zmax': 12237,
              'zmin': 0},
             {'colorbar': {'len': 1, 'lenmode': 'fracti

<fiftyone.core.plots.plotly.PlotlyNotebookPlot object at 0x7f509b1bcb10>






FigureWidget({
    'data': [{'customdata': array([0.9709574 , 0.94905514, 0.94097441, 0.93560362, 0.93146831, 0.92766279,
                                   0.92468315, 0.92186761, 0.91872227, 0.91542256, 0.91216522, 0.90852749,
                                   0.90351772, 0.89783657, 0.8911835 , 0.88456821, 0.87576878, 0.86684024,
                                   0.85590148, 0.84403592, 0.82939762, 0.81318164, 0.79340947, 0.77035546,
                                   0.74586266, 0.72073823, 0.69323456, 0.65981126, 0.61941642, 0.57270879,
                                   0.52519083, 0.46561444, 0.40196586, 0.3248868 , 0.        , 0.        ,
                                   0.        , 0.        , 0.        , 0.        , 0.        , 0.        ,
                                   0.        , 0.        , 0.        , 0.        , 0.        , 0.        ,
                                   0.        , 0.        , 0.        , 0.        , 0.        , 0.        ,
                      

### View dataset in fiftyone <a name="yolofiftyonesession"></a>

We can launch a fiftyone session in a new tab to explore the dataset and the results.

In [11]:
session = fo.launch_app(yolo_view, auto=False)
session.plots.attach(matrix)
session.open_tab()

Session launched. Run `session.show()` to open the App in a cell output.


<IPython.core.display.Javascript object>