# Table of contents
1. [Introduction](#introduction)
2. [Aggregate Model Evaluation](#modelevaluation)
    1. [Loading the dataset](#modeload)
    2. [Perform detections](#modeldetect)
    3. [Evaluate detections](#modeldetectionseval)
    4. [Calculate results and plot them](#modelshowresults)
    5. [View dataset in fiftyone](#modelfiftyonesession)

## Introduction <a name="introduction"></a>

This notebook loads the test dataset in YOLOv5 format from disk and evaluates the model's performance.

In [1]:
import fiftyone as fo
from PIL import Image
from detection import detect
from detection import detect_yolo_only

## Aggregate Model Evaluation <a name="modelevaluation"></a>

First, load the dataset from the directory containing the images and the labels in YOLOv5 format.

### Loading the dataset <a name="modeload"></a>

In [None]:
name = "dataset"
dataset_dir = "dataset"

# The splits to load
splits = ["val"]

# Load the dataset, using tags to mark the samples in each split
dataset = fo.Dataset(name)
for split in splits:
    dataset.add_dir(
        dataset_dir=dataset_dir,
        dataset_type=fo.types.YOLOv5Dataset,
        split=split,
        tags=split,
    )

dataset.persistent = True
classes = dataset.default_classes

If the dataset already exists because it had been saved under the same name before, load the dataset from fiftyone's folder.

In [3]:
dataset = fo.load_dataset('dataset')
classes = dataset.default_classes

### Perform detections <a name="modeldetect"></a>

Now we can call the aggregate model to do detections on the images contained in the dataset. The actual detection happens at line 6 where `detect()` is called. This function currently does inference using the GPU via `onnxruntime-gpu`. All detections are saved to the `predictions` keyword of each sample. A sample is one image with potentially multiple detections.

> **_NOTE:_** If the dataset already existed beforehand (you used `load_dataset()`), the detections are likely already saved in the dataset and you can skip the next step.

In [4]:
# Do detections with model and save bounding boxes
with fo.ProgressBar() as pb:
    for sample in pb(dataset.view()):
        image = Image.open(sample.filepath)
        w, h = image.size
        pred = detect(sample.filepath, '../weights/yolo.onnx', '../weights/resnet.onnx')

        detections = []
        for _, row in pred.iterrows():
            xmin, xmax = int(row['xmin']), int(row['xmax'])
            ymin, ymax = int(row['ymin']), int(row['ymax'])
            rel_box = [
                xmin / w, ymin / h, (xmax - xmin) / w, (ymax - ymin) / h
            ]
            detections.append(
                fo.Detection(label=classes[int(row['cls'])],
                             bounding_box=rel_box,
                             confidence=int(row['cls_conf'])))

        sample["predictions"] = fo.Detections(detections=detections)
        sample.save()

 100% |█████████████████| 640/640 [6.2m elapsed, 0s remaining, 2.1 samples/s]      


### Evaluate detections against ground truth <a name="modeldetectionseval"></a>

Having saved the predictions, we can now evaluate them by cross-checking with the ground truth labels. If we specify an `eval_key`, true positives, false positives and false negatives will be saved under that key.

In [5]:
results = dataset.view().evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval",
    compute_mAP=True,
)

Evaluating detections...
 100% |█████████████████| 640/640 [2.1s elapsed, 0s remaining, 305.3 samples/s]      
Performing IoU sweep...
 100% |█████████████████| 640/640 [2.3s elapsed, 0s remaining, 274.2 samples/s]      


### Calculate results and plot them <a name="modelshowresults"></a>

Now we have the performance of the model saved in the `results` variable and can extract various metrics from that. Here we print a simple report of all classes and their precision and recall values as well as the mAP with the metric employed by [COCO](https://cocodataset.org/#detection-eval). Next, a confusion matrix is plotted for each class (in our case only one). Finally, we can show the precision vs. recall curve for a specified threshold value.

In [6]:
# Print a classification report for all classes
results.print_report()

print(results.mAP())

# Plot confusion matrix
matrix = results.plot_confusion_matrix(classes=classes)
matrix.show()

pr_curves = results.plot_pr_curves(classes=classes, iou_thresh=0.95)
pr_curves.show()

              precision    recall  f1-score   support

     Healthy       0.82      0.74      0.78       662
    Stressed       0.71      0.78      0.74       488

   micro avg       0.77      0.76      0.76      1150
   macro avg       0.77      0.76      0.76      1150
weighted avg       0.77      0.76      0.77      1150

0.6225848121901868






FigureWidget({
    'data': [{'mode': 'markers',
              'opacity': 0.1,
              'type': 'scatter',
              'uid': '79cc4c3d-21f9-416c-af39-7323e0a9570d',
              'x': array([0, 1, 2, 0, 1, 2, 0, 1, 2]),
              'y': array([0, 0, 0, 1, 1, 1, 2, 2, 2])},
             {'colorscale': [[0.0, 'rgb(255,245,235)'], [0.125,
                             'rgb(254,230,206)'], [0.25, 'rgb(253,208,162)'],
                             [0.375, 'rgb(253,174,107)'], [0.5, 'rgb(253,141,60)'],
                             [0.625, 'rgb(241,105,19)'], [0.75, 'rgb(217,72,1)'],
                             [0.875, 'rgb(166,54,3)'], [1.0, 'rgb(127,39,4)']],
              'hoverinfo': 'skip',
              'showscale': False,
              'type': 'heatmap',
              'uid': '266874fc-ec69-4c00-8a44-5305a17c5d3b',
              'z': array([[105, 158,   0],
                          [  0, 382, 106],
                          [493,   0, 169]]),
              'zmax': 493,
        





FigureWidget({
    'data': [{'customdata': array([99., 99., 99., 99., 99., 99., 99., 99., 99., 99., 99., 98., 98., 98.,
                                   97., 97., 97., 97., 96., 96., 96., 95., 94., 93., 93., 92., 91., 91.,
                                   90., 89., 88., 87., 87., 86., 86., 85., 85., 84., 82., 82., 81., 80.,
                                   80., 79., 78., 78., 77., 76., 75., 74., 72., 70., 70., 68., 67., 66.,
                                   65., 64., 62., 62., 61., 59., 58., 56., 53., 52., 51., 50.,  0.,  0.,
                                    0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
                                    0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
                                    0.,  0.,  0.]),
              'hovertemplate': ('<b>class: %{text}</b><br>recal' ... 'customdata:.3f}<extra></extra>'),
              'line': {'color': '#3366CC'},
              'mode': 'lines',
              'name

### View dataset in fiftyone <a name="modelfiftyonesession"></a>

We can launch a fiftyone session in a new tab to explore the dataset and the results.

In [7]:
session = fo.launch_app(dataset, auto=False)
session.view = dataset.view()
session.plots.attach(matrix)
session.open_tab()

Session launched. Run `session.show()` to open the App in a cell output.


<IPython.core.display.Javascript object>

In [8]:
session.close()