Skip to content

Latest commit

 

History

History
1721 lines (1502 loc) · 88.3 KB

thesis.org

File metadata and controls

1721 lines (1502 loc) · 88.3 KB

Methods for In-Situ Fish Length Measurement with a Single Laser and Camera to Support Citizen Science

Frontmatter

Dr. Seuss

Abstract

Introduction

Preamble

Marine ecosystems provide humankind many resources which we depend on. These ecosystems an important source of food worldwide, and approximately 3 billion people depend on seafood as a significant source of protein \cite{WWF}. Data that can be used to evaluate the health of our oceans is therefore essential for us to understand the potential impacts of overfishing and climate change.

Fish in particular are a significant part of this – fish populations are known to be in decline as a result of overfishing and climate change, which has dire consequences for both food stocks and marine ecosystems as a whole \cite{Hutchings2004,Myers2003}. In order to be able to quantify the effects of climate change on fish, we require some metric with which we can study how healthy fish are on a population level.

One such way of doing this is to obtain a fish length distribution \cite{Heppell2012, Stock2021}. By examining such distributions over time, scientists can identify whether a population is being overfished, or is recovering. There are two primary important applications for these data. The first is fish population conservation – if policies are put into place to assist with the recovery of a fish population, it is within the interest of scientists to document the success of these protective measures. Our collaborators have performed a study of a particular species of Nassau Grouper, and have done this fairly successfully by looking at annual spawning aggregations \cite{Heppell2012}. The second is fishery management. Fisheries want to ensure that they extract as much catch as possible without irreparably damaging their stock, meaning that detailed, real-time information about population health is vital, especially where other data are unavailable \cite{Balde2019}.

Current Methods for Fish Length Studies

Current global fish length data often come from catch and release programs. In California, the California Collaborative Fisheries Research Program (CCFRP) is one such program, which as of 2024, consists of over 1800 volunteer anglers. These anglers use hook-and-line procedures to catch the fish, from which length is measured directly with a tape measure before the fish is released. This length information is combined with metrics like catch per unit effort to estimate the biomass of a population. However, fishing takes time – in their most recent study, each examined location took 3 days to gather data for \cite{Ziegler2024}. In addition, stress caused by barotrauma and air exposure can cause fish to be less able to find food or evade predators, which reduces their life expectancy \cite{Campbell2010}. In summary, these expeditions are logistically difficult to set up, and are also harmful to the fish being studied \cite{Ramsay2009,Campbell2010}.

Catch and release fishing also poses an issue for marine protected areas (MPAs), within which fishing may be restricted to certain species, or completely illegal outright. To study populations within these areas, adjacent fishing activities are sometimes used to infer the abundance of a particular species \cite{Ziegler2024}. Thus, in order to better understand populations within MPAs, a non-invasive method is desirable.

A non-invasive alternative is the roving diver survey, where a team of divers conducts a visual census of fish species and their lengths within the MPA. Sometimes divers may use a T-bar as a length reference. However, measurements obtained purely from visual estimation are naturally imprecise – humans are only capable of estimating to within 20% of a fish’s true length \cite{Harvey2001}. This is also a skill that must be honed in order to be improved, and requires retraining as frequent as once every 6 months \cite{Bell1985}.

In summary, state of the art approaches require resources for data collection, whether it be personnel, time, money or specialized equipment. To aid in efforts to study fish populations, we aim to develop a more accessible data collection method.

Citizen Science

Our approach seeks to take advantage of efforts from the general public in order to gather as much data as possible. This is known as citizen science. Due to the overhead of the procedures or equipment required to study fish in particular, there are no methods that we know of currently that are accessible to the general public, which we hope to address with our current work.

This approach to data collection has been tried and tested in other fields. Foldit\footnote{https://fold.it/} is one such example that has been used to identify new protein structures, using a video-game like interface to get players to solve puzzles related to protein folding. Citizen scientists using Foldit have been shown to be able to recreate existing protein structures, and even provide new ways of creating proteins that professional scientists may not consider \cite{Koepnick2019}.

A example related to species identification is iNaturalist, a smartphone application that enables its users to capture images of terrestrial flora and fauna, and upload them to an online platform. A community of naturalists are then able to view these images and identify the species of the subject. This community sourced dataset is showing promise in studies of individual taxa \cite{Rosa2022}, discoveries of new species \cite{Winterton2020}, and species distributions of exotic species \cite{Vendetti2018}. Due to this being a smartphone app, these data primarily consist of terrestrial life, or life that is otherwise only visible above water.

We hope to both provide additional data for fish specifically, as well as provide length data that can be analyzed. In order for such an approach to be useful for citizen scientists, it must be able to be accessed at relatively low expense, and require minimal training to use properly. Subsequent sections will detail how we achieve this.

Related Technologies

<sec:existing-technologies>

In order for our approach to require minimal training to use, we turn to a technological approach. Training recreational divers to visually estimate fish lengths is impractical, and so we aim to develop a tool that they can carry underwater. Herein we discuss current technologies that exist.

The effort described within this work relates to our solution to the aforementioned problem – a device that we call FishSense Lite. FishSense Lite relies on a single laser as depth reference in order to measure the length of a fish from a captured image, and due to the simplicity and accessibility of the hardware construction, it can be utilized by recreational divers to get much larger data coverage. The primary contributions of this thesis are the algorithms behind this device, i.e. how this single laser is used to measure fish length. However, first we will review the existing state of the art. A summary of these technological approaches can be seen in Table \ref{tab:comparison}, including FishSense Lite.

Acoustic Methods

The primary related technology in this category is sonar, which involves actively sending sound waves and using the signal as it echoes back to construct an image of the scene. As we are interested in studying particular fish populations, it is also important that we are able to identify the species of fish being imaged, a relatively recent development when it comes to acoustic methods. Technologies like Dual-frequency Identification Sonar (DIDSON) \cite{Belcher2002} and Adaptive Resolution Imaging Sonar (ARIS) \cite{Jones2021} both take advantage of many individual sonic beams and high frequencies in order to construct a high-fidelity sonic image of the scene. Both DIDSON and ARIS technologies have been used to measure fish lengths \cite{Burwen2010,Cook2019} and identify fish species \cite{Langkau2012,Jones2021}. Sonar is far more effective than camera-based techniques when visibility conditions are poor. However, they are by far the most expensive option listed here. A system in 2006 was said to have cost around 20,000USD \cite{Mueller2006}. We were unable to find specific prices for modern imaging sonar units, though prices have ranged from tens of thousands to even hundreds of thousands of US dollars.

Passive techniques – monitoring sounds that come from the fish themselves – have also more recently been used as a way to study fish populations \cite{Fornshell2013}. These have shown to be an effective way to study species distribution \cite{VanHoeck2021}, or for recording sightings of a targeted species \cite{Bolgan2023}, though we are unaware of any existing literature that attempts to estimate size using this technology.

Stereo Camera Systems

The approach that is easiest for citizen scientists to carry out is to first capture an image of the fish, using one or more cameras, then use the distance to the fish to scale its apparent size appropriately. The main problem associated with these approaches is finding the distance to the fish.

One way of doing so involves stereo video technology \cite{Mallet2014}. To measure the actual size of objects in an image, an additional camera can be added to create a stereo camera setup - provided the relationship between the two cameras is known, apparent size can be converted to actual size by leveraging the disparity in pixel location between the same features in both images. These are typically diver-operated (known as stereo diver-operated video or stereo-DOV) \cite{Goetze2019} or placed in baited remote underwater video systems \cite{Mallet2014}. While stereo-DOV is a more cost-effective solution than deploying a remote system, the current state of the art still requires purchasing proprietary hardware and software \cite{Goetze2019}, which can be prohibitively expensive for a citizen scientist at a minimum of 4,600USD \cite{SeaGIS} for a scientist grade stereo video system. In addition, stereo video generates a large amount of data that requires significant effort to store and process \cite{Tueller2021}.

Commercial stereo video solutions include the AQ1 AM100 \cite{Shafait2017} and the AKVA Vicass HD \cite{Churnside2012}, intended for use in aquaculture. Such systems are also costly and require a tether to a surface-side computer with proprietary software that must be used to manage the system. This limits the regions of the world where the data can be collected as it requires scientists to interact with the system. The tether also limits the depths at which the data can be collected.

Our team has also attempted to use proprietary stereo camera units to build a FishSense “Pro” device \cite{Paxson2022, Tueller2021}. These attempts involved using an Intel Realsense D455 – a depth camera that used both stereo cameras and structured light in order to infer depth information. This camera was paired with a NVIDIA Jetson TX2 for onboard video processing, and was fully contained in a waterproof enclosure with onboard power and storage. Since the Realsense was up against a flat acrylic port, distortions from Snell’s law caused both the incoming images and the structured infrared light to refract, which yielded erroneous depth map information. On top of this, some of the infrared light reflected back into the lens, adding additional color artefacts in the images produced, rendering species identification impossible. Traditional stereo camera calibration procedures are also difficult to do underwater, if not impossible \cite{Wong2022}. We concluded that until we are able to model Snell’s law distortions, an approach using proprietary stereo camera approach would be infeasible.

Laser Calipers

Another solution for length measurement uses laser calipers – two parallel lasers placed a known distance away from each other. When calibrated correctly, the distance between the two laser dots can be used as a reference length to measure the entire fish \cite{Rohner2011, Heppell2012}. For these measurements to be accurate, both lasers must be perfectly parallel with each other and the camera axis. Depending on manufacturing tolerances, such a requirement may mean that lasers must be carefully selected. These systems are calibrated by measuring the distance between the two laser dots at a large distance before a dive \cite{Heppell2012}. Lengths are then calculated using the known distance between two points and the projection of the fish onto the camera. While length estimation becomes simple using this method, the cost of two lasers and time for the minute readjustments waste valuable resources for researchers. The dual laser mechanism also requires that the object is larger than the distance of the two lasers, meaning fish smaller than the offset cannot be measured.

This system has been used by our collaborators at the Reef Environmental Education Foundation (REEF) to document the recovery of a population of Nassau Grouper over 7 years, in order to quantify the effectiveness of recovery efforts \cite{Heppell2012, Stock2021}. While such a system was also demonstrated to be more effective than visual estimation, it also requires manufacturing a custom-machined aluminum mount, which is not easily accessible for most divers. Calibration of the system also involves verifying that the beams are the same distance apart up to 15m from the source, which can be challenging to do in a field setting.

Laser Rangefinding

Our system uses only a single laser to measure distance, which removes the need to calibrate two lasers simultaneously and keep them in parallel. This technique is similar to a light projection-based triangulation rangefinder system \cite{Parthasarathy1982}, as it uses spatial information about the laser dot to determine the depth of the subject. This method can be extremely accurate with the right combination of laser and image sensor - up to 10 micrometers \cite{Ebrahim2015, Cavedo2016}. Such sensors have been experimented with as a low-cost solution for robot localization \cite{Nguyen1995}, quality assurance in manufacturing \cite{Cavedo2016}, and 3D scanning \cite{Baba2001}.

Single laser range finding also has precedence for use in animal size studies \cite{Jaquet2006,Monkman2019,Breuer2007}. The primary benefit of this approach is that it is more inexpensive than other solutions and requires less training to operate \cite{Monkman2019}. Jaquet and Breuer et al. utilize a range finder as a separate module from a regular digital camera \cite{Jaquet2006,Breuer2007}. Data from both modules must be combined and processed manually to obtain lengths \cite{Jaquet2006, Monkman2019}, which our platform improves upon.

TechniqueEst. Cost (USD)Max. Relative ErrorEase of UseRange
Acoustic Methods20k \cite{Mueller2006}1.1% - 35.2% \cite{Mueller2006}Hard1-16m \cite{Mueller2006}
Stereo Video4600 \cite{SeaGIS}2.5% \cite{Harvey2001}Intermediate2-10m \cite{Mallet2014}
Laser Caliper600 \cite{BERGERON2007}12% \cite{Stock2021}Intermediate2-5m \cite{Stock2021}
Laser Rangefinding120015%Easy2-5m

Outline

The rest of this work will be focused on the algorithms behind the length extraction for FishSense Lite, specifically how the laser parameters are determined, how these are used to calculate the coordinates of the laser dot in physical space, and how that in turn is used to get the fish length. Chapter sec:system-overview will give a brief overview of FishSense Lite as a whole. Chapter sec:algorithms will provide an in-depth description of the laser algorithms. Chapter sec:testing will show how we have tested our system and where sources of error come from. Finally, Chapter sec:conclusion will summarize the content of this work and future work that needs to be done on the system.

System Overview

<sec:system-overview>

Preamble

This chapter contains an overview of FishSense Lite. Our primary goals with this system are to reduce the cost to gather fish length data, and to make the practice more accessible to recreational divers. Our system relies on the diver taking images with the laser beam pointed on the fish. With just the image, as well as a priori knowledge of the laser beam parameters, we can estimate the distance between the camera and the fish, and in turn use this to get the length of the fish.

Accuracy Bounds

The primary method that our system aims to improve upon is human-made visual estimates. While humans can collectively estimate the mean length of a population quite accurately, the precision with which this is done can vary wildly between divers \cite{Harvey2002}. Within the study conducted by Harvey et. al., we see that divers can be incorrect about the length of a fish by up to 20%. As our device is meant to be a supplement for a similar experiment setup, we aim to perform as well or even better than this.

Hardware Setup

Overview

This is the physical device that divers would use to take measurements. It is intended to be composed of materials that many divers already own, which is why we use off the shelf parts. The only new component that we include is a 3D printed mount. Figure \ref{fig:fsl} shows one device we have built and tested.

images/fishsense-lite-system.jpg

Camera

For our own testing, we use an Olympus TG6, a common camera for divers to own for underwater photography. This camera itself is waterproof, though we use another protective housing around it.

This is just one example of a camera that can be used for this system – theoretically any camera would work.

Wide Angle Lens

<sec:backscatter>

With just the waterproof housing, Snell’s law distorts images in such a way that a relationship between lengths in the image and lengths in the real world are difficult to obtain \cite{Agrawal2012}.

To mitigate the effects of this, we use a corrective optic, which allows us to use a simple camera model described in Section sec:pinhole. Developed by Backscatter, this corrective optic is designed to attach to the Olympus’s underwater housing, and is typically used to widen the field of view for underwater photographers. Figure \ref{fig:checkerboard-comparison} shows a checkered pattern both with and without this corrective optic installed – note that the distortion without this corrective optic causes lines to be curved, which makes it more difficult to measure lengths. Future work is required for us to understand how to model these Snell’s law distortions better so we can ultimately remove it.

images/checkerboard-comparison.JPG

Notes

include ana’s charts here? that might be sufficient justification

Laser and Mount

<sec:laser-mount> The laser pointer is another off the shelf component. Many divers own laser pointers in order to gesture to others, which further reduces the cost of the overall hardware. This laser is mounted to the camera’s housing with a 3D-printed polylactic acid (PLA) mount. Due to manufacturing defects and small perturbations, that may occur in transit, the laser moves enough to cause significant variablility in the measurements. Thus a calibration algorithm is required to determine initial parameters relating to the exact geometry of the system, which is described in more detail in Section sec:laser-calibration. The color of the laser also has an effect, discussed in Section sec:laser-comparison.

Operation

Overview

This section details how the device would be used in the field.

Calibration Procedures

Camera Calibration

<sec:camera-calibration>

In order to be able to correlate apparent lengths to actual lengths, we use a simple camera model, explained in more detail in Section sec:pinhole. This model is only an approximation of the camera’s actual behavior, and the lens dynamics are complex and cannot be easily correlated with the model parameters. Therefore, the parameters of the model must be calibrated. This is done using Zhang’s procedure \cite{Zhang2000}, where many images of a checkerboard are taken, and parameters are found such that known dimensions and regularity of the checkerboard are satisfied. This procedure need only be performed once during the device’s lifetime.

Currently, this procedure must be done underwater. OpenCV’s cameraCalibrate function provides a straightforward way to obtain camera intrinsics and lens distortion parameters.

Laser Calibration

As discussed in Section sec:laser-mount, since our laser mount is not stable enough to guarantee that laser parameters stay consistent between dives, we require a laser calibration at the beginning of every dive. More detail about this procedure can be found in Section sec:slate-calibration.

Diver Operation

In the field, after calibrating the laser, operating the camera is fairly straightforward. The diver will take a picture of nearby fish as normal, with the restriction that the laser dot must be trained on the fish. Note that we expect the diver to ensure that two conditions are met:

  1. The head and tail points of the fish are visible in the image.
  2. The laser is both present and visible on the fish.

Excerpts from the field manual are provided in Appendix sec:user-manual.

Data Offloading

After the data has been collected, the operator must offload the data to be processed externally. This software is still in development, though the process is described in the next section.

Software

Preamble

Once the data has been collected, it must be processed to extract the lengths of the captured fishes. A summary of the process is as follows: locations of both the fish and the laser dot in the image are currently identified manually. We are currently developing techniques to do this automatically using machine learning. The laser’s location in 3D space is determined using the location of the laser dot in the image, along with prior knowledge of the laser’s location with respect to the camera. The fish segmentation mask is used to create a polygon of the fish’s outline, after which PCA is performed to identify the axis of symmetry of the fish. The intersection of this line is used to find the head and tail of the fish, and the depth obtained from the laser dot is used to determine the locations of the head/tail points in 3D space, described in more detail in Section sec:distance-finding. The distance between these points is what is used to calculate the length of the fish. We currently have plans to create a web service that handles all of this processing, and intent for citizen scientists to upload their data.

A flowchart of the processing pipeline is included in Figure \ref{fig:software-flowchart}. Subsequent subsections will go into more detail regarding the individual stages of this pipeline.

images/software_flowchart.pdf

Raw Image Processing

Anecdotally, we have observed that the camera performs some distortion correction internally, and as such, raw images appear to be more distorted than the JPEGs that the camera generates. In order to ensure that we have control over all aspects of the image processing pipeline, we convert the raw camera images to a PNG instead of working with JPEGs processed by the camera.

To convert from raw images to a PNG, we must first apply a debayering algorithm. Debayering is the process of using a Bayer pattern, shown in Figure \ref{fig:bayer-pattern}, to approximate a pixel’s “true” color. Pixels on an image sensor cannot inherently detect color, so colored filters are applied in front of digital camera sensors. The raw image consists of cells that only detect red, green, or blue light, and algorithms must be employed to approximate the actual color of each pixel.

images/bayer-pattern.png

After the image is debayered, we must apply color correction to ensure that the fish being photographed are still identifiable. Typically, this requires a known “white point”, which may change in environments with different lighting.

To circumvent this issue, we use the grey world assumption \cite{Buchsbaum1980}. This assumes that the color of the image can be adjusted by scaling each color by a uniform factor, for which we use the average of each channel, as described in the transformation below: \begin{equation*} (α R, β G, γ B) → \left( \frac{ α R }{ \frac{α}{n} ∑_i R }, \frac{ β G }{ \frac{ β }{ n } ∑_i G} , \frac{ γ B }{ \frac{ γ }{ n } ∑_i B } \right) \end{equation*}

A comparison with this form of color correction on a sample image is displayed in Figure \ref{fig:jpg-vs-png}.

images/jpg-vs-png.JPG

Note that within this work, unless stated otherwise, the JPEG images are displayed instead of the PNGs, as they are usually better looking. However, the PNGs are what we currently use for our algorithms.

Finding the Laser Dot

In order to identify where the laser dot is in the image we usually hand-label them. Automatic approaches are still under development. The naive approach of simply picking the brightest spot in the image does not always work, as sometimes the laser beam may attenuate so much underwater that the laser dot is no longer the brightest spot in the image, or perhaps sunlight may reflect off a bubble into the camera. Our current attempt relies on a convolutional neural network (CNN) that is trained on $20×20$ tiles within the image. When searching for the laser dot within an image, the entire image is broken up into tiles, and the tile with the highest probability of being the laser dot is selected.

We can further constrain the tiles we need to feed into the network by also relying on the laser parameters to select a smaller region of the image. Since we know the laser is restricted to a single line in 3D space, it will also be restricted to a single line in the image.

Finding Fish Head and Tail Points

These are also usually hand-labeled, and all data presented within this work is hand-labeled. For automatic detection, we currently rely on an open source network called Fishial \cite{Fishial2023} to segment the fish. This is a MaskRCNN that both identifies the species of the fish and draws a polygon surrounding the fish’s outline. Once we identify the outline, we perform Principal Component Analysis (PCA) to identify the line along which the location of outline points varies the most, which is usually along the body of the fish. The intersections of this line with the fish outline are chosen as the head and tail points.

Thus far we have described the system setup. The next chapter will describe the main contribution of this work – the algorithms underlying distance estimation that utilize the laser.

Acknowledgements

The author would like to acknowledge the work of Christopher Crutchfield, Vivaswat Suresh, Ana Perez, Hamish Grant and Kyle Tran for assisting in the development of the software processing pipeline.

This chapter contains material that is being prepared for publication. Christopher L. Crutchfield, Kyle S. Hu, Vivaswat Suresh, Ana I. Perez, Avik Ghosh, Hamish J. Grant, Kyle Tran, Samantha Prestrelski, Ronan Wallace, Nathan Hui, Jack Elstner, Dylan Heppell, Jennifer Loch, Alli Candelmo, Brice X. Semmens, Curt Schurgers, and Ryan Kastner. The thesis author was one of the primary investigators and authors of this material.

Algorithm Details

<sec:algorithms>

Preamble

The specific contribution detailed in this work are the algorithms relating to the laser pointer – both how the parameters of the laser with respect to the camera are determined, and, given those parameters are known, how the location of the intersection of the laser and fish can be determined.

Pinhole Camera Model

<sec:pinhole>

All of the calculations in this chapter assume that our camera follows the pinhole camera assumption. In air, this assumption is extremely common and is used in stereo cameras for object triangulation.

The pinhole camera model assumes that all light that comes into the camera passes through a single point known as the focal point, or optical center, of the camera. All incoming rays fall onto the image sensor, which is a plane perpendicular to the camera’s axis, at a fixed distance from the focal point known as the focal length. These parameters must be calibrated for, and we follow the procedure described in Section sec:camera-calibration.

The image projected onto the sensor is a flipped version of the real life scene, as the light rays are inverted after passing through the optical center. Instead of looking at the image on the image sensor, it is typically the convention to mirror the image sensor plane in front of the optical sensor, such that the distance from the focal point to this new plane is also the focal length. This plane is designated as the image plane. This convention allows us to more easily translate from pixel coordinates to coordinates in the image plane, as we need only know the spacing between individual pixels on the sensor to map to physical units, without needing to flip the image. A pictorial representation of this is show in Figure \ref{fig:pinhole-general}.

images/pinhole-explanation.png

Underwater, it is more difficult to make the assumption that imaging systems follow the pinhole camera model. A image taken with a camera in an underwater housing is distorted, as the flat port of the camera housing causes incoming light to refract. Thus we use the corrective optic mentioned in Section sec:backscatter, which allows the pinhole camera assumption to hold.

Quantities and Conventions

Our axis conventions in this chapter are as follows: the $x$ axis points to the right in the image, the $y$ axis points downward in the image, and the $z$ axis points forward, away from the camera. The $z$ axis is commonly known as the optical axis or camera axis. This system has the added bonus that the $x$ and $y$ axes coincide with the directions of pixel coordinates. The origin will be defined at the camera’s optical center, where all light rays intersect. This coordinate system is commonly known as the “optical frame”, and any mention of 3D coordinates in this work will be within this coordinate system.

We will make use of the following quantities: $n$ represents the length of the fish in pixels, $w$ represents the “pixel pitch” (distance between two pixels on the image sensor), and $f$ represents the focal length of the camera. A diagram of the image sensor is shown in Figure \ref{fig:image-sensor}. Note that the product $nw$ can be interpreted as the length of the fish on the image plane. Figure \ref{fig:pinhole-fish} shows a diagram of both the live fish and its projection onto the image plane.

images/image-sensor.png

images/pinhole-fish.png

If the apparent length of the fish is known, using similar triangles we are able to obtain the true length using the following relationship: \begin{equation} L = \frac{Dnw}{f} \label{eq:length-from-depth} \end{equation} Our only unknown on the right hand side is $D$, the distance of the fish from the camera. The key part of our system is the mechanism with which we obtain this $D$.

Note that the above equation makes the following assumptions:

  1. The fish is parallel with the image plane.
  2. The laser dot is at the same depth as head and tail points on the fish.

In practice, neither of these are always, if ever, true, and the implications of this are discussed more in Chapter sec:testing. However, these tend to be reasonable assumptions for most fish.

We can define parameters $\vb{l}$ and $\vb{d}$ that describe the laser’s position and orientation with respect to the camera, where $\vb{l}$ is the 3D vector from the origin to the laser beam in the $xy$ plane, and $\vb{d}$ is a 3D vector of unit length that points in the direction of the laser beam. These are shown in Figure \ref{fig:known-quantities}. Note that our restriction that $\vb{l}$ lies purely in the $xy$ plane means that the $z$ component of $\vb{l}$ is always zero.

images/known-quantities.png

Finding the Distance to the Fish

<sec:distance-finding>

In order to find the distance between the camera and fish, we must assume that $\vb{l}$ and $\vb{d}$ are known. How these are obtained is explained in the next section.

We assume that this laser beam intersects the fish at an unknown point $\vb{p}$. As imaged by the camera, this laser dot will have known pixel coordinates $\mathfrak{p}$. Note that the origin of these pixel coordinates is not the top left corner of the image, but the center or principal point of the image, which is also found during camera calibration. The point $\mathfrak{p}$ corresponds to a point $\vb{p}_\text{image}$ in the optical frame, defined as

\[ \vb{p}_\text{image} = \begin{bmatrix} \mathfrak{p}_x
\mathfrak{p}_y \ f \end{bmatrix} \]

We arbitrarily decide to scale this vector to be of unit length, defining \[ \vb{v} = \frac{\vb{p}_\text{image}}{\lVert \vb{p}_\text{image}\rVert} \]

We can identify parameters $λ_1$ and $λ_2$ such that when the laser and $\vb{v}$ are scaled out, they intersect, or more formally, \[ \vb{l} + λ_1\vb{d} = λ_2\vb{v} \]

We can refactor the above relationship to be the form

\begin{equation} \begin{bmatrix} \vb{d} & -\vb{v} \end{bmatrix} \begin{bmatrix} λ_1
λ_2 \end{bmatrix} = -\vb{l} \label{eq:find-laser} \end{equation}

In practice, however, this relationship does not necessarily hold, because the laser dot as observed in the image may be in a particular pixel that does not match up perfectly with where $λ_2 \vb{v}$ intersects with the image plane. In three dimensions, this often causes the rays to come very close to each other, but not touch, meaning Equation \ref{eq:find-laser} has no solution.

We can, however, obtain $λ_1$ and $λ_2$ values that minimize the difference between the left and right hand sides – this is the least squares solution, i.e. the solution to the minimization problem \[ \text{argmin}x \lVert A\vb{x} - \vb{b} \rVert^2, \]

where $A ∈ \mathbb{R}m× n$, $m &gt; n$, and $\vb{x},\vb{b} ∈ \mathbb{R}n$. In our case, we define the following: \begin{align*} A &= \begin{bmatrix} \vb{d} \ -\vb{v} \end{bmatrix}
\vb{x} &= \begin{bmatrix} λ_1 \ λ_2 \end{bmatrix} \ \vb{b} &= -\vb{l} \end{align*}

In general, we can solve a least squares problem with the following formula: \[ \vb{x} = (A^TA)-1A^T \vb{b} \]

Using this, we can obtain a closed form solution for both $λ_1$ and $λ_2$. We only require one of them, so we use $λ_2$, for which we can obtain this closed form solution: \[ λ_2 = \frac{-\vb{d}^T\vb{l} \vb{d}^T\vb{v} + \vb{v}^T \vb{l}}{1 - (\vb{d}^T\vb{v})^2} \]

A derivation of the above is included in Appendix sec:scale-factor.

Obtaining $λ_1$ or $λ_2$ allows us to obtain $\vb{p}$, and the z-component of $\vb{p}$, along with Equation \ref{eq:length-from-depth}, gives us the corresponding fish length.

Finding Laser Parameters

<sec:laser-calibration> So far, we have assumed that the parameters of the laser $\vb{d}$ and $\vb{l}$ are known. These cannot be measured directly for two reasons:

  1. Precise measurements relative to the optical center of the camera are hard to obtain, since in reality the location of the optical center of the camera is not known.
  2. Since the laser mount is not perfectly stable, and can change in between dives and over time, the parameters need to be recalculated.

Thus, we must use a calibration procedure. This involves capturing $k$ total images, where from any particular i-th image we obtain a 3D laser point $\vb{p}_i$. Details on how this point is obtained are discussed in Section sec:slate-calibration.

Here we describe two possible algorithms to leverage this information to determine $\vb{d}$ and $\vb{l}$. A comparison of the two methods is presented in Section sec:math-testing.

Method 1: Gauss Newton Optimization

We leverage the fact that the parameterized laser beam must intersect with the laser dot point to give us the following series of equations: \begin{equation} \vb{p}_i = \lVert \vb{p}_i - \vb{l} \rVert\vb{d} + \vb{l} \label{eq:problem} \end{equation} The above equation states that the point $\vb{p}_i$ should be able to be retrieved from scaling the laser ray out by a factor determined by the difference between $\vb{p}_i$ and $\vb{l}$.

We stack these points into a single vector, defining the following; \[ P = \begin{bmatrix} \vb{p}_1 \ \vb{p}_2 \ \vdots \ \vb{p}_k \end{bmatrix} \]

We then once again define a parameter vector $\vb{x}$ that contains all of our parameters: \[ \vb{x} = \begin{bmatrix} \vb{d}
l_x \ l_y \end{bmatrix} \]

We also define a function in terms of our parameters for the right hand side: \begin{align*} \vb{g}_i(\vb{x}) &= \lVert \vb{p}_i - \vb{l} \rVert \vb{d} + \vb{l}
G(\vb{x}) &= \begin{bmatrix} \vb{g}_1(\vb{x}) \ \vb{g}_2(\vb{x}) \ \vdots \ \vb{g}_k(\vb{x}) \end{bmatrix} \end{align*}

We then formulate this in terms of the following optimization problem: \begin{align} \text{argmin}x\lVert \vb{r}(\vb{x})\rVert, \nonumber
\vb{r}(\vb{x}) = P - G(\vb{x}) \label{eq:residual} \end{align}

We have a relationship that relates our known quantities $\vb{p}_i$ and unknown quantities $\vb{d}$ and $\vb{l}$, though in this case the relationship is non-linear. We must therefore choose a non-linear optimization method to find the best candidates for $\vb{d}$ and $\vb{l}$ that satisfy this.

The method we currently choose is the Gauss-Newton method, which involves the following steps:

  1. Find the Jacobian $J\vb{r}$ of the minimizing function w.r.t to $\vb{x}$.
  2. Take iterative steps of the following form:

\[ \vb{x}(k+1) = \vb{x}(k) - (J\vb{r}^TJ\vb{r})-1J\vb{r} \vb{r}(\vb{x}(k)) \]

According to Equation \ref{eq:residual}, only the second term depends on our parameters, so we can rework our iterative step into \[ \vb{x}(k+1) = \vb{x}(k) + (J_GTJ_G)-1JG\vb{r}(\vb{x}(k)) \]

Here we set $\vb{x}(0)$ to a rough estimate of where the laser is with respect to the camera, taken with a ruler. For the particular test system mentioned in this work, the laser’s starting point is assumed to be -4cm in the x direction and -11cm in the $y$ direction, with the laser parallel to the camera axis.

There are many other methods of this kind that we could have used, such as the Levenberg-Marquardt algorithm, though from our experiments this calibration method has been sufficient.

The Jacobian of $G(\vb{x})$ is given by the following, and the full derivation is detailed in Appendix sec:jacobian-derivation:

\begin{align} J_G &= \begin{bmatrix} JG\vb{d}^1 & JG\vb{l}^1
JG\vb{d}^2 & JG\vb{l}^2 \ \vdots & \vdots \ JG\vb{d}^k & JG\vb{l}^k \end{bmatrix} ∈ \mathbb{R}3k×5\nonumber \ JG\vb{d}^i &= \lVert \vb{p}_i - \vb{l}\rVert I ∈ \mathbb{R}3×3\ JG\vb{l}^i &= \begin{bmatrix} 1 & 0 \ 0 & 1 \ 0 & 0 \end{bmatrix} \left(I3×3 - \vb{d}\frac{(\vb{p}_i - \vb{l})^T}{\lVert \vb{p}_i - \vb{l}\rVert }\right) ∈ \mathbb{R}3×2 \end{align}

Method 2: Averaging

Once again assuming we have $n$ images with image $i$ corresponding with laser dot point $\vb{p}_i$, our laser trajectory $\vb{d}$ can be obtained from a normalized sum over the differences between all points: \[ \vb{d} = \frac{∑i=1^k∑j > i \vb{p}_j - \vb{p}_i}{\lVert ∑i=1^n∑j > i \vb{p}_j - \vb{p}_i \rVert} \] Note that here we take all pairwise differences where $j &gt; i$, as without loss of generalization, here we assume that the points are ordered in ascending distance from the optical center. Hence we only sum up differences that point away from the laser origin.

Once $\vb{d}$ is obtained, we take the centroid of all of our points, which we define as $μ$, and set the intersection of the ray defined by $μ$ and $\vb{d}$ with the $xy$ plane to be $\vb{l}$. Formally, this is defined as \[ \vb{l} = μ - \frac{μ_z}{d_z}\vb{d} \]

This method can be performed much faster than the former method, though while developing our device we have stuck to using the Gauss Newton method since the constraints and minimizing function are more explicit.

Calibration Procedure

<sec:slate-calibration>

As mentioned in Section sec:laser-calibration, to calibrate the laser, we assume that we know the optical frame coordinates of each laser dot $\vb{p}_i$. While the laser parameters $\vb{d}$ and $\vb{l}$ are still unknown, the vector $\vb{v}$ is always known, since this only depends on the location of the laser dot in the image $\mathfrak{p}$. By intersecting $\vb{v}$ with a known plane, we can precisely determine the location of each $\vb{p}_i$.

In order to create such a known plane, we require a calibration target that is both flat and possesses distinct features. If an image is taken of this calibration target, since the relationship between features on the target are known, a plane can be defined that best fits the calibration object.

The same checkerboard pattern mentioned in Section sec:camera-calibration is one example plane that is ideal for this. The main reasoning for this method was that there are two parts that OpenCV provides functionality for: detecting corners of specifically the checkerboard pattern, and calculating a 3D transformation given object points and corresponding points in an image. However, in the field, requiring that divers carry a large and heavy checkerboard for every dive is a large ask.

We have developed a procedure that does not utilize a checkerboard, and can instead be done with a dive slate – something that most divers will carry. An example of one of these is shown in Figure \ref{fig:slate-img}. We add pieces of duct tape in an arbitrary pattern on one side to make the slate more featureful. We also make the assumption that the slate remains close to parallel with the image plane.

images/slate-calibration.JPG

We assume that we have a scanned copy of the dive slate, and hence the physical measurements of the duct tape pattern. Assuming that a correspondence can be drawn between the corners of the scan representation and the corners within the image, a transformation between the two can be found. This is known as a Perspective-N-Point (PnP) problem, and OpenCV has a solvePnP function that makes this relatively straightforward. The same problem exists using the checkerboard, but OpenCV’s findChessboardCorners function abstracts this away from the developer by taking advantage of the specific structure of the checkerboard (number of squares per side, side length of a square).

Results demonstrating this method compared to the checkerboard are shown in Section sec:field-calibration-testing.

Conclusion


In this chapter, we have discussed the assumptions we make for this system to work, and outlined the algorithm for obtaining laser parameters, and using these to obtain the length of a fish. In the next chapter, we will discuss experiements done to test the device.

Acknowledgements

The author would like to thank Christopher Crutchfield, Nathan Hui and Nikolay Atanasov for assistance with the math in this chapter.

This chapter contains material that is being prepared for publication. Christopher L. Crutchfield, Kyle S. Hu, Vivaswat Suresh, Ana I. Perez, Avik Ghosh, Hamish J. Grant, Kyle Tran, Samantha Prestrelski, Ronan Wallace, Nathan Hui, Jack Elstner, Dylan Heppell, Jennifer Loch, Alli Candelmo, Brice X. Semmens, Curt Schurgers, and Ryan Kastner. The thesis author was one of the primary investigators and authors of this material.

Experiments

<sec:testing> In this section we describe the testing that we have done so far to quantify how well the system works. A total of 24 pool tests and 2 field deployments were conducted in the process of developing FishSense Lite.

Pool Experiments

Preamble

The vast majority of our experiements have been performed in swimming pools, due to our ability to control as many variables as possible, and for convenience. A full list of pool tests is included in Appendix sec:pool-test-list. All data in this section was taken on November 17th, 2023 and May 9th, 2024.

In order to test the system, we follow variations on the following procedure:

  1. If calibration parameters do not exist or we do not trust the current parameters, calibrate the camera.
  2. Calibrate the laser by taking pictures of one or both the planar objects mentioned in Section sec:slate-calibration.
  3. Take images of objects with a known length.
  4. Analyze how the estimated lengths compare with the true length of the object.

The majority of these experiments were performed in a pool setting, as this was the most accessible approach for us.

Reference Objects

<sec:reference-objects> The experiments described in this section involve one of three measurement objects, described below.

Checkerboard

The checkerboard itself is useful for the same reason that makes it a good object to calibrate the laser with – it is relatively straightforward to identify the corners of a checkerboard in an image. In our case, we use the checkerboard as a convenient way to measure the orientation of the object when we measure lengths from it, while also using it to get many different measurements from a single image.

Box

Historically we have also used an aluminum box, with a 15cm section of tape being used as the reference length that we measure. The box is shown in Figure \ref{fig:box}.

images/box.jpg

Fake Fish

In order to fully simulate the purpose for which the device was intended, we use a dummy rainbow trout as a reference object. Each dummy is made from the same model with the same dimensions, though different techniques have been used to make them slightly negatively buoyant. Three generations of dummy fish, named Fred, George, and Ginny, have been used for these tests\footnote{If you would like to purchase a fish of your own, go to https://www.loftus.com/items/LF-0167.}. Fred is shown in Figure \ref{fig:fred}. Henceforth in this section, we will refer to all three of them as just “the fake fish”.

images/fred.jpg

Laser Calibration Algorithm

<sec:math-testing>

We test the accuracy of the calibration by comparing reference length measurements with different laser calibration parameters. The most common way in which we will show these measurements is be a plot of distance versus estimated length, as pictures of each subject were taken over a range of distances. This allows us to observe where errors occur, as will be shown in Section sec:length-measurements.

Figure \ref{fig:laser-calibration-comparison} shows the results obtained from both calibrations – as can be seen, they are functionally identical. Future work still needs to be done to formally verify that the two methods are equivalent.

images/laser-calibration-comparison.pdf

Distance Measurements

<sec:distance-test> To evaluate purely the distance estimates we can obtain from the system, we take images of objects of known length (in this case, all three of the reference objects listed in Section sec:reference-objects), at known distances, and comparing this known distance with the distance we get from the algorithm described in Section sec:distance-finding. These data were collected by taking pictures at fixed increments – in our case, we taped 20cm marks to a long fishing rod, and took images at each mark. There were originally 19 marks along the rod, though many of the pieces of tape came off underwater. Results are shown in Figure \ref{fig:length-reference}, which demonstrates that the actual distance and the measured distance are well correlated.

images/distance-measurements.pdf

Note that this is heavily dependent on the laser calibration being accurate – the data in Figure \ref{fig:length-reference} follow two different trends, each of which comes from a different set of laser parameters. Both of these parameters were obtained from images in the same test – one calibration run was performed before and the other performed after the length measurements. In this particular case, the incorrect laser parameters were obtained before all measurements were taken, and the correct parameters were obtained afterward. This implies that the mount or the laser itself was moved slightly before measurements began.

In practice, there is no way that the diver would be able to tell if the mount has been moved or not, which makes verifying the measurements difficult. Section sec:length-measurements will contain further discussion on how calibration parameters affect our measurements.

Length Measurements

<sec:length-measurements> Figure \ref{fig:measurements} demonstrates measurements of our fake fish and the box during a pool test. We can see that with varying distances away, the estimate remains relatively consistent, though there are several factors that cause error. These are listed in the subsequent subsections.

images/length-measurements.pdf

Object Thickness

If the object is not flat, as is the case with most fish, then the laser dot may be at a different distance from the camera than the head and tail points. This causes closeup measurements of the object to be shorter than in images taken further away, due to the fact that the difference in distance between the laser dot and the head/tail points makes up a smaller percentage of the total distance as the camera becomes further away. This effect can be seen in Figure \ref{fig:measurements}, as the fish’s measurements trend upward until around 1m away. It is likely that in field use, there will not be many measurements taken that close to the fish anyway, as that may scare the fish away.

Deviations in Laser Calibration

Section sec:distance-test showed that an error in the laser calibration parameters can cause deviation in measured distance. This creates a deviation in measured length that increases the further away the object is from the camera. Figure \ref{fig:incorrect-laser-calibration} shows the effect that these incorrect parameters have on object length measurements.

images/length-with-incorrect-calibration.png

Notably, a change in the laser orientation $\vb{d}$ results in much more error than a change in the laser location $\vb{l}$, especially as the laser dot moves further away from the camera.

Labelling Errors

All the data in this chapter were hand labeled, and this is especially noticeable at longer distances from the camera. Figure \ref{fig:measurements} has particularly obvious instances of large amounts of noise being introduced at longer distances. When the subject within the image is farther away, smaller deviations in the point being labeled matter much more in the final length measurement, meaning that with a hand-labeled approach, the effect of noise increases.

Deviations in Camera Parameters

An incorrect calibration of the camera’s internal parameters can also cause a deviation in length. Specifically, if the focal length $f$ is off by a certain percentage, this affects both the process that determines distance as described in Section sec:distance-finding, as well as the conversion of distance into length using Equation \ref{eq:length-from-depth}. Consequently, as shown in Figure \ref{fig:incorrect-camera-calibration}, the relationship between focal length and measurement error is complex. Concrete examples of this occuring are given in Section sec:fw-vs-sw.

images/length-with-incorrect-calibration.pdf

This means that for now, camera calibration must be done carefully and verified to show ensure the obtained parameters are as accurate as possible.

Object Tilt

An important case we must account for is when the fish is not perfectly perpendicular to the camera axis. If the fish is rotated around the $y$ axis (recall that this corresponds with the vertical axis in the image plane), the fish may look shorter within the image than it actually is, and as such the measurements of this fish would be more inaccurate. An experiment that quantifies this is described in the next section.

Contributions of Tilt to Error

<sec:tilt-experiment>

To evaluate the system’s accuracy with respect to different tilt angles, we used the following experiment setup. We captured a total of 183 images of a checkerboard pattern with the laser dot on it, with varying tilt angles and depths. A subset of these images were used to obtain the laser position and orientation. Using these data, from each image we extracted 13 lengths, as shown in Figure \ref{fig:checkerboard-experiment-setup}. Figures \ref{fig:checkerboard-errors-by-length} and \ref{fig:checkerboard-errors-by-distance} shows the results of this experiment.

images/checkerboard-experiment-setup.png

images/checkerboard_errors_by_length.png

images/checkerboard_errors_by_distance.png

Both figures show that the relative error in measurement decreases dramatically when the angle becomes lower than 15 degrees, which supports previous studies \cite{Heppell2012}. Since our system is diver operated, we expect the vast majority of images taken will fit this criteria, as humans are quite good at observing this threshold.

Figure \ref{fig:variances} shows the variances of the percentage measurement errors in all images, where each data point represents a single image. We see that the variance in measurements is much higher when the tilt angle is higher and the camera is close to the image. Our current hypothesis for this is that when the calibration board is closer to the camera, the distance covered by the tilting checkerboard is a more significant percentage of the overall distance, which causes the error of the distance measured to be larger.

images/checkerboard_error_variances.png

Laser Attenuation

<sec:laser-comparison> The bulk of our experiments have been with two different laser pointers: the Innovative Scuba Concepts laser pointer\footnote{https://dealer.innovativescuba.com/tc-101-aluminum-underwater-laser-pointer.html}, and the Shark Laser\footnote{https://waterprooflaser.com/contents/en-us/d22\_TO-ORDER.html}. Both lasers are rated for less than 5mW of power. The former model emits light at roughly 700nm, and the latter at roughly 532nm.

Figure \ref{fig:laser-test} shows an example of a comparative range test with the Shark laser and three different red lasers, of which the innovative Scuba laser is second from the left. Our goal with this test was to qualitatively understand how attenuation of different laser colors would affect how it would be used. In pool water, we were able to observe the green laser from almost the entire width of the pool (roughly 15m), while the red laser was effectively unusable at around 5m. This did not take into account the fact that the the diver would sometimes not able to keep the camera steady enough to perceive the red laser dot at distances of roughly 4m.

images/lasers-close.JPG

A more detailed discussion of light attenuation in water is included in Appendix sec:light-attenuation.

Field Experiments

There are two main locations in which Fishsense Lite modules have been field tested. The first was conducted off the La Jolla coast. Water conditions in La Jolla are turbid, so this was an opportunity to test how our system would fare in low visibility conditions. The main goal of this test was to evaluate the red and green lasers, looking both at attenuation and how fish would react.

Secondly, in August 2023, we deployed six of our FishSense Lite units (FSL-01 to FSL-06) to be tested by REEF staff in the Florida Keys. The water conditions in this region were much clearer than in the La Jolla Kelp forests, so despite the shorter attenuation of red lasers, the data from this test are predominantly generated using red lasers. These deployments allowed us to look at how the system would fare during more long term use, and in the hands of real recreational divers.

Salt Water Measurements

<sec:fw-vs-sw>

During the La Jolla deployment, we compared fake fish measurements taken in fresh water (pool water in our case) and salt water. Much like in our pool tests, the laser was first calibrated, then a fake fish was measured and lengths were calculated in post-processing. Three sets of measurements were taken: one before the ocean test, one during the ocean test, and one after the ocean test. Both the camera calibration parameters and laser calibration parameters were the same across all three measurement sets.

As seen in Figure \ref{fig:freshwater-vs-saltwater}, both the measurements before and after the ocean tests were similar, while the salt water measurements exceeded both the freshwater measurement sets. Since the cameras themselves are calibrated in fresh water, the change in refractive index in salt water is likely what causes the discrepancy in measurements. However, this is still within our accepted error margin of 20%, as we defined earlier.

images/ocean-test.pdf

We currently choose to calibrate the camera in pool water despite this error, as asking divers to calibrate in the field would be an extremely tedious process. We recognize that to minimize as much error as we can, we will need to be able to obtain the camera calibration parameters in salt water, and are currently looking into methods of doing so that do not involve using a checkerboard.

Laser Calibration With A Slate

<sec:field-calibration-testing>

We have been using the checkerboard method as the standard of calibration, as this can be easily automated. However, we want to assess how effective slate calibration is in theory, as it would be a huge asset to people using the device in the field. Luckily for us, the divers at REEF were able to test the slate calibration with a slate of their own construction. The methodology was as follows: each of their seven FishSense Lite units were calibrated using the same slate, and the length of the slate, show in Figure \ref{fig:slate-calibration-measurement}, was measured using the newly obtained laser parameters. Several measurements were taken per image to create a box and whisker plot, shown in Figure \ref{fig:slate-calibration}. We can see that the measurements are fairly consistent, which demonstrates that the calibration method is indeed viable.

images/slate-calibration-measurement.png

images/slate-calibration.png

Fish Behavior Study

<sec:fish-behavior> From our testing in the La Jolla Kelp Forest, divers observed that fish tended to avoid the green laser. Our main hypothesis for why this occurs is that fish are able to see green light far better than red. In especially turbid water, added scattering meant that the beam was visible even when the beam was not pointed directly at the fish’s eyes. An example of this is shown in Figure \ref{fig:sheephead}. This meant that in La Jolla, fish were especially skittish around the green laser. We realized this because after the divers made a safety stop further off the bottom, when pointing the green laser at the fish they seemed less skittish.

images/sheephead.JPG

As of now, for all pool tests, we use a green laser, since we are not concerned about scaring fish away while testing. Green lasers in the field have been shown to scare fish away in turbid water, though testing is currently being conducted by REEF to assess whether a green laser is viable in the Florida Keys, where the water is relatively clear.

Points of Mechanical Failure

<sec:mechanical-failure>

During field testing, we identified two main points where the device would tend to mechanically fail, listed below.

Laser Mount

<sec:mount-failure>

The laser mounts have been shown to break after relatively brief periods of use (several dives). This could be due to a combination of absorption of water, degradation from exposure to ultraviolet light, or the large amount of stress being put on certain parts by the screws. Figure \ref{fig:mount-crack} shows the most common stress point which causes the mount to fail. This cracking is caused by the two screws used to keep the laser secured to the mount.

images/mount-failure.png

Current work is being done to create these mounts out of a different material which does not absorb water and resists corrosion.

Laser Pointer

The laser pointers themselves have also been known to fail. There are two main reasons for this:

  1. The O-rings fail due to lack of proper maintenance or debris getting in the way, flooding the battery enclosure.
  2. A mechanical failure from the mechanism which activates the laser. Wear from turning them on so many times causes them to chip.

Discussions with Backscatter are in the works to create a custom solution that does not need to be unscrewed, and also includes the possibility of having multiple lasers. Multiple lasers could also allow us to estimate the angle of the fish relative to the camera, which further increases the accuracy of our estimations.

Summary of Results

Thus far, we have shown that in ideal scenarios, the measurements of our device can be quite accurate. However, the “ideal” scenario seems to be rare – in practice, it seems to be quite easy for the mount to go out of calibration, whether it be due to weakness in the mount material, twisting or moving the laser out of alignment during operation, etc. There is currently no way to know for sure when this happens, which makes the results from this device more unreliable than they should be. Our next priority is to create a mount of sturdier construction such that the mount is less likely to move out of calibration during use, and that less calibration runs are required.

While this system still has many issues with durability and reliability, we have still shown that it is capable of obtaining fish lengths to within our desired error margins.

Acknowledgements

The author would like to thank Nathan Hui and Jack Elstner for testing our system off the La Jolla coast. The author would also like to thank Alli Candelmo, Jen Loch, and Dylan Heppell from REEF for testing our system extensively in the Florida Keys.

This chapter contains material that is being prepared for publication. Christopher L. Crutchfield, Kyle S. Hu, Vivaswat Suresh, Ana I. Perez, Avik Ghosh, Hamish J. Grant, Kyle Tran, Samantha Prestrelski, Ronan Wallace, Nathan Hui, Jack Elstner, Dylan Heppell, Jennifer Loch, Alli Candelmo, Brice X. Semmens, Curt Schurgers, and Ryan Kastner. The thesis author was one of the primary investigators and authors of this material.

Conclusion

<sec:conclusion>

Summary

In this work we have discussed the construction and inner workings of FishSense Lite. We described the main assumption – the pinhole camera assumption, which allows us to draw simple relationships between apparent and actual size. We established that distance to the fish is necessary to calculate it’s actual length. We discussed both procedures to obtain the laser parameters, and to obtain the distance to the laser dot from the camera once those parameters are known. We presented results that verify the accuracy of this system is within 20% of a fish’s true length.

Future Work

There is much work to be done to improve the system. We are working towards removing some of the current requirements of the subject being imaged. In order to remove the requirement that the fish be parallel to the image plane, we are looking into either using multiple lasers, or relying on a machine learning method to estimate the fish’s orientation.

We also aim to build a sturdier mount so that less frequent calibration is required, and so that our device will be able to better survive the elements. Current work is being done to manufacture a mount from aluminum.

One area which we are especially interested in advancing is removing the corrective optic, which will be possible once we are able to properly model the distortions caused by the port of the underwater housing. This will reduce the cost of our current device significantly, and will also be a significant step in allowing us to use stereo cameras with Snell’s law distortions.

Lastly, we also need to refine some of the automatic detection algorithms for both the fish and the calibration object, and continue to build the infrastructure to support fully automatic data processing. To ensure that citizen scientists can easily use our platform, all processing must eventually be done automatically. One of our long term goals is to also be able to perform identification, so that scientists can use these data to draw conclusions about specific populations of a certain species or even specific individuals.

Supplementary Derivations

Derivation of Scale Factor

<sec:scale-factor>

Recall from Section sec:distance-finding that we have \begin{equation*} A\vb{x} = -\vb{l}, \end{equation*} where we define the following matrices: \begin{align*} A &= \begin{bmatrix} d_x & -v_x
d_y & -v_y \ d_z & -v_z \ \end{bmatrix}\ \vb{x} &= \begin{bmatrix} λ_1 \ λ_2 \end{bmatrix} \end{align*}

By multiplying $A^T$ on the left and right hand sides, we get \[ A^TA\vb{x} = -A^T\vb{l}. \]

We also note the following: \begin{align*} A^TA &= \begin{bmatrix} \lVert \vb{d}\rVert_2^2 & -\vb{d}^T\vb{v}
-\vb{d}^T\vb{v} & \lVert \vb{v}\rVert_2^2 \ \end{bmatrix} \ -A^T\vb{l} &= \begin{bmatrix} -\vb{d}^T\vb{l} \ -\vb{v}^T\vb{l} \end{bmatrix} \end{align*}

Since this is a system of two linear equations, we can leverage standard row reduction methods to solve for $λ_2$. First we create an augmented matrix: \begin{equation*} \left[ \begin{array}{cc|c} \lVert \vb{d} \rVert_2^2 & -\vb{d}^T \vb{v} & -\vb{d}^T\vb{l}
-\vb{d}^T \vb{v} & \lVert\vb{v}\rVert^2_2 & -\vb{v}^T\vb{l} \end{array} \right] \end{equation*}

Then we apply row reduction: \begin{equation*} \left[ \begin{array}{cc|c} \lVert\vb{d}\rVert_2^2 & -\vb{d}^T v & -\vb{d}^T \vb{l}
0 & 1 & \Big(-\vb{d}^T\vb{l} + \frac{\vb{v}^T\vb{l} \lVert\vb{d}\rVert_2^2}{\vb{d}^T\vb{v}} \Big)\frac{1}{\Big( \frac{\lVert\vb{v}\rVert_2^2 \lVert\vb{d}\rVert_2^2}{\vb{d}^T\vb{v}} - \vb{d}^T\vb{v} \Big)} \end{array} \right] \end{equation*} The above provides the solution to $λ_2$: \[ λ_2 = \frac{-\vb{d}^T\vb{l} \vb{d}^T\vb{v} + \vb{v}^T\vb{l} \lVert\vb{d}\rVert_2^2}{\lVert\vb{v}\rVert_2^2\lVert\vb{d}\rVert_2^2 - \vb{d}^T\vb{v} \vb{d}^T \vb{v}} \] Since both $\vb{v}$ and $\vb{d}$ are both unit vectors, we then get \[ λ_2 = \frac{-\vb{d}^T\vb{l}\vb{d}^T\vb{v} + \vb{v}^T\vb{l}}{1 - (\vb{d}^T\vb{v})^2}. \]

Derivation Of Jacobian

<sec:jacobian-derivation> Recall the function $G(\vb{x})$: \begin{equation*} G(\vb{x}) = \begin{bmatrix} \lVert \vb{p}_1 - \vb{l}\rVert\vb{d} + \vb{l}
\lVert \vb{p}_2 - \vb{l}\rVert\vb{d} + \vb{l} \ \vdots \ \lVert \vb{p}_k - \vb{l}\rVert\vb{d} + \vb{l} \end{bmatrix}, \end{equation*} where $\vb{p}_i$ is the laser dot in optical coordinates from calibration image $i$.

We can split $J_G$ into two block columns representing the Jacobians with respect to different vectors: \begin{equation*} J_G = \begin{bmatrix} JG\vb{d} & JG\vb{l} \end{bmatrix} \end{equation*} Calculating $JG\vb{d}$ is fairly trivial, as the function is linear in $\vb{d}$: \begin{equation*} JG\vb{d} = \begin{bmatrix} \lVert \vb{p}_1 - \vb{l}\rVert I3×3
\lVert \vb{p}_2 - \vb{l}\rVert I3×3\ \vdots \ \lVert \vb{p}_k - \vb{l}\rVert I3×3 \end{bmatrix} \end{equation*}

We can take the Jacobian with respect to $\vb{l}$ by taking the Jacobians of each row individually. For a particular image $i$ we have: \begin{align*} \vb{g}_i(\vb{d},\vb{l}) &= \lVert \vb{p}_i - \vb{l} \rVert \vb{d} + \vb{l}
\frac{∂}{∂ \vb{l}}\vb{g}_i(\vb{d},\vb{l}) &= -\frac{1}{\lVert \vb{p}_i - \vb{l}\rVert}\left[\begin{smallmatrix} (p_x - l_x)d_x+1 & (p_y - l_y)d_x & (p_z - l_z)d_x \(p_x - l_x)d_y & (p_y - l_y)d_y + 1 & (p_z - l_z)d_y \p_x - l_x)d_z & (p_y - l_y)d_z & (p_z - l_z)d_z + 1 \end{smallmatrix}\right] \ &= I - \frac{1}{\lVert \vb{p}_i - \vb{l} \rVert}\vb{d}(\vb{p}_i - \vb{l})^T \end{align*}

We only want the first two columns of this, so we get \begin{equation*} J^iG\vb{l} = \begin{bmatrix} I2× 2 \ 0\end{bmatrix}(I3×3 - \frac{1}{\lVert \vb{p}_i - \vb{l} \rVert}\vb{d}(\vb{p}_i - \vb{l})^T) ∈ \mathbb{R}3× 2 \end{equation*}

Pool Test List

<sec:pool-test-list> Table \ref{tab:test-list} shows the full list of pool tests conducted while testing FishSense Lite.

DatePurpose
2023-02-24Lens calibration of first TG6
2023-03-03Evaluate accuracy of OpenCV’s calibration (Olympus vs. Go Pro, housing vs. no housing)
2023-03-10Test attenuation of red vs. green laser
2023-03-20First attempt of laser calibration
2023-04-05Repeat of both camera calibration and laser calibration
2023-04-12Quantify errors of length measurement from both object tilt and thickness
2023-04-19Collect length-referenced data for ground-truth comparison
2023-05-03Evaluate deviation in laser calibration parameters after travel
2023-05-08Obtain laser parameters with new mount
2023-07-13Calibrate FSL-01 camera and laser
2023-07-29Determine whether FSL-01 has maintained calibration after ocean test
2023-08-07Camera calibrations for FSL-02, 03, 04, 05, 06 and 07
2023-08-14Laser calibrations for FSL-02, 03, 04, 05, 06 and 07
2023-08-18Recalibration of laser mounts due to faulty mechanical design. First testing of slate calibration
2023-10-28Flat port data
2023-11-10Trial for obtaining more fine-grain length data for different tilt angles
2023-11-17Second trial for obtaining more fine-grain length data for different tilt angles
2024-02-12Testing if rougher edges and corners affect the quality of slate calibration
2024-02-26Laser calibration and length measurements with flat port
2024-03-04Data for testing automatic slate calibration
2024-04-25Determine accuracy of range measurements
2024-05-02Better determine accuracy of range measurements with more data points
2024-05-09Redo the range accuracy test, attempt new camera calibration

FishSense Lite User Manual

<sec:user-manual>

Setup

  1. Take the camera underwater and burp the lens. When reattaching the lens, ensure that it is tightly attached. The lens needs to remain in the same position it originally calibrated.
  2. Turn on the laser by turning the back end clockwise.
  3. Take a validation photo using the method specified in Section

Use

  1. When the laser is in use, aim it towards the center of the body mass of the fish. Ensure that the laser is not pointed directly toward a person.
  2. In order to image fish, it is not necessary to look through the camera’s viewport. If the laser is visible on the fish, it should also be visible in the camera.
  3. Refer to the appropriate camera manual for photo and video collection procedures. Take photos with fish as close to flat in the image as possible; otherwise, the length measurement may be inaccurate.

Teardown

  1. When data collection is complete, turn off the laser.
  2. Offload data to a hard disk if necessary.

Guidelines

  1. Do not use the zoom function on the camera. This changes the focal length, rendering our calibration useless.
  2. Avoid touching or moving the laser directly. Do not use the laser as a handle. The laser must be rigidly attached and unmoved for calibration values to hold true.
  3. The laser must be pointed directly at the fish, as close to its center of body mass as possible.
  4. The camera’s optical axis should establish approximately a right angle with the vertical axis of the fish.
  5. Ensure the fish is located within the effective range of 0.5 to 5 meters from the camera. With 3 meters being the ideal distance.
  6. The fish must be illuminated.

Camera Settings

The following settings are the ones for which the camera was tested, and to keep the data consistent, we recommend that you use them as well. Unless specified, all these settings can be found by pressing the “OK” button.

  • Program auto (set the wheel to P)
  • Flash off
  • ISO-A 1600
  • Exposure bias 0
  • Burst mode
  • Underwater shallow white balance
  • ESP metering
  • Face priority off
  • Accessory off
  • JPEG mode underwater
  • Autofocus
  • Aspect ratio 4000 x 3000
  • Output format LF + raw
  • Still image stabilization on

Light Attenuation In Water

<sec:light-attenuation>

The optical properties of water create some unique challenges in underwater optical imaging. Attenuation of light underwater is influenced by two main factors: absorption, the property of water itself to absorb light, and scattering, where water can re-emit incident light in other directions. Formally, the attenuation coefficient $c$ of a medium is defined as a sum of the absorption coefficient $a$ and scattering coefficient $b$, all of which are functions of the wavelength of light $λ$. \begin{equation} c(λ) = a(λ) + b(λ) \label{eq:attenuation} \end{equation} All of the above quantities are in units of $m-1$, and represent the fraction of incident power dissipated per meter.

Figure \ref{fig:water-absorption} shows the absorption coefficients of pure water corresponding to different wavelengths of light – the least absorbed portion of the spectra falls within the visible region.

images/water-absorption.png

images/water-absorption-visible.png

In particular, water permits green and blue light to pass up to an order of magnitude more easily than red light. Divers experience this the deeper they go into the ocean – as sunlight gets attenuated by ocean water, objects that are red begin to appear black, while blue and green objects remain the same. Underwater photographers must either bring filters or lights with them underwater to ensure that the scene they photograph is reflective of its true colors. Figure \ref{fig:underwater-colors} demonstrates the difference in color as sunlight is absorbed. Also note that ultraviolet and near IR frequencies are absorbed significantly more than within the visual spectrum – this is why underwater imaging is almost never done outside the visual spectrum.

Referring back to the laser comparison in Section sec:laser-comparison, the red laser sits with the 650-700nm wavelength range, and the green laser sits within the 510-540nm range. As can be seen from Figure \ref{fig:water-absorption-visible}, the green laser is absorbed around 10 times less than the red laser.

images/underwater-colors.png

The coefficients in Equation \ref{eq:attenuation} also change with dissolved or particulate matter in the water, and will generally increase both $a$ and $b$.

For a detailed guide on the optical properties of water and how they are studied, the Ocean Optics Web Book\footnote{https://www.oceanopticsbook.info/} is a great resource.

Backmatter

\bibliographystyle{plain} \bibliography{fishsense}