The Hough Transform

Benj Jensen
9 min readJun 3, 2021

--

Red sinusoids on black background, example image of accumulator matrix.

What is the Hough Transform?

In many computer vision tasks there arises a need to detect boundary lines (such as detecting lanes for autonomous vehicles). The Hough Transform is a common framework that allows for the detection of these boundary lines and is popular due to its ability to detect lines even in the presence of small gaps or occlusions. The secret to this transform is a “democratic” algorithm involving something known as the parameter space.

(Note that while the Hough Transform can be generalized to circles, ellipses, and many other types of parameterizable lines, this article will focus on straight lines).

What is the Parameter Space?

Different spaces are often used in mathematics to take something is complicated and represent it in a way that is easier to work with, and transformations are the way to convert back and forth between spaces. Although it may sound complicated, a simple example can help motivate the use of transforms and spaces.

Most of us are likely familiar with something known as the Cartesian coordinate system, which involves a pair of axes (usually called x and y) and a grid system. This provides an easy, intuitive way to plot points and make graphs — at least, in some scenarios. When it comes to representing boxes, rectangles, and straight lines, this coordinate system excels. Unfortunately, once circles are introduced, the Cartesian coordinate system suddenly stumbles.

For example, imagine working with the following circle shown below.

Fig. 1 — Comparison between a curve in Cartesian (left) and polar (right) plots.

Were we to need to find points on this line, we would need to find values that satisfy the equation x² + y² = 4, which would result in values such as (2,0), (1.5, 1.3228), (1.0, 1.732), etc. By comparison, in the polar coordinate system, which represents points as (θ, ρ) pairs (corresponding to the degrees of rotation from the horizontal axis and the radial distance from the origin), this circle would simply be represented as ρ = 2, and the points around the circle could easily be determined as (θ, 2), where 0 ≤ θ < 360. Remember that these represent actual solutions to our equations; which system would you rather work with?

This leads us to the parameter space. Consider the equation y = m x + b, a common way to parametrize straight lines. In this parameterization, given the parameters (m, b), any pair of values (x, y) for which the equation holds is a valid solution. The Hough transform allows us to take these straight lines from an image (in the image space) and convert it to a single point in the parameter space. Instead of solving for the pairs of (x, y) values that fulfill a given equation y=m x + b, with m and b given, we reverse the problem and attempt to solve for the parameters (m, b) that fulfill a given equation y - m x = b for every given (x, y) solution.

As an example, take the line y = 2 x + 1. Simple algebra will show that (0, 1), (1,3), and (2,5) are among the infinite solutions to this problem, which, when plotted, result in a straight line. This is a problem that is likely familiar to most of us. However, the Hough transform looks at the problem in reverse, noting that given the infinite set of solutions to the problem, there can only be one set of parameters for which all the given solutions are true.

Fig. 2— Comparison of the line y = 2x + 1 in the Image Space and the Parameter Space.

In this way, given a set of points on a line in the image space, we can reconstruct the parameters of the line. Conceptually, this may be a fairly simple concept, but it is the basis for a variety of important computer vision tasks.

In the same way that a line in the image space results in a point in the image space (i.e., only one set of parameters can generate a given line), a line in the parameter space transforms into a point in the image space (because a line can be represented as an infinite series of points, the result is just the point at which all the lines intersect).

Fig. 3— Comparison of the point (-1, 0.5) in Image Space and Parameter Space.

So how does this apply to image processing? Let’s use the following image as an example, and say we are trying to detect the blue line.

Fig. 4 — Example image with line to detect highlighted in blue.

By running some preprocessing functions to determine which pixels are on the boundary between the ‘M’ and the black background, we can determine that the points (338, 88), (350, 114), (362, 140), (373, 164), and (379, 176) are among the many pixels on this edge. Transforming those points into the parameter space, we get the following.

Fig. 5— Comparison between line in Image space and corresponding parameter pair in the Parameter Space.

Because each point (x, y) in the image space corresponds to a line of possible (m, b) pairs, in the parameter space we get five straight lines, and the point that they all intersect at corresponds to the true parameters (or at least the best estimate) of the line in the image space. This process can be run over all the edge pixels, with the intersection points in the parameter space showing the parameters for each line in the image space.

It is worth noting that in a perfect world, finding the true line would be a simple problem that could be solved by randomly selecting any two points on the line. However, when working with discretized or noisy data, these solutions may not actually provide accurate solutions, which is why the Hough transform is necessary.

The Accumulator Matrix

In practice, rather than using a plot to represent the parameter space as shown in Fig. 1 and Fig. 2, something known as an accumulator matrix is used instead. This is, essentially, a way to take the continuous plot and discretize it into a usable format for computers. The domain and range of the parameter space are split into prespecified bins, with each bin being represented as a cell in a matrix (e.g., all values of m so that 0.00 m < 0.05 are grouped together into bin 1, 0.05 ≤ m<0.1 are grouped into bin 2, etc.). Then, any time a viable (m, b) pair is generated, the corresponding cells of the matrix are incremented.

Fig. 6 — Comparison between lines in Image space, corresponding parameter pairs, and representation in an accumulator matrix. Note that the accumulator matrix has been flipped vertically in order to match the other two plots.

Increasing the discretization level (shrinking the bin width) results in an accumulator matrix with greater fidelity, but also increases the amount of storage needed, which can quickly become a problem. Thus, this is a factor that must be tailored to the needs of each project individually.

A More Efficient Parameter Space

As the careful reader may have noticed, the parameter space mentioned so far has two major problems. The first is that the y=m x + b representation that we have been using is unable to deal with vertical lines due to a pesky divide-by-zero error in the equation for slope, which means that any vertical lines (or approximately vertical lines, depending on the level of discretization) will not be detected. The second problem is that both of the parameters m and b are not bounded and as such the size of the accumulator matrix can grow big very quickly.

Both of these problems can be overcome by switching parameterizations to one known as the normal form: x cos(θ) + y sin(θ) = ρ, which is parameterized by the variables θ and ρ. Although this form looks much more intimidating (Aah, sines!), it allows for some handy functionality.

Before we get into the benefits, lets look at how the normal form works. This parameterization relies on the fact that every straight line has exactly one point that is closest to the origin and describes an imaginary line that connects the origin to this closest point perpendicularly (hence normal form). The length of this imaginary line is represented by ρ, and the angle that this line makes with the horizontal axis is represented by θ, as shown below in Fig. 7.

Fig. 7 — Example of normal form representation of a line.

This line passes through the point (2,2) and could be represented as y = 4 - x, but it could also be represented as x sin(45) + y sin(45) = 2√2 (θ= 45°, ρ= 2√2).

Some additional examples are shown below as references.

Fig. 8 — Additional examples of lines in normal form.

(Note that we are allowing any θ so that 0 θ < 360, which means ρ is positive. You could also restrict θ to be between 0 and 180 (i.e., 0 θ < 180) and allow for negative ρ , which would change (b) and (c) to (14°, -√17) and (135°, -√8), respectively)

As you can see from example (a), vertical lines are no longer a problem. Additionally, θ is restricted to being between 0 and 360 (or 0 and 180 if negative ρ are allowed), which helps maintain a more reasonable size of the accumulator matrix.

With this new parameterization, we can now update our transformation rules. Using our previous parameterization, a point in the image space mapped to a line in the parameter space, and vice versa. With the use of the normal form parameterization, points now map to sinusoids (this is due to the cyclical nature of the sine and cosine function).

Fig. 9—Example of line in Image and Parameter spaces using normal form parameterization.

As with the previous parameterization, the sinusoids in the parameter space intersect at the true parameters for the line in the image space (in this case, θ = 45°, ρ= 8). There is also a second intersection point at θ = 225°, ρ= -8; this is a redundant solution, and results in the same point (which is why we typically restrict rho to be positive or theta to be between 0 and 180). And again, in practice, the continuous plot in the parameter space is replaced with a discretized accumulator matrix.

Image preprocessing:

Now that we understand how the Hough transform works, the last step (first step?) is to understand how to process the image before running the Hough transform on it. The Hough transform as described above involves (1) iterating over each of the edge pixels in an image and converting them to the parameter space, (2) storing the results in an accumulator matrix, and then (3) finding the intersection points and (4) returning the corresponding lines. However, this requires us to know where the edge pixels are. Fortunately, this is fairly easy, and you can simply convolve a pair of derivative filters over the image and implement non-maximal suppression (AKA use the Canny edge detection function from OpenCV, and possibly through in some dilations and erosions if needed). Note that this works best on grayscale images. An example of an image before and after edge detection is shown below.

Fig. 10 — Example image before and after Canny edge detection.

Once this has been done, each pixel can be run through the algorithm to get the corresponding lines.

An example is shown in Fig. 11 below.

Fig. 11 — Example of Hough transform running in real time. Output of Canny edge detection is shown on the left with input pixels shown in green. The parameter space representation is shown in the middle, with the brightest points corresponding to line parameters. Finally, the original image is shown on the right along with lines corresponding to the top parameter sets.

Summary of Line Detection Process:

1. Convert to black and white (optional)

2. Determine edge pixels (e.g., through a Canny edge detection algorithm)

3. For each pixel, convert to the parameter space and increment the appropriate spaces in the accumulator matrix

4. Implement non-maximal suppression on the accumulator matrix to avoid near-duplicate lines caused by quantization (optional)

5. Take the top n intersection points in the parameter space and convert back to image space

Based off of class notes from course taught by Kris Kitani at Carnegie Mellon University

--

--