Exploring Sense-and-Avoid Systems for Autonomous Vehicles

A question we often get is, why doesn’t Vahana have a pilot?

It’s simple: supply and demand. We seek to provide a system capable of delivering more than 1 billion flight hours every year. To do this would require more than half a million pilots. Today, the total number of active commercial helicopter pilots in the US is about 24,000, and each of them needs 1,200 hours of flight time (and several years) to secure a commercial airline transport pilot certificate. Even under the most optimistic assumptions about the availability of active helicopter pilots for an air taxi operation at relevant locations, we’d have less than 1% of the number of pilots that we’d need. Self piloting is the key to scalability for Vahana and similar air taxis.

To do this, our self-piloted system must be capable of taking on many of the tasks assigned to the pilot-in-command, including sensing and avoiding hazards around us. Considering that we’re operating in high-density urban airspace (see figures 1, 2, 3 for examples), we’ll have to navigate around ground infrastructure and in airspace with other Vahanas, drones, aircraft, birds, and more. The capabilities of such a system include:

  1. Ensure the aircraft is able to analyze, understand, and interact with its environment during takeoff, flight, and landing operations as a self-piloted vehicle.
  2. Ensure the safety of the aircraft during takeoff, flight, and landing as well as the safety of everyone — and everything — near the vehicle and its passenger at any stage of its operation.
  3. Ensure safe operation with respect to collision avoidance, airspace, and ground safety, including the in-flight sense-and-avoid function.

Figure 1: Vahana approaching vertiport. Image source: Acubed by Airbus

Figure 2: Passenger entry at vertiport. Image source: Acubed by Airbus

Figure 3: Flight over urban environment. Image source: Acubed by Airbus

Such a system relies on a wide range of sensors and software, many of which are currently being developed for the automotive industry. There are several distinguishing factors, however, that separate an airborne system from a ground-based one. See Figure 4 for a high level description of the Vahana flight phases.

Figure 4: Vahana flight phases. Image source: Acubed by Airbus

Unlike with autonomous cars, where the majority of requirements are driven by the legacy of an existing transportation system, we are developing a brand new transportation system in an environment that is currently sparsely used and constrained by a system very different that that on the ground. We plan to shape a completely new form of urban mobility that will positively impact millions of people.

One of the cutting-edge technologies we are building is the ability of our aircraft to detect and avoid obstacles it may encounter, commonly referred to as “sense-and-avoid”, see figures 5 and 6.

Figure 5: Vahana encountering an obstacle. Image source: Acubed by Airbus

Figure 6: Vahana avoiding an obstacle. Image source: A³ by Airbus

Sense-and-avoid can be broken down into the following tasks:

  1. Detect obstacles (e.g. large bird, aircraft, or drone)
  2. Classify obstacle types
  3. Identify collision risks
  4. Find strategy to avoid collision
  5. Change own trajectory and execute avoidance path
  6. Return to the original path

When obstacles are closing in at upwards of 100 m/s (225 mph), these tasks need to happen very fast. In that case, a vehicle equipped with a sensor that “sees” 500 m (1,640 ft) ahead would have less than 5 seconds before impact with such an obstacle if no evasive action is taken. And don’t forget: in the air, obstacle avoidance is a three-dimensional proposition. It becomes even more complex when Vahana encounters several obstacles with different characteristics all at once.

Imagine a case where a drone, a bird, and an aircraft approach at the same time. Each has a different impact risk, kinematics, speed, etc. The sense-and-avoid system decides on an avoidance maneuver, but each of the three obstacles may also decide to change course. This presents a fascinating detection and decision-making challenge for our system. It needs to incorporate obstacle detection and identification, as well as collision risk analysis. Additionally, it must perform this analysis not once but continuously as obstacles change course of their own volition. Also, the system needs to include Vahana-specific considerations, such as passenger comfort, energy consumption, and flight path restrictions.

Our sensors need to be able to accurately see and understand the environment. We require 3D coverage around the vehicle to provide situational awareness, and we need to see obstacles about 2 km away (ca. 1.2 miles). We can’t have any blind spots around our vehicle when flying or when on the ground.

We have decided on a redundant, multimodal sensing setup, see figure 7:

  1. Cameras: The workhorse of almost all autonomous vehicles, cameras are essentially “passive” sensors to measure incoming light in a 2D sensor array.
  2. Lidar: An “active” sensor with light waves emitted by laser beams, which provides high resolution 3D representations.
  3. Radar: Another “active” sensor with radio waves enabling obstacle detection and distance estimations.

Figure 7: Rendering of example 3D sensor coverage. Image source: A³ by Airbus

To illustrate the need for 3D sensor coverage, consider one of the many operational corner cases: A construction crane arm is turned away from a vertiport when Vahana lands, but then turned to be above Vahana before it takes off again. Our system must be able to sense and adapt its mission to this kind of rapidly changing environment.

Another essential component to successful sense-and-avoid systems is data collection. For algorithm development and verification, we require a large training data set, including all of the obstacles that we want to avoid. It is not easy to get 100,000 relevant images of Canada geese or of a particular Cessna model! To be useful, images must be from the relevant viewpoints of our aircraft (e.g. head-on), and include additional metadata such as kinematics properties (speed, size, heading, etc.). To cover all cases, we need data for the object in differing backgrounds, weather, and light conditions. The goal is to train the system to properly perform the sense-and-avoid tasks in all possible flight conditions.

We decided to collect our own data using drones and hexacopters as surrogate vehicles. We created a systematic and repeatable setup for data collection and algorithm verification with an observation drone emulating the flight path of Vahana, and a target drone emulating the obstacles Vahana may encounter, see figures 8, 9, 10, and 11.

Figure 8: Data collection in an urban environment. Image source: A³ by Airbus

Figure 9: Observation drone. Image source: Velodyne Lidar

Figure 10: Example for flight path way points. Image source: Google Maps

Figure 11: Team preparing for a test flight. Image source: A³ by Airbus

Data collection is nothing without data annotation, so in addition to creating our own data collection environment, we also had to build a domain-specific solution to annotate hundreds of thousands of images for relevant information, such as type of obstacles, ground / sky, etc., see figures 12, 13, and 14.

Figure 12: 12 MP image recording against complex background. Image source: A³ by Airbus

Figure 13: Data annotation example (drone, bird, horizon). Image source: A³ by Airbus

Figure 14: Data annotation example (drone, horizon). Image source: A³ by Airbus

Proper data annotation is essential because the detection accuracy of the algorithms developed using the annotated data is strongly correlated to the accuracy of the annotated data itself.

So our team had to boost the quality of the annotation data by developing our own tools and a rigorous, iterative process. See figure 15 for an example of such a tool.

Figure 15: Example plot to evaluate the distribution of observed locations of the target drone for a particular flight. The X and Y coordinates correspond to relative drone position in the image frame. The blue markers indicate usable drone positions, while the red markers indicate drone positions not used for training. The gray horizontal lines indicate the horizon as observed during the flight. Image source: A³ by Airbus

The critical operation in sense-and-avoid systems is perception: the ability to identify and characterize objects in an environment based on raw sensor data inputs. The basic perception requirements for Vahana are:

  1. It must be robust against a variety of environmental conditions (e.g. illumination, background, foreground, clutter).
  2. It must be robust against various objects’ conditions (e.g. viewpoints, variance in object geometry and appearance).
  3. It must be very fast at object detection (the goal being real-time).
  4. It must fit a resource-constrained, on-board solution.

Figure 16 exemplifies a particularly challenging perception task, where our system is required to detect a drone against a cluttered background, complicated horizon, mix of artificial and natural objects, and objects with a very similar appearance.

Figure 16: Perception in the wild. Image source: A³ by Airbus

Because the data produced by a vision system is very complex, many techniques have been developed to simplify its processing. A standard approach to processing an image is to first look for simple “features” in the image and then use those features to extract useful information. For example, edge features in the image can be identified, and the arrangement and distribution of edges can be used to determine that the image shows a particular object.

In “traditional” computer vision, object features are selected manually and analyzed using highly tuned, application-specific methods. For this reason, traditional computer vision methods are rarely generalizable and usually lack robustness to variations found in real-world situations such as variable illumination conditions, object rotation and scaling, and intra-class variation (e.g., the same airplane painted a different color). Deep Learning, on the other hand, is a specific Artificial Intelligence method that has been shown to be robust not only to real-world variations, but also across domains.

Deep Learning utilizes artificial neural networks as the main component to process information. Neural networks consist of interconnected groups of nodes, with each node processing information based on a defined function. Deep neural networks (DNN) refer to neural networks with a deep and complex layering of such nodes. Rather than using the hand-tuned features in traditional vision techniques, DNNs are trained with large amounts of data able to robustly perform classification tasks.

For our application we developed specific classes of Convolutional Neural Networks for Deep Learning. Training of such networks is performed on multi-GPU, high-performance computers. This optimized hardware is capable of performing Deep Learning training on massive data sets.

We train a network that we then port to an on-board compute platform. The specific challenge is to optimize our deployed Deep Learning network for high detection accuracy while providing real-time output, given the very large images generated by our high-resolution cameras. Our image sizes are significantly larger than what is currently used in autonomous cars. The output of the perception system includes the class of detected object (e.g. drone, bird, plane), position of the object, trajectory of the object, and the distance of the object from the sensor (and therefore our aircraft), see figure 17.

Figure 17: Example of algorithm development. Image source: A³ by Airbus, Nvidia

In the example below we specified that the system detect multiple birds in the distance against an open sky with dense cloud coverage. It had to detect multiple small objects in parallel in front of the large background variance with the clouds, see figure 18 and 19.

Figure 18: Original image from 12 MP camera. Image source: A³ by Airbus

Figure 19: Positions of detected birds superimposed. Image source: A³ by Airbus

Recently we shared our initial results at the NVIDIA GTC conference. By far the most rewarding feedback was, “I didn’t know what you showed was even possible!” We also discussed self-flying air taxis in greater detail on their AI Podcast with host Michael Copeland. Take a listen here and here.

The team is now focused on integrating the sense-and-avoid functions with the full-scale Vahana vehicle for flight demonstrations, in parallel with a continued development of deep learning algorithms and data collection systems.

Figure 20: Autonomous Systems crew. Image source: A³ by Airbus

We are always looking for top engineers to join the team, bring their experience in areas such as robotics, deep learning, perception, decision making, and who share our excitement to make Vahana and Urban Air Mobility a reality! Check out our current jobs postings here.

- Arne Stoschek