Design Food Delivery System like Uber Eats (Mock Interview)

Design Food Delivery System like Uber Eats (Mock Interview) was originally published on Exponent.

💡

Hey there! Learn how to answer system design questions like this with in-depth video examples and fundamental concepts in our System Design Interview Course.

Sneak Peek: The three most common system design questions:
– Design Instagram. Watch an answer to this question here.
– How would you build TinyURL? Watch answer here.
– Design YouTube. Watch a sample answer to this question here.

In this system design mock interview, we asked Neeraj, an Engineering Manager at eBay, to describe the design process for Uber Eats, a food delivery app.

Functional requirements

Many components are required in a system that delivers food:

  • restaurants,
  • customers,
  • delivery drivers,
  • and the orders themselves.

The first step is to define our functional requirements.

A functional requirement is an operation our software must be able to perform, like adding a restaurant or removing a menu item, for example.

The system should allow restaurants to easily add, update, or remove menu items and information.

On the customer side, users should be able to view, search, and filter restaurants based on distance, types of food, and estimated delivery time.

Non-functional requirements

Non-functional requirements define a system’s attributes. In our example, this would be:

  • scalability,
  • data consistency,
  • security,
  • availability,
  • and operational latency.

Some non-functional requirements are more critical than others.

For example, restaurant updates or menu items don’t need to happen immediately.

If a restaurant releases a new menu item, it’s fine if our system takes a few seconds to update.

High availability, on the other hand, is a top priority.

It‘s better to have a functional platform with slightly outdated data than a system that crashes on users.

Security is also an important consideration.

We don’t want users to access or manipulate the information they don’t own. We could use identity access management and logging mechanisms to prevent unauthorized access.

Lastly, we must address latency. Important metrics include the time it takes to load a restaurant page or conduct a search.

Fast response times are desired, such as:

  • under 150 milliseconds for viewing pages,
  • and under 400 milliseconds for conducting restaurant searches.

Operations such as uploading images or new menu items are less critical, and it’s ok if they have a slightly higher latency.

Understanding the scale of our data

For a system design to be effective, we need to know how much data we’re working with.

Let’s assume our Uber Eats system adds 100 restaurants and 10,000 customers daily. Over 20 years, the system would have about ~1.73 million restaurants and ~73 million customers.

That’s a fair estimate for the amount of data our system must be able to support.

Lastly, let’s also estimate daily restaurant views and searches to be around 100 million per day.

Defining data models

Each component (such as a menu, restaurant, order, etc.) in our system has data fields. A data model lists those fields and describes how each component interacts with other components.

For example, a menu item would have fields such as:

  • item name,
  • currency,
  • price,
  • and image.

A restaurant would have:

  • name,
  • unique ID,
  • address,
  • city and location,
  • and distance from a user’s location.

🗺️

The distance can be calculated using latitude and longitude data points.

Customers would have fields such as:

  • name,
  • delivery address,
  • country,
  • latitude,
  • and longitude.

Calculating distances

A common task in location-specific applications is calculating delivery times between two locations. Here, we need to calculate delivery times between restaurants and customers.

One way to do this is with GeoHashing. This method divides the world map into small grids and expresses each location as a short alphanumeric string.

We would use a longer alphanumeric string to represent a smaller (more specific) geographic area. Conversely, a larger geographic area can be represented with a shorter string.

Using this method, we can group multiple restaurants in a small area, such as a shopping center, and treat them as a single data unit. The delivery time will be relatively constant whether it’s from the first shop or the tenth. This simplifies our design.

Database design

We could use a relational database storing data with separate tables for customers, menu items, restaurants, etc. We could also create multiple read replicas to speed up data reading.

Databases typically receive more read requests than write requests. A read replica is a copy of a primary database used explicitly for processing read requests.

Since the number of users is high in our example (~173 million), a NoSQL database such as Cassandra may be more appropriate as it supports automatic sharding.

🏢

Sharding is the process of storing a large database across multiple machines. It allows you to scale your database by providing increased read/write throughput, storage capacity, and high availability.

Designing specific services

In designing our system, we could start by creating a layer that orchestrates services across different platforms, such as iOS, Android, mobile web, and desktop web. This will be useful for sharing back-end resources.

Next, we could implement a service for restaurants.

This would have a load balancer on top to distribute traffic across multiple machines.

Handling Images

When storing data, we must be careful with specific fields such as menu images. Sending images over the network and storing them in a database may be too resource-intensive, so it’s better to use object storage like S3 for efficiency.

We could also have an image service with a moderation policy. This policy would use machine learning models to ensure all uploaded images are accurate and meet quality standards.

Many trained models for image classification already exist and can be leveraged. These models use a classification algorithm to determine whether an image is appropriate.

The model’s performance can be determined by testing it on a sample data set and calculating the precision and recall values.

Imagine you go fishing. You use a wide net and catch 80 out of 100 fish in a lake. That’s 80% recall. But you also get 80 rocks in your net. You have 50% precision because your net has 80 rocks and 80 fish.

You could use a smaller net and target an area with many fish and no rocks, but you might only get 20 fish to get 0 rocks. That’s 20% recall and 100% precision.

If our image validation process is slow, one solution could be to optimize the API or add more computing resources, such as GPUs or multiple CPUs, to increase speed.

We can also optimize the speed of data retrieval by storing information in the cache using a mechanism like LRU (Least Recently Used) or compression.

Search Functionality

Our systems will need a search service. The search service identifies the closest restaurants to a user based on location.

For distance-based searches, such as finding nearby restaurants, we can use Elasticsearch, which provides different geo-queries. A restaurant’s coordinates can be stored in Elasticsearch, and the user’s coordinates can be used to run a geo-query to determine nearby restaurants.

💡

Elasticsearch now uses a block k-dimensional tree called BKD, an improved geo-hash version.

We also need a way to filter restaurants by delivery time.

We can create a polygon map that defines the region a restaurant can deliver to within a specific period and store it in a database such as DynamoDB.

This allows users to view restaurants that can deliver to them within a specified time.