**Problem**:

“Locating a lost bag in a NYC Medallion taxi”

In the weekend, you traveled in a taxi in the Manhattan area. While seated in the rear seat, you noticed the cab driver’s license — it was a long name but you recalled at least the last name — it was SINGH. You got down from the taxi at the intersection of 37th street and 5th Avenue. You paid the fare and the taxi left. All of a sudden, you realized that you left your bag (containing some valuables) behind in the taxi.

Naturally, you reported this in the local police station and also tried to track the taxi by calling the Taxi and Limousine commission. They said that in order to locate the taxi driver, they need the Medallion number. Unfortunately, you never noted down the Medallion number.

Feeling really bad about the loss, you are contemplating on what to do. An idea crosses your mind. You search the Internet and are able to find the most updated list of Medallion taxi drivers. You are also able to locate the Medallion numbers of all the cabs that traveled in Manhattan that day. Your job is to locate the driver in the most efficient manner.

There are about 42000 Medallion taxi drivers and about 4500 cabs that traveled in Manhattan that day.

**Hints**:

Let’s say, there are n Medallion cab drivers and m cabs that traveled in Manhattan. There are k drivers with the last name as SINGH. Clearly, n > m > k.

Here are some possibilities to approach the problem:

1. For every cab that traveled in Manhattan that day, search for the matching Medallion number and driver’s last name. This will cost you O(m * n).

2. Create a list (of length k) of all drivers with the last name SINGH. This will cost you O(n). Now, go through the cabs that traveled in Manhattan and try to match the Medallion number in the list you just created. This will cost you O(m * k). Total cost = O(n + m*k).

3. Create a list (of length k) of all drivers with the last name SINGH and sort the list. This will cost you O(n + k^2). Now, do a binary search for the Medallion number in the sorted list. Total cost = O(n + k^2 + m * log k). This can be reduced further to O(n + (k + m) * log k) by choosing a better sorting algorithm. Let’s shoot for O(n + k^2 + m * log k).

**Starter files**:

Copy the starter files from here:

/comp/15/public_html/labs/lab5/problem/

**Expected Output**:

Total number of legal taxi drivers = 41666

———————

Total matching drivers = 1061

———————

Total number of taxis traveling in Manhattan = 4496

———————

Match 1:

<MEDALLION NUM>: <LAST NAME>, <FIRST NAME>

**Solution**:

The solution files are here.