Zomato Recommendation System

Context¶

I was always fascinated by the food culture of Bengaluru. Restaurants from all over the world can be found here in Bengaluru. From United States to Japan, Russia to Antarctica, you get all type of cuisines here. Delivery, Dine-out, Pubs, Bars, Drinks,Buffet, Desserts you name it and Bengaluru has it. Bengaluru is best place for foodies. The number of restaurant are increasing day by day. Currently which stands at approximately 12,000 restaurants. With such an high number of restaurants. This industry hasn't been saturated yet. And new restaurants are opening every day. However it has become difficult for them to compete with already established restaurants. The key issues that continue to pose a challenge to them include high real estate costs, rising food costs, shortage of quality manpower, fragmented supply chain and over-licensing. This Zomato data aims at analysing demography of the location. Most importantly it will help new restaurants in deciding their theme, menus, cuisine, cost etc for a particular location. It also aims at finding similarity between neighborhoods of Bengaluru on the basis of food. The dataset also contains reviews for each of the restaurant which will help in finding overall rating for the place.

In this notebook I will try analyzing the Buisness Problem of Zomato and create a practical recommendation system for users.

Zomato

What is Recommendation System?¶

The rapid growth of data collection has led to a new era of information. Data is being used to create more efficient systems and this is where Recommendation Systems come into play. Recommendation Systems are a type of information filtering systems as they improve the quality of search results and provides items that are more relevant to the search item or are realted to the search history of the user. They are active information filtering systems which personalize the information coming to a user based on his interests, relevance of the information etc. Recommender systems are used widely for recommending movies, articles, restaurants, places to visit, items to buy etc.

There are basically three types of recommender systems:-

  • Demographic Filtering- They offer generalized recommendations to every user, based on movie popularity and/or genre. The System recommends the same movies to users with similar demographic features.

  • Content Based Filtering- They suggest similar items based on a particular item. This system uses item metadata, such as genre, director, description, actors, etc. for movies, to make these recommendations.

  • Collaborative Filtering- This system matches persons with similar interests and provides recommendations based on this matching. Collaborative filters do not require item metadata like its content-based counterparts.

Here I will be using Content Based Filtering

Content-Based Filtering: This method uses only information about the description and attributes of the items users has previously consumed to model user's preferences. In other words, these algorithms try to recommend items that are similar to those that a user liked in the past (or is examining in the present). In particular, various candidate items are compared with items previously rated by the user and the best-matching items are recommended.

This data set consists of restaurants of Bangalore,India collected from Zomato.

My aim is to create a content based recommender system in which when I will write a restaurant name, Recommender system will look at the reviews of other restaurants, and System will recommend us other restaurants with similar reviews and sort them from the highest rated.

Breakdown of this notebook:¶

  1. Loading the dataset: Load the data and import the libraries.
  2. Data Cleaning:
    • Deleting redundant columns.
    • Renaming the columns.
    • Dropping duplicates.
    • Cleaning individual columns.
    • Remove the NaN values from the dataset
    • #Some Transformations
  3. Text Preprocessing
    • Cleaning unnecessary words in the reviews
    • Removing links and other unncessary items
    • Removing Symbols
  4. Recommendation System

Importing Libraries¶

In [1]:
#Importing Libraries
import numpy as np
import pandas as pd
import seaborn as sb
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings('always')
warnings.filterwarnings('ignore')
import re
from nltk.corpus import stopwords
from sklearn.metrics.pairwise import linear_kernel
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer

Loading the dataset¶

In [2]:
#reading the dataset
zomato_real=pd.read_csv("../input/zomato-bangalore-restaurants/zomato.csv")
zomato_real.head() # prints the first N rows of a DataFrame
Out[2]:
url address name online_order book_table rate votes phone location rest_type dish_liked cuisines approx_cost(for two people) reviews_list menu_item listed_in(type) listed_in(city)
0 https://www.zomato.com/bangalore/jalsa-banasha... 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa Yes Yes 4.1/5 775 080 42297555\r\n+91 9743772233 Banashankari Casual Dining Pasta, Lunch Buffet, Masala Papad, Paneer Laja... North Indian, Mughlai, Chinese 800 [('Rated 4.0', 'RATED\n A beautiful place to ... [] Buffet Banashankari
1 https://www.zomato.com/bangalore/spice-elephan... 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant Yes No 4.1/5 787 080 41714161 Banashankari Casual Dining Momos, Lunch Buffet, Chocolate Nirvana, Thai G... Chinese, North Indian, Thai 800 [('Rated 4.0', 'RATED\n Had been here for din... [] Buffet Banashankari
2 https://www.zomato.com/SanchurroBangalore?cont... 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe Yes No 3.8/5 918 +91 9663487993 Banashankari Cafe, Casual Dining Churros, Cannelloni, Minestrone Soup, Hot Choc... Cafe, Mexican, Italian 800 [('Rated 3.0', "RATED\n Ambience is not that ... [] Buffet Banashankari
3 https://www.zomato.com/bangalore/addhuri-udupi... 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana No No 3.7/5 88 +91 9620009302 Banashankari Quick Bites Masala Dosa South Indian, North Indian 300 [('Rated 4.0', "RATED\n Great food and proper... [] Buffet Banashankari
4 https://www.zomato.com/bangalore/grand-village... 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village No No 3.8/5 166 +91 8026612447\r\n+91 9901210005 Basavanagudi Casual Dining Panipuri, Gol Gappe North Indian, Rajasthani 600 [('Rated 4.0', 'RATED\n Very good restaurant ... [] Buffet Banashankari
In [3]:
zomato_real.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51717 entries, 0 to 51716
Data columns (total 17 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   url                          51717 non-null  object
 1   address                      51717 non-null  object
 2   name                         51717 non-null  object
 3   online_order                 51717 non-null  object
 4   book_table                   51717 non-null  object
 5   rate                         43942 non-null  object
 6   votes                        51717 non-null  int64 
 7   phone                        50509 non-null  object
 8   location                     51696 non-null  object
 9   rest_type                    51490 non-null  object
 10  dish_liked                   23639 non-null  object
 11  cuisines                     51672 non-null  object
 12  approx_cost(for two people)  51371 non-null  object
 13  reviews_list                 51717 non-null  object
 14  menu_item                    51717 non-null  object
 15  listed_in(type)              51717 non-null  object
 16  listed_in(city)              51717 non-null  object
dtypes: int64(1), object(16)
memory usage: 6.7+ MB

Data Cleaning and Feature Engineering¶

In [4]:
#Deleting Unnnecessary Columns
zomato=zomato_real.drop(['url','dish_liked','phone'],axis=1) #Dropping the column "dish_liked", "phone", "url" and saving the new dataset as "zomato"
In [5]:
#Removing the Duplicates
zomato.duplicated().sum()
zomato.drop_duplicates(inplace=True)
In [6]:
#Remove the NaN values from the dataset
zomato.isnull().sum()
zomato.dropna(how='any',inplace=True)
zomato.info() #.info() function is used to get a concise summary of the dataframe
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43499 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   address                      43499 non-null  object
 1   name                         43499 non-null  object
 2   online_order                 43499 non-null  object
 3   book_table                   43499 non-null  object
 4   rate                         43499 non-null  object
 5   votes                        43499 non-null  int64 
 6   location                     43499 non-null  object
 7   rest_type                    43499 non-null  object
 8   cuisines                     43499 non-null  object
 9   approx_cost(for two people)  43499 non-null  object
 10  reviews_list                 43499 non-null  object
 11  menu_item                    43499 non-null  object
 12  listed_in(type)              43499 non-null  object
 13  listed_in(city)              43499 non-null  object
dtypes: int64(1), object(13)
memory usage: 5.0+ MB
In [7]:
#Reading Column Names
zomato.columns
Out[7]:
Index(['address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'location', 'rest_type', 'cuisines', 'approx_cost(for two people)',
       'reviews_list', 'menu_item', 'listed_in(type)', 'listed_in(city)'],
      dtype='object')
In [8]:
#Changing the column names
zomato = zomato.rename(columns={'approx_cost(for two people)':'cost','listed_in(type)':'type',
                                  'listed_in(city)':'city'})
zomato.columns
Out[8]:
Index(['address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'location', 'rest_type', 'cuisines', 'cost', 'reviews_list',
       'menu_item', 'type', 'city'],
      dtype='object')
In [9]:
#Some Transformations
zomato['cost'] = zomato['cost'].astype(str) #Changing the cost to string
zomato['cost'] = zomato['cost'].apply(lambda x: x.replace(',','.')) #Using lambda function to replace ',' from cost
zomato['cost'] = zomato['cost'].astype(float) # Changing the cost to Float
zomato.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43499 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   address       43499 non-null  object 
 1   name          43499 non-null  object 
 2   online_order  43499 non-null  object 
 3   book_table    43499 non-null  object 
 4   rate          43499 non-null  object 
 5   votes         43499 non-null  int64  
 6   location      43499 non-null  object 
 7   rest_type     43499 non-null  object 
 8   cuisines      43499 non-null  object 
 9   cost          43499 non-null  float64
 10  reviews_list  43499 non-null  object 
 11  menu_item     43499 non-null  object 
 12  type          43499 non-null  object 
 13  city          43499 non-null  object 
dtypes: float64(1), int64(1), object(12)
memory usage: 5.0+ MB
In [10]:
#Reading Rate of dataset
zomato['rate'].unique()
Out[10]:
array(['4.1/5', '3.8/5', '3.7/5', '3.6/5', '4.6/5', '4.0/5', '4.2/5',
       '3.9/5', '3.1/5', '3.0/5', '3.2/5', '3.3/5', '2.8/5', '4.4/5',
       '4.3/5', 'NEW', '2.9/5', '3.5/5', '2.6/5', '3.8 /5', '3.4/5',
       '4.5/5', '2.5/5', '2.7/5', '4.7/5', '2.4/5', '2.2/5', '2.3/5',
       '3.4 /5', '-', '3.6 /5', '4.8/5', '3.9 /5', '4.2 /5', '4.0 /5',
       '4.1 /5', '3.7 /5', '3.1 /5', '2.9 /5', '3.3 /5', '2.8 /5',
       '3.5 /5', '2.7 /5', '2.5 /5', '3.2 /5', '2.6 /5', '4.5 /5',
       '4.3 /5', '4.4 /5', '4.9/5', '2.1/5', '2.0/5', '1.8/5', '4.6 /5',
       '4.9 /5', '3.0 /5', '4.8 /5', '2.3 /5', '4.7 /5', '2.4 /5',
       '2.1 /5', '2.2 /5', '2.0 /5', '1.8 /5'], dtype=object)
In [11]:
#Removing '/5' from Rates
zomato = zomato.loc[zomato.rate !='NEW']
zomato = zomato.loc[zomato.rate !='-'].reset_index(drop=True)
remove_slash = lambda x: x.replace('/5', '') if type(x) == np.str else x
zomato.rate = zomato.rate.apply(remove_slash).str.strip().astype('float')
zomato['rate'].head()
Out[11]:
0    4.1
1    4.1
2    3.8
3    3.7
4    3.8
Name: rate, dtype: float64
In [12]:
# Adjust the column names
zomato.name = zomato.name.apply(lambda x:x.title())
zomato.online_order.replace(('Yes','No'),(True, False),inplace=True)
zomato.book_table.replace(('Yes','No'),(True, False),inplace=True)
zomato.cost.unique()
Out[12]:
array([800.  , 300.  , 600.  , 700.  , 550.  , 500.  , 450.  , 650.  ,
       400.  , 900.  , 200.  , 750.  , 150.  , 850.  , 100.  ,   1.2 ,
       350.  , 250.  , 950.  ,   1.  ,   1.5 ,   1.3 , 199.  ,   1.1 ,
         1.6 , 230.  , 130.  ,   1.7 ,   1.35,   2.2 ,   1.4 ,   2.  ,
         1.8 ,   1.9 , 180.  , 330.  ,   2.5 ,   2.1 ,   3.  ,   2.8 ,
         3.4 ,  50.  ,  40.  ,   1.25,   3.5 ,   4.  ,   2.4 ,   2.6 ,
         1.45,  70.  ,   3.2 , 240.  ,   6.  ,   1.05,   2.3 ,   4.1 ,
       120.  ,   5.  ,   3.7 ,   1.65,   2.7 ,   4.5 ,  80.  ])
In [13]:
zomato.head()
Out[13]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city
0 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa True True 4.1 775 Banashankari Casual Dining North Indian, Mughlai, Chinese 800.0 [('Rated 4.0', 'RATED\n A beautiful place to ... [] Buffet Banashankari
1 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant True False 4.1 787 Banashankari Casual Dining Chinese, North Indian, Thai 800.0 [('Rated 4.0', 'RATED\n Had been here for din... [] Buffet Banashankari
2 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe True False 3.8 918 Banashankari Cafe, Casual Dining Cafe, Mexican, Italian 800.0 [('Rated 3.0', "RATED\n Ambience is not that ... [] Buffet Banashankari
3 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana False False 3.7 88 Banashankari Quick Bites South Indian, North Indian 300.0 [('Rated 4.0', "RATED\n Great food and proper... [] Buffet Banashankari
4 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village False False 3.8 166 Basavanagudi Casual Dining North Indian, Rajasthani 600.0 [('Rated 4.0', 'RATED\n Very good restaurant ... [] Buffet Banashankari
In [14]:
zomato['city'].unique()
Out[14]:
array(['Banashankari', 'Bannerghatta Road', 'Basavanagudi', 'Bellandur',
       'Brigade Road', 'Brookefield', 'BTM', 'Church Street',
       'Electronic City', 'Frazer Town', 'HSR', 'Indiranagar',
       'Jayanagar', 'JP Nagar', 'Kalyan Nagar', 'Kammanahalli',
       'Koramangala 4th Block', 'Koramangala 5th Block',
       'Koramangala 6th Block', 'Koramangala 7th Block', 'Lavelle Road',
       'Malleshwaram', 'Marathahalli', 'MG Road', 'New BEL Road',
       'Old Airport Road', 'Rajajinagar', 'Residency Road',
       'Sarjapur Road', 'Whitefield'], dtype=object)
In [15]:
zomato.head()
Out[15]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city
0 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa True True 4.1 775 Banashankari Casual Dining North Indian, Mughlai, Chinese 800.0 [('Rated 4.0', 'RATED\n A beautiful place to ... [] Buffet Banashankari
1 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant True False 4.1 787 Banashankari Casual Dining Chinese, North Indian, Thai 800.0 [('Rated 4.0', 'RATED\n Had been here for din... [] Buffet Banashankari
2 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe True False 3.8 918 Banashankari Cafe, Casual Dining Cafe, Mexican, Italian 800.0 [('Rated 3.0', "RATED\n Ambience is not that ... [] Buffet Banashankari
3 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana False False 3.7 88 Banashankari Quick Bites South Indian, North Indian 300.0 [('Rated 4.0', "RATED\n Great food and proper... [] Buffet Banashankari
4 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village False False 3.8 166 Basavanagudi Casual Dining North Indian, Rajasthani 600.0 [('Rated 4.0', 'RATED\n Very good restaurant ... [] Buffet Banashankari
In [16]:
## Checking Null values
zomato.isnull().sum()
Out[16]:
address         0
name            0
online_order    0
book_table      0
rate            0
votes           0
location        0
rest_type       0
cuisines        0
cost            0
reviews_list    0
menu_item       0
type            0
city            0
dtype: int64
In [17]:
## Computing Mean Rating
restaurants = list(zomato['name'].unique())
zomato['Mean Rating'] = 0

for i in range(len(restaurants)):
    zomato['Mean Rating'][zomato['name'] == restaurants[i]] = zomato['rate'][zomato['name'] == restaurants[i]].mean()
In [18]:
zomato.head()
Out[18]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city Mean Rating
0 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa True True 4.1 775 Banashankari Casual Dining North Indian, Mughlai, Chinese 800.0 [('Rated 4.0', 'RATED\n A beautiful place to ... [] Buffet Banashankari 4.118182
1 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant True False 4.1 787 Banashankari Casual Dining Chinese, North Indian, Thai 800.0 [('Rated 4.0', 'RATED\n Had been here for din... [] Buffet Banashankari 4.100000
2 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe True False 3.8 918 Banashankari Cafe, Casual Dining Cafe, Mexican, Italian 800.0 [('Rated 3.0', "RATED\n Ambience is not that ... [] Buffet Banashankari 3.800000
3 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana False False 3.7 88 Banashankari Quick Bites South Indian, North Indian 300.0 [('Rated 4.0', "RATED\n Great food and proper... [] Buffet Banashankari 3.700000
4 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village False False 3.8 166 Basavanagudi Casual Dining North Indian, Rajasthani 600.0 [('Rated 4.0', 'RATED\n Very good restaurant ... [] Buffet Banashankari 3.800000
In [19]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range = (1,5))

zomato[['Mean Rating']] = scaler.fit_transform(zomato[['Mean Rating']]).round(2)

zomato.sample(3)
Out[19]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city Mean Rating
34037 110-A, Westminister Building, Cunningham Road,... Brownie Heaven True False 4.2 34 Cunningham Road Dessert Parlor Desserts, Fast Food 300.0 [('Rated 4.0', "RATED\n So we wanted to have ... [] Delivery MG Road 3.96
12444 Royal Orchid Central, Level 10,Manipal Centre,... Ging - Royal Orchid Central True True 4.3 226 MG Road Bar, Casual Dining Asian 2.0 [('Rated 5.0', "RATED\n In love with this pla... [] Pubs and bars Frazer Town 4.28
28326 45/1, 45/2, Bangalore Central, 45th Cross, 9th... The Chocolate Room True False 3.9 155 JP Nagar Food Court Cafe, Desserts 400.0 [('Rated 4.0', "RATED\n Just came back home. ... [] Delivery Koramangala 7th Block 3.53
In [20]:
zomato.head()
Out[20]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city Mean Rating
0 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa True True 4.1 775 Banashankari Casual Dining North Indian, Mughlai, Chinese 800.0 [('Rated 4.0', 'RATED\n A beautiful place to ... [] Buffet Banashankari 3.99
1 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant True False 4.1 787 Banashankari Casual Dining Chinese, North Indian, Thai 800.0 [('Rated 4.0', 'RATED\n Had been here for din... [] Buffet Banashankari 3.97
2 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe True False 3.8 918 Banashankari Cafe, Casual Dining Cafe, Mexican, Italian 800.0 [('Rated 3.0', "RATED\n Ambience is not that ... [] Buffet Banashankari 3.58
3 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana False False 3.7 88 Banashankari Quick Bites South Indian, North Indian 300.0 [('Rated 4.0', "RATED\n Great food and proper... [] Buffet Banashankari 3.45
4 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village False False 3.8 166 Basavanagudi Casual Dining North Indian, Rajasthani 600.0 [('Rated 4.0', 'RATED\n Very good restaurant ... [] Buffet Banashankari 3.58
In [21]:
## Text Preprocessing

Some of the common text preprocessing / cleaning steps are:

  • Lower casing
  • Removal of Punctuations
  • Removal of Stopwords
  • Removal of URLs
  • Spelling correction
In [22]:
# 5 examples of these columns before text processing:
zomato[['reviews_list', 'cuisines']].sample(5)
Out[22]:
reviews_list cuisines
19536 [('Rated 3.0', 'RATED\n Saturday morning we h... Cafe, Continental, American, Italian
7500 [('Rated 5.0', 'RATED\n Lovely place, a must ... Chinese, Continental, Burger, Pizza
13756 [('Rated 5.0', 'RATED\n The place has very go... Continental, Italian, North Indian, Mexican
34930 [('Rated 1.0', "RATED\n Drinks and food price... North Indian, Chinese
23798 [('Rated 4.0', 'RATED\n 3.5/5.\nOrdered tangd... North Indian, Biryani
In [23]:
## Lower Casing
zomato["reviews_list"] = zomato["reviews_list"].str.lower()
zomato[['reviews_list', 'cuisines']].sample(5)
Out[23]:
reviews_list cuisines
14083 [('rated 1.0', 'rated\n i wanted order food f... Biryani, North Indian, Mangalorean, Seafood
21375 [('rated 1.0', "rated\n ordered a chicken sub... Arabian, Sandwich, Rolls, Burger
15003 [('rated 3.0', 'rated\n been here for dinner ... North Indian, South Indian, Chinese, Seafood
36463 [('rated 1.0', 'rated\n the place was empty a... North Indian, Chinese
30443 [('rated 5.0', 'rated\n natural ice cream, st... Ice Cream, Beverages
In [24]:
## Removal of Puctuations
import string
PUNCT_TO_REMOVE = string.punctuation
def remove_punctuation(text):
    """custom function to remove the punctuation"""
    return text.translate(str.maketrans('', '', PUNCT_TO_REMOVE))

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_punctuation(text))
zomato[['reviews_list', 'cuisines']].sample(5)
Out[24]:
reviews_list cuisines
2242 rated 30 ratedn pongal is my favourite here a... South Indian, Fast Food, Street Food
23495 rated 30 ratedn chettys corner is a great pla... Fast Food, Burger
17679 rated 40 ratedn happened to visit this place ... South Indian, Beverages
4170 rated 10 ratedn placed an online order for ve... Chinese, Momos
18459 rated 50 ratedn it was a saturday afternoon a... Continental, Sandwich, Burger, Italian, Salad,...
In [25]:
## Removal of Stopwords
from nltk.corpus import stopwords
STOPWORDS = set(stopwords.words('english'))
def remove_stopwords(text):
    """custom function to remove the stopwords"""
    return " ".join([word for word in str(text).split() if word not in STOPWORDS])

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_stopwords(text))
In [26]:
## Removal of URLS
def remove_urls(text):
    url_pattern = re.compile(r'https?://\S+|www\.\S+')
    return url_pattern.sub(r'', text)

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_urls(text))
In [27]:
zomato[['reviews_list', 'cuisines']].sample(5)
Out[27]:
reviews_list cuisines
12110 rated 20 ratedn piece shit customer service wo... Ice Cream, Desserts
25865 rated 10 ratedn ordered chicken fried rice chi... Bengali, North Indian, Chinese
555 rated 40 ratedn perfect place burger coke frie... Burger, Fast Food
14033 rated 40 ratedn place needs introduction locat... Bakery, Cafe, Italian, Desserts
37162 rated 50 ratedn located city market cant miss ... Healthy Food, Juices
In [28]:
# RESTAURANT NAMES:
restaurant_names = list(zomato['name'].unique())
restaurant_names
Out[28]:
['Jalsa',
 'Spice Elephant',
 'San Churro Cafe',
 'Addhuri Udupi Bhojana',
 'Grand Village',
 'Timepass Dinner',
 'Rosewood International Hotel - Bar & Restaurant',
 'Onesta',
 'Penthouse Cafe',
 'Smacznego',
 'Cafã\x83Â\x83Ã\x82Â\x83Ã\x83Â\x82Ã\x82Â\x83Ã\x83Â\x83Ã\x82Â\x82Ã\x83Â\x82Ã\x82© Down The Alley',
 'Cafe Shuffle',
 'The Coffee Shack',
 'Caf-Eleven',
 'Cafe Vivacity',
 'Catch-Up-Ino',
 "Kirthi'S Biryani",
 'T3H Cafe',
 '360 Atoms Restaurant And Cafe',
 'The Vintage Cafe',
 'Woodee Pizza',
 'Cafe Coffee Day',
 'My Tea House',
 'Hide Out Cafe',
 'Cafe Nova',
 'Coffee Tindi',
 'Sea Green Cafe',
 'Cuppa',
 "Srinathji'S Cafe",
 'Redberrys',
 'Foodiction',
 'Sweet Truth',
 'Ovenstory Pizza',
 'Faasos',
 'Behrouz Biryani',
 'Fast And Fresh',
 'Szechuan Dragon',
 'Empire Restaurant',
 'Maruthi Davangere Benne Dosa',
 'Chaatimes',
 'Havyaka Mess',
 "Mcdonald'S",
 "Domino'S Pizza",
 'Hotboxit',
 'Kitchen Garden',
 'Recipe',
 'Beijing Bites',
 'Tasty Bytes',
 'Petoo',
 'Shree Cool Point',
 'Corner House Ice Cream',
 'Biryanis And More',
 'Roving Feast',
 'Freshmenu',
 'Banashankari Donne Biriyani',
 'Wamama',
 'Five Star Chicken',
 'Xo Belgian Waffle',
 'Peppy Peppers',
 'Goa 0 Km',
 'Chinese Kitchen',
 '1947',
 'Cake Of The Day',
 'Kabab Magic',
 "Namma Brahmin'S Idli",
 'Gustoes Beer House',
 'Sugar Rush',
 'Burger King',
 'The Good Bowl',
 'The Biryani Cafe',
 'Lsd Cafe',
 'Rolls On Wheels',
 'Sri Guru Kottureshwara Davangere Benne Dosa',
 'Devanna Dum Biriyani Centre',
 'Kolbeh',
 'Upahar Sagar',
 'Kadalu Sea Food Restaurant',
 'Frozen Bottle',
 'Parimala Sweets',
 'Vaishali Deluxe',
 'The Big O Bakes',
 'Meghana Foods',
 'Krishna Sagar',
 'Dessert Rose',
 'Chickpet Donne Biryani House',
 "Thanco'S Natural Ice Creams",
 'Nandhini Deluxe',
 "Vi Ra'S Bar And Restaurant",
 'Kaggis',
 'Ayda Persian Kitchen',
 'Chatar Patar',
 'Polar Bear',
 "Kidambi'S Kitchen",
 'Mane Thindi',
 'Kotian Karavali Restaurant',
 'Floured-Baked With Love',
 'Cakes & Slices',
 'Spice 9',
 'Naveen Kabab & Biriyani Mane',
 'Katriguppe Donne Biryani',
 'Atithi Point Ande Ka Funda',
 'Just Bake',
 'K27 - The Pub',
 'Bengaluru Coffee House',
 'New Mangalore Lunch Home',
 'Coffee Bytes',
 'Parjanya Chat Zone',
 "Kwality Wall'S Swirl'S Happiness Station",
 'Ruchi Maayaka',
 'Anna Kuteera',
 'Darbar',
 'Vijayalakshmi',
 'Sri Udupi Food Hub',
 'Udupi Upahar',
 'House Of Kebabs',
 'Roll N Rock',
 'Box8- Desi Meals',
 'Kfc',
 'Roll Over',
 'Imperial Restaurant',
 'Lassi Shop',
 'The Fortuna',
 'Wahab',
 'Al Diwan Biryanis And More',
 "New Gowda'S Fried Chicken",
 'Canton',
 'Diabetics Dezire Sugarless Sweets And Bakes',
 'The Blue Wagon - Kitchen',
 'Hot Coffee',
 'Patio 805',
 'Lassi Corner',
 'Sagar Deluxe',
 'Kanti Sweets',
 'Vegetalia',
 'Aramane Donne Biryani',
 'Ande Ka Funda',
 'Cake Ghar',
 'Energy Addaa',
 'Bhattara Bhojana',
 'Tandoori Knight',
 'Dev Sagar - Food Street',
 'Mitraa Da Pizza',
 'Paradise Premium',
 'Grazers',
 'Shakes Theory',
 '@Italy',
 'Chilli Flakes',
 'Calcutta Cafe',
 'Old Mumbai Ice Cream',
 'Donne Biriyani House',
 'By 2 Coffee',
 "Kedia'S Fun Food",
 'Davangere Butter Dosa Hotel',
 'Just Shawarma',
 'Mini Punjabi Dhaba',
 'Mulabagilu Dosa',
 'Gokul Veg',
 'Olive - Era',
 'Udupi Ruchi',
 'Madhappa Hindu Military Hotel',
 'Gama Gama',
 'Pizza Hut',
 'Mangalore Pearl',
 'Asha Sweets Centre',
 'Twiststick House',
 'Cool Corner',
 'Pizza Mane',
 'Dal Tadkaa',
 'Chutney Chang',
 'Mystique Palate',
 'Thamboola',
 'Castle Rock',
 'Vietnamese Kitchen',
 'C Corner',
 'Paratha Merchant',
 'North Rasoe',
 'Toscano',
 'Lord Of The Kitchen',
 'Stoned Monkey',
 'Central Jail Restaurant',
 'Bella',
 'Vennela Andhra Meals',
 'New Prashanth Hotel',
 'The Grillo',
 'Re Malnad Nati Style Hotel',
 'Karma Kaapi',
 "Tiwari'S",
 'Pizza Stop',
 "Biggies Burger 'N' More",
 "Kollapuri'S",
 'Kadamba Classic',
 'Rs Shiv Sagar',
 'Kholi Mane',
 'Harshi Super Sandwich',
 'Cake Yard',
 'Sri Udupi Veg',
 'Cake Art',
 'Potato House',
 'Matru Sagar',
 'Ugadi',
 'Sri Krishna Sweets',
 'In Time Cane Juice',
 'Subway',
 'Daal Roti',
 'The Lassi Park',
 'A2B - Adyar Ananda Bhavan',
 'Srikrishna Bhavan',
 'Green Gardenia',
 'J Spice',
 'Karavali Family Restaurant',
 'Karavali Lunch Home',
 'Hatti Kaapi',
 "Kolkata Kathi Roll'S",
 'Upahara Darshini',
 'The Chaat Shop',
 'Anda Ka Funda',
 'Shri Vinayaka Ice N Juice',
 'Ibaco',
 'Jalaram Sweets',
 'Samskruti - Sanman Gardenia',
 'Bendakaluru Bytes',
 'Cocoa Bakes',
 'Chumma Delicious',
 'Dwaraka Grand',
 'Gufha - The President Hotel',
 "Tanna'S Kitchen",
 'Shree Mandarathi Grand',
 'Mojo Pizza - 2X Toppings',
 'Iceberg Icecreams',
 'South Kitchen',
 'Chung Wah',
 'Shanthi Sagar',
 'Millet Mama',
 'Bangarpet Chat Express',
 'Prems Graama Bhojanam',
 'Java City',
 'Kamat Bugle Rock',
 'Puliyogare Point',
 'Bangalore Agarwal Bhavan',
 'Rustic Stove',
 'Udupi Ruchi Grand',
 'De Thaali',
 'Just Thindi',
 'Vasanth Vihar - Since 1965',
 'Sea Spice By 7 Star',
 'Desi Dawat',
 'Cafe Aira',
 'Mast Punjabi',
 'South Grand',
 'The Pizzeria',
 'Cafe Zone',
 'Ruchis Point',
 'Firangi Bake',
 'The Royal Corner - Pai Viceroy',
 'Hara Fine Dine',
 'Chocoberry',
 'Dakshin Grand',
 'Sandwichwallas',
 'Chai Mane',
 'Slv Upachar',
 'Waffle-A-Go Go',
 'Cross Road Cafe',
 'Anand Donne Biriyani',
 'Seven Star',
 "Stop 'N' Joy",
 'Sri Yaksha Shiv Sagar',
 'Fudge',
 'Just Bunny',
 'Bakers Town',
 'Shree Mahalakshmi Sweets',
 'Kaapi Kendra',
 'Ivy',
 'The Airos',
 'Chai Kraft',
 'South Point Pub',
 'The Trundle',
 'The Krishna Grand Xpress',
 'Sri Krishna Darshini',
 'Shringar Sweets & Snacks',
 'Udupi Grand',
 'Ganesh Grand',
 'Shreyas Upahar & Burger Point',
 'Little Cafe',
 'Mumbai Badam Milk & Lassi Center',
 'Sri Krishna Sagar',
 'Mylari Biryani Family Restaurant',
 '1980S Games Cafe',
 'Kadala Tarangaa',
 'Namma Biryani',
 "Nandhanu'S Rasoi",
 'Hotel Pork Paradise',
 'As On Fire',
 'J K Fish Land',
 'Curry Leaves',
 'Magic Meals',
 'Desi Cream Junction',
 'Drunken Monkey',
 "Ruchi'S Corner",
 'Tandoori Bytes',
 'Bangalore Donne Biriyani',
 'Sgs Non Veg - Gundu Pulav',
 'Keventers',
 'Navi Food Point',
 'Shawarma Inc',
 'The Cafe Nuts And Bolts',
 'Am Wow',
 'Lalbagh Grand',
 'Funjabi Curries',
 'Eat Repeat',
 'Salut',
 'Nammura Donne Biriyani',
 'Lassi Stop',
 'Dakshin Kitchen',
 'Sri Sai Cafe',
 'Sri Sai 99 Variety Dosa',
 'Utsav Restaurant',
 'Crunch And Munch',
 'Food Geek',
 'Sri Krishna Aramane',
 'Great Indian Rolls',
 'Chats Point',
 'Food Point',
 'Swarga Ruchi',
 'Kolkata King',
 "Sandwich Mama'S And Frozen Monster",
 'New Rajadhani Spicy',
 'Big Mishra Pedha',
 'Vegeatz',
 'Foodizo',
 'Food Springs',
 "Chung'S Chinese Corner",
 'Eurasia Pasta And Barbeque By Little Italy',
 'Goli Vada Pav No. 1',
 "Ragoo'S",
 'Pure & Natural',
 'Maiyas',
 'Sip N Dine',
 'Cafe Mondo',
 'Jalpaan',
 "Kataria'S Pakwan",
 'Juicy Momos',
 'Amande Patisserie',
 'Anand Sweets And Savouries',
 'Chicken Hunt',
 'Swadesh Tadka',
 'Cane-O-La',
 'New Biryani Mane',
 'Karnataka Bhel House',
 'Sreeraj Lassi Bar',
 'Juice Junction Food Court',
 "Bunt'S Biriyani Palace",
 'Chai Point',
 'Janahaar',
 'Utsav',
 'Meat And Eat',
 'Snacks Bite',
 'The Spice Saga',
 'Dakshin Cafe',
 'Kaulige Millet Corner',
 'Ifruit Ice Cream',
 'Havmor',
 'Grand Food',
 'Dining Hut',
 'Desi Swaad',
 'Yummy Momos',
 'Masale Daan',
 'Grow Fit',
 'Sandwich Hub',
 'Udupi Sri Krishna Cafe',
 'Starlite Bakery & Fastfood',
 'Veganbreak24X7',
 'Aadhya Hotel',
 'Dodda Mane Baaduta',
 'Swadd Kitchen',
 'Happy Chopsticks',
 'Udupi Upachar',
 'Davanagere Benne Dose Hut',
 'Subz',
 'Mint And Mustard',
 'Chinese Square',
 'Punjabi Raswada',
 "Vinny'S",
 "Chetty'S Corner",
 "Kapoor'S Cafe",
 'Donne Biryani & Kabab Corner',
 'Sukh Sagar',
 'S M V Snacks Corner',
 'Andhra Ruchulu',
 'Steaming Mugs',
 'Rajathadri Food Fort',
 'Brew Meister',
 'Rasoi',
 'Mr. Singh Da Dhaba',
 'Kababs N Biryani',
 'Ayodhya Upachar',
 'Biriyani Mane',
 'Cafe Ajfan',
 'Brundhavana Food Point',
 'Slv Corner Restaurant',
 'Hotel New Karavali',
 'The Krishna Grand',
 'Roti Ghar',
 'Kettle & Kegs',
 'Baisakhi',
 'Poonam Sweets',
 "Amma'S Pastries",
 'The Lassi And Juice Park',
 'Corner Stone',
 'Arun Ice Cream',
 'Sweet N Swirl',
 'Sri Venkateshwara Sweet Meat Stall',
 'Baskin Robbins',
 'Srinivasa Brahmins Bakery',
 'Jain Bakes',
 'Kc Das - Sweet Spot',
 'A2B Veg - Adyar Ananda Bhavan',
 'Jcubez',
 'Blue Wings Bar & Restaurant',
 'New Imperial Restaurant',
 'Karavali Fish Center',
 "Iyer'S Tiffin Centre",
 'Kydz Adda',
 'Food Box Cafe',
 'Sri Ganesh Juice And Chats',
 'Manifest Cafe',
 'Taaza Thindi',
 'Sri Laxmi Venkateshwara Coffee Bar',
 'Messy Bowl',
 'Brahmin Cafe',
 'Hotel Mangala',
 'Simple Thindies',
 'Slv Swadishta',
 'New Sagar Fast Food',
 'Parama Ruchi',
 'Thrilok Restaurant',
 'Hanumanthanagar Biryani Junction',
 'Slv Refreshment',
 'Svkp Daily Fresh',
 'Srinagar Kabab Corner',
 'Sri Venkateshwara Chat Centre',
 'Vinay Bhel Corner',
 'Nandi Chats And Juice',
 'Food Adda',
 'Hotel Nisarga',
 'Yo Roll Corner',
 'New Quality Dum Biryani',
 'Sri Lakshmi Kabab Center',
 'Tasty Bites',
 'Raams Chicken',
 'Panchami',
 'Kavali',
 'Ranganna Military Hotel',
 'Vidyarthi Bhavan',
 'Bharjari Oota',
 'Bridgeway',
 'Soho Bar & Grill',
 'Bhavani Restaurant',
 'Zephyr',
 'Hotel Dwarka',
 'Nisarga Garden Restaurant',
 'Ma-Arya Family Restaurant',
 'Udupi Sri Krishna Bhavan',
 'Ice Thunder',
 'Mahalaxmi Tiffin Room',
 'Basavanagudi Mylari',
 'Shrinidhi Military Hotel',
 'Pramukh Family Restaurant',
 'Vybhava',
 'Shree Venkateshwara North Karnataka Hotel',
 'Anand Sagar Inn',
 'Sangam Military Hotel',
 'Belly Squad Food Truck',
 '50-50 Eating House',
 'Om Shiva Shakthi Chats Centre',
 'Rolls Corner',
 'New Ambur Hot Dum Biryani',
 'Deja Vu Resto Bar',
 'Fattoush',
 'Abhiruchi Hotel',
 'Tulips',
 'Barbeque Nation',
 'Sattvam',
 '24Th Main',
 'Zaitoon',
 'Mango Greens',
 'Oye Amritsar',
 'Melt - Eden Park',
 'Spice Code',
 'The Onyx - The Hhi Select Bengaluru',
 'The Pavillion',
 'Sankranthi Veg Restaurant',
 'Tisano Cafe',
 'Cafe Kabana',
 'Butterly',
 'Black Mug Cafe',
 '#Refuel',
 'Wafl',
 'Vaho Kafe & Pressery',
 'Dreamcatcher',
 'Cafe Arabica',
 'Starbucks',
 'Smoor',
 'Kalmane Koffees',
 'Shee-Sha Cafe',
 'Brews N Bites',
 'D2V Cafe',
 'Cafe Talk',
 'Cafe Choco Craze',
 'Slate Cafe',
 'Dialogues',
 'Mudpipe Cafe',
 'Tab - Take A Break',
 'Cafe Potpourri',
 'De Oxford Cafe',
 'Vinaya Coffee Moments',
 'Brew Point',
 'The Cravery',
 "Anju'S Cafe",
 'Skytouch Le Cafe',
 'Hearts Unlock Cafe',
 'Eat.Fit',
 'Sai Abhiruchi',
 'Capsicum Family Restaurant',
 'Box Magic',
 "Maa'R Rannaghor",
 'Easy Bites',
 'Hiyar Majhe Kolkata',
 'Dabba Gosht',
 'Punjabites',
 'Sri Lakshmi Dhaba',
 'Swadista Aahar',
 'Vegetarea',
 'Al-Bek',
 "Aniram'S",
 'Punjabi Nawabi',
 'Yummrajj',
 'Swad Punjab Da',
 'Roti Wala',
 'Midnight Mania',
 'Kitchens@Jp Nagar',
 'Krishna Kuteera',
 'Apna Punjab',
 'Paratha Junction',
 'Nellore Bhojanam',
 'Momoz',
 'Kalingas',
 'Kanteen The Eatery',
 'Kullad Cafe',
 'Litti Twist',
 'Cakebuy',
 'Delight Food',
 'Andhra Kitchen',
 'Veg By Nature',
 'Chicken Magic',
 'Swathi Restaurant',
 'Fresh Kitchen',
 'Hind Ka Chulah',
 'Kuttanad',
 'New Mahesh Friends Food Corner',
 'Bohra Bohra Cafã\x83Â\x83Ã\x82Â\x83Ã\x83Â\x82Ã\x82Â\x83Ã\x83Â\x83Ã\x82Â\x82Ã\x83Â\x82Ã\x82©',
 'Shree Krishna Sannidhi',
 'Bingejoy!',
 'Shiv Sai Hotel',
 'Mra',
 'Burj Hotel',
 'Shaadi Ki Biryani',
 'Madeena Hotel',
 'Biryani Durbar',
 'Mahesh Friends Food Center',
 'Juice Shop',
 "Dadi'S Dum Biryani",
 'Krishna Kuteera South',
 'Alankrutha',
 'Paradise',
 'Kabab Mehal',
 'Sri Punjabi Dhaba',
 'Arabian Mexico',
 'Cakezone',
 'Fujian Express',
 'Indian Food',
 'Tandoori Paradise',
 'Kolkata Kathi Rolls',
 'Adithya',
 'Cheesiano Pizza',
 'Nati Palace',
 "Dande'S Hyderabad Biryani",
 'Upahara Bhavan',
 'Sher-E-Punjab',
 'Shuddh Desi Khana',
 'Karama Restaurant',
 'Jaganath Hotel And Restaurant',
 'Aramane Donne Biriyani',
 "Mani'S Dum Biryani",
 'Amontron',
 'A M Biryani Hotel',
 'Birinz',
 'Hyderabadi Bawarchi',
 'Fish Chain',
 'Prasiddhi Food Corner',
 'Biriyani Bhatti',
 'Hyderabad Biryaani House',
 "Galito'S",
 'C. K. Mega Hot Food',
 'Red Chilli Restaurant',
 'Rss Donne Biryani',
 'Rajdhani Thali Restaurant',
 'Phew (Play Hard Eat Wild)',
 'New Kabab Zone',
 'Bawarchi Paradise',
 'Shree Udupi Grand',
 'Chicken County Grand',
 'Darjeeling Momos & Fast Food Center',
 'Veruthe Oru Thattukada',
 'Savoury - Sea Shell Restaurant',
 'Warm Oven',
 'Kundana',
 'Food Ka Masti',
 'The Shawarma Shop',
 'Momo Junction',
 'Antilla Aromas',
 'Punjabi Food Corner',
 'Mealer.In',
 'Pathaan Sir',
 'Cold Stone Creamery',
 'Hari Super Sandwich',
 'Amritsari Kulcha Land',
 'Chokha Chowka',
 'Gorbandh',
 'Grills & Rolls',
 'Bathinda Junction',
 'Stories',
 "New Gongura'S",
 'Sagar Grand',
 'Ubq By Barbeque Nation',
 'Agarwal Food Service',
 "Daniyal'S",
 'Seasons',
 'Chef Delicacies',
 'Indiana Burgers',
 'Moksha',
 'Marwa Restaurant',
 'Shanghai Court',
 'Akshaya Donne Biriyani',
 'Bhojohori Manna',
 'Richie Rich',
 'Hunger Bee',
 'Yum In My Tum',
 'Maggi N Maggi House',
 'Fresh Dinner',
 'B.M.W Bhookh Mitaane Wala',
 'Biryani Miya',
 'Krispy Kreme',
 'Paani Kum Chai',
 'Chulha Chauki Da Dhaba',
 "Magix'S Parattha Roll",
 'Elegant Dining',
 'Aliensip',
 'Waffle Head',
 'Samruddhi Biryani',
 'Basmati Delights',
 "Charlee'S Chicken",
 'Samosa Singh',
 'Cravings',
 'Nagas',
 'Matka',
 'Punjabi Swag',
 'Taco Bell',
 'Ambur Star Dum Biryani',
 "Mother'S Rasoi",
 'Dosa Bazaar',
 'Babu Moshai',
 'The Bong Palate',
 'Gowdru Mane Oota',
 'Banashankari Nati Style',
 "Chandrima'S Kitchen",
 'Bikaner Jn',
 'Crunch Pizzas',
 'Lassi Berg',
 'Kakal-Kai Ruchi',
 'Manchu Cafe',
 'Calorie Express',
 'Bangalore Box',
 'Hotel Khaaja',
 'The Gujarat Express',
 'Vishal Foods',
 'Lassi Darbar',
 'Chavadi',
 'Nanna Munna Paratha House',
 'Cucumber Town',
 'Kolkata Famous Kati Roll',
 'Nellore Ruchulu',
 'Brewsky',
 'Chefeana',
 'Bangaliana',
 'Gud Dhani',
 'The Hunger Room',
 'Parisar Veg Restaurant',
 'Deejas Kitchen',
 'Pancuzzi',
 "Zhang'S - Chinese Restaurant",
 'Shagun Sweets & Foods',
 'Eat Well',
 "Dev'S Gugababa",
 'Oogway Express',
 'Balaji Bombay Vada Pav Gujrati Dalebi',
 'The Cake Ville',
 'The Egg Factory',
 'Chow San',
 'New Udupi Grand',
 'New Karawali Lunch Home',
 'Sr Choco Station',
 'Momo Jojo',
 'Simply Indian',
 'Delhi Biryani Cafe',
 "Rithika'S Kitchen",
 'Punjabi Times',
 'Cravy Wings',
 'Funky Punjab',
 'A1 Garam Masala',
 'Punjabi Corner',
 'Guru Palace',
 'Zeeshan',
 'Ambara Gardenia',
 'Layerbite',
 'Bhavani Chats',
 'More Pizza',
 'Crafted Plate',
 "Shetty'S Kitchen",
 'Machali Port',
 'Naati Manae',
 'Mad Over Biryani',
 'Garma Garam',
 'Sambram Biriyani Paradise',
 'Mr. Meetharam',
 'The Coastal Crew By Fujian On 24Th',
 'Bib - Breakfast In The Box',
 'Angel Restaurant',
 'Wow Momo',
 'Pizza Da Dhaba',
 'Kalpavruksha',
 'Juice Land',
 'Tandoor And Spice',
 'Rock Stone Ice Cream Factory',
 'Paratha Plaza',
 'Krishna Vijayashree',
 'Soup Station',
 'Tasty Point',
 'Natural Mumbai Kulfi',
 'Namma Kudla',
 'Triveni',
 'Wangs Kitchen',
 'Rk Fresh Food',
 'Alif Restaurant',
 'Dine One One Restaurant',
 'Pallavas Veg Cuisine',
 'The Food Cottage',
 'Brundhavana Pure Veg',
 'Spice Up',
 'Suryawanshi',
 'Tempteys',
 'Delibox.In',
 'Frybies',
 'Crumb Together',
 'The Foodyz',
 "Bean D'Er Cafe",
 'Chaat Chatore',
 'Taste Of Kolkata',
 'Y Not Restaurant',
 '24/7 Food Service',
 'Modern Restaurant',
 'Burrito Boys',
 'Trippy Paradise',
 'Biryani Magic',
 'Nagarjuna',
 'Laddoos',
 'Waffle Stories',
 'Late Night',
 'Waffle Walle',
 'Shake It Off',
 'Bombay Kulfi',
 'Chaat Central',
 'Two Friends Cauldron',
 'Bun Town',
 'Garden Fresh',
 'Dhabeliwala',
 "Rayaan'S Bbq",
 'Chaat Junction',
 'Asharfilal',
 'Street Foods By Punjab Grill',
 "Tiwari'S Ghee Paratha And Chats",
 'Spurthi Foods',
 'Melting Pot At Woodrose',
 'Pot Biryani',
 "Thanco'S Natural Ice Cream",
 'Mango Tree- The Beer Garden',
 'Dessi Cuppa',
 'Halli Mane Mudde Oota',
 'Wow Vada Pav',
 'The Jade Kitchen',
 'Krishna Vaibhava',
 'Nandhanus Rasoi',
 'Donne Biriyani Mane',
 'Fish And Dish',
 'Miss Momo',
 'Intalia',
 'Chicken County Restaurant',
 'Velvet Kitchen',
 'Hotel Aradhana',
 'The Chervil',
 'Sri Rajasthani Foods',
 'New Royal Treat',
 'I Cool',
 'Mahek Pizza',
 'Kc Das- Sweet World',
 'Panther Cafe',
 'Roots N Fruits',
 'Kaizen Wellness Kitchen',
 'Mad Over Donuts',
 'Kolkata Famous Kati Rolls',
 'Vinaya Cafe',
 'Rbp Greens Garden',
 'Dazu Momoz',
 'Desi Dhaba',
 'World Of Waffles',
 "Pika'S Kitchen",
 'Lassi Cafe',
 'New Mogul Empire',
 'Atithi Biryani Corner',
 'Oottupura Family Restaurant',
 'Yumme Veg',
 'Shree Guru Raghavendra Chats Chintamani Special',
 'Vyanjan',
 'Sri Sankara Cafe',
 'Bhaijaan Barbeques',
 'Chatter Platter',
 'Bikaneri Sweets',
 "Watson'S",
 'Sri Nandi Grand',
 'Chicken Man',
 "Chef Baker'S",
 'Sree Ganesha Fruits & Juice',
 'Nandhana Palace',
 'Biryani Nights',
 'Thejas Bhavan',
 'Graffitea',
 'Gappe',
 'On The Nose',
 'Juicy & Spicy',
 'Eat Repeat Express',
 'Tfi - The Fresh Ice Cream',
 'Mirch Masala',
 'Just Chill',
 "Bengalooru Tiffany'S",
 'Udupi Thaja Thindi',
 'Baba Ka Dhaba',
 'Hot Burg',
 'Northern Bites',
 'Hotel Smile',
 'Roll Wala',
 'Natural Ice Cream',
 'F3 Food Fun Fiesta',
 "Churchill'S",
 "Kulkarni'S New Uttara Karnataka Food Speciality Stores",
 'The Hungers Zone',
 'The Chocolate Room',
 'Aramane Restaurant',
 'Hunger2Eat',
 'Bbq Ride 46',
 'Bawarchi Inn',
 "Ruh'S Cafe",
 'Eatery Have U Been',
 'Upahar Banashree',
 'Me And My Cake',
 'Bhojon Tripti',
 'New South Corner',
 'Khana Khazana',
 'China Tang',
 'California Burrito',
 'Tandoor Garden',
 'Caffe Pascucci',
 'Freeze It',
 'Meal Square',
 'Chaitanya Cafe',
 'Tasty Jigarthanda',
 'Berrylicious',
 'Rcs Kitchen',
 'Pappu Da Dhaba',
 'Smoke - The Sizzler House',
 'Sandwich Shop',
 'Joon Restaurant',
 'Snack Knack',
 'Chung Wah Opus',
 'Chef & Dine',
 'Once In Nature',
 'Manjushree Upahara',
 'Hotel Annapoorna',
 'Dream A Dozen',
 'Bread Crumbs Bakery',
 'Nite Out',
 'Cupcake Bliss',
 'Armaani Caffe',
 'Muddhe Bytes',
 'Bombay Kulfis',
 'Willys Top Cafe',
 'Mist N Creams',
 'Papacream',
 "Woody'S",
 'Jp Fish Land',
 'Pepper Crown Restaurant',
 'Pizza Paradise',
 'The Curry Hut Plus',
 'Malabar Mess',
 'Malabar Cafe',
 'Tangra Indo - Chinese Cuisine',
 'Juice Junction',
 'Xpress Chai',
 'Zhangs Classic',
 'Guru Greens',
 'D View Cafe',
 'The Cuboidal',
 'Delifusion - Hunger Sorted',
 'Flavour Of China',
 'Gujrati Mess',
 'Cake-O-Mania',
 'The Park Inn Restaurant',
 'Eagle Ridge',
 'Vidya Cafe',
 'Kabab Treat',
 'Limra Hotel',
 'Kafe Nook',
 'Kumbha Bhojanam',
 'Yummerica Fries',
 'Casa Piccosa',
 'Bake Addiction',
 'Tongue Twisters',
 'Hot Rolls & House Of Kebabs',
 'Aami Kolkata',
 'Vaathsalya Millet Cafe',
 'Hungry Buddies',
 'Thericebowl.In',
 'Guru Garden',
 'Cakes By M',
 'Brewz Coffee',
 'Fritz Haber',
 'Athithi',
 'Smooth Blender',
 'Sri Bhagya Grand',
 "Shree Guru Juice 'N' Ice",
 'Bhavya Military Hotel',
 'Neals Cafe',
 'Flavorsome Bakes',
 'Bake-Ooh',
 'The Yummy Tummy',
 'Mast Kalandar',
 "Sultan'S Biryani",
 'Halli Sogasu',
 'Crepe Nation',
 'Rice Bar',
 'Kanika Biryani Paradise',
 'Resto',
 'New Tandoori Point',
 'Shiv Sagar',
 'Little Shangai',
 "Nadella'S Kitchen",
 "Uncle'S Kitchen",
 'Coastal Inn',
 "Mom'S Momos",
 'Aishwarya Parkland',
 'Slice Of Spice',
 'Bao And Rolls',
 'Andhra Bhojanam',
 'Kanchan Dhaba',
 'Encyclofoodia',
 'Bella?S Kitchen',
 'Food Feast Multicusine Restaurant',
 'Bangalir Rannaghar',
 'Sai Samosa & Chat Corner',
 "Lalchee'S Rasoi",
 'Kailash Parbat',
 "Dream'S Kitchen",
 'Tandoor Hut',
 'The Chocolatiers',
 'Ballava',
 'Zaika Take Away',
 'Snackiey',
 'Sandwich Mamas',
 'The Shake Factory Originals',
 'Hotel Lakshmi Paradise',
 'Desi Bites',
 'Shiv Sai',
 'The Juicy',
 'Feast And Burp',
 'Zengi Pub & Restaurant',
 'Cafe Indiana',
 'Multi Cakes',
 'Bhukkad',
 'Malhar Maharashtrian Cuisine',
 'Mandi',
 'Pallavi North Indian Veg Restaurant',
 'Restro Cafe',
 ...]
In [29]:
def get_top_words(column, top_nu_of_words, nu_of_word):
    
    vec = CountVectorizer(ngram_range= nu_of_word, stop_words='english')
    
    bag_of_words = vec.fit_transform(column)
    
    sum_words = bag_of_words.sum(axis=0)
    
    words_freq = [(word, sum_words[0, idx]) for word, idx in vec.vocabulary_.items()]
    
    words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)
    
    return words_freq[:top_nu_of_words]
In [30]:
zomato.head()
Out[30]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city Mean Rating
0 942, 21st Main Road, 2nd Stage, Banashankari, ... Jalsa True True 4.1 775 Banashankari Casual Dining North Indian, Mughlai, Chinese 800.0 rated 40 ratedn beautiful place dine inthe int... [] Buffet Banashankari 3.99
1 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ... Spice Elephant True False 4.1 787 Banashankari Casual Dining Chinese, North Indian, Thai 800.0 rated 40 ratedn dinner family turned good choo... [] Buffet Banashankari 3.97
2 1112, Next to KIMS Medical College, 17th Cross... San Churro Cafe True False 3.8 918 Banashankari Cafe, Casual Dining Cafe, Mexican, Italian 800.0 rated 30 ratedn ambience good enough pocket fr... [] Buffet Banashankari 3.58
3 1st Floor, Annakuteera, 3rd Stage, Banashankar... Addhuri Udupi Bhojana False False 3.7 88 Banashankari Quick Bites South Indian, North Indian 300.0 rated 40 ratedn great food proper karnataka st... [] Buffet Banashankari 3.45
4 10, 3rd Floor, Lakshmi Associates, Gandhi Baza... Grand Village False False 3.8 166 Basavanagudi Casual Dining North Indian, Rajasthani 600.0 rated 40 ratedn good restaurant neighbourhood ... [] Buffet Banashankari 3.58
In [31]:
zomato.sample(5)
Out[31]:
address name online_order book_table rate votes location rest_type cuisines cost reviews_list menu_item type city Mean Rating
6656 1, Nirmala Mansion, 4th Cross, 17th A Main, 60... Nuts Over Salads Cafe True True 4.4 197 Koramangala 5th Block Cafe Cafe, Salad 800.0 rated 40 ratedn went sunday eveningnnfood 45 e... [] Cafes BTM 4.39
39393 334, Doddkanhalli, Near WIPRO Corporate Office... Mughal Treat True False 3.6 116 Sarjapur Road Casual Dining Mughlai, North Indian, Chinese 700.0 rated 50 ratedn oh man chicken jalfrezi butter... [] Delivery Sarjapur Road 3.53
39833 Beside JDA Software, Near Mantri Espana Apartm... Coastal Fish Land True False 3.5 26 Bellandur Casual Dining Mangalorean 700.0 rated 30 ratedn food average fish isnãƒx83ã‚x8... ['Chicken Dum Biryani', 'Chicken Masala', 'Chi... Dine-out Sarjapur Road 3.19
14274 19, 1st Cross, Domlur Village, HAL Old Airport... Meat And Eat True False 3.1 73 Domlur Quick Bites Fast Food, Burger 500.0 rated 20 ratedn dropped meat eat picked chicke... [] Delivery Indiranagar 2.98
35054 Ballal Residency, 74/4, 3 Cross, Residency Roa... The Charlton Bar False False 3.3 4 Residency Road Bar Chinese, North Indian 1.0 rated 30 ratedn enjoyable place drinks great p... [] Pubs and bars MG Road 2.94
In [32]:
zomato.shape
Out[32]:
(41237, 15)
In [33]:
zomato.columns
Out[33]:
Index(['address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'location', 'rest_type', 'cuisines', 'cost', 'reviews_list',
       'menu_item', 'type', 'city', 'Mean Rating'],
      dtype='object')
In [34]:
zomato=zomato.drop(['address','rest_type', 'type', 'menu_item', 'votes'],axis=1)
In [35]:
import pandas

# Randomly sample 60% of your dataframe
df_percent = zomato.sample(frac=0.5)
In [36]:
df_percent.shape
Out[36]:
(20618, 10)

Term Frequency-Inverse Document Frequency¶

Term Frequency-Inverse Document Frequency (TF-IDF) vectors for each document. This will give you a matrix where each column represents a word in the overview vocabulary (all the words that appear in at least one document) and each column represents a restaurant, as before.

TF-IDF is the statistical method of evaluating the significance of a word in a given document.

TF — Term frequency(tf) refers to how many times a given term appears in a document.

IDF — Inverse document frequency(idf) measures the weight of the word in the document, i.e if the word is common or rare in the entire document. The TF-IDF intuition follows that the terms that appear frequently in a document are less important than terms that rarely appear. Fortunately, scikit-learn gives you a built-in TfIdfVectorizer class that produces the TF-IDF matrix quite easily.

In [37]:
df_percent.set_index('name', inplace=True)
In [38]:
indices = pd.Series(df_percent.index)
In [39]:
# Creating tf-idf matrix
tfidf = TfidfVectorizer(analyzer='word', ngram_range=(1, 2), min_df=0, stop_words='english')
tfidf_matrix = tfidf.fit_transform(df_percent['reviews_list'])
In [40]:
cosine_similarities = linear_kernel(tfidf_matrix, tfidf_matrix)
In [41]:
def recommend(name, cosine_similarities = cosine_similarities):
    
    # Create a list to put top restaurants
    recommend_restaurant = []
    
    # Find the index of the hotel entered
    idx = indices[indices == name].index[0]
    
    # Find the restaurants with a similar cosine-sim value and order them from bigges number
    score_series = pd.Series(cosine_similarities[idx]).sort_values(ascending=False)
    
    # Extract top 30 restaurant indexes with a similar cosine-sim value
    top30_indexes = list(score_series.iloc[0:31].index)
    
    # Names of the top 30 restaurants
    for each in top30_indexes:
        recommend_restaurant.append(list(df_percent.index)[each])
    
    # Creating the new data set to show similar restaurants
    df_new = pd.DataFrame(columns=['cuisines', 'Mean Rating', 'cost'])
    
    # Create the top 30 similar restaurants with some of their columns
    for each in recommend_restaurant:
        df_new = df_new.append(pd.DataFrame(df_percent[['cuisines','Mean Rating', 'cost']][df_percent.index == each].sample()))
    
    # Drop the same named restaurants and sort only the top 10 by the highest rating
    df_new = df_new.drop_duplicates(subset=['cuisines','Mean Rating', 'cost'], keep=False)
    df_new = df_new.sort_values(by='Mean Rating', ascending=False).head(10)
    
    print('TOP %s RESTAURANTS LIKE %s WITH SIMILAR REVIEWS: ' % (str(len(df_new)), name))
    
    return df_new
In [42]:
# HERE IS A RANDOM RESTAURANT. LET'S SEE THE DETAILS ABOUT THIS RESTAURANT:
df_percent[df_percent.index == 'Pai Vihar'].head()
Out[42]:
online_order book_table rate location cuisines cost reviews_list city Mean Rating
name
Pai Vihar True False 2.8 Vasanth Nagar South Indian, Street Food, Chinese, Fast Food 400.0 rated 30 ratedn 12 rate herenneven though tast... Frazer Town 2.48
Pai Vihar True False 2.8 Vasanth Nagar South Indian, Street Food, Chinese, Fast Food 400.0 rated 20 ratedn nice place vasanthnagar extrem... Brigade Road 2.48
Pai Vihar True False 2.8 Vasanth Nagar South Indian, Street Food, Chinese, Fast Food 400.0 rated 30 ratedn 12 rate hereãƒx83ã‚x83ãƒx82ã‚x... MG Road 2.48
Pai Vihar True False 2.8 Vasanth Nagar South Indian, Street Food, Chinese, Fast Food 400.0 rated 30 ratedn 12 rate herenneven though tast... Frazer Town 2.48
Pai Vihar False False 3.3 City Market South Indian, Street Food, Chinese, Fast Food 400.0 rated 20 ratedn food dry bland dont understand... Lavelle Road 2.48
In [43]:
recommend('Pai Vihar')
TOP 7 RESTAURANTS LIKE Pai Vihar WITH SIMILAR REVIEWS: 
Out[43]:
cuisines Mean Rating cost
Cinnamon North Indian, Chinese, Biryani 3.62 550.0
Prasiddhi Food Corner Fast Food, North Indian, South Indian 3.45 200.0
Shrusti Coffee Cafe, South Indian 3.45 150.0
Shanthi Sagar South Indian, North Indian, Chinese 3.44 400.0
Shanthi Sagar South Indian, North Indian, Chinese, Juices 3.44 250.0
Mayura Sagar Chinese, North Indian, South Indian 3.32 250.0
Container Coffee South Indian 3.11 200.0