Skip to content
Snippets Groups Projects
Commit cfe66d6d authored by Terézia Slanináková's avatar Terézia Slanináková
Browse files

Update README.md

parent ce52b16f
No related branches found
No related tags found
No related merge requests found
......@@ -4,4 +4,19 @@ Project for PV254 (Fall 2019). Recommends a city based on your travel history.
## Dataset
The dataset, found in `data/trips.csv`, is scraped from user pages of [nomadlist.com](www.nomadlist.com). Overall, we were able to get data from over 72k trips and 3700 users.
The dataset, found in the `data/` folder is scraped from user pages of [nomadlist.com](www.nomadlist.com). Overall, we were able to get data from over 72k trips, 3700 users and 599 cities.
### Loading the dataset with pandas
Some boilerplate code is needed to load the dataset properly. Example:
```PATH = "..\\data"
df_trips = pd.read_csv(f"{PATH}\\trips.csv", sep='\s*,\s*', encoding='utf-8')
dict_df = {'city': [], 'hospital_score': [],'english_speaking': [], 'nightlife': [],'female_friendly': [],'racial_tolerance': [], 'peace_score': []}
for (i, row) in df_cities[df_cities.cities != {}].iterrows():
for key in dict_df.keys():
if key != 'city':
dict_df[key].append(row.values[0][key])
dict_df['city'].append(row.name)
df_cities = pd.DataFrame.from_dict(dict_df)
```
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment