Datasets
Sherpa.ai has created five datasets, so that the spatial context, profiling, and prebuilt recommendation functionalities of the API can be tested.
The datasets simulate a user’s route, points of interest, and time transitions (the amount of time a user spends at a point of interest, arrival time, and departure time). That way, the points of interest that the user frequents can be detected, based on their location, the date, and the time.
The datasets simulate a user’s routine over the course of one month (February 2020), in the the following cities:
- San Francisco
- New York
- Madrid
- London
- Singapore
See the Next Place Tutorial, to understand how context awareness is simulated.
POST /v2/tracking/uploadDataset
Based on the city you want to use, set the corresponding URL:
- San Francisco: https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/SanFrancisco_user_10001.json.
- New York: https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/NewYork_user_10002.json.
- London: https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/London_user_10004.json.
- Madrid: https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/Madrid_user_10003.json.
- Singapore: https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/Singapore_user_10005.json.
Curl example:
curl -X POST "https://api.sherpa.ai/v2/tracking/uploadDataset" \
-H "Authorization: Basic XXXX-SHERPA-TOKEN-XXXX" \
-H "Accept-Language: en-US" \
-H "Content-Type: application/json" \
-d "{\"url\":\"https://sh-ia-data.s3-eu-west-1.amazonaws.com/sherpa_data/synthetic_users/SanFrancisco_user_10001.json"}"
Response
Code | Name |
---|---|
202 | Accepted |
400 | Bad Request |
401 | Unauthorized |
500 | Internal Server Error |
The following box presents an example of the JSON format used to describe the geographical fingerprint of a user within the city. For practical purposes, we have limited the number of GPS points to two.
The user JSON is composed of two parts: a head (which comprises the file information), and a body (which contains the GPS points). Despite how the JSON is displayed in the box, there is no character separation and there are no line jumps.
Here is an example of San Francisco’s JSON output:
{
"idUsuario":10001,
"totalPoints":2,
"measuresCompleteness":0,
"pois":[],
"allClustersOnlyAccuracy":[],
"allClusters":[],
"points":[
{"stamp":"2020-02-01 00:00:01","latitude":37.786924,"longitude":-122.416205,"timeZone":"America/Los_Angeles","accuracy":10.5436244339217,"speed":0},
{"stamp":"2020-02-01 00:00:31","latitude":37.7868555014314,"longitude":-122.416105024042,"timeZone":"America/Los_Angeles","accuracy":9.72294784057885,"speed":0}
]}
The data contains the following body fields:
Field | Type | Description |
---|---|---|
idUsuario | Long | Unique user ID |
totalPoints | Long | Number of GPS points |
stamp | String | Time stamp |
latitude | Double | Latitude |
longitude | Double | Longitude |
timeZone | String | TimeZone |
accuracy | Double | Accuracy |
speed | Double | Speed |