The biggest event for me this year was completely outside of work and had nothing to do with statistics or R: I got married. We technically met at the Open Stats meetup and I did build our wedding website with RMarkdown, so R was still involved. We just returned from our around-the-world honeymoon so I thought the best way to track our travels would be with maps and globes using leaflet and threejs.

Before we get to any code, the following packages were used in making this post.

This was an extensive trip that, in addition to traditional vacation activities, included a few visits to clients and speaking and a few conferences and meetups. In all, we visited, London, Singapore, Hong Kong, Auckland, Queenstown, Bora Bora, Tahiti, Moorea, San Jose and San Francisco, with a connection or two in between.

The airport/ferry codes for our trip were the following.

Origin Destination Airline
JFK LGW Norwegian Air
LHR SIN Singapore Airlines
SIN HKG Singapore Airlines
HKG AKL Cathay Pacific
AKL ZQN Air New Zealand
ZQN AKL Air New Zealand
AKL PPT Air New Zealand
PPT BOB Air Tahiti
BOB MOZ Air Tahiti
MOZ PPT Terevau
PPT LAX Air Tahiti Nui
LAX SJC Alaska Airlines
SFO JFK JetBlue

Converting these to latitude and longitude is easy thanks to Open Flights.

# read in the data
airports <- readr::read_csv('https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports-extended.dat',
                            # give it good column names since the data are headerless
                            col_names=c('ID', 'Name', 'City', 'Country', 
                                        'IATA', 'ICAO', 'Latitude', 'Longitude', 
                                        'Altitude', 'Timezone', 'DST', 'Tz', 
                                        'Type', 'Source'))

We then use filter to get just the ports we visited. Notice how we use a second tbl inside filter.

visited <- airports %>% 
    select(Name, City, Country, IATA, Latitude, Longitude) %>% 
    filter(IATA %in% (codes %>% select(Origin, Destination) %>% unlist))
DT::datatable(visited, elementId='AirportsTable',
              rownames=FALSE,
              extensions=c('FixedHeader', 'Scroller'),
              options=list(
                  dom='<"top"f>rt<"bottom"i><"clear">'
                  ,
                  scrollY=200,
                  scroller=TRUE
              )
) %>% 
    DT::formatRound(columns=c('Latitude', 'Longitude'), digits=2)

We then manually reorder the airports so that edges can be drawn nicely between them. This is akin to creating an edgelist of airport-pairs. This is not the most robust way of creating this list, but suffices for our purposes.

visitedOrdered <- visited %>% 
    slice(c(12, 1, 2, 8, 7, 5, 6, 5, 13, 3, 4, 13, 10, 11, 12))

For the first visualization let’s create a map using leaflet.

# initialize the widget
leaflet(data=visitedOrdered) %>% 
    # overlay map tiles
    addTiles() %>% 
    # plot lines from one point to another
    addPolylines(lng=~Longitude, lat=~Latitude) %>% 
    # add markers with city names
    addMarkers(lng=~Longitude, lat=~Latitude, popup=~City)

Unfortunately, this doesn’t quite capture the directions of the flights as it makes it look like we flew back west to get to Papeete. So let’s try a globe instead using threejs.

We augment the edgelist of airports so that it has the latitude and longitude of the origin and destination airports for each flight.

flightPaths <- codes %>% 
    left_join(visited %>% select(IATA, Longitude, Latitude), by=c('Origin'='IATA')) %>% 
    rename(oLong=Longitude, oLat=Latitude) %>% 
    left_join(visited %>% select(IATA, Longitude, Latitude), by=c('Destination'='IATA')) %>% 
    rename(dLong=Longitude, dLat=Latitude)
DT::datatable(visited, elementId='FlightPathLatLong',
              rownames=FALSE,
              extensions=c('FixedHeader', 'Scroller'),
              options=list(
                  dom='<"top"f>rt<"bottom"i><"clear">'
                  ,
                  scrollY=200,
                  scroller=TRUE
              )
) %>% 
    DT::formatRound(columns=c('Latitude', 'Longitude'), digits=2)

Now we can provide that data to threejs. We first specify an image to overlay on the globe. Then we specify the latitude and longitude of visited airports. After that, we provide the origin and destination latitudes and longitudes of our flights. The rest of the arguments are cosmetic.

globejs(
    # the image to overlay on the globe
    img="http://eoimages.gsfc.nasa.gov/images/imagerecords/73000/73909/world.topo.bathy.200412.3x5400x2700.jpg",
    # lat/long of visited airports
    lat=visited$Latitude, long=visited$Longitude,
    # lat/long of origin and destination
    arcs=flightPaths %>% select(oLat, oLong, dLat, dLong),
    # cosmetic adjustments
    arcsHeight=.4, arcsLwd=7, arcsColor="red", arcsOpacity=.95,
    atmosphere=FALSE, fov=30, rotationlat=0.3, rotationlong=.8*pi)

We now calculate the total distance traveled (not including car trips) using Haversine Distance to account for the curvature of the Earth.

distHaversine(visitedOrdered %>% select(Longitude, Latitude), r=3959) %>% sum
## [1] 28660.52

So we traveled 3,760 more miles than the circumference of the Earth.

Beyond the epic proportions of our travel, this honeymoon was outstanding from the sheer length, to the vastly different places we visited, to the food we ate and the sights we saw, to the activities we participated in, to the great people along the way. And, of course, it’s amazing to spend a month traveling with your favorite person.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Posted in R.



So far this year I have logged many miles in the air and on the rails. In between trips to Minneapolis and Boston I spent about a month traveling through India and Southeast Asia, mainly to conduct R courses in Singapore and Kuala Lumpur for the likes of Intel, Micron, Celcom, Maxis, DBS and other similar companies. The training courses were organized through Revolution Analytics’ Singapore office. Given the success of the classes, there will be more opportunities this spring or summer in Singapore, Kuala Lumpur and also in Australia.

Quite a lot of material was covered based on the offerings of my company, Lander Analytics and the content of my R for Everyone.

Day 1 – Basics

  • Getting and installing R
  • The RStudio Environment
  • The basics of R
    • Variables
    • Data Types
    • Reading data
    • Calling functions
    • Missing Data
  • Basic Math
  • Advanced Data Structures
    • data.frames
    • lists
    • matrices
    • arrays
  • Reading Data into R
    • read.table
    • RODBC
    • Binary data
  • Matrix Calculations
  • Data Munging
  • Writing functions
  • Conditionals
  • Loops
  • String manipulation and regular expressions
  • Visualization

Day 2 – Modeling

Day 3 – Machine Learning

Day 4 – Data Presentation and Portability

  • Reproducible reports using knitr
  • Basic Introduction to Markdown
  • Using knitr to automatically generate reports with embedded analytics
  • Using Markdown and knitr to automatically generate websites with embedded analytics
  • Using Markdown and knitr to make HTML5 slideshows with embedded analytics
  • Advanced plotting
  • Building R Packages
  • Shiny Overview

Day 5 – High Performance Computing with R

Given my extensive time abroad I thought it would be good to look at it all on a map using the Leaflet package in R.

Using the Google Maps API we can look up the latitude and longitude of the visited cities.

library(XML)
library(plyr)

cities <- c('Hong Kong', 'Haripal, India', 'Kolkata, India', 'Jaipur, India', 'Agra, India', 'Delhi, India', 
            'Singapore', 'Kuala Lumpur, Malaysia', 'Geroge Town, Malaysia')
lat.long <- function(place)
{
    theURL <- sprintf('http://maps.google.com/maps/api/geocode/xml?sensor=false&address=%s', place)
    doc <- xmlToList(theURL)
    data.frame(Place=place, Latitude=as.numeric(doc$result$geometry$location$lat), Longitude=as.numeric(doc$result$geometry$location$lng), stringsAsFactors=FALSE)
}

places <- adply(cities, 1, lat.long)
knitr::kable(places[, -1], digits=3, row.names=FALSE)
Place Latitude Longitude
Hong Kong 22.396 114.109
Haripal, India 22.817 88.105
Kolkata, India 22.573 88.364
Jaipur, India 26.912 75.787
Agra, India 27.177 78.008
Delhi, India 28.614 77.209
Singapore 1.352 103.820
Kuala Lumpur, Malaysia 3.139 101.687
Geroge Town, Malaysia 5.415 100.330

Now that we have the coordinates we use Leaflet to plot them.

library(leaflet)
leaflet(data=places) %>% addTiles() %>% setView(90, 15, zoom=4) %>% addPopups(lng=~Longitude, lat=~Latitude, popup=~Place) %>% addPolylines(~Longitude, ~Latitude, data=places[c(1, 3, 2:9, 1), ]) %>% addMarkers(lng=~Longitude, lat=~Latitude, popup=~Place, icon=JS("L.icon({iconUrl: 'https://www.jaredlander.com/images/jaredlanderfavicon.png', iconSize: [20, 20]})"))

Calculating all the miles traveled could be as simple as looking it up on TripIt, or we could do some quick Haversine distance calculations with the geosphere package.

First, we get the coordinates for New York, Minneapolis and Boston to have a complete picture of the distance.

newCities <- adply(c('New York, NY', 'Minneapolis, MN', 'Boston, MA'), 1, lat.long)
allPlaces <- rbind(newCities[c(1, 2, 1), ], places[c(1, 3, 2:9, 1), ], newCities[c(1, 3, 1), ])

Then in order to use distHaversine we need to set up a to and from relationship between the places. The easiest way will be to just shift the columns.

library(useful)
## Loading required package: ggplot2
shiftedPlaces <- shift.column(data=allPlaces, columns=names(places)[-1], newNames=c('To', 'Lat2', 'Long2'))

Now we can calculate the distance. This assumes that all trips followed a great circle, which might not be the case, especially for the car and rail portions of the trip.

library(geosphere)
## Loading required package: sp
shiftedPlaces$Distance <- distHaversine(shiftedPlaces[, c("Longitude", "Latitude")], shiftedPlaces[, c("Long2", "Lat2")], r=3959)

In total this led to 25,727 miles traveled.

knitr::kable(shiftedPlaces[, -1], digits=c(1, 3, 3, 1, 3, 3, 0), row.names=FALSE)
Place Latitude Longitude To Lat2 Long2 Distance
New York, NY 40.713 -74.006 Minneapolis, MN 44.978 -93.265 1016
Minneapolis, MN 44.978 -93.265 New York, NY 40.713 -74.006 1016
New York, NY 40.713 -74.006 Hong Kong 22.396 114.109 8046
Hong Kong 22.396 114.109 Kolkata, India 22.573 88.364 1642
Kolkata, India 22.573 88.364 Haripal, India 22.817 88.105 24
Haripal, India 22.817 88.105 Kolkata, India 22.573 88.364 24
Kolkata, India 22.573 88.364 Jaipur, India 26.912 75.787 844
Jaipur, India 26.912 75.787 Agra, India 27.177 78.008 138
Agra, India 27.177 78.008 Delhi, India 28.614 77.209 111
Delhi, India 28.614 77.209 Singapore 1.352 103.820 2574
Singapore 1.352 103.820 Kuala Lumpur, Malaysia 3.139 101.687 192
Kuala Lumpur, Malaysia 3.139 101.687 Geroge Town, Malaysia 5.415 100.330 183
Geroge Town, Malaysia 5.415 100.330 Hong Kong 22.396 114.109 1491
Hong Kong 22.396 114.109 New York, NY 40.713 -74.006 8046
New York, NY 40.713 -74.006 Boston, MA 42.360 -71.059 190
Boston, MA 42.360 -71.059 New York, NY 40.713 -74.006 190
leaflet(data=allPlaces) %>% addTiles() %>% setView(80, 20, zoom = 3) %>% addPolylines(~Longitude, ~Latitude) %>% addMarkers(lng=~Longitude, lat=~Latitude, popup=~Place, icon=JS("L.icon({
    iconUrl: 'https://www.jaredlander.com/images/jaredlanderfavicon.png', iconSize: [20, 20]})"))


Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Posted in R.