Show Me the Data; Season 2 Episode 4

SHOW ME THE DATA PODCAST

SEASON 2 EPISODE 4

THINGS THAT MOVE! WHERE CAN MOBILITY DATA HAVE THE MOST IMPACT?

CONVERSATION WITH KEVIN HARPER AND DR TOM VERHELST - 29 minutes 44 seconds

Exploring the utility of mobility data to drive positive and sustainable change, mobility data are extremely valuable when working to solve complex social problems.

By collecting and analysing data on how, when and where people move around a neighbourhood, city or region, governments and organisations can gain valuable insights into human behaviour, transportation use, incident rates and can be used to inform interventions promoting greater equity, sustainability, and safety in our communities. Mobility data can also play a vital role in disaster response efforts, helping to ensure that resources are deployed efficiently and effectively to those in need. DSpark are leaders in processing large geo-spatial temporal mobility data to deliver intelligence on people and places using the highest data privacy standards. Understanding how people move, where they go and what they do enables their partner organisations to map their strategy around where people live, work and play. Join us as we discuss the importance of fostering a data friendly and curious work culture, and why it’s hard for some organisations to take that calculated risk and trust how in what the data say, even though if they do so, they can reap the benefits of being an insight rich organisation.

SHOW NOTES

EPISODE TRANSCRIPT

Rhetta Chappell (host): Hi, and welcome to Show Me the Data, a podcast where we discuss evidence-based decision making and the ways in which our lives interact with and create data. I’m Rhetta, your host for today, and I’m a Data Scientist at Griffith University. Show Me the Data acknowledges the Jagera peoples who are the traditional custodians of the land on which we are recording today. And we pay respect to the elder’s past, present and emerging.

Hello and Good day to all and welcome to another episode of show me the data. Today in studio we have a fellow Canadian turned Aussie Kevin Harper. Kevin is the Partnerships & Enterprise Consulting Director at DSpark. DSpark’s work and data help organisations drive data-driven strategy planning and cutting-edge digital transformation across Australia and abroad. Today I also have Dr. Tom Verhelst with me in studio. Tom is the Director of the Griffith Data Trust and the director of the Relational Insights Data Lab. Both Tom and I have had the pleasure of working closely with Kevin and DSpark on a pivotal, all of government mobility data ecosystem pilot, we could have easily talked Kevin’s ears off for hours. So, we hope you find this conversation as entertaining as we did. Let’s get started.

Hello, and welcome. Thanks for being with me today, Kevin and Tom. I wanted to start by asking you Kevin, could you please explain what DSpark is and where you’re headed as a company?

Kevin Harper: Yes, we are a mobile data analytics company, that’s built on a non PII dataset that’s bringing in a few different data sources, mainly around telco, but also a few other sources as well to combine into a fairly accurate movement pattern of population aggregates across Australia and some other parts in Asia Pacific, including Singapore, Philippines, Thailand, etc. To be able to understand the power of knowing where people have moved, how many people have come to a specific location, how long they stayed, where they went to after, kind of the general understanding of how people will have moved in and around an area to be able to measure things like the value of a visit, as an example, or how many people are travelling along a transit corridor. Those sorts of questions is kind of what we do on a day-to-day basis. The other side of our business is focused on telco enhancement as well. So, as I mentioned earlier, we are primarily based on telco data and by doing that, we’ve also learned a lot about telco in the same breath. So, we can also help the telco run more efficient processes, help clients get better coverage, new rollout recommendation enhancing of CapEx to understand where to best upgrade the network, etc. So, in terms of where things are going with the company, I think there’s probably a couple of different avenues. One that we’re focusing quite heavily on telco, we’re part of the SingTel family and what we’re doing is supporting basically a lot of the group of SingTel telcos on how to do all the things that I was just mentioning. But what we’re also looking at is monetisation of that same data. So, what we’re saying is to in order to understand your business better, we’re also able to look at how that can also help their customers do things like process improvement and save costs and bring additional revenues, and etc. So, we’re really focusing on a more of a regional expansion, but also doubling down on some of the work that we’ve done over the last seven or eight years in region to enhance the data that we currently have.

Rhetta: Yeah, that’s excellent. And you kind of mentioned in there, like, there’s obviously a wide range of applications for the DSpark data and the kind of different parts of the company and where you’re kind of focusing there. But are there any kind of particular future applications of the DSpark data that you’re particularly excited about? And maybe if you can, kind of give maybe give us one kind of government public example, and maybe one commercial industry example?

Kevin: Sure. I would say that the future applications almost exclusively are, that I’m particularly excited about anyways, is on spend. So, what we’re looking at right now is how people move in and around a region. So, we can say, how many people came from Australia and to Singapore as an example, or vice versa, but we can’t really explain the nature of that trip to a certain degree. We can certainly say they’re visiting and there’s some regions that are more of a hotbed for tourism activity than others, and by country as well, and even local and regional tourism as well the plays within each of the countries that we’re in. But the spend part is really exciting, because now we can start to value that trip to a region, to understand why tourism is important, or is this particular event paying off the way that we think it is. We can start to look at and understanding really of how that has an impact for local or regional economy, with we even looking as far as looking at GDP growth, predictions, etc. So, it’s pretty far reaching, very exciting, it plays into almost every vertical that we’re playing in, specifically around tourism. I think, as I mentioned before, it’s quite a lot of what we do in the government space. We’re working with most of the state transport agencies as well. But I think overall from top down and Australia’s as one of the main focus areas, tourism is a real interesting thing for a lot of reasons because it really talks about not just the people moving to a region and did that thing work? But also what the value and what the economic impacts were, as well as the environmental impacts.

I think that’s part… and I know, that was one of the things that we worked on with you Rhetta, was understanding the Eco tourism aspect, and also potentially even taking that a little bit further as to the footprint of tourism overall, and how do we make sustainable tourism work for a region. Those metrics are, by and large, not really available to the governments that are setting up these programmes in the first place. So, without a lot of expensive studies and a lot of history to show this type of data, we can actually look at a spot check to say, “Here’s how it’s changed over time” and “Here’s how we can affect the now to hopefully improve things for the future” as well. So, I think overall, anything when it comes to decision in government is typically where we’re playing quite a lot at the time. On the commercial side, you also mentioned, we’re heavily involved with kind of a shopping centre analytics view. So, what we’re looking at is very similar to a tourism example, again, but we’re looking at where do people travel to get to a particular shopping area? How long did they spend in the region? Are they going to other locations as well? What is the impact of a particular shopping centre for the local economy? Quite a lot of add-on benefits for not only the property owner, but also the area that that property is situated in. And so, I think it’s pretty exciting that there’s multi-stakeholder benefits to being able to understand the movement at that kind of level.

Dr. Tom Verhelst: It’s very interesting, especially for Brisbane, the first one. So what you described as you can basically quantify how much the Olympics are, if they’re worth it or not. You could say if we are gonna be in Barcelona or Tokyo

Rhetta: With the value aspect, are you linking to other data like transactional data? Or is there other ways that you’re deriving that value, and that spend?

Kevin: At current, we are looking at other publicly available data sets, like the NVS, IVS, and IVS is a typical one, right. So the National Visitor Survey, International Visitor Survey that tourism research Australia runs. Were able to look at things like average spend for a visit to the Gold Coast as an example. And I know that the council there is has done quite a lot of work in that space to understand, we know that people on average are spending x dollars per night, DSpark data also shows that there’s a four night stay. So we can value that type of mass move migration or change to an actual dollar figure. So now, it actually starts to make sense when we say that we’re up or down in terms of local economy. Now, the downside to that is the NVS and IVS are typically done at a spot check. I don’t know if any of your listeners have been called randomly, I certainly have myself. So I’ve been through the survey once, for interest sake, mostly, but they’re really looking at a very small slice of time, in the last 30 days. Have you travelled anywhere? And how much did you spend? And that’s really basing a lot of that information on memory. There’s a lot of things that you might miss out in details, and certainly with the ability to have a better data set that can give more granularity and fill in the gaps and also the people that don’t answer that survey, which is a pretty large number. We can start to actually get a really confident view of actual spend that we can extrapolate up to the population. So that that next dataset is coming very shortly, where we can actually start to look at a true provable number, not just an estimation by one metric that we extrapolate up.

Tom: Cool.

Rhetta: Yeah, that is really cool. And especially with the trip sequencing that you mentioned, as well, I feel like that’s like so exciting. I wish we still had access to your data.

Kevin: We can talk after.

Tom: Good. So, with the switch to 5G, will this improve, or will this will this negatively impacted the DSpark data do you think?

Kevin: I actually am really excited about 5G, it’s certainly a lot of work that I was doing in Canada before I came over to Australia a few years back. 5G rollouts are typically done in a different way than typical cellular history. And the reason why is because a lot of the new network is based on smaller cells. So, we’re looking at smaller footprints, smaller serving areas, lower power, they’re not the massive towers that we’ve all come to know, quite a lot of this new infrastructure is based on almost the size of a laptop, and we put them on street posts and on top of buildings, inside parking facilities, etc. So the ability for us to look at a smaller area by very nature of how this is getting rolled out, is really powerful for our data. The second part about 5G is it’s extremely fast and almost near real time. I mean, some of the use cases that are coming out around the world are, are really focused around almost basically zero latency type scenarios, where it’s edge compute, where they’re enabling things like autonomous vehicle, corridors, etc. Where the vehicles are talking to these 5G towers, pretty much 24/7 as well as other vehicles. So the ability to have pretty much low latency on most devices, as they will connect to 5G and 6G and onwards in the future. I think the power to understand not just this mobile device anymore, we’re now late looking at things like vehicles and all kinds of other things that are connecting to this network, I think is it’s gonna be a pretty amazing revolution in terms of the mobility analytics world.

Rhetta: With that kind of coming on board, and like kind of with the data ecosystem thing that we tested with health and wellbeing Queensland, and as you mentioned, we did that eco-tourism, as well as we worked with Queensland fire and emergency services around using the mobility data for that. Moving beyond that, and kind of giving the possibilities and directions that DSpark data are kind of going and the opportunities that he felt like there’s much more of an appetite to kind of access data like that in that kind of collaborative way?

Kevin: Yes, absolutely. I think it makes a lot of sense to everybody that has seen this, in operation in Queensland in particular, that the removal of these silos from governments or universities alike has profound benefits that we haven’t even tapped. I mean, in terms of not just our mobility data, but just data sharing in general in terms of all the different departments in academia and industry, as well partnering on solving some of these complex problems that sometimes we don’t even know the questions to yet. I think having the ability to not only have access for a particular dataset, like ours or others out there, certainly also has a benefit, when you start to understand well, what else is available? And how can I link these together, so that I can create some new insights?

You’re talking about emergency and fire, looking at where vehicles are placed. The real time nature of this data, as I’ve alluded to, it can be potentially lifesaving, if you start to look at where the people are, not just where the infrastructure is, and where the fire is, and where the fire could go, that could potentially save quite a lot of lives. If we start to say, how many vehicles do we need to rescue people? How many workers do we need? How many beds do we even need for an emergency shelter? Are they actually using the emergency shelter that’s been set up? Or are they moving past that to go somewhere else? All those types of pretty much real time decisioning needs to be in the hands of the first responders and the planners around these types of events. And I think that that’s starting to get unlocked with some of these pilots that you’re referring to. And I suspect that once that has a positive effect, I think all the rest of the departments will say “Oh, I actually have some interesting data.” And “I’ve got flood history data from two years ago.” and we can start predicting some of these things a little bit more. So I think overall, the all of government or all of organisations that can pair into this ecosystem, I think is going to have a pretty amazing benefit for those that partake.

Tom: Very interesting and a very strong point towards more collaboration across government and across universities and government.

Rhetta: Yeah, I guess kind of leading off of that, like if you could kind of snap your fingers or wave a magic wand and you could use the DSpark data to kind of solve any social or kind of public problem, what would it be and why? Like if you could kind of get your data into the hands of anyone.

Kevin: I would definitely say more into the emergency services space. The ability for government and first responders to have access to this data at the tip of their fingers is essential. I mean, the climate around the world is changing. There’s no denying that we we’re not going to be going back to the way things are or were. Hopefully, soon, but it will probably continue with floods, fires, droughts, etc. And I would suspect that because we have this information in a real time scenario that this could have a dramatic impact on the lives of Australians and others around the world. So, if I could snap my fingers, I would, I would have this data in everybody’s hands tomorrow. I think that the challenges and what’s preventing that, there’s probably a couple of things. One is around budget, so, that because nobody has spent money on it before, that’s, that’s another big challenge that I go through on a day-to-day basis as to why they should change the way that they’ve done things for a long time. Budget is one but I think the granularity I think, is a secondary view. Because of the sensitivity of a real time data feed, there are extra protections that we have in place, that that certainly limit the
“This is John Smith”, and I can’t ever say that that’s that person because we don’t have that information in our data. However, the first responders that we have talked to have said, If I can know that, that is that person, then I can contact that person. So, I think getting over that that hump of saying, this is going to be a satellite view, maybe not a five story view, that we can look at generalised population locations, that there’s still value in that. And I think that that’s a hurdle that we have to get across as well. But finally, there’s just a lot of other emergency services companies out there as well, that don’t necessarily talk about the people side, but talk about the fire and the smoke and all of the other effects of that, or flood as an example. So, there’s a quite a lot of like property data, there’s things out there, but there’s not necessarily the people there. So, I think integrating into that system so that we can start to use this data for good in that scenario, I think would be a would be a game changer. In my view.

Tom: It’s only we’re talking about things, especially like if you’re directly exposed to the fire, but if you’re exposed to the smoke, you will see quite a difference in the ways people are behaving. And just people running or walking out. Yeah, very interesting.

Kevin: Exactly. Yeah.

Tom: So I mean, it’s a sort of similar question, but slightly different, I suppose. So, where do you think that the DSpark data might have the greatest impact?

Kevin: I think it’s probably similar to the answer I gave around the people that can trust in the results. I think the bravery to change something is it’s risky. And people just don’t know that the outcome, I think the ones that can trust that the concept makes sense, even though it’s never been done or not done in a lot of frequency. I think those are the ones that have changed the game around, around the world. I think having the perseverance to follow through on those decisions, I think is really helpful, especially if it didn’t come out the right way, in the first time, I think those organisations will certainly see the greatest impact, it does take a certain sense of a leap of faith, if you will, to say, “I think this is going to work, I don’t know if it’s going to work. and the budget certainly is difficult to unlock, because I don’t know that it’s guaranteed result.” But the ones that we kind of work with the most are the ones that have that that vision to say, “I think this is going to work” and when it pays off, and it almost always typically does. That’s a really amazing feeling for both sides, both DSpark and our and our end clients. And anybody that works on the data as well, to be able to say “I knew that this was going to work and we actually saw 10 other cool things that we didn’t even think we’re going to see, let’s go unpack all those things as well.” And then for an organisation to get as excited about the data, as we are to say, “Okay, well, let’s go unpack all those 10 things. And what does this mean? Can we impact this even more, because of this initial leap of faith”, I think those are kind of the enthusiastic clients that we like to work with.

Rhetta: With that, and he kind of spoke to this with regards to how you are already linking into a lot of other data sources. And that might not be DSpark doing that necessarily, it could be the clients or partners that you’re working with in terms of linking that with financial data, or maybe it’s like transport data or whatever else it is. But if, as DSpark, you could have access to any data set in the world, and it’s more of a thought experiment were kind of parking our morals our ethics and everything at the door. It that if you could kind of have any access to any other data set to really, I guess further enhance the insights and kind of impact and value of the DSpark data. What would it be and why?

Kevin: This is an interesting one. So, I would say there’s probably two parts to my answer. One would be anything that moves, I want to know everything.

Rhetta: Tracking devices on everyone.

Kevin: Anything that moves, I want to know cycling, I want to know any kind of different types of mode of transportation, aircrafts, boats, everything, like anything that I can get my hands on that moves, I think is 100% the kind of the sweet spot for DSpark overall, I think we aren’t experts in absolutely everything and I don’t think that we claim that we’re going to expand and own the whole data analytics market, I think but we what we do really well is understanding how things and people move. So, looking at it from that standpoint, if we had telematics data, and cycling data, as a start, I think that would be an amazing thing to have a look at. Because then we can start to look at freight vehicles as an example with clear data to show that this is heavy freight versus light, freight, etc. We looked at things like cabs versus rideshare. That would also be a really interesting dataset. Cycling is a big one for us, and one that is universally challenging from my experience, because in our data, it could look like a pedestrian or it could look like a vehicle, but we just don’t know, depending on the speed that they’re going. So I think cycling, especially as cities expand their green initiatives and add bike lanes, etc. I think that would be a pretty exciting one for us to get our hands on.

But I think the second part of my answer is a little bit more general. And I think it’s really just anything that we can easily join at a level of granularity that’s interesting for us. So, part of the challenge that our data science team will always have and every data science team when we’re linking data sets together is in the same boat, is how do you say this equals that. And I think if we have a level of granularity that we can make that link as close to realistic as possible, that starts to really enhance the understanding of the movement in the first place and we can contextualise it and we can understand what types of people or things are moving around. I think that challenge onto your privacy and ethics side of things. I think the power of this data is lost if we lose the ethics. And I think that there’s been a lot of questions over my career certainly, about can we just unpack just for this one-use case? And I think it’s a dangerous spiral that I don’t really think, I’ve debated amongst myself and my colleagues quite a lot over the years on how can we do this without opening Pandora’s Box, basically. So, I still think ethics, privacy, by design, etc, are extremely important, because I think we should still be able to do this type of thing in a privacy compliant way, that doesn’t bend the rules or infringe on anybody’s individual data. Because I think there’s a lot of power here that can get uncomfortable if we start to bend those rules slightly. However, what the happy medium between that and opening Pandora’s box is a is a middle layer that I’ve always envisioned, which I think I’ve seen a few organisations around the world start to do this. But what that really is, is a double blind system where I’ve got my data set, and somebody else has their data set. But there has to be a way that we can have a middle layer that doesn’t really understand either dataset, but can somehow link those two together and get the insights derived out of that to us or to the other organisation without us matching. So, I think if we can solve that, that is a major, major step forward for analytics in the world. I think I think we’re kind of getting there now. But it’s certainly very difficult. And it’s a lot of bespoke work from experience from what I’ve seen. But overall, I think the middle joining layers is probably the most exciting thing for us. And certainly something that if we don’t find it, we’ll have to develop it I think in the future.

Tom: Thank you. The privacy one is an interesting one, because it’s sort of, you’re right, it’s highly depends on the context, right? It is Pandora’s box. But if you’re a first responder, if someone’s respond, if you’re in need, and the first responder is responding to you, and your phone signal can help him get to you faster. You don’t really care about your privacy at that point. You just want to be found and helped.

Kevin: I think I think I think so but then there are other ways. Yeah, if you do expand that for that use case, then others will say “Well, we did it there and also national security” or threats or and all that kind of stuff, right? I think the ability for us to at least keep guard on that level of privacy protection, I think is helpful for the general public and the future of humanity, if you will, because I think, looking at this type of macro data, and definitely getting inferences and not actual people, can still be as powerful as looking at an individual. So, I think, from a privacy standpoint, we want to make sure that that is not possible. And even from the DSpark side, it we’re not at the level of we know who it is, and that we mask it, we’ve actually designed it so that we don’t know who it is because upstream is getting mass from us. And we did that by design so that even within our own organisation, we have no ability to kind of break the rules, so to speak. So, I think that that that is something that’s pretty passionate for myself as well to be able to do ethical privacy compliant things, by still looking at other data sets in a granular enough way that we can get enough information. And I think that’s, that’s kind of the future and how I see it.

Rhetta: And that’s such it’s always that trade-off between useful and anonymous data, isn’t it? And I think that to have the kind of public buy in and trust to get more people using your valuable data, you really do have to tread that line carefully.

Kevin: Exactly.

Tom: But it’s also, if you would throw this forward a lot, if you turn it the other way around, if you would start breaking this for people would start leaving their phone at home or would start doing things so they can’t be tracked in there. I mean, you need you need for most, for most things in the Western world, you need the buying of the public to go along. Otherwise, you’re never going to, otherwise your data won’t make any sense. It’s the same with the census, right? If we wouldn’t trust the census, if 10% would fill them in? Well, it would be a problem.

Kevin: Exactly. I think it’s an implied trust, that will be, I mean, regulations across the world are starting to come protect those types of use cases, I guess, are in like GDPR, etc. And I think the ability for the world to still understand and derive value from these types of insights will be necessary to play at a level that’s not below the sheet, so to speak, I think we have to be able to understand that there is a limitation to how close we can get with this type of data. But because we’re doing it that way, I think, and certainly some of the governments and even just individual citizens I’ve talked to or in many different countries, I think they’re on board with that level of detail, you can see that that trust has an impact, even for companies like Apple, where they’re using it as a marketing tool almost to say, we’re not selling your data, you don’t nobody knows where you are all that kind of thing. I think the ability for groups like us and others out there is certainly much more powerful if we are above board, and we have those limitations in place.

Rhetta: Yes, definitely.

Tom: Thank you.

Rhetta: Yeah, I think it’s been absolutely fascinating talking to you, Kevin, and I will be respectful of your time. So I think we could keep on asking you more and more questions. But thank you for being here. And thank you for improving my Canadian to non-Canadian ratio today. And also, you saw we had another podcast guest asked if we say “day-ta” or “da-ta”, you say “da-ta”, I change, and I think you say “day-ta” (towards Tom). So anyways, that was a bit of an interesting fun for the audience.

Kevin: Yeah, I’m all over. I think I used to say “day-ta”, but then the Australians converted me over to “da-ta”. So, I think that’s where I’m sitting now.

Rhetta: Well, we’ll have some fun with that. But thank you so much, Kevin. It was really lovely to speak with you. And it was fun working with you guys. And I hope we get an opportunity to do so again in the future.

Kevin: Likewise, thank you for having me, it was an honour. Thank you.

Rhetta: To listen to more episodes of show me the data, head to your favourite podcast provider, or visit our website, RIDL.com.au, and look for the podcast. We hope that by sharing these conversations about data and evidence-based decision making, we can help to inform a more inclusive, ethical and forward-thinking future. Making data matter is what we’re all about. And we’d love to hear why data matters to you. To get in touch. You can tweet us @G_RIDL. Send us an email or if you prefer, just send us a letter by carrier pigeon. Thank you for listening, and that’s it till next time, take care and stay safe.