Seeing through the customer’s eyes: the emerging science of photo analytics

Photo: Lu Lacerda, CC 2.0

Photo: Lu Lacerda, CC 2.0

I recently had the opportunity to moderate a panel at Oracle Data Cloud Summit 2015, and to deliver a speech at SDL Innovate. The themes, respectively: “Listening to the Customer’s Voice,”and “Unlocking the Value of Social Data.” But it’s not just the customer’s voice we need to care about. We also need to care about, and better understand, the customer’s vision.

This past weekend, the Washington Post ran a story about painter and photographer Richard Prince, whose slightly-reconfigured blowups of Instagram users’ photos were recently shown (and sold) at The Frieze Art Fair in New York–for a cool $90,000 each.

The focus of the Post article was on ownership of the Instagram photos themselves, and the flexibility of copyright laws related to images. But it’s not just ownership we need to think about with images; it’s the challenge of interpreting what they mean, so we can determine what action to take, if any.  Some use cases include:

  • Image-based UGC/Native advertising
  • Content marketing
  • eCommerce (increasing lift)
  • Risk management (copyright)
  • Risk management (brand)

The Emerging Science of Photo Analytics

When we think about understanding “the customer’s voice” through social media, we generally think about text. A tweet such as this one presents a few interesting challenges for a human (not to mention a machine) to interpret; the sentiment is mixed (love the watch, hate myself), the person may or may not be an owner (did he buy it?), and there isn’t really much other behavioral data (he tried one on and wrote about it) to interpret.

Screen Shot 2015-05-26 at 10.22.17 AM 1

Beyond that, we can look at the metadata to try to determine whether and how much it was shared, the profile of the poster, and so on. But these 14 words don’t tell us a whole lot; a listening tool will tell us that it is a brand mention, in English, with mixed (not neutral!) sentiment. That’s not a bad thing; it simply illustrates the challenges that we–not to mention machines–have with the 140-character format, and with human language as a whole.

Contrast the above with something like this image I found on Flickr:

Screen Shot 2015-05-26 at 10.44.26 AM 1

A human can detect that the focus of the photo is a butterfly-shaped object that started its life as a can of soda. A photo analytics tool such as Ditto could likely detect that it contains a partial Coca-Cola logo. If it were shared in Instagram or Pinterest, and contained enough useful metadata, Piqora might return it in a list of search results.

But the real question, if I’m a brand marketer at Coke, is this: what do I do now?

  • Is it a brand mention?
  • Is it positive or negative?
  • Can I tell the identity of the poster?
  • Is that person the same as the creator of the image? Of the butterfly?
  • How else do I categorize it?
  • Can I use it in native advertising, commerce or other types of campaigns?
  • Is it a brand risk?
  • Is it actionable in other some way?

This is particularly critical as brands incorporate user-generated content into campaigns, and as automated campaigns become more popular. In February Coke launched, then pulled, their online #makeithappy campaign when a user added the hashtag to a tweet containing white supremacist content. Gawker then created a Twitter bot to test whether it could make the account inadvertently tweet lines from Hitler’s Mein Kampf. Short answer: it could, and it did, and Coke shut down the campaign immediately.

This is where brands must balance optimism with sometimes harsh reality: optimism that involving the community can be beneficial, and the reality that someone, somewhere will appropriate that content for their own uses: artistic, competitive, financial, political, comic, or just plain nasty. Clearly, malicious intent is more of an issue on some platforms than others. Instagram and Pinterest tend to be, as Sharad Verma, CEO of Piquora says, “happy platforms.” Twitter and others tend to have more problems with abusive content, as has been widely reported.

As you can imagine, it would be fairly easy to get lost in analysis paralysis with digital images, especially since we are still so new at the science of understanding and using them. So let’s start with a few basic principles:

  • Images are brand mentions
  • The science involved in interpreting them is still very new. The main approaches focus on interpreting the image itself, the metadata associated with it, or a combination (ideally) of both.
  • A few things are possible today with image recognition: logo identification and some (very basic) sentiment analysis. Given metadata, search becomes more feasible and useful.
  • Popular social platforms such as Instagram, Tumblr, Snapchat (and others yet to be invented) are highly image-centric.
  • Images are more universal than language, though there are always exceptions 
  • Automated campaigns will always hold some element of risk if they use or even simply suggest the use of UGC.
  • You’re going to need to figure this out sooner rather than later, because GIFs, video, spherical video (coming from Facebook), augmented reality and virtual reality are here, and they’re going to be a lot more complex from an analytical point of view.

I’d love to hear your thoughts on use of images and image analytics; what you’re using, looking at, thinking about, terrified of and excited about.

giphy

Source: giphy.com

 

 

Posted in Analytics, Artificial Intelligence, Big Data, content marketing, content measurement, Crisis, digital ethics, Photo Analytics, Predictive Analytics, Real-Time Enterprise, Sentiment Analysis, Social Analytics, Social Data, Social media measurement, Uncategorized | Tagged , , , , , , , , , , | 1 Comment

Data Experience: You’ve Gotta Have Trust

Photo: Nic Taylor, CC 2.0

Photo: Nic Taylor, CC 2.0

A few years ago, I started thinking about how data informs the customer experience. The catalyst was simple; I was frustrated with my fitness tracker, and felt deluged by a stream of numbers that weren’t particularly helpful (the dashboards sure were pretty though).

Part of the issue was that all the design energy had gone into the physical device. For the time, it was cool and sleek. But syncing it was clunky, and it featured a proprietary metric that may have been useful for branding purposes but did nothing for me personally. So if insight was supposed to be core to the product, the product had failed, at least for me. But wearable devices aren’t the only products in which data experience is critical. Insight has become an expectation, an essential part of the customer experience overall.

And it’s not just the production of data that forms the experience; the way the product consumes data shapes our experience too. We need to trust that our data is being used thoughtfully and ethically, or any insight that it provides is meaningless.

When that relationship is out of balance, people find workarounds. Designers and developers have a love/hate relationship with workarounds. They hate them because they expose flaws, and love them  (one hopes) for the same reason.

If you doubt that trust is part of the customer experience of data, consider these fascinating workarounds:

  • A recent story in Alternet introduced a group of fashion designers and developers committed to helping consumers block facial recognition technology and confuse drones, among other things.
  • The makeup tutorial of your dreams nightmares: artist and film maker Jillian Mayer presents this YouTube video on how to apply makeup to hide from facial recognition apps and surveillance cameras.

If you dismiss these as edge cases, you’re well within your rights. But maybe that’s just today.

Isn’t it worth considering whether and how customers might be signaling distrust of your brand’s data experience? If I were Samsung’s Smart TV division, I’d look at how many people are disabling voice commands. If I were Facebook, I might look at photo upload rates over time, tagging behavior and changes to privacy controls.

What would you look at? As always, I appreciate your thoughts and comments.

Posted in Big Data, Data Experience, data privacy, Data Science, digital ethics, Ethics, Innovation, Internet of Things, Quantified Self | Leave a comment

What the end of the Twitter-DataSift partnership means for customers

3875455368_788fc3a19f_o

Photo: Bob Kelly, cc 2.0

Disclosures

Let’s get this out of the way right up front. DataSift is a client of mine at Altimeter Group. I am closely connected to Twitter via my role as a board member of the Big Boulder Initiative, of which Chris Moody, VP Data Strategy at Twitter, is chair. 

On Friday, Zach Hofer-Shall, head of Twitter’s ecosystem, published a post on the Gnip/Twitter blog entitled “Working Directly With the Twitter Data Ecosystem.” Shortly thereafter, Nick Halstead, CEO of DataSift, published a post (since updated) entitled “Twitter Ends its Partnership with DataSift – Firehose Access Expires on August 13, 2015“. The next day, Tim Barker, DataSift’s Chief Product Officer, added another take: “Data Licensing vs. Data Processing.”

In a nutshell, this means that Twitter has ended its reseller agreements with third parties, and, going forward, will be the sole distributor of Twitter data.

To be clear, this does not mean that Twitter is discontinuing firehose access; in the future, Twitter will control its own data relationships rather than licensing data through resellers. 

These posts have sparked a flurry of commentary across a wide spectrum, from support to vilification to philosophizing on the meaning of platforms, analytics and ecosystems. I’ve included links to a few at the end of this post.

It wasn’t a huge surprise for anyone watching Twitter go public, and subsequently disclose the revenues from the direct data business, to anticipate that Twitter might realize that this was an area ripe for a significant strategy shift. And it was a short hop from there to conclude that Datasift’s (and possibly other’s) days of reselling data received via the Twitter firehose might be numbered.

It also hasn’t been a surprise to see Twitter enhance its analytics, re-evaluate the short-and-long-term value of its data and announce strategic partnerships such as the ones with IBM and Dataminr as it seeks to build its partner strategy and revenue potential.

Meanwhile, DataSift has continued to execute on its own strategy, which includes broadening its data sources far beyond social data, announcing VEDO, its data categorization platform, and via its developing its privacy-first PYLON technology (see its announcement with Facebook on how they are providing privacy-safe topic data).

Long story short: no one was shocked at the news. But the reaction to it has been polarizing in the extreme. What seems to have fanned the flames are a few different dynamics:

  • The context of the announcement, including Twitter’s recent moves against Meerkat and a historically fraught relationship with its ecosystem
  • The fact that Gnip and Datasift were competitors before the acquisition, complicating DataSift’s relationship with Twitter
  • Datasift’s strong public reaction to the news; and to some extent, the timing of the announcement, which came late on a Friday; a classic—though ineffective—news burial tactic.

It also doesn’t help that Twitter has been (except for the original post) all but silent this past week. But that shouldn’t come as a surprise to anyone either. It’s a public company, and, as such, required to comply with regulatory requirements governing communications with the public. According to Nasdaq, Twitter is expected to report earnings on April 28. So they’re in quiet period, and, as will surprise no one, won’t talk about confidential negotiations between parties.

As a privately-held company, however, DataSift has more leeway to comment publicly. I’m not going to repeat their position here; it is clearly stated in several posts on the DataSift blog.

But none of this gets at the most important issue: the impact of this decision on customers and users of Twitter data. Here are a few constituencies to consider:

1. Companies that sell social technology

To gauge the impact, you need to consider how these companies gained access to Twitter data before now:

How they get Twitter data Impact Implications
Directly from Twitter, via firehose or public API No change None
From Gnip (and therefore now from Twitter) No change None
From third-party resellers such as DataSift Ends August 13, 2015 Must re-assess how they migrate from DataSift to Twitter

But this requires a bit of context.

Before the acquisition, there was a reason companies—social technology or enterprise—would select Gnip or DataSift (or, before its acquisition by Apple, Topsy) if they wanted direct access to Twitter data: they had different value propositions.

DataSift positioned itself as a platform in which the access to social and other types of data came with value-adds such as advanced filtering, enrichments, taxonomies, machine-learning and other sophisticated data processing capabilities to enable users to derive insights from the data.

Gnip, on the other hand, was a simpler, less expensive option: they offered enrichments, but the value proposition was historically more about simplicity and reliability than sophisticated data processing. This tended to be an easy calculation for a lot of social tech companies who wanted to add their own capabilities to the data.

So, speaking broadly, analytics or social technology companies (even brands) who could or wanted to handle raw data would have been better suited to Gnip. Those who wanted a more plug-and-play product that processed data consistently across multiple sources would more likely favor DataSift. Once Twitter acquired Gnip, it didn’t take a team of data scientists to conclude that Twitter had bigger plans for its data business, and that a lot of that development would happen under the waterline, as these things tend to do.

But that doesn’t eliminate the very real issue that migration is going to be highly disruptive for customers of DataSift.

But there is another data point that’s important to consider.

Unlike most industry shifts, it’s been very difficult to get any social analytics companies to talk on the record about this news. On background, some stated that they were never quite sure whether DataSift intended to be a partner or a competitor because they weren’t a pure reseller; the platform—with its ability to perform sentiment analysis, enrichments, and provide classifiers and taxonomies—pushed it uncomfortably for some into analytics territory.

Some said they’re concerned about Twitter’s plans as well. Now that Twitter has discontinued data licensing, what will it now monetize? Will they take more control of  or develop their own analytics? If not, what then?

This is unsettling for some in the social analytics community, who are also being buffeted by business intelligence and marketing/enterprise cloud companies (think Oracle, Salesforce, Adobe) eager to wrap social data into a broader insight offering. It’s a time of shifting strategies and shifting alliances.

2. End users of technology (brands and enterprise)

For the most part, end users of Twitter data don’t have much to worry about, unless they are current or potential DataSift customers and can’t (or don’t want to) ingest the firehose in its raw form. If they are, they’ll need to migrate to Twitter, and assess the extent to which Twitter is currently (or via roadmap) able and willing to provide the type of processing they need.

If enterprises get their Twitter and other social data from social analytics providers, they are more insulated from this news. The question I would ask is whether and how Twitter intends to normalize data from other social media platforms. Will users have a clear sightline across multiple social (and other) data sources? Will they analyze more than Twitter data? Will they handle that through partnerships (and if so, with whom?) The ability to normalize data across sources has been a clear value proposition for DataSift; less so from Gnip, especially since the acquisition by Twitter. And of course we can’t discount the fact that Twitter likely has more up its sleeve than its able to disclose right now.

3. Agencies, consultancies and other professional services firms

Agencies can be affected in any number of ways, based upon their social (and other) analytics strategy and business model. Those who offer their own bespoke analytics would do well to learn more about Twitter’s data roadmap; how much will be product and how much service. Those who use a mix of analytics tools would likely be less affected.

As for professional services firms, there is a tremendous amount of opportunity in custom analytics for enterprise. The challenge is that 1) data processing isn’t core to that business 2) developing custom analytics doesn’t scale well and 3) Twitter data, especially in the context of other social, enterprise and external data, is extremely complex. As a result, professional services firms will need to approach Twitter to better understand what the company will and won’t offer in the future and where the synergies lie. Either way, it’s going to be a delicate dance.

For all DataSift customers, Tim Barker’s post is a shot across the bow for Twitter; even if Twitter disagrees with his assessment of the impact of terminating the reseller agreement, it’s a starting point for customers to begin their conversations with Twitter and suss out what exactly a shift to a direct relationship might entail.

One other option is for customers to bring Twitter data into DataSift via a Gnip connector. DataSift is working on that now.

A few last thoughts

In the end, a lot is still unknown, and Twitter’s silence is enabling people to fill the void with speculation, which creates uncertainty and doubt among those whose businesses depend on to some extent on Twitter. But in my opinion, all of this is an inevitable if painful blip for both companies: DataSift will move on, and Twitter will continue to build out its data business, which will likely create even more uncertainty in the social data ecosystem for some time to come.

But, as one person who spoke off the record rather philosophically commented, “In a world where you’re dealing with third-party data, you can never be completely comfortable.”

 Other points of view

 

 

 

Posted in Analytics, Big Boulder Initiative, DataSift, Facebook, Sentiment Analysis, Social Analytics, Social Data, Social media, Social media measurement, Twitter, Uncategorized | 2 Comments

Who will win the cloud wars? (Hint: wrong question)

Photo: Nick Kenrick, cc 2.0

Photo: Nick Kenrick, cc 2.0

Like many analysts, I’ve spent the last month or so crisscrossing the country looking at clouds*: marketing clouds, sales clouds, service clouds. I’ve been on busses, at dinners, in keynote speeches and presentations and one-on-one meetings. I’ve collected more badges than I did my first (and only) year in girl scouts.

One of the things that stands out throughout all of these presentations and conversations is how tactically similar yet philosophically different each approach turns out to be. Salesforce, as always, is most forward in its messaging. Adobe is doing a lot of work under the waterline to deepen the integration (and therefore the value proposition) of its marketing cloud. Oracle is using its expertise in data to paint the picture of a customer-centric enterprise. And yes, each approach has its standout strengths and clear weaknesses.

The fundamentals—whether it’s the presentation, the messaging, the suite-versus-point-solution positioning, the discussions of acquisitions and integrations and features—are consistent. And so it tempts us to rate one over the other. Who will win the cloud war? Oracle, Salesforce, Adobe? Will Sprinklr be the dark horse? Someone else?

Except while we argue over the finer points of Adobe’s cloud strategy versus Salesforce’s versus Oracle’s, we overlook one basic fact. The war is over. And not in the “congratulations to the winner; please pick up your statuette” sense. It’s just that looking at it as a “war” doesn’t really cut it anymore.

Paul Greenberg incisively argues in a recent post that we’ve moved past the suite wars and into an age of ecosystems for CRM. Consider the collapse of what we used to call the marketing funnel: awareness, consideration, conversion. Leaving aside that human beings have never been that linear to begin with, the Internet and social web have conspired in the past 20-odd years to upend the “funnel” model.

Instead of a single workflow that culminates in victory conversion, we at Altimeter think about an “Influence Loop,” which lays out a more detailed version of the customer’s path, and which more importantly includes the experience after purchase—the point at which he or she is the most literally invested in the product or service. But another fundamental difference between the funnel and the loop is this: customers can and do communicate with each other in public at every stage. Sellers no longer have the information advantage.

8278952242_bcffd8d0f5_o

As Daniel Pink put it in his keynote at Oracle’s Customer Experience World last week: (my paraphrase):

Information asymmetry is about ‘buyer beware’. Buyers have to beware because they can be ripped off. We no longer live in a world of information asymmetry; we live in a world where buyers have lots of information about sellers.

But this isn’t the only playing field that’s being leveled. Sellers now have more information about each other. They can integrate tools more easily, build products more quickly, gain access to open data. They can build ecosystems more effectively. Granted, none of this is easy, but it’s easier. And becoming easier still.

And so the dated, us-versus-them ethos of the 90s and 00s no longer applies. It’s not which cloud approach is better or will win. In fact, that old model actually damages each player’s position. Why?

If we’re fighting to beat our competitors, we’re not fighting for our customers. 

Instead, we should be asking:

  1. What is our strategy for serving customers in the cloud economy?
  2. How robust is our ecosystem?
  3. How strong are our offerings in sales, marketing, customer service, commerce? Where they’re not strong, is it in our roadmap, and do we make it easy to integrate other tools?
  4. What is our data and analytics strategy? Does it promote real insight?
  5. Is our offering siloed, or does it facilitate the customer’s sightline across the customer journey, and the business?
  6. Are we spending more time on our competition than on our customers?
  7. Are we thinking past the horizon?

And the most important one:

Do we make it easier for customers to understand and serve their customers?

Let’s be honest; nobody gets a perfect score on all of these questions. Each vendor has strong marks in some categories and weaker ones in others, and it’s always changing. Some have standout infrastructure and weak tools. Others have the opposite. And, yes, we still need to make decisions in a world in which 1,000-plus marketing vendors (never mind web, commerce, service, mobile, sales) vie for our attention.

This isn’t going to get any easier, so we need to start to think about technology selection a bit differently than we did in the past. If we don’t, given the speed of development and the pace of change in general, we risk entering into an infinitely spiraling arms race.

We also need to do some soul-searching. If we really are committed to (in that overused phrase) trying to understand the customer journey, are we willing to give up some short-term budget/power/control/ease-of-use to do so? What does that mean for performance management, bonuses, culture, leadership?

My colleagues at Altimeter and I will be exploring these issues in more depth in upcoming reports. In the meantime, I’d be grateful for your thoughts; what questions should we be asking as we assess cloud vendors? What are the great examples and untold stories? What’s keeping you up at night?

* Thanks and apologies to Joni Mitchell, who I hope is recovering quickly.

Posted in Adobe, cloud computing, Oracle, Salesforce, Sprinklr, Uncategorized | Tagged , , , , , , , , , , , , | 1 Comment

SXSW15 Redux: What happens at SX spreads everywhere

Photo: Alan Cleaver, CC 2.0

Photo: Alan Cleaver, CC 2.0

First, let’s get this out of the way: every year, SXSW regulars say the festival has jumped the shark. It’s too big, there are too many panels, and they’re poorly curated. It’s impossible to get anywhere. Too many lines, too many wristbands, too few taxis, too damn many cards, pens, pins, stickers that will inevitably end up in landfill. Breakfast tacos become a temporary food group. And there’s always a contingent who mistake the festival for Spring Break and leave their trash, noise and bodily fluids everywhere.

At the same time, certain things happen at SXSW that rarely happen elsewhere: hallway/street/in line for barbeque conversations that build or change businesses; serendipitous combinations of technologies; new and old friends settle into the Driskill or the Four Seasons or a corner at a party somewhere at 1:00 am and plan out the innovations that will drive next year’s technology agenda.

From the conversations I had, it was generally agreed to be a transitional year. Social media, long the darling of SX, was significantly less prominent, replaced by the maker movement, collaborative economy, IoT, privacy and surveillance, cognitive computing/AI, digital ethics, and data, data, everywhere.

Sure, people were Yik Yakking and Meerkating away, but the tone of the conversation was a bit more sober, at least among the people I interacted with. Meerkat in particular raised the spectre of Google Glass, specifically because of its privacy implications. The Beacon sensors throughout the conference logged the movements of thousands of people in an effort to better understand attendee traffic patterns and preferences, although a lot of people I spoke with were unaware that they were being tracked, a sign of cognitive dissonance if there ever was one.

I counted 21 panels that featured “privacy” in the title (121 that included it in the description), and five with “ethics”, (93 that included it in the description.)  “Surveillance” clocked in at five, with 38 total results. The panel on DARPA was so over-full that many, many people were turned away, while other panels (as usual) had barely enough attendees to fill the first row.

Overall, it felt as though the zeitgeist was catching up to Albert Einstein’s assertion, so many decades ago, that “it has become appallingly obvious that our technology has exceeded our humanity.” I wasn’t able to attend as many sessions as I wanted (who is?), but the ones I attended were terrific. Parry Aftab, CEO of WiredTrust, and Mat Honan, bureau chief of Buzzfeed, proposed a framework for privacy by design that seeks to embed privacy into business practices and operational processes. Not sexy, not even cool (yet), but so, so needed. In that panel, Ann Cavoukian (via video) rejected the notion of privacy as a zero-sum game: we don’t have to trade competitiveness for ethics. To me, that was a breath of fresh air.

The panel I participated on, “Emerging Issues in Digital Ethics,” (hashtag: #ethicscode) was moderated by the brilliant Don Heider, Founding Dean and Professor at the School of Communication at Loyola University Chicago, and Founder of the Center for Digital Ethics and Policy there. My co-panelists, Brian Abamont from State Farm (speaking on his own behalf) and Erin Reilly, Managing Director & Research Fellow, USC Annenberg School for Communication and Journalism, covered everything from privacy to cyberbullying, Gamergate to doxxing to scraping. There was so much ground to cover, and yet the panel started to feel more like an open conversation than a transfer of “knowledge” (at least to me). The audience was as or more fluent with these topics as we were; we’re all still figuring it out.

A final thought: every year I insist that it’s my last SXSW, and every year I break down, pack my comfiest shoes and attend. If there’s any takeaway this year, it’s that SX continues to be a pretty good indicator of the tech zeitgeist. I’d love to see some of my data science friends go through the schedule and do an analysis of trending topics on the schedule from year to year. With all our obsession about data, wouldn’t that be an interesting benchmark to have?

 

 

Posted in Artificial Intelligence, Big Data, data privacy, Data Science, data security, digital ethics, Ethics, Policy, Privacy, Social Data, social data ethics, Uncategorized | 3 Comments

Three Implications of ‘POPI’: South Africa’s New Legislation on Data Hoarding

7318395990_5c31627428_o

Photo: BuzzFarmers, cc 2.0

There probably won’t be a TLC reality show about it anytime soon, but the concept of “data hoarding” is real, and it’s about to be declared illegal in South Africa, according to an article last week by ITWeb SA. The story concerns the imminent implementation of the Protection of Personal Information (POPI) Act, which stipulates that “data may only be processed for as long as there are clear and defined business purposes to do so” (italics mine).

If you haven’t heard the term before, “data hoarding” is, according to Michiel Jonker, director of IT advisory at Grant Thornton:

“The gathering of data without a clear business reason or security strategy to protect the underlying information.”

Implication #1: data hoarding legislation could spread to other countries

This is big news for organizations doing business in South Africa, as well as data-watchers outside the country: marketers, data scientists, strategists and anyone whose business depends on collecting, processing, using and storing data.

This means you.

Says Jonker, “we are all data hoarders. Data is hoarded in eletronic and non-electronic formats and, with the emergence of the Internet of Things, machines are also creating data. People also have a tendency to multiply data by sharing it, processing it and storing it.” “The problem with data hoarding,” he says, “is it attracts ‘flies.’ As data is being referred to as the new currency, big data also attracts criminals.”

I asked Judy Selby, Partner, Baker Hostetler and an expert in data privacy law, whether legislation such as POPI could ever be adopted in the United States. She believes that it could. “Some of our privacy laws have criminal penalties, so it’s not unheard of. In the context of data hoarding, especially involving a data broker, I suspect if there’s a big privacy or security incident associated with the data, some of the more active states in this space (such as California, for example) might make a move in that direction.”

Implication #2: Data hoarding legislation and risk avoidance put pressure on data strategy

This piece of legislation gets at a particularly thorny issue for data scientists, ethicists, marketers—really anyone interested in balancing the twin imperatives of extracting insight and fostering trust. To extract the most useful insights, develop the most personalized services, run the most effective and efficient campaigns and organizations requires data—lots of it. It’s not always possible to anticipate what is needed, so the natural impulse is to store it until it comes in handy.

But to protect privacy, and reduce what Jonker refers to as a company’s “risk surface,” we actually need to collect as little data as is practically necessary, and only for uses that we can define today. POPI lays down the law for that decision in South Africa.

Implication #3: Organizations should address security and define use cases now

Organizations should look closely at the two main tenets of the POPI legislation–clear and defined business reasons, and security strategy—for leading indicators of issues that may crop up in other geographies.

Both tenets are challenging, partly because of the potential multitude of business cases for data, and because of the many and disparate data types available. While security strategy may be the most obvious (albeit challenging) first step, we would also recommend early thinking on future uses of big data, including IoT (sensor) data.

My colleague Jessica Groopman’s research report, entitled “Customer Experience in the Internet of Things,” published today, offers excellent examples of how organization are using IoT today, and how they may do so in the future. Reading this report is a terrific first step toward envisioning how such data might be used in the enterprise.

We’ll be watching this space closely for future developments, and suggest you do the same.

 

 

 

Posted in Analytics, Big Data, Data Science, data security, Ethics, Internet of Things, Policy, Predictive Analytics, Privacy | Tagged , , , , , , , , , | Leave a comment