声明: 本站全部内容源自互联网,不进行任何盈利行为
仅做 整合 / 美化 处理
In ancient Greece,
在古希腊,
when anyone from slaves to soldiers, poets and politicians,
从奴隶到士兵,从诗人到政治家,
needed to make a big decision on life's most important questions,
都需要对人生中最重要的问题做决定,
like, "Should I get married?"
比如,我该结婚吗?
or "Should we embark on this voyage?"
这次出海我该不该去?
or "Should our army advance into this territory?"
我们该不该向那片区域进军?
they all consulted the oracle.
他们纷纷去请教先知。
So this is how it worked:
过程是这样的:
you would bring her a question and you would get on your knees,
你问她一个问题,然后跪在她面前,
and then she would go into this trance.
之后她会进入一种恍惚的状态。
It would take a couple of days,
也许持续几天,
and then eventually she would come out of it,
最终她会恢复清醒状态,
giving you her predictions as your answer.
给出她的预测,回答你的问题。
From the oracle bones of ancient China
从古代中国用骨头占卜,
to ancient Greece to Mayan calendars,
到古希腊,再到玛雅历法,
people have craved for prophecy
人们祈求能得到预言,
in order to find out what's going to happen next.
从而知道未来会发生什么。
And that's because we all want to make the right decision.
因为我们都想做出正确的决定。
We don't want to miss something.
我们不想忽略什么。
The future is scary,
未来是可怕的,
so it's much nicer knowing that we can make a decision
因此若我们 在做决定时多多少少
with some assurance of the outcome.
能预知结果,会更好。
Well, we have a new oracle,
如今我们有了新的先知,
and it's name is big data,
它的名字叫大数据,
or we call it "Watson" or "deep learning" or "neural net."
或者叫它“沃森”或者 “深度学习”或者“神经网络”。
And these are the kinds of questions we ask of our oracle now,
以下就是我们问这位先知的问题。
like, "What's the most efficient way to ship these phones
“要把这些手机从中国运到瑞典,
from China to Sweden?"
怎么做最高效?”
Or, "What are the odds
或者“我的孩子出生时
of my child being born with a genetic disorder?"
患遗传病的几率是多少?”
Or, "What are the sales volume we can predict for this product?"
或者“这件产品的预计销量是多少?”
I have a dog. Her name is Elle, and she hates the rain.
我养了一只狗,名叫艾尔, 她讨厌下雨。
And I have tried everything to untrain her.
我想了很多办法来帮她。
But because I have failed at this,
但是因为我失败了,
I also have to consult an oracle, called Dark Sky,
因此每次准备遛狗时,
every time before we go on a walk,
我都会求助一位先知,叫Dark Sky,
for very accurate weather predictions in the next 10 minutes.
来获得未来10分钟精准的天气预报。
She's so sweet.
小狗真可爱。
So because of all of this, our oracle is a $122 billion industry.
因此,“先知”大数据是 一项价值1220亿美元的产业。
Now, despite the size of this industry,
但尽管产业规模大,
the returns are surprisingly low.
投资回报却出奇地低。
Investing in big data is easy,
投资大数据很简单,
but using it is hard.
但利用它却很难。
Over 73 percent of big data projects aren't even profitable,
超过73%的大数据项目都不赚钱,
and I have executives coming up to me saying,
有经理来找我说,
"We're experiencing the same thing.
“我们的情况也是如此。
We invested in some big data system,
我们投资了一些大数据系统,
and our employees aren't making better decisions.
但雇员们并未因此做出更好的决策。
And they're certainly not coming up with more breakthrough ideas."
更别说提出突破性的想法了。”
So this is all really interesting to me,
我觉得这个现象很有意思,
because I'm a technology ethnographer.
因为我是一名技术人类学家。
I study and I advise companies
我研究人们使用技术的模式,
on the patterns of how people use technology,
并据此为企业提供建议,
and one of my interest areas is data.
数据是我感兴趣的领域之一。
So why is having more data not helping us make better decisions,
为什么更多的数据不能 帮我们更好的决策呢?
especially for companies who have all these resources
尤其是那些资源丰富,
to invest in these big data systems?
能投资大数据系统的公司。
Why isn't it getting any easier for them?
为什么对他们而言, 事情并未变得简单?
So, I've witnessed the struggle firsthand.
我亲眼见过这种困境。
In 2009, I started a research position with Nokia.
2009年,我跟诺基亚 开始进行一项研究。
And at the time,
在当时,
Nokia was one of the largest cell phone companies in the world,
诺基亚是全球最大的 手机生产商之一,
dominating emerging markets like China, Mexico and India --
在中国、墨西哥和印度等 新兴市场占有巨大份额,
all places where I had done a lot of research
我在上述国家进行了大量的研究,
on how low-income people use technology.
看低收入人群是如何使用技术的。
And I spent a lot of extra time in China
我在中国花了大量时间
getting to know the informal economy.
去了解当地的街头经济。
So I did things like working as a street vendor
我当过街边小贩,
selling dumplings to construction workers.
卖饺子给建筑工人。
Or I did fieldwork,
我还泡过网吧,
spending nights and days in internet cafés,
在那里连续待上几天,
hanging out with Chinese youth, so I could understand
跟中国年轻人 混在一起,来了解
how they were using games and mobile phones
他们如何玩游戏和使用手机,
and using it between moving from the rural areas to the cities.
如何在从农村来到城市时使用。
Through all of this qualitative evidence that I was gathering,
通过搜集到的这些 高质量的例证,
I was starting to see so clearly
我开始清晰地看到
that a big change was about to happen among low-income Chinese people.
在中国低收入人群中 将发生巨大的变革。
Even though they were surrounded by advertisements for luxury products
尽管奢华产品的广告随处可见,
like fancy toilets -- who wouldn't want one? --
比如高级马桶——谁不想要?
and apartments and cars,
还有房子和车子,
through my conversations with them,
聊天过程中,
I found out that the ads the actually enticed them the most
我发现最吸引他们的广告,
were the ones for iPhones,
是iPhone的广告,
promising them this entry into this high-tech life.
因为感觉可以将他们 带入高科技生活。
And even when I was living with them in urban slums like this one,
跟他们一起住在 这样的城中村里,
I saw people investing over half of their monthly income
我看到有人花掉超过 半个月的收入
into buying a phone,
去买一部手机,
and increasingly, they were "shanzhai,"
“山寨”越来越多,
which are affordable knock-offs of iPhones and other brands.
就是苹果和其他品牌的 廉价仿冒品。
They're very usable.
它们也能用。
Does the job.
基本功能都有。
And after years of living with migrants and working with them
多年来,我跟这些外地人 一起工作和生活,
and just really doing everything that they were doing,
跟他们做着同样的事情,
I started piecing all these data points together --
我开始把很多数据联系起来,
from the things that seem random, like me selling dumplings,
从随机事件,比如卖饺子,
to the things that were more obvious,
到比较直观的东西,
like tracking how much they were spending on their cell phone bills.
比如看他们会花多少钱买手机。
And I was able to create this much more holistic picture
我更全面地了解了
of what was happening.
发生的事。
And that's when I started to realize
此时我开始意识到,
that even the poorest in China would want a smartphone,
即使是中国最穷的人, 也会想拥有一部智能手机,
and that they would do almost anything to get their hands on one.
而为此他们几乎愿意付出一切。
You have to keep in mind,
别忘了,
iPhones had just come out, it was 2009,
那是2009年, iPhone才刚刚出现,
so this was, like, eight years ago,
差不多是8年前,
and Androids had just started looking like iPhones.
而安卓手机刚开始 长得像iPhone。
And a lot of very smart and realistic people said,
很多聪明而务实的人断言,
"Those smartphones -- that's just a fad.
“这些智能手机,只会昙花一现。
Who wants to carry around these heavy things
谁会愿意拿着这么重的手机,
where batteries drain quickly and they break every time you drop them?"
电量掉得那么快,一摔就坏。”
But I had a lot of data,
但我有数据,
and I was very confident about my insights,
我对自己的见解很自信,
so I was very excited to share them with Nokia.
于是我非常兴奋地告诉诺基亚。
But Nokia was not convinced,
但是诺基亚不为所动,
because it wasn't big data.
因为我给的不是大数据。
They said, "We have millions of data points,
他们说,“我们有几百万的数据,
and we don't see any indicators of anyone wanting to buy a smartphone,
没有数据显示会 有人愿意买智能手机,
and your data set of 100, as diverse as it is, is too weak
而你的数据量只有几百, 还如此分散,毫无说服力,
for us to even take seriously."
根本不值一提。”
And I said, "Nokia, you're right.
我说,“诺基亚,你是对的。
Of course you wouldn't see this,
你当然看不到这些,
because you're sending out surveys assuming that people don't know
因为你在调查时就假定
what a smartphone is,
人们不了解智能手机,
so of course you're not going to get any data back
因此当然得不到数据来了解
about people wanting to buy a smartphone in two years.
2年之内想买智能手机的人。
Your surveys, your methods have been designed
因为你们的调查和方法,
to optimize an existing business model,
目的都是优化现有的商业模式,
and I'm looking at these emergent human dynamics
而我看到的,是前所未有的
that haven't happened yet.
人类新动向。
We're looking outside of market dynamics
我们看的是市场动态之外的东西,
so that we can get ahead of it."
因此可以领先一步。”
Well, you know what happened to Nokia?
都知道诺基亚的结局吧?
Their business fell off a cliff.
他们的生意一落千丈。
This -- this is the cost of missing something.
这就是忽略某些事情的代价。
It was unfathomable.
就是那么难以想象。
But Nokia's not alone.
而诺基亚并非个案。
I see organizations throwing out data all the time
我看到许多组织总是 对数据视而不见,
because it didn't come from a quant model
因为这些数据并非 来自某种数据模型,
or it doesn't fit in one.
或跟模型不符。
But it's not big data's fault.
大数据本身并没有错。
It's the way we use big data; it's our responsibility.
是我们使用不当,错在我们。
Big data's reputation for success
大数据的声名鹊起
comes from quantifying very specific environments,
是因为它能量化特定环境,
like electricity power grids or delivery logistics or genetic code,
比如电网、物流或者基因编码,
when we're quantifying in systems that are more or less contained.
帮我们量化一定程度上 可控的体系。
But not all systems are as neatly contained.
然而并非所有的体系 都有很好的可控性。
When you're quantifying and systems are more dynamic,
对一个动态的体系进行量化,
especially systems that involve human beings,
尤其是牵涉到人时,
forces are complex and unpredictable,
各种因素复杂多变,
and these are things that we don't know how to model so well.
有些因素并没有很好的模型。
Once you predict something about human behavior,
对人的行为进行预测时,
new factors emerge,
会出现新的因素,
because conditions are constantly changing.
因为条件是在不断变化的。
That's why it's a never-ending cycle.
因此这是个永远的循环。
You think you know something,
你以为已经懂了,
and then something unknown enters the picture.
结果新的未知情况又出现了。
And that's why just relying on big data alone
因此,仅仅依靠大数据,
increases the chance that we'll miss something,
反而会使我们更容易 忽略一些事实,
while giving us this illusion that we already know everything.
却给了我们已经掌握一切的错觉。
And what makes it really hard to see this paradox
要看清这样一个矛盾,
and even wrap our brains around it
哪怕仅仅去认真思考它,
is that we have this thing that I call the quantification bias,
也是困难重重, 原因在于我们偏爱量化,
which is the unconscious belief of valuing the measurable
比起不能量化的, 总是不自觉地相信
over the immeasurable.
能够量化的。
And we often experience this at our work.
这在工作中很常见。
Maybe we work alongside colleagues who are like this,
也许我们的同事是这样,
or even our whole entire company may be like this,
甚至整个公司都是这样,
where people become so fixated on that number,
大家都盯着数字,
that they can't see anything outside of it,
而忽略了其他东西,
even when you present them evidence right in front of their face.
即便你将证据摆在他们面前。
And this is a very appealing message,
这一点很有意思,
because there's nothing wrong with quantifying;
因为量化本身并没有什么错,
it's actually very satisfying.
甚至会让人愉悦。
I get a great sense of comfort from looking at an Excel spreadsheet,
看Excel表格时我就感觉挺好的,
even very simple ones.
哪怕表格很简单。
(Laughter)
(笑声)
It's just kind of like,
那感觉就是,
"Yes! The formula worked. It's all OK. Everything is under control."
“好!这个公式能用。 都没问题,一切尽在掌握!”
But the problem is
但问题在于,
that quantifying is addictive.
量化会让人上瘾。
And when we forget that
一旦忘记这点,
and when we don't have something to kind of keep that in check,
又没有什么纠错的机制,
it's very easy to just throw out data
就很容易舍弃
because it can't be expressed as a numerical value.
无法变成数值的信息。
It's very easy just to slip into silver-bullet thinking,
人们很容易执迷于一招鲜,
as if some simple solution existed.
好像总有简单的解决方法。
Because this is a great moment of danger for any organization,
对任何组织来说这都很要命,
because oftentimes, the future we need to predict --
因为通常我们需要预测的未来,
it isn't in that haystack,
不是这干草垛,
but it's that tornado that's bearing down on us
而是谷仓外向我们袭来的
outside of the barn.
龙卷风。
There is no greater risk
最危险的莫过于
than being blind to the unknown.
忽略未知事物。
It can cause you to make the wrong decisions.
这会让你做出错误的决定,
It can cause you to miss something big.
忽略重要的事情。
But we don't have to go down this path.
但我们并非别无选择。
It turns out that the oracle of ancient Greece
其实古希腊的先知们
holds the secret key that shows us the path forward.
已经掌握了解决问题的关键。
Now, recent geological research has shown
最近的地质研究表明,
that the Temple of Apollo, where the most famous oracle sat,
最著名的先知 所在的阿波罗神庙
was actually built over two earthquake faults.
正建在两个地震断层之间。
And these faults would release these petrochemical fumes
断层不断从地下释放出
from underneath the Earth's crust,
石油化学气体。
and the oracle literally sat right above these faults,
先知们恰好坐在这些断层上,
inhaling enormous amounts of ethylene gas, these fissures.
吸入了从断层中 逸出的大量乙烯,
(Laughter)
(笑声)
It's true.
是真的。
(Laughter)
(笑声)
It's all true, and that's what made her babble and hallucinate
没骗你们,因此她才会 产生幻觉,开始呢喃,
and go into this trance-like state.
变得神情恍惚,
She was high as a kite!
她正“飘”着呢!
(Laughter)
(笑声)
So how did anyone --
所以怎么可能——
How did anyone get any useful advice out of her
这种情况下,怎么可能
in this state?
从先知那里得到有用的建议?
Well, you see those people surrounding the oracle?
看到先知身旁的人了吗?
You see those people holding her up,
他们扶着她,
because she's, like, a little woozy?
因为她已经有点晕了。
And you see that guy on your left-hand side
你看左手边那位老兄,
holding the orange notebook?
手里拿着橙色的本子。
Well, those were the temple guides,
他们是神庙向导,
and they worked hand in hand with the oracle.
跟先知一起合作的。
When inquisitors would come and get on their knees,
当求问者跪在先知面前时,
that's when the temple guides would get to work,
神庙向导就要开始介入了,
because after they asked her questions,
求问者提问后,
they would observe their emotional state,
向导开始观察他们的精神状态,
and then they would ask them follow-up questions,
并且问进一步的问题,
like, "Why do you want to know this prophecy? Who are you?
比如,“你为什么想问这个?你是谁?
What are you going to do with this information?"
你要用这个答案来做什么?”
And then the temple guides would take this more ethnographic,
神庙向导利用这些与人更相关的
this more qualitative information,
更有实质意义的信息,
and interpret the oracle's babblings.
来对先知的呢喃进行解释。
So the oracle didn't stand alone,
所以先知并不是孤立的,
and neither should our big data systems.
大数据也不应如此。
Now to be clear,
别误会,
I'm not saying that big data systems are huffing ethylene gas,
我不是说大数据吸了乙烯,
or that they're even giving invalid predictions.
或者大数据的预测没有用。
The total opposite.
完全不是。
But what I am saying
我想说的是,
is that in the same way that the oracle needed her temple guides,
正如先知需要神庙向导们一样,
our big data systems need them, too.
大数据系统也需要协助。
They need people like ethnographers and user researchers
需要人类学家和用户研究人员,
who can gather what I call thick data.
搜集所谓的“厚数据”。
This is precious data from humans,
这是来源于人类的宝贵信息,
like stories, emotions and interactions that cannot be quantified.
比如故事、情感和交流 等不能被量化的东西。
It's the kind of data that I collected for Nokia
像我曾为诺基亚搜集的,
that comes in in the form of a very small sample size,
它们来自很小的样本量,
but delivers incredible depth of meaning.
却能传达意义重大的信息。
And what makes it so thick and meaty
而“厚数据”内涵丰富是因为
is the experience of understanding the human narrative.
其中包含了理解人类生活的过程。
And that's what helps to see what's missing in our models.
这能帮助我们看清 模型中缺失的东西。
Thick data grounds our business questions in human questions,
“厚数据”将商业问题 落实到人类生活,
and that's why integrating big and thick data
因此将大数据和厚数据相结合
forms a more complete picture.
能得到更全面的认识。
Big data is able to offer insights at scale
大数据能在数量级上提供视角,
and leverage the best of machine intelligence,
最大限度利用机器智能,
whereas thick data can help us rescue the context loss
而厚数据能补充 在利用大数据时
that comes from making big data usable,
缺失的情境信息,
and leverage the best of human intelligence.
充分利用人类智慧。
And when you actually integrate the two, that's when things get really fun,
两者结合起来时就很有意思了,
because then you're no longer just working with data
因为这样你不只是在使用
you've already collected.
搜集到的数据。
You get to also work with data that hasn't been collected.
你还能利用尚未搜集到的数据。
You get to ask questions about why:
你可能会问:
Why is this happening?
为什么会这样?
Now, when Netflix did this,
Netflix这么做之后,
they unlocked a whole new way to transform their business.
他们找到了全新的方式 来进行商业转型。
Netflix is known for their really great recommendation algorithm,
Netflix以出色的 推荐算法而闻名,
and they had this $1 million prize for anyone who could improve it.
他们设立了100万美元的奖金, 寻找可以改进它的人。
And there were winners.
有人获奖了。
But Netflix discovered the improvements were only incremental.
但Netflix发现改进太慢。
So to really find out what was going on,
为了彻底弄清原因,
they hired an ethnographer, Grant McCracken,
他们雇了一位人类学家: 格兰特·麦克拉肯,
to gather thick data insights.
来搜集分析厚数据。
And what he discovered was something that they hadn't seen initially
他发现了在一开始的数据分析中
in the quantitative data.
没发现的东西。
He discovered that people loved to binge-watch.
他发现人们喜欢连续看片。
In fact, people didn't even feel guilty about it.
事实上人们才不会内疚。
They enjoyed it.
大家乐在其中。
(Laughter)
(笑声)
So Netflix was like, "Oh. This is a new insight."
于是Netflix觉得, “噢,这是个新见解。”
So they went to their data science team,
于是他们找来数据科学团队,
and they were able to scale this big data insight
将基于厚数据的观点
in with their quantitative data.
跟量化数据进行对比。
And once they verified it and validated it,
这一观点得到验证后,
Netflix decided to do something very simple but impactful.
Netflix决定采取 简单却有效的措施。
They said, instead of offering the same show from different genres
他们不再把同一节目 做成不同体裁,
or more of the different shows from similar users,
也不再给同一类用户 推荐不同节目,
we'll just offer more of the same show.
而是提供同一节目,
We'll make it easier for you to binge-watch.
便于连续观看。
And they didn't stop there.
不仅如此,
They did all these things
他们还想尽一切办法
to redesign their entire viewer experience,
重新规划用户体验,
to really encourage binge-watching.
引导用户连续观看。
It's why people and friends disappear for whole weekends at a time,
于是大家在周末集体消失,
catching up on shows like "Master of None."
都在追《无为大师》这样的剧。
By integrating big data and thick data, they not only improved their business,
通过结合大数据和厚数据, 他们不仅发展了业务,
but they transformed how we consume media.
还转变了人们消费媒体的方式。
And now their stocks are projected to double in the next few years.
他们的股价预计会在 未来几年内翻番。
But this isn't just about watching more videos
但这不只是关于看更多的视频,
or selling more smartphones.
或者卖更多的智能手机。
For some, integrating thick data insights into the algorithm
对某些人而言,将厚数据的观点 整合到算法中,
could mean life or death,
关乎生死,
especially for the marginalized.
尤其是被边缘化的人群。
All around the country, police departments are using big data
全国各地的警察部门都在将大数据
for predictive policing,
用于预防性警务,
to set bond amounts and sentencing recommendations
规划牢房数量, 提供量刑建议,
in ways that reinforce existing biases.
这种的方法更是强化了已有偏见。
NSA's Skynet machine learning algorithm
国安局的天网机器学习算法
has possibly aided in the deaths of thousands of civilians in Pakistan
可能间接导致了几千 巴基斯坦平民丧生,
from misreading cellular device metadata.
因为误读了他们的 蜂窝移动设备的元数据。
As all of our lives become more automated,
随着我们的生活变得更加自动化,
from automobiles to health insurance or to employment,
从汽车到健康保险到就业,
it is likely that all of us
所有人都可能
will be impacted by the quantification bias.
会受量化偏见的负面影响。
Now, the good news is that we've come a long way
不过好消息是,我们已经 有了很大进步,
from huffing ethylene gas to make predictions.
不再吸入乙烯气体, 而是真正做出预测。
We have better tools, so let's just use them better.
我们有了更好的工具, 那就让我们用好它。
Let's integrate the big data with the thick data.
让我们将大数据和 厚数据结合起来,
Let's bring our temple guides with the oracles,
为先知配上神庙向导,
and whether this work happens in companies or nonprofits
无论是在公司、非营利性机构,
or government or even in the software,
还是在政府或者软件公司,
all of it matters,
都很重要,
because that means we're collectively committed
因为这意味着我们共同承诺
to making better data,
提供更好的数据,
better algorithms, better outputs
更好的算法,更好的结果,
and better decisions.
并做出更好的决定。
This is how we'll avoid missing that something.
这样我们才不会忽略重要信息。
(Applause)
(掌声)