04【原声】Chapter1 | The Amazing Power of Big Data and AI

04【原声】Chapter1 | The Amazing Power of Big Data and AI

00:00
19:10

Part 1: Big Data and AI is everywhere

00:00/19:10 

Hello, listeners of Himalaya, and welcome to this course on Artificial Intelligence, Big Data and Us, in which I will discuss the amazing power of Artificial Intelligence (or AI for short), how it works, and how it will affect all sectors of our economy and all phases of our lives. In this first episode, we kick things off by looking at AI’s unexpected trajectory.

 

When I was a school kid in the 1970s and early 1980s, researchers around the world talked about artificial intelligence. Labs at elite universities, such as MIT and Stanford produced intriguing, but very limited demonstrations. National governments, for instance in Japan, announced massive amounts of public monies to accelerate AI breakthroughs. There was much talk about AI, but nowhere in my young life I could see AI actually happening. By the time the 1990s had arrived, talk about the impending AI revolution had all but stopped. AI was dead, and even conventional data processing was boring. Everything cool was happening on this new global network of communication and information – the Internet.

 

How much has changed in a quarter century! Today, data is at the core of a new AI revolution. Today’s AI is completely different in concept and structure from the 20th century attempts. It is successful. And most importantly: it is everywhere. Hardly a sector, hardly a domain, hardly a context in which big data analytics and AI have not been applied and are utilized not just by a few researchers in their Ivory Tower labs, but by people like you and me.

 

We may not be conscious of it. But it is there. Just think of searching online: Internet search uses AI to provide users with more context-relevant results, especially on mobile smartphones. When you type something in the search field, predictive analytics suggests auto-completed terms. So, for instance if you type “compu” the system automatically suggests “computer” as a search term. It works well quite often. Sometimes, the suggestions are surprising. But they reflect what most people have been searching for.

 

AI also helps mapping and navigation applications to offer us a seamless image of the world that we can explore down to the street level, even in three dimensions. AI is the key to machine translations and voice recognition that rapidly lowers language barriers around the world. Just as a n example, for many years, Google has relied on statistical inferences gleaned from data for machine translation; it meant that a word in one language would be translated into the statistically most likely word in the other language. That worked okay. But more recently they switched to AI-based deep machine learning, and increased comprehension and accuracy very substantially. Machine-assisted driving as well as self-driving cars are built on elaborate models of data-driven machine learning; they would be impossible without AI. Or take medicine and health care: not only in the context of the SARS-COV2 pandemic, but more generally with gene sequencing, protein folding, generative chemistry, semantic access to research, and global research collaboration, AI is crucial at every twist, and present at every turn. And these are just a few examples.

 

Five decades ago, AI was a pipedream, nowhere to be found. But today, Big Data and the new AI is everywhere. And we may not realize it, but in so many ways we all are the beneficiaries of this development.

 

 

Part 2: Why Did Big Data and AI happen now?

04:50/19:10 

So if AI has been around for decades, why are we talking about it today? Why is it only becoming popular now? It’s because we changed how we do AI.

 

In the old days of AI in the 20th century, the core idea was to implant into computers general rules and abstract concepts, with which the computer would then be able to reason. The heavy mental lifting, so to speak, would be done by humans: experts would distil and summarize the knowledge of their domain in a few dozen, or a few hundred such rules, which then would be entered into the computer. So for example, a doctor might enter a rule into an AI system for medical diagnostics that states: if sudden chest pain, then likelihood of heart attack.

 

Two problems very quickly developed: First, it is difficult to represent complex domain knowledge in clear and concise rules that do not logically contradict each other. What, to remain in our example, exactly are “chest pains” and how strong do they need to be? It turns out that even seemingly straight-forward expert knowledge is far less clear and simple than it may seem at first glance. Just simply sorting and writing down such domain expertise in concise human readable terms (we are not talking about computer code yet) already entails a lot of work.

 

Second, pretty much any domain knowledge rests on a huge amount of general knowledge and insights that are very obvious to humans but need to be represented through clear rules. For example, how can a system know that yesterday came before tomorrow, or that my mother’s child is my sibbling? Large projects to capture general knowledge in such rules were undertaken but failed. Such general knowledge is not only vast, but also amorphous and sometimes even outright contradictory. Humans have developed pretty good cognitive mechanism to deal with such general knowledge but reducing them to simple rules is practically impossible.

 

Taken together, these two obstacles ended classical rule-based AI.

 

Since at least the 1970s an alternative way to capture knowledge existed; some of the foundational work had been undertaken, and some of the early tools had been developed. So-called neural networks are the most prominent examples of this category. The core idea is to let computers rather than humans do the heavy lifting of knowledge extraction by letting them learn from data.

 

The idea was hugely promising, and the mathematical models invented were quite suitable. Unfortunately, the technology wasn’t. These were the 1970s. Such machine learning requires both enormous amounts of training data and a huge amount of processing power. Neither was available then – at least not at reasonable cost.

 

But around the turn of the 21st century, our ability to collect data and to process it had increased dramatically. Just think about it: a single average smartphone today has more computing power than the NASA computers that calculated the trajectory of the first human moon flight in 1969! So when humanity’s capability to gather data and to compute that data reached the inflection point of availability at relatively low cost, data-driven machine learning was dusted off, rejigged and put into practice. This is what we call Big Data and AI today. And it continues to improve by leaps and bounds.

 

Intriguingly, the amazing progress we have witnessed in the past decade or so is also a result of a technical hurdle. Since roughly the 1960, raw computing power had doubled about every eighteen months. This observation has been called Moore’s Law, named after Gordon Moore, the co-founder of chipmaker Intel, who coined it in a brief paper almost six decades ago. But beginning around the year 2000, the doubling of processing power began to slow, and computer makers started to worry.

 

In response chip designers started adding additional processing units. The idea was that if one can’t increase the speed of one processing unit, adding more processing units will at least partially make up for it. That way multiple tasks can be processed in parallel. Think of a large garden that needs to be tended. You can have multiple gardeners each tend to smaller areas rather than a single gardener tending to the entire lot. But this only works if tasks can indeed be easily be broken up. It doesn’t work well when assembling car, in which one tasks build on the other.

 

Graphics processing is one task that is amendable to parallelism (as it is called), because every pixel on a computer screen can be calculated separately from the others. And so graphic processors have come on the market with dozens and even hundreds of processing units that can work in parallel. It gave us amazing computer graphics at a very affordable price – and many kids and adults are playing amazing computer games as a result.

 

But important for our context, researchers discovered ways to repurpose strongly parallel graphics processors for machine learning. As it turns out, classical neural networks, deep machine learning and a host of similar techniques for machines to learn from data run blindingly fast on these heavily parallel computing units. Add to this mix the rapid growth of so-called cloud computing, ie huge centres offering scalable computing power at commodity prices, and we have one of the necessary ingredients for the rise of AI.

 

In addition to computing power, we also need data to learn from. And, fortunately, that data has been forthcoming as well. In 2020, it is estimated that 44 Zettabyte of data in digital storage exists in the world – that’s 44 followed by 21 zeros and an enormous amount. In part, this is because sensors that collect data have gotten cheap, and so has data storage. That means, from a cost-benefit-perspective, when costs plummet, even yet unclear economic benefits are sufficient reason to keep data stored rather than get rid of it. But arguably the biggest reason for the explosion in data being collected in the world is the availability of small and relatively affordable computers – think not of PCs, but tablets, smartphones, smartwatches, but also cars, planes, trains and countless machines on factory floors - comprising powerful sensors, abundant processing capabilities, sufficient storage, and a global digital network, the Internet, to enable rich and diverse flows of data among all of these devices.

 

With this technical enablement thanks to digital technologies in place, the data-driven approach to AI that was conceptually sketched out in the 1970s and 1980s finally took off – and, propelled by quick successes – swiftly found its footing. Attracting excellent talent around the world, research continued to make progress, evolving and tuning the data-driven concepts and reaching new heights in machine learning. That’s where we are today – a remarkable vantage point to celebrate our successes but also to look into the future.

 

 

Part 3: What the Future Might Hold?

14:49/19:10 

So, I have explained what AI can do and how an idea that has been around for decades, finally was able to take off and change the world. But where does this lead us? Have you ever wondered what the age of AI will bring? Me too!

 

But we need to be careful. Every peek into the future is in danger of being overcome by reality. A hundred years ago, futurists envisioned faster cars, trains, and planes, and powerful new machines - but few of them foresaw the general-purpose computers that are ubiquitous today. We humans are good at predicting the continuation of trends we see, but not to imagine what does not yet exist. That makes it so difficult and challenging and dangerous to venture into future-gazing.

 

And yet, in order to know how much we should focus on Big Data and AI, letting our gaze wonder into tomorrow is necessary. Otherwise, we fall into the trap of being utterly unprepared. And we may be quite wrong predicting what’s happening in ten years, but pretty good in forecasting what’s in three years. That way, our short- and medium-term forecasts can inform our decisions of the present. So here we go with our big picture view of the next years:

 

With data-driven AI, the availability of even more, and more valuable data in the coming years will surely improve the ability of machines to learn from it. Already self-driving cars are very good, as we shall see in more detail in a future episode. But looking into a future of self-driving cars isn’t about cars; it’s far more about better, smarter, and more sustainable ways of mobility. Data-driven AI is about empowerment – with self-driving cars people that can drive can still get around. Similarly, the future of machine translation and voice recognition isn’t primarily about the quality of translating languages, but about breaking down language hurdles that have plagued the access to knowledge (and thus informational power) for centuries. It is a tool for those, for example, that do not speak English to access huge treasure troves of insight and expertise – a great leveller that also will diminish the relative might English native speakers derive from the simple fact that they speak English so well. As we break down language barriers that held us back, we enhance the power and increase the opportunities of those that speak other languages. Here, too, data-driven AI empowers.

 

Much has been written about the shift in power in the world, the end of a unipolar world led by the US, and the advent of a multi-polar world, with China and other nations playing crucial roles. Data-driven AI will only accelerate this process. It’s an engine that can propel China forward and hold other regions, like Europe, back. It will almost certainly reduce further America’s superpower status, and make the world a much more diverse and, if you ask me, interesting place. It’ll be intriguing to watch the future as it unfolds. Come along – and join me on a path to discover AI, and us!

以上内容来自专辑
用户评论

    还没有评论,快来发表第一个评论!