Stephen Wolfram Blog http://blog.stephenwolfram.com Stephen Wolfram's Personal Blog Wed, 12 Jun 2013 15:28:07 +0000 en-US hourly 1 http://wordpress.org/?v=3.4.2 There Was a Time before Mathematica… http://blog.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/ http://blog.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/#comments Thu, 06 Jun 2013 06:34:44 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=5192  

In a few weeks it’ll be 25 years ago: June 23, 1988—the day Mathematica was launched.

Late the night before we were still duplicating floppy disks and stuffing product boxes. But at noon on June 23 there I was at a conference center in Santa Clara starting up Mathematica in public for the first time:

Mathematica v1.0 on Macintosh

(Yes, that was the original startup screen, and yes, Mathematica 1.0 ran on Macs and various Unix workstation computers; PCs weren’t yet powerful enough.)

People were pretty excited to see what Mathematica could do. And there were pretty nice speeches about the promise of Mathematica from a spectrum of computer industry leaders, including Steve Jobs (then at NeXT), who was kind enough to come even though he hadn’t appeared in public for a while. And someone at the event had the foresight to get all the speakers to sign a copy of the book, which had just gone on sale that day at bookstores all over the country:

Signatures of speakers at release of Mathematica v1.0

So much has happened with Mathematica in the quarter century since then. What began with Mathematica 1.0 has turned into the vast system that is Mathematica today. And as I look at the 25th Anniversary Scrapbook, it makes me proud to see how many contributions Mathematica has made to invention, discovery and education:

The Mathematica Story: A Scrapbook

But to me what’s perhaps most satisfying is how the fundamental principles on which I built Mathematica have stood the test of time. And how the core ideas and language that were in Mathematica 1.0 persist today (and yes, most Mathematica 1.0 code will still run unchanged today).

But, OK, where did Mathematica come from? How did it come to be the way it is? It’s a long story, really. Deeply entwined with my own personal story. But particularly as I look to the future, I find it interesting to understand how things have evolved from all that history.

Perhaps the first faint glimmering of an orientation toward something like Mathematica came when I was about 6 years old—and realized that I could “automate” those tedious addition sums I was being given, by creating an “addition slide rule” out of two rulers. I never liked calculational math, and was never good at it. But starting around the age of 10, I became increasingly interested in physics—and doing physics required doing math.

Electronic calculators arrived on the scene when I was 12—and I immediately became an enthusiast. And around the same time, I started using my first computer—an object the size of a large desk, with 8 kilowords of 18-bit memory, programmed mostly in assembler using paper tape. I tried doing physics with it, to no great success. But by the time I was 16, I had published a few physics papers, left high school, and was working at a British government lab. “Real” theoretical physicists basically didn’t use computers in those days. But I did. Alternating between an HP desk calculator (with a plotter!) and an IBM mainframe programmed in Fortran.

I was basically just doing numerics, though. But in the physics I wanted to do, there was all sorts of algebra. And not just a little algebra. Huge amounts. Expressions from Feynman diagrams with hundreds or thousands of terms, all of which had to be precisely right if one was going to get the right answer.

I wondered what to do. I imagined spending my life chasing minus signs and factors of 2. But then I started thinking about using a computer to help. And right then someone told me that other people had had that idea too. There were three programs that I found out about—all as it turned out started some 14 years earlier from a single conversation at CERN in 1962: Reduce (written in LISP), Ashmedai (written in Fortran) and Schoonschip (written in CDC 6000 assembler).

The programs were specialized, and it wasn’t clear how many people other than their authors had ever used them seriously. They were pretty clunky to use: typically you’d submit a deck of cards, and then some time later get back a result—or more often a cryptic error message. But I managed to start doing physics with them.

Then in the summer of 1977 I discovered the ARPANET, or what’s now the internet. There were only 512 hosts on it back then. And @O 236 went to an open computer at MIT that ran a program called Macsyma—that did algebra, and could be used interactively. I was amazed so few people used it. But it wasn’t long before I was spending most of my days on it. I developed a certain way of working—going back and forth with the machine, trying things out and seeing what happened. And routinely doing weird things like enumerating different algebraic forms for an integral—then just “experimentally” seeing which differentiated correctly.

My physics papers started containing all sorts of amazing formulas. And not imagining that I could be using a computer, people started thinking that I must be some kind of great human algebraic calculator. I got more and more ambitious, trying to do more and more with Macsyma. Pretty soon I think I was its largest user. But sometime in 1979 I hit the edge; I’d outgrown it.

And then it was November 1979. I was 20 years old, and I’d just gotten my PhD in physics. I was spending a few weeks at CERN, planning my future in (as I believed) physics. And one thing I concluded was that to do physics well, I’d need something better than Macsyma. And after a little while I decided that the only way I’d really have a chance to get what I wanted was if I built it myself.

And so it was that I embarked on what would become SMP (the “Symbolic Manipulation Program”). I had a pretty broad knowledge of other computer languages of the time, both the “ordinary” ALGOL-like procedural ones, and ones like LISP and APL. At first as I sketched out SMP, my designs looked a lot like what I’d seen in those languages. But gradually, as I understood more about how different SMP had to be, I started just trying to invent everything myself.

I think I had some pretty good ideas. And actually even some of my early SMP design documents have a remarkably Mathematica-like flavor to them:

Early SMP design documents

Looking back at its documentation, SMP was quite an impressive system, especially given that I was only 20 years old when I started designing it. But needless to say, not every idea in SMP was good. And as a long-time connoisseur of language design, I can’t resist at the bottom of this post mentioning a few of my “favorite” mistakes.

Even in my early designs, SMP was a big system. But for whatever reason, I didn’t find that at all daunting. I just wanted to go ahead and implement it. I wanted to make sure I did everything as well as possible. And I remember thinking: “I don’t officially know computer science; I’d better learn it”. So I went to the bookstore, and bought every book I could find on computer science—the whole half shelf of them. And proceeded to read them all.

I was working at Caltech back then. And I invited everyone I could find from around the world who’d worked on any related system to come give a talk. I put together a little “working group” at Caltech—which for a while included Richard Feynman. And I started recruiting people from around the campus to work on the “SMP Project”.

A big early decision was what language SMP should be written in. Macsyma was written in LISP, and lots of people said LISP was the only possibility. But a young physics graduate student named Rob Pike convinced me that C was the “language of the future”, and the right choice. (Rob went on to do all sorts of things, like invent the Go language.) And so it was that early in 1980, the first lines of C code for SMP were written.

The group that worked on SMP was an interesting one. My first recruit was Chris Cole, who’d worked at IBM and become an APL enthusiast, and went on to found a rather successful company called Peregrine Systems. Then there were students with a variety of different skills, and a programming-enthusiast professor who’d been a collaborator of mine on some physics papers. There was some eccentricity along the way, of course. Like the person who wrote very efficient code, all on one line, with functions colorfully named so their combinations would read as little jokes. Or the quite brilliant undergraduate who worked so hard on the project that he failed all his classes, then promised he wouldn’t touch a computer—but was soon found dictating code to someone else.

I wrote lots of code for SMP myself (about 1000 lines/day). I did the design. And I wrote most of the documentation. I’d never managed a large project before. But somehow that part never seemed very difficult. And sure enough, by June 1981, SMP Version 1 was running—and even looking a bit like Mathematica:

Output from SMP

For its time, SMP was a very big software system (though its executable was just under a megabyte). Its original purpose was to do mathematical computation. But along the way I realized that even to do that well, I had to create a whole, rather general, symbolic language. I suppose I saw it as being a bit like physics—but instead of dealing with elementary particles, I was trying to find the elementary components of computation. I developed a kind of aesthetic: always try to pack the largest capability into the smallest number of primitives. Sometimes I would puzzle for weeks about how to do something—but in the end I’d come up with a design, then implement it.

I understood the idea that everything could be represented by symbolic expressions. Although the whole business of symbolically indexed lists prevented SMP from having the notion of “expression heads” that’s so clean in Mathematica. And there was definitely some funkiness in the internal implementation of symbolic expressions—most notably bizarre ideas about storing all numbers in floating point. (Tini Veltman, author of Schoonschip, and later winner of a physics Nobel Prize, had told me that storing numbers in floating point was one of the best decisions he ever made, because FPUs were so much faster at arithmetic than ALUs.)

Before SMP, I’d written lots of code for systems like Macsyma, and I’d realized that something I was always trying to do was to say “if I have an expression that looks like this, I want to transform it into one that looks like this”. So in designing SMP, transformation rules for families of symbolic expressions represented by patterns became one of the central ideas. It wasn’t nearly as clean as in Mathematica, and there were definitely some funky and far-out ideas. But a lot of the core elements were already there.

And in the end, the table of contents from the SMP Version 1.0 documentation from 1981 had a fair degree of modernity:

Table of contents from SMP v1.0

Yes, “graphical output” is relegated to a small section, alongside “memory management”. And there are the charming “programming impasses” (i.e. system hangs), as well as “statistical expression generation” (i.e. making random expressions). But “parallel processing” is already there, along with “program construction” (i.e. code generation). (SMP even had a way of creating C code, compiling it, and, very scarily, dynamically linking it into the running SMP executable.) And there were lots of mathematical functions, and mathematical operations—though vastly less powerful than in Mathematica.

But, OK. So SMP 1.0 was running. What should be done with it? It was pretty clear there were lots of people who would find it useful. It only ran on quite big computers—so-called “minicomputers”, like the VAX, that was the size of several large refrigerators, and cost a few hundred thousand dollars. But still, I knew there were plenty of research and engineering organizations that had such machines.

I really didn’t know anything about companies or business at the time. But I did understand that it cost money to pay people to work on SMP, and it seemed pretty obvious that a good way to get that money was to sell copies of SMP. My first idea was to go to what would now be called the “technology transfer office” at Caltech, and see if they could help. At the time, the office essentially consisted of one pleasant old chap. But after a few attempts, it became clear he really didn’t know what to do. I asked him how this could be, given that I assumed similar things must come up all the time at Caltech. “Well”, he said, “the thing is that faculty members mostly just go off and start companies themselves, so we never get involved”. “Oh”, I said, “can I do that?”. And he leafed through the bylaws of the university and said: “Software is copyrightable, the university doesn’t claim ownership of copyrights—so, yes, you can”.

And so off I went to start a company. But it wasn’t as simple as that. Because a little while later the university administration suddenly decided that, no, it wasn’t OK. It got very weird—and scurrilous (“give me a cut, and I’ll sign off on this”, etc.). Richard Feynman and Murray Gell-Mann interceded on my behalf. The president of the university didn’t seem to know what to do. And for a while everything was completely stuck. But eventually we agreed that the university would license whatever rights they might have—even though they were (very foolishly, as it later turned out when they tried to recruit computer science faculty) changing their bylaws about software.

As it happened there was one “last problem” though, come up with by the then-provost of the university. He claimed that having a license in place between the university and the company created a conflict of interest if I worked at the university and owned part of the company. “OK”, I said, “that’s easy to resolve: I’ll quit the university”. That seemed to come as a big surprise. But quit I did, and moved to the Institute for Advanced Study in Princeton, where, as the then-director pointed out, they’d “given away the computer” when John von Neumann died, so they couldn’t really be too worried about intellectual property.

For years, I’d wondered what had actually been going on at Caltech. And as it happens, just a couple of weeks ago, I agreed to visit Caltech again (to get a “distinguished alumnus award”), and having lunch at the faculty club there—I discovered that at the next table was none other than the former provost of Caltech, now about to turn 95. I was very impressed at his immediate and deep recall of what he called “the Wolfram Affair” (was he “warned”?), and the conversation we had finally explained things a bit better.

Frankly, it was more bizarre than I could have possibly imagined. The story in a sense began in the 1930s, when Arnold Beckman was at Caltech, invented the pH meter, and left to found Beckman Instruments. By 1981, Beckman was a major donor to Caltech, and the chairman of its board of trustees. Meanwhile, the chairman of its biology department (Lee Hood) was inventing the gene sequencer. He’s told me he tried many times to interest Beckman Instruments in it, but failed, and so started his own company (Applied Biosystems), which became very successful. At some moment, I’m told, Arnold Beckman got upset, and told the administration that they needed to “stop IP walking off campus”. Well, it turned out that the only thing of relevance happening on campus right then was none other than my SMP project. Which the then-provost said he thought he had a duty to “deal with”. (Well, he was also a chemist, who Feynman and Gell-Mann, as physicists, claimed had a “thing about physicists”, etc.)

But notwithstanding this whole adventure, the company that I named Computer Mathematics Corporation got started. At the time, I still thought of myself as a young academic, and didn’t imagine that I’d know how to run a company. So I brought in a CEO, who happened to be about twice my age. And at the behest of the CEO and some venture capitalists, the company arranged to merge with a startup that was doing what they thought was going to be really hot artificial intelligence R&D.

Meanwhile, SMP began to be sold under the banner of “mathematics by computer”:

Mathematics by Computer

There were horrible missteps. CEO: “Let’s build a workstation computer to run SMP”; me: “No, we’re a software company, and I’ve seen this Stanford University Network (SUN) system that’s going to be better than anything we can build”. And then there were the charmingly misguided agency-created ads:

Agency-created ads for SMP

And pretty soon I decided the whole thing was too frustrating. SMP remained something of a cash cow, and although the CEO wasn’t good at making money, he was good at raising it, going through a dizzying number of investment rounds—until there was finally an undistinguished IPO many years later.

I was meanwhile having a terrific time doing basic science, and discovering things that laid the foundations for A New Kind of Science. And in fact SMP turned out to be a crucial precursor to what I did. Because it was my success in inventing computational primitives for the language of SMP that got me thinking about inventing computational primitives for nature—and building a science from studying the consequences of those primitives.

You might ask what happened to SMP. It continued to be sold until sometime after Mathematica was released. None of its code was ever used for Mathematica. But occasionally I used to start it up, just to see how it “felt” compared to Mathematica. As time went by, it became harder to find computers that would run SMP. And perhaps 15 years ago, the last computer we had that could run SMP stopped working.

Well, I thought, I’d always been sent a personal copy of the SMP source code—though I hadn’t looked at it for ages. So now why not just recompile it on a modern system? But then I remembered: I’d had this “great” idea that we should keep the source code encrypted. But what was the key? I asked everyone I could think of. But nobody remembered.

It’s been years now, and I’d really like to see SMP run again. So here’s a challenge. This is the source for a C program encrypted like the SMP source code. Actually, it’s the source for the program that did the encryption: a version of the circa-1981 Unix crypt utility, “cleverly” modified by changing parameters etc. Can someone break the encryption? And finally free SMP from the strange digital time safe in which it’s been locked for so long. (Here’s what Wolfram|Alpha Pro has to say if one just uploads this raw file)

Wolfram|Alpha Pro results on C program encrypted like the SMP source

But back to the main story. I stopped working on SMP in 1983, and began alternating between basic science, software projects, and my (wonderfully educational) “hobby” of doing technology and strategy consulting. I used SMP a bit, but mostly I ended up writing lots and lots of C code, usually gluing together algorithms and graphics and interfaces.

The science that I’d started was going very well—and it was clear that there were lots of important things to do. But instead of trying to do it all myself, I decided I should try to get other people involved. And as part of that, I resolved to start a research institute—and got what amounted to bids from different universities for it. The University of Illinois was the winner, and so in August 1986 off I went there to start the Center for Complex Systems Research.

But by this point I was already getting concerned that my scheme of “other people doing the science” wasn’t so good. And within just a few weeks of arriving in Illinois I’d come up with plan B: build the best tools I could, and the best personal environment I could, and then try to do as much science as I could myself. And since I was pretty well plugged into the computer industry, I knew that powerful software systems would soon be able to run on the zillions of personal computers that were starting to appear. So I knew that if I could build something good, there’d be a good market for it, that would support an interesting company and environment.

And so it was that late in August 1986, I decided to try to build my ultimate computation system—that could do all the computations I wanted, or could imagine I would ever want.

And of course the result was Mathematica.

I knew a lot about what to do (and not do) from SMP and my other software experiences. But it was refreshing to be able to start from scratch, just trying to get the design right, without prior constraints. In SMP, algebraic computation had been the central goal. But in Mathematica, I wanted to cover lots of other areas too—numerics, graphics, programming, interfaces, whatever. I thought a lot about the foundations for the system, wondering for example whether things like the cellular automata I’d studied in my basic science could be relevant. But I just kept on coming back to the basic paradigm I’d already developed for SMP. Symbolic expressions and transformations for them seemed exactly right as a high-level, yet general, representation for computation.

If it hadn’t been for SMP, I would certainly have made a lot of mistakes. But SMP pretty much showed me what was important and what was not, and where the issues were. Looking through my archives today, I can see the painstaking process of puzzling through problems that I knew from SMP. And one by one coming up with solutions.

Meanwhile, just as for SMP, I’d assembled a team, and started the actual implementation of Mathematica. I’d also started a company—this time with me as CEO. Every day I’d write lots of code. (And to my chagrin, quite a bit of that code is still running in Mathematica today, especially in the pattern matcher and evaluator.) But my biggest focus was design. And following a practice I’d started with SMP, I wrote documentation as I developed the design. I figured if I couldn’t explain something clearly in documentation, nobody was ever going to understand it, and it probably wasn’t designed right. And once something was in the documentation, we knew both what to implement, and why we were doing it.

The first code for Mathematica was written in October 1986. And by the middle of 1987 Mathematica was beginning to come to life. I’d decided that the documentation should be published as a book, and hundreds of pages were already written. And I estimated that Mathematica 1.0 would be ready by April 1988.

My original plan for our company was to concentrate on R&D, and to distribute Mathematica primarily through computer manufacturers. Steve Jobs was the first to take Mathematica on, making a deal to bundle it with every one of his as-yet-unreleased NeXT computers. Deals with Sun, Silicon Graphics, IBM and a sequence of other companies followed. We started sending out a few beta copies of Mathematica. And—even though this was long before the web—word of its existence began to spread. Some media coverage started up too (I still like that kind of ice cream):

Media coverage

Sometime in the spring of 1988, we officially set June 23 as the release date for Mathematica (without Wolfram|Alpha, I didn’t know it was Alan Turing’s birthday, etc.). There was a lot to get ready. In those days releasing software didn’t just involve flipping a switch. Like I remember we were right down to the wire in getting The Mathematica Book printed. So I flew to Canada with a hard disk and personally babysat a phototypesetting machine for a long weekend, handing the box of film it produced to a person who met me at the airport in Boston and rushed it to the printer. But despite adventures like that, shortly before June 23 off were mailed some mysterious invitations:

1988 Launch Party Invitation

And at noon on June 23 the room had filled, and we were ready to launch Mathematica into the world.

Mathematica v1.0 box

It’s been a great 25 years since then. The foundations that we laid in Mathematica 1.0—greatly informed by my earlier experiences—have proved incredibly robust, and we’ve been able to just build and build on them. My “plan B” of developing Mathematica, then using it to do science, worked out just great, and led to A New Kind of Science. And from Mathematica, we’ve been able to build a great company, as well as build things like Wolfram|Alpha. And over the course of 25 years, we’ve had the pleasure and privilege of seeing Mathematica contribute in all sorts of ways to many things in the world.

Addendum: Lessons from SMP

What was SMP like? Here are a few examples of SMP programs that I wrote for the SMP documentation:

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

SMP programs written for documentation

In some ways these look quite similar to Mathematica programs—complete with [...] for functions, {...} for lists and -> for rules. But somehow the readability that’s a hallmark of Mathematica isn’t there, and instead the SMP programs seem quite cryptic and obscure.

One of the most obvious problems is that SMP code is littered with $ and % characters—appearing respectively as prefixes for pattern and local variables. In SMP, I hadn’t had the Mathematica idea of separating pattern constructs (such as _) from names (such as x). And I thought it was important to emphasize which variables were local—but didn’t have a subtle cue like color to do it with.

In SMP I’d already had the (good) idea of distinguishing immediate (=) and delayed (:=) assignment. But in a nod to languages like ALGOL, I indicated them by the rather obscure : and :: (For rules, -> was the immediate form, as it is Mathematica, while --> was the analog of :> and S[...] was the analog of /. )

In SMP, just like in Mathematica, I indicated built-in functions with capital letters (at the time it was a fairly new thing to distinguish upper and lowercase at all on a computer). But while Mathematica typically uses English words for function names, SMP used short—and often cryptic—abbreviations. When I was working on SMP, I was quite taken with the design of Unix, and wanted to emulate its practice of having short function names. That might have been OK if SMP had just a few functions. But with hundreds of functions with names like Ps, Mei and Uspb things began to get pretty unreadable. Of course, back then, there was another issue: lots of users couldn’t type quickly—so that provided a motivation to have short function names.

SMP programs written for documentation

It’s interesting to look at the SMP documentation today. SMP had plenty of good ideas—most of which I used again in Mathematica. But it also had some quite bad ideas—which happily aren’t part of Mathematica. One example of a bad idea—that even sounds bad as soon as one hears it—are “chameleonic symbols”: symbols that change their name whenever they’re used. (These were an attempt at localizing things like dummy variables, a bit like an over-automated form of Module.)

There were some much more subtle mistakes too. Like here’s one that in a sense came from trying to go too far in unifying the system. Like Mathematica, SMP had a notion of lists, like {a,b,c}. It also had functions, like f[x]. And in my effort to achieve the maximum possible unification, I thought that perhaps one could combine the notion of lists and functions.

Let’s say one has a list v={a,b,c}. (In SMP assignment was done with :, so this would have been written v:{a,b,c}.) Then for example in SMP v[2] would extract the second element in the list. But this notation looks a lot like asking for the value of a function v when its argument is 2. And this gave me the idea that perhaps one could generalize lists—to have not just integer-indexed elements, but elements with arbitrary symbolic indices.

In SMP, pattern variables (x_ in Mathematica) were written $x. So when one defined a function f[$x] : $x^2 one could imagine that this was just defining f itself to have a value that was a symbolically indexed list: {[$x]: $x^2}. If you wanted to find out how a function was defined, you just typed its name—like f. And the value that came back would be the symbolically indexed list that represented the definition.

An ordinary vector-type list could be thought of as something like {[1]:a, [2]:b, [3]:c}. And one could mix in symbolic indices: {[1]: 1, [$x]:$x f[$x-1]}. There was also a certain unification with part numbering in general symbolic expressions. And at some level it all seemed rather nice. And to describe my unified concept of functions and lists, I called the f in f[x] a “projection”, and x a “filter”. (There were jokes about lists of definitions being “optical benches”.)

But gradually cracks started appearing. It got pretty weird, for example, when one started making definitions like v[2]:b, v[3]:c. According to SMP’s conventions for assignments v would then have value {[3]:c, [2]:b}. But what if one made a definition like v[1]:a? Well, then v suddenly had to reorder itself as {a, b, c}.

It got even weirder when one started dealing with multi-argument functions. It was quite nice that one could define a matrix with m:{{a,b},{c,d}}, then m[1] would be {a,b}, and either m[1,1] or m[1][1] would be a. But what if one had a function with several arguments? Would f[x, y] be the same as f[x][y]? Well, sometimes one wanted it that way, and sometimes not. So I had to come up with a property (“attribute” in Mathematica)—that I called Tier—to say for every function which way it should work. (Today more people might have heard of “currying”, but in those days this kind of distinction was really obscure.)

Symbolically indexed lists in SMP had some really powerful and elegant features. But in the end, when the whole system was built, there were just too many weirdnesses. And so when I designed Mathematica I decided not to use them. Over the years, though, I’ve kept thinking about them. And as it happens, right now, more than 30 years after SMP, I’m working on some very interesting new functionality in Mathematica that’s closely related to symbolically indexed lists.

I learned a huge amount designing SMP—and then seeing how the design played out. One particularly memorable moment for me was this. Like Mathematica, SMP had pure functions. But unlike Mathematica, it didn’t have a syntax like & to indicate them. And that meant that it needed a special object called a “mark” (written `) to indicate when a pure function was supposed to give a literal, constant, value. Well, about 5 years after SMP was released, I was looking at one of its training manuals. And out jumped at me the sentence: “Marks are the enigma of SMP”. And in that moment I realized: that’s what a language design mistake looks like.

SMP was in many ways a very radical system—a kind of extreme experiment in programming language design. It had only grudging support for most of what were then familiar programming constructs. And instead almost everything in it revolved around the idea of transformation rules for symbolic expressions. In some ways I think SMP went too far into the unfamiliar. Because in a sense what a programming language has to do is to connect the human conception of a computation to an actual computation that a computer can execute. And however powerful a language is, it doesn’t do much good if humans don’t have enough context to be able to understand it. Which is why in Mathematica, I’ve always tried to make things familiar when I can, limiting the unfamiliar to places where it’s really needed in supporting things that are fundamentally new.

One of the things about designing a system is knowing what’s going to end up being important. In SMP, we spent a lot of effort on what we called “semantic pattern matching”. Let’s say one made a definition like f[$x+$y, $x, $y] := {$x, $y}. It’s pretty clear that this would match f[a+b, a, b]. But what about f[7, 3, 4]? In SMP, that would match—even though the 7 isn’t structurally of the form $x+$y. It took lots of effort to make this work. And it was neat to see in simple examples. But in the end, it just didn’t come up very often—and when it did, it was usually something to avoid, because it typically made the operation of programs really hard to understand.

There was something similar with recursion control. I thought it was bad to have f[$x] : $x f[$x-1] (with no end condition for f[1]) go into an infinite loop trying to evaluate f[-1], f[-2], etc. Because after all, at some point there’s multiplication by 0. So why not just give 0? Well, in SMP the default was to give 0. Because instead of running all the way down the evaluation of each branch of the recursion tree, SMP would repeatedly stop and try to simplify all the unevaluated branches. It was neat and clever. But by the time one started parametrizing this behavior it was just too hard for people to understand, and nobody ended up using it.

And then there was user-defined syntax. Allowing users for example to set “U” (say, for “union”) to be an infix operator. Which worked great until one wanted to type a function with a “U” in its name. Or until one completely trapped oneself in one’s syntax, diverting the parsing of any form of escape.

SMP was a great learning experience for me. And Mathematica wouldn’t be nearly as good if I hadn’t done SMP first. And as I reflect now on “mistakes” in SMP, one thing I find quite satisfying is that I don’t think I’d make any of them today. Between SMP and 25 years of Mathematica design, most of them would now fall into the category of “easy issues” for me.

It’s funny, though, how often variations of some of the not-so-good ideas in SMP seem to come up. And actually I’m very curious with my modern design sensibilities how exactly I’d feel about them if I ran SMP today. Which is part of the reason I’m keen to release SMP from its “digital time safe”, and get it running again. Which I hope someone out there is going to help me make possible.

]]>
http://blog.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/feed/ 3
Dropping In on Gottfried Leibniz http://blog.stephenwolfram.com/2013/05/dropping-in-on-gottfried-leibniz/ http://blog.stephenwolfram.com/2013/05/dropping-in-on-gottfried-leibniz/#comments Tue, 14 May 2013 17:57:05 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=5788 I’ve been curious about Gottfried Leibniz for years, not least because he seems to have wanted to build something like Mathematica and Wolfram|Alpha, and perhaps A New Kind of Science as well—though three centuries too early. So when I took a trip recently to Germany, I was excited to be able to visit his archive in Hanover.

Leafing through his yellowed (but still robust enough for me to touch) pages of notes, I felt a certain connection—as I tried to imagine what he was thinking when he wrote them, and tried to relate what I saw in them to what we now know after three more centuries:

Page of Gottfried Leibniz's notes

Some things, especially in mathematics, are quite timeless. Like here’s Leibniz writing down an infinite series for √2 (the text is in Latin):

Example of Leibniz writing down an infinite series for Sqrt[2]

Or here’s Leibniz try to calculate a continued fraction—though he got the arithmetic wrong, even though he wrote it all out (the Π was his earlier version of an equal sign):

Leibniz calculating a continued fraction

Or here’s a little summary of calculus, that could almost be in a modern textbook:

Summary of calculus from Leibniz

But what was everything else about? What was the larger story of his work and thinking?

I have always found Leibniz a somewhat confusing figure. He did many seemingly disparate and unrelated things—in philosophy, mathematics, theology, law, physics, history, and more. And he described what he was doing in what seem to us now as strange 17th century terms.

But as I’ve learned more, and gotten a better feeling for Leibniz as a person, I’ve realized that underneath much of what he did was a core intellectual direction that is curiously close to the modern computational one that I, for example, have followed.

Gottfried Leibniz was born in Leipzig in what’s now Germany in 1646 (four years after Galileo died, and four years after Newton was born). His father was a professor of philosophy; his mother’s family was in the book trade. Leibniz’s father died when Leibniz was 6—and after a 2-year deliberation on its suitability for one so young, Leibniz was allowed into his father’s library, and began to read his way through its diverse collection of books. He went to the local university at age 15, studying philosophy and law—and graduated in both of them at age 20.

Even as a teenager, Leibniz seems to have been interested in systematization and formalization of knowledge. There had been vague ideas for a long time—for example in the semi-mystical Ars Magna of Ramon Llull from the 1300s—that one might be able to set up some kind of universal system in which all knowledge could be derived from combinations of signs drawn from a suitable (as Descartes called it) “alphabet of human thought”. And for his philosophy graduation thesis, Leibniz tried to pursue this idea. He used some basic combinatorial mathematics to count possibilities. He talked about decomposing ideas into simple components on which a “logic of invention” could operate. And, for good measure, he put in an argument that purported to prove the existence of God.

As Leibniz himself said in later years, this thesis—written at age 20—was in many ways naive. But I think it began to define Leibniz’s lifelong way of thinking about all sorts of things. And so, for example, Leibniz’s law graduation thesis about “perplexing legal cases” was all about how such cases could potentially be resolved by reducing them to logic and combinatorics.

Leibniz was on a track to become a professor, but instead he decided to embark on a life working as an advisor for various courts and political rulers. Some of what he did for them was scholarship, tracking down abstruse—but politically important—genealogy and history. Some of it was organization and systematization—of legal codes, libraries and so on. Some of it was practical engineering—like trying to work out better ways to keep water out of silver mines. And some of it—particularly in earlier years—was “on the ground” intellectual support for political maneuvering.

One such activity in 1672 took Leibniz to Paris for four years—during which time he interacted with many leading intellectual lights. Before then, Leibniz’s knowledge of mathematics had been fairly basic. But in Paris he had the opportunity to learn all the latest ideas and methods. And for example he sought out Christiaan Huygens, who agreed to teach Leibniz mathematics—after he succeeded in passing the test of finding the sum of the reciprocals of the triangular numbers.

Over the years, Leibniz refined his ideas about the systematization and formalization of knowledge, imagining a whole architecture for how knowledge would—in modern terms—be made computational. He saw the first step as being the development of an ars characteristica—a methodology for assigning signs or symbolic representations to things, and in effect creating a uniform “alphabet of thought”. And he then imagined—in remarkable resonance with what we now know about computation—that from this uniform representation it would be possible to find “truths of reason in any field… through a calculus, as in arithmetic or algebra”.

He talked about his ideas under a variety of rather ambitious names like scientia generalis (“general method of knowledge”), lingua philosophica (“philosophical language”), mathematique universelle (“universal mathematics”), characteristica universalis (“universal system”) and calculus ratiocinator (“calculus of thought”). He imagined applications ultimately in all areas—science, law, medicine, engineering, theology and more. But the one area in which he had clear success quite quickly was mathematics.

To me it’s remarkable how rarely in the history of mathematics that notation has been viewed as a central issue. It happened at the beginning of modern mathematical logic in the late 1800s with the work of people like Gottlob Frege and Giuseppe Peano. And in recent times it’s happened with me in my efforts to create Mathematica and the Wolfram Language. But it also happened three centuries ago with Leibniz. And I suspect that Leibniz’s successes in mathematics were in no small part due to the effort he put into notation, and the clarity of reasoning about mathematical structures and processes that it brought.

When one looks at Leibniz’s papers, it’s interesting to see his notation and its development. Many things look quite modern. Though there are charming dashes of the 17th century, like the occasional use of alchemical or planetary symbols for algebraic variables:

Example of Leibniz's use of alchemical or planetary symbols for algebraic variables

There’s Π as an equals sign instead of =, with the slightly hacky idea of having it be like a balance, with a longer leg on one side or the other indicating less than (“<”) or greater than (“>”):

Example of Leibniz using Pi as an equal sign instead of =

There are overbars to indicate grouping of terms—arguably a better idea than parentheses, though harder to type, and typeset:

Leibniz used overbars to indicate grouping of terms

We do use overbars for roots today. But Leibniz wanted to use them in integrals too. Along with the rather nice “tailed d”, which reminds me of the double-struck “differential d” that we invented for representing integrals in Mathematica.

Showing Leibniz's use of overbars in integrals

Particularly in solving equations, it’s quite common to want to use ±, and it’s always confusing how the grouping is supposed to work, say in a±b±c. Well, Leibniz seems to have found it confusing too, but he invented a notation to handle it—which we actually should consider using today too:

Leibniz example of a +- notation

I’m not sure what some of Leibniz’s notation means. Though those overtildes are rather nice-looking:

As are these things with dots:

One example of Leibniz's notation using dots

Or this interesting-looking diagrammatic form:

Diagrammatic form made by Leibniz

Of course, Leibniz’s most famous notations are his integral sign (long “s” for “summa”) and d, here summarized in the margin for the first time, on November 11th, 1675 (the “5″ in “1675″ was changed to a “3″ after the fact, perhaps by Leibniz):

Leibniz’s most famous notations summarized in the margin for the first time

I find it interesting that despite all his notation for “calculational” operations, Leibniz apparently did not invent similar notation for logical operations. “Or” was just the Latin word vel, “and” was et, and so on. And when he came up with the idea of quantifiers (modern ∀ and ∃), he just represented them by the Latin abbreviations U.A. and P.A.:

Leibniz's notation for logical operations

It’s always struck me as a remarkable anomaly in the history of thought that it took until the 1930s for the idea of universal computation to emerge. And I’ve often wondered if lurking in the writings of Leibniz there might be an early version of universal computation—maybe even a diagram that we could now interpret as a system like a Turing machine. But with more exposure to Leibniz, it’s become clearer to me why that’s probably not the case.

One big piece, I suspect, is that he didn’t take discrete systems quite seriously enough. He referred to results in combinatorics as “self-evident”, presumably because he considered them directly verifiable by methods like arithmetic. And it was only “geometrical”, or continuous, mathematics that he felt needed to have a calculus developed for it. In describing things like properties of curves, Leibniz came up with something like continuous functions. But he never seems to have applied the idea of functions  to discrete mathematics—which might for example have led him to think about universal elements for building up functions.

Leibniz recognized the success of his infinitesimal calculus, and was keen to come up with similar “calculi” for other things. And in another “near miss” with universal computation, Leibniz had the idea of encoding logical properties using numbers. He thought about associating every possible attribute of a thing with a different prime number, then characterizing the thing by the product of the primes for its attributes—and then representing logical inference by arithmetic operations. But he only considered static attributes—and never got to an idea like Gödel numbering where operations are also encoded in numbers.

But even though Leibniz did not get to the idea of universal computation, he did understand the notion that computation is in a sense mechanical. And indeed quite early in life he seems to have resolved to build an actual mechanical calculator for doing arithmetic. Perhaps in part it was because he wanted to use it himself (always a good reason to build a piece of technology!). For despite his prowess at algebra and the like, his papers are charmingly full of basic (and sometimes incorrect) school-level arithmetic calculations written out in the margin—and now preserved for posterity:

Example of basic school-level arithmetic calculations written out in the margin by Leibniz

There were scattered examples of mechanical calculators being built in Leibniz’s time, and when he was in Paris, Leibniz no doubt saw the addition calculator that had been built by Blaise Pascal in 1642. But Leibniz resolved to make a “universal” calculator, that could for the first time do all four basic functions of arithmetic with a single machine. And he wanted to give it a simple “user interface”, where one would for example turn a handle one way for multiplication, and the opposite way for division.

In Leibniz’s papers there are all sorts of diagrams about how the machine should work:

Leibniz's diagrams about how an arithmetic machine should work

Leibniz imagined that his calculator would be of great practical utility—and indeed he seems to have hoped that he would be able to turn it into a successful business. But in practice, Leibniz struggled to get the calculator to work at all reliably. For like other mechanical calculators of its time, it was basically a glorified odometer. And just like in Charles Babbage’s machines nearly 200 years later, it was mechanically difficult to make many wheels move at once when a cascade of carries occurred.

Leibniz at first had a wooden prototype of his machine built, intended to handle just 3 or 4 digits. But when he demoed this to people like Robert Hooke during a visit to London in 1673 it didn’t go very well. But he kept on thinking he’d figured everything out—for example in 1679 writing (in French) of the “last correction to the arithmetic machine”:

1679 writing (in French) of the last correction to the arithmetic machine

Notes from 1682 suggest that there were more problems, however:

Notes from 1682 suggesting that there were more problems with the arithmetic machine

But Leibniz had plans drafted up from his notes—and contracted an engineer to build a brass version with more digits:

Plans drafted up from Leibniz's notes

It’s fun to see Leibniz’s “marketing material” for the machine:

Leibniz's "marketing material" for the machine

As well as parts of the “manual” (with 365×24 as a “worked example”):

Usage diagrams of the machine

Complete with detailed usage diagrams:

Detailed usage diagram of the machine

But despite all this effort, problems with the calculator continued. And in fact, for more than 40 years, Leibniz kept on tweaking his calculator—probably altogether spending (in today’s currency) more than a million dollars on it.

So what actually happened to the physical calculator? When I visited Leibniz’s archive, I had to ask. “Well”, my hosts said, “we can show you”. And there in a vault, along with shelves of boxes, was Leibniz’s calculator, looking as good as new in a glass case—here captured by me in a strange juxtaposition of ancient and modern:

Leibniz’s calculator

All the pieces are there. Including a convenient wooden carrying box. Complete with a cranking handle. And, if it worked right, the ability to do any basic arithmetic operation with a few minutes of cranking:

Leibniz’s calculator with the cranking handle

Leibniz clearly viewed his calculator as a practical project. But he still wanted to generalize from it, for example trying to make a general “logic” to describe geometries of mechanical linkages. And he also thought about the nature of numbers and arithmetic. And was particularly struck by binary numbers.

Bases other than 10 had been used in recreational mathematics for several centuries. But Leibniz latched on to base 2 as having particular significance—and perhaps being a key bridge between philosophy, theology and mathematics. And he was encouraged in this by his realization that binary numbers were at the core of the I Ching, which he’d heard about from missionaries to China, and viewed as related in spirit to his characteristica universalis.

Leibniz worked out that it would be possible to build a calculator based on binary. But he appears to have thought that only base 10 could actually be useful.

It’s strange to read what Leibniz wrote about binary numbers. Some of it is clear and practical—and still seems perfectly modern. But some of it is very 17th century—talking for example about how binary proves that everything can be made from nothing, with 1 being identified with God, and 0 with nothing.

Almost nothing was done with binary for a couple of centuries after Leibniz: in fact, until the rise of digital computing in the last few decades. So when one looks at Leibniz’s papers, his calculations in binary are probably what seem most “out of his time”:

Leibniz's calculations in binary

With binary, Leibniz was in a sense seeking the simplest possible underlying structure. And no doubt he was doing something similar when he talked about what he called “monads”. I have to say that I’ve never really understood monads. And usually when I think I almost have, there’s some mention of souls that just throws me completely off.

Still, I’ve always found it tantalizing that Leibniz seemed to conclude that the “best of all possible worlds” is the one “having the greatest variety of phenomena from the smallest number of principles”. And indeed, in the prehistory of my work on A New Kind of Science, when I first started formulating and studying one-dimensional cellular automata in 1981, I considered naming them “polymones”—but at the last minute got cold feet when I got confused again about monads.

There’s always been a certain mystique around Leibniz and his papers. Kurt Gödel—perhaps displaying his paranoia—seemed convinced that Leibniz had discovered great truths that had been suppressed for centuries. But while it is true that Leibniz’s papers were sealed when he died, it was his work on topics like history and genealogy—and the state secrets they might entail—that was the concern.

Leibniz’s papers were unsealed long ago, and after three centuries one might assume that every aspect of them would have been well studied. But the fact is that even after all this time, nobody has actually gone through all of the papers in full detail. It’s not that there are so many of them. Altogether there are only about 200,000 pages—filling perhaps a dozen shelving units (and only a little larger than my own personal archive from just the 1980s). But the problem is the diversity of material. Not only lots of subjects. But also lots of overlapping drafts, notes and letters, with unclear relationships between them.

Leibniz’s archive contains a bewildering array of documents. From the very large:

Very large document from Leibniz's archive

To the very small (Leibniz’s writing got smaller as he got older and more near-sighted):

Very small document from Leibniz's archive

Most of the documents in the archive seem very serious and studious. But despite the high cost of paper in Leibniz’s time, one still finds preserved for posterity the occasional doodle (is that Spinoza, by any chance?):

Documents from the archive with a doodle by Leibniz

Leibniz exchanged mail with hundreds of people—famous and not-so-famous—all over Europe. So now, 300 years later, one can find in his archive “random letters” from the likes of Jacob Bernoulli:

Letter to Leibniz from Jacob Bernoulli

What did Leibniz look like? Here he is, both in an official portrait, and without his rather oversized wig (that was mocked even in his time), that he presumably wore to cover up a large cyst on his head:

Official portrait and statue of Leibniz

As a person, Leibniz seems to have been polite, courtierly and even tempered. In some ways, he may have come across as something of a nerd, expounding at great depth on all manner of topics. He seems to have taken great pains—as he did in his letters—to adapt to whoever he was talking to, emphasizing theology when he was talking to a theologian, and so on. Like quite a few intellectuals of his time, Leibniz never married, though he seems to have been something of a favorite with women at court.

In his career as a courtier, Leibniz was keen to climb the ladder. But not being into hunting or drinking, he never quite fit in with the inner circles of the rulers he worked for. Late in his life, when George I of Hanover became king of England, it would have been natural for Leibniz to join his court. But Leibniz was told that before he could go, he had to start writing up a history project he’d supposedly been working on for 30 years. Had he done so before he died, he might well have gone to England and had a very different kind of interaction with Newton.

At Leibniz’s archive, there are lots of papers, his mechanical calculator, and one more thing: a folding chair that he took with him when he traveled, and that he had suspended in carriages so he could continue to write as the carriage moved:

Folding chair that Leibniz took with him when he traveled

Leibniz was quite concerned about status (he often styled himself “Gottfried von Leibniz”, though nobody quite knew where the “von” came from). And as a form of recognition for his discoveries, he wanted to have a medal created to commemorate binary numbers. He came up with a detailed design, complete with the tag line omnibus ex nihilo ducendis; sufficit unum (“everything can be derived from nothing; all that is needed is 1”). But nobody ever made the medal for him.

In 2007, though, I wanted to come up with a 60th birthday gift for my friend Greg Chaitin, who has been a long-time Leibniz enthusiast. And so I thought: why not actually make Leibniz’s medal? So we did. Though on the back, instead of the picture of a duke that Leibniz proposed, we put a Latin inscription about Greg’s work.

And when I visited the Leibniz archive, I made sure to bring a copy of the medal, so I could finally put a real medal next to Leibniz’s design:

Leibniz’s medal with the original design

It would have been interesting to know what pithy statement Leibniz might have had on his grave. But as it was, when Leibniz died at the age of 70, his political fates were at a low ebb, and no elaborate memorial was constructed. Still, when I was in Hanover, I was keen to see his grave—which turns out to carry just the simple Latin inscription “bones of Leibniz”:

Leibniz's grave

Across town, however, there’s another commemoration of a sort—an outlet store for cookies that carry the name “Leibniz” in his honor:

Outlet store for cookies that carry the name "Leibniz" in his honor

So what should we make of Leibniz in the end? Had history developed differently, there would probably be a direct line from Leibniz to modern computation. But as it is, much of what Leibniz tried to do stands isolated—to be understood mostly by projecting backward from modern computational thinking to the 17th century.

And with what we know now, it is fairly clear what Leibniz understood, and what he did not. He grasped the concept of having formal, symbolic, representations for a wide range of different kinds of things. And he suspected that there might be universal elements (maybe even just 0 and 1) from which these representations could be built. And he understood that from a formal symbolic representation of knowledge, it should be possible to compute its consequences in mechanical ways—and perhaps create new knowledge by an enumeration of possibilities.

Some of what Leibniz wrote was abstract and philosophical—sometimes maddeningly so. But at some level Leibniz was also quite practical. And he had sufficient technical prowess to often be able to make real progress. His typical approach seems to have been to start by trying to create a formal structure to clarify things—with formal notation if possible. And after that his goal was to create some kind of “calculus” from which conclusions could systematically be drawn.

Realistically he only had true success with this in one specific area: continuous “geometrical” mathematics. It’s a pity he never tried more seriously in discrete mathematics, because I think he might have been able to make progress, and might conceivably even have reached the idea of universal computation. He might well also have ended up starting to enumerate possible systems in the kind of way I have done in the computational universe.

One area where he did try his approach was with law. But in this he was surely far too early, and it is only now—300 years later—that computational law is beginning to seem realistic.

Leibniz also tried thinking about physics. But while he made progress with some specific concepts (like kinetic energy), he never managed to come up with any sort of large-scale “system of the world”, of the kind that Newton in effect did in his Principia.

In some ways, I think Leibniz failed to make more progress because he was trying too hard to be practical, and—like Newton—to decode the operation of actual physics, rather than just looking at related formal structures. For had Leibniz tried to do at least the basic kinds of explorations that I did in A New Kind of Science, I don’t think he would have had any technical difficulty—but I think the history of science could have been very different.

And I have come to realize that when Newton won the PR war against Leibniz over the invention of calculus, it was not just credit that was at stake; it was a way of thinking about science. Newton was in a sense quintessentially practical: he invented tools then showed how these could be used to compute practical results about the physical world. But Leibniz had a broader and more philosophical view, and saw calculus not just as a specific tool in itself, but as an example that should inspire efforts at other kinds of formalization and other kinds of universal tools.

I have often thought that the modern computational way of thinking that I follow is somehow obvious—and somehow an inevitable feature of thinking about things in formal, structured, ways. But it has never been very clear to me whether this apparent obviousness is just the result of modern times, and of our experience with modern practical computer technology. But looking at Leibniz, we get some perspective. And indeed what we see is that some core of modern computational thinking was possible even long before modern times. But the ambient technology and understanding of past centuries put definite limits on how far the thinking could go.

And of course this leads to a sobering question for us today: how much are we failing to realize from the core computational way of thinking because we do not have the ambient technology of the distant future? For me, looking at Leibniz has put this question in sharper focus. And at least one thing seems fairly clear.

In Leibniz’s whole life, he basically saw less than a handful of computers, and all they did was basic arithmetic. Today there are billions of computers in the world, and they do all sorts of things. But in the future there will surely be far far more computers (made easier to create by the Principle of Computational Equivalence). And no doubt we’ll get to the point where basically everything we make will explicitly be made of computers at every level. And the result is that absolutely everything will be programmable, down to atoms. Of course, biology has in a sense already achieved a restricted version of this. But we will be able to do it completely and everywhere.

At some level we can already see that this implies some merger of computational and physical processes. But just how may be as difficult for us to imagine as things like Mathematica and Wolfram|Alpha would have been for Leibniz.

Leibniz died on November 14, 1716. In 2016 that’ll be 300 years ago.  And it’ll be a good opportunity to make sure everything we have from Leibniz has finally been gone through—and to celebrate after three centuries how many aspects of Leibniz’s core vision are finally coming to fruition, albeit in ways he could never have imagined.

]]>
http://blog.stephenwolfram.com/2013/05/dropping-in-on-gottfried-leibniz/feed/ 14
Data Science of the Facebook World http://blog.stephenwolfram.com/2013/04/data-science-of-the-facebook-world/ http://blog.stephenwolfram.com/2013/04/data-science-of-the-facebook-world/#comments Wed, 24 Apr 2013 18:25:47 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=5350 More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes.

A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of Mathematica and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)

We’d always planned to use the data we collect to enhance our Personal Analytics system. But I couldn’t resist also trying to do some basic science with it.

I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in A New Kind of Science.

So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network using Wolfram|Alpha—or the SocialMediaData function in Mathematica.)

social networks

So a first quantitative question to ask is: How big are these networks usually? In other words, how many friends do people typically have on Facebook? Well, at least for our users, that’s easy to answer. The median is 342—and here’s a histogram showing the distribution (there’s a cutoff at 5000 because that’s the maximum number of friends for a personal Facebook page):

distribution of number of friends for our users

But how typical are our users? In most respects—so far as we can tell—they seem pretty typical. But there are definitely some differences. Like here’s the distribution of the number of friends not just for our users, but also for their friends (there’s a mathematical subtlety in deriving this that I’ll discuss later):

distribution of number of friends for users+friends

And what we see is that in this broader Facebook population, there are significantly more people who have almost no Facebook friends. Whether such people should be included in samples one takes is a matter of debate. But so long as one looks at appropriate comparisons, aggregates, and so on, they don’t seem to have a huge effect. (The spike at 200 friends probably has to do with Facebook’s friend recommendation system.)

So, OK. Let’s ask for example how the typical number of Facebook friends varies with a person’s age. Of course all we know are self-reported “Facebook ages”. But let’s plot how the number of friends varies with that age. The solid line is the median number of friends; successive bands show successive octiles of the distribution.

number of friends vs. age

After a rapid rise, the number of friends peaks for people in their late teenage years, and then declines thereafter. Why is this? I suspect it’s partly a reflection of people’s intrinsic behavior, and partly a reflection of the fact that Facebook hasn’t yet been around very long. Assuming people don’t drop friends much once they’ve added them one might expect that the number of friends would simply grow with age. And for sufficiently young people that’s basically what we see. But there’s a limit to the growth, because there’s a limit to the number of years people have been on Facebook. And assuming that’s roughly constant across ages, what the plot suggests is that people add friends progressively more slowly with age.

But what friends do they add? Given a person of a particular age, we can for example ask what the distribution of ages of the person’s friends is. Here are some results (the jaggedness, particularly at age 70, comes from the limited data we have):

friend ages for people of different ages

And here’s an interactive version, generated from CDF:

 

The first thing we see is that the ages of friends always peak at or near the age of the person themselves—which is presumably a reflection of the fact that in today’s society many friends are made in age-based classes in school or college. For younger people, the peak around the person’s age tends to be pretty sharp. For older people, the distribution gets progressively broader.

We can summarize what happens by plotting the distribution of friend ages against the age of a person (the solid line is the median age of friends):

median age of friends vs. age

There’s an anomaly for the youngest ages, presumably because of kids under 13 misreporting their ages. But apart from that, we see that young people tend to have friends who are remarkably close in age to themselves. The broadening as people get older is probably associated with people making non-age-related friends in their workplaces and communities. And as the array of plots above suggests, by people’s mid-40s, there start to be secondary peaks at younger ages, presumably as people’s children become teenagers, and start using Facebook.

So what else can one see about the trajectory of people’s lives? Here’s the breakdown according to reported relationship status as a function of age:

relationship status fractions vs. age

And here’s more detail, separating out fractions for males and females (“married+” means “civil union”, “separated”, “widowed”, etc. as well as “married”):

relationship status fractions vs. age

There’s some obvious goofiness at low ages with kids (slightly more often girls than boys) misreporting themselves as married. But in general the trend is clear. The rate of getting married starts going up in the early 20s—a couple of years earlier for women than for men—and decreases again in the late 30s, with about 70% of people by then being married. The fraction of people “in a relationship” peaks around age 24, and there’s a small “engaged” peak around 27. The fraction of people who report themselves as married continues to increase roughly linearly with age, gaining about 5% between age 40 and age 60—while the fraction of people who report themselves as single continues to increase for women, while decreasing for men.

I have to say that as I look at the plots above, I’m struck by their similarity to plots for physical processes like chemical reactions. It’s as if all those humans, with all the complexities of their lives, still behave in aggregate a bit like molecules—with certain “reaction rates” to enter into relationships, marry, etc.

Of course, what we’re seeing here is just for the “Facebook world”. So how does it compare to the world at large? Well, at least some of what we can measure in the Facebook world is also measured in official censuses. And so for example we can see how our results for the fraction of people married at a given age compare with results from the official US Census:

fraction married vs. age

I’m amazed at how close the correspondence is. Though there are clearly some differences. Like below age 20 kids on Facebook are misreporting themselves as married. And on the older end, widows are still considering themselves married for purposes of Facebook. For people in their 20s, there’s also a small systematic difference—with people on Facebook on average getting married a couple of years later than the Census would suggest. (As one might expect, if one excludes the rural US population, the difference gets significantly smaller.)

Talking of the Census, we can ask in general how our Facebook population compares to the US population. And for example, we find, not surprisingly, that our Facebook population is heavily weighted toward younger people:

population vs. age

OK. So we saw above how the typical number of friends a person has depends on age. What about gender? Perhaps surprisingly, if we look at all males and all females, there isn’t a perceptible difference in the distributions of number of friends. But if we instead look at males and females as a function of age, there is a definite difference:

number of friends vs. age

Teenage boys tend to have more friends than teenage girls, perhaps because they are less selective in who they accept as friends. But after the early 20s, the difference between genders rapidly dwindles.

What effect does relationship status have? Here’s the male and female data as a function of age:

median number of friends vs. age

In the older set, relationship status doesn’t seem to make much difference. But for young people it does. With teenagers who (mis)report themselves as “married” on average having more friends than those who don’t. And with early teenage girls who say they’re “engaged” (perhaps to be able to tag a BFF) typically having more friends than those who say they’re single, or just “in a relationship”.

Another thing that’s fairly reliably reported by Facebook users is location. And it’s common to see quite a lot of variation by location. Like here are comparisons of the median number of friends for countries around the world (ones without enough data are left gray), and for states in the US:

median number of friends by location

There are some curious effects. Countries like Russia and China have low median friend counts because Facebook isn’t widely used for connections between people inside those countries. And perhaps there are lower friend counts in the western US because of lower population densities. But quite why there are higher friend counts for our Facebook population in places like Iceland, Brazil and the Philippines—or Mississippi—I don’t know. (There is of course some “noise” from people misreporting their locations. But with the size of the sample we have, I don’t think this is a big effect.)

In Facebook, people can list both a “hometown” and a “current city”. Here’s how the probability that these are in the same US state varies with age:

percentage who moved states vs. age

What we see is pretty much what one would expect. For some fraction of the population, there’s a certain rate of random moving, visible here for young ages. Around age 18, there’s a jump as people move away from their “hometowns” to go to college and so on. Later, some fraction move back, and progressively consider wherever they live to be their “hometown”.

One can ask where people move to and from. Here’s a plot showing the number of people in our Facebook population moving between different US states, and different countries:

migration between US states

migration between countries

There’s a huge range of demographic questions we could ask. But let’s come back to social networks. It’s a common observation that people tend to be friends with people who are like them. So to test this we might for example ask whether people with more friends tend to have friends who have more friends. Here’s a plot of the median number of friends that our users have, as a function of the number of friends that they themselves have: median friend count vs. friend count

And the result is that, yes, on average people with more friends tend to have friends with more friends. Though we also notice that people with lots of friends tend to have friends with fewer friends than themselves.

And seeing this gives me an opportunity to discuss a subtlety I alluded to earlier. The very first plot in this post shows the distribution of the number of friends that our users have. But what about the number of friends that their friends have? If we just average over all the friends of all our users, this is how what we get compares to the original distribution for our users themselves:

distribution of number of friends

It seems like our users’ friends always tend to have more friends than our users themselves. But actually from the previous plot we know this isn’t true. So what’s going on? It’s a slightly subtle but general social-network phenomenon known as the “friendship paradox”. The issue is that when we sample the friends of our users, we’re inevitably sampling the space of all Facebook users in a very non-uniform way. In particular, if our users represent a uniform sample, any given friend will be sampled at a rate proportional to how many friends they have—with the result that people with more friends are sampled more often, so the average friend count goes up.

It’s perfectly possible to correct for this effect by weighting friends in inverse proportion to the number of friends they have—and that’s what we did earlier in this post. And by doing this we determine that in fact the friends of our users do not typically have more friends than our users themselves; instead their median number of friends is actually 229 instead of 342.

It’s worth mentioning that if we look at the distribution of number of friends that we deduce for the Facebook population, it’s a pretty good fit to a power law, with exponent -2.8. And this is a common form for networks of many kinds—which can be understood as the result of an effect known as “preferential attachment”, in which as the network grows, nodes that already have many connections preferentially get more connections, leading to a limiting “scale-free network” with power-law features.

But, OK. Let’s look in more detail at the social network of an individual user. I’m not sufficiently diligent on Facebook for my own network to be interesting. But my 15-year-old daughter Catherine was kind enough to let me show her network:

social network

There’s a dot for each of Catherine’s Facebook friends, with connections between them showing who’s friends with whom. (There’s no dot for Catherine herself, because she’d just be connected to every other dot.) The network is laid out to show clusters or “communities” of friends (using the Wolfram Language function FindGraphCommunities). And it’s amazing the extent to which the network “tells a story”. With each cluster corresponding to some piece of Catherine’s life or history.

Here’s a whole collection of networks from our Data Donors:

social networks

No doubt each of these networks tells a different story. But we can still generate overall statistics. Like, for example, here is a plot of how the number of clusters of friends varies with age (there’d be less noise if we had more data):

mean number of clusters vs. age

Even at age 13, people typically seem to have about 3 clusters (perhaps school, family and neighborhood). As they get older, go to different schools, take jobs, and so on, they accumulate another cluster or so. Right now the number saturates above about age 30, probably in large part just because of the limited time Facebook has been around.

How big are typical clusters? The largest one is usually around 100 friends; the plot below shows the variation of this size with age:

median size of largest cluster vs. age

And here’s how the size of the largest cluster as a fraction of the whole network varies with age:

relative size of largest cluster vs. age

What about more detailed properties of networks? Is there a kind of “periodic table” of network structures? Or a classification scheme like the one I made long ago for cellular automata?

The first step is to find some kind of iconic summary of each network, which we can do for example by looking at the overall connectivity of clusters, ignoring their substructure. And so, for example, for Catherine (who happened to suggest this idea), this reduces her network to the following “cluster diagram”:

cluster diagram of social network

Doing the same thing for the Data Donor networks shown above, here’s what we get:

mini social networks

In making these diagrams, we’re keeping every cluster with at least 2 friends. But to get a better overall view, we can just drop any cluster with, say, less than 10% of all friends—in which case for example Catherine’s cluster diagram becomes just:

cluster diagram after clusters with less than 10% of friends were dropped

And now for example we can count the relative numbers of different types of structures that appear in all the Data Donor networks:

Bar chart of different types of clustered social networks

And we can look at how the fractions of each of these structures vary with age:

community graph makeup vs. age

What do we learn? The most common structures consist of either two or three major clusters, all of them connected. But there are also structures in which major clusters are completely disconnected—presumably reflecting facets of a person’s life that for reasons of geography or content are also completely disconnected.

For everyone there’ll be a different detailed story behind the structure of their cluster diagram. And one might think this would mean that there could never be a general theory of such things. At some level it’s a bit like trying to find a general theory of human history, or a general theory of the progression of biological evolution. But what’s interesting now about the Facebook world is that it gives us so much more data from which to form theories.

And we don’t just have to look at things like cluster diagrams, or even friend networks: we can dig almost arbitrarily deep. For example, we can analyze the aggregated text of posts people make on their Facebook walls, say classifying them by topics they talk about (this uses a natural-language classifier written in the Wolfram Language and trained using some large corpora):

topics discussed on Facebook

Each of these topics is characterized by certain words that appear with high frequency:

word clouds for topics discussed on Facebook

And for each topic we can analyze how its popularity varies with (Facebook) age:

topics discussed on Facebook

It’s almost shocking how much this tells us about the evolution of people’s typical interests. People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. The peak time for anyone to talk about school+university is (not surprisingly) around age 20. People get less interested in talking about “special occasions” (mostly birthdays) through their teens, but gradually gain interest later. And people get progressively more interested in talking about career+money in their 20s. And so on. And so on.

Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life.

Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll see all sorts of other interesting things. And for example personally I’m very curious to analyze my own archive of nearly 25 years of email—and then perhaps predict things about myself by comparing to what happens in the general population.

Over the decades I’ve been steadily accumulating countless anecdotal “case studies” about the trajectories of people’s lives—from which I’ve certainly noticed lots of general patterns. But what’s amazed me about what we’ve done over the past few weeks is how much systematic information it’s been possible to get all at once. Quite what it all means, and what kind of general theories we can construct from it, I don’t yet know.

But it feels like we’re starting to be able to train a serious “computational telescope” on the “social universe”. And it’s letting us discover all sorts of phenomena. That have the potential to help us understand much more about society and about ourselves. And that, by the way, provide great examples of what can be achieved with data science, and with the technology I’ve been working on developing for so long.

]]>
http://blog.stephenwolfram.com/2013/04/data-science-of-the-facebook-world/feed/ 39
Talking about the Computational Future at SXSW 2013 http://blog.stephenwolfram.com/2013/03/talking-about-the-computational-future-at-sxsw-2013/ http://blog.stephenwolfram.com/2013/03/talking-about-the-computational-future-at-sxsw-2013/#comments Tue, 19 Mar 2013 14:31:54 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=5218 Last week I gave a talk at SXSW 2013 in Austin about some of the things I’m thinking about these days—including quite a few that I’ve never talked publicly about before. Here’s a video, and a slightly edited transcript:

Video streaming by Ustream

Well, this is a pretty exciting time for me. Because it turns out that a whole bunch of things that I’ve been working on for more than 30 years are all finally converging, in a very nice way. And what I’d like to do here today is tell you a bit about that, and about some things I’ve figured out recently—and about what it all means for our future.

This is going to be a bit of a wild talk in some ways. It’s going to go from pretty intellectual stuff about basic science and so on, to some really practical technology developments, with a few sneak peeks at things I’ve never shown before.

Let’s start from some science. And you know, a lot of what I’ll say today connects back to what I thought at first was a small discovery that I made about 30 years ago. Let me tell you the story.

I started out at a pretty young age as a physicist. Diligently doing physics pretty much the way it had been done for 300 years. Starting from this-or-that equation, and then doing the math to figure out predictions from it. That worked pretty well in some cases. But there were too many cases where it just didn’t work. So I got to wondering whether there might be some alternative; a different approach.

At the time I’d been using computers as practical tools for quite a while—and I’d even created a big software system that was a forerunner of Mathematica. And what I gradually began to think was that actually computers—and computation—weren’t just useful tools; they were actually the main event. And that one could use them to generalize how one does science: to think not just in terms of math and equations, but in terms of arbitrary computations and programs.

So, OK, what kind of programs might nature use? Given how complicated the things we see in nature are, we might think the programs it’s running must be really complicated. Maybe thousands or millions of lines of code. Like programs we write to do things.

But I thought: let’s start simple. Let’s find out what happens with tiny programs—maybe a line or two of code long. And let’s find out what those do. So I decided to do an experiment. Just set up programs like that, and run them. Here’s one of the ones I started with. It’s called a cellular automaton. It consists of a line of cells, each one either black or not. And it runs down the page computing the new color of each cell using the little rule at the bottom there.

Rule 254

OK, so there’s a simple program, and it does something simple. But let’s point our computational telescope out into the computational universe and just look at all simple programs that work like the one here.

Cellular automata rules

Well, we see a bunch of things going on. Often pretty simple. A repeating pattern. Sometimes a fractal. But you don’t have to go far before you see much stranger stuff.

This is a program I call “rule 30“. What’s it doing? Let’s run it a little longer.

Rule 30

That’s pretty complicated. And if we just saw this somewhere out there, we’d probably figure it was pretty hard to make. But actually, it all comes just from that tiny program at the bottom. That’s it. And when I first saw this, it was my sort of little modern “Galileo moment”. I’d seen something through my computational telescope that eventually made me change my whole world view. And made me realize that computation—even as done by a tiny program like the one here—is vastly more powerful and important than I’d ever imagined.

Cellular automata

Well, I’ve spent the past few decades working through the consequences of this. And it’s led me to build a new kind of science, to create all sorts of practical technology, and to make me think about almost everything in a different way. I published a big book about the science about ten years ago. And at the time when the book came out, there was a quite a bit of “paradigm shift turbulence“. But looking back it’s really nice to see how well the science has taken root.

Stephen Wolfram—A New Kind of Science

Academic papers making use of NKS

And for example there are models based on my kinds of simple programs showing up everywhere. After 300 years of being dominated by Newton-style equations and math, the frontiers are definitely now going to simple programs and the new kind of science.

But there’s still one ultimate app out there to be done: to figure out the fundamental theory of physics—to figure out how our whole universe works. It’s kind of tantalizing. We see these very simple programs, with very complex behavior.

Cellular automaton

It makes one think that maybe there’s a simple program for our whole universe. And that even though physics seems to involve more and more complicated equations, that somewhere underneath it all there might just be a tiny little program. We don’t know if things work that way. But if out there in the computational universe of possible programs, the program for our universe is just sitting there waiting to be found, it seems embarrassing not to be looking for it.

Now if there is indeed a simple program for our universe, it’s sort of inevitable that it has to operate kind of underneath our standard notions like space and time and so on. Maybe it’s a little like this.

Network

A giant network of nodes, that make up space a bit like molecules make up the air in this room. Well, you can start just trying possible programs that create such things. Each one is in a sense a candidate universe.

Collection of universes

And when you do this, you can pretty quickly say most of them can’t be our universe. Time stops after an instant. There are an infinite number of dimensions. There can’t be particles or matter. Or other pathologies.

But what surprised me is that you don’t have to go very far in this universe of possible universes before you start finding ones that are very plausible. And that for example seem like they’ll show the standard laws of gravity, and even some features of quantum mechanics. At some level it turns out to be irreducibly hard to work out what some of these candidate universes will do. But it’s quite possible that already caught in our net is the actual program for our universe. The whole thing. All of reality.

Well, if you’d asked me a few years ago what I thought I’d be doing now, I’d probably have said “hunting for our universe”. But fortunately or unfortunately, I got seriously sidetracked. Because I realized that once one starts to understand the idea of computation, there’s just an incredible amount of technology one can build—that’s to me quite fascinating, and that I think is also pretty important for the world. And in fact, right off the bat, there’s a whole new methodology one can use for creating technology.

I mean, we’re used to doing traditional engineering—where we build things up step by step. But out there in the computational universe, we now know that there are all these programs lying around that already do amazing things. So all we have to do is to go out and mine them, and find ones that fit whatever technological purpose we’re trying to achieve.

And actually we’ve been using this kind of automated algorithm discovery for quite some time now. By now Mathematica and Wolfram|Alpha are full of algorithms and programs that no human would ever have come up with, but were just found by systematically searching the computational universe. There’s a lot that can be done like this. Not just for algorithms, but for art, like this, and for physical structures and devices too.

WolframTones

Here’s an important point that comes from the basic science. 75 years ago Alan Turing gave us the idea of universal computation. Which is what showed that software was possible, and eventually launched the whole computer revolution. Well, from the science I’ve done comes what I call the Principle of Computational Equivalence. Which among other things implies that not only are universal computers possible; they’re actually really common out there in the computational universe. Like this is the simplest cellular automaton we know is a universal computer—with that tiny little rule at the bottom there.

Rule 110

And from a very successful piece of crowdscience that we did a few years ago, we know this is the simplest possible universal Turing machine.

The Wolfram 2,3 Turing Machine Research Prize

Tiny things. That we can reasonably expect exist all over the natural world. But that are computationally just as powerful as any computer we can build, or any brain, for example. Which explains, by the way, why so much of nature seems so hard for us to decode.

And actually, this starts to get at some big old questions. Like free will. Or like the nature of intelligence. And one of the things that comes out of the Principle of Computational Equivalence is that there really can’t be something special that is intelligence—it’s all just computation. And that has important consequences for thinking about extraterrestrial intelligence. And also for thinking about artificial intelligence.

For me it was this philosophical breakthrough that led to a very practical piece of technology: Wolfram|Alpha. Ever since I was kid I’d been interested in seeing how to take as much of the knowledge that’s been accumulated in our civilization as possible and make it computable. Somehow make it so that if there’s a question that can be answered on the basis of this knowledge, it can be done automatically.

For years I thought that doing that would require building something like a brain. And every decade or so I would ask myself if it was time yet, and I would conclude that it was just too hard. But finally from the Principle of Computational Equivalence I realized that, no, it all had to be doable just with computation. And that’s how I came to start building Wolfram|Alpha.

I hope you’ve mostly seen Wolfram|Alpha—on the web, in Siri, in apps, or wherever.

life of pi box office vs die hard

The idea is: you ask a question, in natural language, and Wolfram|Alpha tries to compute the answer, and generate a report, using knowledge that it has. At some level, this is an insanely difficult thing to make work. And if we hadn’t managed to do it, I might have thought it was pretty much impossible.

First, you’ve got to get all that data, on all sorts of things in the world. And no, you can’t just forage it from the web. You have to actually go interact with all the primary sources. Really understand the data, with actual human experts. And curate it to the point where it can reliably be used to compute from. And by now I think we’ve got more bytes of raw data inside Wolfram|Alpha than there is meaningful text content on the whole web.

But that’s only the beginning. Most questions people have aren’t answered just by retrieving a piece of data. They need some kind of computation. And for that we’ve had to take all those methods and models and algorithms that come from science and engineering and financial analysis and whatever and implement them. And by now it’s more than ten million lines of very high-level Mathematica code.

So we can compute lots of things. But now we’ve got to know what to compute. And the only realistic way for humans to interface with something this broad is through humans’ natural language. It’s not just keywords; it’s actual pieces of structured language, written or spoken. And understanding that stuff is a classic hard problem.

But we have two secret weapons. First, a bunch of methods from my new kind of science. And second, actual underlying knowledge, a bit like us humans have, that lets us decode and disambiguate.

Over the 3 years since Wolfram|Alpha launched I’m pleased at how far we’ve managed to get. It’s hard work, but now more than 90% of the queries that come to our website we can completely understand. We’ve really cracked the natural language problem, at least for these small snippets.

So once we’ve understood the input, what do we do? Well, what we’ve found is that people almost never want just one answer—42 or whatever. They want a whole custom report built for them. And we’ve developed a methodology now for automatically figuring out what information to present, and how to present it.

Many millions of people use this every day. A few web tourists. An awful lot of students, and professionals, and people wanting to figure all kinds of things out. It’s kind of nice to see how few of the queries we get are things that you can just search for on the web. People are asking us fresh, new, questions whose answers have never been written down before. So the only way to get those answers would be to find a human expert to ask—or to have Wolfram|Alpha compute them. It’s a huge project that I personally expect to keep working on forever.

It’s fascinating of course. Combining all these different areas of human knowledge. Figuring out things like how to curate and make computable human anatomy, or the 3 million or so theorems that exist in the literature of mathematics. I’m quite proud of how far we’ve got already, and how much faster we’re getting at doing things.

Wolfram|Alpha examples

And, you know, it’s not just about public knowledge. We’re also now able to bring in uploaded material, and use our algorithms and knowledge to analyze it. We can bring in a picture. And Wolfram|Alpha will tell us things about it.

Image upload with Wolfram|Alpha Pro

And we could explicitly tell Wolfram|Alpha to do some image computation. It works really nicely on a phone. Or we could upload a spreadsheet. And Wolfram|Alpha can use its linguistics to decode what’s in it, and then automatically generate a report about what’s interesting in the data.

Or we could get data from some internal database and ask natural language questions about it. And get custom reports automatically generated that can use external data as well as internal data. It’s incredibly powerful. And actually we have quite a business going building custom versions of Wolfram|Alpha for companies and other organizations.

It’s gradually getting more and more automated, and actually we’re planning to spin off a company specifically to do this kind of thing.

And you know, given the Wolfram|Alpha technology stack, there are so many places to go. Like having Wolfram|Alpha not just generate information, but actually do things too. You tell it something in natural language. And it uses algorithms and knowledge to figure out what to do.

Here’s a sophisticated case. As part of our high-end business, last year we released Wolfram SystemModeler.

Wolfram SystemModeler

Which is a tool for letting one design and simulate complex devices with tens of thousands of components. Like airplanes or turbines. Well, hooking this up to Wolfram|Alpha, we’ll be able to just ask questions to Wolfram|Alpha, and have it go to SystemModeler to automatically simulate a device, and then figure out how to do something.

Wolfram SystemModeler

Here’s a different direction: set Wolfram|Alpha loose on something like a document, where it can use our natural language technology to automatically add computation.

You know, today Wolfram|Alpha operates as an on-demand system: you say something to it, and it’ll respond. But in the future, it’s increasingly going to be used in a preemptive way. It’s going to sense or see something, and it’s automatically going to show you what it thinks you should know. Right now, the main issue that we see in people using Wolfram|Alpha is that they don’t understand all the things it can do. But in this preemptive mode, there’s no issue with that kind of discovery. Wolfram|Alpha is just going to automatically be figuring out what to show people. And once the hardware for augmented reality is there, this is going to be really neat. I mean, within Mathematica we now have what I think is the world’s most powerful image computation system. And combining this with Wolfram|Alpha capabilities, we’re going to be able to do a lot.

Wolfram Mathematica 9

I mentioned Mathematica here. It’s sort of our secret weapon. It’s how we’ve managed to do everything we’ve done. Including build that outrageously complex thing that is Wolfram|Alpha. Many of you I hope have heard of Mathematica. This June it’ll be the 25th anniversary of the original release of Mathematica. And I’m proud of how many inventions and discoveries have now been made in the world using Mathematica over that period of time. As well as how many students have been educated with it.

You know, I originally built Mathematica for a kind of selfish reason: I wanted to have it myself. And my goal was to make it broad enough that it could handle sort of any kind of computation I’d ever want to do. My approach was kind of a typical natural-science one. Think about all those different kinds of computations, drill down and try to understand the primitives that lie beneath them, and then implement those primitives in the system. And in a sense my plan was ultimately just to implement anything systematic and algorithmic that could be implemented.

Now I had a very important principle right from the beginning: as the system grew, it must always remain consistent and unified. Every new capability that was added must coherently fit into the structure of the system. And it was a huge amount of work to maintain that kind of design discipline. But I have to say that particularly in the last 10 years or so, it’s unbelievably paid off. Certainly it’s important in letting people learn what’s now a very big system. But even more important is that it’s allowed us to have a very powerful kind of recursive development process, in which anything we add now can “for free” use those huge blocks of functionality that we’ve already built.

The result is that we’ve been covering huge algorithmic areas incredibly fast, and with much more powerful algorithms than have ever been possible before. Actually, a lot of the time we’re really building not just algorithms, but meta-algorithms. Because another big principle we have is that everything should be as automated as possible.

You as a human want to just tell Mathematica what task you’re trying to perform. And there might be 200 different algorithms that could in principle be used. But it’s up to Mathematica to figure out automatically what the best one is. Internally, Mathematica is using very sophisticated algorithms—many of which we’ve invented. But the great thing is that a user doesn’t have to know anything about the details; that’s all handled automatically.

You know, Mathematica has by far the largest set of interconnected algorithmic capabilities that’s ever existed. And it’s not just algorithms that are built in; it’s also knowledge. Because all the knowledge in Wolfram|Alpha is directly accessible, and progressively more closely integrated, in Mathematica. It’s really quite a transformational thing. I call it knowledge-based computing. Whether you’re using the Wolfram|Alpha API or Mathematica, you’re able to do computing in which you can in effect start from the knowledge of the world, and then build from there.

I have to say that I’ve increasingly realized that Mathematica has been rather undersold. People think of it as that great tool for doing math. Which it certainly is. But it’s so much more than that. It was designed that way from the beginning, and as the years go by “math” becomes a smaller and smaller fraction of what the capabilities of Mathematica are about.

Really there are several parts to Mathematica. The most fundamental is the language that Mathematica embodies. It’s ultimately based on the idea that everything can be represented as a symbolic expression. Whether it’s an array of data, an image, a document, a program, an interface, whatever. This is an idea that I had more than 25 years ago—and over the years I’ve gradually realized just how powerful it is: having a small set of primitives that can seamlessly handle all those different kinds of things, and that provides in a sense an elegant “fusion” of many popular modern programming paradigms.

In addition to the symbolic character of the language, there’s another key point. Essentially every other computer language has just a small set of built-in operations. Yes, it has all sorts of mechanisms for handling in a sense the “infrastructure” of programming. But when it comes to algorithms and so on, there’s very little there. Maybe there are libraries, but they’re not unified, and they’re not really part of the language. Well, the point in our language is that all those algorithms are actually built right into the language. And that’s not all, there’s actual knowledge and data also built into the language.

It’s really a new kind of language. Something very different than others. And something incredibly productive for people who use it. But I have to say, in a sense I think it’s been rather hidden all these years. Not that there aren’t millions of people using the language through Mathematica. But there really should be a lot more—including lots who won’t be caught dead doing anything that anyone might think had “math” in it.

Really anyone who’s doing anything algorithmic or computational should be using it. Because it’s inevitably just much more efficient than anything else—because it has so much already built in. So one of the new things that we’re doing is to break out the language that Mathematica is based on, and give it a separate life. We’ve been thinking about this for more than 20 years. But now it’s finally going to happen.

We agonized for a long time about what to call the language. We came up with all kinds of names—clever, whimsical, whatever—and actually just recently on my blog I asked people for their comments and suggestions. And I suppose the result was a little embarrassing. Because after all the effort we put it in, by far the most common response about the name we should use is the most obvious and straightforward one. We should call it the Wolfram Language.

So that’s what it’ll be. The language we’ve built for Mathematica, with that huge network of built-in algorithms and knowledge, will be called the Wolfram Language. It’ll use .wolf files, and of course that means its icon has to be something like this:

Wolfram Language logo

What’s going to happen with this language? Well, here’s where things really get interesting. The language was originally built for the desktop platform that’s the current way most people use Mathematica. But in Wolfram|Alpha, for example, the language is running on a large scale in the cloud. And what’s going to be happening over the next few months is that we’ll be releasing a full cloud version. And not only that, there’ll also be a version running locally on mobile, first under iOS.

Why is that important? Well, it really opens up the language, both its use and its deployment. So, for example, we’re going to have the Wolfram Programming Cloud, in which you can freely write code in the language—anything from a pithy one-liner to something giant—right there in the cloud. And then immediately deploy in all sorts of ways.

If you wanted, you could just run it in an interactive session, like in standard Mathematica. But you can also generate an instant API. That you can call from anywhere, to just seamlessly run code in our cloud. Or you can embed the code in a page, or have the code just run in the background, periodically generating reports or whatever. And then you can take the exact same code, and deploy it on mobile too.

Now something else that we’ve built and refined over the years in Mathematica is our dynamic interface, that uses symbolic expressions to represent controls and interactivity. Not every use of the Wolfram Language uses that interface. But what’s happening is that we’re reinterpreting the interface to optimize it not just for the desktop, but also for the cloud and for mobile.

One place the interface is used big time is in what we call “CDF“: our computable document format. We introduced this a couple of years ago. Underneath it’s Wolfram Language code. On top, it’s a dynamic interactive interface that one can use to make reports and presentations and interactive documents of any kind. Right now, they can be in a plugin in a browser, or they can be standalone on a desktop. What’s happening now is that they can also be on mobile, or, with cloud CDF, they can operate in a pure web page, with no plugin, but just sending every computation to the cloud.

It might sound a bit abstract here. But I think the whole deployment of the Wolfram Language is going to be quite a revolution in programming. There’ve been seeds of this in Mathematica for a quarter of a century. But it’s a kind of convergence of cloud and mobile technology—and frankly our own understanding of the power of what we have—that’s making all this happen now.

You know, the fact that it’s so easy to get so much done in the language is not only important for professional programmers; it’s also really important for kids and anyone else who’s learning to program. Because you don’t have to type much in, and you’re immediately doing serious stuff. And, by the way, you get to learn all those state-of-the-art programming and algorithm concepts right there. And also: there’s an on-ramp that’s easier than anyone’s ever had before, with free-form natural language courtesy of the Wolfram|Alpha engine. It really seems to work very well for this purpose—as we’ve seen in our Mathematica Summer Camp for high-school kids, and our new after-school initiative for middle-school kids.

Maybe I should actually show a demo of all this stuff.

CountryData["SouthAmerica"]

{Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, FalklandIslands, FrenchGuiana, Guyana, Paraguay, Peru, Suriname, Uruguay, Venezuela}

CountryData[#, "Flag"] & /@ %

Flags of South American countries

EdgeDetect /@ %

Edges of flags of South American countries

There is a whole mechanism for deploying these dynamic things using CDF.

One application area that’s fun—and topical these days—is using algorithmic processes to make things that one can 3D-print.

3D printing example

That was the Wolfram Language on the desktop, and CDF. Here it is in the Programming Cloud.

Wolfram Programming Cloud

That’s cloud CDF. This also works on iOS, though the controls look a bit different.

In the next little while, you’ll be seeing a variety of platforms based on our technology. The Document Platform, for creating CDF documents, in the cloud or elsewhere. The Presentation Platform, for creating full computable interactive presentations. The Discovery Platform, optimized for the workflow of discovering things with our technologies.

Many of these involve not just the pure language, but also CDF and our dynamic interface technology. But one important thing that’s just happening now is that the Wolfram Language, with all its capabilities, is starting to fit in some very cheap hardware. Like Raspberry Pi. For years if you wanted to embed algorithms into some device, you’d have to carefully compile them into some low-level language or some such. But here’s the great thing: for the first time, this year, embeddable processors are powerful enough that you can just run the whole Wolfram Language, right on them. So you can be doing your image processing, or your control theory computation, right there, with all the power of everything we’ve built in Mathematica.

By the way, I might say something about devices. The whole landscape of sensors and devices is changing, with everything getting more diverse and more ubiquitous. And one important thing we’re doing is making a general Connections Hub for sensors and devices. In effect we’re curating sensors and devices, and working with lots of manufacturers. So that the data that comes from their systems can seamlessly flow into Wolfram|Alpha, or into anything based on the Wolfram Language. We’re building a generic analytics system that anyone can plug into. It can be used in a fully automatic way, like in Wolfram|Alpha Pro. And it can be arbitrarily customized and programmed, using the Wolfram Language.

By the way, another component of this, primarily for researchers, is that we’re building a general Data Repository. What’s neat here is that because of our Wolfram|Alpha linguistic capabilities, we can automatically read and align data. And then of course we can do analysis. When you read a research paper today, if you’re lucky there’ll be some URL listed where you can find data in some raw form. But with our Data Repository people are going to be able to have genuinely “data-backed papers”. Where anyone can immediately do comparisons or new analysis.

Talking of data, I’ve been a big collector of it personally for a long time. Last year here I showed for the first time some of my nearly 25-year time series of personal analytics data. Here’s the new version.

Plot of every email sent

That’s every email I sent, including this year.

Plot of keystrokes

That’s keystrokes.

Daily rhythms

And that’s my whole average daily rhythm over the past year.

Oh, and here’s something useful I built actually right after South by Southwest last year, that I was embarrassed I didn’t have before: the time series of the number of pending and unanswered emails I have. (It’s computing in real time here in our cloud platform.)

Time series of the number of pending and unanswered emails over the last 30 days

It’s sort of a proxy for busyness level. Which is pretty useful in managing my schedule and so on.

Well, bizarre as it seems to me, I may be the human who’s ended up collecting the most long-term data on themselves of anyone.

But nowadays everyone’s got lots of data on themselves. Like on Facebook, for example. And so in Wolfram|Alpha we recently released Personal Analytics for Facebook. It’ll be coming out in an app soon too. So you can just go to Wolfram|Alpha and ask for a Facebook report, and it’ll generate actually a whole little book about you, combining analysis of your Facebook data with public computational knowledge.

My personal Facebook is a mess, but here’s what the system does on it:

Stephen Wolfram's Facebook report

When we first released our Personal Analytics for Facebook we were absolutely draconian not keeping any data. And no doubt we destroyed some great sociometric science in the making. But a month or so ago we started keeping some anonymized data, and started a Data Donor program, which has been very successful. So now we can explore quite a few things. Like here are a few friend graphs.

Facebook friend graphs

There’s a huge diversity. Each one tells a story. Both about personality and circumstances.

But let’s look at some aggregate information. Like here’s the distribution of the number of friends that people have.

Distribution of the number of Facbeook friends that people have

Like this shows the distributions of ages of friends for a person of a particular age.

Distribution of ages of friends for a person of a particular age

The distribution gets broader with age. Actually, after about age 25, there’s some sort of new law of nature one discovers: that at any age about half of people’s friends are between 25% younger and 25% older.

By the way, in Mathematica and the Wolfram Language there’s also now direct access to social media data for Facebook, LinkedIn, Twitter and so on. So you can do all kinds of interesting analysis and visualization.

Actually, talking of Personal Analytics, here’s a new dimension. I’ve been walking around South by Southwest for a couple of days wearing this cute Memoto camera, which takes a picture every 30 seconds. And last night my 14-year-old was kind enough to write a bit of code to analyze what I got. Here’s what he came up with.

Memoto camera data

You know, it’s pretty neat to see how our big technology stack makes all this possible. I mean, even just to read stuff properly from Facebook we’ve got to be able understand free-form input. Which of course we can with the Wolfram|Alpha Engine. And then to say interesting things we’ve got to use knowledge and algorithms. Then we’ve got to have good automated visualization. And it helps to have state-of-the-art large-graph-manipulation algorithms in the Wolfram Language Engine. And also to have CDF and our Dynamic Interface to generate complete reports.

To me it’s exciting—if a little overwhelming—to see how many things can be moved forward with our technology stack. One big one is education. Of course Wolfram|Alpha and Mathematica are extremely widely used—and well known—in education. And they’re used as central tools in endless courses and so on.

But with our upcoming Cloud Platform lots of new things are going to become possible. And as my way to understand that, I’ve decided it’s time for me to actually make a course or two myself. You know, I was a professor once, before I was a CEO. But it’s been 25 years. Still, I decided the first course to do was one on Data Science. An Introduction to Data Science. I’m having a great time.

Data Science Course example

Data Science is a terrific topic. Really in the modern world everyone should learn it. It’s both immediately useful, and a great way to teach programming, as well as general computational and quantitative thinking.

Between our Cloud Platform and the Wolfram Language, we have a great way to set up the actual course. Here’s the basic setup. Below the video there’s a window where you can just immediately play with all the code that’s shown. And because it’s just very high-level Wolfram Language code it’s realistic to learn in effect just by immersion.

And when it comes to setting up exercises and so on, it’s pretty interesting when you have Wolfram|Alpha-style natural language understanding capabilities and so on. I hope the Data Science will be ready to test in a limited number of months. And, needless to say, it’s all being built with a very automated authoring system, that’ll allow lots of people to make courses like this. I’m thinking about trying to do a math course, for example.

We get asked a lot about math education a lot, of course. And actually we have a non-profit spinoff called Computer-Based Math that’s been trying to create what we see as being a modern computer-informed math curriculum. You see, the current math curriculum was mostly set a century ago, when the world was very different. Two things have changed today: first, we’ve got computers that can automate the mechanical doing of math. And second, there are lots of new and different ways that math gets used in the world at large.

Computer-Based Math

It’s going to be a long process modernizing math education, around the world. We’d been wondering what the first country really to commit to Computer-Based Math would be. Turns out it’s Estonia, which signed up a few weeks ago.

So we’re slowly moving toward people being educated in the kind of computational paradigm. Which is good, because the way I see it, computation is going to become central to almost every field. Let’s talk about two examples—classic professions: law and medicine. It’s funny, when Leibniz was first thinking about computation at the end of the 1600s, the thing he wanted to do was to build a machine that would effectively answer legal questions. It was too early then. But now we’re almost ready, I think, for computational law. Where for example contracts become computational. They explicitly become algorithms that decide what’s possible and what’s not.

You know, some pieces of this have already happened. Like with financial derivatives, like options and futures. In the past these used to just be natural language contracts. But then they got codified and parametrized. So they’re really just algorithms, which of course one can do meta-computations on, which is what has launched a thousand hedge funds, and so on.

Well, eventually one’s going to be able to make computational all sorts of legal things, from mortgages to tax codes to perhaps even patents. Now to actually achieve that, one has to have ways to represent many aspects of the real world, in all its messiness. Which is what the whole knowledge-based computing of Wolfram|Alpha is about.

sore throat + cough

How about medicine? To me probably the single most important short-term target in medicine is diagnosis. If you get a diagnosis wrong—and an awful lot are wrong in practice—then all the effort and money you spend is going to be wasted, and is often even going to be harmful. Now diagnosis is a difficult thing for humans. And as more is discovered in medicine—and medicine gets more specialized—it gets even more difficult. But I suspect that in fact diagnosis is in some sense not so hard for computers. But it’s a big project to make a credible automated diagnosis system. Because you have to cover everything: it’s no good just doing one particular kind of disease, because then all you’re going to do is say that everyone has it.

By the way, the whole area of diagnosis is about to change—as a result of the arrival of sensor-based medicine. It used to be that you could ask a question or do a test, and the result would be one bit, or one number. But now it’s routine to be able to get lots and lots of data. And if we’re really going to use that data, we’ve got to use computers; humans just don’t deal with that kind of thing. It’s an ambitious project with many pieces, but I think that using our technology stack—and some ideas from science I’ve developed—we know how to do automated medical diagnosis. And we’re actually spinning off a company to do this.

You know, it’s interesting to think about the broad theory of diagnosis. And I think an interesting model for medical diagnosis is software diagnosis—figuring out what’s going wrong with a large running software system. In medicine we have all these standard diagnosis codes. For an operating system one might imagine having things like “diseases of the memory management system” or “diseases of the keyboard driver”. In medicine, we’re starting to be able to measure more and more. But in software we can in principle monitor almost everything. But we need methodologies to interpret what we’re seeing.

By the way, even though I think diagnosis is in the short term a critical point in medicine, I think in the long term it’s simply going to go away. In fact, from my science—as well as the software analogy—I think it’s clear that the idea of discrete diseases is just wrong. Of course, today we have just a few thousand drugs and surgeries we can use. But I think more and more we’ll be using algorithmic treatments. Whether it’s medical devices that behave according to algorithms, or whether it’s even programmable drugs that effectively do a computation at the molecular scale to work out how to act. And once the treatments are algorithmic, we’re really going to want to go directly from data on symptoms to working out the treatment, often adaptively in real time.

My guess is it’s going to end up a bit like a financial portfolio. You watch what the stocks do, and you have algorithms to decide how to respond. And you don’t really need to have a verbal description—like the technical trader’s “head and shoulders” pattern or something—of what the stock chart is doing.

By the way, when you start thinking about medicine in fundamentally computational terms, it gives you a different view of human mortality. It’s like the operating system that’s running, and over the course of time has various kinds of trauma and infections, starts running slower, and eventually crashes, and dies. If we’re going to avoid mortality, we need to understand how to intervene to keep the operating system—or the human—up and running. There are lots of interim steps. Taking over more and more biological functions with technology. And figuring out how to reprogram pieces of the molecular machine that is our body. And figuring out if necessary how to “hit the pause button” to freeze things, presumably with cryonics.

By the way, it’s bizarre how few people work on this. Because I’m sure that, just like cloning, there’s just going to be a wacky procedure that makes it possible—and once we know it, we’re just going to be able to do it quite routinely, and it’s going to be societally very important. But in the end, we want to solve the problem of keeping all the complexity that is a human running indefinitely. There are some fascinating basic science problems here. Connected to concepts like computational irreducibility, and a bit to the traditional halting problem. But I have no doubt that eventually it’ll be solved, and we’ll achieve effective human immortality. And when that happens I expect it’ll be the single biggest discontinuity in human history.

Cellular automata

You know, as one thinks about such things, one can’t help wondering about the general future of the human condition. And here’s something someone like me definitely thinks about. I’m spending my life trying to automate things. Trying to make it possible to do automatically with computation things that humans used to have to do themselves.

Now, if we look at the arc of human history, the biggest systematic change through time is the arrival of more and more technology, and the automation of more and more kinds of tasks. So here’s a question: what if we succeed in automating everything? What will happen then? What will the humans do? There’s an ultimate—almost philosophical—version of this question. And there’s also a practical next-few-decades version.

Let’s start with the ultimate version. As we go on and build more and more technology, what will the end point be? We might assume that we could somehow go on forever, achieving more and more. But the Principle of Computational Equivalence tells us that we cannot. One we have reached a certain level, everything is already in a sense possible. And even though our current engineering has not yet reached this point, the Principle of Computational Equivalence also tells us that this maximal level of computational sophistication is not particularly rare. Indeed it happens in many places in the physical world, as well as in systems like simple cellular automata.

Cellular automata

And it’s not too hard to see that as we improve our technology, getting down to the smallest scales, and removing everything that seems redundant, that we might wind up with something that looks just like a physical process that already happens in nature. So does this mean that in the ultimate future, with all that great automation and technology, all we’ll achieve is just to produce something that’s indistinguishable from zillions of things that already exist in nature?

In some sense, yes. It’s a sort of ultimate Copernicanism: not only is our Earth not the center of the universe, and our bodies not made of something physically unique. But also, what we can achieve and create with our intelligence is not in a fundamental sense different from what nature is already doing.

So is there any meaningful ultimate future for us? The answer is yes. But it’s not about doing some kind of scientific utopian thing, and achieving some ultimate perfect state that’s independent of our history. Rather, it’s about doing things that depend on all those messy details of us humans and our history.

Here’s a way to understand this. Imagine our technology has got us a complete AI sitting in a box on a desk. It can do all sorts of incredible things; all sorts of sophisticated computations. The question is: what will it choose to do? It has no intrinsic way to decide. It needs some kind of goal, some kind of purpose, imposed on it. And that’s where we humans and our history come in. I mean, for humans, there is again no absolute purpose abstractly defined. We get our notion of purpose from the details of our existence and our history. And to achieve ultimate technology is in a sense empty unless purposes are defined for it, and that’s where we humans come in.

We can begin to see this pretty well even right now. In the past, our technology was such that we typically had to define quite explicitly what systems we build should do, say by writing code that defines each step they should take. But today we’ve increasingly got much more capable systems, that can do all kinds of different things. And we interact with them in a sense by injecting purpose. We define a purpose or a goal, and then the system figures out how it can best achieve that goal.

Well, of course, human purposes have evolved quite a bit over the course of human history. And often their evolution is connected to the arrival of technology that makes more things possible. So it’s not too clear what the limit of this kind of co-evolving system will be, and whether it will turn out to be wonderful or terrible. But in the nearer term, we can ask what effect increasing automation will have on people and society. And actually, as I was thinking about this recently, I thought I’d pull together some data about what’s happened with this historically. So here are some plots over the past 150 years of what fractions of people in the US have been in different kinds of occupations. Blue for males; pink for females.

Fractions of people in the US who have been in different kinds of occupations over the last 150 years

There are lots of interesting details here, like the pretty obvious direct and indirect effects of larger government over the last 50 years. But there’s also a clear signature of automation, with a variety of kinds of occupations simply going away. And this will continue. And indeed my expectation is that over the coming years a remarkable fraction of today’s occupations will successfully be automated. In the past, there’ve always been new occupations that took the place of ones that were automated away. And my guess, or perhaps hope, is that for most people some hybrid of avocation and occupation will emerge.

Which brings me to something I’ve been thinking about quite a lot recently. I’m mostly a science, technology and ideas guy. But I happen also to be very interested in people. And over the years I’ve had the good fortune to work with—and mentor—a great many very talented people. But here’s something I’ve noticed. Many people—and young people in particular—have an incredibly difficult time picking a good occupation—or avocation—for themselves. It’s a bit of a puzzle. People have certain sets of talents and interests. And there are certain niches that exist in the world at any given time. The problem is to match a given person with a niche.

Now sometimes people—and I was an example—pick out a pretty clear niche by the time they’re early teenagers. But an awful lot of people don’t. Usually there are two problems. First, people don’t really identify their skills and interests. And second, people don’t know what’s possible to do in the world. And in the end, an awful lot of people pick directions—almost at random—that aren’t in fact very good for them. And I suspect in terms of wasted resources in the world, this is pretty high up there.

You know, I have a kind of optimistic theory—that’s supported by a lot of personal observation—that for almost every person, there’s at least one really good thing they could be doing, that they will find really fulfilling. They may be lucky or unlucky about what value the world places on that thing at a given time in history. But if they can find that thing—and it often isn’t so easy—then it’s great.

Well, needless to say, I’ve been thinking what can be done. I’ve personally worked on the problem many times. With many great results. Although I have to say that almost always I’ve been dealing with highly capable individuals in good circumstances. And I do want to figure out how to generalize, to younger folk and less good circumstances. But whatever happens, there’s a puzzle to solve. A little like medical diagnosis. Requiring understanding the current situation. Then knowing what’s possible. And one of the practical challenges is knowing enough about how the world is evolving, and what new occupations and ways to operate in the world are emerging.

I’m hoping to do more in this direction. I’m also thinking a bunch about the structure of education. If people have an idea what they might like to do, how do they develop in that direction? The current system with college and so on is pretty inflexible. But I think there are better alternatives, that involve effectively doing diverse mentored projects. Which is something we’ve seen very successfully in the summer schools we’ve done over the past decade.

But anyway, with all this discussion about what people should do: that’s a big challenge for someone like me too. Because I’m in this situation where I’ve been building things for 30 years, and now there are just an absurd number of things that what I’ve built makes possible. We’re pursuing a lot of things at our company. But we only have 700 people, which isn’t enough for everything we want to do. I made a decision long ago to have a simple private company, so we could concentrate on the long term, and on what we really wanted to do. And I’m happy to say that for the last quarter century that’s worked out very well. And it’s made possible things like Wolfram|Alpha—that probably nobody but me would ever have been crazy enough to put money into.

But now we’ve just got too many opportunities, and I’ve decided we’re just leaving too many great ideas—and great technology prototypes—on the table. So we’ve been learning how to spin off companies to develop these things. And actually, we have a whole scheme now for setting up an outside fund to invest in spinoffs that we’re doing.

I’ve been used to architecting technical systems. But architecting these kinds of business structures is also pretty interesting. Sort of trying to extend the machine I’ve built for turning ideas into reality. You know, I like to operate by having a whole portfolio of long-range ideas. Which I carry around with me for a long time. Like for Wolfram|Alpha it was more than 30 years. Gradually waiting for the circumstances and the right time to pursue them. And as I said earlier, I would probably be doing my physics project now, if technology opportunities hadn’t got in the way.

Though I have to say that the architecture of that project is tricky too. Because it’s not clear how to fit it into the world. I mean, lots of people, including myself, are incredibly curious about it. But for the physics community it’s a scary, paradigm-breaking, proposition. And it’s going to be an uphill story there.

And the issue for someone like me is: how much does the world really want something like the fundamental theory of physics done? It’s always great feedback for me doing projects where people really like the results. I don’t know about this one. I’ve been thinking about trying to find out by putting up a Kickstarter project or something for finding the fundamental theory of physics. It’s kind of funny how one goes from that level of practicality, to thinking about the structure of our whole universe. It’s fun—and to me—it’s invigorating.

Well, there are lots more things it’d be fun to talk about. But let me stop here, and hope that you’ve enjoyed hearing a little about what’s going on these days in my small corner of the world.

]]>
http://blog.stephenwolfram.com/2013/03/talking-about-the-computational-future-at-sxsw-2013/feed/ 5
What Should We Call the Language of Mathematica? http://blog.stephenwolfram.com/2013/02/what-should-we-call-the-language-of-mathematica/ http://blog.stephenwolfram.com/2013/02/what-should-we-call-the-language-of-mathematica/#comments Tue, 12 Feb 2013 19:35:13 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4997 At the core of Mathematica is a language. A very powerful symbolic language. Built up with great care over a quarter of a century—and now incorporating a huge swath of knowledge and computation.

Millions and millions of lines of code have been written in this language, for all sorts of purposes. And today—particularly with new large-scale deployment options made possible through the web and the cloud—the language is poised to expand dramatically in usage.

But there’s a problem. And it’s a problem that—embarrassingly enough—I’ve been thinking about for more than 20 years. The problem is: what should the language be called?

Usually on this blog when I discuss our activities as a company, I talk about progress we’ve made, or problems we’ve solved. But today I’m going to make an exception, and talk instead about a problem we haven’t solved, but need to solve.

You might say, “How hard can it be to come up with one name?” In my experience, some names are easy to come up with. But others are really really hard. And this is an example of a really really hard one. (And perhaps the very length of this post communicates some of that difficulty…)

language

Let’s start by talking a little about names in general. There are names like, say, “quark”, that are in effect just random words. And that have to get all their meaning “externally”, by having it explicitly described. But there are others, like “website” for example, that already give a sense of their meaning just from the words or word roots they contain.

I’ve named all sorts of things in my time. Science concepts. Technologies. Products. Mathematica functions. I’ve used different approaches in different cases. In a few cases, I’ve used “random words” (and have long had a Mathematica-based generator of ones that sound good). But much more often I’ve tried to start with a familiar word or words that capture the essence of what I’m naming.

And after all, when we’re naming things related to our company, we already have a “random” base word: “wolfram”. For a while I was a bit squeamish about using it, being that it’s my last name. But in recent years it’s increasingly been the “lexical glue” that holds together the names of most of the things we’re doing.

And so, for example, we have products like Wolfram Finance Platform or Wolfram SystemModeler for professional markets that have that “random” wolfram word, but otherwise try to say more or less directly what they are and what they do.

Wolfram|Alpha is aimed at a much broader audience, and is a more complex case. Because in a short name we need to capture an almost completely new concept. We describe Wolfram|Alpha as a “computational knowledge engine”. But how do we shorten that to a name?

I spent a very long time thinking about it, and eventually decided that we couldn’t really communicate the concept in the name, and instead we should just communicate some of the sense and character of the system. And that was how we ended up with “alpha”: with “alphabet simplicity”, a connection to language, a technical character, a tentative software step, and the first, the top. And I’m happy to say the name has worked out very well.

OK. So what about the language that we’re trying to name? What should it be called?

Well, I’m pretty sure the word “language” should appear in the name, or at least be able to be tacked onto the name. Because if nothing else, what we’ve got really is quintessentially a language: a set of constructs that can be strung together to represent an infinite range of meanings.

Our language, though, works in a somewhat different way from ordinary human natural language—most importantly, because it’s completely executable: as soon as we express something in the language, that immediately gives us a specification for a unique sequence of computational actions that should be taken.

And in this respect, our language is like a typical computer language. But there is a crucial difference, both practical and philosophical. Typical computer languages (like C or Java or Python) have a small collection of simple built-in operations, and then concentrate on ways to organize those operations to build up programs. But in our language—built right into the language—is a huge amount of computation capability and knowledge.

In a typical computer language, there might be libraries that exist for different kinds of computations. But they’re not part of the language, and there’s no guarantee they fit together or can be built on. But in our language, the concept from the very beginning has been to build as much as possible in, to have a coherent structure in which as much is automated as possible. And in practice this means that our language has thousands of carefully designed functions and structures that automate a vast range of computations and deliver knowledge in immediately usable ways.

So while in some aspects of its basic mode of operation our language is similar to typical computer languages, its breadth and content is much more reminiscent of human languages—and in a sense it generalizes and deepens both concepts of language.

But OK, what should it be called? Well, I first started thinking about this outrageously long ago—actually in 1990. The software world was different then, and there were different ways we might have deployed the language back then. But despite having put quite a bit of software engineering work into it, we in the end never released it at all. And the single largest reason for that, embarrassingly enough, was that we just couldn’t come up with a name for it that we liked.

The “default name” that we used in the development process was the M Language, with M presumably short for Mathematica. But I never liked this. It seemed too much like C—a language which I’d used a lot, but whose character and capabilities were utterly different from our language. And particularly given the name “C”, M seemed to suggest a language somehow based on “math”. Yet even at that time—and to a vastly greater extent today—the language is about much much more than math. Yes, it can do math really well. But it’s broad and deep, and can do an immense range of other algorithmic and computational things—and also an increasing range of things related to built-in knowledge.

One might ask why Mathematica is named as it is. Well, that was a difficult naming process too. The original development name for Mathematica was Omega (and there are still filetype registrations for Mathematica based on that). Then there was a brief moment when it was renamed Polymath. Then Technique. And then there were a whole collection of possibilities, captured in this old list:

Possible names for Mathematica

But finally, at the urging of Steve Jobs, we settled on a name that we had originally rejected for being too long: Mathematica. My original conception of the system—as well as the foundations we built for it—went far beyond math. But math was the first really obvious application area—which is why, when Mathematica was first released, we described it as “a system for doing mathematics by computer”.

I’ve always liked Mathematica as a name. And back in 1988 when Mathematica was launched, it introduced in many ways a new type of name for a computer system, with a certain classical stylishness. In the years since, the name Mathematica has been widely imitated (think Modelica, for example). But it’s become clear that for Mathematica itself the name “Mathematica” is in some sense much too narrow—because it gives the idea that all that Mathematica does is math.

For our language we don’t want to have the same kind of problem. We want a name that communicates the generality and breadth of the language, and is not tied to one particular application area or type of usage. We want a name that makes sense when the language is used to do tiny pieces of interactive work, or to create giant enterprise applications, and to be used by seasoned software engineers, or by casual script tweakers, or by kids getting their first introduction to programming.

My personal analytics data show that I’ve been thinking about the problem of naming our language for 23 years—with episodic bursts of activity. As I mentioned, the original internal name was the M Language. More recently the default internal name has been the Wolfram Language.

Back in the early 1990s, one of my favorite ideas was Lingua—the Latin for language (as well, unfortunately, as tongue), analogous to the Latin character of Mathematica. But Lingua just sounded too weird, and the “gwa” was unpronounceable by too many people whose native languages don’t contain that sound. There was some brief enthusiasm for Express (think “expression”, as well as “express train”), but it died quickly.

There were early suggestions from the MathGroup Mathematica community, like Principia, Harmony, Unity and Tongue (in the latter case, a wag pointed out that bugs could be “slips of the tongue”). One summer intern who worked on the language in 1993 was Sergey Brin (later of Google fame); he suggested the name Thema—”the heart of mathematica” (“ma-thema-tica”). My own notes from that time record rather classical-sounding name ideas like Radix, Plurum, Practica and Programos. And in addition to thinking a lot about it myself, I asked linguists, classicists, marketers and poets—as well as a professional naming expert. But somehow every name either said too little or too much, was too “heavy” or too “light”, or for some reason or another just sounded silly. And after more than 20 years, we still don’t have a name we like.

But now, with all the new opportunities that exist for it, we just have to release the language—and to do that we have to solve the problem of its name. Which is why I’ve been thinking hard about it again.

So, what do we want to communicate about the language? First and foremost, as I explained above, it’s not like other languages. In a sense, it’s a new kind of language. It’s computational, but it’s also got intrinsic content: broad knowledge, structures and algorithms built in. It’s a language that’s highly scalable: good for programs ranging from the absolutely tiny to the huge. It’s a very general language, useful for a great many different kinds of domains. It’s a symbolic language with very clear principles, that can describe arbitrary structures as well as arbitrary data. It’s a fusion of many styles of programming, notably functional and pattern based. It’s interactive. And it prides itself on coherence of design, and tries to automate as much as possible of what it does.

At this point, we pretty much have to have “wolfram”—or at least some hint of it—in the name. But it would be nice if there was a good short name or nickname too. We want to communicate that the language is something that we as a company take responsibility for, but also that it will be very widely and often freely available—and not some kind of rare expensive thing.

All right. So an obvious first question is: how are languages typically named? Well, in Wolfram|Alpha, we have data on more than 16,000 human languages, current and former. And, for example, of the 100 with the most speakers, 13% end in -ese (think Japanese), 11% in -ic (think Arabic), 8% in -ian (think Russian), 5% in -ish (think English) and 3% in -ali (think Bengali). (If one looks at more languages, -ian becomes more common, and -an and -yi start to appear often too.) So should our language be called Wolframese, Wolframic, Wolframian, Wolframish or Wolframaic? Or perhaps Wolfese, Wolfic or Wolfish? Or Wolfian or Wolfan or Wolfatic, or the exotic Wolfari or Wolfala? Or a variant like Wolvese or Wolvic? There are some interesting words here, but to me they all sound a bit too much like obscure tribal languages.

OK. So what about computer languages? Well, there’s quite a diversity of names. In rough order of their introduction, some notable languages have been: Fortran, LISP, Algol, COBOL, APL, Simula, SNOBOL, BASIC, PL/1, Logo, Pascal, Forth, C, Smalltalk, Prolog, ML, Scheme, C++, Ada, Erlang, Perl, Haskell, Python, Ruby, Java, JavaScript, PHP, C#, .NET, Clojure, Go.

So how are these names constructed? Some—particularly earlier ones—are abbreviations, like Fortran (“Formula Translation”) and APL (“A Programming Language”). Others are names of people (like Pascal, Ada and Haskell). Others are named for companies, like Erlang (“Ericsson language”) and Go (“Google”). And still others are named in whimsical sequences, like BCPL to B to C (“sea”) to shell to Perl (“pearl”) to Ruby—or just plain whimsically, like Python (“Monty Python”). And these naming trends just continue if one looks at less well-known languages.

There are two important points here: first, it seems like computer languages can be called pretty much anything; unlike for most human languages (which are usually derivative on place names), no special linguistic indicator seems to have emerged for computer languages. And second, the names of computer languages only rarely seem immediately to communicate the special features or aspirations of a given language. Sometimes they refer to computer-language history, but often they just seem like quite random words.

So for us, this suggests that perhaps we should just use our existing “random word”, and call our language the Wolfram Language, or WL—or conceivably in short form just Wolfram.

Or we could start from our “random word” wolfram, and go more whimsical. One possibility that has generated some enthusiasm internally is Wolf. Unfortunately wolves tend to have scary associations—but at least the name Wolf immediately suggests an obvious idea for an icon. And we even already have a possible form for it. Because when we introduced special-character fonts for Mathematica in the mid-1990s, we included a \[Wolf] character that was based on a little iconic drawing of mine. Dressing this up could give quite a striking language icon—that could even appear as a single character in a piece of text.

Wolf logo

There are variants, like WolframCode or WolframScript—or Wolfcode or Wolfscript—but these sound either too obscure or too lightweight. Then there’s the somewhat inelegant WolframLang, or it shorter forms WolfLang and WolfLan, which sound too much like Wolfgang. Then there are names like WolframX and WolfX, but it’s not clear the “X” adds much. Same with WolframQ or WolframL. There’s also WolframPlus (Wolfram+), WolframStar (Wolfram*) or WolframDot. Or Wolfram1 (when’s 2?), WolframCore (remember core memory?) or WolframBase. There are also Greek-letter suffixes, Wolfram|Alpha-style, like Wolfram Omega or Wolfram Lambda (“wolf”, “ram” and “lamb”: too many animals!). Or one could go shorter, like the W Language, but that sounds too much like C.

Of course, if one’s into “wolf whimsical”, there are all kinds of places to go. Wolf backwards is Flow, though that hardly seems appropriate for a language so far from simple flowcharts. And then there are names like Howl and Growl which I can’t take too seriously. If one goes into wolf folklore, there are plenty of words and names—but they seem more suited to the Middle Ages than the future.

One can go classical, but the Latin word for wolf is Lupus, which is also the name of a disease. And the Greek is Lukos [λυκος], which just seems like a random word to modern ears. With different case endings, one gets “differently styled” words. But none of the alternate cases or variants of these words (like Lupum, Lupa or Lukon) are too promising either—though at least I get to use my knowledge of Latin and Greek from when I was a kid to determine that. (And English forms like Lupine are amusing, but don’t make it.)

And in the direction of whimsical, there are also words like Tungsten, the common English name for element 74, whose symbol W stands for “wolfram”, and whose most common ore is wolframite. (And no, it was not discovered by an ancestor of mine.)

How about doing something more scientific? Like searching a space of all possible names, “NKS style”. For example, one can just try adding all possible single letters to “wolfram”, giving such unpromising names as Wolframa, Wolframz and Wolframé. With two letters, one gets things like Wolframos, Wolframix and WolframUp. One can try just appending all possible short words, to get things like WolframHowWolframWay and WolframArt. And it’s a single line of code in our unnamed language (or Mathematica) to find the distribution of, say, what follows “am” in typical English words—yielding ideas like Wolframsu, Wolframity or the truly unfortunate Wolframble.

But what about going in the other direction, and trying to find word forms that actually relate to what we’re trying to communicate about the language? A common way to make up new but suggestive forms is to go back to classical or Indo-European roots, and then try to build novel combinations or variants of these. And of course if we use an actual word form from a language, we at least know that it survived the natural selection of linguistic evolution.

There was a time in the past where one could have taken almost any Latin or Greek root, and expected it to be understood in educated company (as perhaps cyber- was when it was introduced from the Greek [κυβερνητησ] for steersman or rudder). But in today’s world we pretty much have to limit ourselves to roots which are already at least somewhat familiar from existing words.

And in fact, in the relevant area of “semantic space”, “lexical space” is awfully crowded with rather common words. ”Language”, for example, is lingua (“linguistics”) or sermo (“sermon”) in Latin, and glossa [γλωσσα] (“glossary”) or phone [φωνη] (“telephone”) in Greek. “Computation” is computatio in Latin, and arithmos [αριθμος] (“arithmetic”) or logismos [λογισμος] (“logistics”) in Greek. “Knowledge” is scientia (“science”) or cognitio (“cognition”) in Latin, and episteme [επιστημη] (“epistemology”), mathesis [μαθησις] (“mathematics”) or gnosis [γνωσις] (“diagnosis”) in Greek. “Reasoning” is ratio (“rational”) in Latin, and logos [λογος] (“-ology”) in Greek. And so on.

But what can we form from these kinds of roots? I haven’t been able to find anything terribly appealing. Typically the names are either ugly, or immediately suggest a meaning that is clearly wrong (like Wolframology or Wolfgloss).

One can look at other languages, and indeed if you just type “translate word” into Wolfram|Alpha (and then press More a few times), you can see translations for as many as a few hundred languages. But typically, beyond Indo-European languages, most of the forms that appear seem random to an English speaker. (Bizarrely, for example, the standard transliteration of the word for “wolf” in Chinese is “lang”.)

So where can we go from here? One possible direction is this. We’ve been trying to find a name by modifying or supplementing the word “wolfram”, and expecting that the word “language” will just be added as a suffix. But we need to remember that what we have is really a new kind of language—so perhaps it’s the word “language” that we should be thinking of modifying.

But how? There are various prefixes—usually Greek or Latin—that get added, for example, to scientific words to indicate some kind of extension or “beyondness”: ana-, alto-, dia-, epi-, exa-, exo-, holo-, hyper-, macro-, mega-, meta-, multi-, neo-, omni-, pan-, pleni-, praeter-, poly-, proto-, super-, uber-, ultra- and so on. And from these Wolfram hyperlanguage (WHL?) is perhaps the nicest possibility—though inevitably it sounds a little “hypey”, and is perhaps too reminiscent of hypertext and hyperlinks. (Layering on the Greek and Latin there’s Hyperlingua too.)

Wolfram superlanguage, Wolfram omnilanguage and Wolfram megalanguage all sound strangely “last century”. Wolfram ultralanguage and Wolfram uberlanguage both seem to be “trying a bit too hard”, though Wolfram Ultra (without the “language” at all) is a bit better. Wolfram exolanguage pleasantly shortens to Wolfex, but means the wrong thing (think “exoplanet”). Wolfram epilanguage (or just Wolfram Epi) does better in terms of meaning (think “epistemology”), but sounds very technical.

A rather frustrating case is Wolfram metalanguage (WML). It sounds nice, and in Greek even means more or less the correct thing. But “metalanguage” has already come to have a meaning in English (a language about another language)—and it’s not the meaning we want. Wolfram Meta might be better, but has the same problem.

So, OK, if we can’t make a prefix to the word “language” work, how about just adding a word or phrase between “wolfram” and “language”? Obviously the resulting name is going to be long. But perhaps it’ll have a nice abbreviation or shortening.

One immediate idea is Wolfram Knowledge Language (WKL), but this has the problem of sounding like it might just be a knowledge representation language, not a language that actually incorporates lots of knowledge (as well as algorithms, etc.) More accurate would be Wolfram Knowledge-Based Language (Wolfram KBL), and perhaps whatever the name, “knowledge-based language” could be used as a description.

Another direction is to insert the word “programming”. There’s of course Wolfram Programming Language (WPL). But perhaps better is to start by describing the new kind of programming that our language makes possible—which one might call “hyperprogramming”, or conceivably “metaprogramming”. (“Macroprogramming” might have been nice, but it’s squashed by the old concept of “macros”.) And so conceivably one could have Wolfram Hyperprogramming Language (WolframHL, WolframHPL or WHL) or Wolfram Metaprogramming Language (WML)—or at least one can use “hyperprogramming language” or “metaprogramming language” as description.

OK, so what’s the conclusion? I suppose the most obvious metaconclusion is that getting a name for our language is hard. And the maddening thing is that once we do get a name, my whole 20-year quest will be over incredibly quickly. Perhaps the final name will be one we’ve already considered, but just weren’t thinking about correctly (that’s basically what happened with the name Mathematica). Or perhaps some flash of inspiration will lead to a new great name (which is basically what happened with Wolfram|Alpha).

What should the name be? I’m hoping to get feedback on the ideas I’ve discussed here, as well as to get new suggestions. I must say that as I was writing this post, I was sort of hoping that in the end it would be a waste, and that by explaining the problem, I would solve it myself. But that hasn’t happened. Of course, I’ll be thrilled if someone else just outright suggests a great name that we can use. But as I’ve described, there are many constraints, and what I think is more realistic is for people to suggest frameworks and concepts from which we’ll get an idea that will lead to the final name.

I’m very proud of the language we’ve built over all these years. And I want to make sure that it has a name worthy of it. But once we have a name, we will finally be ready to finish the process of bringing the language to the world—and I’ll be very excited to see all the things that makes possible.

]]>
http://blog.stephenwolfram.com/2013/02/what-should-we-call-the-language-of-mathematica/feed/ 261
Remembering Richard Crandall (1947-2012) http://blog.stephenwolfram.com/2012/12/remembering-richard-crandall-1947-2012/ http://blog.stephenwolfram.com/2012/12/remembering-richard-crandall-1947-2012/#comments Sun, 30 Dec 2012 17:49:09 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4866 Richard Crandall liked to call himself a “computationalist”. For though he was trained in physics (and served for many years as a physics professor at Reed College), computation was at the center of his life. He used it in physics, in engineering, in mathematics, in biology… and in technology. He was a pioneer in experimental mathematics, and was associated for many years with Apple and with Steve Jobs, and was proud of having invented “at least 5 algorithms used in the iPhone”. He was also an extremely early adopter of Mathematica, and a well-known figure in the Mathematica community. And when he died just before Christmas at the age of 64 he was hard at work on his latest, rather different, project: an “intellectual biography” of Steve Jobs that I had suggested he call “Scientist to Mr. Jobs”.

I first met Richard Crandall in 1987, when I was developing Mathematica, and he was Chief Scientist at Steve Jobs’s company NeXT. Richard had pioneered using Pascal on Macintoshes to teach scientific computing. But as soon as he saw Mathematica, he immediately adopted it, and for a quarter of a century used it to produce a wonderful range of discoveries and inventions.

He also contributed greatly to Mathematica and its usage. Indeed, even before Mathematica 1.0 in 1988, he insisted on visiting our company to contribute his expertise in numerical evaluation of special functions (his favorites were polylogarithms and zeta-like functions). And then, after the NeXT computer was released, he wrote what may have been the first-ever Mathematica-based app: a “supercalculator” named Gourmet that he said “eats other calculators for breakfast”. A couple of years later he wrote a book entitled Mathematica for the Sciences, that pioneered the use of Mathematica programs as a form of exposition.

Over the years, I interacted with Richard about a great many things. Usually it would start with a “call me” message. And I would get on the phone, never knowing what to expect. And Richard would be talking about his latest result in number theory. Or the latest Apple GPU. Or his models of flu epidemiology. Or the importance of running Mathematica on iOS. Or a new way to multiply very long integers. Or his latest achievements in image processing. Or a way to reconstruct fractal brain geometries.

Richard made contributions—from highly theoretical to highly practical—to a wide range of fields. He was always a little too original to be in the mainstream, with the result that there are few fields where he is widely known. In recent years, however, he was beginning to be recognized for his pioneering work in experimental mathematics, particularly as applied to primes and functions related to them. But he always knew that his work with the greatest immediate significance for the world at large was what he did for Apple behind closed doors.

Richard was born in Ann Arbor, Michigan, in 1947. His father was an actuary who became a sought-after expert witness on complex corporate insurance-fraud cases, and who, Richard told me, taught him “an absolute lack of fear of large numbers”. Richard grew up in Los Angeles, studying first at Caltech (where he encountered Richard Feynman), then at Reed College in Oregon. From there he went to MIT, where he studied the mathematical physics of high-energy particle scattering (Regge theory), and got his PhD in 1973. On the side he became an electronics entrepreneur, working particularly on security systems, and inventing (and patenting) a new type of operational amplifier and a new form of alarm system. After his PhD these efforts led him to New York City, where he designed a computerized fire safety and energy control system used in skyscrapers. As a hobby he worked on quantum physics and number theory—and after moving back to Oregon to work for an electronics company there, he was hired in 1978 at Reed College as a physics professor.

Steve Jobs had ended his short stay at Reed some years earlier, but through his effort to get Reed computerized, Richard got connected to him, and began a relationship that would last the rest of Steve’s life. I don’t know even a fraction of what Richard worked on for NeXT and Apple. For a while he was Apple’s Chief Cryptographer—notably inventing a fast form of elliptic curve encryption. And later on, he was also involved in compression, image processing, touch detection, and many other things.

Through most of this, Richard continued as a practicing physics professor. Early on, he won awards for creating minimal physics experiments (“measure the speed of light on a tabletop with $10 of equipment”). By the mid-1980s, he began to concentrate on using computers for teaching—and increasingly for research. One particular direction that Richard had pursued for many years was to use computers to study properties of numbers, and for example search for primes of particular types. And particularly once he had Mathematica, he got involved in more and more sophisticated number theoretical mathematics, particularly around primes, among other things co-authoring the (Mathematica-assisted) definitive textbook Prime Numbers: A Computational Perspective.

He invented faster methods for doing arithmetic with very long integers, that were instrumental, for example, in early crowdsourced prime discoveries, and that are in fact used today in modified form in Mathematica. And by doing experimental mathematics with Mathematica he discovered a wonderful collection of zeta-function-related results and identities worthy of Ramanujan. He was particularly proud of his algorithms for the fast evaluation of various zeta-like functions (notably polylogarithms and Madelung sums), and indeed earlier this year he sent me the culmination of his 20 years of work on the subject, in the form of a paper dedicated to Jerry Keiper, the founder of the numerics group at Wolfram Research, who died in an accident in 1995, but with whom Richard had worked at length.

Richard was always keen on presentation, albeit in his own somewhat unique way. Through his “industrial algorithm” company Perfectly Scientific, he published a new poster every time a Mersenne prime was discovered. The price of the poster increased with the number of digits, and for convenience his company also sold a watchmaker’s loupe to allow people to read the digits on the posters.

Richard always had a certain charming personal ponderousness to him, his conversation peppered with phrases like “let me commend to your attention”. And indeed as I write this, I find a classic example of over-the-top Richardness in the opening to his Mathematica for the Sciences: “It has been said that the evolution of humankind took a substantial, discontinuous swerve about the time when our forepaws left the ground. Once in the air, our hands were free for ‘other things’. Toolmaking. …”, and eventually, as he explains after his “admittedly conjectural rambling”, computers and Mathematica

Richard regularly visited Steve Jobs and his family, with his last visit being just a few days before Steve died. He was always deeply impressed by Steve, and frustrated that he felt people didn’t understand the strength of Steve’s intellect. He was disappointed by Walter Isaacson’s highly successful biography of Steve, and had embarked on writing his own “intellectual biography” of Steve. He had years of interesting personal anecdotes about Steve and his interactions with him, but he was adamant that his book should tell “the real story”, about ideas and technology, and should at all costs avoid what he at least considered “gossip”. At first, he was going to try to take himself completely out of the story, but I think I successfully convinced him that with his unique role as “scientist to Steve Jobs”, he had no choice but to be in the story, and indeed to tell his own story along the way.

Richard was in many ways a rather solitary individual. But he always liked talking about his now-15-year-old daughter, whom he would invariably refer to rather formally as “Ellen Crandall”. He had theories about many things, including child rearing, and considered one of his signature quotes to be “the most efficient way to raise an atheist kid is to have a priest for a father”. And indeed as part of the last exchange I had with him just a few weeks before he died, he marveled that his daughter from a “pure blank, white start” … “has suddenly taken up filling giant white poster boards with minutely detailed drawing”.

While his overall health was not perfect, Richard was in many ways still in the prime of his life. He had ambitious plans for the future, in mathematics, in science and in technology, not to mention in writing his biography of Steve Jobs. But a few weeks ago, he suddenly fell ill, and within ten days he died. A life cut off far too soon. But a unique life in which much was invented that would likely never have existed otherwise.

I shall miss Richard’s flow of wonderfully eccentric ideas, as well as the mysterious “call me” messages, and of the late the practically monthly encouragement speech about the importance of having Mathematica on the iPhone. (I’m so sorry, Richard, that we didn’t get it done in time.)

Richard was always imagining what might be possible, then in his unique way doggedly trying to build towards it. Around the world at any time of day or night millions of people are using their iPhones. And unknown to them, somewhere inside, algorithms are running that one can imagine represent a little piece of the soul of that interesting and creative human being named Richard Crandall, now cast in the form of code.

]]>
http://blog.stephenwolfram.com/2012/12/remembering-richard-crandall-1947-2012/feed/ 12
Welcome, National Museum of Mathematics http://blog.stephenwolfram.com/2012/12/welcome-national-museum-of-mathematics/ http://blog.stephenwolfram.com/2012/12/welcome-national-museum-of-mathematics/#comments Mon, 17 Dec 2012 18:57:54 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4464 I was just in New York City for the grand opening of the National Museum of Mathematics. Yes, there is now a National Museum of Mathematics, right in downtown Manhattan. And it’s really good—a unique and wonderful place. Which I’m pleased to say I’ve been able to help in vwarious ways in bringing into existence over the past 3 years.

Museum of Mathematics logo

Of all companies, ours is probably the one that has been most involved in bringing math to the world (Mathematica, Wolfram|Alpha, Wolfram Demonstrations Project, MathWorld, Computer-Based Math, Wolfram Foundation, …). And for a long time I’ve thought how nice it would be if there were a substantial, physical, “museum of mathematics” somewhere. But until recently I’d sort of assumed that if such a thing were going to exist, I’d have to be the one to make it happen.

A little more than 3 years ago, though, my older daughter picked out of my mail a curious folding geometrical object—which turned out to be an invitation to an event about the creation of a museum of mathematics. At first, it wasn’t clear what kind of museum this was supposed to be. But as soon as we arrived at the event, it started to be much clearer: this was “math as physical experience”. With the centerpiece of the event, for example, being a square-wheeled tricycle that one could ride on a cycloidal “road”—a mathematical possibility that, as it happens, was the subject of some early Mathematica demonstrations.

Behind the museum was a small group led by Glen Whitney, an energetic math PhD and recently “retired” hedge fund quant, who I had never met, but with whom I turned out to have quite a few connections in common. Soon I was involved as a trustee of the fledgling museum, and four other people from our company were on its advisory council. Needless to say, there were many questions and issues. But a prominent early one was what the logo for the museum should be.

What iconic image would best capture mathematics and its character, and be lively enough to connect with the youthful target group for the museum? Our company has had a strong graphic design tradition, and is of course deeply involved in math. So it was natural for me to suggest that perhaps we could make an early contribution to the museum by trying to develop a logo.

I posed the problem at our company, and quickly got a response from Chris Carlson in our User Interfaces group. Chris has been at our company for more than 18 years, and has long had an interest in computable forms (and in fact his PhD in architecture from Carnegie Mellon was on this subject). So perhaps it was not surprising that Chris suggested not just a logo, but a computable “meta-logo”—an infinite family of possible logos, generated by simple rules.

His idea was ultimately quite simple: pick a mathematical symbol and apply a sequence of symmetry transformations to it. But as is so often the case with simple rules, the results can be elaborate and striking:

Museum of Mathematics logo array

His original proposal, developed with members of our Design group (including our longtime art director Jeremy Davis) addressed some of the new possibilities—and issues—presented by the meta-logo. The museum didn’t have to have a single official logo; it could have logos created all the time by its visitors. And the logos didn’t have to be static; they could animate too. Different projects or events could have their own special logos. And the logo itself could serve as a simple puzzle (“what symbol is that?”). And so on. Of course there were issues too. Like would a meta-logo, with its infinite variations, still be recognizable?

But after some discussion at the museum, the decision was made that, yes, the meta-logo was going to work. And indeed its very variability and structure seemed to capture remarkably well some of the most important features of mathematics. Gradually all sorts of lovely possibilities emerged for how a mass-customizable meta-logo could be deployed—from personalized business cards for the staff, to “logo IDs” for visitors, to being able to decorate almost anything with multiple variations of the logo.

And almost three years later there I was a few days ago walking down 26th St. from Fifth Avenue in New York, and I look up to see:

MoMath flags with logos

I reach the museum, and of course the (temporary) main entrance is full of logos:

MoMath temporary entrance with logos

It’s a few hours before the opening gala event. And of course inside it’s a hive of activity. The first thing I see is the logo generator station for visitors. But oops… there are no logos there yet, just lots of Mathematica code on the screen:

Last-minute Mathematica code for MoMath logos

A basic interactive system for generating logos is a tiny amount of Mathematica code. But getting everything exactly right turns out to be quite tricky, and to involve some quite sophisticated mathematics. It’s easy to apply symmetry operations to regions representing font characters. The issue is rendering them. The most obvious thing is to layer them in the order they’re generated. But if the regions overlap, then doing that can break symmetry. And the only way to guarantee to preserve symmetry is to do some intricate computational geometry, breaking up the regions just right.

And with hours to spare, the final touches were finished, and the logo generator was ready. And so was the rest of the museum.

The MoMath logo generator in situ at the opening gala

It’s an impressive two floors of exhibits. Full of inventive ideas about how to make math tangible to all, from middle school (or below) on. I’m quite a connoisseur of both mathematics and museums. But what impresses me most about MoMath is how different its exhibits are from what I’ve seen before. Each one has a unique idea. That I’ve typically never seen before. But that elegantly illustrates some fundamental mathematical principle. And does it in a way that only a physical exhibit can. Like nesting in the “human tree”:

Human tree

Often there are computers involved in the exhibits. But they’re controlled and used in very physical ways—that makes it clear why this has to be a physical museum, not some kind of virtual entity on the web. To me, it’s also very cool that most of the exhibits can be understood at several levels. There’s the basic math that’s being illustrated. And then there’s the “how does that actually work?”, and the “how does one build something to do this?”. Like the non-holographic 3D images (that can’t be captured in a photograph):

3D non-holograms

It wasn’t cheap to make the museum. And to me it’s impressive how much money was raised for the cause of math. To be able to put the museum in the center of Manhattan, and to create such beautifully made exhibits, with elegantly machined pieces—that are nevertheless designed to be tough enough to withstand the onslaught of young users.

When I was a toddler (long, long ago!) I was fortunate enough to live near the Science Museum in London. Which at the time was just opening its first hands-on gallery. That I insisted on being taken to visit almost every day for a whole summer.

I don’t know what my early experiences with those exhibits imprinted on me. But it’s remarkable to see—half a century later—how the advance of technology and all the creativity that’s been put into MoMath has led to such vastly richer exhibits.

In the coming days and weeks, lots of kids—and adults—will get their first taste of the first museum of mathematics ever to exist in the US. No doubt MoMath will become quite a destination for students, tourists—and, I suspect, events of all sorts. People will learn a lot—and have fun. And remember for a long time their ride on the square-wheeled tricycle—much as I still remember a ride I had on a Galapagos tortoise at the London Zoo nearly 50 years ago. (At the MoMath gala, my younger daughter happened to be captured moments after this picture by a photographer from The Wall Street Journal…)

Square-wheeled tricycle

There could have been a Museum of Mathematics in the US long long ago. But there wasn’t. And it’s only now, through the remarkable efforts of a small number of people, that MoMath exists. It’s going to be a great institution, and it’s one more step in the effort that we’ve been so involved with to bring the heritage and promise of mathematics to the 21st century.

]]>
http://blog.stephenwolfram.com/2012/12/welcome-national-museum-of-mathematics/feed/ 1
“What Are You Going to Do Next?” Introducing the Predictive Interface http://blog.stephenwolfram.com/2012/12/what-are-you-going-to-do-next-introducing-the-predictive-interface/ http://blog.stephenwolfram.com/2012/12/what-are-you-going-to-do-next-introducing-the-predictive-interface/#comments Fri, 07 Dec 2012 00:29:17 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4600 There aren’t very many qualitatively different types of computer interfaces in use in the world today. But with the release of Mathematica 9 I think we have the first truly practical example of a new kind—the computed predictive interface.

If one’s dealing with a system that has a small fixed set of possible actions or inputs, one can typically build an interface out of elements like menus or forms. But if one has a more open-ended system, one typically has to define some kind of language. Usually this will be basically textual (as it is for the most part for Mathematica); sometimes it may be visual (as for Wolfram SystemModeler).

The challenge is then to make the language broad and powerful, while keeping it as easy as possible for humans to write and understand. And as a committed computer language designer for the past 30+ years, I have devoted an immense amount of effort to this.

But with Wolfram|Alpha I had a different idea. Don’t try to define the best possible artificial computer language, that humans then have to learn. Instead, use natural language, just like humans do among themselves, and then have the computer do its best to understand this. At first, it was not at all clear that such an approach was going to work. But one of the big things we’ve learned from Wolfram|Alpha is with enough effort (and enough built-in knowledge), it can. And indeed two years ago in Mathematica 8 we used what we’d done with Wolfram|Alpha to add to Mathematica the capability of taking free-form natural language input, and automatically generating from it precise Mathematica language code.

But let’s say one’s just got some output from Mathematica. What should one do next? One may know the appropriate Mathematica language input to give. Or at least one may be able to express what one wants to do in free-form natural language. But in both cases there’s a kind of creative act required: starting from nothing one has figure out what to say.

So can we make this easier? The answer, I think, is yes. And that’s what we’ve now done with the Predictive Interface in Mathematica 9.

The concept of the Predictive Interface is to take what you’ve done so far, and from it predict a few possibilities for what you’re likely to want to do next.

Predictive interface

In Mathematica 9, the way this works is that when you have an output in focus, a Suggestions Bar appears below it, with a list of buttons for possible actions to take next. What buttons appear is determined by a computation that is run in real time when you request the Suggestions Bar.

There are two kinds of inputs to this computation. The first is what actions are common given the structure of the output, and the earlier history of your session. And the second is what actions would lead to useful results.

Over time, the Predictive Interface will be able to learn from the actions people take in it. But to get started, we’ve used several large sources. The first is the collection of carefully tuned heuristic algorithms that Wolfram|Alpha uses to determine what pods to output given a particular input. The second is the billions of actual queries that we’ve seen in the Wolfram|Alpha query stream. The third is published Mathematica code—for example the huge number of examples in the Wolfram Documentation Center and the Wolfram Demonstrations Project. And the fourth is our very large internal sample of Mathematica code from the source code of both Mathematica itself and Wolfram|Alpha.

From the query stream and heuristic algorithms in Wolfram|Alpha we learn many a priori probabilities for actions to be taken on different kinds of objects, as well as a certain amount about sequences of actions. From samples of Mathematica code we learn what kinds of actions occur together—for example what the probabilities are for different functions to appear applied to different kinds of arguments.

In a first approximation, what happens in the Predictive Interface is that possible outputs are categorized into hundreds of different general types, each represented by a symbolic expression that encodes certain properties and attributes. Then, informed by our various sources, large numbers of probabilistic rules are set up to determine conceivable actions that might be suggested for particular combinations of types. Which actions actually make sense to suggest will then depend on which ones lead to useful results. So typically what the Predictive Interface does is to try out candidate actions—or tests based on them—and then use heuristics to assess the utility of the results they would give.

The Predictive Interface will often internally find quite a large number of possible actions to suggest. These then have to be ranked so that the best choices can be presented first. And the way this is done is through a fairly elaborate system of scores, that combine specific heuristic algorithms with probabilistic information and assessments of the utility of results.

But after all this sophisticated computation, what the user ultimately sees is just a simple list of buttons for possible actions to take.

Let’s look at an example:

Predictive Interface example

The Predictive Interface is suggesting a few things to do with this integer. And they seem pretty sensible. But actually they’re even better than we might expect. Because the Predictive Interface is using information it gets by actually doing the computations it’s suggesting. Let’s try another integer:

Predictive Interface example

Again the suggestions seem pretty sensible. But they’re different. And the reason is that the Predictive Interface knows that this particular integer is a prime. So it can tell for example that a primality test will be particularly worth doing on it. Sometimes it can be quite uncanny how prescient the Predictive Interface manages to be. But what I’ve found is that when one gets used to this, it’s surprisingly useful not only in its primary purpose of guiding one through what one can do, but also in giving implicit hints about features of what one’s seeing.

It’s interesting to compare the concept of the Predictive Interface with the way Wolfram|Alpha generates reports. In Wolfram|Alpha, given a particular input—like an integer, for example—Wolfram|Alpha will generate a sequence of “pods” which it displays down the page, giving results of computations that its heuristic algorithms determine are interesting. In the Predictive Interface there’s also selection of computations going on, but now not for the purpose of actually displaying the results of these computations. Instead, after all sorts of internal work, all that’s actually displayed is a sequence of buttons, presented for the human user to pick what to do next.

Of course, one can combine these approaches, and when the Predictive Interface determines that it’s going to be useful, it “lights up” the Wolfram|Alpha logo in the Suggestions Bar. If you then press this, you’ll get the whole Wolfram|Alpha result—from which you can pick an individual pod to generate a specific new Mathematica input.

Predictive Interface example

The Predictive Interface makes suggestions that lead to many different kinds of actions. But what we’ve found is that it’s best to present every suggestion in a more or less uniform way—by having one or more plain English words on what is effectively a button. Sometimes pressing the button may just apply some individual Mathematica function, like Solve (for equation solving), or FactorInteger (for integer factorization). And in this case, the next input makes it immediately obvious what was done:

Predictive Interface example

Sometimes the Predictive Interface may end up composing pieces of code on the fly, which again can be applied with individual buttons:

Predictive Interface example

Often these pieces of code are simple enough that it is not distracting to display them completely. But sometimes it’s better to “hide” them by default, with an “opener” provided if one wants to actually look at the code:

Predictive Interface example

And sometimes the code is actually an invocation of Wolfram|Alpha:

Predictive Interface example

The Predictive Interface ranks suggestions, and immediately displays the top few. If you press “more…”, you’ll get a panel, which will typically show many more suggestions, now arranged in categories:

Predictive Interface example

Sometimes there’s just one form of a particular suggestion. But quite often there are alternatives or options. When there are a fairly small number of easy-to-understand choices, the Predictive Interface just lists them by name in a pull-down:

Predictive Interface example

Or, if what’s going on is more complicated, it shows previews of what the result would be for different choices:

Predictive interface example

In many cases, there isn’t just a list of possible choices; instead one may have to “fill in a form” to say what one wants:

Predictive Interface example

In general, the Predictive Interface can present an essentially arbitrary user interface. Internally, it’s just generating a symbolic Mathematica expression, which can then make use of anything in the Mathematica interface language—for example to give a custom-created ”wizard-like” panel:

Predictive Interface example

In simple cases, the Predictive Interface just gives a simple row of suggestions buttons. But it’s fairly common for there to be very different kinds of suggestions depending on what a particular output is supposed to mean. And in such cases we’ve found that instead of mixing suggestions based on different meanings, it’s much better just to pick a default meaning, then offer other meanings as explicit alternatives—analogous to the way Wolfram|Alpha’s “assuming” mechanism works:

Predictive Interface example

When one uses the Predictive Interface, it’s pretty common to end up with a whole string of inputs and outputs. The way the Predictive Interface is set up, each new input contains the previous output. Here’s a sequence, with the explicit Predictive Interface Suggestions Bars for each line added in:

Predictive Interface example

If you want to repeat the sequence of computations here, you typically want to “roll them up” into a single line—which is what the spiral icon in the Suggestions Bar does:

Predictive Interface example

Right now, the Predictive Interface concentrates on making suggestions for single outputs—though it often makes use of context and previous history. In the future, we’re planning to do a lot more on multi-input suggestions, and various forms of refactoring, as well as on full Mathematica programs.

When we started building the Predictive Interface it was not at all clear it was going to end up working out. Previous examples of “suggestions” interfaces had generally not been well received (think for example Microsoft’s “Clippy” intelligent paperclip—which I have to say I always found charming, if not especially useful). But I suspected the problem was not the general idea of providing contextual suggestions, but the way they were being generated and presented.

And the key idea that’s led to our Predictive Interface for Mathematica 9 is to put real computation into figuring out what to suggest. There’s never going to be a complete, precise, algorithm to determine what a person is going to want to do next (though people are often much more predictable than one might expect). And instead, what one needs to do is to have a whole collection of heuristic algorithms that come as close as possible to being able to make a computer “do what I mean”.

I have to say that in past years I was always skeptical about heuristics. Because I thought people would find them very frustrating. When one has a precise language and system like Mathematica, the essence of good design is to make everything completely consistent, so people can readily predict what the system will do in a particular case. But heuristics go in the opposite direction, trying to cover common cases well, but not worrying at all about overall consistency.

But here’s a key point I’ve learned from creating Wolfram|Alpha: if heuristics are done well, with serious computation and knowledge behind them, they actually do work, and people like them very much. Wolfram|Alpha is absolutely full of heuristics: for understanding free-form linguistic input, for deciding what output to generate, etc. And—as it is so often with computer systems—so long as everything “just works”, people never think about the heuristics, never try to deconstruct them, and never notice or get confused by the lack of ultimate consistency.

The Predictive Interface is technically rather different from Wolfram|Alpha. But the notion of having a whole web of heuristics based on serious computation and knowledge is the same. In practice with the Predictive Interface it’s also important that it presents itself with the appropriate level of emphasis: it’s there, and easy to get at if one wants it, but understated enough that it doesn’t visually get in the way if one doesn’t need it.

And I have to say that in my own use of Mathematica it seems to work well. If I know what to do after I get a particular output, I just type the next input without getting distracted by the Predictive Interface. But as soon as I pause even for a moment, I tend to glance at the Predictive Interface. And often it jogs my thinking, and gives me exactly what I want to do next.

I’ve now had a chance to watch a few Mathematica beginners use the Predictive Interface. It seems to work really nicely. It both gives them the satisfying experience of making progress more quickly than ever before, and it gently exposes them to a wider range of capabilities in Mathematica that they might not otherwise discover for a long time.

Mathematica is a big enough system that I don’t think anyone (even myself) can immediately remember everything it does. So that means that particularly if one is using a somewhat unfamiliar part of the system, the Predictive Interface is highly useful. And even in areas of the system that I know well, it’s just faster to press a Predictive Interface button than to type an input.

The Predictive Interface consists of a kind of infinite web of suggestions. And it’s rather fun to try starting with something simple, and seeing just how far one can go simply by following Predictive Interface suggestions. It’s remarkable how quickly one can end up doing some pretty sophisticated things.

One of the big things I’ve always tried to do in Mathematica (and in Wolfram|Alpha and so on) is to automate as much as possible: to make it so that whatever the computer can automatically handle it does automatically handle. In the past, we’ve done a lot on automating the selection of algorithms, automating the way output and visualizations are presented, or interfaces are built, and, in Wolfram|Alpha, automating the way input is interpreted. With the Predictive Interface, we’re attacking yet another “automation frontier”: automating how one chooses what to do next.

From the point of view of interface design, I am finding the Predictive Interface extremely interesting. Previous interfaces—like menus or forms or computer languages or free-form linguistics—make certain kinds of things easy. The Predictive Interface makes a new set of things easy. And as I work on future design for Mathematica—as well as other products of ours—I can already see my thinking changing as a result of the Predictive Interface.

In language design one typically wants to have a minimal number of names and concepts for people to remember. In free-form linguistic input one wants to support whatever people will immediately think of—and it’s barely worth covering things that people will never “think to ask”. But with the Predictive Interface one has a new mechanism. Once people are going in a particular general direction, one can present to them suggestions that let them discover things that they’d never think were there, or were even possible.

This is particularly important for a system like Mathematica that is deep and broad. It’s too easy for people to spend years using just a small part of the system, and never get the benefit of its wider capabilities. But now the Predictive Interface constantly leads people to other parts of the system that they can immediately use and become familiar with.

The Predictive Interface in Mathematica 9 is just the beginning. In time, it will become possible to do still much richer and more sophisticated predictions. Making use not just of information from the current session, but all sorts of history, data and analytics about the user. The direction is in a sense maximal automation. To have the user define a goal, but then have the computer figure out as much as possible about how to achieve that goal. Sometimes the user will be able to specify the goal using natural language or computer language. But often they will not have formulated it completely enough to do so. And instead they’ll just be able to state a general direction to go. At which point the Predictive Interface can take over, making suggestions and letting the user guide the computer in the direction they want to go.

Today the Predictive Interface is available in Mathematica 9. We already have other products in the works that make use of it. And in the future I expect to see sophisticated computed predictive interfaces show up all over the place—defining in a sense a new paradigm for interacting with computers.

]]>
http://blog.stephenwolfram.com/2012/12/what-are-you-going-to-do-next-introducing-the-predictive-interface/feed/ 3
Mathematica 9 Is Released Today! http://blog.stephenwolfram.com/2012/11/mathematica-9-is-released-today/ http://blog.stephenwolfram.com/2012/11/mathematica-9-is-released-today/#comments Wed, 28 Nov 2012 16:21:03 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4442 I’m excited to be able to announce that today we’re releasing Mathematica 9—and it’s big! A whole array of new ideas and new application areas… and major advances along a great many algorithmic frontiers.

Next year Mathematica will be 25 years old (and all sorts of festivities are planned!). And in that quarter century we’ve just been building and building. The core principles that we began with have been validated over and over again. And with them we’ve created a larger and larger stack of technology, that allows us to do more and more, and reach further and further.

From the beginning, our goal has been an ambitious one: to cover and automate every area of computational and algorithmic work. Having built the foundations of the Mathematica language, we started a quarter century ago attacking core areas of mathematics. And over the years since then, we have been expanding outward at an ever-increasing pace, conquering one area after another.

As with Wolfram|Alpha, we’ll never be finished. But as the years go by, the scope of what we’ve done becomes more and more immense. And with Mathematica 9 today we are taking yet another huge step.

New in Mathematica 9

So what’s new in Mathematica 9? Lots and lots of important things. An amazing range—something for almost everyone. And actually just the very size of it already represents an important challenge. Because as Mathematica grows bigger and bigger, it becomes more and more difficult for one to grasp everything that’s in it.

But in Mathematica 9 there’s an important new idea. We call it the Wolfram Predictive Interface™, and what it does is to automate the process of suggesting at every step what to do next. At the most basic level, when you type, there’s a context-sensitive Input Assistant that knows about all the functions and options of Mathematica. But more important, when you get output, there’s a Suggestions Bar that’s generated, with a series of buttons for top actions you might want to take next. Sometimes these buttons apply individual Mathematica functions, and sometimes they do more complex things, bringing up interactive panels if need be.

Predictive Interface

Experienced software users may be skeptical. They may be thinking: “I’ve seen these kinds of heuristic let-me-help-you systems before; typically they just get in the way”. Well, I’m happy to say that I think with the Predictive Interface in Mathematica 9 we’ve made a breakthrough. Of course, it helps that we have all that experience—and all those query logs—from Wolfram|Alpha. But the result is that even for an experienced Mathematica user like myself, the Predictive Interface really does well, and it makes my use of Mathematica substantially better. And for people new to Mathematica, I think it’ll be a game-changer. Never again will they be left in a “so what do I do next?” state; they’ll always be given suggestions about how to move forward, as well as automatically be shown what’s possible in Mathematica.

There are all sorts of other interface enhancements in Mathematica 9 too. But what about the computational capabilities of Mathematica? What’s new there in Mathematica 9?

Here’s an immediate “crowd pleaser”: social network analysis. There’s now a function in Mathematica that lets you instantly get data from the APIs of popular social networks—that you can then immediately visualize and analyze using all the capabilities of Mathematica, including many new graph theoretical and statistical functions especially added in Mathematica 9 for social networks. A couple of months ago we introduced Wolfram|Alpha Personal Analytics for Facebook—which has become a highly popular service. Now we’re introducing general, programmable, social network analysis in Mathematica—which promises to be very valuable not only for professional data scientists, but also for math and computer science students who want to jump immediately to the frontiers of one of the hottest current areas.

Social network analysis

Over the last 25 years there have been a few requests for new features in Mathematica that just keep on coming in over and over again. One of those is for Mathematica to support units—like centimeters and gigabytes. Early on we created an add-on package that did just fine for simple cases, and that many people have been happily using for ages. But try as we might, we never figured out how to do a true “Mathematica-class” job of supporting units—and so we never built them into the core of the system.

Well, one feature of Wolfram|Alpha is that it includes by far the most complete handling of units ever. I used to think units were comparatively simple. But I now know they’re messy and complicated, not least because to be at all usable in practice, people have to be able to refer to them with all sorts of weird short notations. Now here’s the great thing we realized in Mathematica 9: we can just use Wolfram|Alpha-based free-form linguistics to let people enter units however they feel like. But then we can turn the units into precise symbolic expressions—that we can then support throughout Mathematica, not just in simple arithmetic, but in calculus, visualization, data analysis, and much more. And so, after all these years, in Mathematica 9 we finally have units built in—not as some kind of hack, but in a really clean, streamlined and long-term way.

Units in Mathematica

Mathematica has a vast web of interconnected computational capabilities. And in each new version, we build on what is already there to reach and cover still more areas. And to me a remarkable aspect of this is just how much needs to already exist in Mathematica to be able to reach and successfully cover these new areas. Sometimes there may be some fairly easy way to implement simple examples of a new capability. But to get really good thorough coverage needs the whole stack that we’ve spent 25 years building up.

And whenever we tackle a new area in Mathematica, we try to do it to full depth and breadth. Usually this means we have to figure out all sorts of new ideas and algorithms. And often a whole new way of looking at the area. That typically dramatically clarifies the area, and both makes it accessible to a much wider class of people, and allows it to be successfully used as a long-term building block in the further development of Mathematica.

In Mathematica 9, we’ve covered quite a collection of new areas.

One example, from a “classic” Mathematica area, concerns differential equations. In Wolfram SystemModeler, we handle all sorts of systems described by differential equations. In Mathematica 9, we’ve now built in capabilities for solving differential equations with discontinuities (e.g. for a ball bouncing on a surface), hybrid discrete/continuous equations, and parametric and eigenvalue differential equations. Back in the 1970s I remember writing a Fortran program to solve an eigenvalue version of the Schrödinger equation; now finally in 2012 it’s little more than a one-liner in Mathematica 9.

There are a whole collection of new capabilities in Mathematica 9 around statistical systems and statistical modeling. We’ve been gradually building towards these for several versions. In Mathematica 7 we introduced all sorts of statistical distributions, and all sorts of methods for fitting data. Then in Mathematica 8 we introduced a very clean symbolic formalism for handling probabilities and probability distributions—as well as filling out a wide range of statistical analysis capabilities. And in Mathematica 9 we’re now extending from probability distributions to a full range of random processes.

We’re covering time series, Markov chains, queues, reliability, survival, stochastic differential equations, random graphs, and more. It’s all very clean and very unified. In each case, there’s a symbolic way to represent a model. And then it’s quite beautiful how everything fits together. Say you’ve got some data. In Mathematica 8, you could fit it to some statistical distribution. Now in Mathematica 9 you can use the exact same functions to fit it to a time series model or a differential equation, or whatever.

25 years ago, Mathematica didn’t put much emphasis on statistics. But over the years, we’ve steadily been building out extremely strong and often beyond-state-of-the-art statistical capabilities. And at this point, we’ve covered with great depth and robustness what’s needed for the vast majority of areas where statistical methods are used.

Of course, Mathematica is not an island. We’ve put a lot of emphasis on making sure that it can import and export an immense range of formats. And also that it can communicate with many external programs and systems. So in Mathematica 9, one thing we’ve added is built-in integration with the R statistics language. It’s pretty cool: I think it’s fair to say that it’s now easier to use R from within Mathematica than directly in the R system itself. So if there’s a package that’s been written in R for some specialized statistical task, you can immediately and seamlessly use it inside Mathematica.

Over the last few years, Mathematica has become a major player in the emerging field of data science. And to support that, we’ve been steadily expanding the types of data for which Mathematica has strong built-in support. We first added image processing in Mathematica 7, then enhanced it in Mathematica 8. Now in Mathematica 9, we’re making Mathematica’s already very complete image processing system do still more. There’s a convenient interactive Image Assistant, there’s feature tracking and face detection, there’s support for HDR and color profiles and there’s the ability to do out-of-core processing on very large images.

But probably the single most striking feature of image processing in Mathematica 9 is that it can handle not only 2D images, but also 3D volumetric ones. And in typical Mathematica style, most functions that work on 2D images now just seamlessly work on 3D images too. Beginning back in the early 1980s I used to try to use 3D volume rendering to visualize 3D cellular automata. And finally now what has always been elaborate and painful has become a Mathematica 9 one-liner that executes instantaneously. I’ve also over the years tried quite a few times to manipulate 3D DICOM-style data from MRIs and the like—and it’s always been quite challenging. But now in Mathematica 9 it’s become incredibly easy—and one’s immediately able not just to do visualization, but also to use all sorts of sophisticated analysis methods.

3D cellular automata

Another area that’s come of age in Mathematica 9 is signal processing—now with hundreds of functions for efficiently analyzing and filtering signals. It’s pretty impressive how smoothly it fits into Mathematica. Whether operating on a standard time-domain signal, or audio, or 2D or 3D images—and whether one’s doing visualization, applying continuous or discrete calculus, or doing high-precision or exact computations. And because Mathematica is a symbolic language, it’s immediately possible to represent filters for signal processing in a symbolic way—so they can be designed and manipulated with the full power of the Mathematica system, as well, for example, as being shared with Wolfram SystemModeler.

Mathematica 8 began the introduction of built-in control theory capabilities in Mathematica. Mathematica 9 fills this out, adding PID controllers, time delays and full support for descriptor systems. And of course, all this is fully integrated with signal processing, visualization and everything else in Mathematica, and Wolfram SystemModeler.

The list of new frontiers and new capabilities in Mathematica 9 is long. Two more that have been long in the making are built-in support for vector analysis and for symbolic tensors. In both cases, there have been deep challenges both in algorithms and in design. Indeed, for example, for more than 20 years I’ve been thinking about how to conveniently fit traditional vector analysis notation—with its often implicit reference to coordinate systems—into Mathematica. And it’s interesting that what it’s taken to solve this problem is in a sense a deeper understanding of just what coordinate systems really mean. But the result is that it’s now easy in Mathematica 9 to deal with symbolic vector expressions and with vector calculus in all standard named coordinate systems.

In working with symbolic tensors, I myself have a long history. Indeed, the first large-scale package that I ever wrote for symbolic computation—in 1978—dealt with symbolic tensors. But it’s taken until now for us to understand at the level needed for Mathematica just how really to work with them. A key problem is how to canonicalize products of tensors with contracted indices. I always suspected that there might be really powerful algorithms for this. And indeed there are, based on graph theory. And they’re now fully implemented in Mathematica 9. With the result that computations in general relativity that even recently seemed like major research projects now happen in mere seconds.

Looking down the complete list of what’s new in Mathematica 9, it’s pretty impressive. In addition to major new areas, there are countless extensions and enhancements throughout the system. Whether it’s responding to suggestions from math teachers to conveniently support real-valued cube roots. Or to allow giant data arrays limited only by 64-bit addressing. Or to support programmatic access to password-protected websites, synchronously or asynchronously. Or to support business dates in a streamlined way. Or to add elegantly designed interactive gauges to use in dashboard or for controls. Or to have a systematic framework for adding legends to any kind of plot. Or to make it possible to do enterprise-level distribution of CDF documents that make use of live data.

It’s been two years since we released Mathematica 8, and to me it’s very impressive how many new things have been finished in just those two years. Back with Mathematica 6, we built a framework that allowed us to start growing Mathematica at a much faster rate. And it’s interesting to see the effect of that in the plot below of the growth of the number of functions built into Mathematica. Today as we add higher and higher levels of automation to Mathematica, we are increasingly dealing with “superfunctions” that each cover larger and larger areas of functionality. But even so, we see that the dramatic growth in total number of functions has continued with Mathematica 9.

Mathematica functions over time

I’ve spent nearly half my life so far overseeing the design of Mathematica. And so for me it’s particularly interesting to see two major developments in Mathematica 9 that relate to design. The first is the increasing use of Wolfram|Alpha ideas and functionality in Mathematica, for example in the handling of units. And the second is the arrival of the Predictive Interface, which provides a new level of automation and discoverability in using Mathematica. Already in Mathematica 9 these are important directions. But I expect in the future they’ll give us the flexibility and the new ways of thinking that we need to unlock a whole sequence of spectacular possibilities.

One might think that after so many years and so many versions, it wouldn’t feel much different to be using Mathematica 9 compared to Mathematica 8. But it does. From the very beginning, whether it’s the new updated design, or the Predictive Interface, it’s very clear that Mathematica 9 is something fundamentally sleeker and stronger than anything that’s come before. And for me, what has happened with Mathematica 9—as with previous new versions of Mathematica—is that I quickly start being able to do more things, more quickly. Old programs that took many lines I can now replace with single, more general and easier-to-understand, Mathematica 9 functions. And things that I never got around to doing before I now do, because they’ve become so easy in Mathematica 9.

Being the kind of software company CEO that I am, I’m always using the latest test versions of all our products while they’re under development. But especially with Mathematica, it’s only close to the end that one can typically see the full vision of a new version emerge from all the threads of development that it involves. And so it has been with Mathematica 9. But what we have now is exciting, ground-breaking, and a great pleasure to use. And I am proud to be able to announce that as of today it is available to everyone.

]]>
http://blog.stephenwolfram.com/2012/11/mathematica-9-is-released-today/feed/ 9
Latest Perspectives on the Computation Age http://blog.stephenwolfram.com/2012/10/latest-perspectives-on-the-computation-age/ http://blog.stephenwolfram.com/2012/10/latest-perspectives-on-the-computation-age/#comments Thu, 11 Oct 2012 16:24:13 +0000 Stephen Wolfram http://blog.internal.stephenwolfram.com/?p=4398 This is an edited version of a short talk I gave last weekend at The Nantucket Project—a fascinatingly eclectic event held on an island that I happen to have been visiting every summer for the past dozen years.

Lots of things have happened in the world in the past 100 years. But I think in the long view of history one thing will end up standing out among all others: this has been the century when the idea of computation emerged.

We’ve seen all sorts of things “get computerized” over the last few decades—and by now a large fraction of people in the world have at least some form of computational device. But I think we’re still only at the very beginning of absorbing the implications of the idea of computation. And what I want to do here today is to talk about some things that are happening, and that I think are going to happen, as a result of the idea of computation.

Word cloud

I’ve been working on this stuff since I was teenager—which is now about a third of a century. And I think I’ve been steadily understanding more and more.

Our computational knowledge engine, Wolfram|Alpha, which was launched on the web about three years ago now, is one of the latest fruits of this understanding.

What it does—many millions of times every day—is to take questions people ask, and try to use the knowledge that it has inside it to compute answers to them. If you’ve used Siri on the iPhone, or a bunch of other services, you’ll probably have seen Wolfram|Alpha answers.

Here’s the basic idea of Wolfram|Alpha: we want to take all the systematic knowledge that’s been accumulated in our civilization, and make it computable. So that if there’s a question that can in principle be answered on the basis of that knowledge, we can just compute the answer.

So how do we do that? Well, one starts off from data about the world. And we’ve been steadily accumulating data from primary sources about thousands of different kinds of things. Cities. Foods. Movies. Spacecraft. Species. Companies. Diseases. Sports. Chemicals. Whatever.

We’ve got a lot of data now, with more flowing in every second. And actually by now our collection of raw structured data is about as big in bytes as the text of all the human-written pages that one can find on the web.

But even all that data on its own isn’t enough. Because most questions require one not just to have the data, but to compute some specific answer from it. You want to know when some satellite is going to fly overhead? Well, we may have recent data about the satellite’s orbit. But we still have to do a bunch of physics and so on to figure when it’s going to be over us.

And so in Wolfram|Alpha a big thing we’ve done is to try to take all those models and methods and algorithms—from science, and technology, and other areas—and just implement them all.

You might be thinking: there’s got be some trick, some master algorithm, that you’re using. Well, no, there isn’t. It’s a huge project. And it involves experts from a zillion different areas. Giving us their knowledge, so we can make it computable.

Actually, even having the knowledge and being able to compute from it isn’t enough. Because we still have to solve the problem of how we communicate with the system. And when one’s dealing with, sort of, any kind of knowledge, any question, there’s only one practical way: we have to use human natural language.

So another big problem we’ve had to solve is how to take those ugly messy utterances that humans make, and turn them into something computable. Actually, I thought this might be just plain impossible. But it turned out that particularly as a result of some science I did—that I’ll talk about a bit later—we made some big breakthroughs.

The result is that when you type to Wolfram|Alpha, or talk to Siri… if you say something that humans could understand, there’s a really good chance we’ll be able to understand it too.

So we can communicate to our system with language. How does it communicate back to us?

What we want to do is to take whatever you ask, and generate the best report we can about it. Don’t just give you one answer, but contextualize that. Organize the information in a way that’s optimized for humans to understand.

All of this, as I say, happens many millions of times every day. And I’m really excited about what it means for the democratization of knowledge.

It used to be that if you want to answer all these kinds of questions, you’d have to go find an expert, and have them figure out the answer. But now in a sense we’ve automated a lot of those experts. So that means anyone, anywhere, anytime, can immediately get answers.

People are used to being able to search for things on the web. But this is something quite different.

We’re not finding web pages where what you’ve asked for was already written down by someone. We’re taking your specific question, and computing for you a specific idea. And in fact most of the questions we see every day never appear on the web; they’re completely new and fresh.

When you search the web, it’s liking asking a librarian a question, and having them hand you a pile of books—well, in this case, links to web pages—to read. What we’re trying to do is to give you an automated research analyst, who’ll instantly generate a complete research report about your question, complete with custom-created charts and graphs and so on.

OK. So this all seems like a pretty huge project. What’s made it possible?

Actually, I’d been thinking about basically this project since I was a kid. But at the beginning I had no idea what decade—or even century—it would become possible. And actually it was a big piece of basic science I did—that I’ll talk about a bit later—that convinced me that actually it might be possible.

I’ve been involved in some big technology projects over the years. But Wolfram|Alpha as a practical matter is by far the most complicated, with the largest number of different kinds of moving parts inside it.

And actually, it builds on something I’ve been working on for 25 years. Which is a system called Mathematica. Which is a computer language. That I guess one could say is these days by far the most algorithmically sophisticated computer language that exists.

Mathematica is the language that Wolfram|Alpha is implemented in. And the point is that in Mathematica, doing something like solving a differential equation is just one command. That’s how we manage to implement all those methods and models and so on. We’re starting from this very sophisticated language we already have.

Wolfram|Alpha is still about 15 million lines of code—in the Mathematica language—though.

Wolfram|Alpha is about knowing everything it can about the world—with all its messiness—and letting humans interact with it quickly using natural language.

Mathematica is about creating a precise computer language, that has built in to it, in a very coherent way, all the kinds of algorithmic functionality that we know about.

Over the past 25 years, Mathematica has become very widely used. There’s broad use on essentially all large university campuses, and all sophisticated corporate R&D operations around the world. And lots and lots of things have been discovered and invented with Mathematica.

In a sense, I see Mathematica as the implementation language for the idea of computation. Wolfram|Alpha is where that idea intersects with the sort of collective accumulation of knowledge that’s happened in our civilization.

So where does one go from here? Lots and lots of places.

First, Wolfram|Alpha is using public knowledge. What happens when we use internal knowledge of some kind?

Over the last couple of years there’ve been lots of custom versions of Wolfram|Alpha created, that take internal knowledge of some company or other organization, combine it with public knowledge, and compute answers.

What’s emerging is something pretty interesting. There’s lots of talk of “big data”. But what about “big answers”?

What one needs to do is to set things up so one makes all that data computable. So it’s possible to just ask a question in natural language, and automatically get answers, and automatically generate the most useful possible reports.

So far this is something that we’ve done as a custom thing for a limited number of large organizations. But we know how to generalize this, and in a sense provide a general way to automatically get analytics done, from data. We actually introduced the first step toward this a few months ago in Wolfram|Alpha.

You can not only ask Wolfram|Alpha questions, but you can also upload data to it. You can upload all kinds of data. Like a spreadsheet, or even an image. And then Wolfram|Alpha’s goal is to automatically tell you something interesting about that data. Or, if you ask a specific question, be able to give a report about the answer.

Right now what we have works rather nicely for decently small lumps of data. We’re gradually increasing to huge quantities of data.

Here’s a kind of fun example that I did. It relates to personal analytics—or what’s sometimes called “quantified self”. I’ve been a data-oriented guy for a long time. So I’ve been collecting all kinds of data about myself. Every email for 23 years. Every keystroke for a dozen years. Every walking step for a bunch of years. And so on. I’ve found these things pretty useful in sort of keeping my life organized and productive.

Earlier this year I thought I’d take all this data I’ve accumulated, and feed it to Mathematica and Wolfram|Alpha. And pretty soon I’m getting all these plots and analyses and so on. Sort of my automated personal historian, showing me all these events and trends in my life and so on.

I have to say that I thought there must be lots of people who were collecting all sorts of data about themselves. But when I wrote about this stuff earlier this year—and it got picked up in all the usual media places—I was pretty surprised to realize that nobody came out and said “I’ve got more data than you”.

So, a little bit embarrassingly, I think I have to conclude that for now, I might be the data-nerdiest—or maybe the most computable—human around. Though we’re working to change that.

Just a few weeks ago, for example, we released Wolfram|Alpha Personal Analytics for Facebook. So people can connect their Facebook accounts to Wolfram|Alpha and immediately get all this analytics about themselves and their friends and so on.

And so far a few million have done this. It’s kind of fun to see peoples’ lives made computable like this. There are all these different friend networks for example. Each one tells a story. And tells one some psychology too.

So we’re talking about making things computable. What can we really make computable? What about a city?

There’s all this data in a city, collected by all sorts of municipal agencies. There’s permits, there’s reports, there’s GIS data. And so on. And if you’re a sophisticated city, you’ve got lots of this data on the web somehow. But it’s in raw form. Where really only an expert can use it.

Well, what if we were about to feed it through the Wolfram|Alpha technology stack? If there’s a question that could be answered about the city on the basis of the data that exists, it’d be able to be answered.

What electric power line gets closest to such-and-such a building? What’s the voltage drop between some point and the nearest substation? Imagine just being able to ask those questions to a mobile phone, and having it automatically compute the answers.

Well, there are a lot of details about actually setting this up in the world, but we now have the technology to do it. To make a computable city. Or, for that matter, to make a computable country. Where all the government data that’s being generated can be set up so we can automatically answer questions from it. Either for the citizens of the country, or for the people who run it. It’ll be interesting what the first computable country is… but from a technology—and workflow—point of view, we’re now ready to do this.

So what else can be computable like this?

Here’s another example: large engineering systems. These days there’s a language called Modelica—yes, it was a Mathematica-inspired name—that’s an open standard for people who create large engineering systems. There used just to be spec sheets for engineering components. Now there are effectively little algorithms that describe each component.

We acquired a company recently that had been using Mathematica for many years to do large-scale systems engineering. And we just a couple of months ago released an integrated systems modeling product, which allows one to take, say, 50,000 components in an airplane, represent them in computable form, and then automatically compute how they’ll behave in some particular situation.

We haven’t yet assembled it all, but we now have the technology stack to do the following: you’ve got some big engineering system in front of you, and maybe it’s sent sensor data back to our servers. Now you talk to your mobile phone and you say “If I push it to 300 rpm, what will happen?” We understand the query, then run a model of the system, then tell you the answer; say “That wouldn’t be a very good idea” (preferably not a HAL voice or something).

So that’s about the operation of engineering systems. What about design?

Well, with everything being computable, it’s easy to run optimization algorithms on designs, or even to search a large space of possible designs. And increasingly what we’ll be doing is stating some design goal, then having the computer automatically figure out how to achieve that goal. It’ll know for example what components are available, with what specifications, and at what cost, and it’ll then figure out how to assemble what’s needed to achieve the design goal. Actually, there’s a much more everyday example of this that will come soon.

In Wolfram|Alpha, for example, we’ve been working with retailers to get data on consumer products. And the future will be to just ask in natural language for some product that meets some requirements, and then automatically to figure out what that is.

Or, more interestingly, to say: “I’m doing such-and-such a piece of home improvement. Figure out how much of what products I need to get to do that.” And the result should be an automatically generated bill of materials, and then the instructions about what to do with them.

There are just all these areas ripe to be made computable. Here’s another one: law.

Actually, back 300 years Leibniz was thinking about that when he first invented some precursors to the modern idea of computation. He imagined having some encoding of human laws, set up so one can ask a machine in effect to automatically figure out: “Is this legal or not?”

Well, today, there are some kinds of contracts that have already been “made computable”. Like contracts for derivative financial instruments and so on. But what if we could make the tax code computable? Or a mortgage computable? Or, more extremely, a patent.

Actually, some contracts like service-level agreements are beginning to become computable, of necessity, because in effect they have to be interpreted by computers in real time. And of course once things become computable, they can be applied in a much more democratized way, without all the experts needed, and so on.

Here’s a completely different area that I think is going to become computable, and actually that we’re planning to spin off a company to do. And that’s medical diagnosis.

When I look at the medical world, and the healthcare system, diagnosis is really a central problem. I mean, if you don’t have the right diagnosis, all the wonderful and expensive treatment in the world isn’t going to help, and actually it’s probably going to hurt.

Well, diagnosis is really hard for humans. I actually think it’s going to turn out not to be so hard for computers. It’s a lot easier for them to know more, and to not get confused about probabilities, and so on.

Of course, it’s a big project. You start off by encoding all those specialized decision trees and so on. But then you go on and grind up the medical literature and figure out what’s in there. Then you get lots of actual patient records—probably at first realistically from outside the US—and start doing analysis on those. There’s a lot about the getting of histories, and the delivery of diagnoses, that actually becomes a lot easier in an automated system.

But, you know, there’s actually something that’s inevitably going to disrupt existing medical diagnosis, and that’s sensor-based medicine. These days there are only a handful of consumer-level medical sensors, like thermometers and things. Very soon there are going to be lots. And—a little bit like the personal analytics I was talking about earlier—people are going to be routinely recording all sorts of medical information about themselves.

And the question is: how is this going to be used in diagnosis? Because when you come in with 10 megabytes of time series, that’s not just a “were you sweating a lot” question. That’s something that will have to be analyzed with an algorithm.

Actually, I think the whole medical process is going to end up being very algorithmic. Because you’ll be analyzing symptoms with algorithms, but then the treatment will also be specified by some algorithm. In fact, even though right now diagnosis is really important, I think in the end that’s sort of going to go away. One will be going straight from the observed data to the algorithm for treatment.

It’s sort of like in finance. You observe some behavior of some stock in the market. And, yes, there are technical traders who’ll start telling you that’s a “head and shoulders pattern” or something. But mostly—at least in the quant world—you’ll just be using an algorithm to decide what to do, and one doesn’t care about the sort of “descriptive diagnosis” of what’s happening.

And in medicine, I expect that the whole computation idea will extend all the way down to the molecules we use as drugs. Today drugs tend to just be molecules that do one particular thing. In the future, I think we’re going to have molecules that each act like little computers, looking around at cells they encounter, and effectively running algorithms to decide how to act.

You know, there’s some very basic questions about medical diagnosis. I like to think of the analogy of software diagnosis. You have a computer. It’s running an operating system. Things happen to it. Eventually all kinds of crud builds up, it starts running slower—and eventually it crashes; it dies.

And of course you can restart it—from the same program, effectively the same “genetic material” giving you the next generation. That’s all pretty analogous to biology. But it’s much less advanced. I mean, we have all those codes for medical conditions; there’s nothing analogous for software.

But in software, unlike in biology, in principle we can monitor every single bit of what’s happening. And we’ve just started doing some experiments trying to understand in a general way, sort of what’s optimal to monitor to do “software diagnosis”, or more interestingly, what do you have to fix on an ongoing basis to effectively “extend the lifespan” of the running program.

OK. So I’m going through lots of areas and talking about how computation affects them. Here’s another completely different one: journalism.

We’re in the interesting position now with Wolfram|Alpha of having by quite a large margin more data feeds—whose meaning we understand—coming into our system than anyone has ever had before. In other words, we sort of have this giant sensory system connected to lots of things in the world.

Now the question is: what’s interesting that’s going on in the world? We see all this data coming in. What’s unexpected? What’s newsworthy? In a sense what we want to create is computational journalism: automatically finding each hour what the “most interesting things happening in the world are”.

You know, in addition to algorithms to just monitor what’s going on, there are also algorithms to predict consequences. It might be solving the equations for the propagation of a tsunami across an ocean. I think we can pretty much do those. Or it might be—and this I’m much less sure will work—figuring out some economic or supply chain model, in kind of the same way that we figure out behavior of large engineering systems. So that we don’t just see raw news, but also compute consequences.

So that’s computation in journalism. What about computation in books? How can those be computational?

Well, actually it’s rather easy. In fact, we started a company a couple of years ago that’s effectively making computational books. It’s called Touch Press. Our first book was an interactive tour of the chemical elements, that conveniently came out the day the iPad shipped, and that showed up in lots and lots of iPad ads. I’m actually surprised there aren’t lots more entrants here. But Touch Press has become by far the most successful publisher of highly interactive ebooks—in effect computational books. And, yes, underneath it’s using pieces of our technology stack, like Mathematica and Wolfram|Alpha. And producing books on all sorts of things. The most recent two being Egyptian pyramids, and Shakespeare’s sonnets.

And actually, from Mathematica we’ve actually built what we call CDF—the Computable Document Format—that lets one systematically define computable documents: documents where there’s interaction and computation going on right in the document.

And from CDF—in addition to all sorts of corporate reports—we’re beginning to see a generation of textbooks that can interactively illustrate their points, perhaps pulling in real-time data too, and that can interactively let students try things out, or test themselves.

There’s actually a lot more to say about how computation relates to the future of education, both in form and content. We’ve been working to define a computer-based math curriculum, that’s kind of what it’s worth teaching in the 21st century, now that for example, a large fraction of US students routinely use Wolfram|Alpha to do their homework every day. It’s actually exciting how much more we can teach now that knowledge and computation have been so much more democratized.

We’re also realizing—particularly with Mathematica—how much it’s possible to teach about computation, and programming, even at very early stages in education.

Some other time, perhaps, I can talk about the thinking we’ve done about how to change the structure of education—in certain ways to de-institutionalize it.

Before I finish I’d like to make sure I say just a tiny bit about what computation means not just about all the practical things I’ve been discussing, but also at a sort of deeper intellectual level. Like in science. Some of you may know that I’ve spent a great many years—in a sense as a user of Mathematica—doing basic science.

My main idea was to depart from essentially 300 years of scientific tradition, that had to do with using mathematical equations to describe the natural world, and instead sort of generalize them to arbitrary computer programs.

Well, my big discovery was that in the universe of possible computer programs, it takes only a very simple program to get incredibly complex behavior.

And I think that’s very important in understanding many systems in nature. Maybe even in giving us a fundamental theory of physics for our whole universe. It also gives us new ways of thinking about things. For philosophy. For understanding systems, and organizations and so on.

Newtonian science gave us notions like momentum and forces and integrals. That we talk about nowadays in all kinds of contexts. The new kind of science gives us notions like computational irreducibility, and computational equivalence, that give us new ways to think about things.

There are also some very practical implications. Like in technology. In a sense, technology is all about taking what exists in the world, and seeing how to harness it for human purposes. Figuring out what good a magnetic material, or a certain kind of gas, is.

In the computational universe, we’ve got all these little programs and algorithms that do all these remarkable things. And now there’s a new kind of technology that we can do. Where we define some goal. Then we search this computational universe for a program that achieves it. In a sense, what this does is to make invention free.

Actually, we’ve used this for example for creating music, and other people have used it in areas like architecture. Using a computer to do creative work. And, if one wants, to do it very efficiently. Making it economical, for example, to do mass customization.

At a very practical level, for more than a decade now we’ve routinely been creating technology not by having human engineers build it up step by step, but instead by searching the computational universe—and finding all this stuff out there that we can harness for technology. It’s pretty interesting. Sometimes what one finds is readily understandable to a human. Sometimes one can verify it works, but it’s really a very non-human solution. Something that no human on their own would have come up with. But something that one just finds out there in the computational universe.

Well, I think this methodology of algorithm discovery—and related methodologies for finding actual structures, for mechanical devices, molecules, and so on—will, I think, inevitably grow in importance. In fact, I’m guessing that within a few decades we’re going to find that there’s more new technology being created by those methods, than by all existing traditional engineering methods put together.

Today, we tend to create in a sense using only simplified computations—because that’s what our existing methods let us work with. But in the future we’re going to be seeing in every aspect of our world much much more that’s visibly doing sophisticated computation.

I want to leave you with the thought that even after everything that’s happened with computers over the past 50 years, we haven’t seen anything yet. Computation is a much stronger concept—and actually my guess is it’s going to be the defining concept for much of the future of human history.

]]>
http://blog.stephenwolfram.com/2012/10/latest-perspectives-on-the-computation-age/feed/ 2