*Two weeks ago I spoke at SXSW Interactive in Austin, TX. Here’s a slightly edited transcript (it’s the “speaker’s cut”, including some demos I had to abandon during the talk):*

Well, I’ve got a lot planned for this hour.

Basically, I want to tell you a story that’s been unfolding for me for about the last 40 years, and that’s just coming to fruition in a really exciting way. And by *just* coming to fruition, I mean pretty much today. Because I’m planning to show you today a whole lot of technology that’s the result of that 40-year story—that I’ve never shown before, and that I think is going to be pretty important.

I always like to do live demos. But today I’m going to be pretty extreme. Showing you a lot of stuff that’s very very fresh. And I hope at least a decent fraction of it is going to work.

OK, here’s the big theme: taking computation seriously. Really understanding the idea of computation. And then building technology that lets one inject it everywhere—and then seeing what that means.

I’ve pretty much been chasing this idea for 40 years. I’ve been kind of alternating between science and technology—and making these bigger and bigger building blocks. Kind of making this taller and taller stack. And every few years I’ve been able to see a bit farther. And I think making some interesting things. But in the last couple of years, something really exciting has happened. Some kind of grand unification—which is leading to a kind of Cambrian explosion of technology. Which is what I’m going to be showing you pieces of for the first time here today.

But just for context, let me tell you a bit of the backstory. Forty years ago, I was a 14-year-old kid who’d just started using a computer—which was then about the size of a desk. I was using it not so much for its own sake, but instead to try to figure out things about physics, which is what I was really interested in. And I actually figured out a few things—which even still get used today. But in retrospect, I think the most important thing I figured out was kind of a meta thing. That the better the tools one uses, the further one can get. Like I was never good at doing math by hand, which in those days was a problem if you wanted to be a physicist. But I realized one could do math by computer. And I started building tools for that. And pretty soon me with my tools were better than almost anyone at doing math for physics.

And back in 1981—somewhat shockingly in those days for a 21-year-old professor type—I turned that into my first product and my first company. And one important thing is that it made me realize that products can really drive intellectual thinking. I needed to figure out how to make a language for doing math by computer, and I ended up figuring out these fundamental things about computation to be able to do that. Well, after that I dived back into basic science again, using my computer tools.

And I ended up deciding that while math was fine, the whole idea of it really needed to be generalized. And I started looking at the whole universe of possible formal systems—in effect the whole computational universe of possible programs. I started doing little experiments. Kind of pointing my computational telescope into this computational universe, and seeing what was out there. And it was pretty amazing. Like here are a few simple programs.

Some of them do simple things. But some of them—well, they’re not simple at all.

This is my all-time favorite, because it’s the first one like this that I saw. It’s called rule 30, and I still have it on the back of my business cards 30 years later.

Trivial program. Trivial start. But it does something crazy. It sort of just makes complexity from nothing. Which is a pretty interesting phenomenon. That I think, by the way, captures a big secret of how things work in nature. And, yes, I’ve spent years studying this, and it’s really interesting.

But when I was first studying it, the big thing I realized was: I need better tools. And basically that’s why I built *Mathematica*. It’s sort of ironic that *Mathematica* has math in its name. Because in a sense I built it to get beyond math. In *Mathematica* my original big idea was to kind of drill down below all the math and so on that one wanted to do—and find the computational bedrock that it could all be built on.
And that’s how I ended up inventing the language that’s in *Mathematica*. And over the years, it’s worked out really well. We’ve been able to build ever more and more on it.

And in fact *Mathematica* celebrated its 25th anniversary last year—and in those 25 years it’s gotten used to invent and discover and learn a zillion things—in pretty much all the universities and big companies and so on around the world. And actually I myself managed to carve out a decade to actually use *Mathematica* to do science myself. And I ended up discovering lots of things—scientific, technological and philosophical—and wrote this big book about them.

Well, OK, back when I was a kid something I was always interested in was systematizing information. And I had this idea that one day one should be able to automate being able to answer questions about basically anything. I figured out a lot about how to answer questions about math computations. But somehow I imagined that to do this in general, one would need some kind of general artificial intelligence—some sort of brain-like AI. And that seemed very hard to make.

And every decade or so I would revisit that. And conclude that, yes, that was still hard to make. But doing the science I did, I realized something. I realized that if one even just runs a tiny program, it can end up doing something of sort of brain-like complexity.

There really isn’t ultimately a distinction between brain-like intelligence, and this. And that’s got lots of implications for things like free will versus determinism, and the search for extraterrestrial intelligence. But for me it also made me realize that you shouldn’t need a brain-like AI to be able to answer all those questions about things. Maybe all you need is just computation. Like the kind we’d spent years building in *Mathematica*.

I wasn’t sure if it was the right decade, or even the right century. But I guess that’s the advantage of having a simple private company and being in charge; I just decided to do the experiment anyway. And, I’m happy to say, it turned out it was possible. And we built Wolfram|Alpha.

You type stuff in, in natural language. And it uses all the curated data and knowledge and methods and algorithms that we’ve put into it, to basically generate a report about what you asked. And, yes, if you’re a Wolfram|Alpha user, you might notice that Wolfram|Alpha on the web just got a new spiffier look yesterday. Wolfram|Alpha knows about all sorts of things. Thousands of domains, covering a really broad area. Trillions of pieces of data.

And indeed, every day many millions of people ask it all sorts of things—directly on the website, or through its apps or things like Siri that use it.

Well, OK, so we have *Mathematica*, which has this kind of bedrock language for describing computations—and for doing all sorts of technical computations. And we also have Wolfram|Alpha—which knows a lot about the world—and which people interact with in this sort of much messier way through natural language. Well, *Mathematica* has been growing for more than 25 years, Wolfram|Alpha for nearly 5. We’ve continually been inventing ways to take the basic ideas of these systems further and further.
But now something really big and amazing has happened. And actually for me it was catalyzed by another piece: the cloud.

Now I didn’t think the cloud was really an intellectual thing. I thought it was just sort of a utility. But I was wrong. Because I finally understood how it’s the missing piece that lets one take kind of the two big approaches to computation in *Mathematica* and in Wolfram|Alpha and make something just dramatically bigger from them.

Now, I’ve got to tell you that what comes out of all of this is pretty intellectually complicated. But it’s also very very directly practical. I always like these situations. Where big ideas let one make actually really useful new products. And that’s what’s happened here. We’ve taken one big idea, and we’re making a bunch of products—that I hope will be really useful. And at some level each product is pretty easy to explain. But the most exciting thing is what they all mean together. And that’s what I’m going to try to talk about here. Though I’ll say up front that even though I think it’s a really important story, it’s not an easy story to tell.

But let’s start. At the core of pretty much everything is what we call the Wolfram Language. Which is something we’re just starting to release now.

The core of the Wolfram Language has been sort of incubating in *Mathematica* for more than 25 years. It’s kind of been proven there. But what just happened is that we got all these new ideas and technology from Wolfram|Alpha, and from the Cloud. And they’ve let us make something that’s really qualitatively different. And that I’m very excited about.

So what’s the idea? It’s really to make a language that’s knowledge based. A language where built right into the language is huge amounts of knowledge about computation and about the world. You see, most computer languages kind of stay close to the basic operations of the machine. They give you lots of good ways to manage code you build. And maybe they have add-on libraries to do specific things.

But our idea with the Wolfram Language is kind of the opposite. It’s to make a language that has as much built in as possible. Where the language itself does as much as possible. To make everything as automated as possible for the programmer.

OK. Well let’s give it a try.

You can use the Wolfram Language completely interactively, using the notebook interface we built for *Mathematica*.

OK, that’s good. Let’s do something a little harder:

Yup, that’s a big number. Kind of looks like a bunch of random digits. Might be like 60,000 data points of sensor data.

How do we analyze it? Well, the Wolfram Language has all that stuff built in.

So like here’s the mean:

And the skewness:

Or hundreds of other statistical tests. Or visualizations.

That’s kind of weird actually. But let me not get derailed trying to figure out why it looks like that.

OK. Here’s something completely different. Let’s have the Wolfram Language go to some kind volunteer’s Facebook account and pull out their friend network:

OK. So that’s a network. The Wolfram Language knows how to deal with those. Like let’s compute how that breaks into communities:

Let’s try something different. Let’s get an image from this little camera:

OK. Well now let’s do something to that. We can just take that image and feed it to a function:

So now we’ve gotten the image broken into little pieces. Let’s make that dynamic:

Let’s rotate those around:

Let’s like even sort them. We can make some funky stuff:

OK. That’s kind of cool. Why don’t we tweet it?

OK. So the whole point is that the Wolfram Language just intrinsically knows a lot of stuff. It knows how to analyze networks. It knows how to deal with images—doing all the fanciest image processing. But it also knows about the world. Like we could ask it when the sun rose this morning here:

Or the time from sunrise to sunset today:

Or we could get the current recorded air temperature here:

Or the time series for the past day:

OK. Here’s a big thing. Based on what we’ve done for Wolfram|Alpha, we can understand lots of natural language. And what’s really powerful is that we can use that to refer to things in the real world.

Let’s just type `control-= nyc`:

And that just gives us the entity of New York City. So now we can find the temperature difference between here and New York City:

OK. Let’s do some more:

Let’s find the lengths of those borders:

Let’s put that in a grid:

Or maybe let’s make a word cloud out of that:

Or we could find all the former Soviet countries:

And let’s find their flags:

And let’s like find which is closest to the French flag:

Pretty neat, eh?

Or let’s take the first few former Soviet republics. And generate maps of their capital cities. With 10-mile discs marked:

I think it’s pretty amazing that you can do that kind of thing right from inside a programming language, with just a line of code.

And, you know, there’s a huge amount of knowledge built into the Wolfram Language. We’ve been building this for more than a quarter of a century.

There’s knowledge about algorithms. And about the world.

There are two big principles here. The first is maximum automation: automate as much as possible. You define what you want the language to do, then it’s up to it to figure out how to do it. There might be hundreds of algorithms for doing different cases of something. But what we want to do is to make a meta-algorithm that selects the best way to do it. So kind of all the human has to do is to define their goal, then it’s up to the system to do things in the way that’s fastest, most accurate, best looking.

Like here’s an example. There’s a function `Classify` that tries to classify things. You just type `Classify`.
Like here’s a very small training set of handwritten digits:

And this makes a classifier.

Which we can then apply to something we draw:

OK, well here’s another big thing about the Wolfram Language: coherence. Unification. We want to make everything in the language fit together. Even though it’s a huge system, if you’re doing something over here with geographic data, we want to make sure it fits perfectly with what you’re doing over there with networks.

I’ve spent a decent fraction of the last 25 years of my life implementing the kind of design discipline that’s needed. It’s been fascinating, but it’s been hard work. Spending all that time to make things obvious. To make it so it’s easy for people to learn and remember and guess. But you know, having all these building blocks fit together: that’s also where the most powerful new algorithms come from. And we’ve had a great time inventing tons and tons of new algorithms that are really only possible in our language—where we have all these different areas integrated.

And there’s actually a really fundamental reason that we can do this kind of integration. It’s because the Wolfram Language has this very fundamental feature of being symbolic. If you just type `x` into the language, it doesn’t give some error about *x* being undefined. `x` is just a thing—symbolic `x`—that the language can deal with. Of course that’s very nice for math.

But as far as I am concerned, one of the big discoveries is that this idea of a symbolic language is incredibly powerful for zillions of other things too. Everything in our language is symbolic. Math expressions.

Or entities, like Austin, TX:

Or like a piece of graphics. Here’s a sphere:

Here are a bunch of cylinders:

And because everything is just a symbolic expression, we could pick this up, and, like, do image processing on it:

You know, everything is just a symbolic expression. Like another example is interfaces. Here’s a symbolic slider:

Here’s a whole array of sliders:

You know, once everything is symbolic, there’s just a whole lot you can do. Here’s nesting some purely symbolic function *f*:

Here’s nesting, like, a function that makes a frame:

And here’s symbolically nesting, like, an interface element:

My gosh, it’s a fractal interface!

You know, once things are symbolic, it’s really easy to hook everything up. Like here’s a plot:

And now it’s trivial to make it interactive:

You can do that with anything:

OK. Here’s another thing that can be made symbolic: documents.

The document I’m typing into here is just another symbolic expression. And you can create whatever you want in it symbolically.

Like here’s some text. We could twirl it around if we want to:

All just symbolic expressions.

OK. So here’s yet another thing that’s a symbolic expression: code. Every piece of code in the Wolfram Language is just a symbolic expression, that can be picked up and manipulated, and passed around, and run, wherever you want. That’s incredibly important for programming. Because it means you can build things in a really modular way. Every piece can stand on its own.

It’s also important for another reason: it’s a great way to deal with the cloud, sort of treating it as a giant active repository for symbolic lumps of computation. And in fact we’ve built this whole infrastructure for that, that I’m going to demo for the first time here today.

Well, let’s say we have a symbolic expression:

Now we can just deploy it to the Cloud like this:

And we’ve got a symbolic `CloudObject`, with a URL we can go to from anywhere. And there’s our material.

Now let’s make this not static content, but an actual program. And on the web, a good way to do that is to have an API. But with our whole notion of everything being symbolic, we can represent that as just another symbolic expression:

And now we can deploy that to the Cloud:

And we’ve got an Instant API. Now we can just fill in an API parameter ?size=150 and we can run this from anywhere on the web:

And every time what’ll happen is that you’ll be calling that piece of Wolfram Language code in the Wolfram Cloud, and getting the result back. OK.

Here’s another thing to do: make a form. Just change the `APIFunction` to a `FormFunction`:

Now what we’ve got is a form:

Let’s add a feature:

Now let’s fill some values into the form:

And when we press Submit, here’s the result:

OK. Let’s try a different case. Here’s a form that takes two cities, and draws a map of the path between them:

Let’s deploy it in the Cloud:

Now let’s fill in the form:

And when we press Submit, here’s what we get:

One line of code and an actual little web app! It’s got quite a bit of technology inside it. Like you see these fields. They’re what we call smart fields. That leverage our natural language understanding stack:

If you don’t give a city, here’s what happens:

When you do give a city, the system is automatically interpreting the inputs as city entities. Let me show you what happens inside. Let’s just define a form that just returns a list of its inputs:

Now if we enter cities, we just get Wolfram Language symbolic entity objects. Which of course we can then compute with:

All right, let’s try something else.

Let’s do a sort of modern programming example. Let’s make a silly app that shows us pictures through the eyes of a cat or a dog. OK, let’s build the framework:

Now let’s pull in an actual algorithm for dog vision. Color channels, and acuity.

OK. Let’s deploy with that:

Now we can send that over as an app. But first let’s build an icon for it:

And now let’s deploy it as a public app:

Now let’s go to the Wolfram Cloud app on an iPad:

And there’s the app we just published:

Now we click that icon—and there we have it: a mobile app running against the Wolfram Language in the Cloud:

And we can just use the iPad camera to input a picture, and then run the app on it:

Pretty neat, eh?

OK, but there’s more. Actually, let me tell you about the first product that’s coming out of our Wolfram Language technology stack. It should be available very soon. We call it the Wolfram Programming Cloud.

It’s all the stuff I’m showing you, but all happening in the Cloud. Including the programming. And, yes, there’s a desktop version too.

OK, so here’s the Programming Cloud:

Deploy from the Cloud. Define a function and just use `CloudDeploy[]`:

Or use the GUI:

Oh, another thing is to take CDF and deploy it to run in the Cloud.

Let’s take some code from the Wolfram Demonstrations Project. Actually, as it happens, this was the very first Demonstration I wrote when were originally building that site:

Now here’s the deployed Cloud CDF:

It just needs a web browser. And gives arbitrary interactivity by running against the Wolfram Engine in the Cloud.

OK, well, using this technology, another product we’re building is our Data Science Platform.

And the idea is that data comes in, from all sorts of sources. And then we have all these automatic ways to analyze it. Using sort of a giant meta-algorithm. As well as using all the knowledge of the actual world that we have.

Well, then you can program whatever you want with the Wolfram Language. And in the end you can make reports. On demand, like from an API or an app. Or just on a schedule. And we can use our whole CDF symbolic documents to set up these reports.

Like here’s a template for a report on the state of my email inbox. It’s just defined as a symbolic document. That I go ahead and edit.

And then programmatically generate reports from:

You know, there are some really spectacular things we can do with data using our whole symbolic language technology stack. And actually just recently we realized that we can use it to make a very clean unification and generalization of SQL and NoSQL databases. And we’re implementing that in sort of four transparent levels. In memory. In files. In databases. And distributed.

But OK. Another thing is that we’ve got a really good way to represent individual pieces of data. We call it WDF—the Wolfram Data Framework.

And basically what it is, is taking the kind of algorithmic ontology that we built for Wolfram|Alpha—and that we know works—and exposing that. And using our natural language understanding to be able to take unstructured data, and automatically convert it to something that’s structured and computable. And that for example our Data Science Platform can do really good things with.

Well, OK. Here’s another thing. A rapidly increasing source of data out there in the world are connected devices. And we’ve been pretty deeply involved with those. And actually one thing I wanted to do recently was just to find out what devices there are out there. So we started our Connected Devices Project, to just curate the devices out there—just like we curate all sorts of other things in Wolfram|Alpha.

We have about 2500 devices in here now, growing every day. And, yes, we’re using WDF to organize this, and, yes, all this data is available from Wolfram|Alpha.

Well, OK. So there are all these devices. And they measure things and do things. And at some point they typically make web contact. And one thing we’re doing—with our Data Science Platform and everything—is to create a really smooth infrastructure for handling things from there on. For visualizing and analyzing and computing everything that comes from that Internet of Things.

You know, even for devices that haven’t yet made web contact, it can be a bit messier, but we’ve got a framework for handling those too. Like here’s an accelerometer connected to an Arduino:

Let’s see if we can get that data into the Wolfram Language. It’s not too hard:

And now we can immediately plot this:

So that’s connecting a device to the Wolfram Language. But there’s something else coming too. And that’s actually putting the Wolfram Language onto devices. And this is where 25 years of tight software engineering pays back. Because as soon as devices run things like Linux, we can run the Wolfram Language on them. And actually there’s now a preliminary version of the Wolfram Language bundled with the standard operating system for every Raspberry Pi.

It’s pretty neat being able to have little $25 devices that persistently run the Wolfram Language. And connect to sensors and actuators and things. And every little computer out there just gets represented as yet another symbolic object in the Wolfram Language. And, like, it’s trivial to use the built-in parallel computation capabilities of the Wolfram Language to pull data from lots of such machines.

And going forward, you can expect to see the Wolfram Language running on lots of embedded processors. There’s another kind of embedding we’re interested in too. And that’s software embedding. We want to have a Universal Deployment System for the Wolfram Language.

Given a Wolfram Language program, there are lots of ways to deploy it.

Here’s one: being able to call Wolfram Language code from other languages.

And we have a really easy way to do that. There’s a GUI, but in the Wolfram Language, you can just take an API function, and say: create embed code for this for Python. Or Java. Or whatever.

And you can then just insert that code in your external program, and it’ll call the Wolfram Cloud to get a computation done. Actually, there are going to be ways to do this from inside IDEs, like Wolfram *Workbench*.

This is really easy to set up, and as I said, it just calls the Wolfram Cloud to run Wolfram Language code. But there’s even another concept. There’s an Embedded Wolfram Engine that you can run locally too. And essentially the same code will then work. But now you’re running on your local machine, not in the Cloud. And things get pretty interesting, being able to put Embedded Wolfram Engines inside all kinds of software, to immediately add all that knowledge-based capability, and all those algorithms, and natural language and so on. Here’s what the Embedded Wolfram Engine looks like inside the Unity Game Engine IDE:

Well, talking of embedding, let me mention yet another part of our technology stack. The Wolfram Language is supposed to describe the world. And so what about describing devices and machines and so on.

Well, conveniently enough we have a product related to our *Mathematica* business called *SystemModeler*, which does large-scale system modeling and simulation:

And now that’s all getting integrated into the Wolfram Language too.

So here’s a representation of a rectifier circuit:

And this is all it takes to simulate this device:

And to plot parameters from the simulation:

And here’s yet another thing. We’re taking the natural language understanding capabilities that we created for Wolfram|Alpha, and we’re setting them up to be customizable. Now of course that’s big when one’s querying databases, or controlling devices. It’s also really interesting when one’s interacting with simulations. Looking at some machine out in the field, and being able to figure out things about it by talking to one’s mobile device, and then getting a simulation done in the Cloud.

There are lots of possibilities. But OK, so how can people actually use these things? Well, in the next couple of weeks there’ll be an open sandbox on the web for people to use the Wolfram Language. We’ve got a gallery of examples that gives good places to start.

Oh, as well as 100,000 live examples in the Wolfram Language documentation.

And, OK, the Wolfram Programming Cloud is also coming very soon. And it’ll be completely free to start developing with it, and even to do small-scale deployments.

So what does this mean?

Well, I think it’s pretty exciting. Because I think we just really changed the economics of going from algorithmic ideas to deployed products. If you come by our booth at the South By trade show, we’ll be doing a bunch of live coding there. And perhaps we’ll even be able to create little products for people right there. But I think our Programming Cloud is going to open up a surge of algorithmic startups. And I’ll be really interested to see what comes out.

OK. Here’s another thing that’s going to change I think: programming education. I think the Wolfram Language is sort of uniquely good for education. Because it’s a language where you get to do real things incredibly easily. You get to see computation at work in an incredibly powerful way. And, by the way, rather effortlessly see a bunch of modern computer science ideas… and immediately connect to the real world.

And the natural language aspect makes it really easy to get started. For serious programmers, I think having snippets of natural language programming, particularly in places where one’s connecting to the real world, is very powerful. But for people getting started, it’s really nice to be able to create things just with natural language.

Like here we can just say:

And have the code generated automatically.

We’re really interested in all the educational possibilities here. Certainly there’s the raw material for a zillion great hackathon projects.

You know, every summer for the past dozen years we’ve done a very successful summer school about the new kind of science I’ve worked on:

Where we’re effectively doing real-time science. We’ve also for a few years had a summer camp for high-school students:

And we’re using our experience here to build out a bunch of ways to use the Wolfram Language for programming education. You know, we’ve been involved in education for a long time—more than 25 years. *Mathematica* is incredibly widely used there. Wolfram|Alpha I’m happy to say has become sort of a universal tool for students.

There’s more and more coming.

Like here’s a version of Wolfram|Alpha in Chinese that’s coming soon:

Here’s a Problem Generator created with the Wolfram Language and available through Wolfram|Alpha Pro:

And we’re going to be doing all sorts of elaborate educational analytics and things through our Cloud system. You know, there are just so many possibilities. Like we have our CDF—Computable Document Format—that people have used for quite a few years to make interactive Demonstrations.

In fact here’s our site with nearly 10,000 of them:

And now with our Cloud system we can just run all of these directly in a web browser, using Cloud CDF, so they become easy to integrate into web learning environments. Like here’s an example that just got done by Versal:

Well, OK, at kind of the other end of things from education, there’s a lot going on in the corporate area. We’ve been doing large-scale custom deployments of Wolfram|Alpha for several years. But now with our Data Science Platform coming, we’ve got a kind of infinitely customizable version of that. And of course everything is integrated between cloud and desktop. And we’re going to have private clouds too.

But all this is just the beginning. Because what we’ve got with the whole Wolfram Language stack is a kind of universal platform for creating products. And we’ve got a whole sequence of products in the pipeline. It’s an exciting feeling having all this stuff that we’ve been doing for more than a quarter of a century come together like this.

Of course, it’s big challenge dealing with all the possibilities. I mean, we’re just a little private company with about 700—admittedly very talented—people.

We’ve started spinning off companies. Like Touch Press which makes iPad ebooks.

And we’ll be doing more of that, though we need more entrepreneurs. And we might even take investors.

But, OK, what about the broader future?

I think about that a fair amount. I don’t have time to say much here. But let me say just a few things. In what we’ve done with computation and knowledge, we’re trying to take the knowledge of our civilization, and put it in computable form. So we can essentially inject it everywhere. In something like Wolfram|Alpha, we’re essentially doing on-demand computation. You ask for something, and Wolfram|Alpha will do it.

Increasingly, we’re going to have preemptive computation. We’re building towards that a lot with the Wolfram Language. Being able to model the world, and make predictions about what’s going to happen. Being able to tell you what you might want to do next. In fact, whenever you use the Wolfram Language interactively, you’ll see this little Suggestions Bar that’s using some fairly fancy computation to suggest what to do next.

But the real way to have that work is to use knowledge about you. I’ve been an enthusiast of personal analytics for a long time. Like here’s a 25-year history of my diurnal email rhythm:

And as we have more sensors and outsource more of our memory, our machines will be better and better at telling us what to do. And at some level the machines take over just because the humans tend to follow the auto-suggests they make.

But OK. Here’s something I realized recently. I’m interested in history, and I was visiting the archives of Gottfried Leibniz, who lived about 300 years ago, and had a lot of rather modern ideas about computing. But in his time he had only one—very primitive—proto-computer that he built:

Today we have billions of computers. So I was thinking about the extrapolation. And I realized that one day there won’t just be lots more computers—everything will actually be made of computers.

Biology has already a little bit figured out this idea. But one day it won’t be worth making anything out of dumb materials; instead everything will be made out of stuff that’s completely programmable.

So what does that mean? Well, of course it really blurs the distinction between hardware and software. And it means that these languages we create sort of become what everything is made of. You know, I’ve been interested for a long time in the fundamental theory of physics. And in fact with a bunch of science I’ve done, I think there’s a real possibility that we’ve finally got a new way to find such a theory. In effect a way to find our physical universe out in the computational universe of all possible universes.

But here’s the funny thing: once everything is made of computers, even though it’ll be really cool to find the fundamental theory of physics—and I still want to do it—it’s not going to matter so much. Because in effect that actually physics is just the machine code for the universe. But everything we deal with is on top of a layer that we can program however we want.

Well, OK, what does that mean for us humans? No doubt we’ll get to deploy in that sort of much-more-than-biology-programmable world. Where in effect you can just build any universe for yourself. I sort of imagine this moment where there’s a box of a trillion souls. Running in whatever pieces of the computational universe they want.

And what happens? Well, there’s lots of computation going on. But from the science I’ve done—and particularly the Principle of Computational Equivalence—I think it’s sort of a very Copernican situation. I don’t think there’s anything fundamentally different about that computation, from what goes on all over the universe, and even in rather simple programs.

And at some level the only thing that’s special about that particular box of a trillion souls is that it’s based on our particular history. Now, you know, I deal with all this tech stuff. But I happen to like people; I guess that’s why I’ve liked building a company, and mentoring lots of people. And in a sense seeing how much is possible, and how much can sort of be generalized and virtualized with technology, actually makes me think people are more important rather than less. Because when everything is possible, what matters is just what one wants or chooses to do.

It’s sort of a big version of what we’re doing with the Wolfram Language. Humans define the goals, then technology automatically tries to achieve them. And the more we can inject computation into everything, the more this becomes possible. And, you know, I happen to think that the injection of computation into everything will be a defining feature—perhaps the defining feature—of this time in history.

And I have to say I’m personally pleased to have lived at the right time to make some contribution to this. It’s a great privilege. And I’m very pleased to have been able to tell you a little bit about it here today.

Thank you very much.

]]>Here’s a short video demo I just made. It’s amazing to me how much of this is based on things I hadn’t even thought of just a few months ago. Knowledge-based programming is going to be much bigger than I imagined…

]]>In the end, we want every type of connected device to be seamlessly integrated with the Wolfram Language. And this will have all sorts of important consequences. But as we work toward this, there’s an obvious first step: we have to know what types of connected devices there actually are.

So to have a way to answer that question, today we’re launching the Wolfram Connected Devices Project—whose goal is to work with device manufacturers and the technical community to provide a definitive, curated, source of systematic knowledge about connected devices.

We have a couple of thousand devices (from about 300 companies) included as of today—and we expect this number to grow quite rapidly in the months ahead. For each device, there is a certain amount of structured information:

Whenever possible, this information is set up to be computable, so that it can for example be used in Wolfram|Alpha:

Soon you’ll be able to make all sorts of complex queries about devices, very much like the queries you can make now about consumer products:

We’re working hard to make the Wolfram Connected Devices Project an important and useful resource in its own right. But in the end our goal is not just to deal with information about devices, but actually be able to connect to the devices, and get data from them—and then do all sorts of things with that data.

But first—at least if we expect to do a good job—we must have a good way to represent all the kinds of data that can come out of a device. And, as it turns out, we have a great solution for this coming: WDF, the Wolfram Data Framework. In a sense, what WDF does is to take everything we’ve learned about representing data and the world from Wolfram|Alpha, and make it available to use on data from anywhere.

There’s a lot to say about WDF. But in terms of devices, it provides an immediate way to represent not just raw numbers from a device, but, say, images or geopositions—or actual measured physical quantities.

In Wolfram|Alpha we’ve, of necessity, assembled the world’s most complete system for handling physical quantities and their units. We’ve got a couple of thousand physical quantities built in (like length, or torque, or tensile strength, or clicks per impression), as well as nearly 10,000 units of measure (like inches, or meters per second or katals or micropascals per square root hertz). And in WDF we immediately get to use this whole setup.

So once we can get data out of a device, WDF provides a great way to represent it. And given the WDF form, there are lots of things we can do with the data.

For researchers, we’re building the Wolfram Data Repository, that lets people publish data—from devices or otherwise—using WDF in an immediately computable form.

We’re also building the Wolfram Data Science Platform, that lets people visualize and analyze data using all the sophistication of the Wolfram Language—and then generate complete interactive reports from the data, that can be deployed on the web, on mobile, offline, and so on.

But how can one actually interact with the device? Well, within the Wolfram Language we’ve been building a powerful framework for this. From a user’s point of view, there’s a symbolic representation of each device. Then there are a standard set of Wolfram Language functions like `DeviceRead`, `DeviceExecute`, `DeviceReadBuffer` and `DeviceReadTimeSeries` that perform operations related to the device.

Ultimately, this is implemented by having a Wolfram Language driver for each device. But the idea is that the end user never has to know about this. The appropriate driver is just automatically retrieved from the Wolfram Cloud when it’s needed. And then the general Wolfram Language framework goes from the low-level operations in the driver to all the various higher-level symbolic device functions. Like `DeviceReadTimeSeries`, which samples a series of data points from the device, then returns them in a symbolic `TimeSeries` object which can immediately be used for further visualization, analysis, etc.

There is another issue here, though: How does one actually make the connection to a particular device? It depends on the device. Some devices automatically connect to the cloud, perhaps through an intermediate mobile device. And in those cases, one typically just has to connect to an API exposed in the cloud.

But at least right now, many more devices connect in various kinds of wired or wireless ways to a specific local computer. Sometimes one may then want to interact with the data directly on that local computer.

But more often one either wants to have something autonomous happen with the data on the local computer. Or one wants to get the data into the cloud. For example so one can systematically have people or machines query it, generate reports from it, and so on.

And in both these cases, it’s often really convenient to have the basic device connect to some kind of small embeddable computer system. Like the Raspberry Pi $25 Linux computer, on which, conveniently enough, the Wolfram Language is bundled as part of its standard system software.

And if one’s running the Wolfram Language on the local machine connected to the device, there are mechanisms built into the language that allow both for immediate discovery, and for communication with the cloud. And more than that, with this setup there’s a symbolic representation of the device immediately accessible to the Wolfram Language in the cloud. Which means, for example, that parallel computation operations in the language can be used to aggregate data from networks of devices, and so on.

But, OK, so what are the kinds of devices one will be able to do all this with? Well, that’s what the Wolfram Connected Devices Project is intended to answer.

It’s certainly a very diverse list. Yes, there are lots of acceleration- and/or heart-rate-based health devices, and lots of GPS-based devices. But there are lots of other kinds of devices too, measuring scores of different physical quantities.

The devices range from tiny and cheap to huge and expensive. In the current list, about 2/3 of the devices are basically standalone, and 1/3 require continuous physical connectivity. The border of what counts as a “device”, as opposed to, for example, a component, is a bit fuzzy. Our operational definition for the Wolfram Connected Devices Project is that something can be considered a “connected device” if it measures some physical quantity, and can be connected to a general-purpose computer using some standard connector or connection technology.

For now, at least, we’ve excluded objects that in effect have complex custom electrical connectivity—for example, sensors that have the form factor of integrated circuits with lots of “legs” that have to be plugged into something. We’ve included, though, objects that have just a few wires coming out, that can for example immediately be plugged into GPIO ports, say on Raspberry Pi—or into analog ports on something like an Arduino connected to a Raspberry Pi.

The case of a device whose “interface” is just a few wires is usually one of the more straightforward. Things usually get more complicated when there are serial connections, USB, Bluetooth, and so on, involved. Sometimes devices make use of slightly higher-level protocols (like ANT+ or Bluetooth LE). But our experience so far is that ultimately there’s very little that’s truly standard. Each device requires custom work to create a driver, map properly to WDF, and so on.

The good news, of course, is that with the Wolfram Language we have an incredibly rich toolset for creating such drivers. Whether it’s by making use of the hundreds of import and export formats built into the language. Or all the mechanisms for calling external programs. Or the ways of handling time or place information. Or the algorithms for doing signal processing and time series analysis.

We’ve been interacting with many device manufacturers over the past year or so. And it’s been very encouraging. Because it seems as if the technology stack we’ve been building all these years is exactly what people need.

Countless times we’ve heard the same thing. ”We’re building this great device; now we want to do great things with the data from it—analyzing it, delivering it to customers, and so on.” Well, that’s exactly what we’re going to be set up to do. And we have both the deep technical capabilities that are needed, and the practical infrastructure.

The first step is to get a Wolfram Language driver for the device. Once that’s done, everything flows from it. Whether it’s just storing computable versions of data in the Wolfram Data Repository. Or doing analysis or reporting through the Wolfram Data Science Platform. Or creating dashboards. Or exposing the data through an API. Or an app. Or producing alerts from the data. Or aggregating lots of data. Or, for that matter, combining data from multiple devices—for example in effect to create “synthetic sensors”.

There are lots of possibilities. One can use Wolfram *SystemModeler* to have a model for a device, that can be used to run a simulation in real time. Or one can use the control systems functions in the Wolfram Language to create a controller with the device. Or in a quite different direction, one can use our Wolfram|Alpha-style linguistic capabilities to let end users make natural language or voice queries about data coming from a device.

There are several common end results that manufacturers of devices typically want. One is just that it should be possible to take data from the device and flow it into the Wolfram Data Science Platform, or *Mathematica*, or some other Wolfram Language system, for some kind of processing. Another is that the whole user infrastructure around the device is built using our technology. Say creating a portal or dashboard on the web, or on a mobile device, for every single user of a particular type of device. That can use either our cloud, or a private cloud. And instead of a dashboard, one can have a query mechanism. Say through natural language for humans—or through some structured API for machines or programs.

In some ways the situation with connected devices right now is probably something of a transient. Because we’re mostly thinking about connecting devices to computers, and having those run the Wolfram Language. But in the future, the Wolfram Language is going to be running on increasingly small and ubiquitous embedded computers. And I expect that more and more connected devices are just going to end up having the computer power to run the Wolfram Language inside—so that they can do all sorts of Wolfram Language processing completely internally.

Of course, even in this case there is still going to have to be Wolfram Language code that reads raw data from sensors and so on. So there’s no getting around building drivers, just like for the current way most connected devices are set up.

We’ve had the experience now of building quite a few drivers. For simple devices, it’s a quick process. But as devices start to have more commands, and can generate more sophisticated data, it takes longer. In many ways, it feels like a curation task. Given all the Wolfram Language tools we have, it’s rarely about the details of manipulating data. Rather it’s about knowing what the data means, and knitting it into the whole WDF and Wolfram Language framework.

We’re going to have a service for manufacturers to work with us to connect their devices to our system. We’re also planning to run a sequence of hackathon-like events where students and others can work with devices to set up connections (and often get free devices at the end!).

The goal is to get seamless integration of as many kinds of devices as possible. And the more kinds of devices we have, the more interesting things are going to get. Because it means we can connect to more and more aspects of the physical world, and be in a position to compute more and more about it.

Within the Wolfram Language we have a rich symbolic way to represent the world. And with connected devices we have a way to attach this representation to real things in the world. And to make the Wolfram Language become a complete language for the Internet of Things.

But today we’re taking a first step. Launching the Wolfram Connected Devices Project to start the process of curating just what things exist so far in the current generation of the Internet of Things.

Visit the Wolfram Connected Devices Project »

]]>I often like to write notes on the cards I send. And when I was sending out paper cards, that was straightforward to do. But what about with e-cards?

Well, it’d be easy to type messages and have them printed on the e-cards. But that seems awfully impersonal. And anyway, I rather like having at least one time each year when I do a bunch of actual writing by hand—not least so my handwriting doesn’t atrophy completely.

So there’s an obvious solution: handwritten e-cards. Which is exactly what I did this year:

The background image was created by our (frequently award-winning!) company art department. (This year, the white “Spikey” is 25-pointed, celebrating the 25th anniversary of *Mathematica*.) But how did we get the handwriting onto it?

With the Wolfram Language, it was really easy.

First, I got together the list of email addresses I wanted to send e-cards to. Then we had code to print out pieces of paper like this:

Then I actually did my Christmas thing, and went through and wrote all the messages I wanted to write:

Then we took this stack of pages, and ran them through a scanner, getting a bunch of image files. And now we can go to work.

First, import the file:

Then pick out the part of the image corresponding to the handwritten message (the numbers are an approximation found using an interactive tool):

Now we image-process the message, and make it the right size:

Here are the elements of the actual card we’re trying to assemble:

Now we create a version of the card with the right amount of “internal padding” to have space to insert the particular message:

And then we’re ready to use `ImageCompose` to assemble the final image:

OK, so that’s the card we want to send. Now, who do we want to send it to? To get that, we just have to use `TextRecognize` to do OCR on the original scan:

And finally, just use `SendMail` to send the card to the address we’ve got.

And that’s it. Handwritten e-cards. Of course, since I have a lot of techie friends, there were quite a few responses along the lines of, “How did you do that?”

Well, now it’s not “my secret” any more. And by next holiday season, the Wolfram Cloud will let one make this a service anyone can use. And maybe I’ll have to come up with another little innovation for my own cards…

]]>

In effect, this is a technology preview: it’s an early, unfinished, glimpse of the Wolfram Language. Quite soon the Wolfram Language is going to start showing up in lots of places, notably on the web and in the cloud. But I’m excited that the timing has worked out so that we’re able to give the Raspberry Pi community—with its emphasis on education and invention—the very first chance to put the Wolfram Language into action.

I’m a great believer in the importance of programming as a central component of education. And I’m excited that with the Wolfram Language I think we finally have a powerful programming language worthy of the next generation. We’ve got a language that’s not mostly concerned with the details of computers, but is instead about being able to understand and create things on the basis of huge amounts of built-in computational ability and knowledge.

It’s tremendously satisfying—and educational. Writing a tiny program, perhaps not even a line long, and already having something really interesting happen. And then being able to scale up larger and larger. Making use of all the powerful programming paradigms that are built into the Wolfram Language.

And with Raspberry Pi there’s something else too: immediately being able to interact with the outside world. Being able to take pure code, and connect it to sensors and devices that do things.

I think it’s pretty amazing that we’re now at the point where all the knowledge and computation in the Wolfram Language can run in a $25 computer. And I think that it’s the beginning of something very important. Because it means that going forward it’s going to be technically possible to embed the Wolfram Language in pretty much any new machine or system. In effect immediately injecting high-level intelligence and capabilities.

I’ve waited a long time for this. Back in 1988 when *Mathematica* was first released, it could only just fit in a high-end Mac of the time, but not yet a PC. A decade later—even though it had grown a lot—it could run well on pretty much any newly sold personal computer. But embedded computers were a different story—where one expected that only specially compiled simple code could run.

But I knew that one day what would become the Wolfram Language would be able to run in its complete form on an embedded computer. And now it’s clear that finally that day has come: with the Raspberry Pi, we’ve passed the threshold for being able to run the Wolfram Language on an embedded computer anywhere.

To be clear, the Raspberry Pi is perhaps 10 to 20 times slower at running the Wolfram Language than a typical current-model laptop (and sometimes even slower when it’s lacking architecture-specific internal libraries). But for many things, the speed of the Raspberry Pi is just fine. And for example, my old test of computing 1989^1989 that used to take many seconds on the computers that existed when *Mathematica* was young now runs in an immeasurably short time on the Raspberry Pi.

From a software engineering point of view, what’s being bundled with the Raspberry Pi is a pilot version of our new Wolfram Engine. Then there are two applications on the Pi powered by this engine. The first is a command-line version of the Wolfram Language. And the second is *Mathematica* with its notebook user interface, providing in effect a rich document-based way of interacting with the Wolfram Language.

The command-line Wolfram Language is quite zippy on the Raspberry Pi. The full notebook interface to *Mathematica*—requiring as it does the whole X Window stack—can be a trifle sluggish by modern standards (and we had to switch a few things off by default, like our new Predictive Interface, because they just slowed things down too much). But it’s still spectacular: the first time *Mathematica* has been able to run at all on anything like a $25 computer.

And it’s the whole system. Nothing is left out. All 5000+ Wolfram Language functions. All capabilities of *Mathematica* and its notebook interface.

For me, one of the most striking things about having all this on the Raspberry Pi is how it encourages me to try a new style of real-world-connected computing. For a start, it’s easy to connect devices to a Pi. And a Pi is small and cheap enough that I can put it almost anywhere. And if I start a Wolfram Language program on it, it’s reliable enough that I can expect it to pretty much go on running forever—analyzing and uploading sensor data, controlling an autonomous system, analyzing and routing traffic, or whatever.

Building in as much automation as possible has been a longstanding principle of mine for the Wolfram Language. And when it comes to external devices, this means consistently curating properties of devices, and then setting up general symbolic functions for interacting with them.

Here’s how one would take this whole technology stack and use it to switch on LEDs by setting voltages on GPIO pins:

And here’s some image analysis on a selfie taken by a RaspiCam:

Something we’re releasing alongside the Raspberry Pi bundle is a Remote Development Kit, that allows one to develop code and maintain a notebook interface on a standard laptop or other computer, while seamlessly executing code on a networked remote Raspberry Pi. The current RDK connects to a copy of *Mathematica* (such as *Mathematica Student Edition*) running on any Mac, PC or Linux machine; soon there will be other options, for example on the web.

Within the Wolfram Language there’s actually a whole emerging structure for symbolically representing remote running language instances—and for collecting results, dispatching commands, doing computations in parallel, and so on. We’re also going to have WolframLink (derived from the *MathLink* protocol that we’ve used for nearly 25 years), that’ll let one exchange code, data or anything else in a very flexible way.

I’m very excited to see what kinds of things people invent with the Wolfram Language on the Raspberry Pi—and I look forward to reading about some of them in the Wolfram+Raspberry Pi section on Wolfram Community, as well as on the Raspberry Pi Foundation website.

In the next few months, it’s all going to get more and more interesting. What we’re releasing today on the Raspberry Pi is just the first pilot for the Wolfram Language. There’ll be many updates, particularly as we approach the first production release of the language.

As with Wolfram|Alpha on the web, the Wolfram Language (and *Mathematica*) on the Raspberry Pi are going to be free for anyone to use for personal purposes. (There’s also going to be a licensing mechanism for commercial uses, other Linux ARM systems, and so on.)

As a footnote to history, I might mention that the Raspberry Pi is only the second computer ever on which *Mathematica* has been bundled for free use. (Not counting, of course, all the computers at universities with site licenses, etc.) The first was Steve Jobs’s NeXT computer in 1988.

I still regularly run into people today who tell me how important *Mathematica* on the NeXT was for them. Not to mention the gaggle of NeXT computers that were bought by CERN for physicists to run *Mathematica*—but ended up being diverted to invent the web.

What will be done with the millions of instances of the Wolfram Language that are bundled on Raspberry Pi computers around the world? Maybe some amazing and incredibly important invention will be made with them. Maybe some kid somewhere will be inspired, and will go on to change the world.

But one thing is clear: with the Wolfram Language on Raspberry Pi we’ve got a new path for learning programming—and connecting it to the real world—that a great many people are going to be able to benefit from. And I am very pleased to have been able to do my part to make this happen.

]]>

But recently something amazing has happened. We’ve figured out how to take all these threads, and all the technology we’ve built, to create something at a whole different level. The power of what is emerging continues to surprise me. But already I think it’s clear that it’s going to be profoundly important in the technological world, and beyond.

At some level it’s a vast unified web of technology that builds on what we’ve created over the past quarter century. At some level it’s an intellectual structure that actualizes a new computational view of the world. And at some level it’s a practical system and framework that’s going to be a fount of incredibly useful new services and products.

I have to admit I didn’t entirely see it coming. For years I have gradually understood more and more about what the paradigms we’ve created make possible. But what snuck up on me is a breathtaking new level of unification—that lets one begin to see that all the things we’ve achieved in the past 25+ years are just steps on a path to something much bigger and more important.

I’m not going to be able to explain everything in this blog post (let’s hope it doesn’t ultimately take something as long as *A New Kind of Science* to do so!). But I’m excited to begin to share some of what’s been happening. And over the months to come I look forward to describing some of the spectacular things we’re creating—and making them widely available.

It’s hard to foresee the ultimate consequences of what we’re doing. But the beginning is to provide a way to inject sophisticated computation and knowledge into everything—and to make it universally accessible to humans, programs and machines, in a way that lets all of them interact at a vastly richer and higher level than ever before.

A crucial building block of all this is what we’re calling the Wolfram Language.

In a sense, the Wolfram Language has been incubating inside *Mathematica* for more than 25 years. It’s the language of *Mathematica*, and CDF—and the language used to implement Wolfram|Alpha. But now—considerably extended, and unified with the knowledgebase of Wolfram|Alpha—it’s about to emerge on its own, ready to be at the center of a remarkable constellation of new developments.

We call it the Wolfram Language because it is a language. But it’s a new and different kind of language. It’s a general-purpose knowledge-based language. That covers all forms of computing, in a new way.

There are plenty of existing general-purpose computer languages. But their vision is very different—and in a sense much more modest—than the Wolfram Language. They concentrate on managing the structure of programs, keeping the language itself small in scope, and relying on a web of external libraries for additional functionality. In the Wolfram Language my concept from the very beginning has been to create a single tightly integrated system in which as much as possible is included right in the language itself.

And so in the Wolfram Language, built right into the language, are capabilities for laying out graphs or doing image processing or creating user interfaces or whatever. Inside there’s a giant web of algorithms—by far the largest ever assembled, and many invented by us. And there are then thousands of carefully designed functions set up to use these algorithms to perform operations as automatically as possible.

Over the years, I’ve put immense effort into the design of the language. Making sure that all the different pieces fit together as smoothly as possible. So that it becomes easy to integrate data analysis here with document generation there, with mathematical optimization somewhere else. I’m very proud of the results—and I know the language has been spectacularly productive over the course of a great many years for a great many people.

But now there’s even more. Because we’re also integrating right into the language all the knowledge and data and algorithms that are built into Wolfram|Alpha. So in a sense inside the Wolfram Language we have a whole computable model of the world. And it becomes trivial to write a program that makes use of the latest stock price, computes the next high tide, generates a street map, shows an image of a type of airplane, or a zillion other things.

We’re also getting the free-form natural language of Wolfram|Alpha. So when we want to specify a date, or a place, or a song, we can do it just using natural language. And we can even start to build up programs with nothing more than natural language.

There are so many pieces. It’s quite an array of different things.

But what’s truly remarkable is how they assemble into a unified whole.

Partly that’s the result of an immense amount of work—and discipline—in the design process over the past 25+ years. But there’s something else too. There’s a fundamental idea that’s at the foundation of the Wolfram Language: the idea of symbolic programming, and the idea of representing everything as a symbolic expression. It’s been an embarrassingly gradual process over the course of decades for me to understand just how powerful this idea is. That there’s a completely general and uniform way to represent things, and that at every level that representation is immediately and fluidly accessible to computation.

It can be an array of data. Or a piece of graphics. Or an algebraic formula. Or a network. Or a time series. Or a geographic location. Or a user interface. Or a document. Or a piece of code. All of these are just symbolic expressions which can be combined or manipulated in a very uniform way.

But in the Wolfram Language, there’s not just a framework for setting up these different kinds of things. There’s immense built-in curated content and knowledge in each case, right in the language. Whether it’s different types of visualizations. Or different geometries. Or actual historical socioeconomic time series. Or different forms of user interface.

I don’t think any description like this can do the concept of symbolic programming justice. One just has to start experiencing it. Seeing how incredibly powerful it is to be able to treat code like data, interspersing little programs inside a piece of graphics, or a document, or an array of data. Or being able to put an image, or a user interface element, directly into the code of a program. Or having any fragment of any program immediately be runnable and meaningful.

In most languages there’s a sharp distinction between programs, and data, and the output of programs. Not so in the Wolfram Language. It’s all completely fluid. Data becomes algorithmic. Algorithms become data. There’s no distinction needed between code and data. And everything becomes both intrinsically scriptable, and intrinsically interactive. And there’s both a new level of interoperability, and a new level of modularity.

So what does all this mean? The idea of universal computation implies that in principle any computer language can do the same as any other. But not in practice. And indeed any serious experience of using the Wolfram Language is dramatically different than any other language. Because there’s just so much already there, and the language is immediately able to express so much about the world. Which means that it’s immeasurably easier to actually achieve some piece of functionality.

I’ve put a big emphasis over the years on automation. So that the Wolfram Language does things automatically whenever you want it to. Whether it’s selecting an optimal algorithm for something. Or picking the most aesthetic layout. Or parallelizing a computation efficiently. Or figuring out the semantic meaning of a piece of data. Or, for that matter, predicting what you might want to do next. Or understanding input you’ve given in natural language.

Fairly recently I realized there’s another whole level to this. Which has to do with the actual deployment of programs, and connectivity between programs and devices and so on. You see, like everything else, you can describe the infrastructure for deploying programs symbolically—so that, for example, the very structure and operation of the cloud becomes data that your programs can manipulate.

And this is not just a theoretical idea. Thanks to endless layers of software engineering that we’ve done over the years—and lots of automation—it’s absolutely practical, and spectacular. The Wolfram Language can immediately describe its own deployment. Whether it’s creating an instant API, or putting up an interactive web page, or creating a mobile app, or collecting data from a network of embedded programs.

And what’s more, it can do it transparently across desktop, cloud, mobile, enterprise and embedded systems.

It’s been quite an amazing thing seeing this all start to work. And being able to create tiny programs that deploy computation across different systems in ways one had never imagined before.

This is an incredibly fertile time for us. In a sense we’ve got a new paradigm for computation, and every day we’re inventing new ways to use it. It’s satisfying, but more than a little disorienting. Because there’s just so much that is possible. That’s the result of the unique convergence of the different threads of technology that we’ve been developing for so long.

Between the Wolfram Language—with all its built-in computation and knowledge, and ways to represent things—and our Universal Deployment System, we have a new kind of universal platform of incredible power. And part of the challenge now is to find the best ways to harness it.

Over the months to come, we’ll be releasing a series of products that support particular ways of using the Wolfram Engine and the Universal Platform that our language and deployment system make possible.

There’ll be the Wolfram Programming Cloud, that allows one to create Wolfram Language programs, then instantly deploy them in the cloud through an instant API, or a form-based app, or whatever. Or deploy them in a private cloud, or, for example, through a Function Call Interface, deploy them standalone in desktop programs and embedded systems. And have a way to go from an idea to a fully deployed realization in an absurdly short time.

There’ll be the Wolfram Data Science Platform, that allows one to connect to all sorts of data sources, then use the kind of automation seen in Wolfram|Alpha Pro, then pick out and modify Wolfram Language programs to do data science—and then use CDF to set up reports to generate automatically, on a schedule, through an API, or whatever.

There’ll be the Wolfram Publishing Platform that lets you create documents, then insert interactive elements using the Wolfram Language and its free-form linguistics—and then deploy the documents, on the web using technologies like CloudCDF, that instantly support interactivity in any web browser, or on mobile using the Wolfram Cloud App.

And we’ll be able to advance *Mathematica* a lot too. Like there’ll be *Mathematica* Online, in which a whole *Mathematica* session runs on the cloud through a web browser. And on the desktop, there’ll be seamless integration with the Wolfram Cloud, letting one have things like persistent symbolic storage, and instant large-scale parallelism.

And there’s still much more; the list is dauntingly long.

Here’s another example. Just as we curate all sorts of data and algorithms, so also we’re curating devices and device connections. So that built into the Wolfram Language, there’ll be mechanisms for communicating with a very wide range of devices. And with our Wolfram Embedded Computation Platform, we’ll have the Wolfram Language running on all sorts of embedded systems, communicating with devices, as well as with the cloud and so on.

At the center of everything is the Wolfram Language, and we intend to make this as widely accessible to everyone as possible.

The Wolfram Language is a wonderful first language to learn (and we’ve done some very successful experiments on this). And we’re planning to create a Programming Playground that lets anyone start to use the language—and through the Programming Cloud even step up to make some APIs and so on for free.

We’ve also been building the Wolfram Course Authoring Platform, that does major automation of the process of going from a script to all the elements of an online course—then lets one deploy the course in the cloud, so that students can have immediate access to a Wolfram Language sandbox, to be able to explore the material in the course, do exercises, and so on. And of course, since it’s all based on our unified system, it’s for example immediate that data from the running of the course can go into the Wolfram Data Science Platform for analysis.

I’m very excited about all the things that are becoming possible. As the Wolfram Language gets deployed in all these different places, we’re increasingly going to be able to have a uniform symbolic representation for everything. Computation. Knowledge. Content. Interfaces. Infrastructure. And every component of our systems will be able to communicate with full semantic fidelity, exchanging Wolfram Language symbolic expressions.

Just as the lines between data, content and code blur, so too will the lines between programming and mere input. Everything will become instantly programmable—by a very wide range of people, either by using the Wolfram Language directly, or by using free-form natural language.

There was a time when every computer was in a sense naked—with just its basic CPU. But then came things like operating systems. And then various built-in languages and application programs. What we have now is a dramatic additional step in this progression. Because with the Wolfram Language, we can in effect build into our computers a vast swath of existing knowledge about computation and about the world.

If we’re forming a kind of global brain with all our interconnected computers and devices, then the Wolfram Language is the natural language for it. Symbolically representing both the world and what can be created computationally. And, conveniently enough, being efficient and understandable for both computers and humans.

The foundations of all of this come from decades spent on *Mathematica*, and Wolfram|Alpha, and *A New Kind of Science*. But what’s happening now is something new and unexpected. The emergence, in effect, of a new level of computation, supported by the Wolfram Language and the things around it.

So far I can see only the early stages of what this will lead to. But already I can tell that what’s happening is our most important technology project yet. It’s a lot of hard work, but it’s incredibly exciting to see it all unfold. And I can’t wait to go from “Coming Soon” to actual systems that people everywhere can start to use…

]]>Today it’s exactly a quarter of a century since we launched *Mathematica* 1.0 on June 23, 1988. Much has come and gone in the world of computing since that time. But I’m pleased to say that through all of it *Mathematica* has just kept getting stronger and stronger.

A quarter century ago I worked very hard to lay the best possible foundations for *Mathematica*—and to define principles and structures on which I thought *Mathematica* could be grown far into the future. And looking back now I have to say that all that effort paid off far better than I could ever have imagined.

I have always insisted that *Mathematica* be a system without compromises: a system where everything is designed and built right. Often there have been immense technical challenges in doing this. And sometimes it’s taken years. But the result has been the creation of an ever more impressive and unique structure, able to grow and advance without bound.

Twenty-five years is a long time in the history of technology. And it’s extremely rare for a technology project to maintain a clear and consistent direction for that long. No doubt part of what’s made it possible for *Mathematica* is that I’ve personally been able to continue to provide leadership over all those years. But what’s also been critical is that we’ve been able to build Wolfram Research into a long-term company that can focus on long-term goals.

One might have thought that any technology that existed 25 years ago would by now look old and clunky. But not *Mathematica*. The interface to *Mathematica* 1.0 had to deal with dial-up connections and one-megabyte memory limitations. But from the very beginning I tried hard to make sure that the underlying language—what we now call “the Wolfram Language”—was pure and timeless. And while the Wolfram Language has grown and broadened immeasurably over the last quarter century, the core of it is still what was in *Mathematica* 1.0—and it looks as fresh and modern as ever.

My goal in building *Mathematica* was an ambitious one: to create once and for all a single unified computational system that could eventually handle all forms of algorithmic work. Mathematics was an early target area (hence the name “Mathematica”). But the real goal—and the core design of the system—was much broader. And over the past 25 years the system has grown at an accelerating rate to encompass more and more.

Throughout everything we’ve followed a principle that I defined right at the beginning: that whatever is added, everything across the whole system must always fit together in a coherent way. Often it’s taken great effort to achieve this. But the payoff has been spectacular. Not only has it kept an ever-larger system easy to learn and use; it’s also made possible a kind of combinatorial growth in the power of the whole system—with each new part routinely able to combine capabilities from every other part.

It’s been exciting to watch the growth of *Mathematica* over the past 25 years. To see so many new areas covered in each successive version. And to see what is now an immense algorithmic edifice emerge: a giant interconnected web of algorithms and capabilities quite unlike anything even imagined before.

Some of the algorithms in *Mathematica* are ones that were already known. But increasingly they’re ones we’ve invented. Sometimes by making use of sophisticated functionality from other parts of *Mathematica*. And sometimes by using methods like automated algorithm discovery from *A New Kind of Science*. But increasingly what we need are not just algorithms, but meta-algorithms—that automatically select between different algorithms based on a host of criteria from efficiency to aesthetics.

Automation has always been a guiding principle of *Mathematica*. Users define what they want to achieve. Then the idea is that it’s up to *Mathematica* to figure out—as automatically as possible—how best to achieve it. At first it wasn’t clear how far automation could go. But with every new version of *Mathematica*, we’ve found ways to automate more and more. In effect making *Mathematica* a higher and higher level system.

Back when *Mathematica* was young there was sometimes a tension: should one use *Mathematica* because it’s easy and general, or should one use some special-purpose system that’s specifically optimized for a particular kind of work? Well, over the past decade or so, even when it comes to efficiency, special-purpose systems have lost their edge. Because with all its capabilities *Mathematica* can just implement vastly better algorithms.

Looking at the development of *Mathematica* over the past 25 years, I see a mixture of inexorable progress, and surprises. Often pieces of ever more sophisticated functionality simply have to be built layer-by-layer over the course of many years. And sometimes it takes advances in hardware and interfaces—or in peoples’ familiarity with some concept or another—to make progress possible. But then there are surprises. Where given what exists, one can suddenly see some new possibility that one never imagined before.

*Mathematica* will never be truly finished. There will always be more to add to it, more to automate, more intellectual structures to discover. Often over the years I’ll be thinking about some area or another and wonder whether it’ll ever be possible to integrate it into *Mathematica*. But the remarkable experience that I’ve had over and over again is that eventually the answer will turn out to be yes. Sometimes it’ll require significant conceptual breakthroughs. But usually the result is that by building on the principles and foundations of *Mathematica* one can create something uniquely clear and powerful—that often for the first time “consumerizes” a particular area to the point where it can routinely be used.

As a software engineering achievement, *Mathematica* can undoubtedly be considered one of the great codebases of our time. But more than that, I think it stands as a major intellectual achievement: a unique representation and clarification of the scope and concept of computation.

*Mathematica* long outgrew its math-related name. And today in fact it stands at a turning point. Its content and capabilities make it relevant to a huge range of applications. And now the ambient technology of our times—cloud, mobile, and more—will finally allow it to be conveniently deployed in a quite ubiquitous way. Already *Mathematica* is at the core of CDF, Wolfram|Alpha and everything that is done with them. But there is vastly more to come.

It has been wonderful to see over the past 25 years so many ways that *Mathematica* has contributed to invention, discovery and education in the world. But I suspect that all that has happened so far will pale in comparison to what the future holds. We have spent more than a quarter of a century building up the unique structure that is *Mathematica* today. And in some ways it has taken the better part of 25 years to realize just how strong and important what we have is—and to get to the point where it can begin to realize its full potential.

For me and our team *Mathematica* is much more than a product. It is a mission. To create the means to bring the power of computation and computational knowledge to as much of our world as possible. And to create something that will stand as a broad and enduring contribution to civilization.

I am proud of what we have been able to achieve with *Mathematica* in the last 25 years. And it is a source of great encouragement to read remarks from people whose lives *Mathematica* has touched. But as I look to the future, I realize that in many ways we are just getting started. And humbling though it is after spending nearly half my life so far devoted to *Mathematica*, it is inevitable that in time the quarter century just passed will seem like just a small part of the development of *Mathematica*.

But today I am pleased to celebrate the first 25 years of *Mathematica*. And, yes, in recognition of the many seemingly impossible challenges that we have overcome in the development of *Mathematica*, the object at the top is what might seem like an impossible geometrical object: a 25-pointed “spikey”.

It’s been a great first 25 years for *Mathematica*. It’s been a pleasure and privilege to be part of it. And I look forward with great anticipation to the years to come.

In a few weeks it’ll be 25 years ago: June 23, 1988—the day *Mathematica* was launched.

Late the night before we were still duplicating floppy disks and stuffing product boxes. But at noon on June 23 there I was at a conference center in Santa Clara starting up *Mathematica* in public for the first time:

(Yes, that was the original startup screen, and yes, *Mathematica* 1.0 ran on Macs and various Unix workstation computers; PCs weren’t yet powerful enough.)

People were pretty excited to see what *Mathematica* could do. And there were pretty nice speeches about the promise of *Mathematica* from a spectrum of computer industry leaders, including Steve Jobs (then at NeXT), who was kind enough to come even though he hadn’t appeared in public for a while. And someone at the event had the foresight to get all the speakers to sign a copy of the book, which had just gone on sale that day at bookstores all over the country:

So much has happened with *Mathematica* in the quarter century since then. What began with *Mathematica* 1.0 has turned into the vast system that is *Mathematica* today. And as I look at the 25th Anniversary Scrapbook, it makes me proud to see how many contributions *Mathematica* has made to invention, discovery and education:

But to me what’s perhaps most satisfying is how the fundamental principles on which I built *Mathematica* have stood the test of time. And how the core ideas and language that were in *Mathematica* 1.0 persist today (and yes, most *Mathematica* 1.0 code will still run unchanged today).

But, OK, where did *Mathematica* come from? How did it come to be the way it is? It’s a long story, really. Deeply entwined with my own personal story. But particularly as I look to the future, I find it interesting to understand how things have evolved from all that history.

Perhaps the first faint glimmering of an orientation toward something like *Mathematica* came when I was about 6 years old—and realized that I could “automate” those tedious addition sums I was being given, by creating an “addition slide rule” out of two rulers. I never liked calculational math, and was never good at it. But starting around the age of 10, I became increasingly interested in physics—and doing physics required doing math.

Electronic calculators arrived on the scene when I was 12—and I immediately became an enthusiast. And around the same time, I started using my first computer—an object the size of a large desk, with 8 kilowords of 18-bit memory, programmed mostly in assembler using paper tape. I tried doing physics with it, to no great success. But by the time I was 16, I had published a few physics papers, left high school, and was working at a British government lab. “Real” theoretical physicists basically didn’t use computers in those days. But I did. Alternating between an HP desk calculator (with a plotter!) and an IBM mainframe programmed in Fortran.

I was basically just doing numerics, though. But in the physics I wanted to do, there was all sorts of algebra. And not just a little algebra. Huge amounts. Expressions from Feynman diagrams with hundreds or thousands of terms, all of which had to be precisely right if one was going to get the right answer.

I wondered what to do. I imagined spending my life chasing minus signs and factors of 2. But then I started thinking about using a computer to help. And right then someone told me that other people had had that idea too. There were three programs that I found out about—all as it turned out started some 14 years earlier from a single conversation at CERN in 1962: Reduce (written in LISP), Ashmedai (written in Fortran) and Schoonschip (written in CDC 6000 assembler).

The programs were specialized, and it wasn’t clear how many people other than their authors had ever used them seriously. They were pretty clunky to use: typically you’d submit a deck of cards, and then some time later get back a result—or more often a cryptic error message. But I managed to start doing physics with them.

Then in the summer of 1977 I discovered the ARPANET, or what’s now the internet. There were only 256 hosts on it back then. And @O 236 went to an open computer at MIT that ran a program called Macsyma—that did algebra, and could be used interactively. I was amazed so few people used it. But it wasn’t long before I was spending most of my days on it. I developed a certain way of working—going back and forth with the machine, trying things out and seeing what happened. And routinely doing weird things like enumerating different algebraic forms for an integral—then just “experimentally” seeing which differentiated correctly.

My physics papers started containing all sorts of amazing formulas. And not imagining that I could be using a computer, people started thinking that I must be some kind of great human algebraic calculator. I got more and more ambitious, trying to do more and more with Macsyma. Pretty soon I think I was its largest user. But sometime in 1979 I hit the edge; I’d outgrown it.

And then it was November 1979. I was 20 years old, and I’d just gotten my PhD in physics. I was spending a few weeks at CERN, planning my future in (as I believed) physics. And one thing I concluded was that to do physics well, I’d need something better than Macsyma. And after a little while I decided that the only way I’d really have a chance to get what I wanted was if I built it myself.

And so it was that I embarked on what would become SMP (the “Symbolic Manipulation Program”). I had a pretty broad knowledge of other computer languages of the time, both the “ordinary” ALGOL-like procedural ones, and ones like LISP and APL. At first as I sketched out SMP, my designs looked a lot like what I’d seen in those languages. But gradually, as I understood more about how different SMP had to be, I started just trying to invent everything myself.

I think I had some pretty good ideas. And actually even some of my early SMP design documents have a remarkably *Mathematica*-like flavor to them:

Looking back at its documentation, SMP was quite an impressive system, especially given that I was only 20 years old when I started designing it. But needless to say, not every idea in SMP was good. And as a long-time connoisseur of language design, I can’t resist at the bottom of this post mentioning a few of my “favorite” mistakes.

Even in my early designs, SMP was a big system. But for whatever reason, I didn’t find that at all daunting. I just wanted to go ahead and implement it. I wanted to make sure I did everything as well as possible. And I remember thinking: “I don’t officially know computer science; I’d better learn it”. So I went to the bookstore, and bought every book I could find on computer science—the whole half shelf of them. And proceeded to read them all.

I was working at Caltech back then. And I invited everyone I could find from around the world who’d worked on any related system to come give a talk. I put together a little “working group” at Caltech—which for a while included Richard Feynman. And I started recruiting people from around the campus to work on the “SMP Project”.

A big early decision was what language SMP should be written in. Macsyma was written in LISP, and lots of people said LISP was the only possibility. But a young physics graduate student named Rob Pike convinced me that C was the “language of the future”, and the right choice. (Rob went on to do all sorts of things, like invent the Go language.) And so it was that early in 1980, the first lines of C code for SMP were written.

The group that worked on SMP was an interesting one. My first recruit was Chris Cole, who’d worked at IBM and become an APL enthusiast, and went on to found a rather successful company called Peregrine Systems. Then there were students with a variety of different skills, and a programming-enthusiast professor who’d been a collaborator of mine on some physics papers. There was some eccentricity along the way, of course. Like the person who wrote very efficient code, all on one line, with functions colorfully named so their combinations would read as little jokes. Or the quite brilliant undergraduate who worked so hard on the project that he failed all his classes, then promised he wouldn’t touch a computer—but was soon found dictating code to someone else.

I wrote lots of code for SMP myself (about 1000 lines/day). I did the design. And I wrote most of the documentation. I’d never managed a large project before. But somehow that part never seemed very difficult. And sure enough, by June 1981, SMP Version 1 was running—and even looking a bit like *Mathematica*:

For its time, SMP was a very big software system (though its executable was just under a megabyte). Its original purpose was to do mathematical computation. But along the way I realized that even to do that well, I had to create a whole, rather general, symbolic language. I suppose I saw it as being a bit like physics—but instead of dealing with elementary particles, I was trying to find the elementary components of computation. I developed a kind of aesthetic: always try to pack the largest capability into the smallest number of primitives. Sometimes I would puzzle for weeks about how to do something—but in the end I’d come up with a design, then implement it.

I understood the idea that everything could be represented by symbolic expressions. Although the whole business of symbolically indexed lists prevented SMP from having the notion of “expression heads” that’s so clean in *Mathematica*. And there was definitely some funkiness in the internal implementation of symbolic expressions—most notably bizarre ideas about storing all numbers in floating point. (Tini Veltman, author of Schoonschip, and later winner of a physics Nobel Prize, had told me that storing numbers in floating point was one of the best decisions he ever made, because FPUs were so much faster at arithmetic than ALUs.)

Before SMP, I’d written lots of code for systems like Macsyma, and I’d realized that something I was always trying to do was to say “if I have an expression that looks like this, I want to transform it into one that looks like this”. So in designing SMP, transformation rules for families of symbolic expressions represented by patterns became one of the central ideas. It wasn’t nearly as clean as in *Mathematica*, and there were definitely some funky and far-out ideas. But a lot of the core elements were already there.

And in the end, the table of contents from the SMP Version 1.0 documentation from 1981 had a fair degree of modernity:

Yes, “graphical output” is relegated to a small section, alongside “memory management”. And there are the charming “programming impasses” (i.e. system hangs), as well as “statistical expression generation” (i.e. making random expressions). But “parallel processing” is already there, along with “program construction” (i.e. code generation). (SMP even had a way of creating C code, compiling it, and, very scarily, dynamically linking it into the running SMP executable.) And there were lots of mathematical functions, and mathematical operations—though vastly less powerful than in *Mathematica*.

But, OK. So SMP 1.0 was running. What should be done with it? It was pretty clear there were lots of people who would find it useful. It only ran on quite big computers—so-called “minicomputers”, like the VAX, that was the size of several large refrigerators, and cost a few hundred thousand dollars. But still, I knew there were plenty of research and engineering organizations that had such machines.

I really didn’t know anything about companies or business at the time. But I did understand that it cost money to pay people to work on SMP, and it seemed pretty obvious that a good way to get that money was to sell copies of SMP. My first idea was to go to what would now be called the “technology transfer office” at Caltech, and see if they could help. At the time, the office essentially consisted of one pleasant old chap. But after a few attempts, it became clear he really didn’t know what to do. I asked him how this could be, given that I assumed similar things must come up all the time at Caltech. “Well”, he said, “the thing is that faculty members mostly just go off and start companies themselves, so we never get involved”. “Oh”, I said, “can I do that?”. And he leafed through the bylaws of the university and said: “Software is copyrightable, the university doesn’t claim ownership of copyrights—so, yes, you can”.

And so off I went to start a company. But it wasn’t as simple as that. Because a little while later the university administration suddenly decided that, no, it wasn’t OK. It got very weird—and scurrilous (“give me a cut, and I’ll sign off on this”, etc.). Richard Feynman and Murray Gell-Mann interceded on my behalf. The president of the university didn’t seem to know what to do. And for a while everything was completely stuck. But eventually we agreed that the university would license whatever rights they might have—even though they were (very foolishly, as it later turned out when they tried to recruit computer science faculty) changing their bylaws about software.

As it happened there was one “last problem” though, come up with by the then-provost of the university. He claimed that having a license in place between the university and the company created a conflict of interest if I worked at the university and owned part of the company. “OK”, I said, “that’s easy to resolve: I’ll quit the university”. That seemed to come as a big surprise. But quit I did, and moved to the Institute for Advanced Study in Princeton, where, as the then-director pointed out, they’d “given away the computer” when John von Neumann died, so they couldn’t really be too worried about intellectual property.

For years, I’d wondered what had actually been going on at Caltech. And as it happens, just a couple of weeks ago, I agreed to visit Caltech again (to get a “distinguished alumnus award”), and having lunch at the faculty club there—I discovered that at the next table was none other than the former provost of Caltech, now about to turn 95. I was very impressed at his immediate and deep recall of what he called “the Wolfram Affair” (was he “warned”?), and the conversation we had finally explained things a bit better.

Frankly, it was more bizarre than I could have possibly imagined. The story in a sense began in the 1930s, when Arnold Beckman was at Caltech, invented the pH meter, and left to found Beckman Instruments. By 1981, Beckman was a major donor to Caltech, and the chairman of its board of trustees. Meanwhile, the chairman of its biology department (Lee Hood) was inventing the gene sequencer. He’s told me he tried many times to interest Beckman Instruments in it, but failed, and so started his own company (Applied Biosystems), which became very successful. At some moment, I’m told, Arnold Beckman got upset, and told the administration that they needed to “stop IP walking off campus”. Well, it turned out that the only thing of relevance happening on campus right then was none other than my SMP project. Which the then-provost said he thought he had a duty to “deal with”. (Well, he was also a chemist, who Feynman and Gell-Mann, as physicists, claimed had a “thing about physicists”, etc.)

But notwithstanding this whole adventure, the company that I named Computer Mathematics Corporation got started. At the time, I still thought of myself as a young academic, and didn’t imagine that I’d know how to run a company. So I brought in a CEO, who happened to be about twice my age. And at the behest of the CEO and some venture capitalists, the company arranged to merge with a startup that was doing what they thought was going to be really hot artificial intelligence R&D.

Meanwhile, SMP began to be sold under the banner of “mathematics by computer”:

There were horrible missteps. CEO: “Let’s build a workstation computer to run SMP”; me: “No, we’re a software company, and I’ve seen this Stanford University Network (SUN) system that’s going to be better than anything we can build”. And then there were the charmingly misguided agency-created ads:

And pretty soon I decided the whole thing was too frustrating. SMP remained something of a cash cow, and although the CEO wasn’t good at making money, he was good at raising it, going through a dizzying number of investment rounds—until there was finally an undistinguished IPO many years later.

I was meanwhile having a terrific time doing basic science, and discovering things that laid the foundations for *A New Kind of Science*. And in fact SMP turned out to be a crucial precursor to what I did. Because it was my success in inventing computational primitives for the language of SMP that got me thinking about inventing computational primitives for nature—and building a science from studying the consequences of those primitives.

You might ask what happened to SMP. It continued to be sold until sometime after *Mathematica* was released. None of its code was ever used for *Mathematica*. But occasionally I used to start it up, just to see how it “felt” compared to *Mathematica*. As time went by, it became harder to find computers that would run SMP. And perhaps 15 years ago, the last computer we had that could run SMP stopped working.

Well, I thought, I’d always been sent a personal copy of the SMP source code—though I hadn’t looked at it for ages. So now why not just recompile it on a modern system? But then I remembered: I’d had this “great” idea that we should keep the source code encrypted. But what was the key? I asked everyone I could think of. But nobody remembered.

It’s been years now, and I’d really like to see SMP run again. So here’s a challenge. This is the source for a C program encrypted like the SMP source code. Actually, it’s the source for the program that did the encryption: a version of the circa-1981 Unix crypt utility, “cleverly” modified by changing parameters etc. Can someone break the encryption? And finally free SMP from the strange digital time safe in which it’s been locked for so long. (Here’s what Wolfram|Alpha Pro has to say if one just uploads this raw file)

But back to the main story. I stopped working on SMP in 1983, and began alternating between basic science, software projects, and my (wonderfully educational) “hobby” of doing technology and strategy consulting. I used SMP a bit, but mostly I ended up writing lots and lots of C code, usually gluing together algorithms and graphics and interfaces.

The science that I’d started was going very well—and it was clear that there were lots of important things to do. But instead of trying to do it all myself, I decided I should try to get other people involved. And as part of that, I resolved to start a research institute—and got what amounted to bids from different universities for it. The University of Illinois was the winner, and so in August 1986 off I went there to start the Center for Complex Systems Research.

But by this point I was already getting concerned that my scheme of “other people doing the science” wasn’t so good. And within just a few weeks of arriving in Illinois I’d come up with plan B: build the best tools I could, and the best personal environment I could, and then try to do as much science as I could myself. And since I was pretty well plugged into the computer industry, I knew that powerful software systems would soon be able to run on the zillions of personal computers that were starting to appear. So I knew that if I could build something good, there’d be a good market for it, that would support an interesting company and environment.

And so it was that late in August 1986, I decided to try to build my ultimate computation system—that could do all the computations I wanted, or could imagine I would ever want.

And of course the result was *Mathematica*.

I knew a lot about what to do (and not do) from SMP and my other software experiences. But it was refreshing to be able to start from scratch, just trying to get the design right, without prior constraints. In SMP, algebraic computation had been the central goal. But in *Mathematica*, I wanted to cover lots of other areas too—numerics, graphics, programming, interfaces, whatever. I thought a lot about the foundations for the system, wondering for example whether things like the cellular automata I’d studied in my basic science could be relevant. But I just kept on coming back to the basic paradigm I’d already developed for SMP. Symbolic expressions and transformations for them seemed exactly right as a high-level, yet general, representation for computation.

If it hadn’t been for SMP, I would certainly have made a lot of mistakes. But SMP pretty much showed me what was important and what was not, and where the issues were. Looking through my archives today, I can see the painstaking process of puzzling through problems that I knew from SMP. And one by one coming up with solutions.

Meanwhile, just as for SMP, I’d assembled a team, and started the actual implementation of *Mathematica*. I’d also started a company—this time with me as CEO. Every day I’d write lots of code. (And to my chagrin, quite a bit of that code is still running in *Mathematica* today, especially in the pattern matcher and evaluator.) But my biggest focus was design. And following a practice I’d started with SMP, I wrote documentation as I developed the design. I figured if I couldn’t explain something clearly in documentation, nobody was ever going to understand it, and it probably wasn’t designed right. And once something was in the documentation, we knew both what to implement, and why we were doing it.

The first code for *Mathematica* was written in October 1986. And by the middle of 1987 *Mathematica* was beginning to come to life. I’d decided that the documentation should be published as a book, and hundreds of pages were already written. And I estimated that *Mathematica* 1.0 would be ready by April 1988.

My original plan for our company was to concentrate on R&D, and to distribute *Mathematica* primarily through computer manufacturers. Steve Jobs was the first to take *Mathematica* on, making a deal to bundle it with every one of his as-yet-unreleased NeXT computers. Deals with Sun, Silicon Graphics, IBM and a sequence of other companies followed. We started sending out a few beta copies of *Mathematica*. And—even though this was long before the web—word of its existence began to spread. Some media coverage started up too (I still like that kind of ice cream):

Sometime in the spring of 1988, we officially set June 23 as the release date for *Mathematica* (without Wolfram|Alpha, I didn’t know it was Alan Turing’s birthday, etc.). There was a lot to get ready. In those days releasing software didn’t just involve flipping a switch. Like I remember we were right down to the wire in getting *The Mathematica Book* printed. So I flew to Canada with a hard disk and personally babysat a phototypesetting machine for a long weekend, handing the box of film it produced to a person who met me at the airport in Boston and rushed it to the printer. But despite adventures like that, shortly before June 23 off were mailed some mysterious invitations:

And at noon on June 23 the room had filled, and we were ready to launch *Mathematica* into the world.

It’s been a great 25 years since then. The foundations that we laid in *Mathematica* 1.0—greatly informed by my earlier experiences—have proved incredibly robust, and we’ve been able to just build and build on them. My “plan B” of developing *Mathematica*, then using it to do science, worked out just great, and led to *A New Kind of Science*. And from *Mathematica*, we’ve been able to build a great company, as well as build things like Wolfram|Alpha. And over the course of 25 years, we’ve had the pleasure and privilege of seeing *Mathematica* contribute in all sorts of ways to many things in the world.

What was SMP like? Here are a few examples of SMP programs that I wrote for the SMP documentation:

In some ways these look quite similar to *Mathematica* programs—complete with `[...]`

for functions, `{...}`

for lists and `->`

for rules. But somehow the readability that’s a hallmark of *Mathematica* isn’t there, and instead the SMP programs seem quite cryptic and obscure.

One of the most obvious problems is that SMP code is littered with `$`

and `%`

characters—appearing respectively as prefixes for pattern and local variables. In SMP, I hadn’t had the *Mathematica* idea of separating pattern constructs (such as `_`

) from names (such as `x`

). And I thought it was important to emphasize which variables were local—but didn’t have a subtle cue like color to do it with.

In SMP I’d already had the (good) idea of distinguishing immediate (=) and delayed (:=) assignment. But in a nod to languages like ALGOL, I indicated them by the rather obscure `:`

and `::`

(For rules, `->`

was the immediate form, as it is *Mathematica*, while `-->`

was the analog of `:>`

and `S[...]`

was the analog of `/.`

)

In SMP, just like in *Mathematica*, I indicated built-in functions with capital letters (at the time it was a fairly new thing to distinguish upper and lowercase at all on a computer). But while *Mathematica* typically uses English words for function names, SMP used short—and often cryptic—abbreviations. When I was working on SMP, I was quite taken with the design of Unix, and wanted to emulate its practice of having short function names. That might have been OK if SMP had just a few functions. But with hundreds of functions with names like `Ps`

, `Mei`

and `Uspb`

things began to get pretty unreadable. Of course, back then, there was another issue: lots of users couldn’t type quickly—so that provided a motivation to have short function names.

It’s interesting to look at the SMP documentation today. SMP had plenty of good ideas—most of which I used again in *Mathematica*. But it also had some quite bad ideas—which happily aren’t part of *Mathematica*. One example of a bad idea—that even sounds bad as soon as one hears it—are “chameleonic symbols”: symbols that change their name whenever they’re used. (These were an attempt at localizing things like dummy variables, a bit like an over-automated form of `Module`

.)

There were some much more subtle mistakes too. Like here’s one that in a sense came from trying to go too far in unifying the system. Like *Mathematica*, SMP had a notion of lists, like `{a,b,c}`

. It also had functions, like `f[x]`

. And in my effort to achieve the maximum possible unification, I thought that perhaps one could combine the notion of lists and functions.

Let’s say one has a list `v={a,b,c}`

. (In SMP assignment was done with `:`

, so this would have been written `v:{a,b,c}`

.) Then for example in SMP `v[2]`

would extract the second element in the list. But this notation looks a lot like asking for the value of a function `v`

when its argument is 2. And this gave me the idea that perhaps one could generalize lists—to have not just integer-indexed elements, but elements with arbitrary symbolic indices.

In SMP, pattern variables (`x_`

in *Mathematica*) were written `$x`

. So when one defined a function `f[$x] : $x^2`

one could imagine that this was just defining `f`

itself to have a value that was a symbolically indexed list: `{[$x]: $x^2}`

. If you wanted to find out how a function was defined, you just typed its name—like `f`

. And the value that came back would be the symbolically indexed list that represented the definition.

An ordinary vector-type list could be thought of as something like `{[1]:a, [2]:b, [3]:c}`

. And one could mix in symbolic indices: `{[1]: 1, [$x]:$x f[$x-1]}`

. There was also a certain unification with part numbering in general symbolic expressions. And at some level it all seemed rather nice. And to describe my unified concept of functions and lists, I called the `f`

in `f[x]`

a “projection”, and `x`

a “filter”. (There were jokes about lists of definitions being “optical benches”.)

But gradually cracks started appearing. It got pretty weird, for example, when one started making definitions like `v[2]:b, v[3]:c`

. According to SMP’s conventions for assignments `v`

would then have value `{[3]:c, [2]:b}`

. But what if one made a definition like `v[1]:a`

? Well, then `v`

suddenly had to reorder itself as `{a, b, c}`

.

It got even weirder when one started dealing with multi-argument functions. It was quite nice that one could define a matrix with `m:{{a,b},{c,d}}`

, then `m[1]`

would be `{a,b}`

, and either `m[1,1]`

or `m[1][1]`

would be `a`

. But what if one had a function with several arguments? Would `f[x, y]`

be the same as `f[x][y]`

? Well, sometimes one wanted it that way, and sometimes not. So I had to come up with a property (“attribute” in *Mathematica*)—that I called `Tier`

—to say for every function which way it should work. (Today more people might have heard of “currying”, but in those days this kind of distinction was really obscure.)

Symbolically indexed lists in SMP had some really powerful and elegant features. But in the end, when the whole system was built, there were just too many weirdnesses. And so when I designed *Mathematica* I decided not to use them. Over the years, though, I’ve kept thinking about them. And as it happens, right now, more than 30 years after SMP, I’m working on some very interesting new functionality in *Mathematica* that’s closely related to symbolically indexed lists.

I learned a huge amount designing SMP—and then seeing how the design played out. One particularly memorable moment for me was this. Like *Mathematica*, SMP had pure functions. But unlike *Mathematica*, it didn’t have a syntax like `&`

to indicate them. And that meant that it needed a special object called a “mark” (written ```

) to indicate when a pure function was supposed to give a literal, constant, value. Well, about 5 years after SMP was released, I was looking at one of its training manuals. And out jumped at me the sentence: “Marks are the enigma of SMP”. And in that moment I realized: that’s what a language design mistake looks like.

SMP was in many ways a very radical system—a kind of extreme experiment in programming language design. It had only grudging support for most of what were then familiar programming constructs. And instead almost everything in it revolved around the idea of transformation rules for symbolic expressions. In some ways I think SMP went too far into the unfamiliar. Because in a sense what a programming language has to do is to connect the human conception of a computation to an actual computation that a computer can execute. And however powerful a language is, it doesn’t do much good if humans don’t have enough context to be able to understand it. Which is why in *Mathematica*, I’ve always tried to make things familiar when I can, limiting the unfamiliar to places where it’s really needed in supporting things that are fundamentally new.

One of the things about designing a system is knowing what’s going to end up being important. In SMP, we spent a lot of effort on what we called “semantic pattern matching”. Let’s say one made a definition like `f[$x+$y, $x, $y] := {$x, $y}`

. It’s pretty clear that this would match `f[a+b, a, b]`

. But what about `f[7, 3, 4]`

? In SMP, that *would* match—even though the 7 isn’t structurally of the form `$x+$y`

. It took lots of effort to make this work. And it was neat to see in simple examples. But in the end, it just didn’t come up very often—and when it did, it was usually something to avoid, because it typically made the operation of programs really hard to understand.

There was something similar with recursion control. I thought it was bad to have `f[$x] : $x f[$x-1]`

(with no end condition for `f[1]`

) go into an infinite loop trying to evaluate `f[-1]`

, `f[-2]`

, etc. Because after all, at some point there’s multiplication by 0. So why not just give 0? Well, in SMP the default was to give 0. Because instead of running all the way down the evaluation of each branch of the recursion tree, SMP would repeatedly stop and try to simplify all the unevaluated branches. It was neat and clever. But by the time one started parametrizing this behavior it was just too hard for people to understand, and nobody ended up using it.

And then there was user-defined syntax. Allowing users for example to set “`U`

” (say, for “`union`

”) to be an infix operator. Which worked great until one wanted to type a function with a “U” in its name. Or until one completely trapped oneself in one’s syntax, diverting the parsing of any form of escape.

SMP was a great learning experience for me. And *Mathematica* wouldn’t be nearly as good if I hadn’t done SMP first. And as I reflect now on “mistakes” in SMP, one thing I find quite satisfying is that I don’t think I’d make any of them today. Between SMP and 25 years of *Mathematica* design, most of them would now fall into the category of “easy issues” for me.

It’s funny, though, how often variations of some of the not-so-good ideas in SMP seem to come up. And actually I’m very curious with my modern design sensibilities how exactly I’d feel about them if I ran SMP today. Which is part of the reason I’m keen to release SMP from its “digital time safe”, and get it running again. Which I hope someone out there is going to help me make possible.

]]>Leafing through his yellowed (but still robust enough for me to touch) pages of notes, I felt a certain connection—as I tried to imagine what he was thinking when he wrote them, and tried to relate what I saw in them to what we now know after three more centuries:

Some things, especially in mathematics, are quite timeless. Like here’s Leibniz writing down an infinite series for √2 (the text is in Latin):

Or here’s Leibniz try to calculate a continued fraction—though he got the arithmetic wrong, even though he wrote it all out (the Π was his earlier version of an equal sign):

Or here’s a little summary of calculus, that could almost be in a modern textbook:

But what was everything else about? What was the larger story of his work and thinking?

I have always found Leibniz a somewhat confusing figure. He did many seemingly disparate and unrelated things—in philosophy, mathematics, theology, law, physics, history, and more. And he described what he was doing in what seem to us now as strange 17th century terms.

But as I’ve learned more, and gotten a better feeling for Leibniz as a person, I’ve realized that underneath much of what he did was a core intellectual direction that is curiously close to the modern computational one that I, for example, have followed.

Gottfried Leibniz was born in Leipzig in what’s now Germany in 1646 (four years after Galileo died, and four years after Newton was born). His father was a professor of philosophy; his mother’s family was in the book trade. Leibniz’s father died when Leibniz was 6—and after a 2-year deliberation on its suitability for one so young, Leibniz was allowed into his father’s library, and began to read his way through its diverse collection of books. He went to the local university at age 15, studying philosophy and law—and graduated in both of them at age 20.

Even as a teenager, Leibniz seems to have been interested in systematization and formalization of knowledge. There had been vague ideas for a long time—for example in the semi-mystical *Ars Magna* of Ramon Llull from the 1300s—that one might be able to set up some kind of universal system in which all knowledge could be derived from combinations of signs drawn from a suitable (as Descartes called it) “alphabet of human thought”. And for his philosophy graduation thesis, Leibniz tried to pursue this idea. He used some basic combinatorial mathematics to count possibilities. He talked about decomposing ideas into simple components on which a “logic of invention” could operate. And, for good measure, he put in an argument that purported to prove the existence of God.

As Leibniz himself said in later years, this thesis—written at age 20—was in many ways naive. But I think it began to define Leibniz’s lifelong way of thinking about all sorts of things. And so, for example, Leibniz’s law graduation thesis about “perplexing legal cases” was all about how such cases could potentially be resolved by reducing them to logic and combinatorics.

Leibniz was on a track to become a professor, but instead he decided to embark on a life working as an advisor for various courts and political rulers. Some of what he did for them was scholarship, tracking down abstruse—but politically important—genealogy and history. Some of it was organization and systematization—of legal codes, libraries and so on. Some of it was practical engineering—like trying to work out better ways to keep water out of silver mines. And some of it—particularly in earlier years—was “on the ground” intellectual support for political maneuvering.

One such activity in 1672 took Leibniz to Paris for four years—during which time he interacted with many leading intellectual lights. Before then, Leibniz’s knowledge of mathematics had been fairly basic. But in Paris he had the opportunity to learn all the latest ideas and methods. And for example he sought out Christiaan Huygens, who agreed to teach Leibniz mathematics—after he succeeded in passing the test of finding the sum of the reciprocals of the triangular numbers.

Over the years, Leibniz refined his ideas about the systematization and formalization of knowledge, imagining a whole architecture for how knowledge would—in modern terms—be made computational. He saw the first step as being the development of an *ars characteristica*—a methodology for assigning signs or symbolic representations to things, and in effect creating a uniform “alphabet of thought”. And he then imagined—in remarkable resonance with what we now know about computation—that from this uniform representation it would be possible to find “truths of reason in any field… through a calculus, as in arithmetic or algebra”.

He talked about his ideas under a variety of rather ambitious names like *scientia generalis* (“general method of knowledge”), *lingua philosophica* (“philosophical language”), *mathematique universelle* (“universal mathematics”), *characteristica universalis* (“universal system”) and *calculus ratiocinator* (“calculus of thought”). He imagined applications ultimately in all areas—science, law, medicine, engineering, theology and more. But the one area in which he had clear success quite quickly was mathematics.

To me it’s remarkable how rarely in the history of mathematics that notation has been viewed as a central issue. It happened at the beginning of modern mathematical logic in the late 1800s with the work of people like Gottlob Frege and Giuseppe Peano. And in recent times it’s happened with me in my efforts to create *Mathematica* and the Wolfram Language. But it also happened three centuries ago with Leibniz. And I suspect that Leibniz’s successes in mathematics were in no small part due to the effort he put into notation, and the clarity of reasoning about mathematical structures and processes that it brought.

When one looks at Leibniz’s papers, it’s interesting to see his notation and its development. Many things look quite modern. Though there are charming dashes of the 17th century, like the occasional use of alchemical or planetary symbols for algebraic variables:

There’s Π as an equals sign instead of =, with the slightly hacky idea of having it be like a balance, with a longer leg on one side or the other indicating less than (“<”) or greater than (“>”):

There are overbars to indicate grouping of terms—arguably a better idea than parentheses, though harder to type, and typeset:

We do use overbars for roots today. But Leibniz wanted to use them in integrals too. Along with the rather nice “tailed d”, which reminds me of the double-struck “differential d” that we invented for representing integrals in *Mathematica*.

Particularly in solving equations, it’s quite common to want to use ±, and it’s always confusing how the grouping is supposed to work, say in *a*±*b*±*c*. Well, Leibniz seems to have found it confusing too, but he invented a notation to handle it—which we actually should consider using today too:

I’m not sure what some of Leibniz’s notation means. Though those overtildes are rather nice-looking:

As are these things with dots:

Or this interesting-looking diagrammatic form:

Of course, Leibniz’s most famous notations are his integral sign (long “s” for “summa”) and d, here summarized in the margin for the first time, on November 11th, 1675 (the “5″ in “1675″ was changed to a “3″ after the fact, perhaps by Leibniz):

I find it interesting that despite all his notation for “calculational” operations, Leibniz apparently did not invent similar notation for logical operations. “Or” was just the Latin word *vel*, “and” was *et*, and so on. And when he came up with the idea of quantifiers (modern ∀ and ∃), he just represented them by the Latin abbreviations U.A. and P.A.:

It’s always struck me as a remarkable anomaly in the history of thought that it took until the 1930s for the idea of universal computation to emerge. And I’ve often wondered if lurking in the writings of Leibniz there might be an early version of universal computation—maybe even a diagram that we could now interpret as a system like a Turing machine. But with more exposure to Leibniz, it’s become clearer to me why that’s probably not the case.

One big piece, I suspect, is that he didn’t take discrete systems quite seriously enough. He referred to results in combinatorics as “self-evident”, presumably because he considered them directly verifiable by methods like arithmetic. And it was only “geometrical”, or continuous, mathematics that he felt needed to have a calculus developed for it. In describing things like properties of curves, Leibniz came up with something like continuous functions. But he never seems to have applied the idea of functions to discrete mathematics—which might for example have led him to think about universal elements for building up functions.

Leibniz recognized the success of his infinitesimal calculus, and was keen to come up with similar “calculi” for other things. And in another “near miss” with universal computation, Leibniz had the idea of encoding logical properties using numbers. He thought about associating every possible attribute of a thing with a different prime number, then characterizing the thing by the product of the primes for its attributes—and then representing logical inference by arithmetic operations. But he only considered static attributes—and never got to an idea like Gödel numbering where operations are also encoded in numbers.

But even though Leibniz did not get to the idea of universal computation, he did understand the notion that computation is in a sense mechanical. And indeed quite early in life he seems to have resolved to build an actual mechanical calculator for doing arithmetic. Perhaps in part it was because he wanted to use it himself (always a good reason to build a piece of technology!). For despite his prowess at algebra and the like, his papers are charmingly full of basic (and sometimes incorrect) school-level arithmetic calculations written out in the margin—and now preserved for posterity:

There were scattered examples of mechanical calculators being built in Leibniz’s time, and when he was in Paris, Leibniz no doubt saw the addition calculator that had been built by Blaise Pascal in 1642. But Leibniz resolved to make a “universal” calculator, that could for the first time do all four basic functions of arithmetic with a single machine. And he wanted to give it a simple “user interface”, where one would for example turn a handle one way for multiplication, and the opposite way for division.

In Leibniz’s papers there are all sorts of diagrams about how the machine should work:

Leibniz imagined that his calculator would be of great practical utility—and indeed he seems to have hoped that he would be able to turn it into a successful business. But in practice, Leibniz struggled to get the calculator to work at all reliably. For like other mechanical calculators of its time, it was basically a glorified odometer. And just like in Charles Babbage’s machines nearly 200 years later, it was mechanically difficult to make many wheels move at once when a cascade of carries occurred.

Leibniz at first had a wooden prototype of his machine built, intended to handle just 3 or 4 digits. But when he demoed this to people like Robert Hooke during a visit to London in 1673 it didn’t go very well. But he kept on thinking he’d figured everything out—for example in 1679 writing (in French) of the “last correction to the arithmetic machine”:

Notes from 1682 suggest that there were more problems, however:

But Leibniz had plans drafted up from his notes—and contracted an engineer to build a brass version with more digits:

It’s fun to see Leibniz’s “marketing material” for the machine:

As well as parts of the “manual” (with 365×24 as a “worked example”):

Complete with detailed usage diagrams:

But despite all this effort, problems with the calculator continued. And in fact, for more than 40 years, Leibniz kept on tweaking his calculator—probably altogether spending (in today’s currency) more than a million dollars on it.

So what actually happened to the physical calculator? When I visited Leibniz’s archive, I had to ask. “Well”, my hosts said, “we can show you”. And there in a vault, along with shelves of boxes, was Leibniz’s calculator, looking as good as new in a glass case—here captured by me in a strange juxtaposition of ancient and modern:

All the pieces are there. Including a convenient wooden carrying box. Complete with a cranking handle. And, if it worked right, the ability to do any basic arithmetic operation with a few minutes of cranking:

Leibniz clearly viewed his calculator as a practical project. But he still wanted to generalize from it, for example trying to make a general “logic” to describe geometries of mechanical linkages. And he also thought about the nature of numbers and arithmetic. And was particularly struck by binary numbers.

Bases other than 10 had been used in recreational mathematics for several centuries. But Leibniz latched on to base 2 as having particular significance—and perhaps being a key bridge between philosophy, theology and mathematics. And he was encouraged in this by his realization that binary numbers were at the core of the *I Ching*, which he’d heard about from missionaries to China, and viewed as related in spirit to his *characteristica universalis*.

Leibniz worked out that it would be possible to build a calculator based on binary. But he appears to have thought that only base 10 could actually be useful.

It’s strange to read what Leibniz wrote about binary numbers. Some of it is clear and practical—and still seems perfectly modern. But some of it is very 17th century—talking for example about how binary proves that everything can be made from nothing, with 1 being identified with God, and 0 with nothing.

Almost nothing was done with binary for a couple of centuries after Leibniz: in fact, until the rise of digital computing in the last few decades. So when one looks at Leibniz’s papers, his calculations in binary are probably what seem most “out of his time”:

With binary, Leibniz was in a sense seeking the simplest possible underlying structure. And no doubt he was doing something similar when he talked about what he called “monads”. I have to say that I’ve never really understood monads. And usually when I think I almost have, there’s some mention of souls that just throws me completely off.

Still, I’ve always found it tantalizing that Leibniz seemed to conclude that the “best of all possible worlds” is the one “having the greatest variety of phenomena from the smallest number of principles”. And indeed, in the prehistory of my work on *A New Kind of Science*, when I first started formulating and studying one-dimensional cellular automata in 1981, I considered naming them “polymones”—but at the last minute got cold feet when I got confused again about monads.

There’s always been a certain mystique around Leibniz and his papers. Kurt Gödel—perhaps displaying his paranoia—seemed convinced that Leibniz had discovered great truths that had been suppressed for centuries. But while it is true that Leibniz’s papers were sealed when he died, it was his work on topics like history and genealogy—and the state secrets they might entail—that was the concern.

Leibniz’s papers were unsealed long ago, and after three centuries one might assume that every aspect of them would have been well studied. But the fact is that even after all this time, nobody has actually gone through all of the papers in full detail. It’s not that there are so many of them. Altogether there are only about 200,000 pages—filling perhaps a dozen shelving units (and only a little larger than my own personal archive from just the 1980s). But the problem is the diversity of material. Not only lots of subjects. But also lots of overlapping drafts, notes and letters, with unclear relationships between them.

Leibniz’s archive contains a bewildering array of documents. From the very large:

To the very small (Leibniz’s writing got smaller as he got older and more near-sighted):

Most of the documents in the archive seem very serious and studious. But despite the high cost of paper in Leibniz’s time, one still finds preserved for posterity the occasional doodle (is that Spinoza, by any chance?):

Leibniz exchanged mail with hundreds of people—famous and not-so-famous—all over Europe. So now, 300 years later, one can find in his archive “random letters” from the likes of Jacob Bernoulli:

What did Leibniz look like? Here he is, both in an official portrait, and without his rather oversized wig (that was mocked even in his time), that he presumably wore to cover up a large cyst on his head:

As a person, Leibniz seems to have been polite, courtierly and even tempered. In some ways, he may have come across as something of a nerd, expounding at great depth on all manner of topics. He seems to have taken great pains—as he did in his letters—to adapt to whoever he was talking to, emphasizing theology when he was talking to a theologian, and so on. Like quite a few intellectuals of his time, Leibniz never married, though he seems to have been something of a favorite with women at court.

In his career as a courtier, Leibniz was keen to climb the ladder. But not being into hunting or drinking, he never quite fit in with the inner circles of the rulers he worked for. Late in his life, when George I of Hanover became king of England, it would have been natural for Leibniz to join his court. But Leibniz was told that before he could go, he had to start writing up a history project he’d supposedly been working on for 30 years. Had he done so before he died, he might well have gone to England and had a very different kind of interaction with Newton.

At Leibniz’s archive, there are lots of papers, his mechanical calculator, and one more thing: a folding chair that he took with him when he traveled, and that he had suspended in carriages so he could continue to write as the carriage moved:

Leibniz was quite concerned about status (he often styled himself “Gottfried von Leibniz”, though nobody quite knew where the “von” came from). And as a form of recognition for his discoveries, he wanted to have a medal created to commemorate binary numbers. He came up with a detailed design, complete with the tag line *omnibus ex nihilo ducendis; sufficit unum* (“everything can be derived from nothing; all that is needed is 1”). But nobody ever made the medal for him.

In 2007, though, I wanted to come up with a 60th birthday gift for my friend Greg Chaitin, who has been a long-time Leibniz enthusiast. And so I thought: why not actually make Leibniz’s medal? So we did. Though on the back, instead of the picture of a duke that Leibniz proposed, we put a Latin inscription about Greg’s work.

And when I visited the Leibniz archive, I made sure to bring a copy of the medal, so I could finally put a real medal next to Leibniz’s design:

It would have been interesting to know what pithy statement Leibniz might have had on his grave. But as it was, when Leibniz died at the age of 70, his political fates were at a low ebb, and no elaborate memorial was constructed. Still, when I was in Hanover, I was keen to see his grave—which turns out to carry just the simple Latin inscription “bones of Leibniz”:

Across town, however, there’s another commemoration of a sort—an outlet store for cookies that carry the name “Leibniz” in his honor:

So what should we make of Leibniz in the end? Had history developed differently, there would probably be a direct line from Leibniz to modern computation. But as it is, much of what Leibniz tried to do stands isolated—to be understood mostly by projecting backward from modern computational thinking to the 17th century.

And with what we know now, it is fairly clear what Leibniz understood, and what he did not. He grasped the concept of having formal, symbolic, representations for a wide range of different kinds of things. And he suspected that there might be universal elements (maybe even just 0 and 1) from which these representations could be built. And he understood that from a formal symbolic representation of knowledge, it should be possible to compute its consequences in mechanical ways—and perhaps create new knowledge by an enumeration of possibilities.

Some of what Leibniz wrote was abstract and philosophical—sometimes maddeningly so. But at some level Leibniz was also quite practical. And he had sufficient technical prowess to often be able to make real progress. His typical approach seems to have been to start by trying to create a formal structure to clarify things—with formal notation if possible. And after that his goal was to create some kind of “calculus” from which conclusions could systematically be drawn.

Realistically he only had true success with this in one specific area: continuous “geometrical” mathematics. It’s a pity he never tried more seriously in discrete mathematics, because I think he might have been able to make progress, and might conceivably even have reached the idea of universal computation. He might well also have ended up starting to enumerate possible systems in the kind of way I have done in the computational universe.

One area where he did try his approach was with law. But in this he was surely far too early, and it is only now—300 years later—that computational law is beginning to seem realistic.

Leibniz also tried thinking about physics. But while he made progress with some specific concepts (like kinetic energy), he never managed to come up with any sort of large-scale “system of the world”, of the kind that Newton in effect did in his *Principia*.

In some ways, I think Leibniz failed to make more progress because he was trying too hard to be practical, and—like Newton—to decode the operation of actual physics, rather than just looking at related formal structures. For had Leibniz tried to do at least the basic kinds of explorations that I did in *A New Kind of Science*, I don’t think he would have had any technical difficulty—but I think the history of science could have been very different.

And I have come to realize that when Newton won the PR war against Leibniz over the invention of calculus, it was not just credit that was at stake; it was a way of thinking about science. Newton was in a sense quintessentially practical: he invented tools then showed how these could be used to compute practical results about the physical world. But Leibniz had a broader and more philosophical view, and saw calculus not just as a specific tool in itself, but as an example that should inspire efforts at other kinds of formalization and other kinds of universal tools.

I have often thought that the modern computational way of thinking that I follow is somehow obvious—and somehow an inevitable feature of thinking about things in formal, structured, ways. But it has never been very clear to me whether this apparent obviousness is just the result of modern times, and of our experience with modern practical computer technology. But looking at Leibniz, we get some perspective. And indeed what we see is that some core of modern computational thinking was possible even long before modern times. But the ambient technology and understanding of past centuries put definite limits on how far the thinking could go.

And of course this leads to a sobering question for us today: how much are we failing to realize from the core computational way of thinking because we do not have the ambient technology of the distant future? For me, looking at Leibniz has put this question in sharper focus. And at least one thing seems fairly clear.

In Leibniz’s whole life, he basically saw less than a handful of computers, and all they did was basic arithmetic. Today there are billions of computers in the world, and they do all sorts of things. But in the future there will surely be far far more computers (made easier to create by the Principle of Computational Equivalence). And no doubt we’ll get to the point where basically everything we make will explicitly be made of computers at every level. And the result is that absolutely everything will be programmable, down to atoms. Of course, biology has in a sense already achieved a restricted version of this. But we will be able to do it completely and everywhere.

At some level we can already see that this implies some merger of computational and physical processes. But just how may be as difficult for us to imagine as things like *Mathematica* and Wolfram|Alpha would have been for Leibniz.

Leibniz died on November 14, 1716. In 2016 that’ll be 300 years ago. And it’ll be a good opportunity to make sure everything we have from Leibniz has finally been gone through—and to celebrate after three centuries how many aspects of Leibniz’s core vision are finally coming to fruition, albeit in ways he could never have imagined.

]]>A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of *Mathematica* and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)

We’d always planned to use the data we collect to enhance our Personal Analytics system. But I couldn’t resist also trying to do some basic science with it.

I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in *A New Kind of Science*.

So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network using Wolfram|Alpha—or the `SocialMediaData`

function in *Mathematica*.)

So a first quantitative question to ask is: How big are these networks usually? In other words, how many friends do people typically have on Facebook? Well, at least for our users, that’s easy to answer. The median is 342—and here’s a histogram showing the distribution (there’s a cutoff at 5000 because that’s the maximum number of friends for a personal Facebook page):

But how typical are our users? In most respects—so far as we can tell—they seem pretty typical. But there are definitely some differences. Like here’s the distribution of the number of friends not just for our users, but also for their friends (there’s a mathematical subtlety in deriving this that I’ll discuss later):

And what we see is that in this broader Facebook population, there are significantly more people who have almost no Facebook friends. Whether such people should be included in samples one takes is a matter of debate. But so long as one looks at appropriate comparisons, aggregates, and so on, they don’t seem to have a huge effect. (The spike at 200 friends probably has to do with Facebook’s friend recommendation system.)

So, OK. Let’s ask for example how the typical number of Facebook friends varies with a person’s age. Of course all we know are self-reported “Facebook ages”. But let’s plot how the number of friends varies with that age. The solid line is the median number of friends; successive bands show successive octiles of the distribution.

After a rapid rise, the number of friends peaks for people in their late teenage years, and then declines thereafter. Why is this? I suspect it’s partly a reflection of people’s intrinsic behavior, and partly a reflection of the fact that Facebook hasn’t yet been around very long. Assuming people don’t drop friends much once they’ve added them one might expect that the number of friends would simply grow with age. And for sufficiently young people that’s basically what we see. But there’s a limit to the growth, because there’s a limit to the number of years people have been on Facebook. And assuming that’s roughly constant across ages, what the plot suggests is that people add friends progressively more slowly with age.

But what friends do they add? Given a person of a particular age, we can for example ask what the distribution of ages of the person’s friends is. Here are some results (the jaggedness, particularly at age 70, comes from the limited data we have):

And here’s an interactive version, generated from CDF:

The first thing we see is that the ages of friends always peak at or near the age of the person themselves—which is presumably a reflection of the fact that in today’s society many friends are made in age-based classes in school or college. For younger people, the peak around the person’s age tends to be pretty sharp. For older people, the distribution gets progressively broader.

We can summarize what happens by plotting the distribution of friend ages against the age of a person (the solid line is the median age of friends):

There’s an anomaly for the youngest ages, presumably because of kids under 13 misreporting their ages. But apart from that, we see that young people tend to have friends who are remarkably close in age to themselves. The broadening as people get older is probably associated with people making non-age-related friends in their workplaces and communities. And as the array of plots above suggests, by people’s mid-40s, there start to be secondary peaks at younger ages, presumably as people’s children become teenagers, and start using Facebook.

So what else can one see about the trajectory of people’s lives? Here’s the breakdown according to reported relationship status as a function of age:

And here’s more detail, separating out fractions for males and females (“married+” means “civil union”, “separated”, “widowed”, etc. as well as “married”):

There’s some obvious goofiness at low ages with kids (slightly more often girls than boys) misreporting themselves as married. But in general the trend is clear. The rate of getting married starts going up in the early 20s—a couple of years earlier for women than for men—and decreases again in the late 30s, with about 70% of people by then being married. The fraction of people “in a relationship” peaks around age 24, and there’s a small “engaged” peak around 27. The fraction of people who report themselves as married continues to increase roughly linearly with age, gaining about 5% between age 40 and age 60—while the fraction of people who report themselves as single continues to increase for women, while decreasing for men.

I have to say that as I look at the plots above, I’m struck by their similarity to plots for physical processes like chemical reactions. It’s as if all those humans, with all the complexities of their lives, still behave in aggregate a bit like molecules—with certain “reaction rates” to enter into relationships, marry, etc.

Of course, what we’re seeing here is just for the “Facebook world”. So how does it compare to the world at large? Well, at least some of what we can measure in the Facebook world is also measured in official censuses. And so for example we can see how our results for the fraction of people married at a given age compare with results from the official US Census:

I’m amazed at how close the correspondence is. Though there are clearly some differences. Like below age 20 kids on Facebook are misreporting themselves as married. And on the older end, widows are still considering themselves married for purposes of Facebook. For people in their 20s, there’s also a small systematic difference—with people on Facebook on average getting married a couple of years later than the Census would suggest. (As one might expect, if one excludes the rural US population, the difference gets significantly smaller.)

Talking of the Census, we can ask in general how our Facebook population compares to the US population. And for example, we find, not surprisingly, that our Facebook population is heavily weighted toward younger people:

OK. So we saw above how the typical number of friends a person has depends on age. What about gender? Perhaps surprisingly, if we look at all males and all females, there isn’t a perceptible difference in the distributions of number of friends. But if we instead look at males and females as a function of age, there is a definite difference:

Teenage boys tend to have more friends than teenage girls, perhaps because they are less selective in who they accept as friends. But after the early 20s, the difference between genders rapidly dwindles.

What effect does relationship status have? Here’s the male and female data as a function of age:

In the older set, relationship status doesn’t seem to make much difference. But for young people it does. With teenagers who (mis)report themselves as “married” on average having more friends than those who don’t. And with early teenage girls who say they’re “engaged” (perhaps to be able to tag a BFF) typically having more friends than those who say they’re single, or just “in a relationship”.

Another thing that’s fairly reliably reported by Facebook users is location. And it’s common to see quite a lot of variation by location. Like here are comparisons of the median number of friends for countries around the world (ones without enough data are left gray), and for states in the US:

There are some curious effects. Countries like Russia and China have low median friend counts because Facebook isn’t widely used for connections between people inside those countries. And perhaps there are lower friend counts in the western US because of lower population densities. But quite why there are higher friend counts for our Facebook population in places like Iceland, Brazil and the Philippines—or Mississippi—I don’t know. (There is of course some “noise” from people misreporting their locations. But with the size of the sample we have, I don’t think this is a big effect.)

In Facebook, people can list both a “hometown” and a “current city”. Here’s how the probability that these are in the same US state varies with age:

What we see is pretty much what one would expect. For some fraction of the population, there’s a certain rate of random moving, visible here for young ages. Around age 18, there’s a jump as people move away from their “hometowns” to go to college and so on. Later, some fraction move back, and progressively consider wherever they live to be their “hometown”.

One can ask where people move to and from. Here’s a plot showing the number of people in our Facebook population moving between different US states, and different countries:

There’s a huge range of demographic questions we could ask. But let’s come back to social networks. It’s a common observation that people tend to be friends with people who are like them. So to test this we might for example ask whether people with more friends tend to have friends who have more friends. Here’s a plot of the median number of friends that our users have, as a function of the number of friends that they themselves have:

And the result is that, yes, on average people with more friends tend to have friends with more friends. Though we also notice that people with lots of friends tend to have friends with fewer friends than themselves.

And seeing this gives me an opportunity to discuss a subtlety I alluded to earlier. The very first plot in this post shows the distribution of the number of friends that our users have. But what about the number of friends that their friends have? If we just average over all the friends of all our users, this is how what we get compares to the original distribution for our users themselves:

It seems like our users’ friends always tend to have more friends than our users themselves. But actually from the previous plot we know this isn’t true. So what’s going on? It’s a slightly subtle but general social-network phenomenon known as the “friendship paradox”. The issue is that when we sample the friends of our users, we’re inevitably sampling the space of all Facebook users in a very non-uniform way. In particular, if our users represent a uniform sample, any given friend will be sampled at a rate proportional to how many friends they have—with the result that people with more friends are sampled more often, so the average friend count goes up.

It’s perfectly possible to correct for this effect by weighting friends in inverse proportion to the number of friends they have—and that’s what we did earlier in this post. And by doing this we determine that in fact the friends of our users do not typically have more friends than our users themselves; instead their median number of friends is actually 229 instead of 342.

It’s worth mentioning that if we look at the distribution of number of friends that we deduce for the Facebook population, it’s a pretty good fit to a power law, with exponent -2.8. And this is a common form for networks of many kinds—which can be understood as the result of an effect known as “preferential attachment”, in which as the network grows, nodes that already have many connections preferentially get more connections, leading to a limiting “scale-free network” with power-law features.

But, OK. Let’s look in more detail at the social network of an individual user. I’m not sufficiently diligent on Facebook for my own network to be interesting. But my 15-year-old daughter Catherine was kind enough to let me show her network:

There’s a dot for each of Catherine’s Facebook friends, with connections between them showing who’s friends with whom. (There’s no dot for Catherine herself, because she’d just be connected to every other dot.) The network is laid out to show clusters or “communities” of friends (using the Wolfram Language function `FindGraphCommunities`

). And it’s amazing the extent to which the network “tells a story”. With each cluster corresponding to some piece of Catherine’s life or history.

Here’s a whole collection of networks from our Data Donors:

No doubt each of these networks tells a different story. But we can still generate overall statistics. Like, for example, here is a plot of how the number of clusters of friends varies with age (there’d be less noise if we had more data):

Even at age 13, people typically seem to have about 3 clusters (perhaps school, family and neighborhood). As they get older, go to different schools, take jobs, and so on, they accumulate another cluster or so. Right now the number saturates above about age 30, probably in large part just because of the limited time Facebook has been around.

How big are typical clusters? The largest one is usually around 100 friends; the plot below shows the variation of this size with age:

And here’s how the size of the largest cluster as a fraction of the whole network varies with age:

What about more detailed properties of networks? Is there a kind of “periodic table” of network structures? Or a classification scheme like the one I made long ago for cellular automata?

The first step is to find some kind of iconic summary of each network, which we can do for example by looking at the overall connectivity of clusters, ignoring their substructure. And so, for example, for Catherine (who happened to suggest this idea), this reduces her network to the following “cluster diagram”:

Doing the same thing for the Data Donor networks shown above, here’s what we get:

In making these diagrams, we’re keeping every cluster with at least 2 friends. But to get a better overall view, we can just drop any cluster with, say, less than 10% of all friends—in which case for example Catherine’s cluster diagram becomes just:

And now for example we can count the relative numbers of different types of structures that appear in all the Data Donor networks:

And we can look at how the fractions of each of these structures vary with age:

What do we learn? The most common structures consist of either two or three major clusters, all of them connected. But there are also structures in which major clusters are completely disconnected—presumably reflecting facets of a person’s life that for reasons of geography or content are also completely disconnected.

For everyone there’ll be a different detailed story behind the structure of their cluster diagram. And one might think this would mean that there could never be a general theory of such things. At some level it’s a bit like trying to find a general theory of human history, or a general theory of the progression of biological evolution. But what’s interesting now about the Facebook world is that it gives us so much more data from which to form theories.

And we don’t just have to look at things like cluster diagrams, or even friend networks: we can dig almost arbitrarily deep. For example, we can analyze the aggregated text of posts people make on their Facebook walls, say classifying them by topics they talk about (this uses a natural-language classifier written in the Wolfram Language and trained using some large corpora):

Each of these topics is characterized by certain words that appear with high frequency:

And for each topic we can analyze how its popularity varies with (Facebook) age:

It’s almost shocking how much this tells us about the evolution of people’s typical interests. People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. The peak time for anyone to talk about school+university is (not surprisingly) around age 20. People get less interested in talking about “special occasions” (mostly birthdays) through their teens, but gradually gain interest later. And people get progressively more interested in talking about career+money in their 20s. And so on. And so on.

Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life.

Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll see all sorts of other interesting things. And for example personally I’m very curious to analyze my own archive of nearly 25 years of email—and then perhaps predict things about myself by comparing to what happens in the general population.

Over the decades I’ve been steadily accumulating countless anecdotal “case studies” about the trajectories of people’s lives—from which I’ve certainly noticed lots of general patterns. But what’s amazed me about what we’ve done over the past few weeks is how much systematic information it’s been possible to get all at once. Quite what it all means, and what kind of general theories we can construct from it, I don’t yet know.

But it feels like we’re starting to be able to train a serious “computational telescope” on the “social universe”. And it’s letting us discover all sorts of phenomena. That have the potential to help us understand much more about society and about ourselves. And that, by the way, provide great examples of what can be achieved with data science, and with the technology I’ve been working on developing for so long.

]]>