I’m pleased to announce that as of today, the Wolfram Data Repository is officially launched! It’s been a long road. I actually initiated the project a decade ago—but it’s only now, with all sorts of innovations in the Wolfram Language and its symbolic ways of representing data, as well as with the arrival of the Wolfram Cloud, that all the pieces are finally in place to make a true computable data repository that works the way I think it should.
The Wolfram Cloud is coming out of beta soon (yay!), and right now I’m spending much of my time working to make it as good as possible (and, by the way, it’s getting to be really great!). Mostly I concentrate on defining high-level function and strategy. But I like to understand things at every level, and as a CEO, one’s ultimately responsible for everything. And at the beginning of March I found myself diving deep into something I never expected…
Here’s the story. As a serious production system that lots of people will use to do things like run businesses, the Wolfram Cloud should be as fast as possible. Our metrics were saying that typical speeds were good, but subjectively when I used it something felt wrong. Sometimes it was plenty fast, but sometimes it seemed way too slow.
We’ve got excellent software engineers, but months were going by, and things didn’t seem to be changing. Meanwhile, we’d just released the Wolfram Data Drop. So I thought, why don’t I just run some tests myself, maybe collecting data in our nice new Wolfram Data Drop?
A great thing about the Wolfram Language is how friendly it is for busy people: even if you only have time to dash off a few lines of code, you can get real things done. And in this case, I only had to run three lines of code to find a problem.
Where should data from the Internet of Things go? We’ve got great technology in the Wolfram Language for interpreting, visualizing, analyzing, querying and otherwise doing interesting things with it. But the question is, how should the data from all those connected devices and everything else actually get to where good things can be done with it? Today we’re launching what I think is a great solution: the Wolfram Data Drop.
When I first started thinking about the Data Drop, I viewed it mainly as a convenience—a means to get data from here to there. But now that we’ve built the Data Drop, I’ve realized it’s much more than that. And in fact, it’s a major step in our continuing efforts to integrate computation and the real world.
So what is the Wolfram Data Drop? At a functional level, it’s a universal accumulator of data, set up to get—and organize—data coming from sensors, devices, programs, or for that matter, humans or anything else. And to store this data in the cloud in a way that makes it completely seamless to compute with. Continue reading
Last week I wrote about our large-scale plan to use new technology we’re building to inject sophisticated computation and knowledge into everything. Today I’m pleased to announce a step in that direction: working with the Raspberry Pi Foundation, effective immediately there’s a pilot release of the Wolfram Language—as well as Mathematica—that will soon be bundled as part of the standard system software for every Raspberry Pi computer.
A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of Mathematica and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)
We’d always planned to use the data we collect to enhance our Personal Analytics system. But I couldn’t resist also trying to do some basic science with it.
I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in A New Kind of Science.
So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network using Wolfram|Alpha—or the SocialMediaData function in Mathematica.)
Last week I gave a talk at SXSW 2013 in Austin about some of the things I’m thinking about these days—including quite a few that I’ve never talked publicly about before. Here’s a video, and a slightly edited transcript:
Well, this is a pretty exciting time for me. Because it turns out that a whole bunch of things that I’ve been working on for more than 30 years are all finally converging, in a very nice way. And what I’d like to do here today is tell you a bit about that, and about some things I’ve figured out recently—and about what it all means for our future.
This is going to be a bit of a wild talk in some ways. It’s going to go from pretty intellectual stuff about basic science and so on, to some really practical technology developments, with a few sneak peeks at things I’ve never shown before.
Let’s start from some science. And you know, a lot of what I’ll say today connects back to what I thought at first was a small discovery that I made about 30 years ago. Let me tell you the story.
I started out at a pretty young age as a physicist. Diligently doing physics pretty much the way it had been done for 300 years. Starting from this-or-that equation, and then doing the math to figure out predictions from it. That worked pretty well in some cases. But there were too many cases where it just didn’t work. So I got to wondering whether there might be some alternative; a different approach. Continue reading
Note added: Since this blog was written, Facebook has modified their API to make much less information available about Facebook friends. While I think adding privacy controls is a good idea, what Facebook has done reduces the richness of the results that Wolfram|Alpha Personal Analytics can give for Facebook users.
After I wrote about doing personal analytics with data I’ve collected about myself, many people asked how they could do similar things themselves.
Now of course most people haven’t been doing the kind of data collecting that I’ve been doing for the past couple of decades. But these days a lot of people do have a rich source of data about themselves: their Facebook histories.
And today I’m excited to announce that we’ve developed a first round of capabilities in Wolfram|Alpha to let anyone do personal analytics with Facebook data. Wolfram|Alpha knows about all kinds of knowledge domains; now it can know about you, and apply its powers of analysis to give you all sorts of personal analytics. And this is just the beginning; over the months to come, particularly as we see about how people use this, we’ll be adding more and more capabilities.
If you’re doing this for the first time, you’ll be prompted to authenticate the Wolfram Connection app in Facebook, and then sign in to Wolfram|Alpha (yes, it’s free). And as soon as you’ve done that, Wolfram|Alpha will immediately get to work generating a personal analytics report from the data it can get about you through Facebook.
Here’s the beginning of the report I get today when I do this:
Yes, it was my birthday yesterday. And yes, as my children are fond of pointing out, I’m getting quite ancient… Continue reading
One day I’m sure everyone will routinely collect all sorts of data about themselves. But because I’ve been interested in data for a very long time, I started doing this long ago. I actually assumed lots of other people were doing it too, but apparently they were not. And so now I have what is probably one of the world’s largest collections of personal data.
Every day—in an effort at “self awareness”—I have automated systems send me a few emails about the day before. But even though I’ve been accumulating data for years—and always meant to analyze it—I’ve never actually gotten around to doing it. But with Mathematica and the automated data analysis capabilities we just released in Wolfram|Alpha Pro, I thought now would be a good time to finally try taking a look—and to use myself as an experimental subject for studying what one might call “personal analytics”.
Let’s start off talking about email. I have a complete archive of all my email going back to 1989—a year after Mathematica was released, and two years after I founded Wolfram Research. Here’s a plot with a dot showing the time of each of the third of a million emails I’ve sent since 1989:
It’s a sad but true fact that most data that’s generated or collected—even with considerable effort—never gets any kind of serious analysis. But in a sense that’s not surprising. Because doing data science has always been hard. And even expert data scientists usually have to spend lots of time wrangling code and data to do any particular analysis.
I myself have been using computers to work with data for more than a third of a century. And over that time my tools and methods have gradually evolved. But this week—with the release of Wolfram|Alpha Pro—something dramatic has happened, that will forever change the way I approach data.
The key idea is automation. The concept in Wolfram|Alpha Pro is that I should just be able to take my data in whatever raw form it arrives, and throw it into Wolfram|Alpha Pro. And then Wolfram|Alpha Pro should automatically do a whole bunch of analysis, and then give me a well-organized report about my data. And if my data isn’t too large, this should all happen in a few seconds.
And what’s amazing to me is that it actually works. I’ve got all kinds of data lying around: measurements, business reports, personal analytics, whatever. And I’ve been feeding it into Wolfram|Alpha Pro. And Wolfram|Alpha Pro has been showing me visualizations and coming up with analyses that tell me all kinds of useful things about the data.
The precursors of what we’re trying to do with computable data in Wolfram|Alpha in many ways stretch back to the very dawn of human history—and in fact their development has been fascinatingly tied to the whole progress of civilization.
Last year we invited the leaders of today’s great data repositories to our Wolfram Data Summit—and as a conversation piece we assembled a timeline of the historical development of systematic data and computable knowledge.
The story the timeline tells is a fascinating one: of how, in a multitude of steps, our civilization has systematized more and more areas of knowledge—collected the data associated with them, and gradually made them amenable to automation. Continue reading
The creation of large data repositories has been a key historical indicator of social and intellectual development—and indeed perhaps one of the defining characteristics of the whole progress of civilization.
And through our work on Wolfram|Alpha—with its insatiable appetite for systematic data—we have gained a uniquely broad view of the many great data repositories that exist in the world today.
Some of these repositories are maintained by national or international agencies, some by companies and other organizations, and some by individuals. A few of the repositories are quite new, but many date back 40 or more years, and some well over a century. But there is one thing in common across essentially every great data repository: a core of diligent and committed people who have carefully shepherded its development.
Curiously, though, few of these people have ever met their counterparts in other domains of data. And in our work on Wolfram|Alpha we are almost certainly the first group ever to have had the pleasure of getting to know such a broad range of leaders of great data repositories.
And one of the things that we have discovered is that there is much in common in both the methods used and the issues faced by these data repositories. So as part of our contribution to the worldwide data community we have decided to sponsor a data summit to bring together for the first time the leaders of today’s great data repositories.