Category: Data Science

Scientific Bug Hunting in the Cloud: An Unexpected CEO Adventure

The Wolfram Cloud Needs to Be Perfect

The Wolfram Cloud is coming out of beta soon (yay!), and right now I’m spending much of my time working to make it as good as possible (and, by the way, it’s getting to be really great!). Mostly I concentrate on defining high-level function and strategy. But I like to understand things at every level, and as a CEO, one’s ultimately responsible for everything. And at the beginning of March I found myself diving deep into something I never expected…

Here’s the story. As a serious production system that lots of people will use to do things like run businesses, the Wolfram Cloud should be as fast as possible. Our metrics were saying that typical speeds were good, but subjectively when I used it something felt wrong. Sometimes it was plenty fast, but sometimes it seemed way too slow.

We’ve got excellent software engineers, but months were going by, and things didn’t seem to be changing. Meanwhile, we’d just released the Wolfram Data Drop. So I thought, why don’t I just run some tests myself, maybe collecting data in our nice new Wolfram Data Drop?

A great thing about the Wolfram Language is how friendly it is for busy people: even if you only have time to dash off a few lines of code, you can get real things done. And in this case, I only had to run three lines of code to find a problem.

First, I deployed a web API for a trivial Wolfram Language program to the Wolfram Cloud:

In[1]:= CloudDeploy[APIFunction[{}, 1 &]] Continue reading

The Wolfram Data Drop Is Live!

Comments Off Comments Off

Where should data from the Internet of Things go? We’ve got great technology in the Wolfram Language for interpreting, visualizing, analyzing, querying and otherwise doing interesting things with it. But the question is, how should the data from all those connected devices and everything else actually get to where good things can be done with it? Today we’re launching what I think is a great solution: the Wolfram Data Drop.

Wolfram Data Drop

When I first started thinking about the Data Drop, I viewed it mainly as a convenience—a means to get data from here to there. But now that we’ve built the Data Drop, I’ve realized it’s much more than that. And in fact, it’s a major step in our continuing efforts to integrate computation and the real world.

So what is the Wolfram Data Drop? At a functional level, it’s a universal accumulator of data, set up to get—and organize—data coming from sensors, devices, programs, or for that matter, humans or anything else. And to store this data in the cloud in a way that makes it completely seamless to compute with. Continue reading

Putting the Wolfram Language (and Mathematica) on Every Raspberry Pi

Last week I wrote about our large-scale plan to use new technology we’re building to inject sophisticated computation and knowledge into everything. Today I’m pleased to announce a step in that direction: working with the Raspberry Pi Foundation, effective immediately there’s a pilot release of the Wolfram Language—as well as Mathematica—that will soon be bundled as part of the standard system software for every Raspberry Pi computer.

Wolfram Language and Mathematica now free on Raspberry Pi Continue reading

Data Science of the Facebook World

More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes.

A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of Mathematica and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)

We’d always planned to use the data we collect to enhance our Personal Analytics system. But I couldn’t resist also trying to do some basic science with it.

I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in A New Kind of Science.

So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network using Wolfram|Alpha—or the SocialMediaData function in Mathematica.)

social networks

Continue reading

Talking about the Computational Future at SXSW 2013

Last week I gave a talk at SXSW 2013 in Austin about some of the things I’m thinking about these days—including quite a few that I’ve never talked publicly about before. Here’s a video, and a slightly edited transcript:

Well, this is a pretty exciting time for me. Because it turns out that a whole bunch of things that I’ve been working on for more than 30 years are all finally converging, in a very nice way. And what I’d like to do here today is tell you a bit about that, and about some things I’ve figured out recently—and about what it all means for our future.

This is going to be a bit of a wild talk in some ways. It’s going to go from pretty intellectual stuff about basic science and so on, to some really practical technology developments, with a few sneak peeks at things I’ve never shown before.

Let’s start from some science. And you know, a lot of what I’ll say today connects back to what I thought at first was a small discovery that I made about 30 years ago. Let me tell you the story.

I started out at a pretty young age as a physicist. Diligently doing physics pretty much the way it had been done for 300 years. Starting from this-or-that equation, and then doing the math to figure out predictions from it. That worked pretty well in some cases. But there were too many cases where it just didn’t work. So I got to wondering whether there might be some alternative; a different approach. Continue reading

Wolfram|Alpha Personal Analytics for Facebook

Note added: Since this blog was written, Facebook has modified their API to make much less information available about Facebook friends. While I think adding privacy controls is a good idea, what Facebook has done reduces the richness of the results that Wolfram|Alpha Personal Analytics can give for Facebook users.


After I wrote about doing personal analytics with data I’ve collected about myself, many people asked how they could do similar things themselves.

Now of course most people haven’t been doing the kind of data collecting that I’ve been doing for the past couple of decades. But these days a lot of people do have a rich source of data about themselves: their Facebook histories.

And today I’m excited to announce that we’ve developed a first round of capabilities in Wolfram|Alpha to let anyone do personal analytics with Facebook data. Wolfram|Alpha knows about all kinds of knowledge domains; now it can know about you, and apply its powers of analysis to give you all sorts of personal analytics. And this is just the beginning; over the months to come, particularly as we see about how people use this, we’ll be adding more and more capabilities.

It’s pretty straightforward to get your personal analytics report: all you have to do is type “facebook report” into the standard Wolfram|Alpha website.

If you’re doing this for the first time, you’ll be prompted to authenticate the Wolfram Connection app in Facebook, and then sign in to Wolfram|Alpha (yes, it’s free). And as soon as you’ve done that, Wolfram|Alpha will immediately get to work generating a personal analytics report from the data it can get about you through Facebook.

Here’s the beginning of the report I get today when I do this:

Facebook report

Yes, it was my birthday yesterday. And yes, as my children are fond of pointing out, I’m getting quite ancient… Continue reading

The Personal Analytics of My Life

One day I’m sure everyone will routinely collect all sorts of data about themselves. But because I’ve been interested in data for a very long time, I started doing this long ago. I actually assumed lots of other people were doing it too, but apparently they were not. And so now I have what is probably one of the world’s largest collections of personal data.

Every day—in an effort at “self awareness”—I have automated systems send me a few emails about the day before. But even though I’ve been accumulating data for years—and always meant to analyze it—I’ve never actually gotten around to doing it. But with Mathematica and the automated data analysis capabilities we just released in Wolfram|Alpha Pro, I thought now would be a good time to finally try taking a look—and to use myself as an experimental subject for studying what one might call “personal analytics”.

Let’s start off talking about email. I have a complete archive of all my email going back to 1989—a year after Mathematica was released, and two years after I founded Wolfram Research. Here’s a plot with a dot showing the time of each of the third of a million emails I’ve sent since 1989:

Plot with a dot showing the time of each of the third of a million pieces of email Continue reading

Launching a Democratization of Data Science

It’s a sad but true fact that most data that’s generated or collected—even with considerable effort—never gets any kind of serious analysis. But in a sense that’s not surprising. Because doing data science has always been hard. And even expert data scientists usually have to spend lots of time wrangling code and data to do any particular analysis.

I myself have been using computers to work with data for more than a third of a century. And over that time my tools and methods have gradually evolved. But this week—with the release of Wolfram|Alpha Pro—something dramatic has happened, that will forever change the way I approach data.

The key idea is automation. The concept in Wolfram|Alpha Pro is that I should just be able to take my data in whatever raw form it arrives, and throw it into Wolfram|Alpha Pro. And then Wolfram|Alpha Pro should automatically do a whole bunch of analysis, and then give me a well-organized report about my data. And if my data isn’t too large, this should all happen in a few seconds.

And what’s amazing to me is that it actually works. I’ve got all kinds of data lying around: measurements, business reports, personal analytics, whatever. And I’ve been feeding it into Wolfram|Alpha Pro. And Wolfram|Alpha Pro has been showing me visualizations and coming up with analyses that tell me all kinds of useful things about the data.

Data input Continue reading

Advance of the Data Civilization: A Timeline

The precursors of what we’re trying to do with computable data in Wolfram|Alpha in many ways stretch back to the very dawn of human history—and in fact their development has been fascinatingly tied to the whole progress of civilization.

Last year we invited the leaders of today’s great data repositories to our Wolfram Data Summit—and as a conversation piece we assembled a timeline of the historical development of systematic data and computable knowledge.

This year, as we approach the Wolfram Data Summit 2011, we’ve taken the comments and suggestions we got, and we’re making available a five-feet-long (1.5 meters) printed poster of the timeline—as well as having the basic content on the web.

Historical data timeline

The story the timeline tells is a fascinating one: of how, in a multitude of steps, our civilization has systematized more and more areas of knowledge—collected the data associated with them, and gradually made them amenable to automation. Continue reading

Making the World’s Data Computable

Stephen Wolfram at the Wolfram Data Summit

Keynote talk given at the Wolfram Data Summit in Washington, DC on Thursday, September 9, 2010.

Well, I should start off by admitting one thing.

This Data Summit was my idea.

And I have to say that the #1 reason I wanted to have it was so I could have a chance to meet you all.

So… thanks for coming, and I hope I will have a chance to meet you-all!

Well, I’ve been a collector and an enthusiast of systematic data for about as long as I can remember.

But in the last few years, I’ve launched into what one might think of as the ultimate extreme data project.

It’s actually something I’ve been thinking about since I was a kid.

The idea is: take all the systematic knowledge—and data—that our civilization has accumulated, and somehow make it computable.

Make it so that given any specific question one wants to ask, one can just compute the answer on the basis of that knowledge and data.

Well, every so often I’d think about this again. And it’d always just seem too big and too difficult. And like it was at least decades in the future.

But two things happened in my life. Continue reading

Announcing the Wolfram Data Summit

(This post was originally published on the Wolfram Blog.)

The creation of large data repositories has been a key historical indicator of social and intellectual development—and indeed perhaps one of the defining characteristics of the whole progress of civilization.

And through our work on Wolfram|Alpha—with its insatiable appetite for systematic data—we have gained a uniquely broad view of the many great data repositories that exist in the world today.

Some of these repositories are maintained by national or international agencies, some by companies and other organizations, and some by individuals. A few of the repositories are quite new, but many date back 40 or more years, and some well over a century. But there is one thing in common across essentially every great data repository: a core of diligent and committed people who have carefully shepherded its development.

Curiously, though, few of these people have ever met their counterparts in other domains of data. And in our work on Wolfram|Alpha we are almost certainly the first group ever to have had the pleasure of getting to know such a broad range of leaders of great data repositories.

And one of the things that we have discovered is that there is much in common in both the methods used and the issues faced by these data repositories. So as part of our contribution to the worldwide data community we have decided to sponsor a data summit to bring together for the first time the leaders of today’s great data repositories.

The Wolfram Data Summit 2010 will be held in Washington, DC on September 9–10.
Continue reading