The Personal Analytics of My Life

March 8, 2012

One day I’m sure everyone will routinely collect all sorts of data about themselves. But because I’ve been interested in data for a very long time, I started doing this long ago. I actually assumed lots of other people were doing it too, but apparently they were not. And so now I have what is probably one of the world’s largest collections of personal data.

Every day—in an effort at “self awareness”—I have automated systems send me a few emails about the day before. But even though I’ve been accumulating data for years—and always meant to analyze it—I’ve never actually gotten around to doing it. But with Mathematica and the automated data analysis capabilities we just released in Wolfram|Alpha Pro, I thought now would be a good time to finally try taking a look—and to use myself as an experimental subject for studying what one might call “personal analytics”.

Let’s start off talking about email. I have a complete archive of all my email going back to 1989—a year after Mathematica was released, and two years after I founded Wolfram Research. Here’s a plot with a dot showing the time of each of the third of a million emails I’ve sent since 1989:

Plot with a dot showing the time of each of the third of a million pieces of email

The first thing one sees from this plot is that, yes, I’ve been busy. And for more than 20 years, I’ve been sending emails throughout my waking day, albeit with a little dip around dinner time. The big gap each day comes from when I was asleep. And for the last decade, the plot shows I’ve been pretty consistent, going to sleep around 3am ET, and getting up around 11am (yes, I’m something of a night owl). (The stripe in summer 2009 is a trip to Europe.)

But what about the 1990s? Well, that was when I spent a decade as something of a hermit, working very hard on A New Kind of Science. And the plot makes it very clear why in the late 1990s when one of my children was asked for an example of “being nocturnal” they gave me. The rather dramatic discontinuity in 2002 is the moment when A New Kind of Science was finally finished, and I could start leading a different kind of life.

So what about other features of the plot? Some line up with identifiable events and trends in my life, sometimes reflected in my online scrapbook or timeline. Others at first I don’t understand at all—until a quick search of my email archive jogs my memory. It’s very convenient that I can always drill down and read a raw email. Because as with essentially any long-timescale data project, there are all kinds of glitches (here like misformatted email headers, unset computer clocks, and untagged automated mailings) that have to be found and systematically corrected for before one has consistent data to analyze. And before, in this case, I can trust that any dots in the middle of the night are actually times I woke up and sent email (which is nowadays very rare).

The plot above suggests that there’s been a progressive increase in my email volume over the years. One can see that more explicitly if one just plots the total number of emails I’ve sent as a function of time:

Daily outgoing emails and monthly outgoing emails

Again, there are some life trends visible. The gradual decrease in the early 1990s reflects me reducing my involvement in day-to-day management of our company to concentrate on basic science. The increase in the 2000s is me jumping back in, and driving more and more company projects. And the peak in early 2009 reflects with the final preparations for the launch of Wolfram|Alpha. (The individual spikes, including the all-time winner August 27, 2006, are mostly weekend or travel days specifically spent “grinding down” email backlogs.)

Distribution of emails per dayThe plots above seem to support the idea that “life’s complicated”. But if one aggregates the data a bit, it’s easy to end up with plots that seem like they could just be the result of some simple physics experiment. Like here’s the distribution of the number of emails I’ve sent per day since 1989:

What is this distribution? Is there a simple model for it? I don’t know. Wolfram|Alpha Pro tells us that the best fit it finds is to a geometric distribution. But it officially rejects that fit. Still, at least the tail seems—as so often—to follow a power law. And perhaps that’s telling me something about myself, though I have to say I don’t know what.

Monthly distinct email recipients

The vast majority of these recipients are people or mailgroups within our company. And I suspect the overall growth is a reflection of both the increasing number of people at the company, and the increasing number of projects in which I and our company are involved. The peaks are often associated with intense early-stage projects, where I am directly interacting with lots of people, and there isn’t yet a well-organized management structure in place. I don’t quite understand the recent decrease, considering that the number of projects is at an all-time high. I’m just hoping it reflects better organization and management…

OK, so all of that is about email I’ve sent. What about email I’ve received? Here’s a plot comparing my incoming and outgoing email:

Average daily emails

The peaks in 1996 and 2009 are both associated with the later phases of big projects (Mathematica 3 and the launch of Wolfram|Alpha) where I was watching all sorts of details, often using email-based automated systems.

OK. So email is one kind of data I’ve systematically archived. And there’s a huge amount that can be learned from that. Another kind of data that I’ve been collecting is keystrokes. For many years, I’ve captured every keystroke I’ve typed—now more than 100 million of them:

Diurnal plot of keystrokes

Daily keystrokes, averaged by month

There are all kinds of detailed facts to extract: like that the average fraction of keys I type that are backspaces has consistently been about 7% (I had no idea it was so high!). Or how my habits in using different computers and applications have changed. And looking at the daily totals, I can see spikes of writing activity—typically associated with creating longer documents (including blog posts). But at least at an overall level things like the plots above look similar for keystrokes and email.

What about other measures of activity? My automated systems have been quietly archiving lots of them for years. And for example this shows the times of events that have appeared in my calendar:

Diurnal plot of calendar events

The changes over the years reflect quite directly things going on in my life. Before 2002 I was doing a lot of solitary work, particularly on A New Kind of Science, and having only a few scheduled meetings. But then as I initiated more and more new projects at our company, and took a more and more structured approach to managing them, one can see more and more meetings getting filled in. Though my “family dinner stripe” remains clearly visible.

Here’s a plot of the daily average total number of meetings (and other calendar events) that I’ve done over the years:

Average events per day

The trend is pretty clear. And it reflects the fact that in the past decade or so I’ve gradually learned to work better “in public”, efficiently figuring things out while interacting with groups of people—which I’ve discovered makes me much more effective both at using other people’s expertise and at delegating things that have to be done.

It often surprises people when I tell them this, but since 1991 I’ve been a remote CEO, interacting with my company almost exclusively just by email and phone (usually with screensharing). (No, I don’t find videoconferencing with the company very useful, and the telepresence robot I got recently has mostly been standing idle.)

So phone calls are another source of data for me. And here’s a plot of the times of calls I’ve made (the gray regions are missing data):

Diurnal plot of phone calls

Yes, I spend many hours on the phone each day:

Daily hours on the phone and monthly hours on the phone

And this shows how the probability to find me on the phone varies during the day:

On-phone probability

This is averaged over all days for the last several years, and in fact I’m guessing that the “peak weekday probability” would actually be even higher than 70% if the average excluded days when I’m away for one reason or another.

Here’s another way to look at the data—this shows the probability for calls to start at a given time:

Call start times

There’s a curious pattern of peaks—near hours and half-hours. And of course those occur because many phone calls are scheduled at those times. Which means that if one plots meeting start times and phone call start times one sees a strong correlation:

Calls and meetings

Differences between meeting and phone call start timesI was curious just how strong this correlation is: in effect just how scheduled all those calls are. And looking at the data I found that at least for my external phone meetings at least half of them do indeed start within 2 minutes of their appointed times. For internal meetings—which tend to involve more people, and which I normally have scheduled back-to-back—there’s a somewhat broader distribution, shown on the left.

Call durationsWhen one looks at the distribution of call durations one sees a kind of “physics-like” background shape, but on top of that there’s the “obviously human” peak at the 1-hour mark, associated with meetings that are scheduled to be an hour long.

So far everything we’ve talked about has measured intellectual activity. But I’ve also got data on physical activity. Like for the past couple of years I’ve been wearing a little digital pedometer that measures every step I take:

Diurnal plot of steps taken

Daily steps averaged by month

And once again, this shows quite a bit of consistency. I take about the same number of steps every day. And many of them are taken in a block early in my day (typically coinciding with the first couple of meetings I do). There’s no mystery to this: years ago I decided I should take some exercise each day, so I set up a computer and phone to use while walking on a treadmill. (Yes, with the correct ergonomic arrangement one can type and use a mouse just fine while walking on a treadmill, at least up to—for me—a speed of about 2.5 mph.)

OK, so let’s put all this together. Here are my “average daily rhythms” for the past decade (or in some cases, slightly less):

Graphs of incoming emails, outgoing emails, keystrokes, meetings and events, calls, and steps as a function of time

The overall pattern is fairly clear. It’s meetings and collaborative work during the day, a dinner-time break, more meetings and collaborative work, and then in the later evening more work on my own. I have to say that looking at all this data I am struck by how shockingly regular many aspects of it are. But in general I am happy to see it. For my consistent experience has been that the more routine I can make the basic practical aspects of my life, the more I am able to be energetic—and spontaneous—about intellectual and other things.

And for me one of the objectives is to have ideas, and hopefully good ones. So can personal analytics help me measure the rate at which that happens?

It might seem very difficult. But as a simple approximation, one can imagine seeing at what rate one starts using new concepts, by looking at when one starts using new words or other linguistic constructs. Inevitably there are tricky issues in identifying genuine new “words” etc. (though for example I have managed to determine that when it comes to ordinary English words, I’ve typed about 33,000 distinct ones in the past decade). If one restricts to a particular domain, things become a bit easier, and here for example is a plot showing when names of what are now Mathematica functions first appeared in my outgoing email:

First email appearance of Mathematica functions

The spike at the beginning is an artifact, reflecting pre-existing functions showing up in my archived email. And the drop at the end reflects the fact that one doesn’t yet know future Mathematica names.  But it’s interesting to see elsewhere in the plot little “bursts of creativity”, mostly but not always correlated with important moments in Mathematica history—as well as a general increase in density in recent times.

As a quite different measure of creative progress, here’s a plot of when I modified the text of chapters in A New Kind of Science:

Plot of when chapters were modified in A New Kind of Science

I don’t have data readily at hand from the beginning of the project. And in 1995 and 1996 I continued to do research, but stopped editing text, because I was pulled away to finish Mathematica 3 (and the book about it). But otherwise one sees inexorable progress, as I systematically worked out each chapter and each area of the science. One can see the time it took to write each chapter (Chapter 12 on the Principle of Computational Equivalence took longest, at almost 2 years), and which chapters led to changes in which others. And with enough effort, one could drill down to find out when each discovery was made (it’s easier with modern Mathematica automatic history recording). But in the end—over the course of a decade—from all those individual keystrokes and file modifications there gradually emerged the finished A New Kind of Science.

It’s amazing how much it’s possible to figure out by analyzing the various kinds of data I’ve kept. And in fact, there are many additional kinds of data I haven’t even touched on in this post. I’ve also got years of curated medical test data (as well as my not-yet-very-useful complete genome), GPS location tracks, room-by-room motion sensor data, endless corporate records—and much much more.

And as I think about it all, I suppose my greatest regret is that I did not start collecting more data earlier. I have some backups of my computer filesystems going back to 1980. And if I look at the 1.7 million files in my current filesystem, there’s a kind of archeology one can do, looking at files that haven’t been modified for a long time (the earliest is dated June 29, 1980).

Here’s a plot of the latest modification times of all my current files:

Modification dates of all current files

The colors represent different file types. In the early years, there’s a mixture of plain text files (blue dots) and C language files (green). But gradually there’s a transition to Mathematica files (red)—with a burst of page layout files (orange) from when I was finishing A New Kind of Science. And once again the whole plot is a kind of engram—now of more than 30 years of my computing activities.

So what about things that were never on a computer? It so happens that years ago I also started keeping paper documents, pretty much on the theory that it was easier just to keep everything than to worry about what specifically was worth keeping. And now I’ve got about 230,000 pages of my paper documents scanned, and when possible OCR’ed. And as just one example of the kind of analysis one can do, here’s a plot of the frequency with which different 4-digit “date-like sequences” occur in all these documents:

Occurrence of years in scanned documents

Of course, not all these 4-digit sequences refer to dates (especially for example “2000″)—but many of them do. And from the plot one can see the rather sudden turnaround in my use of paper in 1984—when I turned the corner to digital storage.

What is the future for personal analytics? There is so much that can be done. Some of it will focus on large-scale trends, some of it on identifying specific events or anomalies, and some of it on extracting “stories” from personal data.

And in time I’m looking forward to being able to ask Wolfram|Alpha all sorts of things about my life and times—and have it immediately generate reports about them. Not only being able to act as an adjunct to my personal memory, but also to be able to do automatic computational history—explaining how and why things happened—and then making projections and predictions.

As personal analytics develops, it’s going to give us a whole new dimension to experiencing our lives. At first it all may seem quite nerdy (and certainly as I glance back at this blog post there’s a risk of that). But it won’t be long before it’s clear how incredibly useful it all is—and everyone will be doing it, and wondering how they could have ever gotten by before. And wishing they had started sooner, and hadn’t “lost” their earlier years.


Comment added April 5:

Thanks for all the great comments and suggestions, both here and in separate messages!

I’d like to respond to a few common questions that have been asked:

How can I do the same kind of analysis you did?
Eventually I hope the answer will be very simple: just upload your data to Wolfram|Alpha Pro, and it’ll all be automatic. But for now, you can do it using Mathematica programs. We just posted a blog explaining part of the analysis, and linking to the source for the Mathematica programs that you’ll need. To use them, of course, you’ll still have to get your data into some kind of readable form.

What systems did you use to collect all the data?
Different ones at different times, and on different computer systems. For keystroke data, for example, I used several different keyloggers—mostly rather shadowy pieces of software marketed primarily for surreptitious uses. For the phone call data, all my landline phones have always been connected to our company phone system (originally a PBX, now a VoIP system), so I was able to use its built-in logging capabilities. For email, I had a script set up as part of our company email system back in 1989 that forks off a copy of all my messages, and sends them to an archive. This script has had to be updated quite a few times over the years when we’ve changed email systems.

How does your treadmill setup work?
It’s pretty straightforward. I have a keyboard mounted on a board that attaches to the two side rails of the treadmill. I’ve carefully adjusted the height of the keyboard, and I’ve put a gel strip in front of it, to rest my wrists on. I have the mouse on a little platform at the side of the treadmill. And I have two displays mounted in front of me. I’ve sometimes thought about developing some kind of kit to let other people “computerize” their treadmills… but it’s seemed too far from my usual business. (And when I first had the treadmill set up, I was still a bit embarrassed about my impending middle age, and need for exercise.)

With everything you have going on, do you find time for your family?
Happily, very much so. It’s helped a great deal that I’ve always worked at home, so when I’m not actively in the middle of working, I can spend time with my family. It’s also helped that I’ve been very consistent for a long time in taking an extended dinner break with my family (that’s the 2.5 hour gap visible in the early evening in most of my plots). In the blog, I concentrated on work-related personal analytics; I have quite a lot more that’s family oriented, but I didn’t include this in the blog.

Join the Discussion
Join the Discussion

194 more comments. Show all »

  1.  

    What is this distribution? Is there a simple model for it? I don’t know.

    Yes, there is. The Skewness ( http://en.wikipedia.org/wiki/Skewness ) may be easily presented by a Weibull Distribution etc.

    Max

    Max
  2.  

    This is a great example of what other have been calling the quantified self movement: http://quantifiedself.com/. As with many things looks like you were a pioneer here too. Looks like Mathematica is a great tool for it.

  3.  

    Thanks for sharing! Amazing and inspiring.

    If it’s not too personal to ask, I’m curious: What happened in 2002 on the first plot?

  4.  

    “I actually assumed lots of other people were doing it too” — you’re not alone! I logged my own keystrokes for a year and found some interested patterns as well. http://flickr.com/photos/kylemcdonald/4724722714/ One thing I discovered is that my “natural” cycle is about 5 minutes longer than a day, so the time is always increasing.

    I also released the data for anyone else interested in experimenting with visualizations https://code.google.com/p/keytweeter/downloads/detail?name=keytweeter-archive.zip&can=2&q=

  5.  

    I have an inexplicable affinity to time series data. Time is a natural ordering that comforts us. I am addicted to fantasy sports (the data from sports I have not played or cared much about). Part of the allure is that time-ordered data yield clear or illusory patterns to which we ascribe meaning.

    Will the capability to collect and visualize untold personal data make us incrementally more introspective? I wonder if we will routinely recognize false patterns in our data that reinforce our biases. I also wonder if we will routinely pick and choose data that jibe with the personal narrative our egos continually burnish.

  6.  

    Your graphs look great. Calls are a very useful item to keep track of. What about tracking who you are talking with, and what you talk about? That’s what I do. http://quantifiedself.com/2011/08/bryan-bishop-on-meetlog/

    Another interesting project is fenn lipkowitz’s every-moment-tracking: quantifiedself.com/2011/12/fenn-lipkowitzs-amazing-lifelog/ or http://quantifiedself.com/2010/02/a-remarkable-life-logging-proj/

    On a related note, according to my ‘meetlog’, I’ve talked with Stephen before, something about a singularity-related conference it seems.

    - Bryan
    http://heybryan.org/

  7.  

    Amazing post. These are exactly the sort of insights into life we are aspiring to allow anyone to gain with the product we are building at brainpage.com. Inspiring to see this, indeed!

    Our basic thesis is that companies are using all sort of personal data to better serve *their* needs. That data should better serve *our* need – let see what we are actually doing and set goals to live our life closer to what we aspire to.

    Its the second point I’m curious about. Has seeing/analyzing this data caused any changes in what you are doing? How are you using this data?

  8.  

    Impressive.

    Montaigne is the best analogue to this perhaps whimsical but landmark project

    Though Feltron is famous for this type of tracking these days

  9.  

    I am having email from the last 4 years in my portable Thunderbird instalation.
    How are you coping and keeping your email from the last 20+ years ? Can you give us some technical insights ?

    Mike
  10.  

    This comment is for Mr. Wolfram if he ever sees it (which I dearly hope he does). I’m a high school student looking at future careers, specifically one in Quantum Computing. I wanted to get Mr. Wolfram’s personal opinion on this field since it is close to his area of expertise.

    Daniel
  11.  

    “What is this distribution? Is there a simple model for it? I don’t know.”

    Another candidate is the gamma distribution (http://en.wikipedia.org/wiki/Gamma_distribution#Applications), which allows for skewness and is easier to work with than the Weibull distribution.

    Justin
  12.  

    As someone who’s trying to optimize his life better, what strikes me the most is this conclusion:

    For my consistent experience has been that the more routine I can make the basic practical aspects of my life, the more I am able to be energetic—and spontaneous—about intellectual and other things.

    This reminds of the book Uncertainty that I’m reading. Very interesting indeed.

  13.  

    I’d love if there were seasonal effects to your patterns and rhythms. I definitely think they’d emerge if I were able to do this analysis on my own life.

    This is pretty stunning!

  14.  

    Mr Wolfram,

    I would like a (private and free) platform to be built so that I can input this data for my own life. I am 28 years old and the only tool I have to do such things is Facebook but that data is not owned by me and I would like a platform in which I would own my own data so that I could run my own analytic software on it.

    Is there such a thing? Or would you and your team be willing to build such a thing for us data fans :)

    “Data is the lifeblood of our world”

    - Renzo

    Oscar Renzo Guiulfo
  15.  

    An excellent article.

    I think the popularisation of collection and dissemination of personal data is on the brink of kicking the big time. Facebook’s new Timeline UI, inspired by Nicholas Feltron is completely targeted at achieving this and if anyone can push something into mainstream culture it’s the company with 850 million users!

  16.  

    One may try to fit the Gamma distribution for a skewed non-negative bell-like curve. It is covered by a GENMOD proc in SAS, for example.

  17.  

    @StatsTrade – looks like a move west to me.

    Joseph Peters
  18.  

    You’re kind of a nutjob .. in a good way :)

    Jim
  19.  

    I can’t even comprehend the attention to detail to log _every_ keystroke for such a long timeframe. That in itself is an amazing feat and something that I’m sure very few humans could claim. I almost wish I could go back in time and start a personal analytics data collection on myself, but I do feel as if by the time the data becomes useful I might have changed too much as an individual to make use of it. What say you when it comes to the usefulness of personal analytics for fine tuning one’s life?

  20.  

    Could you elaborate on some of the technology you used to collect this data? For example: how have you scanned 230,000 pages of paper? Have you used the same email client and format over this period of time, or have you migrated old formats forward? How do you keep your keystroke data secure?

  21.  

    +1 to the others who have thanked you for sharing. Personal data mining is certainly quite an interesting niche.

    If you don’t mind sharing, how have you set up your treadmill desk? I’ve been interested in pursuing this as well, but haven’t yet seen a good enough example to justify the time/energy.

  22.  

    Interesting delete key ratio. Do you count key repeats as one? (holding down the delete key to erase a whole word for example) Using CTRL+backspace as a matter of practice for deleting an entire word would not inflate the key ratio.

  23.  

    Can you do an analysis and see how many times you’ve hit Control + Alt + Delete since 2002?

  24.  

    I don’t mean to be flippant, but what valuable insight did you gain from this exercise? You probably already knew that you don’t send emails when you’re sleeping.

  25.  

    This is very interesting and inspiring. Could you please speak to how you have captured and recorded all of this data? I ask because I assume in the span of times that are represented the devices and methods used must have evolved several times.

    Andrew Hanlon
  26.  

    @StafsTrade just read the article, in 2002 he finished his book.

    Sævar
  27.  

    @StatsTrade: “The rather dramatic discontinuity in 2002 is the moment when A New Kind of Science was finally finished, and I could start leading a different kind of life.”

    Like David Clausen said above, those interested in this should check out the Quantified Self movement. Even better, the next stage of its evolution: Programmable Self. http://radar.oreilly.com/2012/01/programmable-self-motivation-hacks-digital-data.html

  28.  

    I am one of those “other people”. My data set isn’t quite as extensive as yours, since it only goes back to the late 1990s. See http://sluggish.dyndns.org/wiki/Emailgraph

    Valence graphs can show up significant life events fairly well, and it would be interesting to do other sorts of semantic analysis, such as word frequency, average word length, past vs future tense and so on.

  29.  

    This is fascinating and it’s obviously just the beginning. I’d love to read more in a series on this, particularly when you do get to more of the projections/predictions. It’s not clear to me how far you can reliably take that aspect of it. And without the drive to affect the future it’s not clear to me that any of this analysis would be worth it even if I had this data for my own activities.

    I’m curious: what happens if you divide the plot in the third row of your “average daily rhythms” histograms by the second row? With some simplifying assumptions, it looks like you tend to write shorter e-mails in the morning.

    Brooke
  30.  

    Relevant: http://daytum.com/

    Shaun Culver
  31.  

    Hi Stephen,

    This is really amazing. I’ve tried to keep a record of all the emails I’ve wrote and sent since about 1998. I mostly did it in case I needed to at some point go back and retrieve and email. If one looks at how our behaviour has changed over time, how technology is influencing and changing our behaviour, and what does it mean when we write and blog more or less, comparing it to events in our lives could be really interesting. Could it be predictive? I suspect yes. The scary part is if our online patterns are understood over a million or 100 million people. Facebook certainly already has a lot of this data and they know it. If they can predict a breakup, what else might they be able to predict. Ww might be at the beginnings of really understanding the human computer program and we might soon realize how to make it work better for us.

  32.  

    Whoa. This is brilliant. !

    Saad Bhamla
  33.  

    I built a tool to do exactly this for the first Android Developer’s Challenge back in 2008. It generates exactly the same graphs as a large number of the graphs shown in this blog post, right down to the heatmap visualizations, and several more. It also provided a completely generic data normalization and plotting framework so you could incorporate either data streams available on the phone, or other data streams for the “quantified self” types (such as FitBit, which I think didn’t exist back then). I didn’t place in the ADC, and then the code sat neglected on a hard drive somewhere. This blog post has inspired me to dust it off again though. I called it “LifeScope”, although it looks like that name is taken by multiple companies now. I’ll try to get my code out there soon though so it can be useful to people. Thanks for the inspiration to do so.

  34.  

    All these graphs seem to be pointing to one thing – somewhere around mid 2009, you were using amphetamines.

    Seriously, I think you’re crazy for keeping tabs on yourself for so long, but – it’s a great read and I wish I had done one myself (although I’m not about to start now).

    In a few years, Facebook will probably provide graphs just like these (about us).

    Joshua Davison
  35.  

    I was experiencing intellectual euphoria reading this post!!!
    –Ed

  36.  

    I, too, am curious about what happened in 2002 on the first plot. Time zone shift? Also, can you explain how you logged all this data?

    Sarabande Go
  37.  

    Excellent all around; thanks so much for posting this.
    I was wondering whether you would mind sharing some specific details on the setups, e.g. how you logged phone calls, what pedometer you use (all I know simply count steps, not log times), etc.

    GustavoG
  38.  

    I am fully impressed by the lifestyle that involve tons of night working! By the way, does the 6pm to 8pm break that appears in virtually most graphs indicates that you are having dinner and some quality time with you family?

  39.  

    I’d love to know the tools you employed to log all this data. I’d be very interested in doing the same thing. To a lesser scale I’ve been running an app called OS track which keeps track of system resources and what applications are using them, and Finch which tracks time spent in applications… great post.

  40.  

    “What is this distribution? Is there a simple model for it?”

    Maybe a negative binomial distribution. The model would be that each email is a trial, and once a threshold of cumulative successes or failures is reached, you go to sleep.

    http://www.math.uah.edu/stat/applets/NegativeBinomialExperiment.html

    Rajan Lukose
  41.  

    How is the data pre-processed? I am assuming Mathematica just doesn’t snoop into one’s email and collects the data, so maybe some other software prepares it and formats it into a easier-to-access file. Any comments on this aspect, Professor?
    I think its not too late for the rest of us to start doing this to learn more about ourselves as we look back years from now. It is truly visionary to collect this kind of data years before the adequate software was available. Great post!

    Fernando Sanchez
  42.  

    Stephen, thanks for this! Finally, more examples of QS meets cognitive tracking. I am extremely curious about how I grow – how anyone grows – as a person (human mind) over time.

    The most interesting metric, in my opinion, is the analysis of your adoption of new words. How does language and the evolution of it into phrases and concepts and ideas reflect the changing capability of a mind to interact with itself, others, and the world at large?

    I’ve been trying to discover what activities lead to developments in side projects or yield new ideas. Topically, say I study the biochemistry of the sense of taste and then think something new about how the body works. What is that, how does it impact the future when I think about that topic again?

    Imagine if you had data of categories like “most interesting thing I learned today” or “most beautiful” — I want to track everything.. it seems so difficult to capture the abstract. Imagine a library where qs’ers could post, explore, interact with each other and learn from the data sets and self analytics of others.. I wonder. Successful post! It’s made me think :)

    You’re right, this does give a new dimension to life. Aren’t you exited to experience what unfolds as more people being to explore it.

    Cheers
    Amy

  43.  

    Fascinating! Are the any particular programs that you recommend for collecting the data?

    josh
  44.  

    Amazing and insane! Wow, I’m kinda blown away.

    This is astonishing for me, mostly because I’m a heavily future-focused person, and typically don’t focus on the past, or sometimes even remember large chunks of it (dates, some events).

    Vive la difference :-)

  45.  

    This is really very amazing!

    Can you please tell what tool did you use to log your activities? I too want to do that very seriously, so that I know what exactly I am doing in my life and how am I doing it (or did it, to see the past from the future)?

    Gautam
  46.  

    Very interesting.

    How about mouse travel? Application time? I took a year or two observing my computer usage (early 2000s), and noticed a significant increase in time spent with different browsers. I also recorded my mouse going around the world :P

    MH
  47.  

    This is a brilliant post, thanks for sharing! I wished I had started compiling such an enormous amount of data long ago. If we think about it, it’s actually surprising that nobody does. Considering the fact that we measure every nickel and dime in our businesses, yet when it comes to personal analytics, nobody bothers.

    On a smaller pattern, I’ve tried for many years to measure a lot of things as well. I’ve written up some results in a blog post (it’s in German however): http://gefahrgutblog.de/2012/02/22/ich-messe-alles-ein-komplex/

    Thanks again for those great insights!

  48.  

    You may be interested in Nick Felton (www.feltron.com), who’s been collecting a similarly impressive amount of data about himself (though probably slightly different kinds of data), and cataloging and presenting it in some interesting ways.

    Rory
  49.  

    In fact, it looks like he just linked to this post on his blog…

    Rory
  50.  

    judging from the third to last graph, in 2002, he finished a new kind of science. that accounts for the change in sleep schedule in the first graph and a shift to earlier meetings after publishing

    steve b
Hide comments »