I had an idea the other day while walking the dog, that I could get a sense of how often I get major progress done by checking the times that documents in my home directory get saved. In general, most of my working time is during the standard work day, I started a policy a while ago during grad school that I would work 9-5 without fail, although recently that’s been upended a bit, but that’s post-doc life.
Anyway, this literally took only about 30 minutes of coding to get off the ground, resulting in this code:
library(plyr) library(ggplot2) all.files # This cleans some bulk import files with common timestamps, they screw up the analysis! all.files colnames(out.time) <- 'saved' ggplot(aes(x=saved), data=out.time) + geom_rect(aes(xmin=9, xmax=17, ymin=0, ymax=3500), fill=I('red'), alpha= 0.02) + geom_histogram()+xlab('File Save Time')
The data show what I would expect. During the day I work progressively more, taking a break right before lunch-time (when I do lots of saving!), and then again at 3pm. There’s a bit of a tail going to midnight and one o’clock, which is presumably evidence of some late night cramming. Probably more from my grad studies than my post-doc, although there have definitely been some all-nighters.
I think this is actually a pretty interesting dataset, that might be useful for broader temporal modelling. Adding information about the day of the week would be pretty cool, and it would be a great introductory dataset for students. They all have this data, it is fairly easy to code (the whole thing is really two lines, with a bit of cleaning) and they’d have personal knowledge of the dataset. It would also be pretty noisy so there would be a good opportunity to talk about that to some degree.
So, what else do you think could be done with the dataset? If you try it yourself, can you think of other ways to display the data that would be interesting?