Big Data has been a Thing™ for over a decade now. Google published the GFS paper in 2003 and the MapReduce paper in 2004. Hadoop started a few years later. Interestingly, the Google Trends trend for “Big Data” didn’t start to take off until 2011, though:
Why is that? Well, one clue is that Big Data grows along with, and is even eclipsed by, the rise of AWS, as this trend comparison shows (AWS in red):
That makes sense: Big Data took off once cloud infrastructure became cheap and flexible enough for the masses.
That said, not all data is Big Data. Some data is, well, Small Data. The Web abounds with thought leadership jockeying over the “big vs small data” question. For my purposes, the difference will be considered self-evident.
For me, Twitter is a perfect source of small data to analyze. It lets me reflect on my behavior in a structured, quantitative way. Such an analysis is complementary to the off-the-cuff reasoning that happens inside my head.
In 2017, I tweeted a whopping 21 times. In hindsight, I’d say my main reason for not tweeting much is what Facebook calls “meaningful interaction”, or lack thereof, in my case. Facebook just tweaked its News Feed algorithm again precisely to address this. I think they’re spot on.
With that hypothesis on the table, one straightforward solution suggests itself, before even looking at any data: spend more time engaging with my friends, and try to engage in tweet discussions around my interests, such as coding and design.
But let’s look at the data to be sure. First, I manually categorized my 2017 tweets by topic:
Date Topic ------------------------- 01/07/2017 Design 01/08/2017 News 01/09/2017 Hamburger Eyes 01/13/2017 Design 01/20/2017 Startups 02/07/2017 Misc 02/14/2017 Hamburger Eyes 02/15/2017 Video Games 03/03/2017 Startups 04/10/2017 Apple Music 04/11/2017 Apple Music 04/13/2017 Misc 04/19/2017 Apple Music 04/27/2017 Apple Music 05/11/2017 Apple Music 05/17/2017 Apple Music 05/30/2017 Startups 06/22/2017 Music 06/28/2017 Music 07/25/2017 Apple Music 08/22/2017 Tech
Then, I used my blunderingly naive pandas and matplotlib skills to munge and plot the data:
import matplotlib.pyplot as plt import pandas as pd # Omitting a bunch of data and plot munging for brevity. Left as an # exercise for the reader, or ping me on Twitter for advice. data = pd.read_csv("2017_tweets.csv") month_groups = tweet_dates.groupby("Month") counts_by_month = month_groups.agg("count") counts_by_month.plot(kind='bar') plt.show() topic_groups = data.groupby("Topic") counts_by_topic = topic_groups.agg("count") counts_by_topic.plot(kind="pie", y="Count") plt.show()
Here is the histogram by month:
Those results are somewhat funny. The January high of 5 is due to my renewed efforts at the start of every year to try and engage more on social media. Clearly, that effort loses steam as the year progresses. The matching high in April I’d guess was due to being done with taxes. Thus relieved and relaxed, I started listening to more music again. When I’m stressed, I tend to pick an album and listen to it on repeat.
Why no Twitter in the last 4 months of the year? Not sure. There’s a lot of circumstantial factors. I changed teams at work, for example. But nothing adds up convincingly. Let’s leave it as a small mystery for now. It’s more fun that way, anyways.
Here is the pie chart for the topic breakdown:
(Due to the height of my laptop screen, the pie ended up being an oblate spheroid #lol).
The biggest topic is Apple Music, when I share the song I am listening to straight from the Music app. None of these tweets drive interactions, but I like them as a form of self-expression, so doing that still makes me satisfied.
Another notable topic is Hamburger Eyes, a fantastic photography magazine I discovered over a decade ago. I should retweet their work more often. Also there is Design. I like having armchair design opinions, even though I do not practice.
What’s missing? Well, I had no tweets about coding. I like coding, more or less, but I haven’t really found discussing it on Twitter worthwhile. I prefer an in-depth book or article. As an experiment, I am now following more coding folks to see if this year will be any different. Also, no interactions with friends. Noted; will make a better effort this year.
Finally, my favorite tweet of last year is:
I'm the obscene slang kicker with no parental sticker / Advisin' y'all that wise words is much slicker -GZA
— Peter Skirko (@pskirko) June 22, 2017
So there you have it.