I’ve been doing some simple, useful, and fun #DataAnalysis and #DataViz for The Hardanger Fiddle Association of America (HFAA). The HFAA is a nonprofit organization for which I am a member and a board member.
A Hardanger fiddle is the national folk instrument of Norway.
The HFAA is deciding whether to host its annual workshop in person in 2022. This would be the first in-person workshop since 2019 due to the pandemic. This is a complex decision that the board takes very seriously. I’ve been helping with the decision by performing some simple analysis on the HFAA's data.
Sample questions include:
Would the necessary travel during a pandemic reduce attendance? The workshop is located in a remote and rural area of Wisconsin, yet typically attracts attendees from all over the USA and internationally. I compared the historic geographic distribution of workshop attendees to the geographic distribution of the general membership. Not surprisingly, attendance is markedly highest among those within a day’s driving distance of the venue. Air travel is a financial barrier and a safety barrier to workshop attendance that must be considered.
How many teachers would we need to create smaller/safer class sizes? Classes typically range from 5-15 students in one fiddle class to 50 students in one dance class. Often, these classes are held in a small room or the shared space of the banquet hall. By comparing the ratios amongst interests and levels, we can decide if and how many additional teachers we’d need to hire in order to create smaller class “pods" as a way to increase safety.
What are the most popular classes? Typical classes include hardingfele/fiddle, dance, kveding/singing, seljefløyte/reed flute, munnharpa/mouth harp, luthiery, and cultural lectures. If we decide to present a hybrid workshop we'd only be able to stream one class at a time from the main classroom because it is the only space with the WiFi capabilities to stream content. What would be the most in-demand topics and how much class time of each should we plan?
Unfortunately, I had to do A LOT of data cleaning before I could get into my analysis. Multiple different volunteers have managed the membership database and past workshop registration lists over the past years. (Bless their heart for their hard work!) Naturally, everyone having their own "system" for their volunteer duties created very messy data. Ultimately, I had to do a lot of data cleaning manually... but fortunately, it was a small data set with only ~500 rows. Here are some examples of cleaning needs I encountered:
I figured out what an “x” meant -- a yes or a no? -- and made that a binary variable across many fields.
I cleaned address data because of discrepancies in address formats and fields across countries.
As someone who’s been involved with the organization for almost a decade, I was able to fill in a lot of missing data from what I knew. For example, is someone predominantly a dancer or a fiddler, or what state do they live in?
I decided how to handle fields with inconsistent values or null values.
I found and corrected small typos that made joins difficult.
Working with personally identifiable data, I deleted unnecessary and sensitive data from the Excel sheet I used in my analysis.
Using Excel and Tableau, my output was statistics and data visualizations. I will share these results with the board in advance of our next meeting and take a few moments in the meeting to show and talk through each. Unfortunately, a lot of the things I was curious about I could not analyze due to the very simplistic nature of the data the HFAA collected. This is not surprising given the limited resources and reliance on volunteer time that is common in the non-profit world. For example, the HFAA has no demographic information such as age or gender about its members. This is something I will suggest beginning to collect with membership renewals. As the organization continues to grow and evolve to meet changing interests and needs, I can see that information becoming more and more useful. Today I had limited fields to work with, and I had to make a few inferences, but I managed to uncover some helpful results.
As for if we should host an in-person workshop, I have formed my personal opinions and ideas based on the results of my analysis. I am curious to get the rest of the board's interpretation once I show them my work. Our treasurer is putting together a financial analysis addressing questions such as typical workshop expenses and the minimum attendance needed to break even on workshop costs. I am very interested to add that context!
Having some hard numbers and understandable visualizations will help form a consensus and make a wise decision. This was a fun side-project that only took a few hours of my free time. While it is very simple in the context of what I can do for analysis and visualization, I am proud of myself for my hard work manually cleaning the dataset and for the insights I was able to uncover from very simplistic data. I know my results will provide great value to the HFAA, which was my goal, and I enjoyed doing it!
Here is one of my favorite vizzes... and NOT because of the pie charts... This is my favorite because of the story it tells. Looking at the progression among these pie charts, it is clear that there are many new beginner fiddlers (light blue) that start each year. As they learn the instrument, they are progressing through each level (dark blue to green) and the volume of intermediate and advanced fiddlers is growing through the years. The HFAA has a steady stream of new fiddlers eager to begin, which is evidence that outreach programs are working! The growing volume of intermediate and advanced fiddlers is evidence that new players are enjoying learning the instrument and being a member of the HFAA community so much so that they continue!
Comments