6 min read

Who are the Philly R-Ladies? Analyzing Data from Google Forms

Why do a new member survey?

One of the goals of R-Ladies Philly is for the group to be focused on the needs and interests of its members. With this goal in mind, we created a survey to query members about meeting logistics and R experience level. This survey also provided a mechanism to collect emails in order to invite new members to our Slack channel.

If you want to join us and fill out the survey, you can find it here: http://bit.ly/rladies-survey

I thought it would be fun to share the results. Because this is R-Ladies, I also wanted to analyze and visualize the data in R!

Step One: Get the data from google to R

There are probably many ways to accomplish getting the data from a google form into R. The first things that came to my mind were:

  1. Download a file from google to some local place on my computer
  2. Use an R package to pull the data from google docs to R

I decided to try the googlesheets package: https://github.com/jennybc/googlesheets

I found this package very user-friendly and easy to figure out. You will see below that a couple commands get the job done.

What I did here is not fully automated. I did visit the survey on docs.google.com and click the option to “View responses in Sheets”. In the resulting sheet, I removed email addresses to anonymize the respondents.

library(googlesheets)
suppressPackageStartupMessages(library(tidyverse))

The first step is to authenticate with google and see what sheets you have available. The following will open a browser window and ask you to tell google you want to look at your data.

my_sheets <- gs_ls()

I had made a sheet called, “survey_responses_03_09_2018”, that holds the survey data. You can access it through the key: “1g3-M2fMF7_7pLGv7MttZ0DFjr9jd8jr6zQFxJAccEnw”

You can create the googlesheets objects with gs_key() and then pipe this to a data frame with gs_read().

rsurvey <- gs_key("1g3-M2fMF7_7pLGv7MttZ0DFjr9jd8jr6zQFxJAccEnw") %>%
  gs_read()

As of March 9, there were 39 responses from the survey.

Step Two: Make some plots

For each question, let’s look at the results.

How would you describe your current R usage?

question <- "How would you describe your current R usage?"
rsurvey[[question]] <- gsub('(.{1,30})(\\s|$)', '\\1\n', rsurvey[[question]])

ggplot(rsurvey, aes(x=`How would you describe your current R usage?`)) +
  geom_bar() +
  coord_flip() + 
  ggtitle(question) + 
  xlab("") + 
  theme_minimal() 

What location do you prefer for future meetings?

question <- "What location do you prefer for future meetings?"

responses <- rsurvey[, question, drop=FALSE] %>%
  separate(question, sep = ", ", into = c("first", "second", "third"), extra="drop") %>%
  gather(choice, response, na.rm = TRUE)

ggplot(responses, aes(x=response)) +
  geom_bar() +
  coord_flip() + 
  ggtitle(question) + 
  xlab("") + 
  theme_minimal()

What time do you prefer?

question <- "What time do you prefer? (planning on one/month)" 

responses <- as.data.frame(table(rsurvey[[question]], exclude=NULL)) %>%
  arrange(desc(Freq)) %>%
  rename(Response = Var1)
knitr::kable(responses,
             caption = question)
Table 1: What time do you prefer? (planning on one/month)
Response Freq
Weekdays, 6-8pm 18
Saturday, Sunday 10
Weekdays, 6-8pm, Saturday 3
Weekdays, 6-8pm, Saturday, Sunday 2
Saturday, Thursdays after 6 pm 1
Tuesday, 6-8pm 1
Weekdays, 6-8pm, Saturday, NOT Mondays, Thursdays or Sundays. 1
Weekdays, 6-8pm, Saturday, Sunday, Tuesdays and Wednesdays! 1
Weekdays, 6-8pm, Sunday 1
Weekdays, 6-8pm, Sunday, It really depends on my schedule… I work in Delaware, but if I know enough in advance, I can work from home that day. 1

Are you interested in planning a future event or leading an event?

question <- "Are you interested in planning a future event or leading an event?"

ggplot(rsurvey, aes(x=`Are you interested in planning a future event or leading an event?`)) +
  geom_bar() +
  coord_flip() +
  ggtitle(question) + 
  theme_minimal() + 
  xlab("") 

What types of events would you be interested in attending?

question <- "What types of events would you be interested in attending (choose as many responses as you like)?"
# We need to transform the data to see how many times we got each response
responses <- rsurvey[[question]]
choices <- c("Introduction to R workshops",
             "Events with advice on careers in statistics and data science/career panels",
             "Presentations of research that uses R",
             "Community project",
             "Tutorials on advanced R topics",
             "Workshops on specific packages",
             "General networking",
             "Mentoring",
             "Paired code review")
response_table <- as.data.frame(sapply(choices, function(x) length(grep(x, responses)))) %>%
  rownames_to_column()
colnames(response_table) <- c("Response","Freq")
response_table <- arrange(response_table, desc(Freq))
  
knitr::kable(response_table, caption = question)
Table 2: What types of events would you be interested in attending (choose as many responses as you like)?
Response Freq
Workshops on specific packages 31
Presentations of research that uses R 28
Tutorials on advanced R topics 28
Events with advice on careers in statistics and data science/career panels 27
Community project 26
Introduction to R workshops 22
Mentoring 20
General networking 18
Paired code review 15

If you are interested in giving a talk or leading a workshop on something, what would it be?

question <- "If you are interested in giving a talk or leading a workshop on something, what would it be?"

nonempty_responses <- na.omit(rsurvey[[question]])
print_this <- sapply(nonempty_responses, function(x) cat('+ ',x,'\n'))
  • Git
  • I could do an intro to R (basics, functions and loops). But would love to co-teach with someone else.
  • Not sure yet, but this is something I would be interested in doing in the future
  • Statistics-regression, GLM, machine learning, unsupervised/supervised methods, hypothesis testing, very basic text analytics
  • As of now, I guess I’d do something on the American Community Survey package, if folks would find it useful.
  • I am afraid I don’t know enough to give a talk.
  • Factor analysis, Conjoint analysis, hierarchical linear regression with R

My takeaways

Thank you to all the Philly R-Ladies who took the survey!

Our group represents a variety of experience levels - some have never used R and some are R experts. We even had some people volunteer to help with planning and content for future events, which I am really excited about!

Importantly, the survey confirmed that we will continue to meet in center city or university city (thank you to MakeOffices and Drexel who have hosted us).

As far as when we meet, weekdays appear to be the preference, but there is also a lot of interest in a weekend meetup. Our previous events have been on Wednesdays, but we plan to mix it up so that people with conflicts every Wednesday can attend. One practical consideration I have encountered is that it is harder to find a venue for a weekend meetup. Maybe someone wants to email us at philly@rladies.org with location ideas?

This process got me thinking… let’s plan to do another survey at the end of the year to see how the group has evolved.