Text Mining, Diversity Mission Statements Across Several Colleges

So this is my first time using Voyant, and I’m pleasantly surprised by how intuitive and easily I was able to make use of some it’s cool features. For my assignment, I wanted to reflect on the current and former academic institutions I’ve had experience in both professionally and academically.

Throughout my academic journey, I’ve noticed the term “Diversity” greatly varies depending on the needs and values each respective institution embodies. Based upon our individual perspectives, the term diversity can be quite broad, making it’s application in college settings more difficult to track. Going into this project some guiding questions I pondered were: Would there be differences in the marketed (term) values of diversity between private and public colleges? How would I narrow down the list of colleges? What were my own definitions of diversity, and what aspects are most valued in my own perceived ideal of what diversity means in a college setting?

To begin I narrowed down my college list to six institutions:                                                              (1) Cuyahoga Community College                                                                                                             (2) Smith College                                                                                                                                         (3) LaGuardia Community College                                                                                                            (4) Amherst College                                                                                                                                      (5) CUNY Graduate Center                                                                                                                          6) Hunter College

After attaching the urls to all “Diversity Mission Statements” of each college into Voyant, the first image of “trends” appeared:

To dive in deeper, I did a separate chart of the top 5 terms from each institution’s Diversity Statements, which excluded the glaring amount of times each institution self-referenced itself as seen by the large texts of “Smith” & “Amherst” in the image above. While the amount of times each college felt the need to reference itself was intriguing in itself, I wanted to focus on terminology outside of the name to which followed below and chose to exclude it from the following charts. Aside from the college names, here are the Top 5 terms associated with the Diversity Statements of each college:

Reminder: (1) CuyahogaCC (2) Smith (3) LaGuardia CC (4) Amherst (5) CUNY GC (6) Hunter

As shown above, the top 5 trends are, Diversity, Student, Academic, College, and Community. Whilst the obvious mention of diversity did not surprise me, the remaining terms did. So in response, I next, wanted to compare these top trends with the five terms (values) I thought were most important in identifying what diversity should mean in college setting:

Values: Inclusion, Equity, Disabilities, Gender, & Race

As demonstrated in the chart above, these 5 terms were only mentioned at less than half the amount of the top trending terms. Personally it was greatly disheartening to see that Race was mentioned the least out of the terms which I had selected in what I had perceived to be the most important aspect of diversity.

I also thought it was noteworthy to explain why I chose to select Inclusion & Equity as separate categories. Inclusion can be thought of being granted the permission, or in college terms, “acceptance” into an academic setting, while Equity is what it means to valued in a space without conforming to the standards and values of an institutional space. It was through this distinction one of the most compelling differences between private and public institutional values. According to the chart above, Inclusion was most mentioned in (4) Amherst & (2) Smith, and Equity termed most at (1) Cuyahoga CC & (2) Smith. While the Equity overlaps at Smith College, Inclusion is remarkably mentioned at a much higher rate, especially in comparison to the remaining public colleges which mark either of the terms as almost nonexistent.

Another important aspect of the data correlates circles back to Race. While still the most underrepresented term within this category, it was mentioned the most at: (3) LaGuardia CC, (5) CUNY GC, and (6) Hunter College, in comparison to (4) Amherst, and (2) Smith college which were almost invisible on the chart.

I could’ve delved much further into this project, but felt it could easily become overwhelming through Voyant to distinguish the demographics of each college respectively, and compare how that might reflect the terms prioritized in each Diversity Statement, but this is still an intriguing indicator to how different colleges determine what terminology best encompasses their missions of diversity. In many ways marketing diversity is a huge advertisement which entices students of all walks of life into an expected experience to be had, versus a declaration of Equality within spaces in higher education. Affirmative action, and other policies created in an attempt to equalize higher education can be easily lost in the growing definition of what it means to be a diverse space. I appreciate playing around with Voyant as a sort-of “reality check” into how diversity is constantly manipulated in ways which can result in it’s impact and original meaning being lost in an ever-growing perception of #colorblindness in our nation.

Lastly, (if you’re still reading at this point) I thought it would be a nice bonus to include this nice chart Voyant suggested for me:

As titled, these distinctive words are categorized as trends outside of the corpus that were individually mentioned the most. What can we continue to further draw from these valued terms in each college mission? And how do our own perceptions affect how we decide to gather data from these types of mining software?




4 thoughts on “Text Mining, Diversity Mission Statements Across Several Colleges

  1. Dax Oliver

    Very interesting! When I looked through the text in Voyant, I noticed that Amherst and Smith tended to use “students of color” a lot. Do you think that might have changed your results?

    1. Raven Gomez Post author

      Perhaps, I definitely thought about that but Voyant wasn’t so great at categorizing a phrase of multiple words so I couldn’t define it.

      Regardless I don’t think it would’ve changed the results too drastically after overlooking the documents individually! (-:

  2. Anthony Wheeler (he/him)


    This is awesome! I am interested in how public institutions address race more than private universities. I wonder what the logistics are behind it? There are many things that come into play, but how does this reflect in the mission of each establishment?

    1. Raven Gomez Post author

      Hey Anthony, thank you! Personally I feel it has so much to do with the demographics of a current institution and catering to the needs of the dominant prospective student population. At private universities, diversity has become such a strong selling point for students who feel this idea of “diversity” in their community is a necessary part of the college “lifestyle” experience but in that process the definition of the word itself becomes sort of lost or overly broadened.

      The root of distinguishing diversity in higher education stems from Affirmative Action and the attempt to federally bridge the gaps of accessibility in higher ed, but in combating this which was initially intended for students of color historically denied access to college, it has actually benefited other demographics more (such as white women) in academia. Similar to social services, there is always this warped misconception about the majority of who taps into public services and I wonder how that continues to blur the reality of who and what these terms are really meant to serve? What you might consider a “diverse” community might be drastically different than someone else, also similar to the continuous conversations we’ve had in class regarding how the definitions of Digital Humanities vary greatly depending on where and who are guiding these conversations. If that makes sense…

Comments are closed.