Author Archives: Patricia Accarino

Text Analysis with Voyant

I initially thought that I’d compare the inaugural speeches of Presidents Obama and Trump (which I did), and then thought I’d look at the first and second inaugural speeches of President Bush (43) and Obama.  While I found these comparisons somewhat interesting, I didn’t think that they were as enlightening as I’d hoped.  I was actually surprised with the way that Trump’s speech appeared, in that it did not portray the somewhat bleak picture of the nation that I thought his speech conveyed.

(somehow I was not able to display these slides here, so I added the links…..)

https://voyant-tools.org/?corpus=063848fbc1f9d4e357b0bace3c0ea0f4

I thought that comparing speeches of Presidents was an obvious choice, so then I decided to compare the opening statements of Brett Kavanaugh and Dr. Blasey Ford’s during the recent Senate Hearing.  Here again, while interesting, I was somewhat disappointed by what the software displayed.

Kavanaugh’s Opening Statement:

https://voyant-tools.org/?corpus=ee640979b2271bc5b58bcb570983468d

If one were to rely on the links and word trend screens, it is accurate in that he was speaking of his high school years, and the frequency of words paint the picture of an adolescent’s focus on friends, parties, beer, and boys and girls.  Sounds innocent enough, right?  But what did not get conveyed was his adamant denial of the allegations, the outright partisan statements that he made, and the overall impression that he left on many that questioned his judicial temperament, and his ability to be impartial and balanced.

Dr Ford’s opening statement:

https://voyant-tools.org/?corpus=cace51eabd9b1f94a2ebaa28c083ec30

The word links and word trends window for Dr. Ford’s statement appear to be a more accurate representation of her testimony.  One could infer that it is more specific and focused on the event.  But again, what is missing here is the general impression that the speaker imparts to the world, which is conveyed in tone of voice, cadence, emphasis, and overall demeanor.

I want to also temper my impressions here, in that this is the first time that I’m using this tool, and I could be missing something, or not parsing the data, deleting certain words which could skew the results, or any number of techniques that would make the results more meaningful.  Which is to say that with any tool, understanding HOW to use it correctly, and WHEN are critical.

I then looked at some tutorials on YouTube to learn more.  The question that I find interesting is what are the best uses of this tool, in order to provide data and insights that are not evident from reading specific texts?  I stumbled upon a presentation given by by Stéfan Sinclair (McGill), one of the developers of Voyant.  There was one slide that I found very interesting:

https://www.youtube.com/watch?v=fYmngzBtrLI

At 13.00 minutes : This was a comparison of the texts from advertising for toys, for boys, and for girls.  These two images are striking in that they really say it all, and there is no other explanation needed to illustrate the gendered stereotypes that are still reinforced by advertising to children.

If I were to make a conclusion, perhaps this tool is more instructive when looking at a body of work, as opposed to evaluating a single text.  Then patterns that cannot easily be gleaned from reading individual documents or transcripts might be teased out of the analysis that this type of a tool could provide.

Data for Mapping Workshop

This workshop provided a definition and general overview of GIS (Geographic Information Systems), presented by Javier Otero Peña and Olivia Ildefonso, two of the GC’s Digital Fellows with expertise in this area.  Their presentation was very well-organized, and they both provided examples and useful tips along the way.

GIS are tools that enables one to manipulate and represent data spatially on maps.  While these tools can be complex and are very powerful, Javier and Olivia provided an introduction on the way that maps are organized (vector or raster layers).  Vector layers contain data from files which can be in many different formats.  It is these data layers that can enrich a map by providing a spatial representation of the data in visual form (as opposed to reading a table with rows and columns).

I’ve done a few assignments using a GIS tool this past summer– and while those projects were focused on how to use the tool, there was little discussion on how one gets the data (it was already provided in the exercises).  I remember struggling when searching for data to add to my map:  how did I know which database was reliable, what format to use, how to search for the correct fields?  These were the questions I had, and I did my best to muddle through it.

I appreciated that his workshop was focused on how to search for different data sets and load them into the GIS mapping program.  The whole point of using GIS is to marry the data with the map, and I suspect that this critical step is often not touched on in GIS tutorials.

For this session, the intention was to walk the group through a mapping exercise using Carto, an open-source mapping program.  There was an unanticipated change in the software, so we were unable to open accounts and log into Carto.  No matter, as we were able to focus on the main point of the session.

Given that the subject of this workshop was to locate the data and import it into the program, we were able to focus on this (and not be distracted with creating a map at this point).  I thought that both Javier and Olivia did a great job of walking us through each step, and offering tips and strategies for saving files, naming fields, etc.  We searched for the US Census data, chose a table and then narrowed down the fields that we needed and saved the file.  Then we searched for a shapefile for the census tracks; and then “joined” the information from the table with the shapefile (using a common field, in this case ‘state’).

The slides were very clear, and Javier emailed the Powerpoint slides to us afterwards – which now serve as a mini-tutorial for us to replicate on our own.

Javier and Olivia were both knew their stuff and were very effective at tailoring their presentation to the group’s level. I thought that this was just enough for an intro to the topic, and I’m definitely interested in a follow up that delves deeper into finding and evaluating sources of data.