This repository proposes to analyze the text of the speeches, conferences and interviews of the current president of Mexico, and has an educational aim, there are no purposes of political interest in this document, you are free to interpret the data in your own way. I personally think that formalize this type of practices helps us to follow up on the political promises of the presidents of Latin America and could help us to make decisions in advance for our countries, however, what I intend to do is to show you a basic flow of text analysis using Python, visualize aggregated data and get insight at every step.
The official AMLO website has a stenographic version of each speech, It is the punctual and faithful transcription about what was expressed verbally, we will take this URL for the experiments.
https://lopezobrador.org.mx/
Today AMLO’s site is grouping the speeches into 667sections, and this number grows weekly, each section has a group of speeches by date, my scrapping technique is targeting each element of a group, which represents the speech, check the image below:
The amlo_analysis.ipynb notebook contains all the analysis code, here we will explain each step taken and interpretation of each visualization.
Here we can see the speeches described for each box of each url with its dates, and this is what I want.
df = pd.read_csv('C:/Users/ramse/Downloads/amlo_speechs.csv')
columns = ['id_speech', 'date','title','url','content']
df = df[columns]
df['content'] = df['title'] + ' ' + df['content']
df = df[df['content'].notna()]
df
The code above is used to generate the below visualization, there are three colors separating the chart vertically, which are the three periods of AMLO, He began its campaign in 2011, lost against Peña Nieto in December 2012, and he remained active in the mandate of President Peña Nieto, in 2018 he won the Mexican presidential elections and his number of words used per speech increased, as well as the vocabulary used.
This could help us understand what were his plans and priorities over each period before and while his presidential term
Although there are NLP techniques to detect topics, here we will not use these practices because this study involves political terms that are very common in the political environment of each country, in this case I will choose the topics and terms related to the topics manually
Now we will see the frequency of each word by year into each Topic
Apparently the use of the word “economía” in his speeches before his presidential term had relevance, but the use of this word is stronger in his current presidential term, also take a look at the word “impuesto”
The most mentioned political party before his presidential term was “Morena”, at the same time he mentioned political parties rivals like “PRD” and “PRI”
CFE , IMSS and CNDH the most mentioned words related to general institutions
“educación” has always been a word used since the beginning of his campaign, and now in his presidential term “escuela” is the most relevant word related to education.
During the first years of Peña Nieto’s mandate, AMLO used the words “petróleo” and “energética” for a large period of time.
The word “extranjeros” was common from the beginning of his campaign until today, however, lately “migración” and “frontera” appears stronger
Veracruz and Tabasco
“CDMX” is the most mentioned area in the center of the country, but there is a spiky rebound in the word “oaxaca” the last months of 2019
The words “pobreza” and “alimentación” were very common since the beginning of his presidential campaign, and now both stronger
From the last months of 2020 until today, He’s been using the word “vacuna”, It was his priority.
“corrupción” is one of his favorite words
From the beginning of his presidential campaign, “Peña nieto” was the target.
Since the beginning of his presidential term “ebrard” and “sheinbaum” have been his political allies that he has mentioned the most.
“trump” very relevant and now “biden”, the dates make sense.
His other favorite word is “pueblo” but it is also wor