Sentiment Analysis: Spervised Learning with SVM and Apache Spark

“Humans aren’t as good as we should be in our capacity to empathize with feelings and thoughts of others, be they humans or other animals on Earth. So maybe part of our formal education should be training in empathy. Imagine how different the world would be if, in fact, that were ‘reading, writing, arithmetic, empathy.’ – Neil deGrasse Tyson Abstract The objective is the two-class discrimination (positive or negative opinion) from movie reviews using data from the IMDB database (50000 reviews).

Terrorism around the Word- Study with R

“Everybody’s worried about stopping terrorism. Well, there is a really easy way: stop participating in it.”  –  Noam Chomsky Abstract According to the Wikipedia, the English word “terror“, just like the French “terreur”, derives from that Latin word “terrere” and means to fright, alarm, anguish, fear, panic. Indeed, we all fear terrorism that is the more and more part of our life. But do we understand the global picture? Who attacks who, where and why? Why do we see the more and more suicide attacks? This study focuses on answering these questions… Read More

The impact of individual characteristics on the length of life in India – Oaxaca-Blinder decomposition, Logit model

Abstract This study estimates the impact of social status, education and average life standards on the length of life by analyzing India’s mortality statistics in 2009 for two states, Uttarakhand and Bihar. Using several estimation methods such as MCO, GLM and Logit regressions, furthermore the Oaxaca-Blinder decomposi- tion, we find that education, electricity and the access to toilet significantly raises the length of life. We also find that members of the scheduled tribes live shorter, and this difference cannot be explained by differences between the average value of the two groups’ characteristics.

Multiple correspondence analysis, Clustering and Tandem Analysis through a basic income analysis example

Abstract Okay… So there were several basic income experiments launched in 2017, Finland started a two-year experiment by giving 2,000 unemployed citizens approximately $600 a month. In the Silicon Valley, Y Combinator, announced in mid-2016 that it would begin paying out monthly salaries between $1,000 and $2,000 a month to 100 families in Oakland, while in Utrecht, Netherland 250 Dutch citizens will receive about $1,100 per month. These are just three of the already launched experiments, and their aim is to measure how basic income could provide new structure for social security and to see how people’s… Read More

Principal Component Analysis through the Happiness Index exemple

Abstract What determines happiness? Why countries are more (or less) happy than other ones? In 2017, Norway tops the global happiness ranking, made as an annual publication of the United Nations Sustainable Development Solutions Network. In this article, we use their data to show correlations of the variables used in this Index, furthermore we analyse the countries with the help of the Principal Component Analysis technic.

Web Scraping

Abstract This article shows a simple program written in Python to do a basic web scraping. As an exercice, we get the titles of Youtube videos and the number of views, then we store these information in a Pandas DataFrame.

Simple Graphs in R

Abstract A well-made graph provides us a valuable insight to better understand and analyse data. This post gives us a very simple introduction to graphs in R with ggplot2 through exemples and with the codes.