MLIT

Equity codes prediction using Naive Bayesian Classifier with scikit-learn

Introduction

The aim of this article is to have an introduction to Naive baysian classification using scikit-learn.
The naive Bayesian classification is a simple Bayesian type of probabilistic classification based on Bayes’ theorem with strong (so-called naive) independence of hypotheses. In this article, we will use it to build a basic text prediction system. We will predict Equity codes in a search form fashion (i.e prediction starts when user starts typing).
Keep on reading!

Forecasting recessions with economic indicators

Abstract

This study compares three economic indicators often used in forecasting recessions: the Yield Spread, the Chicago Index and the Leading index. We find that the latter two predict recessions well one and two quarters ahead, but fail in forecasting recessions on a longer time period. On the contrary, the Yield Spread performs better when forecasting recessions four and six quarters ahead.

Read More

Sentiment Analysis: Spervised Learning with SVM and Apache Spark


“Humans aren’t as good as we should be in our capacity to empathize with feelings and thoughts of others, be they humans or other animals on Earth. So maybe part of our formal education should be training in empathy. Imagine how different the world would be if, in fact, that were ‘reading, writing, arithmetic, empathy.’ – Neil deGrasse Tyson

Abstract

The objective is the two-class discrimination (positive or negative opinion) from movie reviews using data from the IMDB database (50000 reviews).


Read More

Terrorism around the Word- Study with R

https://www.cartoonmovement.com/p/7403

“Everybody’s worried about stopping terrorism. Well, there is a really easy way: stop participating in it.”  –  Noam Chomsky

Abstract

According to the Wikipedia, the English word “terror“, just like the French “terreur”, derives from that Latin word “terrere” and means to fright, alarm, anguish, fear, panic. Indeed, we all fear terrorism that is the more and more part of our life. But do we understand the global picture? Who attacks who, where and why? Why do we see the more and more suicide attacks? This study focuses on answering these questions by an Exploratory Data Analysis, semi-supervised learning and a supervised Logit model.


Keep on reading!

The impact of individual characteristics on the length of life in India – Oaxaca-Blinder decomposition, Logit model


Abstract

This study estimates the impact of social status, education and average life standards on the length of life by analyzing India’s mortality statistics in 2009 for two states, Uttarakhand and Bihar. Using several estimation methods such as MCO, GLM and Logit regressions, furthermore the Oaxaca-Blinder decomposi- tion, we find that education, electricity and the access to toilet significantly raises the length of life. We also find that members of the scheduled tribes live shorter, and this difference cannot be explained by differences between the average value of the two groups’ characteristics.


Keep on reading!

Multiple correspondence analysis, Clustering and Tandem Analysis through a basic income analysis example


Abstract

Okay… So there were several basic income experiments launched in 2017, Finland started a two-year experiment by giving 2,000 unemployed citizens approximately $600 a month. In the Silicon Valley, Y Combinator, announced in mid-2016 that it would begin paying out monthly salaries between $1,000 and $2,000 a month to 100 families in Oakland, while in Utrecht, Netherland 250 Dutch citizens will receive about $1,100 per month. These are just three of the already launched experiments, and their aim is to measure how basic income could provide new structure for social security and to see how people’s productivity levels change when they receive a guaranteed salary.

But how people think about basic income? Are we supportive of it or we fear it? Who is the most likely to vote for it? Is there a difference between people according to their education or job status who are more pro or contra of this idea? This study aims to answer these question by using a semi-supervised approach, Clustering and a Tandem analysis to classify people according to their characteristics and their opinion of basic income.


Keep on reading!

Principal Component Analysis through the Happiness Index exemple


Abstract

What determines happiness? Why countries are more (or less) happy than other ones? In 2017, Norway tops the global happiness ranking, made as an annual publication of the United Nations Sustainable Development Solutions Network. In this article, we use their data to show correlations of the variables used in this Index, furthermore we analyse the countries with the help of the Principal Component Analysis technic.


Keep on reading!

Web Scraping


Abstract

This article shows a simple program written in Python to do a basic web scraping. As an exercice, we get the titles of Youtube videos and the number of views, then we store these information in a Pandas DataFrame.


Keep on reading!

Simple Graphs in R


Abstract

A well-made graph provides us a valuable insight to better understand and analyse data. This post gives us a very simple introduction to graphs in R with ggplot2 through exemples and with the codes.


Keep on reading!