This summer, I have attended the Zemanta Data Science Summer School. As the second practical project, we had to predict the click rate of advertisement. Me and two of my teammates ended up doing some light feature picking and testing different algorithms. We ended up testing five different algorithms: LightGBM, Naive Bayes, Factor machines and two different logistic regressions. I tested the VowpalWabbit logistic regression and factor machines.
Since the data, that we were using is something, that I can not publish (and probably also not any of the analysis, that we did), I decided to redo the two algorithms, that I had with a different dataset. Also, I have deleted the datasets on the last day of the summer school.
For at least two of my classes (Artificial Intelligence and Data Mining) I have picked the problem of predicting personality from text. Since I still had some data from that time lying around, I decided to simply try these algorithms also on this data.
The examples of the algorithms could be found on my GitHub.