Data Science

Text Summarization in Python

Summarization is useful whenever you need to condense a big number of documents into smaller texts. Anyone who browsed scientific papers knows the value of abstracts – unfortunately, in general documents don’t share this structure. This article is an overview of some text summarization methods in Python.  

Kaggle Days 2018

This year has seen the first edition of Kaggle Days. The event was a nice opportunity to break in for fresh kagglers – it consisted of two days with kaggle masters. Presentation Day The first day consisted of two parallel tracks – presentations and workshops. One of first presentations given by Mikhail Trofimov overviewed on …

PyData Warsaw 2017

PyData conferences are organized by NumFocus, a nonprofit supporting open source scientific computing (they support Numpy, Pandas, scikit-learn and Jupyter among other things). Warsaw conference took place in Copernicus Science Centre. The conference spanned 3 days – one workshop + two conference days. I didn’t attend workshops. Some of them seemed pretty basic, and others …

Spark Summit Europe 2016 review

Spark Summit Europe took place in Brussels, Belgium just about a week ago. I had a pleasure to be there for conference days where I attended mostly Data Science track, as this is our bread and butter in Semantive. This summit could be summarized in a couple of words: Spark 2.0 and Streaming, as it …

