Blog / Ari Bajo
Ari Bajo

Ari Bajo

Hello! I’m Ari. I build data products with Python. I have worked as a Data Engineer and Data Scientist for 4 French startups.

Blog
June 09, 2020 Ari Bajo
What did I Learn about CI/CD for Machine Learning

Most software development teams have adopted continuous integration and delivery (CI/CD) to iterate faster. However, a machine learning model depends not only on the code but also the data and hyperparameters. Releasing a new machine learning model in production is more complex than traditional software development.

View post
Blog
March 31, 2020 Ari Bajo
Classifying 4M Reddit posts in 4k subreddits: an end-to-end machine learning pipeline

Finding the right subreddit to submit your post can be tricky, especially for people new to Reddit. There are thousands of active subreddits with overlapping content. If it is no easy task for a human, I didn’t expect it to be easier for a machine. Currently, redditors can ask for suitable subreddits in a special subreddit: r/findareddit.

View post
Blog
January 28, 2020 Ari Bajo
Production Machine Learning Pipeline for Text Classification with fastText

When doing machine learning in production, the choice of the model is just one of the many important criteria. Equally important are the definition of the problem, gathering high-quality data and the architecture of the machine learning pipeline.

View post
Blog
November 19, 2019 Ari Bajo
Scaling Apache Airflow for Machine Learning Workflows

Apache Airflow is a popular platform to create, schedule and monitor workflows in Python. It has more than 15k stars on Github and it’s used by data engineers at companies like Twitter, Airbnb and Spotify.

View post