Ethan de Villiers
AI/ML Cloud Data Engineer
Data Systems Architect

MegaBytten

Collection of open-source data tools and projects

Tools and Services

Open Source

YouTube Keyword Scraper

Elevate your content with data-driven titles

Obtain keywords for your video titles: YouTubeKWS allows you to visualise the frequency and engagement rates of keywords from the top 100 YouTube videos on any given topic.

Open Source

Convolutional Neural Network (CNN) for Cervical Cancer Prediction

Machine Learning Model

Deployed Scikit-Learn Model on AWS

An inference API for a logistic regression model predicting whether houses are Multi- or Single-tenancy based on motion sensor data. Inference API deployed on serverless, scalable infrastructure backed by AWS.

Machine Learning Model

End-to-End Crossflow Filtration Model

Developed full-stack, end-to-end model for predicting crossflow filtration experimental set ups. Model is exposed as an Inference API via the associated web page.

Access-gated with the username "megabytten" and password "password", feel free to play around with the model.

Open Source

Twitter Scraper and Sentiment Analysis

Run distributed twitter scrapers to build your own data

A multi-purpose tool. Contains automation-friendly, single-threaded and distributable Playwright-based python scripts configured for command line usage. Further contains a python script using Natural Language Processing (NLP) libraries to conduct sentiment analysis on extracted twitter data. Sample data is available, to see what kind of data it extracts - along with an example sentiment analysis graph on data extracted with the scripts.

Get in Touch

Send an enquiry

0/500