Ethan de Villiers
Data Engineer/Scientist
AWS Solutions Architect Associate

MegaBytten

Collection of open-source data tools and services to boost your business

Tools and Services

Open Source

YouTube Keyword Scraper

Elevate your content with data-driven titles

Obtain keywords for your video titles: YouTubeKWS allows you to visualise the frequency and engagement rates of keywords from the top 100 YouTube videos on any given topic.

In Development

DataSphere

Empower your business with Sales Analytics

Use our custom reporting dashboard to extract business insights and trends from your sales data. Upload existing CRM or POS data or create and house your own custom data.

Open Source

Convolutional Neural Network (CNN) for Cervical Cancer Prediction

Machine Learning Model

Deployed Scikit-Learn Model on AWS

An inference API for a logistic regression model predicting whether houses are Multi- or Single-tenancy based on motion sensor data. Inference API deployed on serverless, scalable infrastructure backed by AWS.

Machine Learning Model

End-to-End Crossflow Filtration Model

Developed full-stack, end-to-end model for predicting crossflow filtration experimental set ups. Model is exposed as an Inference API via the associated web page.

Access-gated with the username "megabytten" and password "password", feel free to play around with the model.

Open Source

Twitter Scraper and Sentiment Analysis

Run distributed twitter scrapers to build your own data

A multi-purpose tool. Contains automation-friendly, single-threaded and distributable Playwright-based python scripts configured for command line usage. Further contains a python script using Natural Language Processing (NLP) libraries to conduct sentiment analysis on extracted twitter data. Sample data is available, to see what kind of data it extracts - along with an example sentiment analysis graph on data extracted with the scripts.

In Development

Coming Soon...

Get in Touch

Send an enquiry

0/500

placeholder

Have an idea you want to develop? Need advice on how to extract, wrangle, or store data? Want to work together to develop a product?

Reach out any time, just send me a free-of-charge enquiry and I'll get back to you as soon as I am able.