MegaBytten

Graph Axes

From left to right on the bottom line (X axis), plotted are the 50 most frequently occuring words out of the top 100 videos on your provided query. Some words are likely used together in common phrases.

From bottom to top along the left line (Y axis), plotted is the continuum of engagement rate, in decimal. To convert to percentage (%) multiply the value by 100. Engagement rate is calculated as (0.5 * like count + 0.5 * comment count) / views. This means an engagement rate of 1 (100%) = every viewer must like and comment.

Frequency of words (n)

Seen next to every word, n= indicates the number of video titles that word has occured in out of the top 100 videos. A maximum would be (n=100), indicating that the word was used in every video title on the first two pages of youtube for your result.

Generally, the smaller sample size (n, number of occurances), the less accurate the data (engangement rates)

Data distributions

Each column extending vertically corresponds to a single word. The blue box and lines, called Boxes and whiskers, map out the distribution of data points for the word. For example, "api (n=98)" tells us there are 98 videos out of the 100 that include the word "api".

Every video has an engagement rate, so here we can see the engagement rates from all 98 videos that use the word "api" - that gives us 98 data points for "api"! Now visualising 98 dots cluttered in the same space becomes very messy. Instead of visualising each dot itself, we can visualise the distribution of the data, through either a data distribution (bell curve), or a box and whisker plot.

Spreads and Ranges

The whiskers and boxes represent the distribution of values (engagement rates) from videos including a certain word. The longer/taller the whisker or box, the more variety the values have - meaning the larger the ranges.

For example, the distribution belonging to "how" (n=55) shows that 75% of the data has the same spread or range as the top 25% of data. The lowest 75% ranges from values of 0% to 1% engagement rate, while the highest 25% of videos range from 1% to 2% engagement rate.

Conversely, the "key" data distribution has a much smaller range, and videos using this word are likely to have below 1% engagement.

Youtube Keyword Scraper

(YoutubeKWS)

Search for keywords in any given topic

What your results mean

Graph Axes

Frequency of words (n)

Data distributions

Spreads and Ranges

Documentation

How it works

How does my web browser get all this data?

How to interpret results

Statistics, oh boy what have I done?

Data Architecture

The technical infrastructure underlying the magic

Contact Me