Or how to play pokemon the right way

Recently, a fellow data enthusiast, https://towardsdatascience.com/@anis.ayari shared a tutorial on data visualization using the kaggle pokemon dataset. This made me wonder if there is a way to beat the game using some basic data analysis. I’ll present an analysis of the dataset for the version red and blue, which is the best starter and whether or not Ash actually stood a chance.

Setting up the scene

I’ll be using the following datasets /kaggle/input/pokemon-moveset/All_Moves.csv and /kaggle/input/pokemon/pokemon.csv. Also, I’ll limit this study to the first generation of pokemon, because let’s face it, it is the best of all. Moreover, the red and blue versions of the game use only those.

Exploring the dataset

The first dataset contains the following columns: 'abilities', 'against_bug', 'against_dark', 'against_dragon',
'against_electric', 'against_fairy', 'against_fight', 'against_fire',
'against_flying', 'against_ghost', 'against_grass', 'against_ground',
'against_ice', 'against_normal', 'against_poison', 'against_psychic',
'against_rock', 'against_steel', 'against_water', 'attack',
'base_egg_steps', 'base_happiness', 'base_total', 'capture_rate',
'classfication', 'defense', 'experience_growth', 'height_m', 'hp',
'japanese_name', 'name', 'percentage_male', 'pokedex_number',
'sp_attack', 'sp_defense', 'speed', 'type1', 'type2', 'weight_kg',
'generation', 'is_legendary'.

Here, I’ll drop the abilities, base_egg_steps, base_happiness, experience_growth, pokedex_number, percentage_male and capture_rate as they bring nothing for my further analysis.

I’ll keep only so categorical classifications, basic morphological informations, power and defense values, vulnerabilities and speed.

Study of the type

This is the distribution of the primary types of pokemon in the game for the first generation. It is interesting to see that the water type, which has a lot of advantages compared to the other is so overwhelming

Here is the composite type, concatenation of primary and secondary type. It is interesting again to see that some combination are very under-represented but this may be due to limiting to the 1st generation only

Here is the distribution by type. It just serves to add some lore to the pokemon universe, but it is funny to see that so man mouses can be found in the pokemon universe

Study of the correlation of the variables

As expected, weight en Height are strongly correlated, so is the defense. I guess, the heavier or bigger the pokemon, the higher the level and therefore the defense value (for instance, Snorlax, Blastoise and Onix are all big and heavy with a high defense). On the other hand, sped is highly anti-correlated with the weight, the heavier, the slower which makes sense.

The ‘against_xxx’ feature is the amount of damage taken against a specific type of attack. This correlation matrix gives co-occuring weaknesses, i.e, if you are weak against bugs, you are likely to be weak against ghost and poison and psychic and if you are weak against fire, you are probably weak against flying type… This could prove usefull to modulate your team against a gym leader. You want to confront a psychic gym but you don’t have any ghost or dark pokemon ? Go for bugs then

What makes a legendary pokemon

There are 5 main legendaries, Moltres for fire, Zapdos for electricity, Articuno for ice, Mew and Mewtwo for psychic. The first three being birds, they also have the flying type.

Moltres the fiery

boxplot for non legendary fire pokemon

For fire pokemon, their defense stat and speed are relatively low compared to their attack. Moltres is in the top tier in terms of defense, high in attack and speed, and balanced

Zapdos the sparky

boxplot for non legendary electric pokemon

Zapdos may not be the fastest electric type (their main stat) but still in the median level and compensate with high stats in the other fields

Articuno the cool one

boxplot for non legendary ice pokemon

Except for speed, Articuno is the best in his class, especially in defense. Though the low number of characters helps

Compared to birds

boxplot for non legendary flying pokemon

The average value of each stat for the three legendary birds is 92 in defense, attack and speed. Which ranks them among the most defensive, quite fast though not the most powerfull

Mew and Mewtwo

boxplot for non legendary psychic pokemon

Mew and Mewtwo clearly are outliers in terms of raw power compared to the other pokemon of their type, but also among all pokemon

Who is the best starter

In the game, the first two gym battles are against Brook and his geodude, Onix and then against Misty with Staryu and Starmi

As a player, we can pick either:

  • Electric type with Pikachu
  • Fire type with Charmander
  • Water type with Squirtle
  • Grass and poison with Bulbasaur

Let’s see how they perform against those gym opponents

Pickachu and Charmander stat are very bad against the ground and rock pokemon and Charmander is also weak against water even if Pikachu could be usefull there.
On the other hand, Squirtles’ water would be very efficient in the first gym battle but not very good against Misty
Finally Bulbasaur, with high effect on ground and rock with his grass type and also effect on the water pokemon is more polyvalent and therefore the best pokemon early game

Studying the moves

There are two types of moves, attack moves, having an effective damage dealing power and stat modifying moves, enabling to modify your pokemon or your pokemon stats

Stats modifying moves by type
Damage dealing moves by type

Is it possible to win the game with Ashs’ team ?

This is only based on the type advantages, not on the power nor stats modifying moves

Gym #1 — Pewter City Gym

In his first battle, Ash fought against Geodude and Onix using Pikachu and Pidgeotto

Gym #2 — Cerulean City Gym

In this battle, Ash faces Starmi and staryu with Butterfree and Pidgeotto

Gym #3 — Vermillion City Gym

There he faces Voltorb, Pikachu and Raichu with his own Pikachu

Gym #4 — Celadon City Gym

This match opposes Victreebel, Tangela and Vileplume to Pikachu, Bulbasaur and Charmander

Gym #5 — Fuchsia City Gym

There, he fights Koffing, Muk and Weezing with Pidgeotto and Charmander

Gym #6 — Saffron City Gym

This one is Kadabra, Venomoth, Alakazam and Mr. Mime against Pikachu

Gym #7 — Cinnabar Island Gym

This fire battle opposes Growlithe, Ponyta, Rapidash and Arcanine to Charizard

Gym #8 — Viridian City Gym

Finally, he faces Rhyhorn, Dugtrio, Nidoqueen, Nidoking and Rhydon with Bulbasaur , Squirtle, Pikachu and Pidgeotto

Ash mainly made very poor choices in terms of strategy when it came to his pokemon, apart when picking Squirtle and Bulbasaur but it is mainly due to the coverage of the types grass, poison and water, probably more dumb luck than anything else


There are still many things that could be said about pokemon, but it is interesting to see that underneath a very childish game there is actually a high level of strategy but also a way to beat the game

Engineer passionate about technology, data processing and AI at large, doing my best to help in the machine uprising https://elbichon.github.io/