Here's how I got started with my first machine learning project and you can too.
Let's take a look.

(This thread will take from zero to being a hero in machine learning, trust me)

(1 / 22)
πŸ§΅πŸ‘‡

Getting started with your first machine learning project might actually much easier than it seems, if I can do it, certainly anyone can.

I did not use:
- Any Math
- An expensive computer
- Complex programming concepts

(2 / 22)
Here's what I did use:

- A free GPU on Google Colab
- Python
- TensorFlow
- Numpy
- Pandas
- Kaggle
- Scikit-Learn
- Google
- StackOverFlow

(3 / 22)
This project was actually a Kaggle challenge based on the MNIST dataset which is a collection images of 70,000 hand written digits.

You can find the dataset hereπŸ‘‡
πŸ”—//kaggle.com/c/digit-recognizer

(4 / 22)
Before we go over the code of this project, it is highly reccomended that you complete this free course on YouTubeπŸ‘‡

Machine Learning foundations course
πŸ”—//youtu.be/_Z9TRANg4c0

CodeπŸ‘‡
πŸ”—//colab.research.google.com/github/PrasoonPratham/Kaggle/blob/main/MNIST.ipynb

(5 / 22)
Now let's look at the code.

We'll first download the dataset for this project using the kaggle API for Python.
Keep in mind that you'll have to provide an API key so that this code works.

(6 / 22)
There are some issues with the names of the files, so we'll rename and then unzip them using the zip library.

(7 / 22)
We'll end with 3 files, we can discard the sample_submission.csv as we won't need it. test.csv and train.csv is what we are interested in.

The train.csv will be used for training our neural and test. csv will be used for making predictions.
(8 / 22)
The prediction will be sent to kaggle.

Using pandas we can load both of them as dataframes, which basically converts .csv file data(excel like data) python arrays so that we can put them in our neural network.
We'll also import TensorFlow and Numpy while we are here.

(9 / 22)
Let's look at the data, train.csv is what we are interested in.

The training data set, (train.csv), has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image

(10 / 22)
Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker.

(11 / 22)
This pixel-value is an integer between 0 and 255, inclusive. We will pass the pixel values in our neural net and exclude the label, we don't want it to know what number is in the image! It'll have to learn that on its own.

(12 / 22)
This code "drops" the label column and stores it in the Y_train variable, we will also divide each pixel value by 255 to make it a value between 0-1 as neural networks perform better with these values and now we "reshape" the values which go into our neural net.

(13 / 22)
Remember how I said, we'll only use the train.csv for training our neural net? We'll split the train.csv into 2 parts, one for actual training and the other for validating how well our neural net did at the end of each iteration.This boosts the accuracy of our model.

(14 / 22)
Note that here I have chosen 10% of our dataset for validation and the rest for training, you can experiment with the values yourself if you wish to do so.

(15 / 22)
Here comes the fun part, we'll now define our neural network. The images will pass through these and our model will be trained. We'll be using this thing called a "convolution" and "pooling".

(16 / 22)
A convolution in simple terms is like applying a filter (like you do on instagram) to a photo, this increases some of the details in the images and helps improve our neural network's accuracy.

(17 / 22)
Pooling does a similar thing by taking the most prominent pixel in an area and throwing out the others.

I found this amazing thread on convolutions by @aumbark which you should definately check outπŸ‘‡

https://t.co/CnsQvMpHZE

(18 / 22)
Now we simply pass our data 15 times (aka epochs) through our neural network and validate it each time using the validation data we made earlier.

(19 / 22)
You'll notice that we get certain metrics in the output, at the end you should see a loss and accuracy similar to the one in the photo.

(20 / 22)
Congrats!πŸ₯³ You've trained the neural network, now can make the predictions on the test data and store them in a csv file which we'll submit to kaggle. (You can use the file icon in the left to browse through the files)

(21 / 22)
You've learnt a lot by this point, be proud of yourself and dive deeper into machine learning, good luck! πŸ™Œ

(22 / 22 πŸŽ‰)

More from Pratham Prasoon πŸš€

More from Machine learning

You May Also Like

Department List of UCAS-China PROFESSORs for ANSO, CSC and UCAS (fully or partial) Scholarship Acceptance
1) UCAS School of physical sciences Professor
https://t.co/9X8OheIvRw
2) UCAS School of mathematical sciences Professor

3) UCAS School of nuclear sciences and technology
https://t.co/nQH8JnewcJ
4) UCAS School of astronomy and space sciences
https://t.co/7Ikc6CuKHZ
5) UCAS School of engineering

6) Geotechnical Engineering Teaching and Research Office
https://t.co/jBCJW7UKlQ
7) Multi-scale Mechanics Teaching and Research Section
https://t.co/eqfQnX1LEQ
😎 Microgravity Science Teaching and Research

9) High temperature gas dynamics teaching and research section
https://t.co/tVIdKgTPl3
10) Department of Biomechanics and Medical Engineering
https://t.co/ubW4xhZY2R
11) Ocean Engineering Teaching and Research

12) Department of Dynamics and Advanced Manufacturing
https://t.co/42BKXEugGv
13) Refrigeration and Cryogenic Engineering Teaching and Research Office
https://t.co/pZdUXFTvw3
14) Power Machinery and Engineering Teaching and Research
1/12

RT-PCR corona (test) scam

Symptomatic people are tested for one and only one respiratory virus. This means that other acute respiratory infections are reclassified as


2/12

It is tested exquisitely with a hypersensitive non-specific RT-PCR test / Ct >35 (>30 is nonsense, >35 is madness), without considering Ct and clinical context. This means that more acute respiratory infections are reclassified as


3/12

The Drosten RT-PCR test is fabricated in a way that each country and laboratory perform it differently at too high Ct and that the high rate of false positives increases massively due to cross-reaction with other (corona) viruses in the "flu


4/12

Even asymptomatic, previously called healthy, people are tested (en masse) in this way, although there is no epidemiologically relevant asymptomatic transmission. This means that even healthy people are declared as COVID


5/12

Deaths within 28 days after a positive RT-PCR test from whatever cause are designated as deaths WITH COVID. This means that other causes of death are reclassified as
I'm going to do two history threads on Ethiopia, one on its ancient history, one on its modern story (1800 to today). πŸ‡ͺπŸ‡Ή

I'll begin with the ancient history ... and it goes way back. Because modern humans - and before that, the ancestors of humans - almost certainly originated in Ethiopia. πŸ‡ͺπŸ‡Ή (sub-thread):


The first likely historical reference to Ethiopia is ancient Egyptian records of trade expeditions to the "Land of Punt" in search of gold, ebony, ivory, incense, and wild animals, starting in c 2500 BC πŸ‡ͺπŸ‡Ή


Ethiopians themselves believe that the Queen of Sheba, who visited Israel's King Solomon in the Bible (c 950 BC), came from Ethiopia (not Yemen, as others believe). Here she is meeting Solomon in a stain-glassed window in Addis Ababa's Holy Trinity Church. πŸ‡ͺπŸ‡Ή


References to the Queen of Sheba are everywhere in Ethiopia. The national airline's frequent flier miles are even called "ShebaMiles". πŸ‡ͺπŸ‡Ή