Kumar_praveen96 Authors Simone Scardapane
7 days
30 days
All time
Recent
Popular
*Reproducible Deep Learning*
The first two exercises are out!
We start quick and easily, with some simple manipulation on Git branches, scripting, audio classification, and configuration with @Hydra_Framework.
Small thread with all information 🙃 /n
Reproducibility is associated to production environments and MLOps, but it is a major concern today also in the research community.
My biased introduction to the issue is here: https://t.co/PqWH6uL5eT
The local setup is on the repository: https://t.co/9mhtZoJhE9
The use case for the course is a small audio classification model trained on event detection with the awesome @PyTorchLightnin library.
Feel free to check the notebook if you are unfamiliar with the task. /n
I spent some time understanding how to make the course as modular and "reproducible" as possible.
My solution is to split each exercise into a separate Git branch containing all the instructions, and a separate branch with the solution.
Two branches for now (Git and Hydra). /n
How well do you *really* know Git? The more I learn, the more I find it incredible.
I summarized most of the information on a separate set of slides: https://t.co/6dSmK3IfWB
Be sure to check them out before continuing! /n
The first two exercises are out!
We start quick and easily, with some simple manipulation on Git branches, scripting, audio classification, and configuration with @Hydra_Framework.
Small thread with all information 🙃 /n

Reproducibility is associated to production environments and MLOps, but it is a major concern today also in the research community.
My biased introduction to the issue is here: https://t.co/PqWH6uL5eT

The local setup is on the repository: https://t.co/9mhtZoJhE9
The use case for the course is a small audio classification model trained on event detection with the awesome @PyTorchLightnin library.
Feel free to check the notebook if you are unfamiliar with the task. /n
I spent some time understanding how to make the course as modular and "reproducible" as possible.
My solution is to split each exercise into a separate Git branch containing all the instructions, and a separate branch with the solution.
Two branches for now (Git and Hydra). /n

How well do you *really* know Git? The more I learn, the more I find it incredible.
I summarized most of the information on a separate set of slides: https://t.co/6dSmK3IfWB
Be sure to check them out before continuing! /n

*Reproducible deep learning*
Lectures 3 and 4 are out!
With code versioning out of the way, it is time to look at data versioning (@DVCorg) and environment isolation (@Docker).
All information in a small thread. 👇 /n
If you know Git, you (almost) know @DVCorg!
A fantastic tool to secure your data in a number of remotes, or to create "data repositories" from which to immediately get folders and artifacts.
My intro to DVC: https://t.co/2m3cXGAPN6
/n
For the course, I created a simple exercise tasking you with initializing DVC on the repository, and syncing the data locally and remotely.
To simulate an S3-like interface, we use a small https://t.co/91bFj7KSPG server and boto3.
Code: https://t.co/KDSX80aqJs
/n
Next up, it is time to "dockerize" your environment!
Docker has become an almost de-facto standard, and knowing it is practically indispensable today.
A very quick introduction, glossing over a number of details: https://t.co/XSrUZNhd3g
/n
In the corresponding exercise, you will learn about creating a working environment in Docker, packaging the entire training loop, and pushing/pulling an image from the Hub.
Code is here:
Lectures 3 and 4 are out!
With code versioning out of the way, it is time to look at data versioning (@DVCorg) and environment isolation (@Docker).
All information in a small thread. 👇 /n

If you know Git, you (almost) know @DVCorg!
A fantastic tool to secure your data in a number of remotes, or to create "data repositories" from which to immediately get folders and artifacts.
My intro to DVC: https://t.co/2m3cXGAPN6
/n

For the course, I created a simple exercise tasking you with initializing DVC on the repository, and syncing the data locally and remotely.
To simulate an S3-like interface, we use a small https://t.co/91bFj7KSPG server and boto3.
Code: https://t.co/KDSX80aqJs
/n

Next up, it is time to "dockerize" your environment!
Docker has become an almost de-facto standard, and knowing it is practically indispensable today.
A very quick introduction, glossing over a number of details: https://t.co/XSrUZNhd3g
/n

In the corresponding exercise, you will learn about creating a working environment in Docker, packaging the entire training loop, and pushing/pulling an image from the Hub.
Code is here: