Hey folks, today is a Mindblowing Monday 🤯!

Today I want to tell you about Language Models, a type of machine learning techniques that are behind most of the recent hype in natural language processing.

❓ Want to know more about them?

🧵👇 1/15

A language model is a computational representation of human language that models which sentences are more likely to appear in a given language.

🎩 Formally, a language model is a probability distribution over the sentences in a language.

❓ What are they used for?

👇 2/15
⚙️ Language models allow computers to understand and manipulate language at least to some degree. They are used in machine translation, speech to text, optical character recognition, text generation, and many more applications!

They come in many flavors:

👇 3/15
⭐ The simplest language model is the *unigram model*, also called a *bag of words* (BOW).

In BOW, each word is assigned a probability Pi, and the probability of a sentence is computed assuming all words are independent. But of course, this isn' true.

👇 4/15
For example, "water" is a more commonly used word than "philosophy", but the phrase "philosophy is the mother of science" is arguably much more likely than the phrase "water is the mother of science".

💡 The likelihood of a phrase depends upon all its words.

👇 5/15
⭐ This dependency can be modelled with an *n-gram model*, in which the likelihood of a word is computed w.r.t. the words before in a given phrase.

💡 If we start a phrase with "philosophy", is more likely to see the word "science" than "shark".

👇 6/15
☝️ The problem with n-gram models is that the total number of parameters you need to store grows exponentially with n.

If you want to capture phrases of length n=10, you need N^10 numbers, where N is the number of words in the language!

👇 7/15
⭐ Neural language models (aka continuous space models) are a solution to this exponential explosion.

They try to learn jointly a vectorial representation for all words (aka an embedding) and some mathematical operation among them that approximates the likelihood.

👇 8/15
⚙️ Neural language models are built by training a neural network to predict some relationships between words and the phrases in which they appear.

The most popular neural language model is *word2vec*, trained in predicting a word given a small window around it.

👇 9/15
👉 Modern neural language models have more complex neural network architectures.

Popular examples are BERT and the family of GPT models, of which GPT-3 recently took the Internet by surprise with its ability to speak nonstop about anything, often without much sense.

👇 10/15
😇 The nice thing about language models is that they can be trained independently of any NLP problem and then used inside specific applications with a little fine-tunning.

👇 11/15
😇 They also improve efficiency. A big company (like OpenAI or Google) can train a big language model and then the rest of us mortals can use them without having to pay millions in GPU training time.

⚠️ But they don't come without issues!

👇 12/15
🤔 Language models encode "common" language use, so all human bias is implicitly stored in them.

For example, the phrase "boy is a programmer" may be considered more likely than "girl is a programmer", simply because the Internet has more examples of the first.

👇 13/15
☝️ If used without care, these language models will introduce subtle biases in your application that are very hard to discover and debug. Understanding and fixing these biases is one of the most exciting and important issues in AI safety!

👇 14/15
👋 And that's it for today.

If you'd like to talk about language models, reply in this thread or @ me at any time.

Feel free to ❤️ like and 🔁 retweet if you think someone else could benefit from knowing this stuff.

⚓ 15/15

More from All

The best morning routine?

Starts the night before.

9 evening habits that make all the difference:

1. Write down tomorrow's 3:3:3 plan

• 3 hours on your most important project
• 3 shorter tasks
• 3 maintenance activities

Defining a "productive day" is crucial.

Or else you'll never be at peace (even with excellent output).

Learn more


2. End the workday with a shutdown ritual

Create a short shutdown ritual (hat-tip to Cal Newport). Close your laptop, plug in the charger, spend 2 minutes tidying your desk. Then say, "shutdown."

Separating your life and work is key.

3. Journal 1 beautiful life moment

Delicious tacos, presentation you crushed, a moment of inner peace. Write it down.

Gratitude programs a mindset of abundance.

4. Lay out clothes

Get exercise clothes ready for tomorrow. Upon waking up, jump rope for 2 mins. It will activate your mind + body.

You May Also Like

1

From today, we will memorize the names of 27 Nakshatras in Vedic Jyotish to never forget in life.

I will write 4 names. Repeat them in SAME sequence twice in morning, noon, evening. Each day, revise new names + recall all previously learnt names.

Pls RT if you are in.

2

Today's Nakshatras are:-

1. Ashwini - अश्विनी

2. Bharani - भरणी

3. Krittika - कृत्तिका

4. Rohini - रोहिणी

Ashwini - अश्विनी is the FIRST Nakshatra.

Repeat these names TWICE now, tomorrow morning, noon and evening. Like this tweet if you have revised 8 times as told.

3

Today's Nakshatras are:-

5. Mrigashira - मृगशिरा

6. Ardra - आर्द्रा

7. Punarvasu - पुनर्वसु

8. Pushya - पुष्य

First recall previously learnt Nakshatras twice. Then recite these TWICE now, tomorrow morning, noon & evening in SAME order. Like this tweet only after doing so.

4

Today's Nakshatras are:-

9. Ashlesha - अश्लेषा

10. Magha - मघा

11. Purvaphalguni - पूर्वाफाल्गुनी

12. Uttaraphalguni - उत्तराफाल्गुनी

Purva means that comes before (P se Purva, P se pehele), and Uttara comes later.

Read next tweet too.

5

Purva, Uttara prefixes come in other Nakshatras too. Purva= pehele wala. Remember.

First recall previously learnt 8 Nakshatras twice. Then recite those in Tweet #4 TWICE now, tomorrow morning, noon & evening in SAME order. Like this tweet if you have read Tweets #4 & 5, both.