I’m excited to share our new paper on HyperTransformers, a novel architecture for few-shot learning able to generate the weights of a CNN directly from a given support set. 🧵👇

📜: https://t.co/vcm67G6P6t with Andrey Zhmoginov and Mark Sandler.

2) We train a transformer model to `convert` a few-shot task description into a small CNN network specialized in solving it on new images.
3) This effectively decouples a high-capacity transformer generator from a much smaller inference model. It is different from most of the existing methods, e.g. MAML where the generator and the executing model share the same architecture.
4) CNN weights are generated layer-by-layer from a combination of layer embedding (features from the last generated layer), and image w/ class embeddings (features directly from the data). The final weights are extracted from output of self-attention (similar to [CLS] tokens).
5) What is cool is that we can also add unlabeled samples from the support set into the mix, effectively allowing for semi-supervised few-shot learning!
6) HyperTransformers are comparable in performance to many competing methods on miniIImageNet and tieredImageNet datasets.
7) But our method especially shines for the case of small target CNN architectures, where the large capacity of the transformer model is the most useful and noticeable. For the 8-channels model we are seeing 5-10% improvement over MAML++!
8) As it turns out, for small target models, where every neuron matters, it is important to generate the whole network from a given support set. For larger target models even generating only the last logits layers appears to be sufficient.
9) We are really excited about the direction of using Transformers to guide the construction and performance of smaller specialized models e.g. in low-power settings. This has a lot of applications in the areas where high-performance compact personalized networks are being used.

More from All

🌺श्री गरुड़ पुराण - संक्षिप्त वर्णन🌺

हिन्दु धर्म के 18 पुराणों में से एक गरुड़ पुराण का हिन्दु धर्म में बड़ा महत्व है। गरुड़ पुराण में मृत्यु के बाद सद्गती की व्याख्या मिलती है। इस पुराण के अधिष्ठातृ देव भगवान विष्णु हैं, इसलिए ये वैष्णव पुराण है।


गरुड़ पुराण के अनुसार हमारे कर्मों का फल हमें हमारे जीवन-काल में तो मिलता ही है परंतु मृत्यु के बाद भी अच्छे बुरे कार्यों का उनके अनुसार फल मिलता है। इस कारण इस पुराण में निहित ज्ञान को प्राप्त करने के लिए घर के किसी सदस्य की मृत्यु के बाद का समय निर्धारित किया गया है...

..ताकि उस समय हम जीवन-मरण से जुड़े सभी सत्य जान सकें और मृत्यु के कारण बिछडने वाले सदस्य का दुख कम हो सके।
गरुड़ पुराण में विष्णु की भक्ति व अवतारों का विस्तार से उसी प्रकार वर्णन मिलता है जिस प्रकार भगवत पुराण में।आरम्भ में मनु से सृष्टि की उत्पत्ति,ध्रुव चरित्र की कथा मिलती है।


तदुपरांत सुर्य व चंद्र ग्रहों के मंत्र, शिव-पार्वती मंत्र,इन्द्र सम्बंधित मंत्र,सरस्वती मंत्र और नौ शक्तियों के बारे में विस्तार से बताया गया है।
इस पुराण में उन्नीस हज़ार श्लोक बताए जाते हैं और इसे दो भागों में कहा जाता है।
प्रथम भाग में विष्णुभक्ति और पूजा विधियों का उल्लेख है।

मृत्यु के उपरांत गरुड़ पुराण के श्रवण का प्रावधान है ।
पुराण के द्वितीय भाग में 'प्रेतकल्प' का विस्तार से वर्णन और नरकों में जीव के पड़ने का वृत्तांत मिलता है। मरने के बाद मनुष्य की क्या गति होती है, उसका किस प्रकार की योनियों में जन्म होता है, प्रेत योनि से मुक्ति के उपाय...
https://t.co/6cRR2B3jBE
Viruses and other pathogens are often studied as stand-alone entities, despite that, in nature, they mostly live in multispecies associations called biofilms—both externally and within the host.

https://t.co/FBfXhUrH5d


Microorganisms in biofilms are enclosed by an extracellular matrix that confers protection and improves survival. Previous studies have shown that viruses can secondarily colonize preexisting biofilms, and viral biofilms have also been described.


...we raise the perspective that CoVs can persistently infect bats due to their association with biofilm structures. This phenomenon potentially provides an optimal environment for nonpathogenic & well-adapted viruses to interact with the host, as well as for viral recombination.


Biofilms can also enhance virion viability in extracellular environments, such as on fomites and in aquatic sediments, allowing viral persistence and dissemination.
How can we use language supervision to learn better visual representations for robotics?

Introducing Voltron: Language-Driven Representation Learning for Robotics!

Paper: https://t.co/gIsRPtSjKz
Models: https://t.co/NOB3cpATYG
Evaluation: https://t.co/aOzQu95J8z

🧵👇(1 / 12)


Videos of humans performing everyday tasks (Something-Something-v2, Ego4D) offer a rich and diverse resource for learning representations for robotic manipulation.

Yet, an underused part of these datasets are the rich, natural language annotations accompanying each video. (2/12)

The Voltron framework offers a simple way to use language supervision to shape representation learning, building off of prior work in representations for robotics like MVP (
https://t.co/Pb0mk9hb4i) and R3M (https://t.co/o2Fkc3fP0e).

The secret is *balance* (3/12)

Starting with a masked autoencoder over frames from these video clips, make a choice:

1) Condition on language and improve our ability to reconstruct the scene.

2) Generate language given the visual representation and improve our ability to describe what's happening. (4/12)

By trading off *conditioning* and *generation* we show that we can learn 1) better representations than prior methods, and 2) explicitly shape the balance of low and high-level features captured.

Why is the ability to shape this balance important? (5/12)

You May Also Like