👩🏽🍳Gather those heirloom family recipes
The holidays are a great time to cook some of those deeply meaningful recipes. Now you can dictate by voice or transcribe a handwritten notecard to keep them in one central, easy-to-access place.
This week we’re turning to food and its central role in the celebration of holidays around the world.
One thing that is common? Special family recipes, passed on from one generation to another, modified, shared, loved.
🥘 Food is my love language.
It’s an uncontested fact that my mum is an incredible cook. Her meals have always been flavorful but they also benefit from the magic that comes from care. She does everything with an intention and precision that you can taste in every bite. It’s no secret that everywhere we’ve lived, we’ve hosted a dinner party during my mum’s visits, to heavily influence our candidacy as friends.
I inherited this love of cooking and the deep cultural role that food plays. And while I’ve developed my own dishes I could call my own, I’ve never been able to touch the ones my mom makes.
Now though, she’s been spending more time with my 9yo - passing down her skills to those eager little hands. And I’ve been itching to get those “recipes” out of her head and into something we can use and practice at home.
But anyone having attempted to get a parent or grandparent to write down a recipe by heart knows it’s easier said than done. Plus these “heirlooms” need to be captured on more than a random notecard lost in the wilds of our cupboards.
This week’s helper focuses on where the practical meets the magic - helping to gather 10 family recipes into a collection that you can download or turn into a cookbook.
Kick off this help with “cookbook” - you can capture up to 10 recipes.
Share the recipe by 💬 texting it, 📸 sending an image (handwritten notecard, cookbook etc) or, 📞 calling Milo to dictate a message of the recipe via voice.
Milo will ask for any modifications, notes or why this is special.
Milo will share the link to the recipe card that you can download or, save the link to the cookbook to access anytime.
Give it a whirl and let me know what you think.
We hope we can help lighten even a little of the load this holiday season 💕
xo,
-avni + milo team
🔦 Technical spotlight
This week, we’re tackling vision - here’s a quick behind the scenes.
👀 Multimodal LLMs OR How can GPT See?
💬 Traditional LLMs: Let's start with a quick recap of how traditional text-based LLMs, like GPT, work. These models process text by breaking it down into "tokens." Tokens can be words, parts of words, or even punctuation. The model has been trained on a vast amount of text data, so it learns the patterns and structures of language. When you ask a question or give a command, the LLM predicts the most likely next token based on the tokens it has already seen. This is done sequentially to form complete sentences and coherent responses.
🔎 Tokens for Images? Images can't be directly fed into an LLM because these models are designed to handle sequential, text-based data. So, we need a way to convert images into a format that the model can understand -
Base64 Encoding: Images are converted into a text format using Base64 encoding, which is a way of transforming binary image data into text characters. This allows the image to be processed as text.
Other Methods: There are also other ways to represent images as text, like encoding pixel data, to make them understandable to LLMs.
✨ Putting it together - Vision LLMs: Now, let's talk about how vision LLMs, like DALL-E or CLIP, work by combining these elements. They are trained on both text and images, learning to correlate textual descriptions with visual content. When you provide a text prompt to a vision LLM, it uses its understanding of both text and image data (and how they usually correlate) to generate a matching image.