Skip to content

1.1 Fundamentals and key concepts of generative AI

To effectively test MagicFridge functionalities using AI, it is imperative to understand what is hidden under the hood. AI is not a monolithic block, but the result of a long technological evolution.

1.1.1 The AI spectrum: a technological evolution

The history of artificial intelligence can be broken down into four major stages, all present at different levels in our application.

1. Symbolic AI

This is the historical form, based on strict logical rules coded by humans. It does not learn; it applies procedures.

Red thread: MagicFridge

In MagicFridge's legacy code, this is the reliable module that compares today's date with the expiration date of your yogurts: IF the date is passed, THEN an alert is triggered. The behavior is binary and totally predictable.

2. Classical Machine Learning (ML)

Here, the paradigm changes: we no longer code explicit rules; we feed the machine with data so it learns to classify or predict.

Red thread: MagicFridge

This technology powers the app's recommendation engine. By analyzing the purchase history of thousands of users, the system detects trends and predicts that a specific customer has a high probability of enjoying a curry recipe, simply by "learning" from their past consumption habits.

3. Deep Learning (DL)

Inspired by the neural structure of the human brain, this method excels in recognizing complex patterns like images.

Red thread: MagicFridge

This is how the "Scan" feature works: when the user photographs a crumpled receipt, a multi-layered neural network analyzes the pixels to decipher the word "Eggplant", where a classical approach would have failed.

4. Generative AI (GenAI)

This is the technological breakthrough at the heart of this certification. Unlike previous stages used to classify or predict, GenAI uses models pre-trained on massive volumes of data to create new content.

Red thread: MagicFridge

This is the flagship feature of the new version: when a user asks for a dinner idea with random leftovers, the AI does not look up a database response. It invents, word by word, a unique and coherent recipe that existed nowhere else, thus moving from the status of analyst to that of creator.


1.1.2 How LLMs work: tokenization, embedding and context

The engines of this textual generative AI are called LLMs (Large Language Models) and rely on a neural network architecture named Transformer. To understand LLMs properly, we must grasp 3 fundamental mechanisms.

Tokenization

Unlike us, LLMs do not read whole words. They break down text into smaller units called tokens, which can be words, syllables, or characters.

Red thread: MagicFridge

When a MagicFridge user types the instruction "Cook a pie", the model actually perceives a numerical sequence corresponding to the fragments [Cook], [a], [pi], [e].

Note for the tester: This nuance is crucial because AI input limits are calculated in tokens, not in number of words.

Embeddings

Embeddings transform tokens into lists of numbers (vectors) to enable the AI to understand their deep meaning and relationships. In this mathematical space, words with similar meanings are positioned physically close to each other.

Red thread: MagicFridge

If the user asks for a recipe with "Zucchini" but runs out, GUS knows mathematically that the word "Cucumber" is a very close neighbor (similar vector) while "Tire" is very far away. Thanks to embeddings, it can suggest a relevant replacement without having been explicitly taught a list of synonyms.

The Context Window

It represents the model's "working memory," i.e., the maximum amount of information it can process simultaneously.

Analogy: The Order Rail

Imagine this context window as the order rail in a kitchen, where the chef clips the ticket instructions. This rail has a fixed length: it can only hold a limited number of tickets.

If the conversation drags on and the rail is full, to clip a new instruction, the chef is forced to unclip and throw away the oldest ticket.

Consequence: The AI continues cooking, but it might suggest a dangerous recipe simply because the crucial information "I am allergic to peanuts", written on the very first ticket now thrown away, has fallen out of its active memory.


1.1.3 Model categories

Not all models behave in the same way. The syllabus distinguishes three categories based on their specialization.

Model Type Description MagicFridge Example
Base model Trained to predict the next most probable word crudely. Powerful but unpredictable. If the user writes "My list", it might complete with "of groceries:" (simple completion).
Instruction-tuned Fine-tuned to understand and execute orders. Ideal for interactions. To the query "Shopping list", it understands the intent and replies with a relevant list of ingredients.
Reasoning model Uses "Chain of Thought" to break down complex problems. For a wedding menu with a tight budget and multiple restrictions, it first calculates costs and constraints before generating the menu.

1.1.4 Multimodality and vision-language

One of the major developments in AI lies in multimodal LLM. AI is no longer limited to text; it can simultaneously process images, audio, and video.

Red thread: MagicFridge

This technology allows the user to take a simple photo of the inside of their open fridge. The vision-language model visually identifies yogurt pots and vegetables, associates these objects with their textual concepts, and generates a recipe consistent with what it "saw".

Testing challenge: we must verify not only the quality of the text but also the accuracy of the visual interpretation.


Syllabus point (summary K1/K2)

To conclude this section, here are the key concepts to remember for the exam:

  • Generative AI (GenAI): branch of AI creating new content (text, image, code) via pre-trained models.
  • LLM (Large Language Model): model based on the Transformer architecture, trained on vast textual data.
  • Tokenization: process of breaking down data (text) into elementary units (tokens).
  • Context window: limit of the model's short-term memory during an interaction.
  • Embeddings: numerical representations (vectors) of tokens enabling the AI to understand their semantic and contextual relationships.



Is this course useful?

This content is 100% free. If this explanation on MagicFridge helped you:

Buy Me a Coffee at ko-fi.com
I truly appreciate your generosity ! 🤗