Mistral AI Releases Forge

(mistral.ai)

165 points | by pember 5 hours ago

12 comments

  • gpubridge 1 minute ago
    Interesting to see Mistral positioning Forge as the "all-in-one" platform. The trend is clear: every model provider is building their own platform layer.

    The question for developers is whether they want lock-in to a single provider's platform or a middleware approach that abstracts across providers. If Mistral changes pricing or deprecates a model, you're migrating your entire workflow — not just swapping an API call.

    The most resilient architecture: use provider-specific platforms for experimentation, but run production through an abstraction layer that lets you swap providers without code changes.

  • mark_l_watson 3 hours ago
    I am rooting for Mistral with their different approach: not really competing on the largest and advanced models, instead doing custom engineering for customers and generally serving the needs of EU customers.
    • jerrygoyal 26 minutes ago
      their ocr model is goated
    • w4yai 1 hour ago
      Go Mistral !
  • roxolotl 3 hours ago
    Mistral has been releasing some cool stuff. Definitively behind on frontier models but they are working a different angle. Was just talking at work about how hard model training is for a small company so we’d probably never do it. But with tools like this, and the new unsloth release, training feels more in reach.
  • dmix 1 hour ago
    This is definitely the smart path for making $$ in AI. I noticed MongoDB is also going into this market with https://www.voyageai.com/ targeting business RAG applications and offering consulting for company-specific models.
  • csunoser 3 hours ago
    Huh. I initially thought this is just another finetuning end point. But apparently they are partnering up with customers on the pretraining side as well. But RL as well? Jeez RL env are really hard to get right. Best wishes I guess.
  • ryeguy_24 1 hour ago
    How many proprietary use cases truly need pre-training or even fine-tuning as opposed to RAG approach? And at what point does it make sense to pre-train/fine tune? Curious.
    • baby 1 hour ago
      RAG is dead
      • charcircuit 1 hour ago
        Using tools and skills to retrieve data or files is anything but dead.
      • loeg 1 hour ago
        Is it??
      • bigyabai 1 hour ago
        In what, X's hype circles? Embeddings are used in production constantly.
      • CharlesW 1 hour ago
        And yet your blog says you think NFTs are alive. Curious.

        But seriously, RAG/retrieval is thriving. It'll be part of the mix alongside long context, reranking, and tool-based context assembly for the forseeable future.

        • strongly-typed 57 minutes ago
          Wait, what does NFTs have to do with RAG?
          • panarky 48 minutes ago
            I, for one, find NFT-shilling to be a strong signal that I should downgrade my trust in everything else a person says.
          • LoganDark 50 minutes ago
            Nothing, I think they're just pointing out a seeming lack of awareness of what really is or isn't dead.
  • rorylawless 1 hour ago
    The fine tuning endpoint is deprecated according to the API docs. Is this the replacement?

    https://docs.mistral.ai/api/endpoint/deprecated/fine-tuning

    • aavci 38 minutes ago
      Interesting to see. I thought they were promoting fine tuning
  • andai 1 hour ago
    They mention pretraining too, which surprises me. I thought that was prohibitively expensive?

    It's feasible for small models but, I thought small models were not reliable for factual information?

    • simsla 11 minutes ago
      Typical stages of training for these models are:

      Foundational:

      - Pretraining - Mid/post-training (SFT) - RLHF or alignment post-training (RL)

      And sometimes...

      - Some more customer-specific fine-tuning.

      Note that any supervised fine-tuning following the Pretraining stage is just swapping the dataset and maybe tweaking some of the optimiser settings. Presumably they're talking about this kind of pre-RL fine-tuning instead of post-RL fine-tuning, and not about swapping out the Pretraining stage entirely.

  • aavci 35 minutes ago
    How does this compare to fine tuning?
  • bsjshshsb 1 hour ago
    Id training or FT > context? Anyone have experience.

    Is it possible to retrain daily or hourly as info changes?

  • codance 1 hour ago
    [dead]
  • shablulman 5 hours ago
    [dead]