Skip to content

Discover your next Milestone.

Choose from industry-vetted challenges. Build local, push to GitHub, and earn cryptographic proof of your engineering skills.

Startup Idea Validator

fullstackAdvanced365d access
199onwards

Describe a startup idea. The app runs a full AI pipeline — market sizing, live competitor scan with web search tools, SWOT analysis, and a landing page draft. All streamed.

  • Design and implement a multi-stage LLM pipeline with distinct agent responsibilities
  • Use LangChain agents with Tavily web search to do live competitor research
  • Stream multiple independent pipeline stages to the frontend simultaneously over SSE
  • Build a pipeline progress tracker that visualizes each stage's completion status

AI Interview Prep Platform

fullstackAdvanced365d access
199onwards

End-to-end interview coach — upload a JD, generate role-specific questions, conduct a mock interview with voice, get scored feedback with improvement suggestions.

  • Design a complex multi-service architecture before writing any code
  • Generate role-specific interview questions from a job description using an LLM
  • Transcribe voice answers using Whisper and deliver questions via TTS
  • Evaluate free-text interview answers across multiple quality dimensions using LangGraph

Transformer Encoder from Scratch

data science & mlAdvanced365d access
199onwards

Implement a full Transformer encoder — multi-head attention, positional encoding, layer norm — in PyTorch from scratch. Train on a classification task. No HuggingFace shortcuts.

  • Implement scaled dot-product attention and multi-head attention from scratch in PyTorch
  • Build sinusoidal positional encoding and understand why position matters in Transformers
  • Assemble a complete Transformer encoder block with residual connections and layer norm
  • Train an encoder classifier end-to-end on a real text classification dataset

Fine-tuning Data Preparation Pipeline

data science & mlAdvanced365d access
199onwards

Scrape content, clean it, auto-generate instruction-response pairs using an LLM, score quality with an evaluator model, and output a production-ready JSONL dataset.

  • Build an async web scraping pipeline using httpx and BeautifulSoup
  • Clean, deduplicate, and validate raw text content at scale
  • Auto-generate instruction-response training pairs using an LLM
  • Score dataset quality using an LLM judge and apply rule-based filters

Production RAG with Hybrid Search

genaiAdvanced365d access
199onwards

Combine dense vector search with BM25 keyword search. Add query rewriting, hypothetical document embedding (HyDE), streaming response, and a full eval suite.

  • Combine dense vector search and BM25 keyword search using Reciprocal Rank Fusion
  • Implement query rewriting to improve retrieval quality for ambiguous queries
  • Apply HyDE (Hypothetical Document Embedding) to boost semantic search precision
  • Build a streaming FastAPI + Streamlit layer on top of a production RAG system

Fine-tune a Small LLM on Custom Data

genaiAdvanced365d access
199onwards

Prepare an instruction-tuning dataset, fine-tune Phi-2 or Mistral 7B using LoRA/QLoRA on free Colab TPUs, and rigorously evaluate the fine-tuned model vs. the base.

  • Curate and format a high-quality instruction-tuning dataset in JSONL format
  • Understand LoRA and QLoRA — how parameter-efficient fine-tuning works
  • Fine-tune a small open-source LLM (Phi-2 or Mistral 7B) on Google Colab for free
  • Evaluate a fine-tuned model rigorously against its base model using an LLM judge

LLM Observability & Tracing System

backendAdvanced365d access
199onwards

Log every LLM call with latency, token usage, and model output. Build a query layer surfacing slow calls, expensive prompts, error rates, and cost trends over time.

  • Instrument every LLM call with latency, token usage, and cost tracking
  • Group related LLM calls into traces for end-to-end session visibility
  • Write analytical SQL queries for performance monitoring (slow calls, cost-by-model)
  • Build a live observability dashboard using Chart.js

Multi-Agent Orchestration Backend

backendAdvanced365d access
199onwards

Coordinate multiple specialized AI agents — planner, researcher, writer — passing context and managing state between them. Return a streamed unified result to the client.

  • Design a multi-agent architecture with clearly defined agent roles
  • Implement a stateful agent graph using LangGraph
  • Stream intermediate agent progress to clients using WebSockets
  • Implement per-agent timeouts and graceful fallback strategies
12k+
Verified Developers
150+
Active Projects
450+
Companies Hiring
14 Days
Avg. Completion

Got questions?

Every challenge includes detailed documentation, technical constraints, and automated evaluation scripts to ensure you have everything you need to succeed.