QuickStart Guides - Get up to Speed Ai/ML/LLM

Discussion in 'Artificial Intelligence' started by Sprout, Jan 29, 2025.

  1. Dismiss Notice
  1. Sprout

    Sprout

    Curious what folks have come across

    My favs;
    “What is ChatGPT Doing… and Why Does It Work?” by Stephen Wolfman is one of the most concise explanations I’ve come across.

    “Transformers in Excel” by Tom Yeh


    “3D Animated version”
    https://bbycroft.net/llm

    “The Prompt Report”
    https://arxiv.org/abs/2406.06608

    #PromptDesign
    #PromptEngineering
    #PromptTaxonomy
     
    Last edited: Jan 29, 2025
  2. Sprout

    Sprout

    Cut through Ai/ML/LLM Hype with Andrej Karpathy. A co-founder of OpenAi, made the breakthough for Tesla's FSD tech, now focused on education (which he is adeptly skilled). He breakdowns how LLMs work, their inherent strengths, weaknesses, areas that can be improved and the roadmap of future developments. Excellent, highly recommend.

    https://en.wikipedia.org/wiki/Andrej_Karpathy

     
  3. Sprout

    Sprout

    Inspired, I noodled around for GPU vs CPU comparison's and came across this epic educational one The production quality and content is amazingly top tier

    definitely levels above the popular mythbusters one

    For those in a cave and haven't seen mythbusters
     
  4. Sprout

    Sprout

    upload_2025-4-7_7-12-26.jpeg

    Advances and Challenges in foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
    https://arxiv.org/abs/2504.01990
     
  5. Kithara

    Kithara

    Paper wise, all my AI bets this week are because of "Scaling Laws for Neural Language Models"
    https://arxiv.org/abs/2001.08361

    As a Sonnet fan boy and having listened to as much Dario as I could find on youtube, you eventually have to get to that paper. Dario is so adamant on the scaling laws holding that he has to understand something about D, data, that the market doesn't understand. The scaling "law" is nothing really other than being able to hold compute/data/model size in line.

    I handicap it 80/20. 80% Dario understands something about the scaling laws and data that is not priced in vs 20% he is just talking his book.

    I have read great reviews of Co-Intelligence: Living and Working with AI by Ethan Mollick but to me the book is falling really flat. I would really love to understand how people are interacting with the models because I have almost never gained anything from reading other people's prompts.

    Karpathy and GPT2 I just don't care. It is too Donald Knuth or Assembler language to me.

    At a true foundation level, I think listening to Ian Goodfellow on Lex Fridman has value. Ian thought of the idea of a GAN after getting drunk. I don't think Ian wants any spotlight though or everyone would know Ian is the godfather of generative AI.


    With that said, for me Sonnet 3.7 and Perplexity deep research + DeepSeek is not even the domain of AI. It is the domain of cybernetics. These tools turn me into a genius cyborg on the level of Richard Feynman but in terms of productivity automation and GDP?

    Most of the US population can not leverage these tools.

    Westerns who worship the values of Confucius though will do quite well.
     
    Last edited: Apr 10, 2025
    Sprout likes this.
  6. Sprout

    Sprout

    Deep dive into Deepseek
     
  7. Sprout

    Sprout

    It's important to consider the context of when Co-intelligence was published.

    While you're not a fan, Karpathy's "How I use LLMs" has some gems. eg. asking questions to an LLM as one is reading a book for increasing memory retention - same with digesting papers from non-familiar disciplines.


    I'm in the same camp of tool use, Sonnet 3.7 & Perplexity are two that I interact with daily.

    My present focus is in Prompt Design; integrating Software Design Patterns with "The Pattern Language" by Christopher Alexander

    Prompt Engineering by Lee Boonstra from Google
    https://www.kaggle.com/whitepaper-prompt-engineering

    Anthropic's Prompt Engineering Interactive Course
    https://github.com/anthropics/prompt-eng-interactive-tutorial

    Another worthy of note is a fairly unknown Carlos Perez's works. Unfortunately, the copy has avoidable mistakes but his general thesis is interesting.
    upload_2025-4-11_11-15-59.png
     
  8. Sprout

    Sprout

    Last edited: Apr 12, 2025
    themickey likes this.
  9. Sprout

    Sprout

    From Epoch Ai:
    ---
    How quickly are AI supercomputers scaling, where are they, and who owns them?

    Our new dataset covers 500+ of the largest AI supercomputers (aka GPU clusters or AI data centers) over the last six years.

    Here is what we found[​IMG]
    Performance has grown drastically – FLOP/s of leading AI supercomputers have doubled every 9 months. Driven by:
    - Deploying more chips (1.6x/year)
    - Higher performance per chip (1.6x/year)

    Systems with 10,000 AI chips were rare in 2019. Now, leading companies have clusters 10x that size [​IMG]

    As they grew in performance, AI supercomputers got exponentially more expensive. The upfront hardware cost of leading AI supercomputers doubled roughly every year (1.9x/year). We estimate the hardware for xAI's Colossus cost about $7 billion.[​IMG]

    Power requirements are following a similar trajectory, doubling every year. Today's most powerful system requires 300 MW—equivalent to about 250,000 households.[​IMG]

    AI supercomputers are improving in energy efficiency, but not quickly enough to offset overall power growth. FLOP/s per watt has increased by 1.34x/year—almost entirely due to AI supercomputers using more energy-efficient chips.[​IMG]

    Ownership has rapidly shifted from the public to the private sector. In 2019, the two sectors had roughly equal compute shares. We estimate that companies now control over 80% of global AI computing capacity, while the share of governments and academia has fallen below 20% .[​IMG]

    Geographically, the US dominates with 75% of global AI supercomputer performance in our dataset. China is in second place with 15%, while Europe plays only a small role. (Our data only covers ~15% of global AI compute, so there’s some uncertainty.)[​IMG]

    Why did AI supercomputers grow so quickly?

    They transitioned from research tools to industrial machines delivering economic value. Bigger AI Supercomputers -> more capable models -> more investment -> bigger AI supercomputers

    If current trends continue, the leading AI supercomputer in 2030 will require 2 million AI chips, cost $200 billion, and need 9 GW of power—equivalent to 9 nuclear reactors.[​IMG]

    Power constraints may ultimately force companies to adopt decentralized training approaches, e.g., training across nine 1GW data centers rather than one 9GW data center.
    This project was led by @KonstantinPilz together with @james_s48, @robi_rahman, and @ohlennart. Find a summary with interactive figures here
    [​IMG]
    Trends in AI SupercomputersAI supercomputers double in performance every 9 months, cost billions of dollars, and require as much power as mid-sized cities. Companies now own 80% of all AI supercomputers, while governments’ shar…
    https://epoch.ai/blog/trends-in-ai-supercomputers
    @KonstantinPilz @james_s48 @robi_rahman @ohlennart Find our full paper below. We’ll publish our full dataset & documentation in a few weeks, and we’ll continue tracking AI supercomputers as part of our Data on AI hub. Stay tuned!
    Trends in Ai Supercomputers
     
    themickey likes this.
  10. Sprout

    Sprout

    Tbh, one of the best podcasts I’ve heard in a while