Battle of the AI's

Discussion in 'Artificial Intelligence' started by MarkBrown, Aug 18, 2025 at 4:59 PM.

  1. Dismiss Notice
  1. MarkBrown

    MarkBrown

    have been using all the main stream AI's and its been a fight between Grok, GPT and Claude for me.

    GPT 5 is currently 1st, Claude 2nd and Grok last...
    every other week this changes, its crazy how bad and good they can fluctuate....
     
  2. Businessman

    Businessman

  3. MarkBrown

    MarkBrown

    yes just brilliant @Businessman your post are continuing to descend into crap
     
  4. RedDuke

    RedDuke

    Claude is number one for coding. I use windsurf as ide.
     
    MarkBrown likes this.
  5. MacR

    MacR

    Mark in the past I have used "Perplexity" for a few different things. I feel they user her as an AI search engine. I'm sure many have heard of her. Maybe the reason my wife perplexes me at times is the reason I have felt it is a her.
    I have only used GPT5 and "Perplexity" for different reasons.
     
    zdreg and MarkBrown like this.
  6. Sekiyo

    Sekiyo

    Did you give Deepseek a try ?
    Claude is more like GPT and it doesn't beat Deepseek.

    I'll check windsurf. Still using VSCode.

    Ps: I am not Chinese nor Communist xD
    Just relying on my own experience around them.

    Feels like Claude & GPT don't read the docs.
    Sometimes they'll write pseudo code that won't run.

    "Oh yeah, those params don't even exist for that built in function !"
     
    Last edited: Aug 18, 2025 at 8:21 PM
    zdreg and MarkBrown like this.
  7. Sekiyo

    Sekiyo

    StackOverflow Survey on LLMs.
     
    Last edited: Aug 18, 2025 at 8:35 PM
  8. MarkBrown

    MarkBrown


    Perplexity did a good job..

    To determine which of the two trading models performs best, let’s compare their overall performance using the following key metrics from the data provided in 0001.csv and 0002.csv:

    • TotalProfit: Accumulated profit over all trades

    • ProfitFactor: Ratio of gross profit to gross loss; a value >1 is considered profitable

    • WinningTrades and LosingTrades: Count of profitable and losing trades

    • AvgProfitPerContract: Average profit made per individual contract
    Performance Comparison Table
    Metric Model 0001.csv Model 0002.csv
    TotalProfit (final) $14,212.50 $14,212.50
    ProfitFactor (final) 1.87 1.86
    WinningTrades (final) 27 25
    LosingTrades (final) 44 45
    AvgProfit/Contract $200.18 $203.04

    Key Insights
    • Total Profit: Both models reach the same final TotalProfit, $14,212.50.

    • Profit Factor: Model 0001.csv has a marginally higher ProfitFactor (1.87 vs. 1.86), suggesting slightly better risk-adjusted returns.

    • Winning Trades: Model 0001.csv achieved 2 more winning trades than Model 0002.csv.

    • Avg Profit per Contract: Model 0002.csv very slightly outperformed (by $2.86 per contract).

    • Consistency: Model 0001.csv has a lower number of losing trades and a marginally better win/loss ratio.
    Conclusion
    Model 0001.csv performs the best overall due to:

    • Slightly higher profit factor (better reward-to-risk)

    • More winning trades and fewer losses

    • Identical total profit as Model 0002.csv with better consistency
    While the average profit per contract is slightly higher in Model 0002.csv, the difference is minimal compared to the consistency and win ratio advantage of Model 0001.csv.
     
    Sekiyo likes this.
  9. nz_melon

    nz_melon

    For coding for me personally my preferred frameworks and llms are as follows:

    1) cursor cli (sonnet 4)
    2) vs code / augment AI with sonnet 4 and gptg5
    3) vs code / github copilot with sonnet 4 and gpt5

    So, anthropic in my estimate at current offers the best coding assistant for both, backend and frontend development.
     
  10. nz_melon

    nz_melon

    Can't disagree more. I tried them all and program every day for about 7-8 hours. For coding deepseek does not beat either gpt nor Claude, not even sonnet 4. I use mcp servers for documentation and both, gpt and sonnet 4 consult the context7 documentations frequently and follow best practices to the dot when asked. I love the competition though.

     
    EllisWyatt and Sekiyo like this.