Battle of the AI's

MarkBrown · Aug 18, 2025 at 4:59 PM

have been using all the main stream AI's and its been a fight between Grok, GPT and Claude for me.

GPT 5 is currently 1st, Claude 2nd and Grok last...
every other week this changes, its crazy how bad and good they can fluctuate....

Businessman · Aug 18, 2025 at 6:02 PM

MarkBrown · Aug 18, 2025 at 6:25 PM

yes just brilliant @Businessman your post are continuing to descend into crap

RedDuke · Aug 18, 2025 at 6:47 PM

Claude is number one for coding. I use windsurf as ide.

MacR · Aug 18, 2025 at 6:59 PM

Mark in the past I have used "Perplexity" for a few different things. I feel they user her as an AI search engine. I'm sure many have heard of her. Maybe the reason my wife perplexes me at times is the reason I have felt it is a her.
I have only used GPT5 and "Perplexity" for different reasons.

Sekiyo · Aug 18, 2025 at 8:21 PM

RedDuke said:
Claude is number one for coding. I use windsurf as ide.
More...

Did you give Deepseek a try ?
Claude is more like GPT and it doesn't beat Deepseek.

I'll check windsurf. Still using VSCode.

Ps: I am not Chinese nor Communist xD
Just relying on my own experience around them.

Feels like Claude & GPT don't read the docs.
Sometimes they'll write pseudo code that won't run.

"Oh yeah, those params don't even exist for that built in function !"

Sekiyo · Aug 18, 2025 at 8:35 PM

Sekiyo said:
Did you give Deepseek a try ?
Claude is more like GPT and it doesn't beat Deepseek.

I'll check windsurf. Still using VSCode.

Ps: I am not Chinese nor Communist xD
Just relying on my own experience around them.

Feels like Claude & GPT don't read the docs.
Sometimes they'll write pseudo code that won't run.

"Oh yeah, those params don't even exist for that built in function !"
More...

StackOverflow Survey on LLMs.

MarkBrown · Aug 18, 2025 at 8:38 PM

MacR said:
Mark in the past I have used "Perplexity" for a few different things. I feel they user her as an AI search engine. I'm sure many have heard of her. Maybe the reason my wife perplexes me at times is the reason I have felt it is a her.
I have only used GPT5 and "Perplexity" for different reasons.
More...

Perplexity did a good job..

To determine which of the two trading models performs best, let’s compare their overall performance using the following key metrics from the data provided in 0001.csv and 0002.csv:

TotalProfit: Accumulated profit over all trades

ProfitFactor: Ratio of gross profit to gross loss; a value >1 is considered profitable

WinningTrades and LosingTrades: Count of profitable and losing trades

AvgProfitPerContract: Average profit made per individual contract

Performance Comparison Table
Metric Model 0001.csv Model 0002.csv
TotalProfit (final) $14,212.50 $14,212.50
ProfitFactor (final) 1.87 1.86
WinningTrades (final) 27 25
LosingTrades (final) 44 45
AvgProfit/Contract $200.18 $203.04

Key Insights

Total Profit: Both models reach the same final TotalProfit, $14,212.50.

Profit Factor: Model 0001.csv has a marginally higher ProfitFactor (1.87 vs. 1.86), suggesting slightly better risk-adjusted returns.

Winning Trades: Model 0001.csv achieved 2 more winning trades than Model 0002.csv.

Avg Profit per Contract: Model 0002.csv very slightly outperformed (by $2.86 per contract).

Consistency: Model 0001.csv has a lower number of losing trades and a marginally better win/loss ratio.

Conclusion
Model 0001.csv performs the best overall due to:

Slightly higher profit factor (better reward-to-risk)

More winning trades and fewer losses

Identical total profit as Model 0002.csv with better consistency

While the average profit per contract is slightly higher in Model 0002.csv, the difference is minimal compared to the consistency and win ratio advantage of Model 0001.csv.

nz_melon · Aug 18, 2025 at 9:02 PM

For coding for me personally my preferred frameworks and llms are as follows:

1) cursor cli (sonnet 4)
2) vs code / augment AI with sonnet 4 and gptg5
3) vs code / github copilot with sonnet 4 and gpt5

So, anthropic in my estimate at current offers the best coding assistant for both, backend and frontend development.

nz_melon · Aug 18, 2025 at 9:06 PM

Can't disagree more. I tried them all and program every day for about 7-8 hours. For coding deepseek does not beat either gpt nor Claude, not even sonnet 4. I use mcp servers for documentation and both, gpt and sonnet 4 consult the context7 documentations frequently and follow best practices to the dot when asked. I love the competition though.

Sekiyo said:
Did you give Deepseek a try ?
Claude is more like GPT and it doesn't beat Deepseek.

I'll check windsurf. Still using VSCode.

Ps: I am not Chinese nor Communist xD
Just relying on my own experience around them.

Feels like Claude & GPT don't read the docs.
Sometimes they'll write pseudo code that won't run.

"Oh yeah, those params don't even exist for that built in function !"
More...