i recently settled on grok it cost more than chat gpt but it's ability and accuracy with coding is the best i have found so far. it has saved me a lot of time not having to get things corrected and or understood.
General consensus is that Anthropic’s Sonnet 3.5 holds the top spot for coding. Early reports about 3.7 is that 3.5 is better. While I’m sure there are cases like that, due to the nature of scaling, some of the degradation you’ve witnessed has more to do with older hardware being used in rapid deployment during high demand times. There are also system prompts that constrain the model in terms of limits on compute time. It was discussed on a recent podcast, I forget which. More-often-than-not, iteratively refining one’s prompts increases better Assistant responses.