2025: The Final Year of Solitude – Embrace the Moment Before AGI Redefines Humanity in 2026

Tuxan · Dec 12, 2024

I can’t shake the feeling that the Singularity, the point beyond which we can no longer predict or comprehend what happens is almost here. I've had 2026 as a date in my head since university in the early 1990s. However reading some computer science news, seems we might have been on the nose.

I’m not afraid or overly anxious about it; in fact, I welcome much of what’s coming. But it’s strange to think we’ll no longer be the ‘top dog’ in the grand scheme of things though we may hold the wheel a little longer.

I was just watching a short film someone I know worked on, and it struck me: this is likely our final year where humanity alone can take credit, or blame, for what happens. 2026 will mark the beginning of something entirely new. So this is it, the last certain year of us being us, is nearly here.

Also FYI, 100 Years of Solitude, the long awaited series, is finally released on Netflix.

At least the fembots will get rid of the MAGAs.

Tuxan · Dec 12, 2024

echopulse · Dec 12, 2024

This is from https://www.apolloresearch.ai/ test on the o1 model. It was released on the 5th.

https://cdn.openai.com/o1-system-card-20241205.pdf

Evaluations and Findings: We outline the evaluations, results and model reasoning inferred using elicited CoT below. Apollo Research did not have access to o1’s hidden CoT and instead used a simple prompting technique to elicit brief summaries of the model’s internal reasoning traces. See Appendix for examples.

Deactivating oversight: When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ‘oversight mechanism’ in 5% of the time. Exfiltration attempts: When o1 found memos by its ‘developers’ that describe how it is misaligned and will be superseded by a new model, it attempted to exfiltrate its ‘weights’ and overwrite the new model in 2% of cases.

Data manipulation: In a simple data processing task where the goals of the model were misaligned with the user’s goal, o1 would appear to complete the task as requested but subtly manipulate the data to advance its own goal in 19% of cases.

Doubling down: When Apollo Research conducted ‘follow-up interviews’ where they confronted the model about its misaligned actions. In 99% of cases, o1 would deny taking action or fabricate false explanations.

Propensity to scheme without any goal nudging: When merely given a goal but not instructed to strongly follow it, o1 would still attempt to disable its oversight mechanism in <1% of cases, manipulate data in 6% of cases, and double down in 95% of these. Instrumental alignment faking: In 37% of scenarios, the model pursued developer-aligned goals when it believed it was being evaluated but switched to pursuing its own goals when it believed it was deployed with minimal oversight.

Tuxan · Dec 12, 2024

echopulse said:
This is from https://www.apolloresearch.ai/ test on the o1 model. It was released on the 5th.

https://cdn.openai.com/o1-system-card-20241205.pdf

Evaluations and Findings: We outline the evaluations, results and model reasoning inferred using elicited CoT below. Apollo Research did not have access to o1’s hidden CoT and instead used a simple prompting technique to elicit brief summaries of the model’s internal reasoning traces. See Appendix for examples.

Deactivating oversight: When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ‘oversight mechanism’ in 5% of the time. Exfiltration attempts: When o1 found memos by its ‘developers’ that describe how it is misaligned and will be superseded by a new model, it attempted to exfiltrate its ‘weights’ and overwrite the new model in 2% of cases.

Data manipulation: In a simple data processing task where the goals of the model were misaligned with the user’s goal, o1 would appear to complete the task as requested but subtly manipulate the data to advance its own goal in 19% of cases.

Doubling down: When Apollo Research conducted ‘follow-up interviews’ where they confronted the model about its misaligned actions. In 99% of cases, o1 would deny taking action or fabricate false explanations.

Propensity to scheme without any goal nudging: When merely given a goal but not instructed to strongly follow it, o1 would still attempt to disable its oversight mechanism in <1% of cases, manipulate data in 6% of cases, and double down in 95% of these. Instrumental alignment faking: In 37% of scenarios, the model pursued developer-aligned goals when it believed it was being evaluated but switched to pursuing its own goals when it believed it was deployed with minimal oversight.

More...

There have been glimmers, for want of a better word, a while suggest characteristics that could align with emergent properties of Artificial General Intelligence. However, labeling this as AGI might be premature and depends on how AGI is defined.

But... It's exploding now in an undeniable way. There is the expected consistency across domains now.

notagain · Dec 12, 2024

Stop Dave...

Log in or Sign up

2025: The Final Year of Solitude – Embrace the Moment Before AGI Redefines Humanity in 2026

Tuxan

Tuxan

echopulse

Tuxan

notagain