From Chatbot to Coworker: Gemini 3 Ushers in the Agent Era

Mollick showcases how Gemini 3 moves beyond chat to agentic work: planning, coding, browsing, and building deployable products with human oversight. In research tasks, it cleaned data, ran analyses, and drafted a paper with an original NLP metric, showing strengths akin to a solid PhD student alongside fixable judgment issues. He concludes the era is shifting from chatbots to digital coworkers, with humans acting as managers and stewards of quality and safety.

Key Points

AI has evolved from text chat to agentic systems that plan, code, and autonomously operate computers, making them general-purpose digital workers.
Google’s Gemini 3 with Antigravity can read local files, browse, build and test websites, and handle deployment, while checking in for permissions and revisions.
In a research test, Gemini 3 cleaned messy legacy data, devised hypotheses, ran analyses, and wrote a 14-page paper, including an original NLP-based novelty metric.
The model’s errors resembled human judgment issues (method choices, theory stretch) rather than classic hallucinations, and improved with managerial guidance.
The human role is shifting from policing outputs to managing and directing capable AI coworkers; however, giving agents system access carries security and data risks.

Sentiment

The overall sentiment of the Hacker News discussion is mixed, leaning towards cautious skepticism. While there is a strong appreciation for the technical progress and practical utility of agentic AI models like Gemini 3, many commenters express significant reservations regarding the quality and reliability of AI-generated content, the persistence of subtle (or overt) hallucinations, and the models' true capacity for novelty. Concerns about security and profound societal impacts also feature prominently, indicating that the community views the article's claims with a healthy degree of critical evaluation rather than outright agreement.

In Agreement

AI models, especially the latest generation like Codex 5.1, Sonnet 4.5, and Gemini 3, are becoming 'more and more shippable' for coding and project development, with a rapid decrease in 'wtfs per line.'
Gemini 3 is 'pretty good,' 'very solid,' and its integrations (e.g., with Google Workspace) are excellent, making it useful for professional tasks.
The shift from 'human who fixes AI mistakes' to 'human who directs AI work' is already true for some individuals, signifying a change in the human-AI interaction paradigm.
AI can act as a highly effective 'sounding board,' providing valuable critique and suggestions that significantly improve human-written content, even for skilled writers.
Gemini 3 has reached a 'major threshold' in AI by offering 'intelligence'—introducing concepts and ideas into conversations rather than just executing skill-based tasks.
Conversations with recent AI models can feel like interacting with an expert in specific subfields, supporting the article's comparison to 'PhD-level intelligence' for certain work.
Advancements in automatic speech recognition and speech-to-text could help overcome potential literacy barriers for interacting with LLMs.

Opposed

The quality and correctness of AI-generated output, such as a 14-page research paper, are often questionable, amounting to '14 pages of words' rather than genuinely good, coherent, or PhD-level work, especially in fields where the user lacks expertise.
Hallucinations persist and have become more insidious; models now 'confidently tell you they are right' and provide fabricated reasoning or even entire functionalities that do not exist, making them harder to detect.
AI models are largely 'huge librarians' that synthesize existing knowledge, rather than true sources of 'new ideas' or 'novel complexes,' often regurgitating standard lore unless extensively guided.
The need for human review, correction, and precise prompting remains crucial, as LLMs are seen as 'probabilistic slot machines' that can lead to 'cargo-cult behavior' if outputs are accepted without critical evaluation.
Claims of having 'moved past hallucinations' or a definitive shift to a 'managerial' human role are considered premature, a 'wishlist,' or a re-statement of promises that have been circulating for over a year.
Granting agentic AI systems broad access to personal systems or sensitive data poses significant security and privacy risks, necessitating robust sandboxing or strict permission controls.
There are significant societal concerns, including 'neural atrophy' from over-reliance on AI, a potential 'bimodal distribution' of human intelligence, and economic issues like a de-skilled workforce or an 'illiterate society' if AI replaces the need for fundamental human skills.