{"id":13485,"date":"2024-09-23T15:11:02","date_gmt":"2024-09-23T15:11:02","guid":{"rendered":"https:\/\/medexperts.pro\/?p=13485"},"modified":"2024-09-23T15:23:31","modified_gmt":"2024-09-23T15:23:31","slug":"can-math-help-ai-chatbots-stop-making-stuff-up","status":"publish","type":"post","link":"https:\/\/medexperts.pro\/?p=13485","title":{"rendered":"Can Math Help AI Chatbots Stop Making Stuff Up?"},"content":{"rendered":"<div><\/div>\n<p id=\"article-summary\" class=\"css-79rysd e1wiw3jv0\">Chatbots like ChatGPT get stuff wrong. But researchers are building new A.I. systems that can verify their own math \u2014 and maybe more.<\/p>\n<section class=\"meteredContent css-1r7ky0e\">\n<div class=\"css-s99gbd StoryBodyCompanionColumn\" data-testid=\"companionColumn-0\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">On a recent afternoon, Tudor Achim gave a brain teaser to an A.I. bot called Aristotle.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">The question involved a 10-by-10 table filled with a hundred numbers. If you collected the smallest number in each row and the largest number in each column, he asked, could the largest of the small numbers ever be greater than the smallest of the large numbers?<\/p>\n<p class=\"css-at9mc1 evys1bk0\">The bot correctly answered \u201cNo.\u201d But that was not surprising. Popular chatbots like ChatGPT may give the right answer, too. The difference was that Aristotle had proven that its answer was right. The bot generated a detailed computer program that verified \u201cNo\u201d was the correct response.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Chatbots like ChatGPT from OpenAI and Gemini from Google can answer questions, write poetry, summarize news articles and generate images. But they also make mistakes that defy common sense. Sometimes, they make stuff up \u2014 a phenomenon called <a class=\"css-yywogo\" href=\"https:\/\/www.nytimes.com\/2023\/05\/01\/business\/ai-chatbots-hallucination.html\" title>hallucination<\/a>.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Mr. Achim, the chief executive and co-founder of a Silicon Valley start-up called Harmonic, is part of growing effort to build a new kind of A.I. that never hallucinates. Today, this technology is focused on mathematics. But many leading researchers believe they can extend the same techniques into computer programming and other areas.<\/p>\n<\/div>\n<\/div>\n<div data-testid=\"Dropzone-1\"><\/div>\n<div class=\"css-s99gbd StoryBodyCompanionColumn\" data-testid=\"companionColumn-1\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">Because math is a rigid discipline with formal ways of proving whether an answer is right or wrong, companies like Harmonic can build A.I. technologies that check their own answers and learn to produce reliable information.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Google DeepMind, the tech giant\u2019s central A.I. lab, recently <a class=\"css-yywogo\" href=\"https:\/\/www.nytimes.com\/2024\/07\/25\/science\/ai-math-alphaproof-deepmind.html\" title>unveiled a system called AlphaProof<\/a> that operates in this way. Competing in the International Mathematical Olympiad, the premier math competition for high schoolers, the system achieved \u201csilver medal\u201d performance, solving four of the competition\u2019s six problems. It was the first time a machine had reached that level.<\/p>\n<div class=\"css-1336jj\">\n<div class=\"css-121kum4\">\n<div class=\"css-171d1bw\"><\/div>\n<div class=\"css-asuuk5\">\n<div class=\"css-7axq9l\" data-testid=\"optimistic-truncator-noscript\">\n<div data-testid=\"optimistic-truncator-noscript-message\" class=\"css-6yo1no\">\n<p class=\"css-3kpklk\">We are having trouble retrieving the article content.<\/p>\n<p class=\"css-3kpklk\">Please enable JavaScript in your browser settings.<\/p>\n<\/div>\n<\/div>\n<div class=\"css-1dv1kvn\" id=\"optimistic-truncator-a11y\">\n<hr \/>\n<p>Thank you for your patience while we verify access. If you are in Reader mode please exit and\u00a0<a href=\"https:\/\/myaccount.nytimes.com\/auth\/login?response_type=cookie&amp;client_id=vi&amp;redirect_uri=https%3A%2F%2Fwww.nytimes.com%2F2024%2F09%2F23%2Ftechnology%2Fai-chatbots-chatgpt-math.html&amp;asset=opttrunc\">log into<\/a>\u00a0your Times account, or\u00a0<a href=\"https:\/\/www.nytimes.com\/subscription?campaignId=89WYR&amp;redirect_uri=https%3A%2F%2Fwww.nytimes.com%2F2024%2F09%2F23%2Ftechnology%2Fai-chatbots-chatgpt-math.html\">subscribe<\/a>\u00a0for all of The Times.<\/p>\n<hr \/>\n<\/div>\n<div class=\"css-1g71tqy\">\n<div data-testid=\"optimistic-truncator-message\" class=\"css-6yo1no\">\n<p class=\"css-3kpklk\">Thank you for your patience while we verify access.<\/p>\n<p class=\"css-3kpklk\">Already a subscriber?\u00a0<a data-testid=\"log-in-link\" class=\"css-z5ryv4\" href=\"https:\/\/myaccount.nytimes.com\/auth\/login?response_type=cookie&amp;client_id=vi&amp;redirect_uri=https%3A%2F%2Fwww.nytimes.com%2F2024%2F09%2F23%2Ftechnology%2Fai-chatbots-chatgpt-math.html&amp;asset=opttrunc\">Log in<\/a>.<\/p>\n<p class=\"css-3kpklk\">Want all of The Times?\u00a0<a data-testid=\"subscribe-link\" class=\"css-z5ryv4\" href=\"https:\/\/www.nytimes.com\/subscription?campaignId=89WYR&amp;redirect_uri=https%3A%2F%2Fwww.nytimes.com%2F2024%2F09%2F23%2Ftechnology%2Fai-chatbots-chatgpt-math.html\">Subscribe<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>Chatbots like ChatGPT get stuff wrong. But researchers are building new A.I. systems that can verify their own math \u2014 and maybe more.On a recent afternoon, Tudor Achim gave a brain teaser to an A.I. bot called Aristotle.The question involved a 10-by-10 table filled with a hundred numbers. If you collected the smallest number in each row and the largest number in each column, he asked, could the largest of the small numbers ever be greater than the smallest of the large numbers?The bot correctly answered \u201cNo.\u201d But that was not surprising. Popular chatbots like ChatGPT may give the right answer, too. The difference was that Aristotle had proven that its answer was right. The bot generated a detailed computer program that verified \u201cNo\u201d was the correct response.Chatbots like ChatGPT from OpenAI and Gemini from Google can answer questions, write poetry, summarize news articles and generate images. But they also make mistakes that defy common sense. Sometimes, they make stuff up \u2014 a phenomenon called hallucination.Mr. Achim, the chief executive and co-founder of a Silicon Valley start-up called Harmonic, is part of growing effort to build a new kind of A.I. that never hallucinates. Today, this technology is focused on mathematics. But many leading researchers believe they can extend the same techniques into computer programming and other areas.Because math is a rigid discipline with formal ways of proving whether an answer is right or wrong, companies like Harmonic can build A.I. technologies that check their own answers and learn to produce reliable information.Google DeepMind, the tech giant\u2019s central A.I. lab, recently unveiled a system called AlphaProof that operates in this way. Competing in the International Mathematical Olympiad, the premier math competition for high schoolers, the system achieved \u201csilver medal\u201d performance, solving four of the competition\u2019s six problems. It was the first time a machine had reached that level.We are having trouble retrieving the article content.Please enable JavaScript in your browser settings.Thank you for your patience while we verify access. If you are in Reader mode please exit and\u00a0log into\u00a0your Times account, or\u00a0subscribe\u00a0for all of The Times.Thank you for your patience while we verify access.Already a subscriber?\u00a0Log in.Want all of The Times?\u00a0Subscribe.<\/p>\n","protected":false},"author":1,"featured_media":13487,"comment_status":"close","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-13485","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/posts\/13485","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/medexperts.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13485"}],"version-history":[{"count":2,"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/posts\/13485\/revisions"}],"predecessor-version":[{"id":13488,"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/posts\/13485\/revisions\/13488"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/medexperts.pro\/index.php?rest_route=\/wp\/v2\/media\/13487"}],"wp:attachment":[{"href":"https:\/\/medexperts.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13485"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/medexperts.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13485"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/medexperts.pro\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13485"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}