M
MercyNews
Home
Back
Solo un LLM puede volar un dron con éxito
Tecnologia

Solo un LLM puede volar un dron con éxito

Hacker News7h ago
3 min de lectura
📋

Hechos Clave

  • SnapBench es un nuevo benchmark diseñado para probar modelos de lenguaje grande en su capacidad para volar drones usando datos visuales.
  • GPT-4o fue el único modelo de todos los probados que completó con éxito el desafío de vuelo de dron.
  • El benchmark resalta una brecha significativa entre las capacidades de razonamiento de la IA y su capacidad para realizar tareas físicas.
  • Estos hallazgos sugieren que los LLM actuales aún no están listos para un uso generalizado en aplicaciones de robótica autónoma.

El Desafío del Dron

Un nuevo benchmark ha revelado una limitación sorprendente en la inteligencia artificial actual: solo un gran modelo de lenguaje ha demostrado la capacidad de volar un dron con éxito. Los hallazgos provienen de SnapBench, un nuevo marco de prueba diseñado para evaluar qué tan bien los sistemas de IA pueden interpretar datos visuales y ejecutar tareas físicas.

El benchmark fue compartido recientemente en Hacker News, generando discusión sobre la preparación de la IA para aplicaciones robóticas. Si bien los LLM han mostrado capacidades impresionantes en la generación de texto y el razonamiento, su desempeño en el mundo físico sigue siendo un obstáculo importante. Esta última prueba proporciona evidencia concreta de esa brecha.

Dentro de SnapBench

SnapBench representa una nueva frontera en la evaluación de IA, yendo más allá de los benchmarks tradicionales basados en texto para probar aplicaciones del mundo real. El marco presenta a los modelos un desafío específico: interpretar instantáneas visuales y emitir comandos para navegar un dron a través de un curso. Esto requiere una combinación de comprensión visual, razonamiento espacial y generación de instrucciones precisas.

La prueba está diseñada para ser rigurosa, simulando el tipo de toma de decisiones dinámica requerida para la robótica autónoma. A diferencia de los problemas estáticos, el vuelo de un dron exige una adaptación continua a condiciones cambiantes. Los resultados del benchmark indican que la mayoría de los modelos actuales no logran cerrar la brecha entre el conocimiento abstracto y la ejecución práctica.

Aspectos clave del benchmark incluyen:

  • Requisitos de procesamiento visual en tiempo real
  • Tareas complejas de navegación espacial
  • Generación continua de comandos
  • Restricciones de seguridad y precisión

"Solo 1 LLM puede volar un dron"

— Hallazgos de SnapBench

La Única Historia de Éxito

Entre todos los modelos probados, GPT-4o surgió como el único candidato exitoso. Su capacidad para procesar entradas visuales y generar comandos de vuelo precisos lo distinguió de los competidores. Este logro destaca las capacidades avanzadas del modelo en la comprensión multimodal y su potencial para la integración robótica.

El éxito de un solo modelo subraya la dificultad de la tarea. Si bien muchos LLM sobresalen en tareas de lenguaje, traducir esa capacidad en acción física requiere un nivel más profundo de comprensión. El desempeño de GPT-4o sugiere que ha dado pasos significativos en esta área, aunque el hecho de que fuera el único modelo en tener éxito indica lo desafiante que sigue siendo este dominio.

Solo 1 LLM puede volar un dron

La cruda realidad de esta declaración refleja el estado actual de la IA en robótica. Si bien se está avanzando, el camino hacia agentes de IA autónomos en el mundo físico aún está en sus primeras etapas.

Implicaciones para la IA

Los resultados de SnapBench tienen implicaciones significativas para el futuro de la IA robótica. Sugerir que simplemente escalar los modelos de lenguaje es suficiente para resolver tareas físicas complejas puede no ser correcto. En cambio, se pueden necesitar nuevos enfoques que integren capacidades visuales, espaciales y de control motor.

Este hallazgo es particularmente relevante para industrias que exploran la automatización, desde la logística hasta la defensa. La capacidad de la IA para operar drones de manera confiable podría transformar muchos sectores, pero la tecnología aún no está lo suficientemente madura para un despliegue generalizado. El benchmark sirve como una realidad, moderando las expectativas mientras también proporciona una métrica clara para la mejora.

Áreas que requerirán enfoque incluyen:

  • Razonamiento visual-espacial mejorado
  • Integración de bucles de retroalimentación sensorial
  • Protocolos de seguridad para autonomía física
  • Entrenamiento en escenarios diversos del mundo real

El Camino a Seguir

La conversación en torno a SnapBench y las capacidades de vuelo de drones es parte de una discusión más amplia sobre las limitaciones de la IA. A medida que los benchmarks como este se vuelven más comunes, los desarrolladores tendrán mejores herramientas para medir el progreso e identificar debilidades. Este proceso iterativo es crucial para avanzar en el campo.

Si bien los resultados actuales pueden parecer decepcionantes, proporcionan una línea base valiosa. Los modelos futuros pueden diseñarse teniendo en cuenta estos desafíos específicos, lo que potencialmente conducirá a avances en cómo la IA comprende e interactúa con el mundo físico. El éxito de GPT-4o ofrece un vistazo de lo que es posible, mientras que el fracaso de otros destaca el trabajo que aún queda por hacer.

Puntos Clave

La prueba de dron SnapBench revela que la tecnología de IA actual tiene un largo camino por recorrer antes de que pueda manejar de manera confiable tareas físicas complejas. Solo un modelo, GPT-4o, logró completar el desafío con éxito, mostrando que la mayoría de los LLM carecen de la integración necesaria de habilidades visuales y motoras.

Para la industria de la robótica, esto representa tanto un desafío como una oportunidad. La brecha clara en el desempeño proporciona dirección para la investigación y el desarrollo futuros. A medida que la IA continúa evolucionando, los benchmarks como SnapBench serán esenciales para rastrear el progreso hacia sistemas verdaderamente autónomos.

Preguntas Frecuentes

¿Cuál es el hallazgo principal de la prueba SnapBench?

El hallazgo principal es que solo un gran modelo de lenguaje, GPT-4o, pudo volar un dron con éxito basándose en instrucciones visuales. Todos los demás modelos probados no lograron completar la tarea, revelando una limitación importante en la tecnología de IA actual.

¿Por qué es esto significativo para el desarrollo de la IA?

Esto es significativo porque muestra que, aunque los LLM son buenos en tareas de lenguaje, luchan con la compleja integración de datos visuales y ejecución física requerida para la robótica. Destaca un área crítica donde la IA necesita mejorar antes de que pueda usarse de manera confiable en sistemas autónomos del mundo real.

¿Qué significa esto para el futuro de la IA en robótica?

Los resultados sugieren que se necesitan nuevos enfoques para cerrar la brecha entre el razonamiento de la IA y la acción física. El desarrollo futuro probablemente se centrará en una mejor integración del razonamiento visual-espacial y el control motor, utilizando benchmarks como SnapBench para medir el progreso.

Continue scrolling for more

La IA transforma la investigación y las demostraciones matemáticas
Technology

La IA transforma la investigación y las demostraciones matemáticas

La inteligencia artificial está pasando de ser una promesa a una realidad en las matemáticas. Los modelos de aprendizaje automático generan teoremas originales, forzando una reevaluación de la investigación y la enseñanza.

Just now
4 min
410
Read Article
Europe must stop ‘dreaming’ about defence without US, Rutte warns
World_news

Europe must stop ‘dreaming’ about defence without US, Rutte warns

Nato chief says continent cannot afford to replace American security umbrella

30m
3 min
0
Read Article
Real_estate

Zoom's 'hidden gem' investment in Anthropic could be worth $2 billion to $4 billion, analysts say

Anthropic revealed that Zoom Ventures had invested in the AI startup in May 2023.

32m
3 min
0
Read Article
Billie Eilish Concert Doc Release Pushed to May; James Cameron Says ‘We’re Dialing in Cool, New 3D Tech’
Technology

Billie Eilish Concert Doc Release Pushed to May; James Cameron Says ‘We’re Dialing in Cool, New 3D Tech’

James Cameron revealed on Monday that the release of the Billie Eilish 3D concert documentary “Billie Eilish: Hit Me Hard and Soft,” which he co-directed with Eilish, has been pushed two more months to May 8 via Paramount. “We’re refining the cut; dialing in cool, new 3D tech; adding some special behind-the-scenes we know you’ll […]

44m
3 min
0
Read Article
Jensen Huang says it's 'ridiculous' to say Nvidia's $2 billion investment in CoreWeave is another circular deal
Technology

Jensen Huang says it's 'ridiculous' to say Nvidia's $2 billion investment in CoreWeave is another circular deal

Nvidia CEO Jensen Huang Markus Schreiber/AP Nvidia CEO Jensen Huang pushed back on criticism of the chipmaker's investment structures. Huang said its latest investment in CoreWeave was not a circular deal. Chipmakers' investments in leading tech companies, which are also customers, have raised worries about an AI bubble. Nvidia CEO Jensen Huang is done with the questions about circular financing. Huang called it "ridiculous" to suggest that Nvidia's latest deal, a $2 billion investment in CoreWeave, is the latest circular deal between AI chipmakers and tech companies, a trend that has sparked some concern among some investors. "These are generational companies — the investments that we make is confidence in them," Huang told Bloomberg News. "But it's a small percentage of the amount of money that they ultimately have to go raise, and so the idea that it is circular is — it's ridiculous." As part of the arrangement, Nvidia is expanding its previous investment in the cloud company by buying $2 billion worth of its shares. According to a joint statement, the money will assist CoreWeave's "procurement of land, power, and shell to build AI factories." The future AI factories will then be powered by Nvidia's chips. Huang portrayed the latest deal and past arrangements with the likes of OpenAI, Anthropic, and Elon Musk's xAI as just a small portion of what the companies need to raise to finance their massive AI expansion plans. For example, OpenAI is committed to spending roughly $1.4 trillion over the next eight years, largely on data centers. "Whatever we decide to invest is a small percentage, very small percentage of the overall amount of infrastructure, capital they're going to have to raise," Huang told CNBC in a separate interview. This is far from the first time Nvidia has bristled at concerns about its deals. In November, the world's largest company by market cap sent a letter to Wall Street analysts in response to investor Michael Burry of "The Big Short" fame, who has questioned whether Nvidia was on solid financial footing. "Nvidia's underlying business is economically sound, our reporting is complete and transparent, and we care about our reputation for integrity," the memo said. Burry has said he stands behind his analysis of the company, comparing it to one of Silicon Valley's giants before the Dotcom crash. "I am not claiming Nvidia is Enron," he wrote on his Substack. "It is clearly Cisco." Read the original article on Business Insider

46m
3 min
0
Read Article
watchOS 26.2.1 now available for Apple Watch, here’s what’s new
Technology

watchOS 26.2.1 now available for Apple Watch, here’s what’s new

Apple has just released watchOS 26.2.1, a new software update for Apple Watch users. Here’s what the update includes. more…

47m
3 min
0
Read Article
World_news

Google Books has been effectively killed by the last algorithm update

Article URL: https://old.reddit.com/r/google/comments/1qn1hk1/google_has_seemingly_entirely_removed_search/ Comments URL: https://news.ycombinator.com/item?id=46769201 Points: 3 # Comments: 0

49m
3 min
0
Read Article
How to generate AI images using ChatGPT
Technology

How to generate AI images using ChatGPT

Since March 2025, ChatGPT has been capable of generating images. Following a period where it briefly wasn't available to free users, you now don't even pay for one of OpenAI's subscriptions to use this feature. And while making images inside of ChatGPT is easy, there are some nuances worth explaining. For example, did you know you can ask ChatGPT to edit photos you've taken? It's more powerful than you might think. Here’s everything you need to know about generating AI images with ChatGPT. How to create images with ChatGPT using text prompts To begin making an image in ChatGPT, you can start by typing in the prompt bar. Igor Bonifacic for Engadget You can start generating images in ChatGPT simply by typing in the prompt bar what you want to see. There's no need to overthink things; as long as you have some version of "generate an image" followed by a description of your idea, ChatGPT will do the rest. Depending on the complexity of the prompt and whether you pay for ChatGPT, it may take a minute or two for the chatbot to complete your image request. Sometimes the process can take longer if OpenAI's servers are experiencing greater traffic than usual. At the end of last year, OpenAI updated the model powering image generation to make it faster, as well as better at rendering text and following instructions. At the same time, it added a dedicated "Images" section to ChatGPT's sidebar. Here you can see all the images you've made, alongside sample prompts and suggestions for styles to try out, making it a great place to start if you've never used an image generator before. How to create images with ChatGPT using existing photos You can also upload images to ChatGPT. Igor Bonifacic for Engadget In addition to generating images from text prompts, ChatGPT can modify existing photos or images you upload. This is my preferred way of making images with ChatGPT; I don't need to describe the composition, I can use an existing one to guide the chatbot. To use an existing image as a starting point for a new generation, follow these steps: Tap the "+" icon, located to the left of the prompt bar. Select Add photos & files. Select the image you want ChatGPT to edit. If uploading an image from your phone, you'll first need to grant ChatGPT access to your camera roll. Write a prompt describing the changes you want. If generating from the Images section, tap "Add photos" instead. Keep in mind any photos you upload to OpenAI's servers may be used by the company to train future models. You can opt out of allowing your data to be used for training by following these steps: Open the sidebar menu. On mobile, tap the two lines on the top left of the interface; on desktop, click instead on the OpenAI logo. Tap your name to access account settings. Tap Data controls. Toggle off Improve the model for everyone. How to edit the images ChatGPT generates ChatGPT gives you a few different ways to edit images. Igor Bonifacic for Engadget If you're unhappy with ChatGPT's output, you have two options. You can either prompt it to create an entirely new image, or edit parts of the picture it just generated. As always, the process for both involves simply typing what you want in the prompt bar. On mobile, OpenAI gives users a few different ways of accomplishing the same task. To generate an entirely new image: Tap the three dots icon below the image ChatGPT created. Select Retry. To edit part of an existing image generation: Tap the image ChatGPT created. Tap Select area. Use your finger to mask the section of the image you want ChatGPT to tweak. The slider on the left allows you to adjust the size of the masking brush. On desktop, masking is also available if you click on an image and then click on the paintbrush icon on the top right. Describe what you want ChatGPT to add, remove or replace through the prompt bar. ChatGPT can also blend one of your photos with an image it has generated. To do this: Tap an image ChatGPT created. Tap Blend in a photo. Upload the photo you wish Like all AI systems, ChatGPT is non-deterministic, meaning even if you prompt it in the same way multiple times, it won't generate the exact same response each time. Tips to create better images with ChatGPT The best advice I can offer is to be specific when prompting ChatGPT. The more detail you can provide when describing what you want from it, the better the results. And remember: ChatGPT can hallucinate — as you may have noticed from one of the example pictures I included above. In the image of the tortoiseshell cat, not only is the tortie not sitting on the window sill as instructed, it's sitting on a table that doesn't make much sense. So, most of all, be patient. Prompting an AI model is not exact science, and it can take a few tries before it creates the result you want. FAQs How do you access ChatGPT? ChatGPT is available on the web, desktop and mobile. To access it on your computer, open your preferred browser and navigate to chatgpt.com. OpenAI also offers dedicated Mac and Windows apps you can download from the company's website. On iOS and Android, you'll need to download the ChatGPT app from either the App Store or Google Play before you can start using the chatbot. Since ChatGPT runs on OpenAI's servers, as long as you can access the chatbot, you'll be able to use it to create images no matter the age of your phone or computer. Can ChatGPT generate images for free? Yes, ChatGPT can generate images for free, as long as you create an OpenAI account. However, there is a daily rate cap and GPT-5 will take longer to make a free image. Following March 27, 2025, OpenAI briefly limited free users to three image generations per day. The company has since relaxed that restriction, though it doesn't list a specific limit on its website. In my experience, you'll be able to generate about six to seven images every 24 hours. OpenAI offers three different subscription plans, each with their own set of image generation perks. ChatGPT Go, which costs $8 per month, offers "more image creation." ChatGPT Plus, which costs $20 per month, offers "expanded and faster image creation." ChatGPT Pro, which costs $200 per month, offers "unlimited and faster image creation." Note: ChatGPT Go will be included in OpenAI's forthcoming ads pilot, which will see the company display sponsored content alongside organic responses from ChatGPT. The company does not plan to display ads to Plus and Pro users. Can ChatGPT generate an existing photo? No. For copyright reasons, ChatGPT can't replicate photos or exact real world events. For example, when I asked it to recreate the photo of Zinedine Zidane's iconic 2006 World Cup headbutt, ChatGPT refused. "I can make an artistic reinterpretation inspired by the emotion or energy of that moment — for example, a stylized painting showing the tension and intensity of competition, without depicting real individuals," it told me. This article originally appeared on Engadget at https://www.engadget.com/ai/how-to-generate-ai-images-using-chatgpt-120000560.html?src=rss

50m
3 min
0
Read Article
World_news

House of Lords Votes to Ban UK Children from Using Internet VPNs

Article URL: https://www.ispreview.co.uk/index.php/2026/01/house-of-lords-votes-to-ban-uk-children-from-using-internet-vpns.html Comments URL: https://news.ycombinator.com/item?id=46769131 Points: 5 # Comments: 1

54m
3 min
0
Read Article
MCP unites Claude chat with apps like Slack, Figma, and Canva
Technology

MCP unites Claude chat with apps like Slack, Figma, and Canva

Anthropic's Claude got a bit livelier today thanks to a new extension to MCP, the open-source protocol that allows AI agents to easily access tools and data across the internet. Users will now be able to interact with apps directly inside the Claude chatbot, letting you draft and format Slack messages to colleagues and create presentations for clients in Canva without having to switch tabs. As of today, Anthropic said tools like Asana, Figma, Slack, and Canva will "open as interactive apps right inside of chat." While users could previously connect tools like Slack and Asana to the AI assistant, doing so meant getting text back. The company … Read the full story at The Verge.

55m
3 min
0
Read Article
🎉

You're all caught up!

Check back later for more stories

Volver al inicio