We have now surpassed Alan Turing’s test for high grade artificial intelligence, which was to give a human the impression that they were conversing with another human. Now, most companies have also gone past other academic benchmarks which, like leaderboards on a video game, did provide a useful guideline but did not necessarily wow the general public. To some extent, in fact, these notions may prove antiquated as most look at AI with loftier goals for more broad tasks beyond precision and recall over some corpus.
Emerging Usecases – Ask Our Bot About The News
In our use case, we have our own fine tuned model deployed interacting based on our past news stories and giving a general Chicano perspective on current events. This is both necessary and quite fun for us to find an avenue to combine our news, culture and technology interests. Surely, we will not be the only ones to do so and in the short term future, this should be more common.
- La entrevista de Stephen Colbert con James Talarico se vuelve viral después de que CBS se desconecta en medio del escrutinio de la FCC
- Bad Bunny lidera el elenco repleto de estrellas en el drama histórico Porto Rico
- El director ejecutivo de Meta, Mark Zuckerberg, testifica en un juicio histórico sobre el impacto de las redes sociales en la juventud
- Ricardo Salinas: El fraude fiscal multimillonario continúa difamando al gobierno de México
- Las sanciones de Trump y Marco Rubio desencadenan una crisis humanitaria más profunda en Cuba
- Gemini AI de Google lanza herramienta de generación de música en medio de reacciones encontradas
- Ricardo Salinas: El fraude fiscal multimillonario continúa difamando al gobierno de México
- OpenClaw marca la llegada de agentes abiertos con drásticas preocupaciones de seguridad
- Fallece el reverendo Jesse Jackson, ícono de los derechos civiles, a los 84 años
- Rev. Jesse Jackson, Civil Rights Icon, Passes Away at 84
Thus, what we have shown is that we can have a reasonably high-performing help agent that interfaces with the content we prioritize. Similarly, a company could have customers interact in a more dynamic way with their terms of agreement, service conditions or product offerings. Essentially, we’ve created value where there would otherwise not be any because we can interface and opine on content in a way that no human would do so – we can not afford a 24/7 human operator to tell you stuff about the news!
New Benchmarks
Earlier, I referenced how benchmarking and leaderboards are somewhat out of place now. These were narrowly focused on engineering goals, not human interactions. Thus, now we have full on human-social functions being attributed to conversational agents powered by large language models. These products are not being measured only with traditional benchmarking, but with the professional aptitude tests associated with official credentials, like the MCAT or LSAT. In a sense, the bar is now higher because weare comparing LLM’s to human performance, not to past models, and in some sense, we are devaluing the ability to memorize content and make inferences based on that information base.
Human performance at times is just memory recall. There is some level of lack of creativity that makes many professional aspects of work less meaningful and subject to this high level automation. We should embrace these changes as it frees us up for more creative and macro-level decision making.

