Business · The Register
Vintage chatbot lives in the past like an elderly relative
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
If you're tired of interacting with a bot that spews Nazi propaganda or refers to itself as MechaHitler, you could sign off of Elon Musk's xAI.
Key facts
- A trio of AI researchers has released a 13-billion-parameter "vintage" language model they call Talkie, which has been trained solely on digital scans of English-language books, newspapers
- Through their work on Talkie, the team determined that training a language model on OCR'ed pre-1931 texts only gave it 30 percent of the performance of a model trained on human-transcribed copies
- Talkie also has a problem with "temporal leakage," said the team: It could identify FDR as the president in 1936 and list some of his legislative accomplishments despite its training data supposedly
- Still, with 13 billion parameters, Duvenaud admitted that there's a big capability gap between Talkie and AI models trained on modern data
Summary
A trio of AI researchers has released a 13-billion-parameter "vintage" language model they call Talkie, which has been trained solely on digital scans of English-language books, newspapers, periodicals, scientific journals, patents, and case law that were published before the end of 1930. In other words, if you're looking for information on World War II, the election of Franklin D. This isn't the first vintage AI model to appear, mind you, with others trained on Victorian literature and pre-1900 scientific texts already out in the world. "Talkie is the largest vintage language model we are aware of, and we plan to continue scaling significantly," the team behind it noted.