Galactica AI is the latest large language model developed by Meta. A large-scale machine translation model is specifically designed to store, collate, understand, and reason scientific knowledge. The Galactica AI can process and understand mathematical relations, scientific code, questions & answers on different scientific subjects. It can then provide accurate explanations, source information, and pertinent details about any scientific topic or information, from black holes to MATLAB assignment help.
However, things went awry, and Galactica AI pull down within a few days after the demo was made public. So, what went wrong? What kind of trouble did an AI search engine, touted to be the next step in the evolution of search engine technology, stir up?
To understand everything clearly, we first need to understand a bit about Large Language Models In AI.
A Bit About Large Language Models
Large language models are ground-breaking advancements in AI. They use natural language processing techniques and deep learning-neural network systems to perform different operations using learned knowledge.
Large language models (LLMs) are challenging to design and hard to maintain, & are nigh inaccessible to small & medium enterprises. At the same time, they are incredibly versatile and powerful, & do different kinds of language-related tasks, such as:
- Text Generation
- Summarization
- Automated Translation
- Real-Time Intelligent Response Generation
- Coding
- Image Generation
- Knowledge Answering
LLMs lie at the heart of all NLP-powered applications.
GPT-3, one of the most well-known open-source LLMs, was develop by OpenAI, has 175 billion parameters and was trained on 570 gigabytes of text. The model employed initialized and pre-normalized data to clean it up, which then underwent reversible tokenization. Sparse attention transformers, a type of neural network, were then use to generate amazingly accurate & relevant information. GPT-3 and similar LLMs are now increasingly use as chatbot systems in websites such as MATLAB assignment writing help services, MOOCs, retail, etc.
A surprising aspect of LLMs like GPT-3 was that simply scaling up the model’s features and increasing the training data size was enough to boost their abilities. OpenAI’s research on GPT-3 also reveal that all LLMs are few-shot learners, where the model was expose to a few examples that were in context with its operations.
Besides OpenAI’s GPT-3, numerous other LLMs exist, each remarkably successful in their applications. For example, there’s BLOOM, XLNet, Google’s BERT, Facebook’s Roberta, Microsoft’s DeBERTa, AND Meta’s Galactica.
Galactica AI is train on a gigantic corpus of 48 million research papers, 2 million scripts, 8 million lecture notes, and myriad textbooks. With billions of parameters and trained on billions of data points, this was suppose to be the ultimate search engine for scientific research and studies. Meta offered access to a public demo on 15th November 2022.
Meta’s AI uses a transformer architecture like GPT-3 with significant modifications, such as Gaussian error Linear Unit, Context Window, No Biases, Learned Positional Embedding (just like Google’s BERT), and a massive context-specific vocabulary generated from a portion of the training data.
The following section offers a quick overview of Galactica’s design.
A Brief Look Into Galactica’s Design
One of the primary purposes behind the development of Galactica AI was to act as the ultimate search engine for scientific knowledge. The scientific prowess of the machine stems from the incredible amount of pre-processing, cleaning, and normalization of the data set.
Galactica AI’s dataset was create by tokenizing information from billions of scientific papers, scripts, books, and the like. The AI model uses a decoder-only design of a transformer architecture with specific modifications such as:
- GeLU Activations for different model sizes
- A 2048-length context window for all models
- Zero Biases in layer normalization or the dense kernels of the transformer
- Learned positional embeddings
- A giant vocabulary of around 50000 tokens
To handle and allow optimal learning despite the variations in language expressions, Galactica uses enhanced tokenization steps to identify mathematical characters, codes, and various other kinds of sequences.
A Neural Network Transformer Model Similar To The One Used By Meta!
Meta’s researchers found that their AI search engine outperformed all contemporary models such as BLOOM, GPT-3, etc. In addition, Galactica raced ahead of its contemporaries on all benchmarking tests.
It isn’t feasible to add all the mathematical and technical intricacies of Galactica’s design in this article. You can check out the entire research paper here:
So, despite all such advancements, why was Galactica pull down?
What Went Wrong?
Biased data, misinformation, racist responses, pseudo-science- Galactica drew strong criticism from all corners due to all such things and much more. Researchers and testers labelled its capabilities as overhyped and the model itself not ready for public use.
So, instead of being a tool for synthesizing and distributing accurate & reliable scientific information, it began spewing nonsense when facing complex questions. And, it doesn’t stop there!
- Galactica supposedly even struggled with the most basic mathematical problems. Instead, it delivered error-riddled answers, incredibly inaccurate information regarding scientific questions, generated incorrect lecture notes, and fabricated citation & referencing information.
Carl Bergstrom, a Professor of Biology from the University of Washington, called it the random bullshot generator. In reality, however, it does not do so with any specific purpose and has no idea that it generates nonsensical information. Galactica generates info in a sound and authoritative manner, but it is usually incorrect.
- Meta paused and then withdrew the demo within 48 hours. Later, a spokesperson from the AI research department at Meta released a statement pointing out that Galactica is not a source of truth but just an experiment that used machine learning to process and disseminate summarized information. He also added that the research and development behind Galactica had short-term goals with no significant long-term implications.
- The intense criticism and vitriol that the Galactica could have been avoided if Meta had chosen not to tout it as the next evolution in AI-powered search engines and a significant competitor of popular large language models. The demo ended up being a glorified sentence auto-completer.
Dan Hendricks, an AI safety researcher at the University of California, Berkeley, pointed out the potential risks associated with an AI-powered knowledge dispenser and noted the lack of a safety team at Meta’s AI division.
Such problems initially seem trivial but could pose severe risks under different circumstances. Besides criticisms, many researchers suggested adding filters and controlling mechanisms while dispensing information.
In Conclusion
The astounding failure of Meta’s Galactica AI assistant was a lesson for the entire AI research community. Issued of biased and outright wrong information, presented grammatically flawless and sensible manner, indicates an intrinsic flaw in the system’s design. However, Meta defended the performance issues of Galactica and cited the inherent flaws of large-scale language models for the fiasco.
The most apparent thing Meta could have done is spend more time with Galactic AI as OpenAI did with GPT-3 before making it public. And they should have never labelled it as the next stage in the evolution of AI-enabled search engines.
Well, that’s about it for this write-up. I hope it was an informative read for one and all. If AI, machine learning, deep learning, neural networks, and NLP interest you, then get started with an online course and start brushing up your mathematics & skills in Python, MATLAB, and R.
And, if you need help, look for professional assistance from reputed online services that offer higher mathematics, machine learning, Python, and MATLAB assignment help.
All the best!