LLM Pre Training - Search News

How Microsoft’s next-gen BitNet architecture is turbocharging LLM efficiency

A smart combination of quantization and sparsity allows BitNet LLMs to become even faster and more compute/memory efficient ...

18h

LLM Scaling Has Hit a Wall; What’s Next For ChatGPT?

Ilya Sutskever now says that LLM scaling has plateaued and it's time for discovery again. AI labs are now working on ...

SDxCentral1d

DataRobot Launches AI-Ready Data Tools to Enhance Team Efficiency

Large-scale, unstructured data preparation and handling functionality accelerates generative and predictive AI development and deployment BOSTON–(BUSINESS WIRE)–DataRobot, the provider of AI that ...

What if AI doesn’t just keep getting better forever?

For years now, many AI industry watchers have looked at the quickly growing capabilities of new AI models and mused about exponential performance increases continuing well into the future. Recently, ...

acm.org2d

Is It Possible to Truly Understand Performance in LLMs?

This includes everything from real world reliability to the amount of energy and resources required to construct an LLM. That ...

Digital Content Next2d

AI developers favor premium media content for training

Initially, datasets were openly shared, allowing the public to examine the content used for training. However, LLM companies tightly guard their data sources today, leading to new intellectual ...

ece.ucsb.edu5d

Zhang: IGSB SW Impact Grant

ECE Assoc. Prof. Zheng Zhang among four potentially high-impact projects seeking to solve critical energy-efficiency challenges have been awarded more than $240,000 in cumulative funding related to ...

TechCrunch5d

Refining pre- and post-training data strategy for LLM success

In this Disrupt Roundtable session, Siddharth Mall, Ian McDiarmid, and Kaushik PS from TELUS Digital dove deep into the ...

AMD Announces OLMo, Its First Fully Open LLM

Now, let’s look at OLMo 1B SFT DPO’s performance. AMD ran several benchmarks and compared the results with other open-source ...

AMD rolls out open-source OLMo LLM, to compete with AI giants

With over 1 billion parameters trained using trillions of tokens on a cluster of AMD’s Instinct GPUs, OLMo aims to challenge ...

AMD unveils AMD OLMo, its first 1B-parameter LLM with strong reasoning

AMD develops its own 1B-parameters OLMo large language model for a wide variety of applications that was trained on Instinct MI250 GPUs.

InfoQ13d

Meta Spirit LM Integrates Speech and Text in New Multimodal GenAI Model

Presented in a recent paper, Spirit LM enables the creation of pipelines that mixes spoken and written text to integrate ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results