How I Leveraged Open Source LLMs to Achieve Massive Savings on a Large Compute Project | by Ryan Shrott

[ad_1]

Unlocking Price-Effectivity in Massive Compute Tasks with Open Supply LLMs and GPU Leases.

Photograph by Alexander Grey on Unsplash

Introduction

On this planet of enormous language fashions (LLMs), the price of computation is usually a vital barrier, particularly for intensive tasks. I lately launched into a mission that required working 4,000,000 prompts with a median enter size of 1000 tokens and a median output size of 200 tokens. That’s almost 5 billion tokens! The standard strategy of paying per token, as is frequent with fashions like GPT-3.5 and GPT-4, would have resulted in a hefty invoice. Nonetheless, I found that by leveraging open supply LLMs, I might shift the pricing mannequin to pay per hour of compute time, resulting in substantial financial savings. This text will element the approaches I took and examine and distinction every of them. Please observe that whereas I share my expertise with pricing, these are topic to alter and should differ relying in your area and particular circumstances. The important thing takeaway right here is the potential value financial savings when leveraging open supply LLMs and renting a GPU per hour, fairly than the precise costs quoted. If you happen to plan on using my beneficial options in your mission, I’ve left a few affiliate hyperlinks on the finish of this text.

ChatGPT API

I performed an preliminary check utilizing GPT-3.5 and GPT-4 on a small subset of my immediate enter knowledge. Each fashions demonstrated commendable efficiency, however GPT-4 persistently outperformed GPT-3.5 in a majority of the instances. To provide you a way of the price, working all 4 million prompts utilizing the Open AI API would look one thing like this:

Complete value of working 4mm prompts with enter size of 1000 tokens and 200 token output size

Whereas GPT-4 did provide some efficiency advantages, the price was disproportionately excessive in comparison with the incremental efficiency it added to my outputs. Conversely, GPT-3.5 Turbo, though extra inexpensive, fell quick when it comes to efficiency, making noticeable errors on 2–3% of my immediate inputs. Given these components, I wasn’t ready to speculate $7,600 on a mission that was…

[ad_2]

Source link

How I Leveraged Open Source LLMs to Achieve Massive Savings on a Large Compute Project | by Ryan Shrott | Aug, 2023

How 25,000 Computers Trained ChatGPT | by Jerry Qu | Aug, 2023

Call of Duty will use AI to moderate voice chats

Editor

Call of Duty will use AI to moderate voice chats

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

How I Leveraged Open Source LLMs to Achieve Massive Savings on a Large Compute Project | by Ryan Shrott | Aug, 2023

Unlocking Price-Effectivity in Massive Compute Tasks with Open Supply LLMs and GPU Leases.

Introduction

ChatGPT API

How 25,000 Computers Trained ChatGPT | by Jerry Qu | Aug, 2023

Call of Duty will use AI to moderate voice chats

Editor

Call of Duty will use AI to moderate voice chats

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended