Challenges of Detecting AI-Generated Text | by Dhruv Matani

[ad_1]

Now we have all of the substances we have to examine if a bit of textual content is AI-generated. Right here’s every thing we want:

The textual content (sentence or paragraph) we want to examine.
The tokenized model of this textual content, tokenized utilizing the tokenizer that was used to tokenize the coaching dataset for this mannequin.
The educated language mannequin.

Utilizing 1, 2, and three above, we are able to compute the next:

Per-token likelihood as predicted by the mannequin.
Per-token perplexity utilizing the per-token likelihood.
Whole perplexity for your complete sentence.
The perplexity of the mannequin on the coaching dataset.

To examine if a textual content is AI-generated, we have to examine the sentence perplexity with the mannequin’s perplexity scaled by a fudge-factor, alpha. If the sentence perplexity is greater than the mannequin’s perplexity with scaling, then it’s in all probability human-written textual content (i.e. not AI-generated). In any other case, it’s in all probability AI-generated. The rationale for that is that we count on the mannequin to not be perplexed by textual content it will generate itself, so if it encounters some textual content that it itself wouldn’t generate, then there’s cause to consider that the textual content isn’t AI-generated. If the perplexity of the sentence is lower than or equal to the mannequin’s coaching perplexity with scaling, then it’s possible that it was generated utilizing this language mannequin, however we are able to’t be very certain. It is because it’s doable for a human to have written that textual content, and it simply occurs to be one thing that the mannequin may even have generated. In spite of everything, the mannequin was educated on quite a lot of human-written textual content so in some sense, the mannequin represents an “common human’s writing”.

ppx(x) within the formulation above means the perplexity of the enter “x”.

Subsequent, let’s check out examples of human-written v/s AI-generated textual content.

Examples of AI-generated v/s human written textual content

We’ve written some Python code that colours every token in a sentence primarily based on its perplexity relative to the mannequin’s perplexity. The primary token is at all times colored black if we don’t contemplate its perplexity. Tokens which have a perplexity that’s lower than or equal to the mannequin’s perplexity with scaling are colored pink, indicating that they could be AI-generated, whereas the tokens with greater perplexity are colored inexperienced, indicating that they had been undoubtedly not AI-generated.

The numbers within the sq. brackets earlier than the sentence point out the perplexity of the sentence as computed utilizing the language mannequin. Observe that some phrases are half pink and half blue. This is because of the truth that we used a subword tokenizer.

Right here’s the code that generates the HTML above.

def get_html_for_token_perplexity(tok, sentence, tok_ppx, model_ppx):
tokens = tok.encode(sentence).tokens
ids = tok.encode(sentence).ids
cleaned_tokens = []
for phrase in tokens:
m = checklist(map(ord, phrase))
m = checklist(map(lambda x: x if x != 288 else ord(' '), m))
m = checklist(map(chr, m))
m = ''.be a part of(m)
cleaned_tokens.append(m)
#
html = [
f"<span>{cleaned_tokens[0]}</span>",
]
for ct, ppx in zip(cleaned_tokens[1:], tok_ppx):
shade = "black"
if ppx.merchandise() >= 0:
if ppx.merchandise() <= model_ppx * 1.1:
shade = "pink"
else:
shade = "inexperienced"
#
#
html.append(f"<span type='shade:{shade};'>{ct}</span>")
#
return "".be a part of(html)
#

As we are able to see from the examples above, if a mannequin detects some textual content as human-generated, it’s undoubtedly human-generated, but when it detects the textual content as AI-generated, there’s an opportunity that it’s not AI-generated. So why does this occur? Let’s have a look subsequent!

False positives

Our language mannequin is educated on a LOT of textual content written by people. It’s typically onerous to detect if one thing was written (digitally) by a selected individual. The mannequin’s inputs for coaching comprise many, many various kinds of writing, possible written by a lot of folks. This causes the mannequin to be taught many various writing kinds and content material. It’s very possible that your writing type very intently matches the writing type of some textual content the mannequin was educated on. That is the results of false positives and why the mannequin can’t make sure that some textual content is AI-generated. Nonetheless, the mannequin can make sure that some textual content was human-generated.

OpenAI: OpenAI just lately introduced that it will discontinue its instruments for detecting AI-generated textual content, citing a low accuracy charge (Supply: Hindustan Times).

The unique model of the AI classifier software had sure limitations and inaccuracies from the outset. Customers had been required to enter at the very least 1,000 characters of textual content manually, which OpenAI then analyzed to categorise as both AI or human-written. Sadly, the software’s efficiency fell quick, because it correctly recognized solely 26 % of AI-generated content material and mistakenly labeled human-written textual content as AI about 9 % of the time.

Right here’s the blog post from OpenAI. It looks as if they used a unique method in comparison with the one talked about on this article.

Our classifier is a language mannequin fine-tuned on a dataset of pairs of human-written textual content and AI-written textual content on the identical subject. We collected this dataset from quite a lot of sources that we consider to be written by people, such because the pretraining information and human demonstrations on prompts submitted to InstructGPT. We divided every textual content right into a immediate and a response. On these prompts, we generated responses from quite a lot of totally different language fashions educated by us and different organizations. For our internet app, we modify the boldness threshold to maintain the false constructive charge low; in different phrases, we solely mark textual content as possible AI-written if the classifier could be very assured.

GPTZero: One other fashionable AI-generated textual content detection software is GPTZero. It looks as if GPTZero makes use of perplexity and burstiness to detect AI-generated textual content. “Burstiness refers back to the phenomenon the place sure phrases or phrases seem in bursts inside a textual content. In different phrases if a phrase seems as soon as in a textual content, it’s more likely to seem once more in shut proximity” (source).

GPTZero claims to have a really excessive success charge. In line with the GPTZero FAQ, “At a threshold of 0.88, 85% of AI paperwork are categorized as AI, and 99% of human paperwork are categorized as human.”

The generality of this method

The method talked about on this article doesn’t generalize effectively. What we imply by that is that if in case you have 3 language fashions, for instance, GPT3, GPT3.5, and GPT4, then you should run the enter textual content by means of all the three fashions and examine perplexity on all of them to see if the textual content was generated by any one among them. It is because every mannequin generates textual content barely in a different way, they usually all have to independently consider textual content to see if any of them might have generated the textual content.

With the proliferation of huge language fashions on the earth as of August 2023, it appears unlikely that one can examine any piece of textual content as having originated from any of the language fashions on the earth.

The truth is, new fashions are being educated day by day, and attempting to maintain up with this speedy progress appears onerous at greatest.

The instance beneath reveals the results of asking our mannequin to foretell if the sentences generated by ChatGPT are AI-generated or not. As you possibly can see, the outcomes are blended.

The sentences within the purple field are appropriately recognized as AI-generated by our mannequin, whereas the remaining are incorrectly recognized as human written.

There are a lot of explanation why this will occur.

Prepare corpus dimension: Our mannequin is educated on little or no textual content, whereas ChatGPT was educated on terabytes of textual content.
Knowledge distribution: Our mannequin is educated on a unique information distribution as in comparison with ChatGPT.
Positive-tuning: Our mannequin is only a GPT mannequin, whereas ChatGPT was fine-tuned for chat-like responses, making it generate textual content in a barely totally different tone. Should you had a mannequin that generates authorized textual content or medical recommendation, then our mannequin would carry out poorly on textual content generated by these fashions as effectively.
Mannequin dimension: Our mannequin could be very small (lower than 100M parameters in comparison with > 200B parameters for ChatGPT-like fashions).

It’s clear that we want a greater method if we hope to supply a fairly high-quality consequence to examine if any textual content is AI-generated.

Subsequent, let’s check out some misinformation about this subject circulating across the web.

[ad_2]

Source link

Challenges of Detecting AI-Generated Text | by Dhruv Matani | Sep, 2023

Meet LMSYS-Chat-1M: A Large-Scale Dataset Containing One Million Real-World Conversations with 25 State-of-the-Art LLMs

Tsinghua University Researchers Introduce OpenChat: A Novel Artificial Intelligence AI Framework Enhancing Open-Source Language Models with Mixed-Quality Data

Editor

Tsinghua University Researchers Introduce OpenChat: A Novel Artificial Intelligence AI Framework Enhancing Open-Source Language Models with Mixed-Quality Data

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

Challenges of Detecting AI-Generated Text | by Dhruv Matani | Sep, 2023

Examples of AI-generated v/s human written textual content

False positives

The generality of this method

Meet LMSYS-Chat-1M: A Large-Scale Dataset Containing One Million Real-World Conversations with 25 State-of-the-Art LLMs

Tsinghua University Researchers Introduce OpenChat: A Novel Artificial Intelligence AI Framework Enhancing Open-Source Language Models with Mixed-Quality Data

Editor

Tsinghua University Researchers Introduce OpenChat: A Novel Artificial Intelligence AI Framework Enhancing Open-Source Language Models with Mixed-Quality Data

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended