Scikit-LLM: Power Up Your Text Analysis in Python Using LLMs within scikit-learn Framework | by Esmaeil Alizadeh

[ad_1]

One of many options of Scikit-LLM is the flexibility to carry out zero-shot textual content classification. Scikit-LLM offers two lessons for this objective:

ZeroShotGPTClassifier: used for single label classification (e.g. sentiment evaluation),
MultiLabelZeroShotGPTClassifier: used for a multi-label classification process.

Let’s do a sentiment evaluation of some film critiques. For coaching functions, we outline the sentiment for every assessment (outlined by a variable movie_review_labels). We practice the mannequin with these critiques and labels, in order that we are able to predict new film critiques utilizing the skilled mannequin.

The pattern dataset for the film critiques is given beneath:

movie_reviews = [
"This movie was absolutely wonderful. The storyline was compelling and the characters were very realistic.",
"I really loved the film! The plot had a few unexpected twists which kept me engaged till the end.",
"The movie was alright. Not great, but not bad either. A decent one-time watch.",
"I didn't enjoy the film that much. The plot was quite predictable and the characters lacked depth.",
"This movie was not to my taste. It felt too slow and the storyline wasn't engaging enough.",
"The film was okay. It was neither impressive nor disappointing. It was just fine.",
"I was blown away by the movie! The cinematography was excellent and the performances were top-notch.",
"I didn't like the movie at all. The story was uninteresting and the acting was mediocre at best.",
"The movie was decent. It had its moments but was not consistently engaging."
]movie_review_labels = [
"positive", 
"positive", 
"neutral", 
"negative", 
"negative", 
"neutral", 
"positive", 
"negative", 
"neutral"
]
new_movie_reviews = [
# A positive review
"The movie was fantastic! I was captivated by the storyline from beginning to end.",
# A negative review
"I found the film to be quite boring. The plot moved too slowly and the acting was subpar.",
# A neutral review
"The movie was okay. Not the best I've seen, but certainly not the worst."
]

Let’s practice the mannequin after which examine what the mannequin predicts for every new assessment.

from skllm import ZeroShotGPTClassifier# Initialize the classifier with the OpenAI mannequin
clf = ZeroShotGPTClassifier(openai_model="gpt-3.5-turbo")
# Prepare the mannequin 
clf.match(X=movie_reviews, y=movie_review_labels)  
# Use the skilled classifier to foretell the sentiment of the brand new critiques
predicted_movie_review_labels = clf.predict(X=new_movie_reviews)  
for assessment, sentiment in zip(new_movie_reviews, predicted_movie_review_labels):
print(f"Overview: {assessment}nPredicted Sentiment: {sentiment}nn")

Overview: The film was improbable! I used to be captivated by the storyline from starting to finish.
Predicted Sentiment: constructiveOverview: I discovered the movie to be fairly boring. The plot moved too slowly and the performing was subpar.
Predicted Sentiment: unfavourable
Overview: The film was okay. Not one of the best I've seen, however definitely not the worst.
Predicted Sentiment: impartial

As will be seen above, the mannequin predicted the sentiment of every film assessment appropriately.

Within the earlier part, we had a single-label classifier ([“positive”, “negative”, “neutral”]). Right here, we’re going to use the MultiLabelZeroShotGPTClassifier estimator to assign a number of labels to an inventory of restaurant critiques.

restaurant_reviews = [
"The food was delicious and the service was excellent. A wonderful dining experience!",
"The restaurant was in a great location, but the food was just average.",
"The service was very slow and the food was cold when it arrived. Not a good experience.",
"The restaurant has a beautiful ambiance, and the food was superb.",
"The food was great, but I found it to be a bit overpriced.",
"The restaurant was conveniently located, but the service was poor.",
"The food was not as expected, but the restaurant ambiance was really nice.",
"Great food and quick service. The location was also very convenient.",
"The prices were a bit high, but the food quality and the service were excellent.",
"The restaurant offered a wide variety of dishes. The service was also very quick."
]restaurant_review_labels = [
["Food", "Service"],
["Location", "Food"],
["Service", "Food"],
["Atmosphere", "Food"],
["Food", "Price"],
["Location", "Service"],
["Food", "Atmosphere"],
["Food", "Service", "Location"],
["Price", "Food", "Service"],
["Food Variety", "Service"]
]
new_restaurant_reviews = [
"The food was excellent and the restaurant was located in the heart of the city.",
"The service was slow and the food was not worth the price.",
"The restaurant had a wonderful ambiance, but the variety of dishes was limited."
]

Let’s practice the mannequin after which predict the labels for brand new critiques.

from skllm import MultiLabelZeroShotGPTClassifier# Initialize the classifier with the OpenAI mannequin
clf = MultiLabelZeroShotGPTClassifier(max_labels=3)
# Prepare the mannequin 
clf.match(X=restaurant_reviews, y=restaurant_review_labels)
# Use the skilled classifier to foretell the labels of the brand new critiques
predicted_restaurant_review_labels = clf.predict(X=new_restaurant_reviews)
for assessment, labels in zip(new_restaurant_reviews, predicted_restaurant_review_labels):
print(f"Overview: {assessment}nPredicted Labels: {labels}nn")

Overview: The meals was glorious and the restaurant was situated within the coronary heart of the town.
Predicted Labels: ['Food', 'Location']Overview: The service was sluggish and the meals was not well worth the value.
Predicted Labels: ['Service', 'Price']
Overview: The restaurant had a beautiful ambiance, however the number of dishes was restricted.
Predicted Labels: ['Atmosphere', 'Food Variety']

The expected labels for every assessment are spot-on.

[ad_2]

Source link

Scikit-LLM: Power Up Your Text Analysis in Python Using LLMs within scikit-learn Framework | by Esmaeil Alizadeh | Jun, 2023

GE Healthcare’s DL model for cardiac MRI gains FDA clearance

Random forest Algorithm in Machine learning

Editor

Random forest Algorithm in Machine learning

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

Scikit-LLM: Power Up Your Text Analysis in Python Using LLMs within scikit-learn Framework | by Esmaeil Alizadeh | Jun, 2023

GE Healthcare’s DL model for cardiac MRI gains FDA clearance

Random forest Algorithm in Machine learning

Editor

Random forest Algorithm in Machine learning

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended