Datasets:
model_id
stringclasses 1
value | dataset_id
stringclasses 1
value | columns
listlengths 1
1
| seed
int64 42
42
| sample_idx
int64 2
45.9k
| sentence_prefix
stringlengths 7
2.17k
| predicted_token
stringlengths 2
17
| actual_token
stringlengths 2
17
| probability
float64 0.9
1
| num_tokens
int32 4
557
|
|---|---|---|---|---|---|---|---|---|---|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 2
|
The detrimental effects that burning fossil fuels has on the environment, such as climate change and air
|
pollution
|
pollution
| 0.94
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 2
|
The importance of investing in renewable energy sources, such as solar and wind power, to reduce our dependency on fossil
|
fuels
|
fuels
| 0.99
| 24
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 6
|
An effective time management strategy should include setting clear and realistic goals, planning ahead, breaking tasks down into smaller chunks, being organized, prioritizing tasks, and staying focused on the task at
|
hand
|
hand
| 1
| 39
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 13
|
Eventually, their collective hope paid
|
off
|
off
| 0.96
| 8
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 13
|
On one fateful day, the sun slowly began to emerge from the darkness like a phoenix rising from the
|
ashes
|
ashes
| 0.93
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 24
|
Gradient descent is an optimization algorithm used in machine learning to find a set of parameters that minimizes a given cost
|
function
|
function
| 0.99
| 24
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 24
|
Gradient descent is used in many machine learning algorithms and is one of the key techniques used in deep
|
learning
|
learning
| 0.97
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 31
|
ARPA stands for Advanced Research Projects
|
Agency
|
Agency
| 0.97
| 9
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 41
|
It is caused by the change in speed and wavelength of light as it goes from one medium to
|
another
|
another
| 0.93
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 65
|
Moreover, they are able to bind with the substrates of the reaction, allowing the substrates to remain in close proximity and enhancing the rate of the
|
reaction
|
reaction
| 0.95
| 32
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 68
|
The death of John F. Kennedy forever changed the course of history, leaving a lasting legacy of hope and sorrow in its
|
wake
|
wake
| 0.98
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 70
|
For example, if you have a class called “Car”, you can create a new class called “SportCar” and extend the “Car”
|
class
|
class
| 0.96
| 31
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 97
|
Not only is it beneficial for communication, leisure, convenience, and productivity, but technology is now an important factor in improving education, health, security and overall quality of
|
life
|
life
| 0.97
| 35
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 97
|
From being able to access the internet from virtually anywhere to the rise of smart phones that allow us to stay connected to the people, services and information we need, technology is becoming a part of our lives that we cannot do away
|
with
|
with
| 0.97
| 47
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 97
|
With the advances in technology and the potential it holds, we are only seeing the beginning of its potential to improve our quality of
|
life
|
life
| 0.98
| 27
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 101
|
Replacing current fossil fuel-powered energy sources with renewable forms of energy is an important step in combating climate
|
change
|
change
| 0.99
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 104
|
Public transportation is an invaluable resource for communities and cities around the
|
world
|
world
| 0.93
| 14
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 104
|
It reduces the number of cars on the road, resulting in decreased air pollution, fewer greenhouse gas emissions, and improved public
|
health
|
health
| 0.93
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 106
|
We are committed to making sure that this issue does not occur again in the
|
future
|
future
| 0.92
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 111
|
The duo must fight off the shadows and find their way to the changing light of the morning, all while keeping their fears at
|
bay
|
bay
| 1
| 27
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 119
|
The United States Declaration of Independence promised life, liberty, and the pursuit of
|
happiness
|
happiness
| 0.98
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 137
|
Your wisdom and guidance were invaluable and I am thankful for your dedication and hard
|
work
|
work
| 0.99
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 148
|
The industrial revolution had a number of impacts, both good and
|
bad
|
bad
| 0.96
| 14
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 148
|
However, it also resulted in increased pollution, exploitation of workers, and a widening gap between the rich and the
|
poor
|
poor
| 0.99
| 24
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 152
|
Transfer the batter to the prepared pan and bake until a wooden pick inserted into the center comes out
|
clean
|
clean
| 0.9
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 156
|
Despite their reserved attitude, cats are very loving and affectionate towards the humans they bond
|
with
|
with
| 0.98
| 19
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 161
|
This event happened several years
|
ago
|
ago
| 0.95
| 7
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 162
|
It is also important to eat a balanced diet rich in fruits and vegetables, lean protein, and healthy
|
fats
|
fats
| 0.93
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 162
|
Make sure to stretch before and after workouts and get at least 7-8 hours of sleep every
|
night
|
night
| 0.94
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 164
|
The two most influential people of the twentieth century are Mahatma Gandhi and Nelson
|
Mandela
|
Mandela
| 0.99
| 18
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 164
|
Gandhi led India's successful struggle for independence from the British Empire and his philosophy of nonviolent resistance served as an inspiration for civil rights movements around the
|
world
|
world
| 0.97
| 33
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 175
|
Immigration to the United States has both pros and
|
cons
|
cons
| 0.99
| 12
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 177
|
Supervised learning is used for tasks such as classification, regression, and forecasting, while unsupervised learning is useful for tasks such as clustering and dimensionality
|
reduction
|
reduction
| 1
| 33
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 178
|
A recommender system is a type of information filtering system that uses user's past actions or preferences to suggest new items that the user may be interested
|
in
|
in
| 1
| 31
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 179
|
Player 1 and Player 2 decide who will go
|
first
|
first
| 0.97
| 13
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 179
|
Player 1 and Player 2 make a gesture (rock, paper, or scissors) at the same
|
time
|
time
| 0.98
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 183
|
The average life expectancy of a cat is around 12 to 15
|
years
|
years
| 0.96
| 18
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 188
|
Ransomware: malicious software that encrypts data, locking the user out of their system until a ransom is
|
paid
|
paid
| 0.99
| 24
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 211
|
User experience should be carefully considered to ensure the app is as intuitive and user-friendly as
|
possible
|
possible
| 1
| 19
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 212
|
The flight time from Orlando, FL to Boston, MA is approximately 3 hours and 18
|
minutes
|
minutes
| 0.99
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 213
|
The surface area of a sphere with radius 5 is 314.1592653589793 square
|
units
|
units
| 0.93
| 32
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 223
|
The UI interface for a grocery store checkout system should be intuitive and user-friendly, making it easy for customers to quickly and accurately check
|
out
|
out
| 0.93
| 28
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 230
|
The solar system is comprised of eight planets orbiting the Sun, along with dwarf planets, asteroids, comets, and other objects such as natural
|
satellites
|
satellites
| 0.98
| 31
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 230
|
The eight planets are Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and
|
Neptune
|
Neptune
| 0.99
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 236
|
Mia: Hey John, I'm thinking of taking up a new
|
hobby
|
hobby
| 0.93
| 16
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 236
|
Mia: But I want to do something that you'll actually be interested
|
in
|
in
| 0.96
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 240
|
def sum_numbers(x, y):
"""
Returns the sum of two
|
numbers
|
numbers
| 0.9
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 241
|
A healthy diet should include a balance of fruits, vegetables, whole grains, low-fat dairy, lean proteins, and healthy
|
fats
|
fats
| 0.94
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 241
|
Eating a variety of foods is important to receive all types of nutrients, vitamins, and
|
minerals
|
minerals
| 0.93
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 246
|
ATT will affect the advertising industry by requiring transparency on how this data is tracked, how it used, and who it's shared
|
with
|
with
| 0.92
| 27
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 246
|
By limiting targeted advertising, users will be able to choose the ads they see, promote a healthier online environment, and enforce greater data privacy regulations if need
|
be
|
be
| 0.96
| 32
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 250
|
It could be due to a number of reasons, such as a misconfiguration, server overload due to high traffic, issues with the application’s coding and logic, or a problem with the server’s hardware or operating
|
system
|
system
| 0.96
| 44
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 263
|
Artificial Intelligence (AI) is a form of technology that is revolutionizing the way we interact with the world around
|
us
|
us
| 1
| 25
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 264
|
I had the pleasure of dining at Panera Bread recently, and it was one of the best restaurant experiences I've ever
|
had
|
had
| 0.96
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 264
|
I would highly recommend Panera Bread to any discerning diner looking for a top-notch dining
|
experience
|
experience
| 0.94
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 275
|
This quote by Nelson Mandela speaks to the resilience, courage and determination of the human
|
spirit
|
spirit
| 0.94
| 18
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 275
|
Such rising ensures that we learn from our mistakes, grow in the face of adversity, and allows us to become the best, and most resilient, versions of
|
ourselves
|
ourselves
| 0.96
| 33
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 276
|
For energy production, we should focus on shifting away from fossil fuel-based sources and transitioning to renewable sources such as wind and
|
solar
|
solar
| 0.96
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 286
|
Customer: The product isn't working properly when I try to use
|
it
|
it
| 0.92
| 15
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 292
|
Later that year, the United States used two atomic bombs over Hiroshima and Nagasaki, Japan, on August 6 and 9 respectively, resulting in the surrender of the Japanese Empire and the official end of World War
|
II
|
II
| 0.99
| 46
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 297
|
It can provide you with relaxed vacations, business opportunities, and a way to explore cultures around the
|
world
|
world
| 0.94
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 298
|
Questions pertaining to customer experience at a retail store should seek to assess customer satisfaction with the overall experience, including customer service, product selection, checkout process, store atmosphere, and any other factors that may have impacted the shopping
|
experience
|
experience
| 0.97
| 45
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 303
|
Actions speak louder than words",
"A piece of cake",
"Piece by piece",
"Cut to the chase",
"Cost an arm and a
|
leg
|
leg
| 0.96
| 30
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 312
|
Mountain climbing requires a lot of practice and hard
|
work
|
work
| 0.97
| 11
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 332
|
John is an engineer living in California who loves running and reading in his spare
|
time
|
time
| 1
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 344
|
A zero-sum game is a type of game where one person's gain is another person's
|
loss
|
loss
| 1
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 344
|
This means that the total benefit or reward of the game is static, and that any gain to one player will result in a corresponding loss for all other
|
players
|
players
| 0.99
| 32
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 357
|
One day, he was presented with a peculiar maze that he couldn't solve, no matter how hard he
|
tried
|
tried
| 0.94
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 369
|
A three paragraph essay usually consists of an introductory paragraph, a body paragraph, and a concluding
|
paragraph
|
paragraph
| 0.99
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 371
|
Leaderboards can create a sense of competition and allow players to compete with each
|
other
|
other
| 1
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 376
|
Additionally, she is a role model for young girls and women, inspiring them to be brave and stand up for what is
|
right
|
right
| 0.97
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 383
|
For these reasons, it is important that we exercise our right to vote and have our voices
|
heard
|
heard
| 0.96
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 386
|
While both machine learning and deep learning are subfields of artificial intelligence, there are differences between the
|
two
|
two
| 0.92
| 21
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 386
|
Additionally, deep learning models have recently achieved impressive results on many tasks such as object recognition and natural language
|
processing
|
processing
| 0.9
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 398
|
The sentence "I am sitting" is a declarative
|
sentence
|
sentence
| 0.94
| 13
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 400
|
According to a report published by the World Health Organization (WHO), air pollution is the leading environmental cause of premature death worldwide, and smog is a major factor in air
|
pollution
|
pollution
| 0.95
| 36
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 400
|
Long-term exposure to smog can increase the risk of developing respiratory and cardiovascular diseases, stroke, and lung
|
cancer
|
cancer
| 0.97
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 400
|
This is due to the presence of fine particles and other compounds that are released into the air from burning fossil
|
fuels
|
fuels
| 0.94
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 401
|
Poverty is linked to poor health for a variety of
|
reasons
|
reasons
| 0.95
| 13
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 402
|
Apple has been empowering people through technology for over four
|
decades
|
decades
| 0.99
| 12
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 404
|
His mission this time was to uncover the truth about a mad scientist's invention, a mysterious device that could travel back in
|
time
|
time
| 0.99
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 404
|
John stepped onto the device and traveled back in
|
time
|
time
| 0.98
| 11
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 408
|
The Cat in the
|
Hat
|
Hat
| 0.98
| 6
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 422
|
The function f(x) = 2x is an even
|
function
|
function
| 1
| 14
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 425
|
Additionally, investing in crime prevention measures such as improved education and community initiatives could help to reduce the overall crime
|
rate
|
rate
| 0.92
| 23
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 429
|
The moon shone eerily through the trees, casting long shadows on the forest
|
floor
|
floor
| 0.96
| 18
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 429
|
As they went deeper into the woods, they could hear the faint sounds of owls and nightjars, and the howling of wolves in the
|
distance
|
distance
| 0.92
| 32
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 434
|
The oldest living tree on Earth is a Bristlecone Pine tree located in the White Mountains of California and it is estimated to be over 5,000 years
|
old
|
old
| 0.99
| 37
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 438
|
I would also include relevant hashtags so that the message reaches a wider
|
audience
|
audience
| 0.96
| 15
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 439
|
Support levels are where the price falls and struggles to move below, while resistance levels are where the price rises and struggles to move
|
above
|
above
| 0.92
| 27
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 442
|
Additionally, make sure to stay hydrated and take time to relax with hobbies you
|
enjoy
|
enjoy
| 0.97
| 17
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 454
|
Solitude is the peacefulness that one experiences when they are alone, either enjoying their own thoughts or meditating on the world around
|
them
|
them
| 0.96
| 28
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 455
|
Jack and Linda were childhood friends that had gone their separate ways after college, but when Linda learned that Jack had taken on a difficult task she wanted to help him in any way she
|
could
|
could
| 0.98
| 38
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 457
|
The Fahrenheit scale is based on 32 degrees for the freezing point of water and 212 degrees for its boiling
|
point
|
point
| 1
| 27
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 457
|
The Celsius scale is based on 0 degrees for the freezing point of water and 100 degrees for its boiling
|
point
|
point
| 1
| 26
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 458
|
The access control policies for a cloud-based application can include policy-based access control, role-based access control, attribute-based access control, identity-based access control, and authentication-based access
|
control
|
control
| 1
| 37
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 458
|
Attribute-based access control allows administrators to define certain attributes and grant or deny access according to those
|
attributes
|
attributes
| 0.91
| 20
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 460
|
The total cost of buying 10 cinema tickets that cost 6 euros each is 60
|
euros
|
euros
| 0.98
| 22
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 462
|
Eat a balanced diet to ensure your body and brain get the nutrition they
|
need
|
need
| 0.98
| 16
|
Qwen/Qwen2.5-0.5B
|
tatsu-lab/alpaca
|
[
"output"
] | 42
| 473
|
He had been training for years and was finally ready to take his game to the next
|
level
|
level
| 0.97
| 19
|
End of preview. Expand
in Data Studio
High-Probability Sentence Predictions Dataset
Dataset Description
This dataset contains sentences from tatsu-lab/alpaca
where the model Qwen/Qwen2.5-0.5B predicts the token before the final period
with ≥90% probability.
Source Dataset Attribution
This dataset is derived from tatsu-lab/alpaca and inherits its license terms (cc-by-nc-4.0). Please cite the original dataset when using this data.
Extraction Parameters
| Parameter | Value |
|---|---|
| Source Dataset | tatsu-lab/alpaca |
| Model | Qwen/Qwen2.5-0.5B |
| Probability Threshold | 0.9 |
| Seed | 42 |
| Source Columns | output |
| Extraction Date | 2025-12-15 |
| Total Samples | 10,000 |
Schema
| Field | Type | Description |
|---|---|---|
model_id |
string | Model used for prediction |
dataset_id |
string | Source dataset identifier |
columns |
list[string] | Source columns extracted from |
seed |
int64 | Random seed used for reproducibility |
sample_idx |
int64 | Index in source dataset |
sentence_prefix |
string | Text before predicted token |
predicted_token |
string | Model's top prediction |
actual_token |
string | Ground truth token |
probability |
float64 | Prediction confidence (0-1) |
num_tokens |
int32 | Token count in sentence |
Usage
from datasets import load_dataset
dataset = load_dataset("ermiaazarkhalili/alpaca-high-prob-qwen-0.5b-10k")
print(dataset["train"][0])
Citation
@dataset{high_prob_sentences_2025,
title = {High-Probability Sentence Predictions from tatsu-lab/alpaca},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/datasets/ermiaazarkhalili/alpaca-high-prob-qwen-0.5b-10k}},
note = {Derived from tatsu-lab/alpaca, model: Qwen/Qwen2.5-0.5B}
}
License
This dataset inherits the license from the source dataset: cc-by-nc-4.0
See tatsu-lab/alpaca for full license terms.
Reproducibility
To reproduce this dataset extraction:
python scripts/extract_high_prob_sentences.py \
--dataset "tatsu-lab/alpaca" \
--model "Qwen/Qwen2.5-0.5B" \
--threshold 0.9 \
--seed 42 \
--columns output \
--output data/output.parquet
- Downloads last month
- 14