FLUF Test: A Framework for Critical Evaluation of Content Generated with Artificial Intelligence © 2023 by Dr. Jennifer L. Parker is licensed under CC BY-NC-SA 4.0.
What is the FLUF Test?
Everyone is talking about Artificial Intelligence. With tools like ChatGPT and Copilot at our fingertips, it’s as easy as a Google search! So, this gets us thinking – if we are going to embrace the power of AI, how do we critically evaluate AI generated content? Enter – the FLUF test.
The FLUF Test is a framework for critically evaluating content generated through artificial intelligence. The framework encourages the user to look critically at format, language, usability, and fanfare when considering the outputs of the AI generated results. AI generated results can be flowery or inaccurate – or repetitive for example. Using these tools means becoming educated about their power, usefulness, and shortcomings.
In 2023, I developed the FLUF Test, based on many years as a teacher of information and digital media literacy skills across PK20 environments. With the FLUF test, you are looking specifically at format, language, usability and fanfare and the indicators for each. The goal is to have a result that has zero FLUF, or zero infractions in the generated results.
The FLUF Test is a framework for critically evaluating content generated through artificial intelligence. The framework encourages the user to look critically at format, language, usability, and fanfare when considering the outputs of the AI generated results. AI generated results can be flowery or inaccurate – or repetitive for example. Using these tools means becoming educated about their power, usefulness, and shortcomings.
In 2023, I developed the FLUF Test, based on many years as a teacher of information and digital media literacy skills across PK20 environments. With the FLUF test, you are looking specifically at format, language, usability and fanfare and the indicators for each. The goal is to have a result that has zero FLUF, or zero infractions in the generated results.
|
The FLUF Test in ActionThe template guides your journey as you use the FLUF indicators to guide prompt writing, critically evaluating results, and repeating the process for zero FLUF. Explore the framework (pages 1-12) or check out the sample scenario of the FLUF Test in Action (pages 13-21).
Watch this video to learn how it works. |
How to FLUF Test Results Generated with Artificial Intelligence
Format Look For's
In the area of F for Format, you are analyzing layout or length. Analyze your result – is the length too long? Does it use MLA when you wanted APA format? If there is an issue, score the rubric with a “plus” – indicating an infraction. A Plus means it needs to be regenerated or tweaked.
Language Look For's
Looking at L for Language, you are looking at tone, phrasing, and repetition. If the tone sounds too business-like, then the score is a PLUS and needs to be regenerated.
Usability Look For's
U is for Usability, where you want to know if the information is valid and reliable – and are looking for credibility and consistency.
Fanfare Look For's
Finally, we have F for Fanfare, we'd look for visual appeal and layout – but with words we are looking for academic writing that is free from anecdotes and jargon. This doesn't mean we don't want technical vocabulary, but it does mean we don't want to use cliches. Here, you are looking for the writing to present information in an informative way and specific way, in the style you intend, and make connections across content. Do the results deepen the conversation by comparing or contrasting information? presenting metaphors or analogies? sharing examples, stories, cases, or scenarios?
Depending on the intention of the writing:
Any infractions (pluses) indicate the need to regenerate or re-prompt the results with the AI generator. Starting with a good prompt will eliminate infractions and help get to "Zero FLUF".
In the area of F for Format, you are analyzing layout or length. Analyze your result – is the length too long? Does it use MLA when you wanted APA format? If there is an issue, score the rubric with a “plus” – indicating an infraction. A Plus means it needs to be regenerated or tweaked.
- Layout - If it doesn't follow the formal writing patterns or format, it gets a +.
- Length - If there are lots of extra words to extend the word count, it gets a +.
Language Look For's
Looking at L for Language, you are looking at tone, phrasing, and repetition. If the tone sounds too business-like, then the score is a PLUS and needs to be regenerated.
- Tone - If it lacks personal style, tone, or human elements, it gets a +.
- Phrasing - If the syntax or semantics are off, or it presents information in an awkward way, it gets a +.
- Repetition – if the passage lacks succinct presentation of ideas, or has run-on sentences or repetition of ideas, thoughts, and/or phrases, it gets a +.
Usability Look For's
U is for Usability, where you want to know if the information is valid and reliable – and are looking for credibility and consistency.
- Consistency – if there are inconsistencies in content it gets a +
- Credibility – if you cannot determine whether there are credible references, information cannot be authenticated or validated, or it lacks citations/sources for documentation, then it gets a +.
Fanfare Look For's
Finally, we have F for Fanfare, we'd look for visual appeal and layout – but with words we are looking for academic writing that is free from anecdotes and jargon. This doesn't mean we don't want technical vocabulary, but it does mean we don't want to use cliches. Here, you are looking for the writing to present information in an informative way and specific way, in the style you intend, and make connections across content. Do the results deepen the conversation by comparing or contrasting information? presenting metaphors or analogies? sharing examples, stories, cases, or scenarios?
Depending on the intention of the writing:
- Anecdotes – the absence of human examples, analogies, metaphors, or comparisons gets a +
- Jargon – jargon is another form of repetition, cliché, or assumption and can come in the form of a condescending tone. If the jargon is presented in this way without specificity, it gets a +.
Any infractions (pluses) indicate the need to regenerate or re-prompt the results with the AI generator. Starting with a good prompt will eliminate infractions and help get to "Zero FLUF".
Create an AI Prompt with FLUF
Use the 2-page template to create a prompt guided by the FLUF indicators for your AI tool, then use the FLUF Test to evaluate results.
|
Explore the FLUF Experience (Prompt to Critical Evaluation)
How to Use the FLUF Test
- Review FLUF indicators – format, language, usability, fanfare
- Create a prompt using FLUF guidance
- Generate results and FLUF test
- Update prompt; regenerate; FLUF test
- Repeat until happy with results and zero FLUF
- Combine AI Results & Human Creativity and Critique to Generate a Final Product
|
Underpinnings for this Work
There are currently no critical evaluation frameworks around artificial intelligence.With a background in library science, the literature review centered on information literacy and frameworks for critical evaluation of online information.
Here are some of the underpinnings for this work:
The 80/20 Rule (Pareto Principle)
The balance between human insight and technological capability
80% research and regeneration of online sources
20% human critique, creativity, and culmination to create a final output
Frameworks for Critical Evaluation of Online Resources
CRAAP (Blakeslee, 2004) – currency, relevance, authority, accuracy, purpose
CARRDSS (Valenza, 2004) – credibility, accuracy, reliability, relevance, date, sources, scope
SIFT (Caulfield, 2019) – stop, investigate, find, trace
5 Key Questions (Thorman & Jolls, 2003) – creator, techniques, perceptions, bias, purpose
The FLUF test is presented to encourage users to expand their media and information literacy, and extend their use of critical evaluation of all information - no matter how it is obtained. For an overview of the rationale leading to the FLUF Test framework and a detailed walk through of the FLUF experience, view the presentation.
For more information about the FLUF Test, templates, or a consultation, contact me.
FLUF Test: A Framework for Critical Evaluation of Content Generated with Artificial Intelligence © 2023 by Dr. Jennifer L. Parker is licensed under CC BY-NC-SA 4.0
Here are some of the underpinnings for this work:
The 80/20 Rule (Pareto Principle)
The balance between human insight and technological capability
80% research and regeneration of online sources
20% human critique, creativity, and culmination to create a final output
Frameworks for Critical Evaluation of Online Resources
CRAAP (Blakeslee, 2004) – currency, relevance, authority, accuracy, purpose
CARRDSS (Valenza, 2004) – credibility, accuracy, reliability, relevance, date, sources, scope
SIFT (Caulfield, 2019) – stop, investigate, find, trace
5 Key Questions (Thorman & Jolls, 2003) – creator, techniques, perceptions, bias, purpose
The FLUF test is presented to encourage users to expand their media and information literacy, and extend their use of critical evaluation of all information - no matter how it is obtained. For an overview of the rationale leading to the FLUF Test framework and a detailed walk through of the FLUF experience, view the presentation.
For more information about the FLUF Test, templates, or a consultation, contact me.
FLUF Test: A Framework for Critical Evaluation of Content Generated with Artificial Intelligence © 2023 by Dr. Jennifer L. Parker is licensed under CC BY-NC-SA 4.0