Artwork

Content provided by Philip Mastroianni. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip Mastroianni or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

AI LLM Prompting Tests - My Results on Prompt Engineering

10:43
 
Share
 

Manage episode 377209197 series 3311112
Content provided by Philip Mastroianni. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip Mastroianni or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Send us a Text Message.

I discuss my experience testing different AI systems prompting including Google Bard, OpenAI GPT-4 / GPT 3.5, Anthropic Claude 2, Llama 2, and Jasper to generate location-specific content. Most of this is based on the last 18 months of building out prompts, and now testing on models released over the last 4-6 weeks.


Google Bard

  • Released major update on July 13, 2023
  • Prompt strategy: Long paragraphs, numbered tasks, multiple iterations
  • Couldn't produce high quality content without heavy editing
  • Issues following instructions, needing reminders


OpenAI GPT-4

  • Works well with conversational, transcribed prompt
  • Able to follow directions and produce high quality content
  • No need for shot prompting


OpenAI GPT-3.5

  • Uses revised GPT-4 prompt plus follow up to enforce formatting
  • Gets content production-ready after second prompt
  • Quality close to GPT-4 with additional data/content provided


Anthropic Claude 2

  • No API access, using text interface
  • Required revising prompt structure significantly
  • XML tagging of data types improves context
  • Built-in prompt diagnosis/suggestions helpful
  • Single prompt can produce high quality output


Meta Llama 2

  • Free to use commercially if you have the hardware
  • Expected behavior similar to GPT-3.5
  • GPT-4 prompt worked well
  • Quality closer to GPT-3.5 but better privacy
  • Could refine with prompt chaining
  • Issues following instructions precisely


Jasper API

  • Access useful for building AI tools
  • Long prompt length capability
  • Appears to use GPT-4 or variant
  • Zero shot performs as well as GPT-4
  • Able to produce high quality content easily


Conclusion

  • GPT-4 and Jasper produce quality results most easily
  • Pleasantly surprised by Claude 2 quality and formatting of prompt
  • Llama 2 needs refinement to reach GPT-4 level
  • Curious about prompt strategies working across models

Full show notes: https://opinionatedseo.com/2023/07/ai-prompting/

  continue reading

40 episodes

Artwork
iconShare
 
Manage episode 377209197 series 3311112
Content provided by Philip Mastroianni. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip Mastroianni or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Send us a Text Message.

I discuss my experience testing different AI systems prompting including Google Bard, OpenAI GPT-4 / GPT 3.5, Anthropic Claude 2, Llama 2, and Jasper to generate location-specific content. Most of this is based on the last 18 months of building out prompts, and now testing on models released over the last 4-6 weeks.


Google Bard

  • Released major update on July 13, 2023
  • Prompt strategy: Long paragraphs, numbered tasks, multiple iterations
  • Couldn't produce high quality content without heavy editing
  • Issues following instructions, needing reminders


OpenAI GPT-4

  • Works well with conversational, transcribed prompt
  • Able to follow directions and produce high quality content
  • No need for shot prompting


OpenAI GPT-3.5

  • Uses revised GPT-4 prompt plus follow up to enforce formatting
  • Gets content production-ready after second prompt
  • Quality close to GPT-4 with additional data/content provided


Anthropic Claude 2

  • No API access, using text interface
  • Required revising prompt structure significantly
  • XML tagging of data types improves context
  • Built-in prompt diagnosis/suggestions helpful
  • Single prompt can produce high quality output


Meta Llama 2

  • Free to use commercially if you have the hardware
  • Expected behavior similar to GPT-3.5
  • GPT-4 prompt worked well
  • Quality closer to GPT-3.5 but better privacy
  • Could refine with prompt chaining
  • Issues following instructions precisely


Jasper API

  • Access useful for building AI tools
  • Long prompt length capability
  • Appears to use GPT-4 or variant
  • Zero shot performs as well as GPT-4
  • Able to produce high quality content easily


Conclusion

  • GPT-4 and Jasper produce quality results most easily
  • Pleasantly surprised by Claude 2 quality and formatting of prompt
  • Llama 2 needs refinement to reach GPT-4 level
  • Curious about prompt strategies working across models

Full show notes: https://opinionatedseo.com/2023/07/ai-prompting/

  continue reading

40 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide