Lessons

Auto Tweeting with OpenAPI GPT 3 & Python

A few weeks ago there was a lot of the internet attention around the recently opened for beta project called GPT-3. Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series created by OpenAI, a for-profit San Francisco-based artificial intelligence research laboratory. [1]

I spent a few minutes using the playground they offered and generating text output. One of the early ideas I had was to create Tweets from historic Tweets. I ended up writing a short program to fetch Tweets from a series of accounts. This lesson will be less structured and more of a blog post on how to play with GPT-3. I am going to show you how to fetch Twitter data and get new tweets using the GPT-3 playground. I am assuming you are in the beta program with OpenAI, you have a Twitter developer application, and Python installed on your computer.

Twitter App Setup

Head over to Developer.Twitter.com, then apps dashboard. If you do not have one, you will need to create one. Hit "Details" of the app.

Screenshot of Github Workflow Running

You are going to need to head over to "Keys and Tokens". You need the consumer API key, and consumer API secret key.

Screenshot of Github Workflow Running

Write those secrets down in a safe space. You will need them a little later.

Python Script for Gathering Tweets

I'm going to provide a full script and I will explain it using reference numbers. Create a file called: get-tweets.py:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 # pip3 install twitter-application-only-auth import json import time import os import random import re from application_only_auth import Client def write_to_file(path, content): file = open(path, "w+") file.write(content) file.close() def append_to_file(path, content): with open(path, "a") as myfile: myfile.write(content) def get_json(path): with open(path, 'r') as fp: return json.load(fp) def write_to_json(path, data): with open(path, 'w+') as fp: json.dump(data, fp, sort_keys=True, indent=4) def main(): # Ref 1 # The consumer secret is an example and will not work for real requests # To register an app visit https://dev.twitter.com/apps/new CONSUMER_KEY = os.getenv('CONSUMER_KEY', None) CONSUMER_SECRET = os.getenv('CONSUMER_SECRET', None) if CONSUMER_KEY == None or CONSUMER_SECRET == None: print ('Error: CONSUMER_KEY or CONSUMER_SECRET is not set.') sys.exit(1) client = Client(CONSUMER_KEY, CONSUMER_SECRET) # Ref 2 # Pretty print of tweet payload # tweet = client.request('https://api.twitter.com/1.1/statuses/show.json?id=316683059296624640') # print(json.dumps(tweet, sort_keys=True, indent=4, separators=(',', ':'))) print ('') print ('') print ('') print ('') screen_names_to_strip_urls = ['raydalio'] # Ref 3 screen_names = ['wealth_theory'] # Ref 4 for screen_name in screen_names: print('-------------------------------------------------------') print('-------------------------------------------------------') print('----------- Processing :: [' + screen_name + '] ------') print('-------------------------------------------------------') print('-------------------------------------------------------') # Ref 5 historic_path = './tweets/' + screen_name + '.json' historic_data = {} since_id = None tweets_as_strs = [] tweet_data = [] tweetIDs = [] # Ref 6 if (os.path.exists(historic_path)): historic_data = get_json(historic_path) if (random.randint(0,2) != 0): # Generates random number between 0 and 2. If 0, fetches live data. since_id = historic_data.get('lastTweetID', None) tweet_data = historic_data.get('tweetData', []) tweetIDs = historic_data.get('tweetIDs', []) for i in range(0, 5): # Ref 7 url = 'https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=' + screen_name + '&exclude_replies=true&include_rts=false' if (since_id != None): url += '&max_id=' + since_id # Ref 8 tweets = client.request(url) # print(json.dumps(tweets, sort_keys=True, indent=4, separators=(',', ':'))) # retweeted # text if (len(tweets) == 0): break for tweet in tweets: tweet_text = tweet.get('text') tweet_id = tweet.get('id_str') if screen_name in screen_names_to_strip_urls: # Ref 9 tweet_text = re.sub(r'^https?:\/\/.*[\r\n]*', '', tweet_text, flags=re.MULTILINE) # Ref 10 if not tweet.get('retweeted') and '@' not in tweet_text and 'http://' not in tweet_text and 'https://' not in tweet_text and ('happy new year' not in tweet_text.lower()) and (tweet_id not in tweetIDs): if tweet_text not in tweets_as_strs: print (tweet_text) print ('') tweets_as_strs.append(tweet_text) tweetIDs.append(tweet_id) formatted_tweet_text = tweet_text.replace('\'', '') formatted_tweet_text = formatted_tweet_text.replace('"', '') tweet_data.append({ 'id': tweet_id, 'text': formatted_tweet_text }) since_id = tweet_id # Show rate limit status for this application status = client.rate_limit_status() print(status['resources']['search']) time.sleep(5) # Ref 11 # Save historic data historic_data['tweetIDs'] = tweetIDs historic_data['tweetData'] = tweet_data historic_data['lastTweetID'] = since_id # Save historic data write_to_json(historic_path, historic_data) # Ref 12 # Prep file of tweets raw_tweets_path = './tweets/' + screen_name + '.txt' if not (os.path.exists(raw_tweets_path)): write_to_file(raw_tweets_path, '') # Append new tweets to file append_to_file(raw_tweets_path, '\n'.join(tweets_as_strs)) main()

Ref 1 :: Grab the CONSUMER_KEY and CONSUMER_SECRET from the environment variables. You will set these as you run the script or using a .env file. You would run it:

1 CONSUMER_KEY=<CONSUMER_API_KEY> CONSUMER_SECRET=<CONSUMER_API_SECRET_KEY> python3 get-tweets.py

Ref 2 :: We will use a Twitter client library. You can install it using pip3 install twitter-application-only-auth.

Ref 3 :: This was an additional attribute I added. Ray Dalio often always includes a URL in all Tweets. This just will be used to remove it.

Ref 4 :: This is a list of Twitter users you want to retrieve their tweets.

Ref 5 :: The path for historic Tweets.

Ref 6 :: Grab the historic tweets, and find the last Tweet found. This will be used to avoid retrieve the same Tweets twice.

Ref 7 :: We will go through 5 pages worth of Tweets.

Ref 8 :: This is how we are setting the last Tweet ID.

Ref 9 :: This is the check to determine if we need to remove a URL from the Tweet.

Ref 10 :: At this point, we do not want to include retweets, mentions, tweets with URLs, and tweets with holidays.

Ref 11 :: We added a 5 second sleep to help avoid throttling. This process doesn't require an immediate response. We can let it gather tweets over a few minutes.

Ref 12 :: We will save a file with all Tweets found. This is to avoid duplicate work.

Running

Time to run the code.

1 2 3 mkdir tweets pip3 install twitter-application-only-auth CONSUMER_KEY=<CONSUMER_API_KEY> CONSUMER_SECRET=<CONSUMER_API_SECRET_KEY> python3 get-tweets.py

Screenshot of Github Workflow Running

Screenshot of Github Workflow Running

Screenshot of Github Workflow Running

The output on the terminal will be a little unclear. It more tells you the state and progress. After it completes, you will check the tweets folder. For this example, I will have two new files: tweets/wealth_theory.json and tweets/wealth_theory.txt. We will put tweets/wealth_theory.txt into the GPT-3 playground. The nice aspect of this script is it grabs 5 pages worth of Tweets. If you want more, you can just run it again. The script will use the historic data as a starting point and continue from there to work backwards. You can also add to the number of pages.

GPT-3 Playground

Start by signing into your OpenAPI beta account.

Screenshot of Github Workflow Running

Screenshot of Github Workflow Running

In the top navigation bar, you will see Playground. Click it and open the playground page.

Screenshot of Github Workflow Running

Next, open the tweets output as text, and copy it into the playground.

Screenshot of Github Workflow Running

Hit submit. You may see a content warning like below.

Screenshot of Github Workflow Running

You should see the new tweets using the existing data.

Screenshot of Github Workflow Running

With the Wealth Theory tweets, it creates these three potential tweets:

1 2 3 4 5 6 If you are not using leverage, you are missing out on the wealth creation. A financial plan is a good idea for anyone. Cash flow is the foundation of a strong financial plan. The ability to convert cash flow into other forms of wealth is the key to moving up on the wealth

You can find more details about GPT-3 here in the introduction docs. Example: davinci is the model we picked. Since I started playing around with GPT-3 in July, they've added these sliders to the side of the playground. If you hover over them, they provide you with more information.

Screenshot of Github Workflow Running

Conclusion

Thanks for reading! This was fairly high level and to show you a really simple example. As mentioned in the introduction, I just wanted to show how to fetch the Twitter data and how to plug into into the playground.

Source

  • https://en.wikipedia.org/wiki/GPT-3