Using Tweepy to scrape tweets from twitter using API : complete
Twitter developers account (skip to the end for scrapper and logger)
To begin with you will be needing a twitter developers account, dont worry the account is free and easily available for personal uses and research purposes.It might be hard to find the developers link but yup here's it twitter developer.
Next, make a new application, filling in your name, description, website, agree to their terms, do the captcha, and create the application.
Once submitted successfully, you should be presented with a page where you can see your consumer key and consumer secret. Now you need an access token, so scroll down and click on "create my access token."
After a few moments, refresh, and you should be able to see the access key and access token. Once you have that, you're going to need to get Tweepy, which is a Python module for streaming Twitter tweets.
1. Sign in to twitter.
2. Create developers account.
3. Create a application.
4. Go to Tokens and Keys and generate a token and key for your application.
Here are some screen shots to help you.
Install tweepy
Ok, now those who have used pip before, just open cmd and type
>>pip install tweepy
If you have not used pip yet and want a more traditional approach
1.Browse https://github.com/tweepy/tweepy .
2.Download and extract the zip to say a folder called tweepy.
3.Open cmd and set it to that specific folder.
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
#ckey is consumer key,csecret is consumer secret, atoken is access token, asecret is access secret.
#you will find your keys from the Tokens and keys subsection from your twitter app page
ckey="fsdfasdfsafsfff"
csecret="asdfsadfsaddsadf"
atoken="asdf-aassdft"
asecret="asdfsadfsdwfsdafs"
class listener(StreamListener):
def on_data(self, data):
print(data)
return(True)
def on_error(self, status):
print status
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track=["dogs"])
Code 2:Scrap and log in a notepad
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json
ckey="fsdfasdfsafsfff"
csecret="asdfsadfsaddsadf"
atoken="asdf-aassdft"
asecret="asdfsadfsdwfsdafs"
f = open("logga.txt","w",encoding = 'utf-8')
class listener(StreamListener):
def on_data(self,data):
f = open("logga.txt","a",encoding = 'utf-8')
all_data = json.loads(data)
tweet = all_data["text"]
username = all_data["user"]["screen_name"]
f.write("{},{}".format(username, tweet))
print(username)
f.close()
return True
def on_error(self,status):
print(status)
f.close()
auth = OAuthHandler(ckey,csecret)
auth.set_access_token(atoken,asecret)
twitterStream = Stream(auth,listener())
twitterStream.filter(track=["dogs"])
#run the code either by cmd or any python console you use.
#the code is for python 3, some edits are needed to make it python 2 compatible.
# You will have a file called logga.txt with all the tweet logs for dogs in the directory where your console is in.
#The code will do the trick , More stuff on scrapping will be coming soon!
Comments
Post a Comment