Skip to content

Twitter

Sarah New edited this page May 21, 2018 · 17 revisions

Overview

The Twitter layer contains tweets related to opioid use and recovery. Using the Python Twitter Tools API (https://pypi.org/project/twitter/), tweets around opioid use were collected and stored in JSON files.

First Set of Tweets

We developed several sets of keywords for search queries.

Opioid use and behavior

  • need OR want OR needing OR wanting
  • too many OR two OR three OR double OR too much OR overdose OR crash OR strong enough OR max
  • pop OR popping OR not enough OR another OR popped
  • buy OR sell OR trade OR share OR spend OR bring OR steal

Street names of opioids

  • Oxys OR OxyCotton OR oxy OR roxies OR roxys OR oxycotin OR lortab OR tabbers OR Hydros OR Perc OR Percs OR Ercs OR Greenies OR dillies OR painkiller OR painkillers OR pain killer OR pain killers OR pain pills OR pain pill OR smack OR dope OR skag

Because the Python Twitter Tools API appears to have a character limit for queries, the street names list was broken into two sections when running queries. Each set of 'behavior' keywords was run with the list of street names like so: *(need OR want OR needing OR wanting) AND (Oxys OR OxyCotton OR oxy OR roxies OR roxys OR oxycotin OR lortab OR tabbers OR Hydros OR Perc OR Percs OR Ercs) *(need OR want OR needing OR wanting) AND (Greenies OR dillies OR painkiller OR painkillers OR pain killer OR pain killers OR pills OR smack OR skag)

Second Set of Tweets

More keywords were generated.

For tweets using the pharmaceutical names

  • need OR want OR max OR too many OR pop OR popping OR not enough OR another OR buy OR sell OR trade OR share OR spend OR bring OR steal OR popped
  • opiod OR opioids OR narcan OR opiates OR suboxone OR methadone OR hydromorphone OR dilaudid OR overdose

For tweets around rehabilitation and recovery

overdoes OR opioidcrisis OR recovery OR formeraddict OR relapse OR safeinjectionsites OR dependency

Geocoding

To geocode the Twitter layer a python script was created with a for loop that had an elif statement. This script introduces some random variability to the city location of each tweet in order to prevent tweets from being stacked on each other.

This loop looks for tweets in the JSON files that already had precise coordinates first. Then if looks for tweets with an identified city and uses the existing bounding boxes to generate a random, but central coordinate for that city. If there is no city identified, it will use the user's self identified profile location along with geopy's geocoding function to get a coordinates. Finally, if there is no available location data, a random coordinate in Maryland will be generated.

Clone this wiki locally