Quantcast
Channel: Python – Twilio Cloud Communications Blog
Viewing all 78 articles
Browse latest View live

Checking Your Daily Spending via SMS with Python, Plaid and Twilio

$
0
0

Your bank may let you set up SMS alerts for various triggers. It might even give you the option of receiving periodic spending summaries via SMS (mine doesn’t though!). But what about a daily SMS summary of your spending across all your accounts? This summary is harder to come by, but thankfully you can roll your own by combining Plaid, an easy to use financial services API and Twilio SMS with a bit of Python 3. Let’s get going!

Setting Up

We’ll begin creating our app by lining up some of the basic building blocks. Here’s all the code in one place for those following at home and want to save some typing.

Start by nabbing a sandbox account from Plaid and putting your credentials into PLAID_CLIENT_ID, PLAID_SECRET, PLAID_PUBLIC_KEY environment variables. While you’re at it, ask for access the development API (it’ll take a few days to get approved). For now, though, create an environment variable named

PLAID_ENV
 and set it to ‘sandbox’.  This will give us some sample data to work with.

export PLAID_CLIENT_ID='somechars1234'
export PLAID_PUBLIC_KEY='somemorechars1234'
export PLAID_SECRET='somesecretchars1234'
export PLAID_ENV='sandbox'

Next, clone the quickstart for Plaid, install from requirements.txt using the pip install -r requirements.txt command then edit the server.py code to include those environment variables.

PLAID_CLIENT_ID = os.getenv('PLAID_CLIENT_ID')
PLAID_SECRET = os.getenv('PLAID_SECRET')
PLAID_PUBLIC_KEY = os.getenv('PLAID_PUBLIC_KEY')
PLAID_ENV = os.getenv('PLAID_ENV')

Run server.py with the python server.py command (unfortunately this part of the example code only works with Python 2 😦) and open http://127.0.0.1:5000 in your browser.

Screenshot 2017-05-31 23.07.00.png

Click “open link”, then log into Chase with the test credentials (“user_good” and “pass_good” as of 5/26/2017) and the application will print an access token to your terminal window. 

Screenshot_2017-05-31_23_28_28.png

Screenshot_2017-05-31_23_08_16.png

Grab the token and put it into a CHASE_ACCESS_TOKEN environment variable. Repeat this for Bank of America and put that access token into BOFA_ACCESS_TOKEN:

export CHASE_ACCESS_TOKEN='access-sandbox-someprettysecretchars1234'
export BOFA_ACCESS_TOKEN='access-sandbox-somemoreprettysecretchars1234'

Grab your Twilio credentials from the Console (sign up for a free account if you don’t already have one), a Twilio phone number and your own phone number, then add those as environment variables.

export TWILIO_SID='somechars1234'
export TWILIO_TOKEN='somesecretchars1234'
export MY_TWILIO_NUM='+11111111111'
export MY_CELL='+12222222222'

Finally, install a couple of project dependencies:

$ pip install python-plaid twilio

With the basic building blocks in place we can start coding our app.

Obtaining Transactions

A quick way to start working with Plaid is to grab some account transactions and explore the result.  Make a new file called get_some_transactions.py.  In that file create a plaid.Client instance and a new function named get_some_transactions which accepts an access token and the start and end dates of the range of transactions you want to get. Inside the function, call the Plaid client’s Transactions.get function with those parameters. The following code will accomplish these steps.

import os
from pprint import pprint
from typing import List

# twilio.rest has a Client too, so let's avoid a namespace collision
from plaid import Client as PlaidClient

plaid_client = PlaidClient(client_id=os.getenv('PLAID_CLIENT_ID'), secret=os.getenv('PLAID_SECRET'),
                           public_key=os.getenv('PLAID_PUBLIC_KEY'), environment=os.getenv('PLAID_ENV'))


def get_some_transactions(access_token: str, start_date: str, end_date: str) -> List[dict]:
    return plaid_client.Transactions.get(access_token, start_date, end_date)

With that code in place we can start exploring what information Plaid returns. In the same script, call get_some_transactions by adding the following two lines to the end of the file. These pass in the Chase access token and a wide date range.

some_transactions = get_some_transactions(os.getenv('CHASE_ACCESS_TOKEN'), '1972-01-01', '2017-05-26')
print(f"there are {some_transactions['total_transactions']} total transactions between those dates.”)

Run the script using the python get_some_transactions.py command.

When run this code outputs there are 338 total transactions between those dates.  How many transactions were returned from our Plaid API call?

print(f"get_some_transactions returned {len(some_transactions['transactions'])} transactions.)

Apparently, get some_transactions returned 100 transactions.  Why only 100?  It seems that’s the default value for count, an optional parameter for Transactions.get, as seen here in the Plaid API documentation.

What sort of data are in a transaction?

pprint(some_transactions['transactions'][0].keys())

This code outputs dict_keys(['account_id', 'account_owner', 'amount', 'category', 'category_id', 'date', 'location', 'name', 'payment_meta', 'pending', 'pending_transaction_id', 'transaction_id', 'transaction_type']).  For our purposes, amount seems to be all we need, but what about category?  We’re not building a full-fledged budget bot, but are there any transactions that would muddy the waters of our spending summary?

print({category
       for transaction in some_transactions['transactions'] if transaction['category'] is not None
       for category in transaction['category']})

This code gives us {'Food and Drink', 'Travel', 'Transfer', 'Airlines and Aviation Services', 'Payment', 'Credit Card', 'Coffee Shop', 'Fast Food', 'Restaurants', 'Deposit'}.  This will be useful to create a choosier get_some_transactions function.  For example, “Transfer” transactions I would argue don’t belong in our daily summaries, since they don’t qualify as spending. Before we refactor, though, let’s see what sort of accounts we’re dealing with, and whether there are any we should exclude.

pprint(some_transactions['accounts'])

Leaving out the less relevant fields, this execution results in:

[{'account_id': 'qwp96Z11b5IBKVMl8XvLSkJXjgj6ZxIXX3o79',
  'name': 'Plaid Checking',
  'subtype': 'checking'
  ...},
 {'account_id': 'Kk9ZL7NN4wSX3lR9evV8f9P4GVGk3BF33QnAM',
  'name': 'Plaid Saving',
  'subtype': 'savings'
  ...},
 {'account_id': 'rEy96MWWgXukrnBW4yVphv7yl3lznosBBzo6n',
  'name': 'Plaid CD',
  'subtype': 'cd'
  ...}
 {'account_id': '9rNKomMMdWTvVL4X9RP6UKb4qEqng1uJJ6nQw',
  'name': 'Plaid Credit Card',
  'subtype': 'credit'
  ...}]

There’s some low-hanging fruit here too: with any luck, we won’t be spending out of our savings or investment accounts — at worst, we’d be doing transfers — so let’s get refactoring!

Getting the Right Transactions

We know we want to exclude transactions with a category of “Transfer”.  “Credit Card”, “Payment” and “Deposit” aren’t going to be useful in gleaning spending activity either, so we’ll refactor our get_some_transactions function to skip transactions with those categories.  As stated earlier, we also want to skip accounts with a subtype of “savings” or “cd”.

While we’re at it, let’s also make sure to get all available transactions by using pagination, not just those default first 100, and hone in on just the transactions item, not any others. Modify get_some_transactions.py with the following code (which can also be found in get_some_transactions.py on GitHub).

import math
import os
from pprint import pprint
from typing import List

# twilio.rest has a Client too, so let's avoid a namespace collision
from plaid import Client as PlaidClient

plaid_client = PlaidClient(client_id=os.getenv('PLAID_CLIENT_ID'), secret=os.getenv('PLAID_SECRET'),
                           public_key=os.getenv('PLAID_PUBLIC_KEY'), environment=os.getenv('PLAID_ENV'))


# https://plaid.com/docs/api/#transactions
MAX_TRANSACTIONS_PER_PAGE = 500
OMIT_CATEGORIES = ["Transfer", "Credit Card", "Deposit", "Payment"]
OMIT_ACCOUNT_SUBTYPES = ['cd', 'savings']


def get_some_transactions(access_token: str, start_date: str, end_date: str) -> List[dict]:
    account_ids = [account['account_id'] for account in plaid_client.Accounts.get(access_token)['accounts']
                   if account['subtype'] not in OMIT_ACCOUNT_SUBTYPES]

    num_available_transactions = plaid_client.Transactions.get(access_token, start_date, end_date,
                                                               account_ids=account_ids)['total_transactions']
    num_pages = math.ceil(num_available_transactions / MAX_TRANSACTIONS_PER_PAGE)
    transactions = []

    for page_num in range(num_pages):
        transactions += [transaction
                         for transaction in plaid_client.Transactions.get(access_token, start_date, end_date,
                                                                          account_ids=account_ids,
                                                                          offset=page_num * MAX_TRANSACTIONS_PER_PAGE,
                                                                          count=MAX_TRANSACTIONS_PER_PAGE)['transactions']
                         if transaction['category'] is None
                         or not any(category in OMIT_CATEGORIES
                                    for category in transaction['category'])]

    return transactions

some_transactions = get_some_transactions(os.getenv('CHASE_ACCESS_TOKEN'), '1972-01-01', '2017-05-26')
print(f"there are {len(some_transactions)} transactions")
pprint([transaction for transaction in some_transactions if transaction['amount'] < 0])

Execute the code and it says that there are 265 transactions. Are any of them negative?

pprint([transaction for transaction in some_transactions if transaction['amount'] < 0])

Several are, in fact, negative:

[{'amount': -500,
  'category': ['Travel', 'Airlines and Aviation Services'],
  'name': 'United Airlines',
  'transaction_type': 'special',
  ...},
  ...]

Okay, that seems legit- airfare refund, I guess. All the transactions with negative amounts are similar to this, so let’s keep them in.

Pulling It All Together

Now let’s get all the transactions from yesterday, making sure to pull them from both accounts.  Create a new file named get_yesterdays.py and add this code:

import datetime
import os
from typing import List

from get_some_transactions_v2 import get_some_transactions


def get_yesterdays_transactions() -> List[dict]:
    yesterday = ('2017-05-16' if os.getenv('PLAID_ENV') == 'sandbox'
                 else (datetime.date.today() - datetime.timedelta(days=1)).strftime('%Y-%m-%d'))

    transactions = []

    for access_id in [os.getenv('CHASE_ACCESS_TOKEN'), os.getenv('BOFA_ACCESS_TOKEN')]:
        transactions += get_some_transactions(access_id, yesterday, yesterday)

    return transactions

As of 5/26/2017, the most recent transactions available in these sandbox accounts are from 5/16/17: Hence, the hardcoded yesterday value above.
Let’s send an SMS to ourselves with the total spent yesterday!  Create another new file named send_summary.py and add this code to it:

import os
from typing import List

from twilio.rest import Client as TwilioClient

from get_yesterdays import get_yesterdays_transactions

twilio_client = TwilioClient(os.getenv('TWILIO_SID'), os.getenv('TWILIO_TOKEN'))


def send_summary(transactions: List[dict]) -> None:
    total_spent = sum(transaction['amount'] for transaction in transactions)
    message = f'You spent ${total_spent} yesterday. 💸'
    twilio_client.api.account.messages.create(to=os.getenv('MY_CELL'), from_=os.getenv('MY_TWILIO_NUM'), body=message)


if __name__ == "__main__":
    send_summary(get_yesterdays_transactions())

Run the code using the python get_send_summary.py command and voila!


Wrapping It Up

We’ve now created an app that aggregates all spending across disparate credit and bank accounts, then pushes that total spend to our phone.  Here’s all the code in one place.  To deploy it, you could create a cron job somewhere to run it every day at a certain time, and never again have to deal with separate alerts/summaries from each spending account.

But that’s just the tip of the iceberg of what you could build with this spending data. I don’t know about you, but having built this proof of concept makes me want a bot that tracks my spending, pinging me to categorize items it can’t categorize itself.  Spreadsheets are a blast and all, but do I need the unpaid part-time job of maintaining one?  Ditto for the existing web apps.

A simpler extension of this app would be to set it up for your grandma and see if you can build a meaningful voice interaction into it.

Put your ideas and questions in the comments below!

Checking Your Daily Spending via SMS with Python, Plaid and Twilio


How I Hacked My University’s Registration System with Python and Twilio

$
0
0

University students know the pain of trying to register for a class only to realize it’s full. At my university we didn’t even have a waitlist system for most classes. We had to resort to logging in and checking the site multiple times a day. This seemed like something a computer could do, so I set out to automate it with a bit of Python and the Twilio API.

Getting Started

Because the university’s course registration system is behind a password login, we’re going to use a simplified site I set up. For the purposes of this demo, it alternates each minute between having no open seats in CS 101 and having one open seat.

We’re going to use a few libraries to help us with this project. Assuming you already have pip installed, go ahead and install them by running the following pip command:

pip install requests==2.17.3 beautifulsoup4==4.6.0 redis==2.10.5 twilio==6.3.0 Flask==0.12.2 

We’ll dive into using each one of these libraries as we get further along.

Scraping the Registration System

We need to write a program that can determine whether there are seats available in a given course. To do this, we’ll use a technique called web scraping, in which we download a page from the internet and find the important bits. Two popular libraries that make this easy are Requests and BeautifulSoup. Requests makes it easy to get a web page, and BeautifulSoup can help us find the parts of that page that are important to us.

# scraper.py
import requests
from bs4 import BeautifulSoup

URL = 'http://courses.project.samueltaylor.org/'
COURSE_NUM_NDX = 0
SEATS_NDX = 1

def get_open_seats():
    r = requests.get(URL)
    soup = BeautifulSoup(r.text, 'html.parser')
    courses = {}

    for row in soup.find_all('tr'):
        cols = [e.text for e in row.find_all('td')]
        if cols:
            courses[cols[COURSE_NUM_NDX]] = int(cols[SEATS_NDX])
    return courses

The meat here is in the get_open_seats function. In this function, we use requests.get to download a page’s HTML source, then we parse it with BeautifulSoup. We use find_all('tr') to get all the rows in the table, updating the courses dictionary to indicate the number of seats available in a given course. find_all can be used more powerfully than this simple example, so check out the documentation if you’re interested in learning more. Finally, we return the courses dictionary so that our program can look up how many seats are in a given course (i.e. courses['CS 101'] is the number of seats available in CS 101).

Hooray, now we can determine whether a course has open seats. A great way to test this function out is in the Python interpreter. Save this code into a file called scraper.py, then run the script and drop into interactive mode to see what this function does:

$ python -i scraper.py
>>> get_open_seats()
{'CS 101': 1, 'CS 201': 0}

While this is great, we’re not quite to a solution; we still need some way to notify users when a seat opens up. Twilio SMS to the rescue!

Getting Updates via SMS

When building a user interface, we want simple things to be simple. In this case, users want to get notified when seats in a course open up. The simplest way for them to communicate that intent to us is sharing the course number. Let’s implement a subscription functionality by setting up and handling a webhook. I’m choosing to use Redis (a tool which provides data structures that can be accessed from multiple processes) to store subscriptions.

# sms_handler.py
from flask import Flask, request
import redis
 
twilio_account_sid = 'ACXXXXX'
redis_client = redis.StrictRedis(host='localhost', port=6379, db=0)
app = Flask(__name__)
 
@app.route('/sms', methods=['POST'])
def handle_sms():
    user = request.form['From']
    course = request.form['Body'].strip().upper()
 
    redis_client.sadd(course, user.encode('utf-8'))
 
if __name__ == '__main__':
    app.run(debug=True)

Here we use a web framework for Python called Flask to create a little service that handles SMS messages. After some initial setup, we indicate that requests to the /sms endpoint should be handled by the handle_sms function. In this function, we grab the user’s phone number and the course they were looking for, and store them in a set named after the course.

This is great as far as capturing the subscriptions, but it is a frustrating user interface because it doesn’t provide any feedback to users. We want to get back to users and tell them whether we’re able to service their request as soon as possible. To do that, we’ll provide a TwiML response. The additional lines needed for that are highlighted below.

# sms_handler.py
from flask import Flask, request
import redis
from twilio.twiml.messaging_response import MessagingResponse
 
twilio_account_sid = 'ACXXXXX'
my_number = '+1XXXXXXXXXX'
valid_courses = {'CS 101', 'CS 201'}
 
redis_client = redis.StrictRedis(host='localhost', port=6379, db=0)
app = Flask(__name__)
 
def respond(user, body):
    response = MessagingResponse()
    response.message(body=body)
    return str(response)
 
@app.route('/sms', methods=['POST'])
def handle_sms():
    user = request.form['From']
    course = request.form['Body'].strip().upper()
    if course not in valid_courses:
        return respond(user, body="Hm, that doesn't look like a valid course. Try something like 'CS 101'.")
 
    redis_client.sadd(course, user.encode('utf-8'))
        return respond(user, body=f"Sweet action. We'll let you know when there are seats available in {course}")
 
if __name__ == '__main__':
    app.run(debug=True)

We’ve made two major changes in the code above. First, we validate that the user is asking for a valid course. Second, we respond to users when they ask for updates. In the respond function, we construct a TwiML response to a given number with a given message.

Make sure to install Redis and start it up with the redis-server command. Save the above code into a file called sms_handler.py and then run python sms_handler.py.

Admittedly, the response messages here are a bit silly, but I was surprised to see how much users enjoyed them. In some contexts a personal touch can make for a better user experience.

Let’s extend our earlier scraping script to actually notify those people now that we know who wants to be notified of a course opening up.

# scraper.py
from twilio.rest import Client

client = Client(twilio_account_sid, token)
redis_client = redis.StrictRedis(host='localhost', port=6379, db=0)

def message(recipient, body):
    message = client.messages.create(to=recipient, from_=my_number, body=body)


if __name__ == '__main__':
    courses = get_open_seats()
    for course, seats in courses.items():
        if seats == 0:
            continue

        to_notify = redis_client.smembers(course)
        for user in to_notify:
            message(user.decode('utf-8'), 
                    body=f"Good news! Spots opened up in {course}. " + \
                          "We'll stop bugging you about this one now.")
            redis_client.srem(course, user)

We can run this scraper on a one-off basis to test it by running python scraper.py.

Keeping Tabs on Courses with a Cron Job

While simplifying the process of checking the course registration site into a single script is nice, we want the script to automatically run every few minutes. This problem is easily solved by using Cron. We can add a task to run every three minutes by running crontab -e and adding the following line:

*/3 * * * * /path/to/scraper.py

With that code in place, the Cron daemon will run our scraper every three minutes. We can see the scheduled tasks by running crontab -l. And that’s it! We can subscribe to updates for a class and get back to the important things in life. As a fun side benefit, your friends will be very appreciative when you get them into that packed “Rest and Relaxation” course. While getting into the classes I wanted was plenty of reward for the work, it also ended up helping around a dozen people get their ideal schedules.

Using the techniques from this post, you can set up notifications for a wide variety of things. For instance, someone used Ruby and Twilio to track craft beer availability. To get all the code from this post, check out this gist. You can also contact me via:

Disclaimer: Make sure to check that setting up notifications will not violate your university’s student system terms of service. When in doubt, ask someone who knows!

How I Hacked My University’s Registration System with Python and Twilio

How to Build a Serverless API with Amazon Web Services’ API Gateway

$
0
0

It’s easy to use the Twilio API to send and receive SMS using Python and Bottle. What if we convert that traditional web application into a serverless application using Amazon Web Services’ API Gateway and AWS Lambda?

Interested in learning how to create a serverless Python application? Great, because we’re going to build a serverless API for Twilio SMS throughout this post.

What Problem Are We Trying to Solve?

If we were to build a production ready service for our next million dollar startup, a Bottle web server running locally on the laptop with Ngrok tunneling would not scale. That’s a no brainer.

But, if a startup were to allow multiple users to use the same number, say Sales and Marketing, how would they control access to the API calls? They’d want to monitor and throttle usage for each department so they don’t break the bank.

They would need a service that allows multiple users to call the same Twilio service that is scalable, secure and hopefully easy to manage. As a bonus, we can enable them to control the usage for each user. Let’s build it.

Old Way of Solving this Problem

Traditionally, this is what we’d need to solve the problem:

  1. Web server such as Nginx or Apache.
  2. Use some web framework to expose the endpoints (what Bottle framework was doing in the original SMS using Python and Bottle post). In the traditional Model-Control-View model paradigm, we don’t really need the View part but the framework comes in a bundle so I guess we will just throw away the parts that are not need).
  3. Write the functions to correlate what each endpoint would do.
  4. Optionally, it would probably be easiest if the server can be accessed publicly, like the Ngrok tunneling in Matt’s post. This means the server needs a public IP with DNS setup.
  5. Some type of authentication mechanism to control access, depending on which framework was used.
  6. Optionally, we should monitor the API usage.

Even after all the work, that is just one server. When the business grows beyond the single server, more servers are needed. We can put the service on AWS EC2 instances with auto-scaling, but that would only solve #4 above, still leaving the rest tasks to be worked on.

There is a Better Way

Luckily, we live in the age of cloud computing. By combining AWS API service with Lambda, we can solve most of the problems above. Here is an overview of the flow:

Each component above represents a service from AWS and not servers or VM’s. Here is the workflow:

  1. The API gateway will create publicly accessible URI endpoints.
  2. Each user will be assigned a key to identify the user and assigned proper usage.
  3. The API request will be passed on to AWS Lambda service.
  4. The Lambda function will in turn make a call to Twilio API and sends the text message.

I will show you how to set up this flow by extending Matt’s code into Lambda. Truth be told, both the API Gateway and Lambda services have extensive features that we are only scratching the surface of. Luckily, they are both well documented, here are some links that can be helpful if you want to learn more about them: Amazon API Gateway, Amazon Lambda.

Both services can work in conjunction with Twilio’s subaccount services, which allows you to separate the billing and API keys per subaccount.

Tools Needed

  1. Your AWS Account, https://aws.amazon.com/. If you don’t already have one, it is highly recommended to register a free account. Many of the services have a free tier (see about Lambda below) that you can use for free.
  2. Twilio account, API key that you have set up from the post, Getting Started with Python, Bottle, and Twilio SMS / MMS.

Step 0. Before We Begin:

  1. If you are brand new to AWS, here is a quick start guide broken down into solutions for each industry that you might find interesting, https://aws.amazon.com/quickstart/.
  2. If you are brand new to Lambda, it might be beneficial to watch the introduction video about the serverless model on https://aws.amazon.com/lambda/.
  3. If you are brand new to Amazon API service, it might be helpful to quickly glance thru the high level points here https://aws.amazon.com/api-gateway/.
  4. At the time of publishing in June 2017, there is a free tier for Lambda up to 1 million requests per month that is more than enough for getting started with the service. Please check the pricing page,  https://aws.amazon.com/free/, for the latest AWS free tier details. 
  5. The Amazon API service is currently NOT offered in the free tier. However, there are no minimum fees or upfront commitments with the usage-based service. Pricing is based on usage and your region of choice. For example, in US-East, the price is$3.5 per million API calls received plus the cost of data transfer out, https://aws.amazon.com/api-gateway/pricing/. My total cost of API services when I built this use case was $0.01, but your mileage might vary. 
  6. In AWS, not all regions are created equal. We will be using us-east-1 region in N. Virginia to build both services, make sure you select it on the top right hand corner. This is sometimes overlooked and cause issues down the line when the services have to exist in the same region.

Step 1. Create IAM Role

For security best practices, we will create a new role using Amazon Identity and Access Management for the API call that will later be assigned to the Lambda function.

  1. Create a free account or sign in to your AWS console via https://aws.amazon.com/:

The click should lead you to a page where you can create a new account or log in to your existing account.

1. Once logged in, click on “Services” on the top left hand corner and choose IAM under Security, Identify & Compliance.

2. Choose the role option on left panel.

3. Create new role and give it a name.

4. Select AWS Lambda for role type.

5. Click on Attach policy and attach the following pre-made policies with the role.

Step 2. Create New Amazon API Endpoint

In our simple design, we will use the a simple API endpoint of POST to /sms. In the body of the POST message, we will construct 3 JSON key value pairs of to_number, from_number, and message.

1. From Services drop down, choose API Gateway under Application Services.

2. If this is your first time on the API Gateway page, click on the “Get Started” button or click on ‘Create API’ if you have existing API gateways.

4. If there are pre-populated sample code, close it and choose new API. Name it twilioTestAPI as the name. You can put anything you want for the description.

5. Use the Action drop down to create a resource, name it SMS.

6. Click on action again to create a method, then from the drop down menu create a POST method.

7. Select Lambda Function as the integration type. A menu to select Lambda Region and name of the function will appear.

8. At this point we need to go create the Lambda function before we can link the API to the correct resource. So let’s do that. I find it easier to leave this tab open in the browser so we can come back to it, but it is up to your personal preference.

Step 3. Create Twilio Lambda Function

  1. Make a new directory somewhere on your computer for the code you will write for the Lambda function.
  2. Change to the newly created directory and install the Twilio helper function locally via PIP in the same directory. It is important to install the any helper function you need in a single directory. In later steps we need to zip all the modules in this directory into one zip file and upload to Lambda.

 2016-09-15 11:25:54 ☆  MacBook-Air in ~/Twilio/AWS_API_Demo
○ → pip install twilio==5.5.0 -t .
Collecting twilio
  Downloading twilio-5.5.0-py2.py3-none-any.whl (262kB)
    100% |████████████████████████████████| 266kB 998kB/s
Collecting pytz (from twilio)
  Using cached pytz-2016.6.1-py2.py3-none-any.whl
Collecting six (from twilio)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting httplib2>=0.7 (from twilio)
Installing collected packages: pytz, six, httplib2, twilio
Successfully installed httplib2 pytz-2015.7 six-1.10.0 twilio

Note. If you are using a Python version that was installed with Homebrew, instead of pip -t you would need to use pip install -prefix. For more information please see https://github.com/pypa/pip/pull/4103. In general, we have seen some issues with Homebrew-installed Python due to installation location issues. If you can, use a standard Python install for these examples.

3. Create a new file named lambda_function.py and construct the code below. Replace it with your Twilio Account SID and Auth Token which you can get from the Twilio Console. The lambda_function.lambda_handler() function will be our entry point when the function is called.

 2016-09-15 11:48:21 ☆  MacBook-Air in ~/Twilio/AWS_API_Demo
○ → cat lambda_function.py

from __future__ import print_function
from twilio import twiml
from twilio.rest import TwilioRestClient
import json

client = TwilioRestClient("<your twilio account sid>", "<your auth token>")

def lambda_handler(event, context):
    print("this is the event passed to lambda_handler: "   json.dumps(event))
    print("parameters"   event['to_number'], event['from_number'], event["message"])
    client.messages.create(to=event['to_number'], from_=event['from_number'], body=event['message'])
    return "message sent"

4. Zip everything in the directory into a file called twilioLambda.zip.

zip -r twilioLambda.zip *

5. Go back to your browser for the AWS portal, click on Services and choose Lambda under Compute.

6. Create a new function.

7. Choose Black Function.

8. Use the drop down box to choose API Gateway as your trigger.

9. Choose twilioTestAPI as the API name.


10. Add a name for your function, such as twilioAPILambda, choose Python 2.7 as Runtime and ‘upload a zip file’ for Code entry type. Click on ‘upload’ and upload the previously created zip file.

11. Scroll down for Lambda function handler and role. Leave the handler name as lambda_function.lambda_handler, choose an existing role, and the role you created in step 1.

12. Review and Choose Create Function.

13. Please note, you might get this trigger error, but that is ok. We will wire it up in the next step.

Step 4. Wire up the API Service with Lambda Function

  1. Head back to the AWS API Gateway page from Step 2 substep 7, refresh your page and choose Lambda Function, us-east-1 region (or other regions if that is where you created your Lambda function), and fill in the name of the Lambda function that you just created. Then click on Save.

2. Click on ‘Ok’ when prompted to grant access from API to the Lambda function.

3. At this point you will see them wired up. You can click on the Test button to test the connection.

4. Enter your to_number, from_number, and message according to your account in the request body and click on test. Below is an example of the JSON request body:

{
    "to_number": "",
    "from_number": "",
    "message": "hello from API"
}

4. You should get a 200 status and a ‘message sent’ in the response body. If it does not work, delete the resource and recreate it again.

5. You can also test it from the lambda function itself on the corresponding Lambda page on AWS, use the test button and fill in the same fields.


6. You should now see the message on your phone.

Step 5. Create Security authentication and Usage Plan

  1. Create API Key, after creation take a note of the API key:

2. Create Usage plan and associate it back to the API.

3. Set API Key required to be True for the POST method.


4. From Action drop down, choose create API.

5. Take a note of the Invoke URL.

6. Associate the Usage Plan to the API Stage that you just created.

Step 6. At Last, Test out the Public API

Here is a quick Python script for testing the URI, notice that I change the message to indicate this was from the Python script:

import requests, json
 
url = "https://enjvbt2bvj.execute-api.us-east-1.amazonaws.com/prod/sms"
auth_header = {'content-type': 'application/json', 'x-api-key': }
data = {
    "to_number": "",
    "from_number": "",
    "message": "Hello from Python Requests Test to Prod!"
}
 
r = requests.post(url, data=json.dumps(data), headers=auth_header)
print(str(r.status_code) + " " + r.content)

Here is the result when we execute the above code:

○ → python pythonTest.py
200 "message sent"

Let’s try again with some bogus key in the header:

auth_header = {'content-type': 'application/json', 'x-api-key': 'bogus key'}

As expected, here is the forbidden message:

○ → python pythonTest.py
403 {"message":"Forbidden"}

Conclusion

Whew, we did it! Hopefully you can get a sense of the power of the serverless setup. I believe the combination of Amazon Lambda and Amazon API services is one of those disruptive technologies that allows for quick and secure service deployments that previously could not be done before.

Please feel free to reach out to me on Twitter @ericchou or by email eric@pythonicneteng.com if you have any questions regarding this post or my new book, Mastering Python Networking.

How to Build a Serverless API with Amazon Web Services’ API Gateway

Image recognition in Python with the Clarifai API and Twilio MMS

$
0
0

Image recognition can seem like a pretty daunting technical challenge. Scraping images to use as training data for a machine learning model stresses me out. That’s where Clarifai comes in. This API is great for implementing image recognition so you can focus on the core functionality of what you are building.

Let’s build a Flask application in Python with Twilio MMS to receive picture messages over a phone number and respond with relevant keywords from Clarifai’s image recognition API.

Setting up your environment

Before moving on, make sure to have your Python environment setup. Getting everything working correctly, especially with respect to virtual environments, is important for isolating your dependencies if you have multiple projects running on the same machine.

You can also run through this guide to make sure you’re good to go before moving on.

Installing dependencies

Now that your environment is set up, you’re going to need to install the libraries we’ll use for this app:

First, navigate to the directory where you want this code to live and run the following command in your terminal with your virtual environment activated to install these dependencies:

pip install flask==0.12.2 twilio==6.4.2 clarifai==2.0.29

Image recognition with Clarifai

Let’s start by writing a module to interact with the Clarifai API. Before being able to use the Clarifai API, you’ll have to make an account. Once you have an account, you’ll need to create an application so you have an API key to use. You can name your application whatever you want.

ClarifaiAPIKey.gif

Once you have an application, you need to set the following environment variable so the Clarifai Python module can use your API key to authenticate:

export CLARIFAI_API_KEY=*your API key*

Now you can start writing some code. Create a file called tags.py and enter the following code:

from clarifai.rest import ClarifaiApp


app = ClarifaiApp()


def get_relevant_tags(image_url):
    response_data = app.tag_urls([image_url])

    tag_urls = []
    for concept in response_data['outputs'][0]['data']['concepts']:
        tag_urls.append(concept['name'])

    return tag_urls

What we’re doing here is defining a function that hits the Clarifai API with an image URL and returns all of the tags or “concepts” associated with that image. Try running this code with an image of your choice from the Internet and seeing what happens. I’m going to use this picture from a show my old band played in Philly back in 2012.

394057_328648487179283_294009412_n.jpg

Append the line print('n'.join(get_relevant_tags('*image_url*'))) to your code and run it:

python tags.py

You should see some results printing to your terminal. Here’s some of what I got. It’s interesting that Clarifai was basically able to figure out that this picture was of a group of people taken at a concert:

Screen Shot 2017-06-28 at 4.49.09 PM.png

You can delete that line you just added before moving on.

Setting up your Twilio account

Before being able to respond to picture messages, you’ll need a Twilio phone number. You can buy a phone number here.

Your Flask app will need to be visible from the internet in order for Twilio to send requests to it. We will use ngrok for this, which you’ll need to install if you don’t have it. In your terminal run the following command:

ngrok http 5000

This provides us with a publicly accessible URL to the Flask app. Configure your phone number as seen in this image:

Screen Shot 2016-07-14 at 10.54.53 AM.png
You are now ready to send a text message to your new Twilio number.

Building the Flask app

Now that you have a Twilio number and are able to grab relevant tags and keywords associated with an image, you want to allow users to text a phone number with their own pictures for your code to analyze.

Let’s create our Flask app. Open a new file called app.py and add the following code:

from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse

from tags import get_relevant_tags


app = Flask(__name__)


@app.route('/sms', methods=['POST'])
def sms_reply():
    # Create a MessagingResponse object to generate TwiML.
    resp = MessagingResponse()

    # See if the number of images in the text message is greater than zero.
    if request.form['NumMedia'] != '0':

        # Grab the image URL from the request body.
        image_url = request.form['MediaUrl0']
        relevant_tags = get_relevant_tags(image_url)
        resp.message('n'.join(relevant_tags))
    else:
        resp.message('Please send an image.')

    return str(resp)


if __name__ == '__main__':
    app.run()

We only need one route on this app: /sms to handle incoming text messages.

Run your code with the following terminal command:

python app.py

Now take a selfie (or any picture) and send it to your Twilio phone number to see if it will recognize what’s inside!

How does this all work?

With this app running on port 5000, sitting behind our public ngrok URL, Twilio can see your application. Upon receiving a text message:

  1. Twilio will send a POST request to /sms.
  2. The sms_reply function will be called.
  3. The URL to the image in the text message is passed to our tags module
  4. A request to the Clarifai API will be made, receiving a response with keywords associated with our image.
  5. Your /sms route responds to Twilio’s request telling Twilio to send a message back with the tags we received from the Clarifai API.

For more Clarifai related fun, check out this post on how to hack your gift giving. If you’re interested in trying your hand at image recognition on your own with OpenCV, you can check out this blog post written by Megan Speir.

Feel free to reach out if you have any questions or comments or just want to show off the cool stuff you’ve built.

Image recognition in Python with the Clarifai API and Twilio MMS

Reading Excel Spreadsheets with Python, Flask, and Openpyxl

$
0
0

 

Basketball GIF - Find & Share on GIPHY

Data stored in Excel spreadsheets can be hard to read with anything other than Excel and it’s especially tough to compare two specific datasets within all that data. One possible solution is Python. It can do the dirty work of finding the information for us while also being pretty fun.

In this post we will read NBA statistics from a Microsoft Excel sheet using the Openpyxl library. How will we know which statistics to look for and return? Text a Twilio phone number two players and a type of basketball statistic (like total points or three-point-shot-percentage) and then the SMS response will look up the statistics of the corresponding players like so:

Want to skip the tutorial and jump right into the code?  No problem, head over to the complete code.

Otherwise, let’s get started.

Getting the Data

This post uses data for each player from the 2016 season on the NBA website here. To get the data I specified which season, season type (playoffs versus regular season), quantifying data type (game average versus total for the season), which dates to look at, and the stats this post uses like age, games played, wins, losses, minutes, points, field goal percentage, and three-point shot percentage, among others. You can see the data in full here, and export it as an Excel file like this.

Setup your Developer Environment

Before we dig into the code and Openpyxl, make sure your Python and Flask development environment is setup.  If you’re new to Python and Flask, this handy guide is a great place to get started. If you’re already familiar with Flask, go ahead and stand up a new empty Flask application.

Your Flask app will need to be visible from the internet so Twilio can send requests to it. Ngrok lets us do this so install it if you haven’t already. Once that’s done, run the following command in your terminal in the directory you’ll put your code in.

ngrok http 5000

This gives us a publicly-accessible URL to the Flask app so we can configure our Twilio phone number.

You’ll also need a Twilio phone number to send and receive SMS messages. If you need to get one you can do that here, and let’s make sure it’s configured as shown in this following gif.

Parsing Data with Openpyxl

Once you have your environment set up and have acquired a Twilio phone number you can start building the app by installing Openpyxl, an open source Python library that reads and writes Microsoft Excel .xlsx files.

Type the following into your terminal in your project directory.

pip install twilio==6.4.2
pip install flask==0.12.2
pip install openpyxl==2.4.8

Open up a new file and save it as main.py. At the very top, include the following imports.

from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse
from openpyxl import load_workbook, Workbook

Let’s use two separate lists to store the data we read from the files: one of players and one of their corresponding statistics we want to search (ie. games played, wins, losses, minutes, points, field goal percentage, etc.) These will be used later on to make a dictionary where the keys are the players’ names and the value is their corresponding statistics. Now let’s create a higher-order function called parse_data_into_dict in main.py, and put in the following code.  This code maps the statistics we’re interested in to different columns of the Excel sheet, represented by letters in stat_dict:

def parse_data_into_dict(data):
    list_of_players = []
    list_of_stats = []
    stat_dict = { 
    "age":"B", "gp":"C","w":"D","l":"E","min":"F","pts":"G",
    "fgm":"H","fga":"I","fg%":"J",
    "3pm":"K","3pa":"L","ftm":"M","fta":"N",
    "ft%":"O","oreb":"P","dreb":"Q","reb":"R","ast":"S","tov":"T", "stl": "U",
    "blk": "V", "pf": "W", "dd2": "X", "td3": "Y"
 }

Next, let’s fill these lists with the data in the spreadsheet. Start by loading the Excel file using the load_workbook function and reading the existing worksheet. Our parse_data_into_dict function should now look like this:

def parse_data_into_dict(data):
    list_of_players = []
    list_of_stats = []
    stat_dict = { 
        "age":"B", "gp":"C","w":"D","l":"E","min":"F","pts":"G",
        "fgm":"H","fga":"I","fg%":"J",
        "3pm":"K","3pa":"L","ftm":"M","fta":"N",
        "ft%":"O","oreb":"P","dreb":"Q","reb":"R","ast":"S","tov":"T", "stl": "U",
        "blk": "V", "pf": "W", "dd2": "X", "td3": "Y"
 }
 excelfile = 'nbastats.xlsx'
 wb = load_workbook(excelfile)
 ws = wb[wb.sheetnames[0]]
 for row in range(1, ws.max_row+1): #need +1 to get last row!

More complex apps and data may have different worksheets within a single workbook. With the data we have, one could be Regular Season while another could be for the Playoffs. We just want the worksheet at index zero since we only have one worksheet, 

After we’ve loaded the Excel worksheet, let’s iterate through each cell in the worksheet. Each column of the worksheet is represented by a letter (which is why we made the dictionary above so that the values match the columns in the Excel sheets). Knowing this, let’s add on to our parse_data_into_dict function.

def parse_data_into_dict(data):
    list_of_players = []
    list_of_stats = []
    stat_dict = { 
        "age":"B", "gp":"C","w":"D","l":"E","min":"F","pts":"G",
        "fgm":"H","fga":"I","fg%":"J",
        "3pm":"K","3pa":"L","ftm":"M","fta":"N",
        "ft%":"O","oreb":"P","dreb":"Q","reb":"R","ast":"S","tov":"T", "stl": "U",
        "blk": "V", "pf": "W", "dd2": "X", "td3": "Y"
    }
    excelfile = 'nba_stats_two_sheets.xlsx'
    wb = load_workbook(excelfile)
    ws = wb[wb.sheetnames[0]]
    for row in range(1, ws.max_row+1): #need +1 to get last row!
        for col in "A": #A gets players for texted season
            cell_name="{}{}".format(col, row)
            list_of_players.append(ws[cell_name].value.lower())
            for col in stat_dict[data]: # gets column of whatever statistic
                cell_name="{}{}".format(col, row)
                #print(ws[cell_name].value)
                list_of_stats.append(ws[cell_name].value)
    return dict(zip(list_of_players, list_of_stats))

The “A” column contains the players’ names so each player in that column is added to the list_of_players list. Then, using the stat_dict list, we loop through the columns which contain the other stats we are interested in and add those to the separate list_of_stats list.

These two lists are then zipped together into one dictionary with players as keys and the corresponding statistic numbers as values. This dictionary will be returned in a send_sms function which we will now make.

Building the Flask App and Sending SMS

This send_sms function provides the heart of the code: it’s the route for the Flask app and checks if the body of the incoming SMS is in our dictionary and sends an outbound SMS accordingly.

app = Flask(__name__)
@app.route('/', methods=['GET', 'POST'])

def send_sms():
    msg = request.form['Body'].lower() # convert to lowercase
    typomsg = "send 1st + last names of 2 players followed by a stat (GP,W,L,MIN,PTS,FG%,3P%,FT%,REB,AST,STL,BLK). Check for typos!"
    player_and_stat = msg.split() #split 

    if len(player_and_stat) == 5: # check input: 2 players + stat
        player1 = player_and_stat[0] + " " + player_and_stat[1] 
        player2 = player_and_stat[2] + " " + player_and_stat[3]
        stat = player_and_stat[4]
        player_stat_map = parse_data_into_dict(stat)
        if player1 in player_stat_map.keys() and player2 in player_stat_map.keys():
            if player_stat_map[player1] > player_stat_map[player2]:
                ret = MessagingResponse().message(player1 + " 's total " + str(player_stat_map[player1]) + ", higher than " + player2 + "\'s " + str(player_stat_map[player2]))
            else:
                ret = MessagingResponse().message(player2 + " 's total " + str(player_stat_map[player2]) + ", higher than " + player1 + "\'s " + str(player_stat_map[player1]))
        else: #check
            ret = MessagingResponse().message("check both players' names (first and last!)")
    else: #idk how many players
        ret = MessagingResponse().message(typomsg)
    return str(ret)
if __name__ == "__main__":
    app.run(debug=True)

Let’s break this down. request.form['Body'].lower() looks at the input SMS and converts the Body to lowercase so it’s easier to compare. Then, we break it up by whitespace and add each piece to a string array. If that array has a length of five (which is what we expect because input should be first and last names of two players followed by a statistic), then we save those two players and the statistic as variables.

Next, we call parse_data_into_dict and use the dictionary it returns to check that it contains the the two players and statistic the message asks for. If it does, we check if the data of one player is greater than the other. Depending on that, a different message is returned. If one or both of the players are not in the dictionary, we return an error message.

Run the following command on the command line to run our Flask app.

python main.py

Now try out the app.  Text your Twilio number two players (with first and last names) and a statistic like “pts.” It could look something like this.

Conclusion

Wow! You just used Openpyxl to read an Excel spreadsheet. What’s next? You can use Openpyxl for financial, baseball, or any sort of data. Here are some more Openpyxl resources and tutorials you may find interesting that go even more in depth.

  1. Openpyxl docs
  2. pythonexcel.com tutorial
  3. tutsplus  

The completed code is on GitHub. Questions or comments on the data or code used? Find me online here:

Reading Excel Spreadsheets with Python, Flask, and Openpyxl

Getting Started on Geospatial Analysis with Python, GeoJSON and GeoPandas

$
0
0

As a native New Yorker, I would be a mess without Google Maps every single time I go anywhere outside the city. We take products like Google Maps for granted, but they’re an important convenience. Products like Google or Apple Maps are built on foundations of geospatial technology. At the center of these technologies are locations, their interactions and roles in a greater ecosystem of location services.

This field is referred to as geospatial analysis. Geospatial analysis applies statistical analysis to data that has geographical or geometrical components. In this tutorial, we’ll use Python to learn the basics of acquiring geospatial data, handling it, and visualizing it. More specifically, we’ll do some interactive visualizations of the United States!

Environment Setup

This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install several packages that we’ll use throughout this tutorial. You can do this by opening terminal or command prompt on your operating system:

pip3 install shapely==1.5.17.post1
pip3 install geopandas==0.2.1
pip3 install geojsonio==0.0.3

Since we’ll be working with Python interactively, using the Jupyter Notebook is the best way to get the most out of this tutorial. Following this installation guide, once you have your notebook up and running, go ahead and download all the data for this post here. Make sure you have the data in the same directory as your notebook and then we’re good to go!

A Quick Note on Jupyter

For those of you who are unfamiliar with Jupyter notebooks, I’ve provided a brief review of which functions will be particularly useful to move along with this tutorial.
In the image below, you’ll see three buttons labeled 1-3 that will be important for you to get a grasp of: the save button (1), add cell button (2), and run cell button (3).

The first button is the button you’ll use to save your work as you go along (1). Feel free to choose when to save your work.
Next, we have the “add cell” button (2). Cells are blocks of code that you can run together. These are the building blocks of jupyter notebook because it provides the option of running code incrementally without having to to run all your code at once.  Throughout this tutorial, you’ll see lines of code blocked off. Each line of code should correspond to a cell.
Lastly, there’s the “run cell” button (3). Jupyter Notebook doesn’t automatically run it your code for you; you have to tell it when by clicking this button. As with add button, once you’ve written each block of code in this tutorial onto your cell, you should then run it to see the output (if any). If any output is expected, note that it will also be shown in this tutorial so you know what to expect. Make sure to run your code as you go along because many blocks of code in this tutorial rely on previous cells.

Introduction

Data typically comes in the form of a few fundamental data types: strings, floats, integers, and booleans. Geospatial data, however, uses a different set of data types for its analyses. Using the shapely module, we’ll review what these different data types look like.
shapely has a class called geometry that contains different geometric objects. Using this module we’ll import the needed data types:

from shapely.geometry import Point, Polygon

The simplest data type in geospatial analysis is the Point data type. Points are objects representing a single location in a two-dimensional space, or simply put, XY coordinates. In Python, we use the point class with x and y as parameters to create a point object:

p1 = Point(0,0)
print(p1)

POINT (0 0)

Notice that when we print p1, the output is POINT (0 0). This indicated that the object returned isn’t a built-in data type we’ll see in Python. We can check this by asking Python to interpret whether or not the point is equivalent to the tuple (0, 0):

print(p1 == (0,0))

False

The above code returns False because of its type. If we print the type of p1, we get a shapely Point object:

print(type(p1))

Next we have a Polygon, which is a two-dimensional surface that’s stored as a sequence of points that define the exterior. Because a polygon is composed of multiple points, the shapely polygon object takes a list of tuples as a parameter.

polygon = Polygon([(0,0),(1,1),(1,0)])

Oddly enough, the shapely Polygon object will not take a list of shapely points as a parameter. If we incorrectly input a Point, we’ll get an error message remind us of the lack of support for this data type.

Data Structures

GeoJSON is a format for representing geographic objects. It’s different from regular JSON because it supports geometry types, such as: Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection.
Using GeoJSON, making visualizations becomes suddenly easier, as you’ll see in a later section. This is primarily because GeoJSON allows us to store collections of geometric data types in one central structure.

GeoPandas is a Python module used to make working with geospatial data in python easier by extending the datatypes used by the Python module pandas to allow spatial operations on geometric types. If you’re unfamiliar with pandas, check out these tutorials here.
Typically, GeoPandas is abbreviated with gpd and is used to read GeoJSON data into a DataFrame. Below you can see that we’ve printed out five rows of a GeoJSON DataFrame:

import geopandas as gpd
states = gpd.read_file('states.geojson')
print(states.head())


 adm1_code          featurecla  \
0  USA-3514  Admin-1 scale rank   
1  USA-3515  Admin-1 scale rank   
2  USA-3516  Admin-1 scale rank   
3  USA-3517  Admin-1 scale rank   
4  USA-3518  Admin-1 scale rank   

                                            geometry id  scalerank  
0  POLYGON ((-89.59940899999999 48.010274, -89.48...  0          2  
1  POLYGON ((-111.194189 44.561156, -111.291548 4...  1          2  
2  POLYGON ((-96.601359 46.351357, -96.5389080000...  2          2  
3  (POLYGON ((-155.93665 19.05939, -155.90806 19....  3          2  
4  POLYGON ((-111.049728 44.488163, -111.050245 4...  4          2

Just as with regular JSON and pandas dataframes, GeoJSON and GeoPandas have functions which allow you to easily convert one to the other. Using the example dataset from above, we can convert the DataFrame to a geojson object using the to_json function:

states = states.to_json()
print(states)

 

Being able to easily convert GeoJSON from one format to another gives us more freedom as to what we can do with our data, whether that be analyzing, visualizing, or manipulating.

Next we will review geojsonio, a tool used for visualizing GeoJSON on the browser. Using the states dataset above, we’ll visualize the United States as a series of Polygons with geojsonio’s display function:

import geojsonio
geojsonio.display(states)

Once this code is run, a link will open in the browser, displaying an interface as shown below:

On the left of the page, you can see that the GeoJSON displayed and available for editing. If you zoom in and select a geometric object, you’ll see that you also have the option to customize it:

And perhaps most importantly, geojsonio has multiple options for sharing your content. There is the option to share a link directly:

And to everyone’s convenience the option to save to GitHub, GitHub Gist, GeoJSON, CSVs, and various other formats gives developers plenty of flexibility when deciding how to share or host content.

In the example before we used GeoPandas to pass GeoJSON to the display function. If no manipulation on the geospatial needs to be performed, we can treat the file as any other and set its contents to a variable:

contents = open('map.geojson').read()
print(contents)


{
    "type": "Point",
    "coordinates": [
		-73.9617,
		40.8067
		]
}

The format is still a suitable parameter for the display function because JSON is technically a string. Again, the main difference between using GeoPandas is whether or not any manipulation needs to be done.

This example is simply a point, so besides reading in the JSON, nothing necessarily has to be done, so we’ll just pass in the GeoJSON string directly:

geojsonio.display(contents)

And once again, a link is opened in the browser and we have this beautiful visualization of a location in Manhattan.

And That’s a Wrap

That wraps up an introduction to performing geoSpatial analysis with Python. Most of these techniques are interchangeable in R, but Python is one of the best suitable languages for geospatial analysis. Its modules and tools are built with developers in mind, making the transition into geospatial analysis must easier.

In this tutorial, we visualized a map of the United States, as well as plotted a coordinate data point in Manhattan. There are multiple ways in which you can expand on these exercises & state outlines are crucial to so many visualizations created to compare results between states.

Moving forward from this tutorial, not only can you create this sort of visualization, but you can combine the techniques we used to plot coordinates throughout multiple states. To learn more about geospatial analysis, check the resources below:

If you liked what you did here, follow @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.

Getting Started on Geospatial Analysis with Python, GeoJSON and GeoPandas

JSON Serialization in Python using serpy

$
0
0

Serialization is the process of transforming objects of complex data types (custom-defined classes, object-relational mappers, datetime, etc.) to native data types so that they can then be easily converted to JSON notation.

In this blog post, we will use a neat little library called serpy to see how such transformations work. We will then integrate this code in a tornado-based web server for a quick demo of how we can write APIs returning JSON data.

Step 1: Define the Data Type

Let’s assume that we are working on an API which returns details of people, like an ID, their name, and their birthdate. For the scope of this blog post, we could say that the API has access to a database of people and that requesting /person/42 will return a JSON representation of the person with ID 42. Before we begin, let’s quickly define the data type.

class Person(object):
    def __init__(self, id, name, birth_date):
        self.id = id
        self.name = name
        self.birth_date = birth_date

    @property
    def sidekick(self):
        return sidekick_for(self)

Assume for now that the function sidekick_for returns the sidekick for the given person (also a Person object). We’ll add a proper function definition later.

Now, JSON only accepts native data types like integers, strings, booleans, and so on. It’s pretty clear that the Python json.dumps function on a Person object won’t work. Instead, we need a representation that only uses native data types before we can pass it to a JSON encoding function.

Approach #1 – Straightforward

We could add a to_json function to the Person class that returns a dictionary of the Person details. That would look something like the following:

def to_json(self):
    return {
        'id': self.id,
        'name': self.name,
        'birth_date': self.birth_date.isoformat(),
        'sidekick': self.sidekick.to_json() if self.sidekick else None,
    }

We return the values of the attributes which make up a Person object, and since they’re all native types, we can pass this dictionary into an encoding function to finally get the JSON notation. Looks good!

giphy.gif

Note that we need to call isoformat on self.birth_date to get a string back since a Python datetime object is not a native datatype. Also note that we’re recursively calling the to_json function on self.sidekick to get its JSON representation. If we don’t, that variable will end up being a Person object which can’t be converted directly to JSON.

While this works, there are a few issues here. For one, we can’t define the field types. So if some code is consuming this JSON representation and we encounter a boolean value for id, the calling code would be confused. Ideally we would like to handle such cases already at serialization time. Additionally, some use cases might require that the returned value be different based on some context. As an example, consider a web application that allows chat rooms where two or more users can talk to each other. In such cases, the number of unread messages for the same chat room would be different based on which user is requesting the value.

The simplest thing to do here would be to separate the serializer definition from the original class definition. This is where serpy comes in. If serpy is not yet installed, type pip install serpy==0.1.1 on the command line, and let’s see how we can use it!

Approach #2: Define a Serializer

from serpy import Serializer, IntField, StrField, MethodField


class PersonSerializer(Serializer):
    id = IntField(required=True)
    name = StrField(required=True)
    birth_date = MethodField('serialize_birth_date')
    sidekick = MethodField('serialize_sidekick')

    def serialize_birth_date(self, person):
        return person.birth_date.isoformat()

    def serialize_sidekick(self, person):
        if not person.sidekick:
            return None
        return PersonSerializer(person.sidekick).data

What just happened? We defined a PersonSerializer, which is a class that defines how Person objects should be serialized. This is good because we now know the field types, which means returning a boolean value for id is considered an error, and returning no value is also an error because id is marked as a required field.

That’s not all. We also achieved separation of concerns by moving the to_json function into a separate serializer class. In case we have to perform some code surgery in the future, this is fantastic.

giphy.gif

What’s also cool is that with a few more lines of code, we can add context to the serializer as well, which would then enable us to change the value of a given field depending on what the context is. Alas, that’s a topic for a separate blog post, or perhaps an exercise for you, the reader. :)

Putting it all together

Let’s write a small API server in tornado (run pip install tornado==4.5.1 on the command line) that combines all the code we wrote in this post. Save the following code in a file called server.py in the current directory.

from datetime import datetime

from serpy import Serializer, IntField, StrField, MethodField
from tornado.escape import json_encode
from tornado.ioloop import IOLoop
from tornado.web import RequestHandler, Application

class Person(object):
    def __init__(self, id, name, birth_date):
        self.id = id
        self.name = name
        self.birth_date = birth_date

    @property
    def sidekick(self):
        return sidekick_for(self)

class PersonSerializer(Serializer):
    id = IntField(required=True)
    name = StrField(required=True)
    birth_date = MethodField('serialize_birth_date')
    sidekick = MethodField('serialize_sidekick')

    def serialize_birth_date(self, person):
        return person.birth_date.isoformat()

    def serialize_sidekick(self, person):
        if not person.sidekick:
            return None
        return PersonSerializer(person.sidekick).data

batman = Person(1, 'Batman', datetime(year=1980, month=1, day=1))
robin = Person(2, 'Robin', datetime(year=1980, month=1, day=1))

def sidekick_for(person):
    return robin if person == batman else None

class BatmanHandler(RequestHandler):
    def get(self):
        self.write(json_encode(PersonSerializer(batman).data))

if __name__ == '__main__':
    Application([(r'/batman', BatmanHandler)]).listen(8888)
    print('Listening on port 8888')

    IOLoop.current().start()

Running it by executing python server.py on the command line in the current working directory. Python should start a web server on port 8888, and visiting the URL /batman should show you the JSON representation of the Person object. It works!

giphy.gif

Conclusion

In this post we explored how to serialize Python objects to JSON. An important thing to keep in mind is that serialization is not limited to JSON. There are plenty of other data formats (XML, for instance) which could use some help. Either way, the basic concept remains the same.
An interesting follow-up exercise would be to try dumping data into different data formats and trying out other libraries (like marshmallow). Happy serializing!

If you have any questions, suggestions, or feedback, feel free to find me online. I’d love to hear from you if this post helped you build something cool!

JSON Serialization in Python using serpy

Analyzing Messy Data Sentiment with Python and nltk

$
0
0

Sentiment analysis uses computational tools to determine the emotional tone behind words. This approach can be important because it allows you to gain an understanding of the attitudes, opinions, and emotions of the people in your data.
At a higher level, sentiment analysis involves natural language processing and artificial intelligence by taking the text element, transforming it into a format that a machine can read, and using statistics to determine the actual sentiment.
In this tutorial, we’ll use the natural language processing module, nltk, to determine the sentiment of tweets from Twitter.

Sentiment analysis on text

Sentiment analysis isn’t a new concept. There are thousands of labeled data out there, labels varying from simple positive and negative to more complex systems that determine how positive or negative is a given text. Because there’s so much ambiguity within how textual data is labeled, there’s no one way of building a sentiment analysis classifier.

I’ve selected a pre-labeled set of data consisting of tweets from Twitter already labeled as positive or negative. Using this data, we’ll build a sentiment analysis model with nltk.

Environment Setup

This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install the nltk package that we’ll use throughout this tutorial:

pip3 install nltk==3.2.4

We will use datasets that are already well established and widely used for our textual analysis. To gain access to these datasets, enter the following command into your command line (note that this might take a few minutes):

sudo python3 -m nltk.downloader all

Using Jupyter Notebook is the best way to get the most out of this tutorial by using its interactive prompts. When you have your notebook up and running, you can download the data we’ll be working with in this example. You can find this in the repo as neg_tweets.txt and pos_tweets.txt. Make sure you have the data in the same directory as your notebook and then we are good to go.

A Quick Note on Jupyter

For those of you who are unfamiliar with Jupyter notebooks, I’ve provided a brief review of which functions will be particularly useful to move along with this tutorial.
In the image below, you’ll see three buttons labeled 1-3 that will be important for you to get a grasp of — the save button (1), add cell button (2), and run cell button (3).

The first button is the button you’ll use to save your work as you go along (1).

Next, we have the “add cell” button (2). Cells are blocks of code that you can run together. These are the building blocks of jupyter notebook because it provides the option of running code incrementally without having to run all your code at once.  Throughout this tutorial, you’ll see lines of code blocked off- each one should correspond to a cell.

Lastly, there’s the “run cell” button (3). Jupyter Notebook doesn’t automatically run your code for you; you have to tell it when to do it by clicking “run cell”. As with the add button, once you’ve written each block of code in this tutorial onto your cell, you should then run it to see the output (if any). If any output is expected, note that it will also be shown in this tutorial so you know what to expect. Make sure to run your code as you go along because many blocks of code in this tutorial rely on previous cells.

Preparing the Data

We’ll now use nltk to build a sentiment analysis model on the same dataset. nltk requires a different data format, which is why I’ve implemented the function below:

import nltk

def format_sentence(sent):
    return({word: True for word in nltk.word_tokenize(sent)})

print(format_sentence("The cat is very cute"))

Which produces

{'The': True, 'cat': True, 'is': True, 'very': True, 'cute': True}

format_sentence changes each tweet into a dictionary of words mapped to True booleans. Though not obvious from this function alone, this will eventually allow us to train our prediction model by splitting the text into its tokens, i.e. tokenizing the text.
Format the positively and negatively labeled data using the data we downloaded from the GitHub repository.

pos = []
with open("./pos_tweets.txt") as f:
    for i in f: 
        pos.append([format_sentence(i), 'pos'])

neg = []
with open("./neg_tweets.txt") as f:
    for i in f: 
        neg.append([format_sentence(i), 'neg'])

# next, split labeled data into the training and test data
training = pos[:int((.8)*len(pos))] + neg[:int((.8)*len(neg))]
test = pos[int((.8)*len(pos)):] + neg[int((.8)*len(neg)):]

Building a Classifier

All nltk classifiers work with feature structures, which can be simple dictionaries mapping a feature name to a feature value. In this example, we use the Naive Bayes Classifier, which makes predictions based on the word frequencies associated with each label of positive or negative.

from nltk.classify import NaiveBayesClassifier

classifier = NaiveBayesClassifier.train(training)

SWe can call a function show_most_informative_features to see which words are the highest indicators of a positive or negative label because the Naive Bayes Classifier is based entirely off of the frequencies associated with each label for a given word:

classifier.show_most_informative_features()

Most Informative Features
         no = True              neg : pos    =     19.4 : 1.0
       love = True              pos : neg    =     19.0 : 1.0
    awesome = True              pos : neg    =     17.2 : 1.0
   headache = True              neg : pos    =     16.2 : 1.0
         Hi = True              pos : neg    =     12.7 : 1.0
        fan = True              pos : neg    =      9.7 : 1.0
  beautiful = True              pos : neg    =      9.7 : 1.0
      Thank = True              pos : neg    =      9.7 : 1.0
        New = True              pos : neg    =      9.7 : 1.0
       haha = True              pos : neg    =      9.3 : 1.0

Notice that there are three columns. Column 1 is why we used format_sentence to map each word to a True value. What it does is count the number of occurrences of each word for both labels to compute the ratio between the two, which is what column 3 represents. Column 2 lets us know which label occurs more frequently. The label on the left is the label most associated with the corresponding word.

Classification

Let’s try the classifier out with a positive example to see how our model works:

example1 = "Cats are awesome!"

print(classifier.classify(format_sentence(example1)))

Outputs:

pos

Now try out an example we’d expect a negative label:

example2 = "I don’t like cats."

print(classifier.classify(format_sentence(example2)))

We get the output:

neg

What happens when we mix words of different sentiment labels? Take a look at this example:

example3 = "I have no headache!"

print(classifier.classify(format_sentence(example3)))

Output:

neg

We’ve found a mislabel! Naive Bayes doesn’t consider the relationship between words, which is why it wasn’t able to catch the fact that “no” acted as a negator to the word headache. Instead, it read two negative indicators and classified it as such.
Given that, we can probably expect a less than perfect accuracy rate.

Accuracy

nltk has a built in method that computes the accuracy rate of our model:

from nltk.classify.util import accuracy
print(accuracy(classifier, test))

The result:

0.8308457711442786

nltk specializes and is made for natural language processing tasks, so we should expect that nltk do fairly well with uncleaned and unnormalized data. But why has this well established models only resulted in ~80% accuracy rates? It goes back to the lack of processing beforehand.

If you look at the actual data, you’ll see that the data is kind of messy – there are typos, abbreviations, grammatical errors of all sorts. There’s no general format to every tweet aside from the fact that each tweet is, well, a tweet. So what can we do about this?

For now, we’re stuck with our 83% accurate classifier.

If you liked what you did here, follow me @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.

Analyzing Messy Data Sentiment with Python and nltk


Request Signature Authentication for IVRs Built with Python

$
0
0

For many APIs it is desirable to authenticate requests made to an endpoint. For an interactive voice response (IVR) system API which returns TwiML, the only entity that should likely be allowed access in production is Twilio. This post will cover implementation of request signature validation in a Python IVR web application that uses the Python web framework.

Allowing Twilio Access To localhost

Of the many ways to create a public URL to serve TwiML to Twilio, ngrok is one of the simplest. ngrok will be used for this tutorial and is available here. Once ngrok is downloaded and unzipped, the following command should be run from the directory where it is located:

./ngrok http 0.0.0.0:8080

When the above command is run in the terminal this output should appear: ngrok terminal output
For this demo the HTTPS link will be used. Copy it, and we’ll paste it in the next section.

Setting Up The Twilio Account

To follow along with this demo a Twilio account is required, as the auth token associated with the account is an integral part of request validation. A Twilio account may be created here.
Once logged into a Twilio account, navigate to the phone numbers section and select the phone number that will be used for this demo. Then set the incoming webhook on its configuration page like so and ensure the trailing slash is included:

Twilio phone numbers

The “Common Stumbling Blocks” section below explains the trailing slash inclusion.
Proceed to the console dashboard to obtain the account’s auth token, which is circled below.

Twilio console dashboard

Setting Up And Serving The Demo App

Now it’s time to setup and serve the demo app with the following commands:

cd /tmp
curl -LOk https://github.com/patrickyevsukov/twilio_demo_pyramid_auth/archive/master.zip
unzip master.zip
mv twilio_demo_pyramid_auth* twilio_demo_pyramid_auth
cd twilio_demo_pyramid_auth
pip install virtualenv
virtualenv --clear venv
. ./venv/bin/activate
pip install -e .
export TWILIO_AUTH_TOKEN=XXXX
./tools/serve -c examples/config.ini

Note that the use of a TWILIO_AUTH_TOKEN environment variable is to prevent inclusion of this sensitive information in a config file or source code, where it may be accidentally version controlled. In production systems, a more robust solution for safely including the auth token in a Twilio app may be desirable.

The app should now be available to Twilio and calling the phone number configured in the previous section should result in the following output in the terminal serving the Pyramid app:

1970-01-01 00:00:00,001 INFO  [twilio_demo_pyramid_auth.security] Authenticating GET /
1970-01-01 00:00:00,002 INFO  [twilio_demo_pyramid_auth.security] Authentication SUCCESS
1970-01-01 00:00:00,003 INFO  [twilio_demo_pyramid_auth.security] Authenticating POST /
1970-01-01 00:00:00,004 INFO  [twilio_demo_pyramid_auth.security] Authentication SUCCESS
1970-01-01 00:00:00,005 INFO  [twilio_demo_pyramid_auth.security] Authenticating GET /accessible
1970-01-01 00:00:00,006 INFO  [twilio_demo_pyramid_auth.security] Authentication SUCCESS

For additional information on building applications in Python with the Twilio Voice API, see the quickstart documentation.

How To Spot A Fake Request

Seems legit meme
When the webhook configured to handle incoming phone calls receives a request from Twilio, the request header will contain an X-Twilio-Signature. This value, along with the account’s auth token is all that is necessary to determine if a request actually originated from Twilio.
Note: The process by which Twilio generates this header value is delineated here.
In determining if a request to a certain endpoint should be allowed, Pyramid takes two concepts into account: authentication and authorization. Authentication may be thought of as checking if the requester is who they claim to be, and authorization may be thought of as checking if the requester has permission to do what they are attempting.
Verifying request signatures is an authentication concern and should be handled via the Pyramid app’s authentication policy. Pyramid includes no authentication policies which cleanly allow for only validation of Twilio request signatures. They all center around a userid and facilitate the creation and usage of sessions. Pyramid’s AuthTktCookieHelper comes close to a good fit, but it provides a lot of extra functionality that is not necessary for simple, stateless request validation.
Thankfully, Pyramid makes it easy to define and include a custom authentication policy. A canonical authentication policy should implement the IAuthenticationPolicy interface and define all of its methods; however, doing so will result in a far more fully featured policy than is necessary for this demo.
Note: Twilio also supports HTTP basic authentication and Pyramid includes a BasicAuthAuthenticationPolicy out-of-the-box. This tutorial focuses only on digest authentication which may be preferable as it does not require the inclusion of credentials in every request URL where they may be susceptible to interception.
The custom TwilioSignatureAuthenticationPolicy defined for this demo contains an effective_principals method definition. This is the only method Pyramid requires for an authentication policy to be compatible the ACLAuthorizationPolicy.
A Pyramid app will call the authentication policy’s effective_principals method for each HTTP request to one of its endpoints, making this a convenient location to include the Twilio RequestValidator.

class TwilioSignatureAuthenticationPolicy(object):

    def _is_authentic_twilio_request(self, request):
        logger.info("Authenticating {} {}".format(request.method, request.path))

        twilio_auth_key = os.environ["TWILIO_AUTH_TOKEN"]
        request_validator = RequestValidator(twilio_auth_key)

        twilio_signature = request.headers.get("X-Twilio-Signature", "")
        is_authentic = request_validator.validate(
            request.url,
            request.POST,
            twilio_signature,
        )
        if is_authentic:
            logger.info("Authentication SUCCESS")
            return is_authentic

        logger.info("Authentication FAILURE")
        return is_authentic

    def effective_principals(self, request):
        principals = [pyramid_security.Everyone]

        if self._is_authentic_twilio_request(request):
            principals.append(Twilio)

        return principals

The access control list of the root context defined for this demo app’s views only allows the "view" permission to the Twilio security group:

class RootContext(object):

    __name__ = ""
    __parent__ = None

    def __init__(self, request):
        pass

    @property
    def __acl__(self):
        return (
            (Allow, Twilio, "view"),
        )

If the signature generated by the RequestValidator does not match the X-Twilio-Signature attached to the request, then the authentication policy will not include the Twilio principal in its list of effective principals and the requester will be denied access to any endpoint requiring “view” permission.
You shall not pass Gandalf meme

Common Stumbling Blocks

The URL passed to the RequestValidator.validate method must be identical to the one used by Twilio. Concerns such as trailing slashes and request URI schemes – either http or https in the case of Twilio – must be taken into account.
By default, the Pyramid request.url method will generate a URL ending in a trailing slash for the root URL whereas ngrok displays the root URL without one: ngrok terminal outputCopying and pasting this ngrok URL, exactly as it appears will result in request validation failure, so ensure a trailing slash is appended when configuring the call handler webhook: Twilio phone numbers
If the Pyramid application that has been configured to serve as the call handler sits behind a load balancer, or any service which terminates TLS, it is critical that these lines appear in the app’s configuration file. Failure to include them will result in calls to Pyramid’s various URL generation methods returning URLs with the incorrect scheme (i.e. http://ivr.example.com instead of https://ivr.example.com).

[app:main]
use = egg:twilio_demo_pyramid_auth
filter-with = prefix

pyramid.includes = pyramid_exclog

support_number = 000-000-0000

# The below filter config will ensure that the `X-Forwarded-Proto` header
# is respected when pyramid generates URLs. This is critical if your application
# sits behind an ELB which terminates TLS.
#
# docs.pylonsproject.org/projects/waitress/en/latest/#using-paste-s-prefixmiddleware-to-set-wsgi-url-scheme
#
[filter:prefix]
use = egg:PasteDeploy#prefix

Another point to remember for GET requests is that Twilio passes various query string parameters when hitting the webhook; however, for POST requests this information resides in the request body. This may be a stumbling block for Pyramid users, as the widely used request.params property merges the query string params with the request body for convenience. Ensure that for validation purposes, the request.POST property is used instead. This property will return an empty dictionary-like object on GET requests, avoiding interference with validation. Below is the behavior of request.POST during a HTTP GET:

>>>> request.POST

>>>> type(request.POST)

>>>> dict(request.POST)
{}

Error Handling

In the event that a problem arises with the API attempting to serve TwiML either due to request validation failure or some other reason, Twilio will read a stock error message to the caller and end the call. By default the caller hears:
“We’re sorry, an application error has occurred. Goodbye.”

It may be desirable to handle errors in a custom manner and, for example, forward callers to a call center in the event of an exception with the API. Pyramid provides a set of view decorators to make catching and handling errors fairly simple. In this example, custom TwiML will be returned and the caller will be redirected to a call center for assistance:

@view_defaults(
    renderer="xml",
)
class ExceptionViews(BaseViews):

    def _handle_exception(self):
        response = VoiceResponse()

        self.request.response.status_int = 200

        message = (
            "My apologies. We seem to be experiencing technical difficulties. "
            "You will now be redirected to our call center for assistance."
        )
        response.say(
            message,
            voice="woman",
            language="en",
        )
        response.dial(self.request.registry.settings["support_number"])

        return response

    @notfound_view_config()
    def notfound(self):
        return self._handle_exception()

    @forbidden_view_config()
    def forbidden(self):
        return self._handle_exception()

    @exception_view_config(httpexceptions.HTTPServerError)
    def exception(self):
        return self._handle_exception()

The above code defines views to handle 403, 404, and 500 errors. In the event one of these errors occurs, the custom message defined above will be read to the user. It is important that the status code is set to 200, as a failure to do so will result in Twilio ignoring the TwiML we send and defaulting to the stock error message.

Closing Notes

For additional information on building applications in Python with the Twilio Voice API, see the quickstart documentation.
If any issues are encountered while attempting this tutorial, please note them in the GitHub repo issue tracker so they can be resolved in a timely manner.
My contact information is available on https://patrick.yevsukov.com/ and
my GitHub profile is https://github.com/patrickyevsukov/.

Request Signature Authentication for IVRs Built with Python

How to Build A Boba Tea Shop Finder with Python, Google Maps and GeoJSON

$
0
0

If you plant me anywhere in Manhattan, I can confidently tell you where the nearest bubble tea place is located. This may be  because I have a lot of them memorized, but for the times my memory betrays me, luckily I have the boba map on my data blog. In this tutorial, we’ll use a combination of Python, the Google Maps API, and geojsonio to create what can only be described as the most important tool in the world: a boba map.

Environment & Dependencies

We have to set our environment up before we start coding. This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install several packages that we’ll use throughout this tutorial on the command line in our project directory:

pip3 install googlemaps==2.4.6
pip3 install geocoder==1.22.4
pip3 install geojsonio==0.0.3
pip3 install pandas==0.20.1
pip3 install geocoder==1.22.4
pip3 install geopandas==0.2.1
pip3 install Shapely==1.5.17.post1

We’ll use the Google Maps API, so make sure to generate an API key. Since we’ll be working with Python throughout, using the Jupyter Notebook is the best way to get the most out of this tutorial. Once you have your notebook up and running, you can download all the data for this post from Github. Make sure you have the data in the same directory as your notebook and then we’re good to go!

For this task, we’re going to take an object-oriented programming approach. We’ll create a class called BubbleTea to take care of the processing and methods we’ll need for our bot. To accomplish this we’ll begin by using the googlemaps API module to initialize our authentication and pandas, a nice data analytics library, to read in the CSV.

import pandas as pd 
import googlemaps

class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='[your-own-key]')

    def __init__(self, filename):
        self.boba = pd.read_csv(filename)

In the code sample above, the googlemaps initialization is before the constructor since this API key shouldn’t necessarily change. In the constructor, however, we need the filename of the boba places as a parameter so we can use pandas to read it in as a DataFrame.

Just so that we know what we’re working with let’s take a look at the file containing bubble tea places:

import pandas as pd
pd.read_csv("./boba.csv").head()

Name Address
0 Boba Guys 11 Waverly Pl New York, NY 10002
1 Bubble Tea & Crepes 251 5th Ave, New York, NY 10016
2 Bubbly Tea 55B Bayard St New York, NY 10013
3 Cafe East 2920 Broadway, New York, NY 10027
4 Coco Bubble Tea 129 E 45th St New York, NY 10017

As you can see, it’s just a simple DataFrame containing two columns, one with the name of the bubble tea place and another one with its address.
To visualize each bubble tea place as a point on a map we have to convert the addresses into coordinates. Eventually, we’ll use these coordinates to create shapely Point geospatial objects.

Let’s review how these coordinates are obtained. Because we don’t have the latitude or the longitude we’ll use the geocoder and googlemaps modules to request the coordinates. Below you can see the API request with geocoder.google(). As a parameter, we provide the address which will be used to create the geospatial object. For this example I’ve used the address of a building at Columbia University.

import googlemaps
import geocoder

gmaps = googlemaps.Client(key='your-key')

geocoder.google("2920 Broadway, New York, NY 10027")

Which displays the following output:

<[OK] Google - Geocode [Alfred Lerner Hall, 2920 Broadway, New York, NY 10027, USA]>

This geospatial object has multiple attributes you can utilize. For the purpose of this tutorial, we’ll be using the lat and lng attributes.

geocoder.google("2920 Broadway, New York, NY 10027").lat
geocoder.google("2920 Broadway, New York, NY 10027").lng

Outputs:

40.8069421
-73.9639939

Let’s use the code we’ve reviewed above to add three columns to our boba CSV DataFrame: Latitude, Longitude, and Coordinates. This function will create the longitude and latitude columns and then use these columns to create the Point geospatial object with shapely, a library that lets us manipulate geometric objects.

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    # new code here
    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

The final step for this project is to visualize the geospatial data using geojsonio. But to use geojsonio, we need to convert the DataFrame above into geojson format. At first glance you may be worried since our original data was in a CSV format. Never fear, however, for we can convert this with a few lines of code. More specifically we’ll create three get methods for our visualize function to work.

The function get_geo returns the coordinates as a list:

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)


    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    # new code below
    def get_geo(self):
        return(list(self.boba['Coordinates']))

The get_names() function returns the Name column as series.

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)


    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]


    def get_geo(self):
        return(list(self.boba['Coordinates']))


    # new code below
    def get_names(self):
        return(self.boba['Name'])

And finally, get_gdf converts all the data into a GeoDataFrame and then returns an object of the same GeoDataFrame type. This is where we utilize the two previous functions since the first parameter requires the indices to be a series and the geometry parameter requires a list.

from geopandas import GeoDataFrame
import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    def get_geo(self):
        return(list(self.boba['Coordinates']))

    def get_names(self):
        return(self.boba['Name'])


    # new code below
    def get_gdf(self):
        crs = {'init': 'epsg:4326'}
        return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))

Great! Now let’s use geojsonio for some boba fun! Now that we have all our helper functions implemented, we can use them to deploy our visualization with geojsonio’s display function.

from geopandas import GeoDataFrame
import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    def get_geo(self):
        return(list(self.boba['Coordinates']))

    def get_names(self):
        return(self.boba['Name'])

    def get_gdf(self):
        crs = {'init': 'epsg:4326'}
        return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))


    # new code here
    def visualize(self):
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
        updated = self.get_gdf()
        display(updated.to_json())

And we’ve done it, our BubbleTea object is finished and ready to be used. Continuing on in the same Jupyter notebook we’ll use the code we just created to build the map. We initialize the class with our boba file in a new Jupyter notebook cell. Remember, this only initializes what’s in the constructor so as of now we only have a pandas DataFrame created — the GeoDataFrame has not been yet created.

boba = BubbleTea("./boba.csv")

Next we call the calc_coords method. Recall that this function makes API calls to Google Maps for the latitude and longitude and then takes these two columns to convert to a shapely Point object.

Because of the many Google Maps API calls, expect this to take a while. In the meantime, spend some time readings this awesome post.

boba.calc_coords()

The longest part is over! Now we’re ready for our awesome boba map:

boba.visualize()

This gets us a beautiful interactive map we can then use for whatever purposes. I chose to include it in a Github Gist, but geojsonio has a lot of different ways of sharing your content so feel free to choose whatever fits your needs.

And that’s a wrap! If you’d like to learn more about geospatial analysis, check out the following resources:
GeoJSON 
OpenStreetMap 
CartoDB

If you liked what you did here, follow me @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.

How to Build A Boba Tea Shop Finder with Python, Google Maps and GeoJSON

Never Forget A Friend’s Birthday with Python, Flask and Twilio

$
0
0

Have you ever forgotten a friend’s birthday? It happens to the best of us. After the frustration of checking Facebook every day for birthdays, I wanted a better push notification system with better filters.

I wrote an article, Building a Simple Birthday App with Flask-SQLAlchemy, showing a way to export your Facebook birthday calendar to an .ics file and import it into a DB with Flask and Flask-SQLAlchemy.

After talking to Twilio at PyCon we thought it would be cool to extend this app by adding SMS notifications and a possibility to send birthday messages via SMS, so I signed up for a Twilio account and got coding.  

In this post we will build a simple birthday app with Python, Flask and Twilio’s Programmable SMS service API so you never miss a birthday again: 

app-printscreen
The complete code for this project on Github

Setup instructions

Start by cloning the Git repository:

$ git clone https://github.com/pybites/bday-app

Next, make a virtual environment and install the dependencies:

$ cd bday-app
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Create a Twilio account, get a phone number and API key (sid) and token.

Copy the settings template in place:

$ cp env-example.conf env.conf

Update the env.conf environment variables file with the correct settings:

  • flask – secret = set this to a hard to guess string (Flask docs)
  • twilio_api – sid token = obtained in step 3
  • phones – twilio = obtained in step 3
  • phones – admin = your (verified) mobile phone number, where you want to receive notification messages
  • login – user password = your login credentials for the Flask app
  • server – url = unchanged if running locally, update to base URL if deployed elsewhere

NOTE: make sure you use E.164 number formatting for phone numbers (e.g. +34666555444, +442071838750). See Twilio’s support article: Formatting International Phone Numbers.

The app uses configparser to load these settings in. 

Import your FB birthday calendar into the local SQLite database:

  • Export your birthday calendar from Facebook and save it as cal.ics in the app’s top level directory.
  • Run model.py to import the birthdays into the DB. Here I am using the -t option to strip out real names using Faker. For real use drop the -t:

$ python model.py

import_birthdays.gif

import_birthdays.gif

How to run it

Make sure you have your virtualenv enabled. This app has two parts: a Flask front-end and a daily cron script on the backend.

The front-end is a Flask app which you can invoke with: python app.py.
At this point go to 127.0.0.1:5000 and you can see your friend’s birthdays. As we’ll see in a bit you need to add a phone number for each friend you want to receive a notification for.  

The notifier is a daily cron script that checks for active birthdays with a phone number set. It sends SMS notifications to the admin phone you configured in your settings. This needs to run in the background so use a tool like nohup:

$ nohup ./notify.py &

How the Flask App Works

All Flask app code is in app.py. As you probably want to host this in the cloud, authentication is there from the start.

This is done by adding the login_required decorator to all private routes. When you go to the app you first have to login. I am using ngrok here to test the local app:

nfbd-login.png
Once logged in, you can see your imported friends and their birthdays. Some simple CSS emphasizes today’s birthday(s):

nfbd-current-bday.png

You can navigate friends by tabs and via the search box:

nfbd-navigate.png

The app uses Flask-SQLAlchemy to interface with the database, with a single table to store the birthdays. The model is defined in model.py which we used before to load in the data. Here is the part that defines the table structure:

class Birthday(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(120))
    bday = db.Column(db.DateTime)
    phone = db.Column(db.String(20))

Adding Phone Numbers

To avoid numerous SMS messages, the cron job only looks at friends with a phone number only. Clicking the pencil button at the right you get a form to add a phone number:

nfbd-add-phone.png
NOTE:  that to be able to send messages to your friends using the free Twilio version you need to verify their phone numbers first. You can lift this limitation by upgrading your account. You can find more info here.
Various validation rules are set up in the corresponding /friends/<int:friendid> route. We cannot have the same phone number twice for example.

With the phone number added you will get an SMS when it’s their birthday.

nfbd-with-phone.png

Notifications

The cron job is coded in notify.pyusing the schedule package to notify about birthdays occurring that day.

It queries the DB using Flask-SQLAlchemy getting the birthdays for the current day:

def job():
    bdays = Birthday.query.filter(and_(
            extract('day', Birthday.bday) == TODAY.day,
            extract('month', Birthday.bday) == TODAY.month,
            Birthday.phone != None)).all()  # noqa E711
    ...

Then, it sends an SMS to your configured admin phone. Twilio makes this very easy as you can see in sms.py:

def send_sms(message, media=None, to_phone=ADMIN_PHONE):
    message = CLIENT.messages.create(
        from_=FROM_PHONE,
        to=to_phone,
        body=message,
        media_url=media,
    )
    return message.sid

Running the cron job manually:

nfbd-sms-notify.png

As this will run on a remote server the current version of the app uses logging instead of stdout. As stated before you want to use nohup to run it in the background. The schedule module does the rest:

...

schedule.every().day.at('00:05').do(job)

while True:
    schedule.run_pending()
    time.sleep(1)

And here is the notification SMS for today’s birthday:

nfbd-notification-sms.png

The link is the entry point into the Flask app to send a birthday message or SMS card which we will see in the next section.

Send Birthday Messages and Cards

This is the second part of the app. There are two ways to get to the send message feature: follow the link in the SMS, or click the phone icon at the left of your friend’s name from the homepage.

You are presented with the following form:

nfbd-send-msg.png

Some validation is in place to only be able to send a message to the person that has a birthday:

nfbd-only-for-bday.png
You can now send a text message and include an image link:

nfbd-include-img-link.png
If you include an image link, the Pillow library is used to put the text on top of the image making a simple birthday card. The code for this is in the text_on_image.py module.

When you click verify you can check if you are happy with the result:

nfbd-confirm.png

If so you can send the card via MMS:
nfbd-sent.png

Gary would receive an SMS like this:

nfbd-card-sms.png

This is on an iPhone. On Android it did not display the image inline. For this reason it sends the text alongside the image.

Conclusion and Learning

Having a birthday app managed by SMS is cool. Twilio’s API makes it very easy. The Pillow image integration was the hardest part of building this app.

Splitting the code into various modules helped and will help manage complexity. Moving forward it would be good to add a regression test suite and possibly automate end-to-end testing with a tool like Selenium.

And finally, nothing beats building practical apps when it comes to honing your programming skills. Apart from the Twilio API, building this app I learned more Flask and integration of various interesting modules.  

Practice Yourself

Join our weekly PyBites code challenge to build your own automated texting app and other awesome applications.

Feel free to reach out if you have any questions or comments:

Contact info

I am Bob Belderbos from PyBites, you can reach out to me by:

Never Forget A Friend’s Birthday with Python, Flask and Twilio

How to Receive and Respond to Text Messages in Python with Django and Twilio

$
0
0

You’re building a Django app and want to be able to respond to SMS messages? Let’s walk through the well written Django tutorial and add Twilio SMS to the canonical basic Django app.

Setting up your environment

Before moving on, make sure to have your environment setup. Getting everything working correctly, especially with respect to virtual environments is important for isolating your dependencies if you have multiple projects running on the same machine.

You can also run through this guide to make sure you’re good to go before moving on.

Installing dependencies

Now that your environment is set up, you’re going to need to install the libraries needed for this app. The code in this post will run on both Python 2 and 3. We’re going to use:

Navigate to the directory where you want this code to live and run the following command in your terminal with your virtual environment activated to install these dependencies:

pip install django==1.11.5 twilio==6.7.0

Building a basic Django app

First, generate a barebones Django starter project with the following terminal command in the directory where you want your project to live:

django-admin startproject mysite

This will generate a new Django project from scratch and create a mysite directory in your current directory. Before moving on, run the following command if you want to make sure everything works (don’t worry if you see an error message about unapplied migrations):

cd mysite
python manage.py runserver

Visit http://localhost:8000 and you should see the following:

Screen Shot 2017-10-10 at 1.16.13 PM.png
Now create a Django app within the project. Instead of the polls web app from the Django tutorial, we will make one that responds to text messages. Run the following command from the same directory as manage.py:

python manage.py startapp sms

As in the Django tutorial, let’s create a view for this app. If you’re familiar with the Model-View-Controller pattern, Django views might confuse you at first because they behave more similarly to Controllers in the traditional MVC app. Open sms/views.py and add the following code:

from django.http import HttpResponse


def sms_response(request):
    return HttpResponse("Hello, world.")

Now add a route to this view by creating the file, sms/urls.py and adding the following code:

from django.conf.urls import url

from . import views


urlpatterns = [
    url(r'^$', views.sms_response, name='sms'),
]

As in the Django tutorial, the next step is to point the root url at the sms.urls module. In mysite/urls.py, add an import for django.conf.urls.include and insert an include() in the urlpatterns list, so you have:

from django.conf.urls import include, url
from django.contrib import admin


urlpatterns = [
    url(r'^sms/', include('sms.urls')),
    url(r'^admin/', admin.site.urls),
]

Now run the project and visit http://localhost:8000/sms/ to see “Hello, World.” on the screen.

Setting up your Twilio account

Before being able to respond to messages, you’ll need a Twilio phone number. You can buy a phone number here (it’s free if you’re using the number to test your code during development). Your Django app will need to be visible from the Internet in order for Twilio to send requests to it. We will use ngrok for this, which you’ll need to install if you don’t have it. In your terminal run the following command:

ngrok http 8000

Screen Shot 2017-10-03 at 4.47.29 PM.png

This provides us with a publicly accessible URL to the Flask app. Configure your phone number as seen in this image by adding your ngrok URL with a /sms/ route appended to it to the “Messaging” section (make sure you don’t forget the extra slash at the end):

Screen Shot 2017-10-03 at 4.49.02 PM.png

You are now ready to receive a text message to your new Twilio number.

Adding SMS to your Django app

Now that you have a Twilio number you want to allow users to send a text message to it and get a response.

We only need one route on this app: /sms/ to handle incoming text messages. Let’s replace the code in sms/views.py with the following:

from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt

from twilio.twiml.messaging_response import MessagingResponse


@csrf_exempt
def sms_response(request):
    # Start our TwiML response
    resp = MessagingResponse()

    # Add a text message
    msg = resp.message("Check out this sweet owl!")

    # Add a picture message
    msg.media("https://demo.twilio.com/owl.png")

    return HttpResponse(str(resp))

If you’re confused by the @csrf_exempt decorator, their documentation explains things pretty well.

Before being able to respond to messages, you’ll have to add ngrok.io to your ALLOWED_HOSTS in your Django site’s settings. Open mysite/settings.py, find the line where the ALLOWED_HOSTS are set and replace it with the following:

ALLOWED_HOSTS = ['.ngrok.io']

Run your code again:

python manage.py runserver

Now text your Twilio number and you should get a response!

What just happened?

With this app running on port 8000, sitting behind a public ngrok URL, Twilio can see your application. Upon receiving a text message:

  1. Twilio will send a POST request to /sms.
  2. The sms view function will be called.
  3. Your /sms view responds to Twilio’s request with TwiML telling Twilio to send a message back in response (containing this sweet owl pic).

If you want to learn more thoroughly about Django and Twilio SMS check out the Server Notifications tutorial or Appointment Reminders in Django.

Feel free to reach out if you have any questions or comments or just want to show off the cool stuff you’ve built.

How to Receive and Respond to Text Messages in Python with Django and Twilio

Basic Statistics in Python with NumPy and Jupyter Notebook

$
0
0

While not all data science relies on statistics, a lot of the exciting topics like machine learning or analysis relies on statistical concepts. In this tutorial, we’ll learn how to calculate introductory statistics in Python.

What is Statistics?

Statistics is a discipline that uses data to support claims about populations. These “populations” are what we refer to as “distributions.” Most statistical analysis is based on probability, which is why these pieces are usually presented together. More often than not, you’ll see courses labeled “Intro to Probability and Statistics” rather than separate intro to probability and intro to statistics courses. This is because probability is the study of random events, or the study of how likely it is that some event will happen.

Environment Setup

Let’s use Python to show how different statistical concepts can be applied computationally. We’ll work with NumPy, a scientific computing module in Python.
This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install the numpy module that we’ll use throughout this tutorial:

pip3 install numpy==1.12.1
pip3 install jupyter==1.0.0

Since we’ll be working with Python interactively, using Jupyter Notebook is the best way to get the most out of this tutorial. You already installed it with pip3 up above, now you just need to get it running. Open up your terminal or command prompt and entire the following command:

jupyter notebook

And BOOM! It should have opened up in your default browser. Now we’re ready to go.

A Quick Note on Jupyter

For those of you who are unfamiliar with Jupyter notebooks, I’ve provided a brief review of which functions will be particularly useful to move along with this tutorial.
In the image below, you’ll see three buttons labeled 1-3 that will be important for you to get a grasp of — the save button (1), add cell button (2), and run cell button (3).

The first button is the button you’ll use to save your work as you go along (1). I won’t give you directions as when you should do this — that’s up to you!
Next, we have the “add cell” button (2). Cells are blocks of code that you can run together. These are the building blocks of jupyter notebook because it provides the option of running code incrementally without having to to run all your code at once.  Throughout this tutorial, you’ll see lines of code blocked off — each one should correspond to a cell.
Lastly, there’s the “run cell” button (3). Jupyter Notebook doesn’t automatically run your code for you; you have to tell it when by clicking this button. As with add button, once you’ve written each block of code in this tutorial onto your cell, you should then run it to see the output (if any). If any output is expected, note that it will also be shown in this tutorial so you know what to expect. Make sure to run your code as you go along because many blocks of code in this tutorial rely on previous cells.

Descriptive vs Inferential Statistics

Generally speaking, statistics is split into two subfields: descriptive and inferential. The difference is subtle, but important. Descriptive statistics refer to the portion of statistics dedicated to summarizing a total population. Inferential Statistics, on the other hand, allows us to make inferences of a population from its subpopulation. Unlike descriptive statistics, inferential statistics are never 100% accurate because its calculations are measured without the total population.

Descriptive Statistics

Once again, to review, descriptive statistics refers to the statistical tools used to summarize a dataset. One of the first operations often used to get a sense of what a given data looks like is the mean operation.

Mean

You know what the mean is, you’ve heard it every time your computer science professor handed your midterms back and announced that the average, or mean, was a disappointing low of 59. Woops.
With that said, the “average” is just one of many summary statistics you might choose to describe the typical value or the central tendency of a sample. You can find the formal mathematical definition below. Note that μ is the symbol we use for mean.
alt text
Computing the mean isn’t a fun task, especially if you have hundreds, even thousands or millions of data points to compute the mean for. You definitely don’t want to do this by hand, right?
Right. In Python, you can either implement your own mean function, or you can use NumPy. We’ll begin with our own implementation so you can get a thorough understanding of how these sorts of functions are implemented.
Below, t is a list of data points. In the equation above, each of the elements in that list will be the x_i’s. The equation above also states the mean as a summation of these values together. In Python, that summation is equivalent to the built-in list function sum() . From there, we have to take care of the 1/n by dividing our summation by the total number of points. Again, this can be done with a built-in function len.

def mean(t):
    return(float(sum(t)) / len(t))

Luckily, Python developers before us know how often the mean needs to be computed, so NumPy already has this function available through their package. Just like our function above, NumPy mean function takes a list of elements as an argument.

import numpy as np
np.mean([1,4,3,2,6,4,4,3,2,6])

Returns the output:

3.5

Variance

In the same way that the mean is used to describe the central tendency, variance is intended to describe the spread.
alt text
The xi – μ is called the “deviation from the mean”, making the variance the squared deviation multiplied by 1 over the number of samples. This is why the square root of the variance, σ, is called the standard deviation.
Using the mean function we created above, we’ll write up a function that calculates the variance:

def var(t, mu):
    dev2 = [(x - mu)**2 for x in t]
    var = mean(dev2)
    return(var)

Once again, you can use built in functions from NumPy instead:

print(np.var([1,3,3,6,3,2,7,5,9,1]))

Returns:

6.4

Distributions

Remember those “populations” we talked about before? Those are distributions, and they’ll be the focus of this section.
While summary statistics are concise and easy, they can be dangerous metrics because they obscure the data. An alternative is to look at the distribution of the data, which describes how often each value appears.

Histograms

The most common representation of a distribution is a histogram, which is a graph that shows the frequency or probability of each value.
Let’s say we have the following list:

t = [1,2,2,3,1,2,3,2,1,3,3,3,3]

To get the frequencies, we can represent this with a dictionary:

hist = {}
for x in t:
    hist[x] = hist.get(x,0) + 1

Now, if we want to convert these frequencies to probabilities, we divide each frequency by n, where n is the size of our original list. This process is called normalization.

n = float(len(t))
pmf = {}
for x, freq in hist.items():
    pmf[x] = freq / n

Why is normalization important? You might have heard this term before. To normalize your data is to consider your data with context. Let’s take an example:
Let’s say we have we have a comma-delimited dataset that contains the names of several universities, the number of students, and the number of professors.

Dartmouth, 5000 students, 300 professors
Columbia, 11000 students, 500 professors
Brown, 8000 students, 400 professors
Cornell, 16000 students, 650 professors

You might look at this and say, “Woah, Cornell has so many professors. And while 650 is more than the number of professors at the other universities, when you take into considering the large number of students, you’ll realize that the number of professors isn’t actually much better.

So how can we consider the number of students? This is what we refer to as normalizing a dataset. In this case, to normalize probably means that we should divide the total number of students by its number of professors, which will get us:

Dartmouth: 16.67
Columbia: 22
Brown: 20
Cornell: 24.61

Turns out that Cornell actually has the worst student to professor ratio. While it seemed like they were the best because of their higher number of professors, the fact that those professors have to handle so many students means differently.

This normalized histogram is called a PMF, “probability mass function”, which is a function that maps values to probabilities.

As we mentioned previously, it’s common to make wrongful assumptions based off of summary statistics when used in the wrong context. Statistical concepts like PMFs provide a much more accurate view of what a dataset’s distribution actually looks like.

That’s the Very Basics of Stats, Folks

I could go on forever about statistics and the different ways in which NumPy serves as a wonderful resource for anyone interested in data science. While the different concepts we reviewed might seem trivial, they can be expanded into powerful topics in prediction analysis.

If you liked what we did here, follow @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.

Basic Statistics in Python with NumPy and Jupyter Notebook

Building TwilioQuest with Twilio Sync, Django, and Vue.js

$
0
0

TwilioQuest is our developer training curriculum disguised as a retro-style video game. While you learn valuable skills for your day job, you get to earn XP and wacky loot to equip on your 8-bit avatar.

Today we’ll pull back the curtain and show the code that the Developer Education team wrote to create TwilioQuest.

Meet Wagtail, a Python & Django Based CMS

TwilioQuest is full of content. A lot of content.

There are missions for nearly all of Twilio’s products, with each mission containing many different objectives. To manage all this content, we needed a content management system (CMS). Luckily, the Twilio documentation site is built on a Python & Django-based CMS called Wagtail, so we already had a tool we were familiar with and ready to build on.

We did have a few experienced Python & Django developers on the team, but others were completely new to the stack (such as your humble author, a .NET developer for 15 years). Wagtail looks like a fairly typical CMS on the surface, complete with all the user-friendly content editing features one comes to expect from a professional grade CMS. However, underneath the hood, it is a developer’s delight.

In Wagtail, you expose your content types via standard Django models. The killer feature,  however, is streamfields, which allow you to compose various “blocks” of content in any conceivable combination. Blocks can be baked-in things such as a rich text editor, or you can build your own blocks (like we did!) for things like code samples or standard design elements.

Here’s an example block we use for adding a “warning” or “danger” box to any page.

from wagtail.wagtailcore import blocks

class WarningDangerBlock(blocks.StructBlock):
   """A custom block type for displaying a Warning or a Danger message."""

   LEVEL_WARNING = 'warning'
   LEVEL_DANGER = 'danger'
   LEVELS = (
       (LEVEL_WARNING, 'Warning'),
       (LEVEL_DANGER, 'Danger')
   )

   level = blocks.ChoiceBlock(required=True, choices=LEVELS, default=LEVEL_WARNING)
   text = blocks.RichTextBlock(icon='edit', template='core/richtext.html')

   class Meta:
       icon = 'warning'
       label = 'Warning Danger Box'
       template = 'core/warning_danger.html'

Sprinkle in Some Django REST Framework

We wanted to build TwilioQuest as a single page application (SPA) to enhance the game’s responsiveness. Our team knew we needed a flexible, JSON-based REST API to pair with the SPA that would work with our Django models on the backend.

We selected the popular Django REST Framework (DRF). DRF is built to work with Django and uses Django models and views to expose API endpoints. Below is an example view.

class CharacterDetailView(generics.RetrieveUpdateAPIView):

   queryset = models.Character.objects.all()
   serializer_class = serializers.CharacterSerializer
   permission_classes = (
       permissions.BetaFeaturePermission,
       permissions.CharacterPermission
   )
   authentication_classes = (CsrfExemptSessionAuthentication,)

Notice the reference to a serializer_class. Think of DRF’s serializers as similar to Django’s forms. The serializer is responsible for creating a JSON representation of your model as well as for validating incoming data. Here’s what a serializer looks like.

class CharacterSerializer(serializers.HyperlinkedModelSerializer):
   equipped_items = serializers.SerializerMethodField()
   experience_points = serializers.SerializerMethodField()
   items_url = SubResourceListUrlField(additional_url_fields=('character_id', ),
                                       view_name='character-items', read_only=True)
   missions_url = SubResourceListUrlField(additional_url_fields=('character_id', ),
                                          view_name='character-missions', read_only=True)
   rank = serializers.SerializerMethodField(read_only=True)
   getting_started_credit_eligible = serializers.SerializerMethodField(read_only=True)

   class Meta:
       model = models.Character
       fields = ('url', 'id', 'username', 'display_name', 'public_profile',
                 'avatar_image', 'experience_points', 'equipped_items', 'items_url',
                 'missions_url', 'theme', 'rank')

   def get_equipped_items(self, character):
       items = models.CharacterItem.objects.filter(
           character_id=character.id, equipped_slot__isnull=False)
       serializer = CharacterItemSerializer(
           instance=items, many=True, context=self.context)
       return serializer.data

   def get_experience_points(self, character):
       return character.get_experience_points()

   def get_rank(self, character):
       return {'name': character.rank.name, 'image': character.rank.image}

DRF can handle automatic serialization of most data types that are part of your model, but you can override this handling as well as provide computed data elements.

Working with DRF isn’t all roses, however. You have to do some spelunking in the DRF docs to figure out how to implement some tasks that appear like they’d be straightforward. However, there was never a scenario that DRF couldn’t handle after some digging into their docs and tweaking our code.

Build an 8-bit Game Frontend with Vue.js

We selected Vue.js for the frontend of the Single Page App. There are a lot of great frameworks out there, and we had some on-team experience with Angular and React, but we selected Vue.js after prototyping a few versions of the game.

Vue.js has been billed as a lighter weight SPA framework and we mostly found that to be true. It is simple to get started with and there are many concepts that will translate for developers coming from other frameworks.

Putting together a frontend toolchain is always challenging. (Yak shaving, anyone?) This has given rise to many of the frameworks providing a CLI tool to help you scaffold new apps. Vue.js has a CLI but we ended up not using it as we wanted to use some of the same tools that we were familiar with in building the frontend for the Twilio docs.

For our toolchain, we landed with grunt, browserify, babel, karma and jshint. We use Vue.js’s single file components. Within each .vue file, we use pug for our templates and ES2015 for the JavaScript code. Below is an example of one of these component files.

<template lang="pug">
 transition(name="popup")
   .congratulation(v-if="mission_status && mission_status.completed && !congratsDismissed")
     .star
     p Congratulations! You have successfully completed this objective!
     p You have earned {{ objective.experience_points }} XP
     p
       a.button(v-on:click="congratsDismissed = true") OK!
</template>

<script>
module.exports = {
 name: 'mission-objective',
 props: {
   mission: null,
   objective: null,
   parent_completed: false
 },
 data() {
   return {
     isOpen: false
   };
 },
 computed: {
   objectiveRoute() {
     return {
       name: 'mission_objective',
       params: {
         missionId: this.mission.id,
         objectiveId: this.objective.id
       }
     };
   },
   prerequisiteRoute() {
     return {
       name: 'mission_objective',
       params: {
         missionId: this.mission.id,
         objectiveId: this.objective.prerequisite.id
       }
     };
   },
   locked() {
     const self = this;
     return self.objective.prerequisite && !self.objective.prerequisite.completed;
   }
 },
 methods: {
   getItemClasses() {
     const self = this;
     return {
       'objective-list__item—in-person': !self.objective.autocomplete,
       'is-completed': self.objective.completed,
       'is-locked': self.locked,
       'is-selected': self.isOpen
     };
   },
   openAccordion() {
     this.isOpen = !this.isOpen;
   }
 }
};
</script>

We also initially placed SCSS in the component files but soon pulled out the SCSS into multiple files in order to facilitate our theming feature. You can choose between a dark and a light theme in-game. Spoiler alert: soon we’ll add a “boss mode” to make TQ look like a regular productivity app.

We’ve been quite happy with Vue.js and the stack we chose. Our team has yet to run into something that was too awkward to implement within Vue’s recommended practices.

Sync Those Real-Time Events

A lot of actions in TwilioQuest happen asynchronously. For example, when you provide a phone number to check the TwiML for a mission objective, TwilioQuest has to:

  1. look up the phone number in Twilio’s internal service called “yellow pages” to find your webhook URL,
  2. invoke another Twilio service to make a call to your webhook handler
  3. analyze the response from your webhook to ensure it matches the victory conditions for the mission objective.

The phone number is sent via AJAX call to the TwilioQuest REST API, but the API returns a 204 response (OK with no content) immediately and the verification process happens asynchronously in the background. How does the Vue.js app know when the verification is complete? How does it know if it succeeded or failed (and, if it failed, why)?

These situations are where having a real-time state synchronization service in our product suite comes in handy. Twilio Sync allows us to have a single JSON document object per player that we can use to push data from the server to the browser in real-time over WebSocket connection.

In addition, we use a global Sync list object for our TwilioQuest Scoreboard that we use at events such as hackathons and Superclass. We can update the scoreboard from the Python server code like so:

list_id = 'Leaderboard'

try:
   sync_list = sync_service.sync_lists(list_id).fetch()
except TwilioRestException:
   sync_list = sync_service.sync_lists.create(list_id)

data = {
   'event_name': 'completion',
   'character': {
       'id': objective.character_id,
       'display_name': objective.character.display_name,
       'username': objective.character.username,
       'avatar_url': '/quest/avatar/{}'.format(objective.character.username),
       'experience_points': objective.character.get_experience_points()
   } if objective.character.public_profile else None,
   'mission': {
       'id': objective.mission_objective.mission_id,
       'title': objective.mission_objective.mission.title,
       'icon':
           objective.mission_objective.mission.icon.get_rendition('original').url
   },
   'mission_objective': {
       'id': objective.mission_objective_id,
       'title': objective.mission_objective.title,
       'experience_points': objective.mission_objective.experience_points
   }
}
sync_list.sync_list_items.create(data=json.dumps(data))

Our scoreboard HTML page uses JavaScript to monitor this list to update the scoreboard.

// Wire up anonymous Sync list watcher
utils.ajax('/quest/api/sync/token-anon/', {
 success: (data) => {
   self.syncClient = new Twilio.SyncClient(data.token);
   if(self.syncClient) {
     self.syncClient.list('Leaderboard').then((list) => {
       self.syncList = list;
       self.syncList.on('itemAdded', (item) => {
         self.updateLeaderboard(item.data.value);
       });
     });
   }
 }
});

All instances of the scoreboard web page are instantly notified whenever someone completes a mission objective.

Going Serverless with Twilio Functions

Our Developer Education team also wanted to figure out how to incorporate Twilio Functions because serverless is what all the cool kids are doing.

We added the ability for anyone in Twilio to extend TwilioQuest via webhooks (using the same Twilio infrastructure that invokes your webhooks for phone calls and SMS messages). This provided flexibility to do some interesting things down the road by just writing a few lines of JavaScript in the Twilio Console.

One interesting integration we added was our @TwilioQuest Twitter feed. Every time someone (with a public profile) completes a mission objective, the Django app fires the webhooks. We wrote the Twilio Function below to automatically send out a congratulatory tweet.

const Twitter = require('twitter');

exports.handler = tweet = function(context, event, callback) {
  const client = new Twitter({
    consumer_key: context.twitter_consumer_key,
    consumer_secret: context.twitter_consumer_secret,
    access_token_key: context.twitter_access_token_key,
    access_token_secret: context.twitter_access_token_secret
  });

  let userName = event.CharacterDisplayName
  if(event.CharacterTwitterUsername) {
    if(event.CharacterTwitterUsername.startsWith('@')) {
      userName = event.CharacterTwitterUsername
    } else {
      userName = `@${event.CharacterTwitterUsername}`
    }
  }

  const objectiveName = event.ObjectiveTitle
  const userExperiencePoints = event.CharacterExperiencePoints
  const missionName = event.MissionTitle
  const tweetBody = `${userName} just completed '${objectiveName}' in the ${missionName} mission. You're up to ${userExperiencePoints} XP!`

  client.post('statuses/update', {status: tweetBody}, (error, tweet, response) => {
    if(error) {
      console.log(error);
      callback(error)
    } else {
      callback()
    }
  });
}

Those items we’re pulling out of the context variable are secrets that we configure in the Environmental Variables for our functions. On the same configuration page, we also added the twitter v1.7.1 npm package.

What’s Next for TwilioQuest

We are incorporating Behave with Selenium and Sauce Labs to build up a comprehensive integration test suite. The hope for this is to eliminate the rounds of manual testing that are normally required after refactoring the code or introducing other impactful changes.

We are anxiously looking forward to feedback from the community to learn what we should build next!

The Team

TwilioQuest began as the vision of Kevin Whinnery, who now works with the Twilio.org team helping to pair nonprofits with developers. Kevin’s vision was furthered by the Developer Education team at Twilio, consisting of Developer Educators Andrew Baker, Paul Kamp, Kat King, Jen Aprahamian, and, yours truly, David Prothero. Supporting our development efforts was our extended engineering team, based in Quito, Ecuador, Wellington Mendoza, Hector Ortega, Samuel Mendes, Jose Oliveros, Agustin Camino, and Orlando Hidalgo.

It’s an amazing team and I am fortunate to be a part of it. Beyond the team, so many colleagues within Twilio (Twilions) helped out with TwilioQuest. From the Platform team (the team that brings you 5 9’s of Twilio API availability) who helped us find all the right internal services where we needed to interface, to the product managers who tested out the various missions for their products, to the devangelists who brought TwilioQuest to meetups, hackathons, and conferences to get valuable community feedback, we are indebted to so many amazing people.

Of course, we have to also call out the retro artwork. Credit for the amazing pixel art for the avatars and loot goes to Kevin Whinnery and Luiggi Hidalgo. The retro design and TwilioQuest logo were designed by Jamie Wilson, Sean McBride, and Nathan Sharp.

Start Your Epic Journey

TwilioQuest has been a labor of love for us and, honestly, such a joy to work on. None of us ever thought we’d ship a video game when we started working here. It’s been our epic journey and now it’s time to begin yours.

Sign up and start playing TwilioQuest. You’ll be having so much fun, you’ll forget you’re learning something new!

Building TwilioQuest with Twilio Sync, Django, and Vue.js

Making Sentiment Analysis Easy With Scikit-Learn

$
0
0

Sentiment analysis uses computational tools to determine the emotional tone behind words. Python has a bunch of handy libraries for statistics and machine learning so in this post we’ll use Scikit-learn to learn how to add sentiment analysis to our applications.

Sentiment Analysis isn’t a new concept. There are thousands of labeled datasets out there, labels varying from simple positive and negative to more complex systems that determine how positive or negative is a given text.

For this post, we’ll use a pre-labeled dataset consisting of Twitter tweets that are already labeled as positive or negative. Using this data, we’ll build a model that categorizes any tweet as either positive or negative with Scikit-learn.

Scikit-learn is a Python module with built-in machine learning algorithms. In this tutorial, we’ll specifically use the Logistic Regression model, which is a linear model commonly used for classifying binary data.

Environment Setup

This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install Scikit-learn, a commonly used module in machine learning, that we’ll use throughout this tutorial. Open up your terminal and type in:

pip3 install scikit-learn==0.19.0
pip3 install jupyter==1.0.0

Since we’ll be working with Python interactively, using Jupyter Notebook is the best way to get the most out of this tutorial. You already installed it with pip3 up above, now you just need to get it running. With that said, open up your terminal or command prompt and entire the following command:

jupyter notebook

And BOOM! It should have opened up in your default browser. Now you can go ahead and download the data we’ll be working with in this example. You can find this in the repo as negative_tweets and positive_tweets. Make sure you have the data in the same directory as your notebook and then we’re good to go!

A Quick Note on Jupyter

If you are unfamiliar with Jupyter notebooks, here are a review of functions that will be particularly useful to move along with this tutorial. If you are familiar with Jupyter, you can skip to the next section.
In the image below, you’ll see three buttons labeled 1-3 that will be important for you to get a grasp of — the save button (1), add cell button (2), and run cell button (3).

The first button is the button you’ll use to save your work as you go along (1). I won’t give you directions as when you should do this — that’s up to you!
Next, we have the “add cell” button (2). Cells are blocks of code that you can run together. These are the building blocks of jupyter notebook because it provides the option of running code incrementally without having to to run all your code at once.  Throughout this tutorial, you’ll see lines of code blocked off — each one should correspond to a cell.
Lastly, there’s the “run cell” button (3). Jupyter Notebook doesn’t automatically run it your code for you; you have to tell it when by clicking this button. As with add button, once you’ve written each block of code in this tutorial onto your cell, you should then run it to see the output (if any). If any output is expected, note that it will also be shown in this tutorial so you know what to expect. Make sure to run your code as you go along because many blocks of code in this tutorial rely on previous cells.

Preparing the Data

Before we implement our classifier, we need to format the Twitter data. Using sklearn.feature_extraction.text.CountVectorizer, we will convert the tweets to a matrix, or two-dimensional array, of word counts. Ultimately, the classifier will use these vector counts to train.
First, we import all the needed modules:

from sklearn.feature_extraction.text import CountVectorizer

Next, we must import the data we’ll be working with. Each file is a text file with one tweet per line. We will use the builtin open function to split the file line-by-line and build up two lists: one for tweets and one for their labels. We chose this format so that we can check how accurate the model we build is. To do this, we test the classifier on unlabeled data since feeding in the labels, which you can think of as the “answers”, would be “cheating”. 

data = []
data_labels = []
with open("./pos_tweets.txt") as f:
    for i in f: 
        data.append(i) 
        data_labels.append('pos')

with open("./neg_tweets.txt") as f:
    for i in f: 
        data.append(i)
        data_labels.append('neg')

Next, we initialize a sckit-learn vector with the CountVectorizer class. Because the data could be in any format, we’ll set lowercase to False and exclude common words such as “the” or “and”. This vectorizer will transform our data into vectors of features. In this case, we use a CountVector, which means that our features are counts of the words that occur in our dataset. Once the CountVectorizer class is initialized, we fit it onto the data above and convert it to an array for easy usage.

vectorizer = CountVectorizer(
    analyzer = 'word',
    lowercase = False,
)
features = vectorizer.fit_transform(
    data
)
features_nd = features.toarray() # for easy usage

As a final step, we’ll split the training data to get an evaluation set through Scikit-learn’s built-in cross_validation function. All we need to do is provide the data and assign a training percentage (in this case, 80%).

from sklearn.cross_validation import train_test_split

X_train, X_test, y_train, y_test  = train_test_split(
        features_nd, 
        data_labels,
        train_size=0.80, 
        random_state=1234)

Linear Classifier

We can now build the classifier for this dataset. As mentioned before, we’ll be using the LogisticRegression class from Scikit-learn, so we start there:

from sklearn.linear_model import LogisticRegression
log_model = LogisticRegression()

Once the model is initialized, we have to train it to our specific dataset, so we use Scikit-learn’s fit method to do so. This is where our machine learning classifier actually learns the underlying functions that produce our results.

log_model = log_model.fit(X=X_train, y=y_train)

And finally, we use log_model to label the evaluation set we created earlier:

y_pred = log_model.predict(X_test)

Accuracy

Now just for our own fun, let’s take a look at some of the classifications our model makes. We’ll choose a random set of tweets from our test data and then call our model on each.

import random
j = random.randint(0,len(X_test)-7)
for i in range(j,j+7):
    print(y_pred[0])
    ind = features_nd.tolist().index(X_test[i].tolist())
    print(data[ind].strip())

 
Your output may be different, but here’s the random set that my code generated:

neg
”@RubyRose1 awww wish i could go! but its in sydney "
neg
"Waiting for him. Hopefully he gets on facebook soon. Something is wrong though, some people can't write on my wall. Hope it's fixed soon. "
neg
"@michelletripp Don't be too bummed. Saw it @ IMAX Sydney (largest in the world) "; felt it was too big. Action seqs were all a blur to me "
neg
"Just listening to my ipod the climb. wel it just ran out of batry "
neg
"using my new app p-twit for psp and i love it! snitter and p-twit are the best! go and try it yourself.. "
neg
"Noooooooo!!! There are clips missing on youtube "
neg
"Should really stop bricking his iPhone. OS 3 jailbreak seems to need restored regularly if Cydia crashes during an update. Annoying! "

Just glancing over the examples above, it’s pretty obvious there are some misclassifications. But we want to do more than just ‘eyeball’ the data, so let’s use Scikit-learn to calculate an accuracy score.

After all, how can we trust a machine learning algorithm if we have no idea how it performs? This is why we left some of the dataset for testing purposes. In Scikit-learn, there is a function called sklearn.metrics.accuracy_score which calculates what percentage of tweets are classified correctly. Using this, we see that this model has an accuracy of about 80%.

from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, y_pred))

The result should be:

0.800498753117

Yikes. 80% is better than randomly guessing, but still pretty low as far as classification accuracy goes. With that said, we just built a classifier with less than 50 lines of Python code and no math. That, my friends, is pretty awesome. Even though we don’t have the best results, sckit-learn has provided us with a solid model, which we can improve on if we tune some of the parameters we saw throughout this post. For example, maybe the model needs more training data? Maybe we should have selected 90% of the data for training instead of 80%? Maybe we should have accounted cleaned the data by checking for misspellings?  

These are all important questions to ask yourself as you utilize powerful machine learning modules like Scikit-learn.

If you liked what you did here, check out my GitHub (@lesley2958) and Twitter (@lesleyclovesyou) for more content!

Making Sentiment Analysis Easy With Scikit-Learn


Build a Serverless Remote-Controlled Lego Robot with Twilio Sync and Runtime

$
0
0

The world of Internet connected devices is exploding and there are billions of things already online. Today we’ll skip the smart thermostats and fridges and move straight to programmable droids.

Building Your Droid

We use Twilio Sync for IoT in this project, which is currently in Developer Preview. Sign up for the preview, and the team will get you on-boarded.

To keep this project entertaining, repurposable, and extensible, we are going to use the Lego Mindstorms EV3 kit, a popular platform for basic robotics builds. We’ll build an internet controlled lego robot with a real-time web dashboard and even demo touch control.

Key ingredients include:

  • Lego Mindstorms EV3 kit
  • A compatible WiFi USB dongle (one from this list; we used the Edimax EW-7811UN.)
  • 2GB (or larger) MicroSD card to install EV3dev image
  • Spare AA batteries

Assemble the Hardware

Follow Lego’s online instructions to assemble the EV3D4 droid. Once you’re done, plug in the WiFi USB dongle to enable direct connectivity to the Internet.

When you’ve got everything assembled, you should end up with something like this:

As you can see, this droid has the following peripherals:

  • Output port A: small motor to pivot head
  • Output port B: right large motor to propel and steer the droid
  • Output port C: left large motor
  • Input port 1: switch sensor
  • Input port 3: color sensor to recognize objects
  • Input port 4: ultrasonic proximity sensor to measure distance

Install a Custom OS on Your Lego Robot

We are going to use EV3dev in order to deploy custom applications to your droid. EV3dev is a Debian Linux based operating system that turns Mindstorms bricks into a flexible application platform. Don’t worry, we are leaving your original EV3 firmware untouched. You’ll be able to get back to normal mode of operation anytime by removing the bootable microSD card with the EV3dev image.

Follow the step-by-step instructions on EV3dev’s site to download and install the OS image onto your microSD card. Boot it for the first time and connect to your local WiFi network using Brickman’s “Wireless and Networks > Wi-Fi” menu.

Deploy the Application

Once booted up and connected to WiFi, try logging into to your droid via SSH. The default username is robot and password is maker; feel free to change it.

$ ssh robot@192.168.11.23

Our first application is going to be a basic Python script that connects to Twilio Sync for IoT, drives motors, and captures input from sensors using EV3 Python language bindings.

First, clone our client.py script from GitHub to your machine. On your development host (not on the droid), execute the following to fetch the code and copy it to your robot:

$ git clone https://github.com/dr0nius/twilio-mindstorms
$ scp twilio-mindstorms/ev3-client/client.py robot@192.168.11.23:~/

Then, install dependencies. Most of the things we’ll need are already installed, but we’ll need a compact MQTT client in order to connect to Twilio cloud. While connected to ev3dev, execute the following:

$ sudo apt-get update
$ sudo apt-get install python3-pip
$ sudo pip3 install paho-mqtt

EV3 isn’t exactly a blazing fast Linux platform as it’s running on a fairly low-power microcontroller. Give it some time to complete above and have a coffee or tea while you wait. The good news is that it only needs to be done once.

Connecting Your Droid

In order to get the droid online and operable, we are going to need to use a few Twilio tools and services.

  • First, you’ll need a Twilio account. Follow this link to create one if you haven’t done it already, otherwise log into the Twilio Console.
  • For data synchronization, you’ll use Twilio Sync:
    • Sync Document object to control the state of droid motors.
    • Sync Message Stream to receive droid sensor updates.
  • Instead of a server, you’ll use Twilio Runtime:
    • Runtime Assets to host the operator’s controller web application.
    • Runtime Functions to generate an authentication token for browser access.

Up until this point, we didn’t have a way to securely identify and trust the connected droid within our remote control application. Let’s fix that by creating its unique identity and a client certificate for authentication purposes.

Create an Identity with Sync

Navigate to Sync for IoT console and click on “Visit the Device Manager”. We have an empty fleet automatically provisioned for you, called “Default Fleet”.

  • Under the default fleet, click “Create a Device” button.
  • Provide a friendly name, e.g. “EV3D4 droid” and make sure “Enabled” is checked.
  • Click “Create” to finish.

Authenticate the Device

Now that we have established the droid’s identity, let’s add a client certificate in order to authenticate with the backend.

When logged in to ev3dev, go to the home directory where the client.py script resides and generate a new private key. Store it in a file named ev3d4.key.pem. Keep this key secret — it should never leave the device.

$ openssl ecparam -genkey -name prime256v1 -out ev3d4.key.pem

Then, while in the same directory, generate a self-signed certificate based on the above key, and store it to ev3d4.cert.pem file. When prompted for certificate attributes, enter whatever you like. It doesn’t make a difference for device management.

$ openssl req -new -x509 -sha256 -days 3650 -key ev3d4.key.pem -out ev3d4.cert.pem
...
$ cat ev3d4.cert.pem
——-BEGIN CERTIFICATE——-
MIIB0TCCAXegAwIBAgIJAIPO90JDKJJjMAoGCCqGSM49BAMCMEUxCzAJBgNVBAYT
…
——-END CERTIFICATE——-

This part of the certificate is public, and we are going to copy it to Sync for IoT device manager. Go back to your “EV3D4 droid” device in Twilio console.

  • Under the “EV3D4 droid” menu, pick “Certificates” and click “Create a Certificate”.
  • Name it (e.g. “Droid certificate”) in the friendly name field.
  • Leave the device SID unchanged.
  • Click “Create”.

Build an Online Robot Controller

Our controller application is going to be browser based, enabling bidirectional communication between a human operator and the remote droid. We are going to use a couple of Twilio Sync objects to deliver state updates both ways.

  1. A Sync Document called “motors”. As soon as the controller stick starts moving, the motors are driven forward or backwards, and their desired speed is stored to the JSON document. Here is an example of the motor state snapshot:
    {
    "l1": -64,
    "l2": -120
    }
  2. A Sync Message Stream called “sensors”. Once started, the ev3dev python application will begin reporting readings from infrared, touch and color sensors. Since these reports are periodic and ephemeral by nature, we don’t want them to persist. Here is an example of a JSON message posted by the stream:
    {
    "ir": 100,
    "touch": 0,
    "color": 36
    }

Publish Assets

The JavaScript controller application is packaged into a single HTML file in our GitHub repository. It relies on twilio-sync.js SDK to do all of the low level state replication and websocket messaging work so that the application is kept simple and lean.

We are also going to make the controller application serverless and execute it using Twilio Assets and Functions. First, let’s upload the HTML/JS code and make it an Asset:

Generate an Access Token

We need one last thing to make our droid controller fully functional. Twilio authenticates your browser application running on an arbitrary machine using a JWT access token, and authorizes access to Sync resources using grants. We are going to construct a Sync token generator and host it as a Function.

  • Navigate to the Runtime Functions console and click “Create a Function”.
  • Pick “Blank” as the template and click “Create”.
  • Function name: type “Sync Token Generator”.
  • Path: type “/token”
  • Copy & paste the content of the twilio-mindstorms/token-generator/token.js file.
  • Click “Save” and observe the toast notifications. The last one should say “Your Function has successfully been deployed”.


In order for the token generator to apply a valid signature, we will also need an API signing key.

  • Navigate to Sync Tools console and click “Create new API Key”.
  • Friendly name: type “Droid controller key”.
  • Key type: leave as “Standard”.
  • Note the resulting SID and Secret fields. We are going to need to use them in the token generator.

  • Navigate to Functions Configuration console.
  • Under “Environment Variables”, click the “(+)” button and set TWILIO_API_KEY as the name. For its value, copy & paste the API key SID that we just generated.
  • Add another variable called TWILIO_API_SECRET, and copy & paste our API key secret as its value.
  • Click “Save” and look for a confirming toast notification.

Fire Up the Robot

We’re almost there — let’s put all the components together and make the ultimate test run.

  1. On your development machine, open the Asset URL in your browser. This will automatically create a Sync list and message stream that will be used to interact with the droid.
  2. When connected to ev3dev, execute the client application.
    $ python3 client.py
    Connected with result code 0
  3. Now, press and hold in the “Control stick pad” area to drive your droid around. Observe the proximity gauge so you can change course if there is an obstacle in front of it.

It also works with a trackpad or touchscreen device! Open it on your phone and observe motor and sensor states changing live in all applications.

Liberate IoT Devices with Twilio Sync and Runtime

While this project was ostensibly about building a droid, you did so much more. You built a web dashboard and addressed generic remote control and state synchronization issues. Now, you can extend it even further. For example, consider the following tweaks:

  • Restore the missing color sensor to the controller UI and start tracking it live.
  • Add a second “stick” handler to the control pad and pivot the droid’s head using multi-touch.
  • Attach a special trigger to the touch sensor and hook it up to another Function using webhooks; make it send an SMS or ring a phone.
  • Deploy a fleet of droids and build a dashboard to monitor them all in real time.

Many IoT projects have the same basic problems to solve. They need to enable reliable, secure, low-latency information flow between heterogeneous endpoints: embedded devices, server backends, first-person clients such as browser and mobile phones. Twilio simplifies the development of your IoT applications by providing a set of basic building blocks similar to Legos. Combining these blocks and APIs allows you to reduce your costs and move faster from concept to prototype to production application.

We can’t wait to see what you build next!

Andrei Birjukov is a technical lead at Twilio, living and working in Tallinn, Estonia. A long time gearhead and system software enthusiast, Andrei was formerly an engineer and development lead at Skype and Microsoft. You can reach him via email at andrei@twilio.com.

Build a Serverless Remote-Controlled Lego Robot with Twilio Sync and Runtime

Building Facebook Messenger Bots with Python in less than 60 minutes

$
0
0

Chatbots are magical. Bots can be an amazing product that allow people to create new experiences, from reporting personal news to delivering women’s healthcare information. When I first learned about bots, I never imagined I would be able to make one on my own. However, I quickly dug into the Facebook Messenger documentation and began learning how with a bit of Python 3 and Flask, one could get a bot up and running in no time.

We’ll cover everything from the basics of how bots work to building our own basic Facebook Messenger bot. Specifically, we’ll be making a basic version of Black Girl Magic Bot, a Facebook Messenger bot that sends users images, playlists, and generally uplifting messages to remind them just how amazing they are. If you’re interested into digging the code for the bot, you can fork it and play with it via GitHub.

Development Environment Setup

To make this bot, you need to make sure you have a few things installed:

  • Python 3.6 (you can download this here)
  • Pip (you can download this here)

Once you have downloaded the above files, you need to install the following libraries:

pip3 install Flask==0.12.2
pip3 install pymessenger==0.0.7.0

Coding Our Bot

Using Flask, we can create an endpoint – a fancy way of referring to a website url. For example, in http://twilio.com/try-twilio the endpoint is “/try-twilio.” When a user sends us a message, Facebook will send that data to our endpoint, where we will send a response back to Facebook to show the user.

To begin, we’ll create a basic Flask app called app.py. If you have not used Flask web framework before, you should look at their introduction to understand how the framework works.

from flask import Flask, request

app = Flask(__name__)
@app.route('/', methods=['GET', 'POST'])
def receive_message():
    return "Hello World!"


if __name__ == '__main__':
    app.run()

When the above code is run from the command line by typing python3 app.py you will get a message that states something similar to this:

* Running on http://127.0.0.1:5000/ (Press CTRL C to quit)

If you navigate to the link given from running the app (in this example http://127.0.0.1:5000/) in a browser, you will see a page load that says “Hello World!” With just these few lines of code, we’ve created a web application that displays “Hello World” to any user who goes to the specified link. To build this bot, we will build off of this basic structure in order to handle a user’s request and return a response to them.

From Basic Flask app to Bot

To handle sending messages back to a user who communicates with our bot, we’ll be using the PyMessenger library to handle sending responses to users.

Since we now have the necessary Python libraries installed, it’s time to write our bot.
For the purposes of the bot we’ll make in this guide, we’ll stick to using a small Python list with a few responses.

To make the bot, we first need to handle two types of requests, GET and POST. In our case, we will use GET requests when Facebook checks the bot’s verify token. Expanding on our basic Flask app, we will go to our receive_message function in app.py and add the following lines of code:

if request.method == 'GET':
    # Before allowing people to message your bot, Facebook has implemented a verify token
    # that confirms all requests that your bot receives came from Facebook. 
    token_sent = request.args.get("hub.verify_token")
    return verify_fb_token(token_sent)

In this section, you might be wondering: what exactly is “hub.verify_token”? This refers to a token we will make up and also provide to Facebook that they will use to verify the bot only responds to requests sent from Messenger. We will discuss later in this article how to set up this variable.

If the bot is not receiving a GET request, it is likely receiving a POST request where Facebook is sending your bot a message sent by a user. For this purpose, we will follow the if statement from above with an else that will take the data sent by Facebook and give us the message the user sent us:

# if the request was not get, it must be POST and we can just proceed with sending a message # back to user
   else:
        # get whatever message a user sent the bot
       output = request.get_json()
       for event in output['entry']:
          messaging = event['messaging']
          for message in messaging:
            if message.get('message'):
                #Facebook Messenger ID for user so we know where to send response back to
                recipient_id = message['sender']['id']
                if message['message'].get('text'):
                    response_sent_text = get_message()
                    send_message(recipient_id, response_sent_text)
                #if user sends us a GIF, photo,video, or any other non-text item
                if message['message'].get('attachments'):
                    response_sent_nontext = get_message()
                    send_message(recipient_id, response_sent_nontext)
    return "Message Processed"

With these initial steps written, we move on to handle verifying a message from Facebook as well as generating and sending a response back to the user. Facebook requires that your bot have a verify token that you also provide to them in order to ensure all requests your bot receives originated from them:

def verify_fb_token(token_sent):
    #take token sent by facebook and verify it matches the verify token you sent
    #if they match, allow the request, else return an error 
    if token_sent == VERIFY_TOKEN:
        return request.args.get("hub.challenge")
    return 'Invalid verification token'

Once we know what we are sending back to the user, we need to write a method that actually sends this message to the user. The PyMessenger library makes this easier for us by handling the POST requests per the Messenger API.

def send_message(recipient_id, response):
    #sends user the text message provided via input response parameter
    bot.send_text_message(recipient_id, response)
    return "success"

Now that we have all these code fragments, we can put them all together to make our bot.

 
#Python libraries that we need to import for our bot
import random
from flask import Flask, request
from pymessenger.bot import Bot

app = Flask(__name__)
ACCESS_TOKEN = 'ACCESS_TOKEN'
VERIFY_TOKEN = 'VERIFY_TOKEN'
bot = Bot(ACCESS_TOKEN)

#We will receive messages that Facebook sends our bot at this endpoint 
@app.route("/", methods=['GET', 'POST'])
def receive_message():
    if request.method == 'GET':
        """Before allowing people to message your bot, Facebook has implemented a verify token
        that confirms all requests that your bot receives came from Facebook.""" 
        token_sent = request.args.get("hub.verify_token")
        return verify_fb_token(token_sent)
    #if the request was not get, it must be POST and we can just proceed with sending a message back to user
    else:
        # get whatever message a user sent the bot
       output = request.get_json()
       for event in output['entry']:
          messaging = event['messaging']
          for message in messaging:
            if message.get('message'):
                #Facebook Messenger ID for user so we know where to send response back to
                recipient_id = message['sender']['id']
                if message['message'].get('text'):
                    response_sent_text = get_message()
                    send_message(recipient_id, response_sent_text)
                #if user sends us a GIF, photo,video, or any other non-text item
                if message['message'].get('attachments'):
                    response_sent_nontext = get_message()
                    send_message(recipient_id, response_sent_nontext)
    return "Message Processed"


def verify_fb_token(token_sent):
    #take token sent by facebook and verify it matches the verify token you sent
    #if they match, allow the request, else return an error 
    if token_sent == VERIFY_TOKEN:
        return request.args.get("hub.challenge")
    return 'Invalid verification token'


#chooses a random message to send to the user
def get_message():
    sample_responses = ["You are stunning!", "We're proud of you.", "Keep on being you!", "We're greatful to know you :)"]
    # return selected item to the user
    return random.choice(sample_responses)

#uses PyMessenger to send response to user
def send_message(recipient_id, response):
    #sends user the text message provided via input response parameter
    bot.send_text_message(recipient_id, response)
    return "success"

if __name__ == "__main__":
    app.run()

Make your Bot on Facebook

We’ve written most of the code for the bot, but now we need to connect it to Facebook and make it available publicly. In order for people to message your bot, you need to create a Facebook page (and Facebook account if you do not already have one). Once you’ve made a page, go to the Facebook for Developers website and make a developer account.

Next, click “Add a new App” from the pane on the top right of the page and choose a name for your App (ex. BlackGirlMagicBot) and provide your email.
 

Next, when prompted on the next page for the type of product you are building, click the “Set Up” button on the Messenger option.
Screen Shot 2017-11-09 at 12.57.36 PM.png
Go to your your app’s settings page on the left-hand side and fill out the Basic Information in the Settings tab. That should look like:

 Next, we’ll get the information we need for our bot to follow Facebook’s guidelines. Generate the Access Token for the Facebook Page in the Messenger tab.

The tab can be found in the left-hand corner of the page. Once there select your page from the dropdown menu and a page access token will automatically be generated.
 

Go back to the app.py file and supply the access token where the current ACCESS_TOKEN placeholder text is located.

Hosting

Now that we have our code written and the required Facebook sections filled out, we need to host our code somewhere. For this tutorial we’ll be using ngrok, a nifty tool that allows us to run code on our computer locally but make it accessible to anyone. This link will work as long as we keep the program and ngrok running on our computer. It is important to note that ngrok is meant for basic testing and should not be used to host your program when released publicly.

To get started with ngrok, follow the instructions here.

Now, in order to get  our bot running publicly with Ngrok, we need to first run the app — open a Terminal window and run your app with python3 app.py. Once your Flask app begins running, look for the digits (or port number) at the end of the link that you see.
Screen Shot 2017-11-03 at 9.30.25 PM.png

Now, open a second terminal window or tab and type “ngrok http [number]” where number is the last digits in the website code generated (in this example where the link provided by flask is “http://127.0.0.1:5000/“, you would type “ngrok http 5000”). Once you do this, a screen will appear with a link after the  “Forwarding” section — make sure to copy the link that begins with “https.” This link is what we can provide to Facebook when someone sends the bot a message.
Screen Shot 2017-11-03 at 9.33.32 PM.png

Go back to the Facebook developer screen and supply this link so that when our page receives a message, Facebook knows where to send the message to. Click the Webhooks tab and click on “Edit Subscription.” You should see a screen like this below:

Screen Shot 2017-11-09 at 1.04.54 PM.png

For the Callback URL, copy and paste the link created by ngrok into the field.

Remember the VERIFY_TOKEN placeholder we currently have in our app.py file? To protect your bot, Facebook requires you to have a verify token. When a user messages your bot, Facebook will send your bot the message along with this verify token for your Flask app to check and verify the message is an authentic request sent by Facebook. Choose a string you want to use for your verify token and replace the placeholder in the app.py file with your code (ex. “TESTINGTOKEN” could be your verify token, but I’d recommend something harder for someone to guess) and place the same token (minus the quotation marks) in the Verify Token field.

For the subscription fields, be sure to check the messages, messaging_postbacks, message_deliveries, messaging_pre_checkouts boxes.

When you’re finished, click “Verify and Save.”

Now, on the same Messenger Settings page, we need to hook the webhook to our Facebook page. Select your page and then click “Subscribe” to finish the process.
Screen Shot 2017-06-28 at 9.29.14 PM.png

After this step is the last part of making our bot… testing it.

Testing the Bot

Now that we have finished writing code and setting up our Facebook app, it’s time to test this bot. If you visit your Facebook page, you’ll be able to test the bot by sending it a message. If you have everything correctly set up, you should get a message back from your page.

The Black Girl Magic bot is correctly sending users a variety of messages that we added whenever the user sends a message.

What’s next?

Congrats! You’ve made your first Facebook Messenger bot. Now, you can start creating bots to help users do a variety of tasks. While in this tutorial we run the bot locally using Ngrok, in the next tutorial, we’ll discuss hosting the bot via Heroku. Once you’re ready, you can finish the Facebook Messenger approval process and get your bot approved to send messages to all users. With the future of bots, the possibilities are endless. I’ve been excited to see how they can help users and look forward to what bots people make next.

I hope you’ve enjoyed learning how to make a bot in Python. If you enjoyed this post, you can follow me on GitHub @wessilfie.

Building Facebook Messenger Bots with Python in less than 60 minutes

Embedding Maps with Python & Plotly

$
0
0

Data Visualization is an art form. Whether it be a simple line graph or complex objects like wordclouds or sunbursts, there are countless tools across different programming languages and platforms. The field of geospatial analysis is no exception. In this tutorial we’ll build a map visualization of the United States Electoral College using Python’s plotly module and a Jupyter Notebook.

Python Visualization Environment Setup

This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install the plotly module that we’ll use throughout this tutorial. You can do this by running the following in the terminal or command prompt on your operating system:

pip3 install plotly==2.0.9
pip3 install jupyter==1.0.0

Since we’ll be working with Python interactively, using the Jupyter Notebook is the best way to get the most out of this tutorial. You installed it with pip3 above and now you just need to get it running. Open up your terminal or command prompt and run the following command:

jupyter notebook

And BOOM! It should have opened up in your default browser.

A Quick Note on Jupyter

For those of you who are unfamiliar with Jupyter Notebook, I’ve provided a brief review of which functions will be particularly useful to move along with this tutorial.

In the image below you’ll see buttons labeled 1-3 that will be important for you to get a grasp of:

  1. Save button
  2. Add cell button
  3. Run cell button

Jupyter Notebook button descriptions
The first button is the button you’ll use to save your work as you go along (1). I won’t give you directions as when you should do this — that’s up to you!

Next, we have the add cell button (2). Cells are blocks of code that you can run together. These are the building blocks of Jupyter Notebook because it provides the option of running code incrementally without having to to run all your code at once.  Throughout this tutorial, you’ll see lines of code blocked off — each one should correspond to a cell.

Lastly, there’s the run cell button (3). Jupyter Notebook doesn’t automatically run your code for you; you have to tell it when by clicking this button. As with add button, once you’ve written each block of code in this tutorial onto your cell, you should then run it to see the output (if any). If any output is expected, note that it will also be shown in this tutorial so you know what to expect. Make sure to run your code as you go along because many blocks of code in this tutorial rely on previous cells.

Setting Up Plotly

In order to use Plotly, you’ll need an account with your own API key. Click here to set up an account with Facebook, Twitter, GitHub, Google Plus or your email address.

Plotly create a new account screen

Once you’ve logged in, look for your username in the upper right corner of the page. If you hover over it, four options pop up in a rectangular box. Select the Settings option, highlighted in the picture below:
Plotly changing settings example

This will redirect you to your settings page, where you’ll find six options. Next choose API Keys, highlighted in the image below:
Plotly API Key menu item

Finally, you have the page with all the needed information for this tutorial! In the image below, you’ll see two text boxes: one with your username and the other with your API key. These are the pieces of information you should store somewhere safely. The text in the API key should be protected, but since this is your first time using Plotly, you can go ahead and click the Regenerate Key button circled below. Copy and paste your username and API key somewhere safe for later.

Save your Plotly API Key

Formatting Data in Plotly

First, as always, we import the needed modules. Then we initialize our session by signing into Plotly, using the username and API key we just got.

import plotly.plotly as py
from plotly.graph_objs import *
py.sign_in('YOUR_USERNAME', 'YOUR_PASSWORD')

Plotly supports three types of maps: choropleths, atlas maps, and satellite maps. Using data from the electoral college, we’ll plot a choropleth map of the United States with a color scale. The darker the color, the greater number of votes.

First, we load in the electoral college data. Each number in the list of votes will correspond to a state label, which we’ll incorporate soon.

data = Data([
    Choropleth(
        z=[3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 8.0, 8.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 16.0, 18.0, 20.0, 20.0, 29.0, 29.0, 38.0, 55.0],
        autocolorscale=False,
        colorbar=ColorBar(
            title='Votes'
        ),

In the same ‘Data’ call, we include parameters that will be our sliding scale. This determines how light or dark a given state will be.

colorscale=[[0.0, 'rgb(242,240,247)'], [0.2, 'rgb(218,218,235)'], [0.4, 'rgb(188,189,220)'], [0.6, 'rgb(158,154,200)'], [0.8, 'rgb(117,107,177)'], [1.0, 'rgb(84,39,143)']],
    hoverinfo='location z',

Now we’re ready to add labels for each state, making sure to keep the order in place so that states aren’t mislabeled.

locationmode='USA-states',
        locations=['DE', 'VT', 'ND', 'SD', 'MT', 'WY', 'AK', 'DC', 'NH', 'RI', 'ME', 'ID', 'HI', 'WV', 'NE', 'NM', 'MS', 'AR', 'IA', 'KS', 'NV', 'UT', 'CT', 'OR', 'OK', 'KY', 'LA', 'SC', 'AL', 'CO', 'MD', 'MO', 'WI', 'MN', 'MA', 'TN', 'IN', 'AZ', 'WA', 'VA', 'NJ', 'NC', 'GA', 'MI', 'OH', 'PA', 'IL', 'NY', 'FL', 'TX', 'CA'],
        marker=Marker(
            line=Line(
                color='rgb(255,255,255)',
                width=2
            )
        ),
        text=['DE', 'VT', 'ND', 'SD', 'MT', 'WY', 'AK', 'DC', 'NH', 'RI', 'ME', 'ID', 'HI', 'WV', 'NE', 'NM', 'MS', 'AR', 'IA', 'KS', 'NV', 'UT', 'CT', 'OR', 'OK', 'KY', 'LA', 'SC', 'AL', 'CO', 'MD', 'MO', 'WI', 'MN', 'MA', 'TN', 'IN', 'AZ', 'WA', 'VA', 'NJ', 'NC', 'GA', 'MI', 'OH', 'PA', 'IL', 'NY', 'FL', 'TX', 'CA']
    )
])

Altogether, we have:

In [ ]:
data = Data([
    Choropleth(
        z=[3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 8.0, 8.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 16.0, 18.0, 20.0, 20.0, 29.0, 29.0, 38.0, 55.0],
        autocolorscale=False,
        colorbar=ColorBar(
            title='Votes'
        ),
        colorscale=[[0.0, 'rgb(242,240,247)'], [0.2, 'rgb(218,218,235)'], [0.4, 'rgb(188,189,220)'], [0.6, 'rgb(158,154,200)'], [0.8, 'rgb(117,107,177)'], [1.0, 'rgb(84,39,143)']],
        hoverinfo='location+z',
        locationmode='USA-states',
        locations=['DE', 'VT', 'ND', 'SD', 'MT', 'WY', 'AK', 'DC', 'NH', 'RI', 'ME', 'ID', 'HI', 'WV', 'NE', 'NM', 'MS', 'AR', 'IA', 'KS', 'NV', 'UT', 'CT', 'OR', 'OK', 'KY', 'LA', 'SC', 'AL', 'CO', 'MD', 'MO', 'WI', 'MN', 'MA', 'TN', 'IN', 'AZ', 'WA', 'VA', 'NJ', 'NC', 'GA', 'MI', 'OH', 'PA', 'IL', 'NY', 'FL', 'TX', 'CA'],
        marker=Marker(
            line=Line(
                color='rgb(255,255,255)',
                width=2
            )
        ),
        text=['DE', 'VT', 'ND', 'SD', 'MT', 'WY', 'AK', 'DC', 'NH', 'RI', 'ME', 'ID', 'HI', 'WV', 'NE', 'NM', 'MS', 'AR', 'IA', 'KS', 'NV', 'UT', 'CT', 'OR', 'OK', 'KY', 'LA', 'SC', 'AL', 'CO', 'MD', 'MO', 'WI', 'MN', 'MA', 'TN', 'IN', 'AZ', 'WA', 'VA', 'NJ', 'NC', 'GA', 'MI', 'OH', 'PA', 'IL', 'NY', 'FL', 'TX', 'CA']
    )
])

Now we’ve formatted our data. But of course, our code needs to know how to display this data. Luckily, we can just use plotly‘s  Layout class. We’ll pass in color schemes and labels as parameters.

layout = Layout(
    geo=dict(
        lakecolor='rgb(255, 255, 255)',
        projection=dict(
            type='albers usa'
        ),
        scope='usa',
        showlakes=True
    ),
    title='2016 Electoral College Votes'
)

Displaying the Plotly Map

We’re almost done! The last step is to construct the map and use plotly to actually display it.

fig = Figure(data=data, layout=layout)
plot_url = py.plot(fig)

This should open up a link in your browser with our visualization on display. It should look like the following:

Plotly visualization example colorized from Python

(You can also see my version of the visualization here.)

Now that we’ve generated our map on the Plotly website, let’s explore the different display options available. Take a look at the upper right corner of the visualization. You’ll notice a camera icon, which if you hover over says, “Download plot as a png.” This image allows you to download a static picture of your visualization, which you can then embed as you would any other photo.

Next, scroll to the bottom of the page. Notice the 5 icons on the lower right corner: Facebook, Google Plus, and Twitter logos, followed by a chain link graphic and “</>”. Click on that final one, “</>”.
Button to embed a Plotly visualization

It should have opened a modal that looks like this:

Plotly embed dialog

Now you can copy and paste that html snippet in any HTML file, and it will elegantly display the visualization. For example, the embed markup for my graph looks like this:

<div>
    <a href="https://plot.ly/~lc2958/36/" target="_blank" title="plot from API (15)" style="display: block; text-align: center;"><img src="https://plot.ly/~lc2958/36.png" alt="plot from API (15)" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="lc2958:36" src="https://plot.ly/embed.js" async></script>
</div>

plotly is a useful Python module for displaying almost any type of data, whether that be numerical, text, or geospatial. In this tutorial, we focused on geospatial mapping, but if you’re interested in expanding your visualization toolkit, you can look here for more ideas.

If you liked what you did here, check out my GitHub (@lesley2958) and Twitter (@lesleyclovesyou) for more content!

Embedding Maps with Python & Plotly

Mock it Til’ You Make It: Test Django with mock and httpretty

$
0
0

In this tutorial, we’ll learn how to use Python’s mock and httpretty libraries to test the parts of a Django app that communicate with external services. These could be 3rd party APIs, web sites you scrape, or essentially any resource which you don’t control and is behind a network boundary. In addition to that, we’ll also take a brief look into Django’s RequestFactory class and learn how we can use it to test our Django views. 🕵

In order to illustrate how all these pieces fit together, we’ll write a simple Django app —  “Hacker News Hotline”. It will fetch today’s headlines from Hacker News and serve them behind a TwiML endpoint, so anyone calling our Twilio number can hear the latest headlines on Hacker News.

Right, sounds like a plan. Let’s do this!

Drake wants to use mock and httpretty for Django testing

Python Environment Setup

Before we get our hands dirty, there are a few software packages we need to install.

1) First, we’ll need Python 2.7. Unfortunately httpretty doesn’t have official support for Python 3.x, as of now 😢. You can follow steps outlined in this StackOverflow post to install version 2.7. If you already have Python installed, you can quickly check which version you have by typing the command below in your terminal. Any 2.x version will work to follow this tutorial.

python -V

2) I recommend using virtualenv to keep your development machine tidy. Virtualenv gives you a virtual Python environment where all project dependencies can be installed to.

We’ll also need pip. Pip is a package manager for Python. You can follow this tutorial to set up both virtualenv and pip.

3) Ngrok is a reverse proxy solution, or in simple terms, software which exposes your development server to the outside world. You can install it here.

Last but not least, you’ll also need a Twilio account and a Twilio number with voice capability to test the project. You can sign up for a free trial here.

Django Project Setup

If you have the above environment setup done, let’s move into setting up our Django project.

First, let’s create a directory called twilio-project, then activate virtualenv and install all the dependencies we’ll need; django, httpretty, mock and twilio.

After that let’s start a django project called twilio_voice and add an app called hackernews_calling to our project. We will also need to apply initial database migrations for our Django app.

Open up your terminal and let’s start hacking. 👩‍💻👨‍💻

mkdir twilio-project && cd twilio-project
virtualenv env
source env/bin/activate
pip install django==1.11
pip install httpretty
pip install mock
pip install twilio
django-admin.py startproject twilio_voice .
cd twilio_voice/
django-admin.py startapp hackernews_calling
python manage.py migrate

Great, now we can start writing some code. 👌

Fetching Hacker News Top Stories with Python

We’ll begin with writing a module that fetches top headlines from the Hacker News API. Within the hackernews_calling directory, create a file called hackernews.py with the following code:

"""
This Module Talks to Hacker News API
to fetch latest headlines
"""
import json
import urllib2

def get_headlines(no_of_headlines):
    """
    gets the titles of top stories
    """
    top_story_ids = urllib2.urlopen("https://hacker-news.firebaseio.com/v0/topstories.json").read()

    ids = json.loads(top_story_ids)[:no_of_headlines]
    headlines = []
    for story_id in ids:
        story_url = "https://hacker-news.firebaseio.com/v0/item/{0}.json".format(story_id)
        story = urllib2.urlopen(story_url).read()
        story_json = json.loads(story)
        headlines.append(story_json["title"])
    return headlines

The above module will do the following:

  1. Fetch topstories.json file from remote host
  2. Read a slice of the top_story_ids list
  3. Fetch individual story details
  4. Return a list of headlines

Simple.

Mocking HTTP Requests with Python

So how do we go about testing this module without firing HTTP requests every time we run our tests? Enter httprettyhttpretty is a library which monkey patches Python’s core socket module. It’s perfect for mocking requests and responses with whichever request library you’re using!

By default, Django creates a tests.py where you put all your test cases in one place. I prefer working with a directory structure with multiple test files instead. It helps to keep your tests files granular and concise.

Within the twilio_voice/hackernews_calling directory, apply the following bash magic. 🎩 🐰

rm tests.py 
mkdir tests/ 
touch tests/__init__.py

Let’s create a test module called test_hackernews.py within the tests directory.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.test import TestCase
import httpretty
import re # native python regex parsing module
from twilio_voice.hackernews_calling import hackernews

class TestHackerNewsService(TestCase):

    @httpretty.activate
    def test_get_headlines(self):
        # mock for top stories
        httpretty.register_uri(
            httpretty.GET,
            "https://hacker-news.firebaseio.com/v0/topstories.json",
           body="[1,2,3,4,5,6,7,8,9,10]")

        # mock for individual story item
        httpretty.register_uri(
            httpretty.GET,
            re.compile("https://hacker-news.firebaseio.com/v0/item/(w ).json"),
            body="{"title":"some story title"}")

        headlines = hackernews.get_headlines(5);
        self.assertEqual(len(headlines), 5)
        self.assertEqual(headlines[0], 'some story title')
        last_request = httpretty.last_request()
        self.assertEqual(last_request.method, 'GET')
        self.assertEqual(last_request.path, '/v0/item/5.json')

Now let’s investigate our test module closely 🔬.

The first thing you’ll probably notice is the @httpretty.activate decorator that we wrapped around our test method. This decorator replaces the functionality of Python’s core socket module and restores it to its original definition once our test method finishes executing during runtime. This technique is also known as “monkey patching”. Pretty neat, right?

Using httpretty we can register URIs to be mocked and define default return values for testing.

httpretty.register_uri(
    httpretty.GET,
    "https://hacker-news.firebaseio.com/v0/topstories.json",
    body="[1,2,3,4,5,6,7,8,9,10]")

Here we are mocking the /v0/topstories.json endpoint to return a list of numbers — “ids”, from 1 to 10.

It’s also possible to mock URIs using regular expressions with httpretty. We leverage this feature to mock fetching individual story details.

re.compile("https://hacker-news.firebaseio.com/v0/item/(w+).json")

Httpretty also lets us investigate the last request made. We use this feature to verify that the last request made was to fetch the 5th story item’s details.

last_request = httpretty.last_request()
self.assertEqual(last_request.method, 'GET')
self.assertEqual(last_request.path, '/v0/item/5.json')

Let’s run it. Jump back to the project root directory and run the tests.

python manage.py test --keepdb

You should see that test run was successful. 🎉

.
———————————————————————————————————
Ran 1 test in 0.090s

TwiML: Talking the Twilio Talk

TwiML is the markup language used to orchestrate Twilio services. With TwiML, you can define how to respond to texts and calls received by your Twilio number. We will generate a TwiML document to narrate the Hacker News data we fetch. When someone calls our Twilio number, they’ll hear the top headlines of the day! 🗣📱

TwiML is simply an XML document with Twilio specific grammar. For example, to make your Twilio number speak upon receiving a call, you could use the following XML:

<?xml version="1.0" encoding="UTF-8"?> 
<Response> 
    <Say voice="woman">Hello World</Say> 
</Response>

To make things easier we can use the official Twilio Python module to generate TwiML syntax.

Let’s create a new Django view (endpoint) to narrate Hacker News headlines using TwiML. We’ll return top Hacker News story headlines in TwiML format. Our hackernews_calling/views.py should look like this:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.http import HttpResponse
from django.views import View
import hackernews
from twilio.twiml.voice_response import VoiceResponse

class HackerNewsStories(View):
    """
    Hacker News Stories View
    """

    def get(self, request):
        """
        Return top Hacker News story headlines in TwiML format
        """
        headlines = hackernews.get_headlines(5)
        resp = VoiceResponse()
        for headline in headlines:
            resp.say(headline, voice='woman', language='en-gb')

        twiml_str = str(resp)
        return HttpResponse(twiml_str, content_type='text/xml')

To start serving the TwiML doc, we’ll need to register the new endpoint we’ve introduced so our Django app serves it. Change the hackernews_callings/urls.py module to the following:

from django.conf.urls import url
from django.contrib import admin
from twilio_voice.hackernews_calling.views import HackerNewsStories

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^headlines', HackerNewsStories.as_view()),
]

Mocking It: Testing Django Views with Mock and RequestFactory

If we were to test the view we have just written, every test run would make HTTP requests as the view is relying on the HackerNews service to fetch the data.

We could use httpretty again to fake requests at the network level, but there is a better solution in this scenario: the mock library we installed earlier. Mock is also part of the Python standard library since v.3.3.

Let’s write our test module and investigate it afterwards. Create a module called test_headlines_view.py under the tests directory with the following content:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.test import TestCase, RequestFactory
import mock
from twilio_voice.hackernews_calling.views import HackerNewsStories

class TestHeadlinesView(TestCase):

    @mock.patch('twilio_voice.hackernews_calling.hackernews.get_headlines', return_value=['headline 1'])
    def test_headlines_xml(self, get_headlines_patched_func):
        request = RequestFactory()
        endpoint = 'headlines'
        get_headlines = request.get(endpoint)
        twiml_response = HackerNewsStories.as_view()(get_headlines)
        self.assertEqual(get_headlines_patched_func.call_count, 1)
        self.assertEqual(twiml_response.status_code, 200)
        expected_content = '<?xml version="1.0" encoding="UTF-8"?><Response><Say language="en-gb" voice="woman">headline 1</Say></Response>' 
        self.assertEqual(twiml_response.content, expected_content)

# snippet provided so reader can follow below explanation easily
    # see file test_headlines_view.py
    @mock.patch('twilio_voice.hackernews_calling.hackernews.get_headlines', return_value=['headline 1'])
    def test_headlines_xml(self, get_headlines_patched_func):
        # lines redacted
        self.assertEqual(get_headlines_patched_func.call_count, 1)

You probably notice the @mock.patch decorator. This decorator monkey patches the function available at the path provided. Similar to how httpretty works, the mocked function is restored to its original state after the test method executes. You should always use absolute paths when working with the mock module as relative paths won’t work!

The second argument provided to the decorator is the value to return when the function is called. You may also have noticed we are passing the get_headlines_patched_func parameter into our test function. This acts as a spy so that we can interrogate if the mocked function was called, how many times it was called, with what arguments it was called and so forth.

# snippet provided so reader can follow below explanation easily
    # see file test_headlines_view.py
    request = RequestFactory()
    endpoint = 'headlines'
    get_headlines = request.get(endpoint)
    twiml_response = HackerNewsStories.as_view()(get_headlines)
    self.assertEqual(twiml_response.status_code, 200)
    expected_content = 'xml response redacted'
    self.assertEqual(twiml_response.content, expected_content)

Now let’s look into how we use Django’s internal RequestFactory. This module lets us imitate a Django http request object which can be passed into a View class. Using this approach we can test our views without actually hitting any endpoints.

This approach is useful to test any sorting, filtering, pagination or authorization logic within your views. It’s also possible to bake additional request headers into the faked Request object, such as authorization headers for restricted views. In our case, we’re simply mocking a GET request with the headline URI. The status code and response returned is checked afterwards.

Once again, jump back to the project root and run tests again. Everything should still be groovy. 👌

$ python manage.py test --keepdb
..
———————————————————————————————————
Ran 2 tests in 0.098s

Putting Our Hacker News App Together

Now that we have our Django app set-up and working, let’s make it work with a Twilio number. (If you haven’t already signed up for Twilio, get a Trial account now.)

We’ll be using Twilio’s “Managing an Incoming Call” flow:
Call infrastructure flow for Python Hacker News headline app
To let Twilio talk to our local Django server, we’ll need to use a reverse proxy such as ngrok. (You can download ngrok here.)

After installing ngrok, start it on port 8000 so the app is publicly available:

$ ngrok http 8000

Pay attention to the output of ngrok, copy the URL provided and add it to the ALLOWED_HOSTS list within settings.py. Ngrok forwarding URL for Hacker News headline app

ALLOWED_HOSTS = [u'6b40f2a5.ngrok.io']

Now we can run the django app:

$ python manage.py runserver

After starting the Django app, go to Twilio Dashboard > Phone Numbers > Active Numbers and click on a twilio number to set up the webhook URL. Copy and paste the secure forwarding URL (https) provided by ngrok and append /headlines to it. The HTTP action should be set to GET.

Twilio webhook example callback entry

Give your Twilio number a call to listen to today’s Hacker News headlines. Voila! 🎉 🙌

Wrap Up: Hacker News Headlines with Django and Twilio

Congratulations if you’ve made it this far – give yourself a pat on the back! We’ve covered how to use mock, httpretty and RequestFactory modules to easily test Django. You can use mock to replace function bodies at runtime and httpretty to mock http requests at the network level. Both modules leverage monkey patching so mocked functions are restored to their original definitions after the test run. Last but not the least, we have also used RequestFactory to mock requests to test our Django views. You can find the finished project on GitHub.

If you would like to learn more about when you should use mocks, I recommend reading Eric Elliot’s “Mocking is a code smell” post here.

Thanks for reading, please do let me know if you have any questions in the comments.

Ersel Aker is a full-stack developer based in the UK. He has been working with FinTech startups using Python, Nodejs and React. He is also the author of Spotify terminal client  library, and contributor to various open source projects. You can find him on Twitter, GitHub, Medium or at erselaker.com

Mock it Til’ You Make It: Test Django with mock and httpretty

How to Host a Python and Flask Facebook Messenger Bot on Heroku

$
0
0

With the rise of bots, people can create new tools to help make people’s lives easier. In the first part of this bot series, we discussed how to make a Facebook Messenger bot using Python, Flask, and Ngrok. However, for a bot in production, having all the requests processed on your personal computer won’t work well. That’s where a hosting service like Heroku can come in. Services like Heroku’s allow you to host your program on their servers making it available for people to access 24/7. In this guide, we will talk about how to migrate a Facebook Messenger bot from being run on Ngrok to Heroku. For the purposes of this guide, we’ll use Heroku’s free hobby tier.

Register an Account with Heroku

Get started with Heroku by first making an account. Once you’ve verified your email address, log into the Heroku dashboard and make a new app.

Animated gif showing the account creation flow with Heroku

You can enter your own App Name or leave it blank and have Heroku randomly assign one to your app. In either case, the app name also represents the subdomain that you can use to access your app once it is deployed to Heroku’s servers. For example, if you name your app blackgirlmagic, once deployed, you can access your creation at blackgirlmagic.herokuapp.com.

Prerequisites to Deploy a Flask App on Heroku

In order to deploy a Flask app on Heroku, there are three specific files we need to include: a requirements.txt, a Procfile, and a runtime.txt. For this guide, we’ll use the code for Black Girl Magic Bot, which can be found here.

For our requirements.txt file, we need to tell Heroku which Python libraries our program depends on. Using your favorite text editor, you need to include every library included in your program that is not included in the Python standard library. To quickly generate the items for your file, type “pip3 freeze > requirements.txt” into the command line to get the list of libraries and their versions. This will output the results into the file for you and should look similar to the code below:

Flask==0.12.2
pymessenger==0.0.7.0
gunicorn==19.6.0

For our Procfile, we need to tell Heroku what to do in order to run the program. Be sure the file is just titled Procfile (it should have no file extension) and drop in this bit of configuration. If you’re interested in learning more about how these files work, visit Heroku’s guide on them.

web: gunicorn app:app --log-file=-

And finally, we need to tell Heroku that our bot is written in Python 3 by creating a runtime.txt file. All it needs to contain is the Python version you want Heroku to use. In this example, we’ll use

python-3.6.4

to tell Heroku that it should use this version of Python when compiling and running the code.

Hook up Version Control to Heroku

Next, we want to put our code under version control, a way to save each iteration of edits to our files (called a commit). To do this, we will commit our code to GitHub. You can make a GitHub account here and learn how to commit your code to GitHub here. For the purposes of this article, we will be working to deploy the Black Girl Magic bot. The code for it can be found via this link and you can use this guide to learn how to clone it.

Once you’ve cloned it, go to the Heroku deploy tab and connect your app to your GitHub repository so that Heroku will be able to pull and deploy your desired code. You will need to grant Heroku access to your Github account so that you can search all of your repositories. After connecting your account, search for your repository’s name and then click the “Connect” button to connect it to your Heroku app.

Connect your GitHub account to Heroku.

Once you’ve connected the repo to your app, you should see a screen like this:

Successful connection of a GitHub account and Heroku

Optional: If you want Heroku to automatically deploy updates to your bot when you make a commit to your GitHub repo, make sure to click the “Enable Automatic Deploys” button.

Enable automatic deploys button in Heroku

Now that we have connected our GitHub repository to your Heroku app, we need to deploy it.

Deploying from a branch to Heroku

Set Environment Variables in Heroku

If you are following this article using the Black Girl Magic bot, you will need to make sure to replace the

ACCESS_TOKEN
and
VERIFY_TOKEN
 placeholders with actual codes. For information on how to access these codes, see the “Make your Bot on Facebook” section on my earlier blog post about bots. However, just leaving these tokens in your repo is bad practice and makes it easy for people to use your tokens for their own purposes. Thankfully, Heroku can help with this by allowing you to use config variables which allow you to securely keep the codes on Heroku and not in your code.

To do this, go to the “Settings” Tab section of your Heroku app and then click the “Reveal Config Vars.” You’ll need to enter a key/value pair for both the access and verify tokens and add them to make sure they are saved. Below is a good example of what this would look like:
Environment variable demonstration in Heroku

Additionally, if you’re following the example code we’re using in this guide, you’ll need to modify lines 6 and 7 of the app.py file to find the tokens needed by getting them from the environment:

ACCESS_TOKEN = os.environ['ACCESS_TOKEN']
VERIFY_TOKEN = os.environ['VERIFY_TOKEN']

Note: Using this setup requires you to import the Python os library. To do this, at the top of your file include:

import os

Deploy Your Heroku Flask App and View Any Logs

Now that we have added these tokens using config variables, click the “Deploy Branch” button at the bottom of the page. Heroku will deploy your code and your site will start running publicly.

To track the general status of your application and how it’s processing requests, one can use the Heroku CLI or visit the logs online by clicking the “View logs” tab at the top of the deploy page.

How to view logs in Heroku

After Heroku deploys your app, you’ll be able to visit it at https://www.[yourappname].herokuapp.com. If you’ve followed the steps correctly you should see a page that looks like this:

Invalid token page for a Facebook Messenger bot.

If you have not built the bot discussed in the earlier post, read the last section about verify tokens and Facebook’s general API around bots. In order to make a Messenger Bot, you need to provide Facebook with an endpoint to send all messages to as well as a verification token so your site only processes requests sent by Facebook. When you go directly to the link created by Heroku, you’re not providing the verification token, hence you see the “Invalid verification token” screen.

To set up our Messenger bot to communicate with our Heroku app, we need to visit the Facebook for Developers page for our application and revisit the Webhooks tab. Click “Edit Subscriptions” and change the Callback URL to https://www.[yourappname].herokuapp.com and update the verify token.

Set the Facebook Messenger callback URL

Testing the Facebook Messenger Bot

We’re almost done with the steps needed in order to host our bot on Heroku. All that’s left is to test that our bot is working as expected. Go to the Facebook page you made for your bot and send it a message to verify that it still works.
Black Girl Magic Facebook Messenger bot

What’s Next?

Woo! You’ve gone from making your first Facebook Messenger bot to hosting it on Heroku. As seen above, now your bot can handle multiple messages from users 24/7 now that it is no longer hosted locally on your computer. With all of this done, you can finish the Facebook Messenger approval process and get your bot approved to send messages to all users.

I hope you’ve enjoyed learning how to make and host a Messenger bot in Python. If you enjoyed this post, you can follow me on GitHub @wessilfie.

How to Host a Python and Flask Facebook Messenger Bot on Heroku

Viewing all 78 articles
Browse latest View live