refresh gitignore

11 Apr 2017

Retroactively add a .gitignore and remove the files you forgot to ignore in the first place from your repo.

ht: http://stackoverflow.com/questions/1139762/ignore-files-that-have-already-been-committed-to-a-git-repository

basically it’s just:

git add .
git commit -m 'committing all my stuff'
git rm -r --cached .
git add .
git commit -m 'holy crow, my ignored files will be deleted now!'

excel csv hell

9 Apr 2017

Just when you think Excel couldn’t suck less, it does when you have to do the most mundane of all things - export a goddamn sheet as a csv.

I’m using Excel 2016 for this example, but I’ve seen this for a bunch of versions of Excel and I can’t imagine the brilliant engineers over at Microsoft have ever looked into making their products compatible with, well, anything.

reading a csv in python

Reading actual csv files in python is usually a fairly trivial affair (using python 2.7):

import csv

with open('my_file.csv') as f:
    reader = csv.reader(f)
    rows = [row for row in reader]

Google Sheets

For example, if I have the following data in Google Sheets, an actual working program built out by an excellent engineering organization that takes its customers’ needs into account:

source_data

I can go ahead and read it in just fine (those gnarly trailing characters on padmé are just the utf8 representation of the e with the accent):

In [24]: with open('sheets_output.csv') as f:
    ...:     reader = csv.reader(f)
    ...:     rows = [row for row in reader]
    ...:

In [25]: rows
Out[25]: [['id', 'name'], ['0', 'ryan'], ['1', 'padm\xc3\xa9']]

Excel

Now let’s say I have literally the same exact data in Excel.

I then select ‘Save As…’ and see ‘CSV UTF-8’ as a file option (sounds about right, of course). Save the file, and then the fun begins.

In [29]: with open('excel_output.csv') as f:
    ...:     reader = csv.reader(f)
    ...:     rows = [row for row in reader]
    ...:
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-29-0a959f4a7dde> in <module>()
      1 with open('excel_output.csv') as f:
      2     reader = csv.reader(f)
----> 3     rows = [row for row in reader]
      4

Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

Gross. No idea what that means. I thought I was just exporting some tabular data as a csv, not signing up for eternal damnation to hell.

So some googling indicates I need to change up my syntax a bit to account for universal-newline mode:

In [30]: with open('excel_output.csv', 'rU') as f:
    ...:     reader = csv.reader(f)
    ...:     rows = [row for row in reader]
    ...:

In [31]: rows
Out[31]: [['\xef\xbb\xbfid', 'name'], ['0', 'ryan'], ['1', 'padm\xc3\xa9']]

Okay, that’s starting to look at least a bit better, but what in the flying fuck is that gnarly thing prepended to my id column!?

It turns out this is a Byte-Order-Mark (BOM) that Excel had to bake into its shitty csv outputs to explicitly state that this file is in UTF-8 format so that its shitty csv reader functionality would be able to handle reading files. This stack overflow link provides more information than I would ever care to know about this: http://stackoverflow.com/questions/155097/microsoft-excel-mangles-diacritics-in-csv-files

Thankfully someone wrote a python module for handling crazy csv inputs, and it’s appropriately called unicodecsv, and this helps us solve the issue first by reading things in as unicode and second, specifying the encoding so it can strip out garbage. It basically acts as a wrapper around the builtin csv module, so just drop it right in:

In [55]: with open('excel_output.csv', 'rU') as f:
    ...:     reader = unicodecsv.reader(f, encoding='utf-8-sig')
    ...:     rows = [row for row in reader]
    ...:

In [56]: rows
Out[56]: [[u'id', u'name'], [u'0', u'ryan'], [u'1', u'padm\xe9']]

Link to the module: https://github.com/jdunck/python-unicodecsv

And it turns out we can use this exact code for reading the sheets_output.csv file without issue too!

encoding details

file will show you the high-level details about a file.

⚡  ~/Desktop ⚡  file excel_output.csv
excel_output.csv: UTF-8 Unicode (with BOM) text, with CR line terminators
⚡  ~/Desktop ⚡  file sheets_output.csv
sheets_output.csv: UTF-8 Unicode text, with CRLF line terminators

python 3

I’m a bit late to the python 3 party, but apparently the csv module now generally handles this stuff alright BUT you still need the codecs module to explicitly open it up correctly to eliminate the BOM:

>>> import codecs
>>> import csv
>>> with codecs.open('excel_output.csv', encoding='utf-8-sig') as f:
...     reader = csv.reader(f)
...     rows = [row for row in reader]
...
>>> rows
[['id', 'name'], ['0', 'ryan'], ['1', 'padmé']]

postgres alter

27 Mar 2017

move schema

alter table old_schema.my_table set schema new_schema;

rename table

alter table my_table rename to new_table_name;

rename column

alter table my_table rename column old_col to new_col;

iot

4 Mar 2017

Particle provides a nice web IDE for flashing programs to your devices but I’d prefer to use the command line and there were some kinks to work out.

I’m working off my repo for smart home / iot stuff:

https://github.com/ryantuck/asmartment

installing packages

I wanted to include FastLED in my .ino file, which works fine from the web interface but not so much on my machine.

🔥  ~/src/asmartment/things/strip 🔥  particle compile photon .

Compiling code for photon

Including:
    strip.ino
attempting to compile firmware
Compile failed. Exiting.
strip.cpp:1:29: fatal error: FastLED/FastLED.h: No such file or directory
 #include <FastLED/FastLED.h>
                             ^
compilation terminated.
make[1]: *** [../build/target/user/platform-6strip.o] Error 1
make: *** [user] Error 2

So I cloned FastLED into that working directory since the file has:

#include "FastLED/FastLED.h"

And got that to work, but now it’s complaining about not being able to find <avr/io.h>. Turns out this can be resolved by installing via brew:

brew tap osx-cross/avr
brew install avr-libc

And it takes god damn forever to install.

n choose k

25 Feb 2017

Stumbled upon this and thought it was cool. There’s probably just some math from high school that i’m forgetting about that explains this in very simple terms.

If I have a set of N values:

[0,1,2,3]

I can figure out the number of possible ways to choose k values from them using ‘n-choose-k’:

n_choose_k = N! / ( k! (N-k)! )

This gives rise to pascal’s triangle:

from scipy.special import comb
ns = range(10)
combos = [list(int(comb(n,i)) for i in range(n+1)) for n in ns]

combos:

[[1],
 [1, 1],
 [1, 2, 1],
 [1, 3, 3, 1],
 [1, 4, 6, 4, 1],
 [1, 5, 10, 10, 5, 1],
 [1, 6, 15, 20, 15, 6, 1],
 [1, 7, 21, 35, 35, 21, 7, 1],
 [1, 8, 28, 56, 70, 56, 28, 8, 1],
 [1, 9, 36, 84, 126, 126, 84, 36, 9, 1]]

For another problem, I thought I needed to sum up all of these possible combinations for a given N:

from scipy.special import comb
ns = range(10)
combos = [sum(comb(n,i) for i in range(n+1)) for n in ns]

And holy crow, that’s just 2^N:

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

postgres row numbers

25 Feb 2017

To get just one result given a particular filter, you can use row_number().

In this case, I want to see an example of detail for each different id in the table:

select * from (
    select
        id,
        detail,
        row_number() over (partition by id) as row_num
    from
        my_table
) x where row_num = 1

web cli

17 Feb 2017

Because switching to chrome and creating a new tab is just too much.

#!/bin/bash
open https://$1.com

usage:

web google
web github

sqlalchemy and stored procs

15 Feb 2017

Ran into a weird issue having to do with calling a stored procedure in postgres from python.

tldr - you have to explicitly begin and commit transactions that call stored procs because select statements don’t normally explicitly get committed. We avoid this issue in psql because i guess those select statements actually do end up getting committed.

example

Here’s a silly stored proc:

create function do_stuff()
returns void as $$
begin

truncate table my_table;

insert into my_table values
    (1,2),
    (3,4)
;

end;
$$ language plpgsql;

To test if it works:

  1. truncate table my_table;
  2. run python script.
  3. select * from my_table; and check whether the rows were inserted.

won’t work

import dataset

db = dataset.connect()
db.query('select do_stuff();')

results:

select * from my_table; -- 0 rows

will work

import dataset

db = dataset.connect()
db.begin()
db.query('select do_stuff();')
db.commit()

or, dataset lets you wrap all that up as a transaction:

import dataset

db = dataset.connect()
with db as tx:
    tx.query('select do_stuff();')

results:

select * from my_table; -- 2 rows (as expected)

gspread

14 Feb 2017

So you wanna pull some google sheets data into python?

gspread is a really handy python module for doing just that. There’s some authentication setup involved but it’s pretty straightforward.

Once you’ve authenticated, you can check out the readme for basic usage for reading/writing to and from sheets over at https://github.com/burnash/gspread

setup

creds

You need to set up your credentials so you can access your google sheet. Fortunately this is pretty straightforward.

And the gspread documentation also has these instructions documented here.

I followed the instructions here but the tl;dr is that you need to head over to the API Console Credentials section of the google developers portal.

  1. Select ‘create a project’. Give it a name. Yatta yatta yatta.
  2. Select ‘create credentials’ > ‘Service Account Key’. I think this is the easiest. It’ll basically give you a file to download.
  3. Select ‘new service account’ and give it the ‘Owner’ role, along with some name.
  4. Select json key type.
  5. Hit ‘create’. This should download a .json file to your computer.
  6. Move that credentials file somewhere (let’s say ~/Desktop).

share a sheet

You’ll need to share your sheet with the service account you just created. From the credentials page, you should see a link for ‘Manage Service Accounts’. Check that out and you should see your service account ID for the one you just created. It’ll look something like

rt-testing@123456.iam.gserviceaccount.com

Share your sheet with that account. Boom.

code

install stuff

pip install gspread oauth2client PyOpenSSL

pull sheet

Now pull it! gspread has great documentation for using the module but I’ve found setup to be a big pain in the ass, so I hope this post helps.

import gspread
from oauth2client.service_account import ServiceAccountCredentials

# that filename path is the creds file you downloaded
credentials = ServiceAccountCredentials.from_json_keyfile_name(
    filename='/Users/ryan/Desktop/rt-testing-12345678.json',
    scopes='https://spreadsheets.google.com/feeds',
)

gc = gspread.authorize(credentials)

worksheet = gc.open('My Workbook Title').worksheet('some sheet title')

records = worksheet.get_all_records()

jq

14 Feb 2017

jq

Examples using jq with aws cli.

aws sns list-subscriptions | jq .Subscriptions | jq '.[] | select(.Endpoint == "my@emailaddress.com")'
aws lambda list-functions | jq .Functions | jq '.[] | .FunctionName'
aws ec2 describe-security-groups --group-ids <group-id> | jq '.SecurityGroups | .[] | .IpPermissions | .[0] | .IpRanges | .[] | .CidrIp' | sort | sed 's/"//g' | sort | sed 's/\/32//g'
aws lambda list-functions | jq '.Functions | .[] | select(.FunctionName == "<fn-name>")'
aws lambda list-functions | jq '.Functions | .[] | select(.FunctionName | startswith("fn-name-start"))'