Work From Home Friday: Don’t be Afraid of Rebasing

Little late today, I was busy working/running around/watching paint dry, so I’ll keep this brief. I’ve mentioned before that git can be like a game save, it protects your butt if things go wrong and you need to get back to a certain safer place. But I also like to use git commits as a well crafted story, or at least that’s what I want to present to the outside world. Having a lot of why doesn't this work or it does the things or rainbows and ponies (a real pr of mine at a job at one point) doesn’t really mean anything to anyone but me. And unfortunately it probably doesn’t mean much to me if I go back and look at it even a day later. So what’s a girl to do? Tell her own story of course.

First let me say, “rewriting history” in git is a bit of a controversial subject for some people. But at multiple jobs and in my preference I don’t have a problem with it as long as you’re not rewriting master or other people’s history. So what does that leave us? The ability to craft a really well documented pull request. I can explain you through the choices of my work. I can easily split out frontend and backend changes if different people need to look at different parts. I can make changes easier to find if they all happen on one file in one commit.

I’ve gotten lazier about this at my current job, but my goal is to get back in the habit. It’s good hygiene and leaves my mental health intact.

For better words than mine check out Nathan LeClaire’s Don’t Be Afraid of Git and a good visual tutorial can be found from the amazing Thoughtbot blog.

Cat Herding 101: The Things That Really Go Wrong For Beginning Programmers

We had a super awesome intern working with us over the summer (I often referred to her as “my” intern, which wasn’t 100% true, but as the one on the team who had done a lot of mentoring before, I felt very mamma bear about the situation). She did some cool things and had done some cool work on her own before working with us but when she needed help and was frustrated out of her mind with a problem (like many of us still often are) it usually ended up being easy enough for a fresh set of eyes to spot the problem.

Cat teamwork

These are the times when having a mentor or even just a partner in crime who’s willing to give a quick code review make all the difference. Unfortunately not all of us get the luxury of a second set of eyes, so we fumble along adding and deleting things until we just throw it all away and start over (been there) or change so many things that one of them magically works but you don’t actually learn anything because what the hell was the problem in the first place (also been there). So for my fresh coder friends, I thought I’d throw together a quick cheat sheet of the things I see most often in my and my mentees code. I bet with any time spent coding you can probably guess a few.

 

Spelling

This one hits where it hurts, because it is a stupid mistake. A variable named “container” is not the same variable as “contanier” (and is also an awful, very vague variable name, but that’s a different lesson). This will drive you nuts, I promise – it’s currently driving my spellchecker nuts as I type. Everyone does this. Everyone. Promise. Even that awful person you just had a horrible first interview with and now will never have the job of your dreams, yep, he’s misspelled “first” I bet.
The Quick Fix: If you’re using an IDE (think Sublime Text, or Atom or whatever the kids are using these days). It’s pretty easy to spot. When you go to auto complete a variable starting with “cont” you will see two options and one of them will be not what you want. Go head and do a search for that errant misspelling and see where else it’s been used in the codebase and change it (unless some jerk really did have a variable named “contanier” in some obscure file, then change it and mock him for the next year).

 

Conditional flaws

This can technically mean a lot of things, but I’m talking about the “if not this or that do this” variety. Unsurprisingly, when you spell it out “If not A and not B” sounds much different from “If not A or not B”, but when quickly writing code it’s easy to forget.
The Quick Fix: Say it out loud! Write out your logic flow using words! Do something that’s easier for your brain to see flaws in than the foreign language that is your code.

 

Syntax errors

Blech, straightforward and an oldie but a goodie. Syntax errors are common. Everything from a typo to a misunderstanding of how a function works can lead to them. Thankfully most languages have been nice enough to also make them the easiest to solve.
The Quick Fix: Learn how to read those stupid error messages. Each language is usually a little different, but most of them follow a pattern. They generally have a complicated error message and a line number. It’s like they’re giving you the answer for free! Check out that line number, does something look funny? Fix and try again. Does it all look, a-ok? Start googling that esoteric error message. I promise you’re not the first person to accidentally divide by 0.

 

Incorrectly nested divs

This one is a bit frontend specific but you are a lucky person if you’ve never accidentally had this madness in your codebase, especially frustrating with things like Angular scope. Missing that end div tag or adding one too many in the wrong place can be a nightmare.
The Quick Fix: Always always use good HTML hygiene, indent for every new block.

1
2
3
4
<div class="outside">
  <div class="inside">
  </div>
</div>

Doing too much

This is a bit of a catch-all and a desperate plea to learn some git basics. Sometimes I’ll go wandering down a rabbit hole and not come up for air until I’ve changed at least four files. This is Not. Good. Do as I say and not as I do kids. This will lead to massive frustration when one part is making the entire thing fail or not compile. But shoot, you just made changes to four major files. So now it will take you at least 4x longer than it took to write to go back and debug. Make incremental changes, save progress with git. It’s like a video game – no one wants to do the boss battle again!

Cat Boss Final Form

We can all make these same mistakes. In fact, I’m positive we all do, but for those of us who’ve been doing this a few years we can just shrug and groan and move on with our day. And this is probably that single point where I see newer engineers go off the rails. They feel dumb and like they’ll never be as smart as their coworkers or mentors. I know this because I remember it and I’ve seen it in people I mentor. My favorite way of taking the edge off, especially when they come up immediately on the defensive with an “I know this is probably stupid, but..” is to help them but then share a time when I’d done the exact same thing. We all had to learn this shit, some of us have just been learning it for longer (and have the dyed grey hairs to show for it).

Clojure Makes Me Feel Stupid

…but I’m kinda in love with it. Thanks to a friend/coworker I’m cautiously dipping my toe into the world of “functional programming”. I’m still not 100% certain I get the full gist of functional programming but, to my newb mind, I’d explain it as programming without consequences. Functions only depend on their inputs and external forces/state cannot muck with your function.

Apparently this makes it easier to predict what’s going on, but I still have issues reading what I just wrote. Parenthesis all the way down dudes. Also prefix notation is technically useful, but warping my poor little brain.

So 1 + 1 in Clojure is actually written as (+ 1 1). Makes sense, right? Sorta, except for the years and years of basic math classes that NEVER LOOKED LIKE THIS. Oh my brain. But it’s kinda cool that instead of 1 + 2 + 3 + 4 I only need to write (+ 1 2 3 4).

But seriously. This is a function in Clojure:

1
2
3
4
5
(defn adder 
  [x y]
  (+ x y))
 
(adder 3 4) ;; 7

Wat? So the first line defines the function name. The second line is the parameters that you give to the function and then the last line adds those two parameters together. I then call the function on line 5 with the arguments 3 and 4 and then the semicolons denote comments and I use that do show what the output of the function is… 7.

As a super ridiculous aside. Parameters vs arguments? Parameters are the things that you define with the function. So x and y on line 2 are parameters. Arguments are what you pass to the function when you want to actually use it. So 3 and 4 on line 5 are arguments. Now go forth and be awesome!

So how do you be Clojure learners too? Currently I’m running through the clojurebridge curriculum on my own. Clojure in 15 minutes looks like a decent-y rundown of most of my syntax options. If you don’t want to put anything on your system yet, or just want to mess around with the syntax, there’s always Try Clojure which lets you program from your browser. And I’ve bookmarked Clojure for the Brave and True mostly for the title, but I haven’t really read any of it yet.

How to Git Ignore without .gitignore

So I had this problem at work. I’m running a virtualenv instance on our main git repo for all my python packages. The global .gitignore obviously doesn’t know about it the way it knows to filter out node modules and other fun bits of localization and I sure as heck don’t want to add my own one line fix to the global .gitignore file.

So what’s a girl to do? Does she just ignore that one annoying untracked file line every time she does a git status from the terminal? Nope, she uses git exclude.

$GIT_DIR/info/exclude

So for me this meant I had to create a info folder in my .git folder in the repo. Then I created a file called exclude (no file extension). The syntax in that file is exactly like the .gitignore file, it’s just very very local (to your computer and only for that repo).

Advanced Filtering with Angular.js

Everyone (who’s used angular or seen an angular tutorial) has seen the awesome realtime angular search. This is used with a filter property, but there are more complex things you can use with filtering. On my Hack Reactor hackathon project I used filtering for just about everything. From picking out individual objects to finding just the right combinations and I learned some awesome tricks and some pitfalls to avoid.

One of the coolest things you can do is filter by lots of things together. The filter ends up looking a lot like a JavaScript object:

 filter:{TEXT: text, notes: noted, category: category}

They way this works the best is if you have a number of searches all on one page to pull together. Say a regular text search (text above) that needs to interact with a drop down menu filter (category). Just throw all the variables into one object and filter on that.

Another neat thing you can do is filter by IDs. For example, if you want to have individual pages for each item, you can normally just call to it based on the $index of that item. But what if you have searched and filtered your list of data into a more manageable grouping? Index doesn’t work! It’s pulling the index of the new filtered list and not the actual index that data point has in the entire data structure.

My fun work around for this was to send the object that you had selected to a function within my Angular controller for that page and find the data that way.

  $scope.add = function(website){
    $rootScope.selected = $rootScope.selected || [];
    if($rootScope.selected.indexOf(website) === -1){
      $rootScope.selected.push(website);
    }
  };

Of course this requires something in the $rootScope, but that also benefited me because I wanted to have access on a separate page to all of the data points a user chose. That way I could fill out a detailed report of the chosen bookmarks with user defined notes all on one screen.

Custom APIs and Web Scraping for Science

So my team’s most recent application, Helix, involved genome visualization. We integrated it with the 23andme API, but still needed a way to find out interesting information about specific RSIDs (used by researchers and databases to refer to specific base pairs of DNA). By far the most useful and open source repository of genetic information is SNPedia, but I needed access to lots of information and to integrate calls to specific SNPs. Basically I needed an API. So being ever resourceful, I decided to make my own.

Tools for the task were an easy choice. I needed a small fast server that I could implement a web scrapper on. I have always wanted a reason to use BeautifulSoup, but it’s a Python library so I knew it would be easier to build a Python server to run the API endpoints. I chose Flask because of its lightweight nature and how much it reminds me of a Node/Express server at times.

Thankfully there are some really good tutorials for both Flask and BeautifulSoup, my favorites (and the ones I referenced when I hit weirdness) were Designing a RESTful API and Website Scraping with BeautifulSoup. Both of these tutorials said a lot of things better than I could have myself.

For access to my SNPedia API and information on how to use it, check out my project on GitHub.

D3.js Rollups

Do you have all the data and none of the visuals? Do you just want a pretty, fast way to compare lots of data that centers around maybe just a handful of moments?

D3.js can help you tame all of your data and d3.rollup is especially useful if you have lots of data that you need to combine into just a couple of data points. All it takes is a couple of (pretty long) lines of code and you will have an awesome visual that’s very customizable.

Lets start with a really straightforward example of a rollup. In all of these examples, I’m using code straight from my HeatVote project, which requires me to pull voting data from our server API that I receive as a JSON blob. Here’s an example entry:

{ video_id: 'T-D1KVIuvjA',
  timestamp: 2,
  vote: 1,
  id: 1,
  createdAt: Sat Dec 21 2013 14:55:42 GMT-0800 (PST),
  updatedAt: Sat Dec 21 2013 14:55:42 GMT-0800 (PST) }

Now obviously there are a bunch of these, and technically there are easier ways to do this, but to show off the structure of a rollup, lets count how many entries we had in our database using a d3 rollup!

var total = d3.nest()
  .rollup(function(d){
    return d.length;
  })
  .entries(data);

Remember, data here is my array of JSON entries, so in our rollup function the d is just shorthand for all of the data. This isn’t a very interesting example though, lets take a look at something that really shows off the beauty of a d3 rollup.

var averages = d3.nest()
  .key(function(d) {
    return d.timestamp; 
  })
  .sortKeys(d3.ascending)
  .rollup(function(d){
    return d3.mean(d, function(g) { 
      return +g.vote;
    });
   })
   .entries(data);

Now there is a lot going on in this very compact few lines, so well go through them one by one, but the result is that averages is equal to an array of objects with the properties key (that is equal to each unique timestamp) and value (that is equal to the mean of all votes at that timestamp).

So lets break it down:

  • .key(...) is just used to tell the function what our keys are, only grabbing unique values of that property.
  • .sortKeys is just a prettiness thing, it sorts my keys into an order (when they’re pulled off the server the only order is by the time they were created on the database).
  • and finally our lovely .rollup(...). Now instead of d being an array of the whole data, it’s now an array of only the data for each individual key (so all of the data with the same timestamp). The inner function d3.mean takes a specific property from all of the data for each key and averages them up.

And that, is d3 rollup in a nutshell, it’s really lovely at coercing relationships out of your raw data and you can obviously do a lot more with it that just averaging things. The d3 nest docs are probably the next best place to look to get your hands dirty (.rollup is a property of nest).