Archive for category Code

Philly PUG talk last week

Last Tuesday I was lucky enough to be invited by Philly PUG to give a co-talk with Spencer Russell. Spencer and I each have some experience in making an easy to use command line console (Spencer) and posting a project to Python Package Index (PyPI).

It was a great talk to a packed house. More than 70 people showed up, with Bu Logics, where Spencer and I work, footing the bill for some snacks during the break.

We also shared our projects on GitHub, if you missed the talk check out Spencer’s talk Repo, or my talk repo for a quick review of what you missed!

Tags: ,

Excellent Hacks: Pathfinding

Finding the right feature to cut during an 11th hour rush to shipping in a trick, and a meta-problem that can be result in amazingly rewarding (and time saving) hacks. Code of Honor has a great post on a pathfinding hack to get StarCraft out on time.

TL;DR: They removed collision between harvester units to avoid pathfinding failures. A great hack that causes no gameplay problem, and makes the whole system simpler to manage.

Tags: ,

Fast Data Structures in Python

A blatant repost, mostly so I can easily find this excellent overview of fast python data structures later.

Tags: , ,

What we believe about software development

I came across an excellent (and old) online talk by Greg Wilson on the net today, covering what we believe about software development. It’s a good talk about how little science we use in deciding and reviewing how we plan software, and highlights how many bad or guesstimate numbers computer scientists and engineers throw around when talking about their profession. Watch the video.

The best developers are 28x more effective than the worst

Along with the bad statistics of it, it’s not backed by good data. The studies were long ago with small sample sets, they really don’t apply to modern conditions. Common sense? Not especially. Also, there is the statistical problem of *ever* comparing the best to the worst. It’s a useless measure since ‘worst’ could be someone taught to code an hour ago. Comparing against mean/median/mode is probably a lot smarter.

SCRUM and Sprints keep software from being late

Another good point. As far as I’m concerned, working in a sprint system is what got MakerWare out the door on such a great tight schedule. Iterative design and culling features, as well as allowing for error and error correction during the process. But the plural of anecdotes isn’t data, the plural of anecdotes is rumors. Is there really smarter planning going on? Smarter throwing away features? What is the actual process that makes SCRUM seem or be, better than planning up front?

Until I watched the video, I had not recognized how many software process decisions we as a culture made by anecdote based suggestions, or convinced over beer discussions. If you have time, watch the talk, or add his blog to your RSS reader.

Tags: , , ,

Testing Python Packages less painfully

If you want to test a python package before you share it with the universe, you can (in theory) use the Python Package Index Test (testpypi) server. In practice, using testpypi is painful, and the hints on the pypi front page don’t give enough details about how to do it. This post is my log of how I get testpypi working for me. This post will walk you through getting your new module up on testpypi and making sure it works right before you post it to the world+cat on Python Package Index (pypi).

setup.py

This post is assuming you already have an awesome setup.py file, and already know how to build a basic package using distribute, or setup_tools. . That work is already covered by some tutorials. Read those guide first if you need an setup.py intro, and come back here when you are done.

Please use distribute

Welcome back. Ok, I lied. I would like to suggest one change to your setup.py file. My request is to ask you to kindly please use distribute_setup. its the new hotsauce,

from distribute_setup import use_setuptools
use_setuptools()

That will use the newer distrbute tool to setup your project.

Register a username on pypi and testpyp

Register for a username and package name at testpypi. I suggest you also grab a matching user name at the main pypi site as well. You will need one of these to log in and update projects you want to post.

Manually create your config file

In theory running python setup.py register will create a proper config file for setup. This failed for me many times, especially when trying to use some -r options to post to testpypi. Instead it’s eaiser to manually create a .pypirc configurat file. On Mac/Linux, it lives at ~/.pypirc If you know where it lives on Windows, please leave a comment. To use testpypi you mostly need to define it as an index server. For speed you can just use the config below, which defines the main pypi server as well as the test server. Edit it to use your own username instead of ‘yourlogin‘.

[distutils]
index-servers =
pypi
pypitest

[pypi]
username:yourlogin

[pypitest]
repository: http://testpypi.python.org/pypi
username:yourlogin

You can also add to each index server a line containing password:yourpassword. That is, if you believe passwords in plaintext files are a good thing!

Register your module name

Almost there. You can now run setup.py commands with the remote option specifying a index server named in your config file. For example adding -r pypitest. Note that you have to put the -r option after each command which contacts a server in the params chain. It’s annoying, but that is what it takes. You can now just register your project with pypitest by running.

python setup.py sdist # create your distribution
python setup.py sdist -r pypitest #register your package name

Upload your module

Now, you can (finally) upload your module to the testpypi server. In the example the command line to do that would be:
python setup.py upload -r pypitest
Of course, once you have done it step-by-step to sort our errors, you can do an all at once command of
python setup.py register -r pypitest sdist upload -r pypitest

Test downloading your module

Module uploaded successfully? No errors, failures, insanities and it shows up on? Can you see it in the list of available testpypi packages? Gratz! But you can’t go home yet. You need to test installing the module from the index server.

The best way to test is to fire up a virtualenv to download your test module into. This will isolate it from messing up your system if something is broken. If you need to learn to use virtualenv, read this virtualenv tutorial and then come back. Once your virtualenv is running. just run pip with the ‘-i testpypi’ options to grab your test package from the testpypi. The command for that (in this example for pypeople) looks like:

pip install --verbose --index-url http://testpypi.python.org/pypi/ pypeople

If you are testing several times in a row, you will need to add –upgrade to trigger a force re-install. Otherwise install will only re-install for a major version number change.
pip install --verbose --upgrade --index-url http://testpypi.python.org/pypi/ pypeople

Success!

Finally, bugs are squashed, and your virtualenv install work. Gratz! You have tested posting your project as a module, and you have tested downloading your module. Now it’s time to push it over to pypi, and post it for the world to see/share.

If you want to read more, check out this Great post at foobar.lu.

Tags: , , ,

cltool: Example command line too in python

I have been hacking at a quick command line tool for python for managing my address book. Being a bit of a over-sharer I’ve been planning to distribute it by Python Package Index (PyPI) to share, in case other folks find it handy. Along the way I’ve run into several problems distributing it, ‘natch.

Rather than dirty up my address book workspace with testing, I created (and distributed) cltool, an example python command line tool. The cltool project is a simple demo of how distribute and setup tools work to create an installer. Here’s a quick overview of it.

setup.py

All of the good sauce of the project is in the setup.py script. Python’s distribute tools are pretty powerful, except the lack of an good uninstall. My first attempt was using just the scripts entry to list what scripts get installed. However after some extra digging, I discovered that you can create an entry point called console_scripts to point directly to python functions to execute.

TESTING.md and READING.md

These two files contain a detailed overview of what the project does, and gets into some of the key points of how setup works. TESTING.md also includes recommendations for local testing of your new package.

If you ever want to deliver a command line tool for easy_install or pip, cltool is a good template to start from

See Also:

Tags: , , , , , ,

Git commit hooks

Git commit hooks can seem like magic, and are hella useful. Whether running some unit tests, or scanning for coding standards. If you know git, and don’t use git commit hooks, you are missing some awesome free help.

What is a git hooks ?
Git hooks are amazingly simple tools. They are scripts to run when you do certain git actions,like after the repository receives a push. In each git project there is a .git directory, full of magic. Along with metadata on the project, one of the folders in there is .git/hooks/. Check it out:

$ ls example/.git/hooks
applypatch-msg  post-commit   post-update     pre-commit  update
commit-msg      post-receive  pre-applypatch  pre-rebase

Each of those is simply a scripts that will happen when you run the related git action. Part of the magic is that if the script exits with an error, the action doesn’t complete.

Why a commit hook?
Commit hooks are easily the most useful hooks. Want to scan your project to be pep8 complaint during check-in? Easy. Want to run a spell checker during fresh pull of markdown files from an sp3ll-fail prone collegue? Also easy!

Here’s a quick example script, for running pep8 on *changed files only* when doing a commit. By stashing this into the file .git/hooks/pre-commit and settings that file to executable, a developer can’t commit, until pep8 succeeds.

# find files git lists as changed for this commit 
modified = re.compile('^[AM]+\s+(?P.*\.py)', re.MULTILINE)
files = system('git', 'status', '--porcelain')
files = modified.findall(files)
#run pep8 on those files to check out problems
for name in files:
  output = system('pep8', str(name))
  #if pep8 return anything, exit and show the error
  if output:
     print output,
     sys.exit(1)

It can be used for more than just work though. Want to play a frustrating prank on a friend? Just update one of their .git/hooks/pre-commit files to contain:

sys.exit(1)


They won’t be able to commit to that repository, and (probably) will get driven a bit crazy figuring out what is failing.

Tags: , ,

Sprint Planning, now with disposable side-tasks!

TL;DR: When planning sprints, don’t pack your ‘nice to have’ features in the end. Pack 20% of each sprint with ‘Nice to Have’ or ‘candy’ features, so you can throw them away.

Planning sprints is a mixture of an art and a science. It’s hard to get all of the features you need (and some of the ones you want) stuffed into a development cycle. One of the tools I use to keep a project on track is making sure I have sacrifical features (or tasks) in every sprint, usually accounting for about 20% of the spint goals.

Lets say you have a project of 12 sprints, and 20 ‘dev points’ per sprint. A pretty normal breakdown would be to have some core features(red, 60%) and some should-have features (yellow, 25%) and a some nice-to-have features (green 10%). Oh and a little candy (blue 2%) as well as some tools (purple 3%) to build. I know this is a bit of a simplification, but I think if you know sofware, you get the gist. So go with it for a min.

Naive Sprint Packing

Now, your newbie SCRUM master will try to plan out a set of sprints for success. Naturally, she wants to get ‘Core’ done, then tinker on nice-to haves. So she plans the project as:

Looks nice right? Get hard stuff done first! If that seems like a plan for success, you probably haven’t run many sprints. To begin with, you can’t build all core fetures in a block. Any time you have interoperation, you have two choices: Define everything up-front (and fail) or make a basic design and evolve (and fail less). See Also ‘Waterfall Model Considered harmful. Core features are not as parallizable as ‘side’ features.

So, that red block of project will creep, or get delayed, just by the nature of development. Once that happens, each update to managment, and time you check your timeline, it will end with ‘and we are behind on Must-Have A,B, and C’. Every. Update. For weeks. As a hacker, you may know that is not bad, and it’s just lag from the front end, but to management and to your not-as-logical-as-logical parts of your brain, what they hear is ‘FAILSAUCE’.

Intermediate Sprint Packing

A more experenee scrum master will break things out a little more, and do something like this:

Again, it’s getting better. Now when things drift, they are more likely to be yellow items, so it’s not such a failure, and you can cast those off. But still, I bet 3 months pay that 99% of projects (yours) will be behind by sprint 8. Instead of red, Yellow items are failing, but still, there is fail.

Advanced Sprint Packing

A more experience scrum master is going to spread things out even more, and make sure, every step of the way, there are tasks that can ‘fall off’ a sprint, without killing the final deliverable.

Every sprint will go bad. Unless you are both awesome and amazingly lucky, almost every spint some core-task will go over estimate. Those red tasks are going to be on the plate almost every sprint. And every sprint you will have some Blue/Green tasks to throw away. Why? Because your estimates will be wrong, mistakes will happen, and you want to throw some luggage overboard, and make the product skinnier, instead of having late items the whole project long.

When you pack your ship, keep some extra luggage around to throw overboard during the journey!

Actual Programmer Productivity 2

A few days ago I wrote about Actual Programmer Productivity, inspired by a post by Matt Rogish. Today I’d like to talk about that, but over time.

One of the points I made is that hours 40 to 60 in a work-week can produce useful code, in the first week of crunch-time (aka overtime). That last clause is the most important, since most people are going to extrapolate that those hours can generate good code in week n+1 of crunch-time, which is not the case. Each week of overtime wears down a developer (or designer, or whatnot) and causes their baseline productivity to drop. It also takes several weeks of refresh time, for programmers get back to their pre-crunch baseline productivity. A rough graph of it looks something like this:

Programmer Productivity under a run of Crunchtime

That is not the most scientific graph, but it’s a good start based on experience on a bunch of projects. These numbers are similar between game developers, Ford Assembly workers in 1909, and modern developers in a crazy cool office.

See Also:
8017_b from AWCI.org (pdf)
Rules of Productivity
Working Too Much Is Stupid

Tags: , ,

Actual Programmer Productivity

A bunch of websites and tech evangalists have been talking in the last few months about how bad working too much is. Not just ‘hey, it’s not good for your health’ bad, but ‘it’s not good for your codebase’ bad. For the past 3-4 years, that has been one of my pet peeves, and it’s awesome to hear other people get onboard. Matt Rogish has a good graph (cc’d here) that is his view of productivity, and how it takes a dive. I wanted to echo his graph, and add a little data of my own.

Matt’s Graph:

I think this is an good baseline, but there is some more information I want to add to that.

When I think about productivity of code, I think it can be broken down into a couple of sections. I think one number can abstract it, but I think some background before that number is useful.

  • Ideas Are a new design or a good way if implementing something. This is the stuff that happens at the Whiteboard or discussion phase.
  • Plans are the I will build, X, they Y, then test X & Y, and after that run Z. They are the skills of breaking a code into a flow.
  • Functions are just that. Chunks of code that are a function, be it good, or bad.
  • Lines of Code (LOC) are also just that. Lines of code, again, good bad or total failsauce.
  • Lines of Usable Code (LOCU) are lines of code that do what you want. I.E, mostly bug-free code that doesn’t need to be rewritten, and doesn’t contribute directly to technical debt. We can also call this Technical Capital.

Adding all that together, I come up with mine own graph of how I think production goes for a week:

Far’s Graph


LOCU (Lines of Code, Usable) is very low. Deal with it. Most developers think they can write 1K to 3K lines of code in a day. That is possible, but only on the best of best days, and that us usually followed by a week of tweaks and updates. Few people reliably get about 10 LOC per hacker-day. Period. If this number seems low, crunch some stats on any project you can get source code for, and you will see it’s true.

Hours 40 to 60 can and do generate useful code, in Week 1. The problem with that is, each week onward, your baseline hacker productivity drops. I’ll cover this more in a follow-up post.

By 70 hours, you are just hurting yourself. If you are still writing code 70 hours into the week, you are simply laying code-landmines to walk into later. Or worse, design landmines to walk into later. By the 70th hour coding you are doing little to no good, and just doubling down on damage. Walk away, or fall asleep in place. Just stop coding.

I nearly added a straight GUD FEEELIEZ lines for felt productivity. Part of the problem is that most programmers type quickly, and feel productive even when they are spitting out garbage code that will just need to be deleted the next day. In a lot of ways, developers (myself included) have little or no internal feedback system to tell them when they are writing crap code, or when they are typing nonsense. Some of the lest productive programmers I know can scramble 60 wpm for hours and hours on end, feel super-productive. … And still have zero function code. Working functionality ships code, not GUD FEELIEZ

Based on 10 years or so of development (at home, and at work) as well as a lot of experience leading teams, I think Matt Rogish has a good (though literally, one dimensional) graph of how productivity works.

Following this, I’m going to get a bit into week-over-week productivity as well.

Tags: , , ,