Quotable quotes

It would be arrogant of me to think that I have solved the problem of large-scale software development:

It is widely acknowledged that coordination of large scale software development is an extremely difficult and persistent problem.

Source: “Splitting the Organization and Integrating the Code: Conway’s Law Revisited,” Herbsleb and Grinter, 1999 (PDF).

One of the most common antipatterns of commercial software development:

In this evil, but extremely common, mirror universe, developers branch to create features. This branch stays isolated for a long time. Meanwhile, other developers are creating other branches. When it comes close to release time, all the branches get merged into trunk.

At this point, with a couple of weeks to go, the entire testing team that has been basically twiddling their thumbs finding the odd bug on trunk suddenly has a whole release worth of integration and system-level bugs to discover, as well as all the feature-level bugs which have not yet been found because nobody bothered to have the testers check the branches properly before they got integrated.

Source: Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation, Humble and Farley, 2010.

This view is too optimistic for me, given what I know about human nature and our sinful condition:

Technological optimists believe that technology makes life better. According to this view, we live longer, have more freedom, enjoy more leisure. Technology enriches us with artifacts, knowledge, and potential. Coupled with capitalism, technology has become this extraordinary tool for human development. At this point, it is central to mankind’s mission. While technology does bring some unintended consequences, innovation itself will see us through.

Source: “The Moral Character of Cryptographic Work,” Rogaway, 2015 (PDF).

This article was how I first came across the work of Erik Dietrich. Since then I’ve had a lot of fun reading his articles on software development and office politics.

In the sense of skills acquisition, one generally realizes arrested development and remains at a static skill level due to one of two reasons: maxing out on aptitude or some kind willingness to cease meaningful improvement. … [L]et’s discard the first possibility (since most professional programmers wouldn’t max out at or before bare minimum competence) and consider an interesting, specific instance of the second: voluntarily ceasing to improve because of a belief that expert status has been reached and thus further improvement is not possible. This opting into indefinite mediocrity is the entry into an oblique phase in skills acquisition that I will call “Expert Beginner.”

Source: “How Developers Stop Learning: Rise of the Expert Beginner,” Erik Dietrich.

Computer security reading list (part 2)

I came across Bruce Schneier’s “CRYPTO-GRAM” many years ago when it was an email-only newsletter, and ever since then I’ve been interested in security. Taking St. Cloud State’s Computing Ethics class (CSCI 332) gave me more connections with how security affects computers.

Continuing my last post, Computer security reading list (part 1), this post includes some resources on computer security. Not that I’ve read everything linked to here, but I’ve read enough to be drawn in and informed. I hope that you may find these useful as well.

Blogs (continued)

Google Project Zero Project Zero is a team at Google that looks at different systems and finds and documents zero-day vulnerabilities.

Articles

Anatomy of an exploit – inside the CVE-2013-3893 Internet Explorer zero-day – Part 1 “Create 10,000 items in the current web page, giving each one a title string of ‘3333….3333’.” This reminds me of the tweet about the QA Engineer.
So Long, And No Thanks for the Externalities: The Rational Rejection of Security Advice by Users (PDF), Cormac Herley, 2010. “Users’ rejection of the security advice they receive is entirely rational from an economic perspective.”
Introducing the “Secure Account Management Fundamentals” course on Pluralsight “I mean I’m going to use SHA1 with a salt so yeah, I’m going to hash it.”
Wikimedia Foundation MediaWiki: Application Penetration Test (PDF) “iSEC identified a total of fourteen issues, including two of high severity. Most of the high and medium severity vulnerabilities are related to data validation and allow for various common attacks including XSS, DoS, and CSRF.”

Lectures

Offensive Computer Security (Florida State University) “Hacking vs. penetration testing, what’s the difference? PERMISSION.”

Computer security reading list (part 1)

Blogs

Schneier on Security Bruce Schneier writes about the privacy and policy implications of security.
Krebs on Security Brian Krebs focuses on security that targets the consumer, whether through fraud or data breaches.
Troy Hunt Troy is Microsoft MVP for Developer Security. He writes about do’s and dont’s of good application security from a developer’s perspective.
Matthew Green’s “A Few Thoughts on Cryptography Engineering.”

Books

Security Engineering: A Guide To Building Dependable Distributed Systems, Ross Anderson. “Security engineering is about building systems to remain dependable in the face of malice, error, or mischance.”
Future Crimes: Everything Is Connected, Everyone Is Vulnerable and What We Can Do About It, Marc Goodman

Talks

TED: The security mirage, Bruce Schneier. This one is suitable for a general audience. Good explanation of modeling threats, risk perception. Not just computer security.
TED: Fighting viruses, defending the net, Mikko Hypponen

Reports

Verizon 2015 Data Breach Investigations Report Click “Get Full Report,” then find the small “Download Only” link in the corner to avoid giving up your contact information
Mandiant APT1 Report (PDF) “Exposing One of China’s Cyber Espionage Units”
OWASP Top Ten “The OWASP Top 10 provides a list of the 10 most critical web application security risks.”

Finding concurrency bugs in Objective C

Introduction to concurrency in Objective C

We can introduce concurrency to a process by creating threads. Within a process, threads use the same memory space and the same instruction set. Each thread has its own position within that instruction set. A thread is in one of 3 main states: running, ready, or blocked. Each thread has its own register state and its own call stack.

In OS X 10.6, Apple introduced Grand Central Dispatch, a thread pool technology that simplifies the use of threads. GCD introduces the concept of queues and tasks. A process can have multiple queues, where each queue has one or more blocks. A block is a task for execution on that queue. From a programming languages perspective, a block is a closure (a function plus a binding environment). As a user of the queue API, you simply dispatch blocks on the queue, and GCD manages the running of queues.

Behind the scenes, GCD will create and destroy threads as needed in order to handle the workload of the queues the user has created.

In the case of a serial queue, one block runs at a time. In the case of a concurrent queue, GCD may concurrently execute blocks placed on that queue on different threads.

There are two ways to place a block on a queue.

dispatch_sync places a block on a queue and waits until that block completes
dispatch_async places a block on a queue and continues immediately

Finding concurrency bugs in Objective C

In code that uses multiple queues, one common concurrency bug is deadlock. Deadlock can happen when there is contention between two queues.

One example of deadlock is when blocks on two separate queues make dispatch_sync calls to each other.

Let’s say we have two queues, the sending queue and the manager queue. The sending queue has a method called beginTransaction, and the manager queue, which oversees both sending and receiving, has methods for managing crypto keys, including one called updateCryptoKey.

beginTransaction looks like this:

make sure the transaction is valid
dispatch_async(sending queue, ^{
    ensure no other transactions are in progress
    get crypto keys for this transaction
    put the transaction in our data structure
    start the transaction
});

updateCryptoKey looks like this:

make sure key being submitted is valid
dispatch_sync(manager queue, ^{
    is there an existing key?
    if not, add the key
    if yes, remove the existing key and add the new key
});

There is a potential deadlock between the sending queue and the manager queue. Each block has a dependency on the other queue. For the block in beginTransaction, the step of getting the crypto keys will do a dispatch_sync to the manager queue. Meanwhile, for the block in the manager queue, the step of removing the existing keys will do a dispatch_sync to the sending queue as part of cancelling any existing transactions under the old key.

These queue dependencies aren’t immediately obvious, because they are hidden behind function calls.

When finding deadlock between queues, it doesn’t matter how a block got placed on a queue. Here we have one block that was placed using dispatch_async, and one that was placed using dispatch_sync. What matters is what happens once that block is running. Since these two blocks are on separate queues, they can be running simultaneously. Then one of them does a dispatch_sync to the other, and it waits, because that queue is not free at the moment. Then the other does a dispatch_sync back to the first, and it waits too, because that queue isn’t free either. This is a deadlock.

Three weeks with a smartphone

Three weeks ago, a shiny new Galaxy S4 arrived in the mail. I purchased the Samsung Galaxy S4 GT-I9500 32GB (unlocked) on eBay. The phone was released about a year ago, but I found a vendor who was selling it new. When it arrived, it had Android 4.3 on it. Recently, an update arrived to my phone for Android 4.4, an OS which by now is 6 months old.

Initial setup

There were a lot of questions and informational boxes, both from Samsung and Google. I was a little overwhelmed. Later on, I went back into the different apps and looked more carefully at the settings.

Many apps I used on my iPod were also available on Android. I installed Facebook, Twitter, Dropbox, ESV Bible, Olive Tree, Springpad, Pleco, Webster’s dictionary, Google authenticator, MPR Radio, and Crashplan. A few apps I had not been able to use on the iPod, because they only supported iOS 6 and higher. This list included Amazon MP3, and Bandcamp, though I don’t use either of them often. Apps I installed that were not on iOS include Firefox and F-Droid, which I’ll talk more about later.

Comparison with iOS

Before owning the Galaxy S4, I owned a 3rd generation iPod touch for 2 years. It wasn’t a phone, and it didn’t have a camera, but it did a lot of the things a smartphone does, provided I was in wifi range. I installed a number of apps and used Apple’s included apps as well. When I bought the S4, I had to get used to the differences in the user interface.

Some physical differences that stuck out to me:

Don’t rest your thumb on the bottom edge of the phone, you’ll bump the back button. I had to get used to holding my Android differently than my iPod touch.
The sleep button is on the side instead of on top, which means that there’s no side to rest the phone on when it is horizontal. Which buttons do you want to mash if watching a video in widescreen?
The sleep button also becomes a shutdown menu if you long press it (same as the iPod), but the definition of “long” is much shorter on Android. I’ve seen that menu way too many times.

Some software differences:

Rearranging icons on the home screen is much different.
Launching apps and going home feels slower on the Galaxy S4. This is a little disconcerting, considering the S4’s hardware is 4 years newer.
Two different photo apps: Photos and Gallery. Apparently Google is trying to shift over to Photos.
The lock screen clock widget is inaccurate half the time. The little numbers in the upper right are always right, but the big numbers aren’t always right. This is unacceptable.
I like not having to install iTunes in order to transfer music on the phone.
Apps live in two places: In the Apps list (shows all your apps), and possibly on your home screen (where you can add widgets, arrange to your heart’s content)

Overall, iOS feels more integrated. Apple has had more time to refine their product. The lock screen clock widget and the apps living in two places are excellent examples. Another example is system updates. For Android, when you get an update depends on the device you’re using and which carrier you have. For iOS, you get the update when it comes out.

On the other hand, Android is much more customizable. I installed F-Droid, a 3rd party app store with only open source apps in it. So far I’ve installed KeePassDroid, SatStat, and FillUp from it. It makes my inner free software person happy. You can also install custom keyboards, root your phone, and even replace the firmware, things I haven’t yet been adventurous enough to do.

If my readers have any thoughts about iPhone vs. Android, I would welcome them in the comments.

Being objective about Objective C

(Sorry, couldn’t resist the pun.) A few months ago, I started a new job that required me to learn Objective C and Cocoa/Cocoa Touch development on the Apple platform. In this post, I’ll relate a few concepts that I learned in other languages that crossed over to the Apple platform.

MVC architecture

In web app development, the view is (arguably) the DOM. The DOM is loaded at a certain point in time: when the browser is finished parsing the initial payload, a load event is generated, and as a developer you can specify a handler for this event. This allows you to do things when that view appears, like populate it with data.

In iOS, the load event is similar to viewWillAppear. A NSViewController subclass can implement this method to perform custom behavior when the view appears, such as updating the UI based on the contents of some property.

Connecting the view to the controller

In Cocoa/Cocoa Touch, outlets and actions are used to connect objects in the view with code in the controller. Outlets are for linking objects in the view to a property in the controller. This allows the controller to manipulate that object, for example, setting the title of a button object. Actions are triggered when an event happens to an object. For example, an action responds to a click on a button. Action methods are passed the sender of that action.

I would compare outlets to DOM objects, and I would compare actions to DOM event handlers. Xcode makes creating outlets and actions a drag and drop operation, but in Javascript, you use document.getElementById or jQuery. DOM events are attached to an object, and specify a callback function, which can also take an argument of the sender (the this keyword).

Event loops

An event loop handles all incoming events and allows the application to respond appropriately. More generally, an event loop takes a queue of messages and processes them one at a time. An example of a message would be “call this callback function”. Processing that message would be actually executing the callback. For example, in Javascript, you specify a callback for the click event. When the click happens, the message to call that callback is placed on the message queue. The loop processes messages one at a time, so if you have another Javascript function that is currently processing a table (maybe shading the rows), the click callback message will not be processed immediately.

The Cocoa main event loop, GLib main event loop, and Javascript event loop are conceptually the same.

Node.js uses a single-threaded Javascript event loop to great advantage. Using Node, it is fairly easy to implement a scalable, event-driven web application.

Cocoa extends the main event loop, allowing you to create other queues. Queues are event loops that may or may not run on another thread. However, messages dispatched on the same serial queue will always be executed one at a time. This allows you to protect variables that aren’t thread safe by only accessing them from a particular queue.

The Glib event loop is used by a GUI library such as Gtk+ to make a Linux application. I’ve used these libraries while fixing a few Gnucash bugs and while writing a Twitter client.

Conclusion

My experience with Gtk+ and Javascript allowed me to quickly pick up core Cocoa concepts and design patterns. Now if I could only remember to do all UI stuff on the main queue!

Yet more website tweaks

Improve server response time

In my original post about hosting my own blog, I mentioned that Google PageSpeed Insights was complaining about server response time. After some research, I realized that my home page was quite large. It was over 1 MB, mainly because of a particular post which contained some large images. (I was complaining about the way graphics are configured on Windows, and included some large screenshots.) The fix was two parts. First, I cropped two of the screenshots to bring the size down a bit more. Second, I added the “More” tag, which makes users click a “Continue reading” link from the home page if they want to see the whole post.

If you want to measure the page load size on your own blog, clear your browser cache, then open up developer tools. Reload the page. On the “Network” tab (“Net” in Firebug), there is a summary of the number of HTTP requests, the amount of data downloaded, and the time it took.

Add LaTeX and YouTube to WordPress

Recently I found another post that needed some special care. In 2012 I posted about my computer animation project and included a YouTube video and some math equations. These used features unique to WordPress.com: by simply pasting the link, it will embed a YouTube video, and by using a special tag, you can include math equations using the popular typesetting language LaTeX. One way to bring these into my WordPress.org blog would be to use WordPress.com’s plugin Jetpack. Jetpack brings a lot of WordPress.com features to WordPress.org, but I didn’t want all of them. Instead, I opted for two small plugins, WP-Latex and YouTube Embed Plus.

For people setting up their own blog, I’d recommend either Jetpack or a combination of smaller plugins to enable these features.

From the Garden to the City book review

From the Garden to the City: The Redeeming and Corrupting Power of Technology
John Dyer

John Dyer is a web programmer and a theologian. I’ve played with his Javascript app Browser Bible and I’ve followed his blog Don’t Eat the Fruit for some time. In 2011 he published From the Garden to the City. I read it last year, and am now digging out my notes to share some of the ways it influenced me.

Technology is not neutral

I had always thought that technology was neutral. When I was growing up, my mom would try to get me to stop spending so much time on the computer. The computer is bad, she would say. It didn’t seem so particularly bad to me. I knew that it could be used in moral or immoral ways. For example, I could use it to watch porn, or I could use it to do my homework. But if I spent too much time using it, surely that wasn’t because the computer was bad—maybe I just needed to go play in the park every once in a while.

My neutral view of technology was too simplistic. The computer is not so much neutral as it is transformative. It changes me, it changes my family, it changes society. It brings tendencies, for example, a tendency to be used by only one person at a time. It brings a tendency toward distraction because the computer can be downloading a file, installing an update, and playing music all while I’m trying to write a research paper. It tends to make communication easier, bringing people closer through social networks. Saying that a computer is neutral would wrongly ignore its more nuanced characteristics.

In Dyer’s book, he gives three examples of technologies that are not neutral. The first example is a simple one: a shovel. A shovel is a tool used to make holes in the ground. It can be used for good purposes, say to plant a tree, or for bad purposes, say to conceal stolen treasure. But regardless of the purpose it is used for, the shovel has (1) made digging holes easier, and (2) given somebody blisters in the process, and maybe even a sore back. It makes it easier to shape the gound, and may inspire somebody to plant a tree.

The second example is Twitter, a digital communication tool used to share 140-character messages with the public. A person can choose what tweets they want to read. So one person follows Seth Godin, and another person follows Justin Bieber. Obviously there is a difference in what benefit they will get out of the content of the tweets. But another, more subtle difference, is that the more tweets they read, the more they train their mind to process information in very short snippets.

The third example is a book, a communication tool which became common about 500 years ago with the invention of the printing press. Again, a person can choose what books they read. One person reads Richard Dawkin, and another person reads Wayne Grudem. Obviously both people get different benefits out of the content that they read. But the technology of the printed book has also changed the way their mind processes information. After reading a few books, their mind will become adept at processing information in multiple chapters and pages, and understanding complex arguments. If reading Tweets trains the mind to process information in short snippets, then reading a book does the opposite.

Redemption through technology

One section of the book that inspired me was the way technology is involved in redemption. Technology can be used by God and humans to temporarily overcome the effects of the fall. God uses it for his grand purposes of redemption, and it points to the Redeemer who will makes all things new

One example is the technology of written words. Prior to written words, people remembered by telling stories. Memory was important, and the person with the most memory had authority and wisdom. Written words have a different sort of authority. If someone reads aloud the written words, they obviously are not the authority. The authority comes from the person who originally recorded those words.

God used this technology when he recorded the law to his people on Mount Sinai. The recorded commandments indicated God’s authority, so that the person who read them (and obeyed them!) had to acknowledge their original source. What’s more, God used this technology right when it came out! Around the same time of the Exodus, other civilizations were beginning to adopt alphabets as a way to write down their spoken languages.

A contemporary example of technology having a redemptive effect is medical technology. We are able to partially reverse the effects of the fall by aiding the body in the process of healing.

Conclusion

Another story from my years of growing up is my affinity for “how stuff works” books. My favorite by far is the one that featured cartoon woolly mammoths (The New Way Things Work). These whimsical mammoths would appear in the cutaway drawings of different gadgets and gizmos. For example, on the page about how the electric guitar works, there was a mammoth doing a tightrope walk down the guitar string.

Mammoths aside, the history of science in the last few hundred years is fascinating. This invention led to that one. That scientist made this discovery. However, the story of technology is more than a historical timeline of inventions. It is the story of technology influencing society and vice versa, and God influencing them both, directing them to his glorious purposes.

Add tracking code to static content

By using a combination of WordPress, Piwik, and the WP-Piwik plugin, I’m able to track analytics on all the pages of my blog. However, my site is more than my blog, and I wanted to track visits to my static pages, namely my portfolio and my Post Voting app. One idea is to paste in the tracking code and just have it be versioned like my other static content. I see two downsides to this:

The tracking code could change, and then I’d be updating it across all the pages, polluting the version history and causing a mess of confusion.
The tracker would track my own visits that happen as I do web development on the various pages.

The solution is to use Server Side Includes to include the tracking code. This addresses the two concerns above:

The tracking code is stored as a separate file, so it can be versioned independently.
The include can be conditional on whether this is a development or production server.

As a further feature, I wanted the tracking code source file to be hidden from anybody that tries to access it directly. (Not really essential, but it helps me learn about configuring .htaccess)

Running into snags

I ran into two snags that helped me learn a lot more than I had originally intended. First, while trying to get the syntax of the conditional include right, I was reading the Apache documentation. However, I failed to realize that my development machine had Apache 2.2, and my web host has Apache 2.4. I was reading the documentation for Apache 2.4 ap_expr syntax. I was stumped as to why the code worked on my web host, but gave an Invalid expression error on my development machine. The solution was to create a new virtual machine using the same version of Apache as my web host. The lesson learned was to ensure that development and production environments are as close in configuration as possible.

The second snag happened when I restricted access to my tracking code piwik.html using .htaccess. I realized that this also restricted mod_include from including it! The solution came from reading the Apache 2.4 documentation for mod_rewrite. The NS flag prevents a rule from applying to an internal subrequest.

The solution

In portfolio and postvoting folders, I renamed index.html to index.shtml. The include source code which follows was added to each page:

<!--#if expr="%{SERVER_NAME} == 'bobbyratliff.nfshost.com'" -->
<!--#include virtual="piwik.html" -->
<!--#endif -->

You can view the current version of the Piwik tracking code on github.

The .htaccess file has the following line added to it:

RewriteRule ^/?piwik.html$ - [F,L,NS]

That’s all there is to it. Just use the static deploy method and your site will be updated.

Website maintenance and tweaks

Keeping WordPress up to date

I originally installed WordPress using Subversion. This provides a really easy way for me to update WordPress:

svn switch http://core.svn.wordpress.org/tags/3.4.1

Replace 3.4.1 with the latest version number. Then I visit the admin panel of the blog and it will redirects me to perform any necessary database upgrades.

Use .htaccess to prevent access to the .svn directory:

# Prevent access to .svn directory
# From http://codex.wordpress.org/Installing/Updating_WordPress_with_Subversion
RewriteEngine On
RewriteRule ^(.*/)?\.svn/ - [F,L]
ErrorDocument 403 "Access Forbidden"

Compression

I’ve learned that content can be categorized into static content and dynamic content. There is an Apache module, mod_deflate, that can compress both types of content seamlessly. It only requires a configuration change in .htaccess, and no changes will need to be made to the application. However, it is inefficient because it recompresses the same content every time someone requests it. For this reason, my web host does not support mod_deflate. Instead, they recommend different tactics for each type of content.

On my blog, an example of dynamic content is the home page, http://www.rratliff.com/. This test or this test can test whether the home page is compressed. At time of writing, I have not found a good way to compress dynamic content.

On my blog, there are several static content files that are typically requested. For example, the CSS and JS files that WordPress includes in every page. Google Page Speed Insights is the a tool that tests compression of every resource needed for loading my blog’s home page, both the dynamic and static pages. I’m looking for a reliable way to compress the static content files that Google Page Speed Insights finds.

Deploying static content

I now have two sections of static content on my website, my Post Voting App and my Portfolio. I’ve adopted a simple solution to keep these sections up to date. Each section is maintained in a github repository. I have a matching repository on my own computer where I make changes, commit, and then push to the github repository. Then, to update the content on my website, I SSH to the host, cd to the directory, and do a git pull.

I reuse the .htaccess code above in order to prevent access to the .git subdirectory. See the .htaccess file for an example.

Backups

I created two scripts to backup the files and the database in my NearlyFreeSpeech site. The scripts aren’t fancy, they just contain one command each.

For the database, I created a non-privileged backup user who has permissions necessary to do a mysqldump on all the tables in my database. Here’s the gist of it. (It should be all one line. Lines wrapped for display purposes.)

ssh user@host mysqldump --user=nonprivilegeduser 
    --password=password --host=mysql_host 
    --all-databases | gzip > backup-file-name-$(date +%Y%m%d).sql.gz

Just change the underlined parts. The $(date) thing creates a filename like this:

backup-file-name-20131029.sql.gz

For the files, I use rsync with the options -aAXE --delete.

Older Entries Newer Entries