My Experience as a Contract Developer while being Employed

Last Fall of 2016, I had the experience of taking on a contract web development project while I was actively working as a full-time W2 employee, and I want to share with you my experience through all of it. Whether you’re considering doing something like this yourself, or you’re a seasoned developer who’s already worked many contract side projects – I hope this helps to show you another perspective. This was my first big contract job that I’ve been paid for on the side, and I learned a lot from it – especially that the lifestyle of combining full-time employment with outside work just wasn’t a good fit for me. But I’m getting ahead of myself, we’ll get to all that; for now, I just want to walk you through what the project was like, how it all got started, development, design updates, and my mentality through it all. Let’s get to it.

Phase 1: Project Setup

It was back in August of 2016 that I was contacted by my old boss at a previous employer to see if I’d be interested in building a web site for one of their clients. The project would be similar to every other project I did while with this previous employer, so I knew I already had the skills to take it on; in case you’re curious, the project was a fully custom-built site using Craft CMS. It would be about 80% HTML, CSS, and front-end JavaScript – and 20% configuration on the backend for the CMS. The thought of making a little extra money on the side excited me, and I trusted this client since I already had a relationship with them (I worked there for multiple years). So far, everything seemed pretty awesome: I knew how to do it, and I felt like I didn’t have to worry about any issues over actually getting paid. I checked with my current employer to verify that they were cool with it, as well as my girlfriend Layla (since I’d be spending evening and weekend hours on it) – and everybody encouraged me to take it on. Things were looking as perfect as could be.

Within a week, I had lunch with the client (my old boss) as well as the designer on the project, and I got the design comps for the website so I could start building things. After spending a couple hours setting up my work environment to run this CMS, I was ready to rock and roll.

I was told that they’d love to have a staging site up in about a month (mid-September), and were planning to take the site live the first week of October. So far, this project was continuing on its perfect streak, because I felt really good about hitting those deadlines. I won’t disclose my rate, but suffice it to say that I kept it low compared to what I’ve heard other developers charge – simply because this was my first contract job. I wasn’t too concerned about getting burned financially; if it happened, then it happened, but I felt fair about my estimated hours. I quoted 50 hours for the project in total: 40 hours of development, and 10 hours of updating based on the client’s needs. I charged a lump sum amount to be paid after the project was done, so if I made it under my quoted hours, then I would have technicaly been paid “extra,” but if I went over, then I wouldn’t get paid for that extra time.

Phase 2: Development

I gave myself 40 hours to develop the site and get it pushed up to staging, and everything honestly went really smoothly and methodically in this phase. I built a home page and styled it, built the navigation, tied all of that to the CMS, and then started working on sub page components. I spent about 2-3 hours every evening on this project for a little over a week, and I got a lot done pretty quickly. My favorite feeling was that while I was early in my hours (< 15), I truly felt like I could spend the time to architect the site and the CMS in a way that made me proud. I built accessibility features, semantic HTML, CSS that followed BEM and SMACSS, etc. I was really happy with it.

There were a few times that I needed to communicate with the client to ask for clarification on design, but that always went well. We communicated entirely through Slack (we were both a part of a local developer community Slack group). After I hit about 20 hours, things were looking good; I had a lot done, but still had a good chunk more to build. It was about this time that the project started slowly wearing on me. I started thinking, “I’m spending these evening hours working, when I could be enjoying them with my family, writing blog posts, playing video games, or just relaxing.” I kept pressing on, but I kept coming back to the thought that these hours I was spending on this contract project weren’t like normal hours I’d spend during a workday. They were prime hours – hours that I only get perhaps 4-5 of a day (time at home after work, before bed, and not spent at the gym). These are my most valuable hours of the day, and this was the only time of day that I could really do things for me. I really started thinking that this might not be how I wanted to spend my free time – or at the very least, I might want to do this very sparingly.

I want to reiterate that the project itself had nothing to do with how I was feeling. The project was 100% perfect, and I truly enjoyed what I was building as well as the tools I was using to build it. Everything was great from that standpoint. It was just the lifestyle that was getting to me.

Regardless of my thoughts, I bulleted through the development phase of this project and got up a staging site right at 36 hours. I came in 4 hours under my quoted development time, which was awesome. That gave me 14 hours to handle any updates from the client – and from my experience being previously employed with this client, that was more than enough. Metaphorically, it seemed like I just rounded 3rd base and was sprinting to home! Or so I thought.

Phase 3: Updates from the Client

It was this phase that really broke my desire to ever want to do contract work again while being employed. I took about a week off after I got the staging site up, which was really nice. I enjoyed my free time relaxing during that week – but there was always this reminder that the project wasn’t done yet, which was kind of like a constant weight on my shoulders. After a few days, I got word from the client that they were really happy and had some updates (which I was expecting), and they shared them to me via a Google doc. When I first saw the updates, it was much more than I was expecting – but nothing I couldn’t handle. I spent about 5-6 hours on that round of updates, which was perfectly fine because I was still under my quoted hours (currently at 42 out of 50 hours).

After I finished those updates, I should have been done, right? Wrong. That was only the first wave of updates in what seemed to be an endless sea of updates for a good month. Within a few days, I got new updates to work on from the client, and when I’d finish those, I would get more soon after. Some updates were small, but some took several hours to work on. For about the whole month of September, I was pretty unhappy working on this project. It turned from something that I enjoyed building to something that I just wanted to be done. It was this constant cycle of thinking that these updates would be the last ones, and there always seemed to be more after that – and that right there wore me out more than anything.

It became this running meme in my household that whenever I’d plan to have some free time, I would always spend it making updates to this project. Literally all of my free time was eaten up. I’d occasionally think things like “I just got the Witcher 3 for my birthday, maybe I’ll play that?” – soon followed by the reminder that I still needed to work on this project. Layla was super supportive throughout this whole process and kept helping me to focus on the end goal – but I know that even she was ready for me to be done with this project.

Every update I made was within the scope of the project, so please don’t think that the client was being too demanding. My job was to give them a perfect site, and I absolutely wanted to follow through on that. After everything was said and done, I spent over 26 hours on updates. I had only planned to spend 10 hours in this phase initially – and I even thought that was a lot. I’d never spent this amount of time on client updates before – but every project has the potential to be different, and this one certainly was in that regard.

Phase 4: Done!

Early- to mid-October (about 2 months after I got the contract), the site finally went live, and there really weren’t any updates after that. I finally started feeling relieved, and like I could truly start spending my free time like how I used to.

All in all, I spent 62.5 hours on this project, and I quoted a price for 50 hours. That means I spent more than 20% additional time on this project than what I was getting paid for, so I politely explained the situation to the client and asked if I could increase my payment by a set amount to get closer to the time I actually spent on the project.

The client was really awesome about my request, and granted me the extra money. This was absolutely something they didn’t have to do – and I was aware of that, since we had an agreement before I started – and I was super thankful about it. I’m glad I asked, and if you’re in a similar situation, then I encourage you to do the same. The worst that can happen is that you’re told “No.”

I soon sent an invoice to the client, and they paid me within a month. This project now had officially been put to rest; I was happy, my client was happy, and their client was happy (the one who the site was actually built for).

My Overall Thoughts

I want to make it absolutely clear that this project was as perfect as it could possibly get as far as contract work goes. My initial quoted price was immediately accepted (and increased when I asked!); I used to work with this client, and thus trusted them to not swindle me out of getting paid, or treat me poorly; I was very confident in my ability to perform this contract, and there really weren’t any unknowns; the deadline was a ways away, so I had more than enough time to build the project; and lastly, it truly was a fun site to build.

It sounds silly to say, but this experience was seriously valuable to me because I know now without a doubt that I hate contract work. I do a pretty good job of leaving my full-time job stress at work, and am able to really enjoy my free time otherwise – but something about contract work just continuously stressed me out during the 2 month period that I was working on it. That’s definitely a personality trait of mine – and I completely understand that everyone’s different in this way. I just wasn’t very good with time management, I suppose. I think it’s because with my full-time job, I know that I’ll be there 40 hours every single week to get done what I need to, and that’s a lot of time. With contract work, I really don’t have any clue as to how much time I’d have to dedicate to it during any given week – and I think that stressed me out a lot. I would often wait until Layla fell asleep to work to work on it, because I hated the idea of taking time away that we could be spending together – but that meant that I’d be working late into the night when I’d rather be sleeping, and normally I’d only work a couple hours a night. You can only do so much in that small time frame, and the next day I’d be back to stressing out about when I’d have time to work on it again. I really struggled with that. I’m sure a seasoned pro at this would dedicate set times every week to work on contract projects – but I just had no idea what times I’d consistently be free.

Here’s my advice to you: if you’re thinking of doing contract work while being fully employed, be 100% transparent about it. Inform your current employer, your family, your friends – anyone important in your life. It’s much better for all of your relationships if you’re open about it. If your employer isn’t cool with it and you can’t convince them otherwise, then it’s best to just say no to contract work rather than go behind their back. Trust me. That same advice goes for your family too. If it’s going to cause any problems at all, it’s not worth it.

I hope my experience working as a contract developer while being employed as a full-time developer helped you in some way. It only took one contract job to show me that this isn’t the lifestyle I want to lead – but if you’re able to handle it just fine, then I encourage you to keep it up! Everyone’s different, and that’s something I really learned through all this. I found that I truly value my spare time much more than any hourly rate could provide me, and thus I decided to hang up my contract developer hat.

Tax Season Update: After filing 2016 taxes, nearly half of the total amount of money I made from this project was taken away. Yikes! I was expecting 30-35%, sure – but I was unaware of the additional self-employment taxes that added to it. Definitely something to keep in mind if you’re planning to do contract work!


P.S. If you made it this far, then you deserve to see the actual project I’m talking about in this post. I hope you enjoy it! I’m really proud of it.

Presenting: Guernsey

Design Patterns: Dependency Injection

If you’re a developer, you may have heard of the phrase dependency injection (DI) before as a possible design pattern you can use. It’s been around for a long time, and many popular frameworks such as Angular.js use it by default. In standard code, it’s common to declare a dependency in the same lexical scope where you actually plan use that dependency. Nothing sounds crazy about that, right? DI flips this on its head – and for good reason too. The core concept of DI is to invert the control of managing dependencies so that instead of the client (i.e. the scope where the code actually exists) having to manage its own dependencies, you instead delegate this responsibility to the code which actually calls your client, typically passing in dependencies as arguments to that client. This is where the name “dependency injection” comes from – you inject the dependencies into your client code during the execution of that code.

If you’re familiar with DI – then you haven’t learned anything new yet, but if this is your first go at understanding this design pattern, then surely you have some red flags popping up right now. This just seems to convolute how I would write my code, why would I do this? What are the benefits of DI? Is it difficult to implement? We’ll get to all of this. Keep following along.

Benefits of DI

Applications built with DI boast a fair number of benefits – and while there’s more than this, here’s a list of some of my favorites:

Loose coupling.

With DI, your code is by default more loosely coupled which makes it easier to “plug-and-play” throughout your application; for example, you don’t have to worry about using a dependency that was potentially declared in an external scope compared to where you’re actually using it. All your code needs to worry about is what it actually does – and not about what exists around it.

Taking loose coupling even further, DI is very functional in nature too in the sense that it helps your functions maintain a pure state. Including dependencies from outside of the immediate scope means that the state of your client code could change at any given time – and while using DI doesn’t force you to necessarily write pure functions – it helps guide you on that path more so than other design patterns.

Testing is very simple.

Imagine you want to test a function which makes a request to a third-party JSON API, and you need certain data to return from that service in order for it to execute properly. This is very difficult to test because not only do external HTTP requests take a significant amount of time compared to the rest of your test’s execution – it’s most likely not feasible or reliable for you to be making HTTP requests during testing. What if the third-party service goes down? What if you have a request quota? What if the service takes a few seconds to respond? There’s a ton of reasons why this might be an issue.

With DI, you would pass in this particular request library as an argument to your client code – but since you’re passing it in from your test code, it’s very simple for you to build a mock of this request library that simulates real behavior; instead of making an HTTP request, it could just immediately respond with test data that you would expect to get back as a response, and then continue on executing the rest of your client code in your test.

Here’s an example of how this library might be used with DI (and Javascript’s new async/await keywords):

And here’s a simple unit test we could write for this function:

Single source of declaration.

You don’t need to require the same files multiple times in a project – with DI, you only have to do this once. Requiring a file multiple times could needlessly increase the total size of your application – but even though most programming languages handle this so that you still only pull in the same file once, it’s still cleaner and easier to debug when you code it in just one spot.

Implementing DI

You can implement DI in a number of different ways, but there are 3 simple patterns of doing so if you’re using a class-based object-oriented language: the constructor, setter, and interface patterns. All of them revolve around the concept of setting each dependency as an instance variable on an object so that you can access them just about anywhere.

Here’s a simple example of code without DI:

Here, factory is a dependency defined in the external lexical scope of this file. This is nice and simple – but what if you want to build a unit test, and factory.getObject is a very hard function to handle during your test? This is where DI really shines, and here’s a simple way you can transform this example to use it:

Here, we pass in a dependency and set it equal to an instance variable – and now we can use this dependency anywhere we see fit with this property. We’ve transformed the SomeClass constructor into a pure function which solely depends on the arguments passed in when it’s called. That, my friend, is loose coupling.

Using an IoC Container

DI is a wonderful concept and is rather easy to implement on a small scale, but it can quickly get messy if you start needing to inject dependencies all over the place in various files. This is where using an IoC (inversion of control) container – also known as a DI container – comes in to play. The purpose of an IoC container is to handle settting up all the necessary dependencies so that you don’t have to duplicate convoluted instantiation code across your project; the IoC container is the only place you would write that.

Imagine code that looks like this:

There’s nothing logically wrong here – we’re following proper DI principles – but it’s still very messy. The real danger here is that if we wanted to ever instantiate an object of class FooService again, then we would need to duplicate all of this code, and that seems like a code smell.

Now imagine we’re using an IoC container. Our code could potentially look like this:

Here, we haven’t lost any of our logic – we’ve just delegated the instantiation of a FooService object to our IoC container, which handles creating this object just like our code before did; our benefit now is just that if we need to duplicate this behavior across our project, we just delegate that responsibility to our IoC container instead of our client code. Our IoC container becomes the single source for handling all of our dependencies – and that’s pretty nice.

Detriments of DI

While we’ve shown the benefits so far, DI isn’t without its faults. Here’s a couple valid reasons that might make DI less appealing depending on your situation.

More difficult to trace.

When you’re debugging code that’s using DI, if the error stems from a dependency, then you may need to follow your stack trace a little bit further to see where the error actually occurs. Because dependencies no longer exist in the same file and/or class as where your logic is happening, you need to know exactly what called the code in question to understand where the problem may lie.

On top of this, learning these types of traversal concepts may be more difficult for developers who are just joining a project for the first time.

More upfront development.

In almost all cases, building a project with the DI pattern will take more upfront development time than a traditional project. Most of this has to do with understanding how your project’s architecture should work, what constitutes a dependency, and potentially building an IoC container.

In the long run, however, DI could save you a lot of development time and headaches as you begin to add on more components to your project and also need to test those components.

Final Thoughts

DI is a nice design pattern and it’s helped me tremendously in the applications where I’ve used it. For the most part, my favorite use case for DI is how simple it is to test every component of your project. If there’s a third-party dependency that makes it difficult to test the rest of my logic, then I can easily mock that dependency and stub out any functionality it has.

But – it’s more complex than non-DI code, and that may be a turn off for many developers out there. Whether you decide to implement DI into some of your projects is always your decision – but if you want my opinion, give it a shot sometime. If it works out – great, you’ve found a nice design pattern you can really start using; if not, then at least you still hopefully learned something in the process!

Building a JSON API with Rails – Part 6: The JSON API Spec, Pagination, and Versioning

Throughout this series so far, we’ve built a really solid JSON API that handles serialization and authentication – two core concepts that any serious API will need. With everything we’ve learned, you could easily build a stable API that accomplishes everything you need for phase 1 of your project – but if you’re building an API that’s gonna be consumed by a large number of platforms and/or by a complex front-end, then you’ll probably run into some road blocks before too long. You might have questions like “what’s the best strategy to serialize data?,” or “how about pagination or versioning – should I be concerned that I haven’t implemented any of that yet?” Those are all good questions that we’re going to address in this post – so keep following along!

The JSON API Spec

Active Model Serializers – my go-to Rails serialization gem of choice – makes it so simple to control what data your API returns in the body (check out my post on Rails API serialization to learn more about this topic). By default, however, there’s very little structure as to how your data is returned – and that’s on purpose; AMS isn’t meant to be opinionated – it just grants you, the developer, the power to manipulate what your Rails API is returning. This sounds pretty awesome, but when you start needing to serialize several resources, you might start wanting to follow a common JSON response format to give your API a little more structure as well as making documentation easier.

You can always create your own API response structure that fits your project’s needs – but then you’d have to go through and document why things are the way they are so that other developers can use the API and/or develop on it. This isn’t terrible – but it’s a pain that can easily be avoided because this need has already been addressed via the JSON API Spec.

The JSON API spec is a best-practice specification for building JSON APIs, and as of right now, it’s definitely the most commonly-used and most-documented format for how you should return data from your API. It was started in 2013 by Yehuda Katz (former core Rails team member) as he was continuing to help build Ember.js, and it officially hit a stable 1.0 release in May of 2015.

If you take a look at the actual spec, you’ll notice that it’s pretty in-depth and might look difficult to implement just right. Luckily, AMS has got our back by making it stupid-simple to abide by the JSON API spec. AMS determines JSON structure based on an adapter, and by default, it uses what’s called the “attributes adapter.” This is the simplest adapter and puts your raw data as high up in the JSON hierarchy as it can, without thinking about any sort of structure other than what you have set in the serializer file. For a simple API, this works; but for a complex API, we should use the JSON API spec.

To get AMS to use the JSON API spec, we literally have to add one line of code, and then we’ll automatically be blessed with some super sweet auto-formatting. You just need to create an initializer, add the following line, and restart your server:

Let’s do a quick show-and-tell, in case you want to see it in action before you try it. Assuming we have the following serializer for a post:

Then our response will go from this:

to this!

The JSON API spec also sets a precedent for how paginated resource queries should be structured in the url – which we’re getting to next!

Pagination

Pagination prevents a JSON response from returning every single record in a resource’s response all at once, and instead allows the client to request a filtered response that it can continue querying on as it needs more data. Pagination is one of those things where every project seems to do it differently; there’s very little standard across the board – but there is in fact a best practice way to do it in a JSON API. A paginated resource on the server should always at a minimum tell the client the total number of records that exist, the number of records returned in the current request, and the current page number of data returned. Better paginated resources will also create and return the paginated links that the client can use (i.e. first page, last page, previous page, next page), but they tend to do that in the response body – and that’s not good. The reason this is frowned upon is because while dumping pagination links in the response body may be easy, it really has nothing to do with the actual JSON payload that the client is requesting. Is it valuable information? Certainly – but it’s not raw data. It’s meta-data – and RFC 5988 created a perfect place to put such paginated links: the HTTP Link header.

Here’s an example of a link header:

That might seem like a large HTTP header – but it’s blatantly obvious what’s going on, and we’re keeping our response body clean in the process. Now, just like with the JSON API spec, you might be asking if you have to manually add these links in when returning any paginated response – and the answer is no! There are gems out there that do this automatically for you while following best practices! Let’s get into the code.

To start with, we’ll need to use one of the two most popular pagination libraries in Rails: will_paginate or kaminari. It literally doesn’t matter which we pick, and here’s why: both libraries take care of pagination – but they’re really geared towards paginating the older styles of Rails apps that also return server-side rendered HTML views, instead of JSON. On top of that, neither of them follow the best practice of returning paginated links in the Link header. So, are we out of luck? No! There’s a wonderful gem that sits on top of either of these gems called api-pagination that takes care of what we need. Api-pagination doesn’t try to reinvent the wheel and create another implementation of pagination; instead, it uses either will_paginate or kaminari to do the actual logic behind pagination, and then it just automatically sets the Link header (as well as making the code changes that you as the developer have to make much, much simpler).

We’ll use will_paginate with api-pagination in this example. For starters, add this to your Gemfile:

Next, install them and restart your server:

Let’s update our Post controller to add in pagination. Just like with the JSON API spec above, we only have to make a single line change. Update the post_controller’s index action from this:

to this:

Do you see what we did? We just removed the render function call and instead added the paginate function call that api-pagination gives us. That’s literally it! Now if you query the following route, then you’ll receive a paginated response:

Bonus

You’ll notice that after all my babbling about putting paginated links in the HTTP header instead of the response body, they still managed to find themselves in the response body! This is a neat feature of AMS if you’re using the JSON API adapter; it will recognize if you’re using either will_paginate or kaminari, and will automatically build the right pagination links and set them in the response body. While it’s not a best practice to do this – I’m not too worried about removing them because we’re still setting the HTTP Link header. We’re sort of in this transition period where many APIs are still placing paginated links in the response body – and if the AMS gem wants to place them in there with requiring no effort from the developer, then be my guest. It may help ease the burden of having new clients transition to parsing the Link header.

Now, here’s a little caveat. The JSON API spec has a preferred way of querying paginated resources, and it uses the page query object to do so, like in this example:

This query is identical to our query above; we just swapped out per_page for page[size], and page for page[number]. By default, the links that AMS creates follow this new pattern, but api-pagination by default doesn’t know how to parse that. Don’t worry though, it’s as easy as just adding a simple initializer to allow api-pagination to handle both methods of querying for paginated resources:

And wallah – add this initializer, restart your server, and now your API can handle paginated query params passed in as either page/per_page, and page[number]/page[size]!

Versioning

The last best practice topic we’ll be covering here is how to properly version your API. The concept of versioning an API becomes important when you need to make non-backwards-compatible changes; ideally, an API will be used by various client applications – and it’s unfeasible to update them all at the same time, which is why your API neds to be able to support multiple versions simultaneously. Because you don’t really need a solid versioning system early-on in the development phase, this is often an overlooked topic – but I really implore you to start thinking about it early because it becomes increasingly more difficult to implement down the road. Spend the mental effort now on a plan to version your API, and save yourself a good deal of technical debt down the road.

Now that I’ve got my soap box out of the way, let’s get down to the best practices of implementing a versioning system. If you Google around, you’ll find that there are two predominant methodologies to how you can go about it:

  • Version in your URLs (e.g. /v1/posts)
  • Version via the HTTP Accept header

Versioning through your URLs is the easier of the two to understand, and it’s got a big benefit: it’s much easier to test. I can send you a link to a v1 path as well as a v2 path – and you can check them both out instantaneously. The drawback however – which is why this way isn’t a best practice – is because the path in your URL should be completely representative of the resource you’re requesting (think /posts, /users/1, etc.), and which version of the API you’re using doesn’t really fit into that. It’s important – sure – but there’s a better place to put that information: the HTTP Accept header.

The Accept header specifies which media types (aka MIME types) are acceptable for the response; this is a perfect use-case for specifying which version of the API you want to hit, because responses from that version are the only ones that you’ll accept!

For our demo, we’re going to specify the version in a custom media type that looks like this:

Here, you can easily see how we set the version to v1 (If you’d like to know how we got this format of media type, check out how MIME vendor trees work). If we want to query v2, then we’ll just swap out the last part of that media type.

Let’s get to some implementation. We won’t need any new gems, but there are a couple of things we do need to do first:

  • Move all of the files in our app/controllers directory into a v1 directory. So the full path of our controllers would then be app/controllers/v1.
  • Move all of the code in our controllers into a V1 module. That looks like this:

  • Wrap all of our routes in a scope function call, and utilize an instantiated object from a new ApiConstraints class that we’ll add in (this will filter our routes based on the Accept header).

We still need to add in the code for our ApiConstraints class, but you can kind of see what’s going on here. We’re specifying that this set of routes will specifically handle any v1 calls – as well as being the default routes, in case a version isn’t specified.

The constraints option in the scope function is powerful and it works in a very specific way: it accepts any sort of object that can respond to a method called matches?, which it uses to determine if the constraint passes and allows access to those routes. Now for the last step; let’s add the logic for ApiConstraints. To do this, we’re going to add a file in the /lib directory called api_constraints.rb:

You an see here that all this class does is handle the matches? method. In a nutshell, it parses the Accept header to see if the version matches the one you passed in – or it will just return true if the default option was set.

If you liked this neat little constraint – then I’m glad, but I take zero credit for this logic. Ryan Bates did a really great RailsCast over versioning an API a few years ago, and this is by-the-books his recommendation about how to parse the Accept header.

You’re now all set up with the best practice of specifying an API version via the Accept header! When you need to add a new version, you’ll create new controllers inside of a version directory, as well as add new routes that are wrapped in a versioned constraint. You don’t need to version models.

Final Thoughts

We covered a lot, but I hope it wasn’t too exhausting. If there’s one common goal towards building a best-practice JSON API, it’s to use HTTP as it’s meant to be used. It’s easy to dump everything in your response body in an unorganized manner – but we can do better than that. Just do your best to follow RESTful practices, and if you have any questions about what you’re doing, then don’t be afraid to look it up; the Internet will quickly guide you down the right path.

Programming Concepts: Garbage Collection

Continuing on in this series, today we’re going to talk about garbage collection (GC) – what it is, how it works, and what some of the algorithms behind it are. Let me just say now that there are people way smarter than me who can give you nitty-gritty details about how specific languages implement GC, what libraries alter it from the norm, etc. What I’m trying to accomplish here is to give you a bird’s eye view of this whole facet of development in the hopes that you learn something you didn’t know before – and if it genuinely interests you, then I hope you continue Googling to find those posts which dig a mile deep into specific GC implementations. Here, we’ll stick to about a few feet deep – so let’s start digging.

What is Garbage Collection?

At its core, GC is a process of automated memory management so that you as a developer have one less thing to worry about. When you allocate memory – like by creating a variable – that memory is allocated to either the stack or the heap (check out my post on the stack vs. the heap if you want to learn more about these two). You allocate to the stack when you’re defining things in a local scope where you know exactly the memory block size you need, such as primitive data types, arrays of a set size, etc. The stack is a self-managing memory store that you don’t have to worry about – it’s super fast at allocating and clearing memory all by itself. For other memory allocations, such as objects, buffers, strings, or global variables, you allocate to the heap.

Compared to the stack, the heap is not self-managing. Memory allocated to the heap will sit there throughout the duration of the program and can change state at any point in time as you manually allocate/deallocate to it. The garbage collector is a tool that removes the burden of manually managing the heap. Most modern languages such as Java, the .NET framework, Python, Ruby, Go, etc. are all garbage collected languages; C and C++, however, are not – and in languages such as these, manual memory management by the developer is an extremely important concern.

Why Do We Need It?

GC helps save the developer from several memory-related issues – the foremost being memory leaks. As you allocate more and more memory to the heap, if the program doesn’t consistently release this memory as it becomes unneeded, memory size will begin to add up – resulting in a heap overflow. Even if heap memory is diligently managed by the developer – all it takes is one variable to be consistently left undeleted to result in a memory leak, which is bad.

Even if there are no memory leaks, what happens if you are attempting to reference a memory location which has already been deleted or reallocated? This is called a dangling pointer; the best case scenario here is that you would get back gibberish, and hopefully throw or cause a validation error soon after when that variable is used – but there’s nothing stopping that memory location from being overwritten with new data which could respond with seemingly valid (but logically incorrect) data. You’d have no idea what would be going on, and it’s these types of errors – memory errors – that are often times the most difficult to debug.

That’s why we need GC. It helps with all of this. It’s not perfect – it does use up extra resources on your machine to work and it’s normally not as efficient as proper manual memory management – but the problems it saves you from make it more than worth it.

How and When does the Garbage Collector Run?

This depends entirely on the algorithm used for GC. There isn’t one hard and fast way to do it, and just like compilers and interpreters, GC mechanisms get better over time. Sometimes the garbage collector will run at pre-determined time intervals, and sometimes it waits for certain conditions to arise before it will run. The garbage collector will just about always run on a separate thread in tandem with your program – and depending on the language’s implementation of GC, it can either stall your program (i.e. stop-the-world GC) to sweep out all the garbage at once, run incrementally to remove small batches, or run concurrently with your program.

It’s difficult to get deeper than this without getting into specific languages’ implementations of GC, so let’s move onto the common GC algorithms.

Garbage Collection Algorithms

There’s a bunch of different GC algorithms out there – but here are some of the most common ones you’ll come across. It’s interesting to note how many of these common algorithms build on one another.

Reference Counting

Reference counting is perhaps the most basic form of GC, and the easiest to implement on your own. The way it works is that anytime you reference a memory location on the heap, a counter for that particular memory location increments by 1. Every time a reference to that location is deleted, the counter decrements by 1. When that counter gets to 0, then that particular memory location is garbage collected.

One of the big benefits of GC by reference counting is that it can immediately tell if there is garbage (when a counter hits 0). However, there are some major problems with reference counting; circular references just flat out can’t be garbage collected – meaning that if object A has a reference to object B, and object B has a reference back to object A, then neither of these objects can ever be garbage collected according to reference counting. On top of this, reference counting is very inefficient because of the constant writes to the counters for each memory location.

Because of these problems, other algorithms (or at least refined versions of reference counting) are more commonly used in modern GC.

Mark-Sweep

Mark-sweep – as well as just about all modern GC algorithms other than reference counting – is a form of a tracing GC algorithm, which involves tracing which objects are reachable from one or multiple “roots” in order to find unreachable (and thus unused) memory locations. Unlike reference counting, this form of GC is not constantly checking and it can theoretically run at any point in time.

The most basic form of mark-sweep is the naïve mark-sweep; it works by using a special bit on each allocated memory block that’s specifically for GC, and running through all memory currently allocated on the heap twice: the first time to mark locations of dead memory via that special bit, and the second time to sweep (i.e. deallocate) those memory locations.

Mark-sweep is more efficient than reference counting because it doesn’t need to keep track of counters; it also solves the issue of not being able to remove circularly referenced memory locations. However, naïve mark-sweep is a prime example of stop-the-world GC because the entire program must be suspended while it’s running (non-naïve tracing algorithms can run incrementally or concurrently). Because tracing GC can happen at any point in time, you don’t ever have a good idea of when one of these stalls will happen. Heap memory is also iterated over twice – which slows down your program even more. On top of that, in mark-sweep there’s no handling of fragmented memory; to give you a visual representation of this, imagine drawing a full grid representing all of your heap memory – mark-sweep GC would make that grid look like a very bad game of Tetris. This fragmentation almost always leads to less efficient allocation of memory on the heap. So – we continue to optimize our algorithms.

Mark-Compact

Mark-compact algorithms take the logic from mark-sweep and add on at least one more iteration over the marked memory region in an effort to compact them – thus defragmenting them. This address the fragmentation caused by mark-sweep, which leads to significantly more efficient future allocations via the use of a “bump” allocator (similar to how a stack works), but adds on extra time and processing while GC is running because of the extra iteration(s).

Copying

Copying (also known as Cheney’s Algorithm) is slightly similar to mark-compact, but instead of iterating potentially multiple times over a single memory region, you instead just copy the “surviving” memory blocks of the region into an entirely new empty region after the mark phase – which thus compacts them by default. After the copying is completed, the old memory region is deallocated, and all existing references to surviving memory will point to the new memory region. This relieves the GC of a lot of processing, and brings down the specs to something even quicker than a mark-sweep process since the sweep phase is eliminated.

While you’ve increased speed though, you now have an extra requirement of needing an entirely available region of memory that is at least as large as the size of all surviving memory blocks. Additionally, if most of your initial memory region includes surviving memory, then you’ll be copying a lot of data – which is inefficient. This is where GC tuning becomes important.

Generational

Generational GC takes concepts from copying algorithms, but instead of copying all surviving members to a new memory region, it instead splits up memory into generational regions based on how old the memory is. The rationale behind generational GC is that normally, young memory is garbage collected much more frequently than older memory – so therefore the younger memory region is scanned to check for unreferenced memory much more frequently than older memory regions. If done properly, this saves both time and CPU processing because the goal is to scan only the necessary memory.

Older memory regions are certainly still scanned – but not as often as younger memory regions. If a block of memory in a younger memory region continues to survive, then it can be promoted to an older memory region and will be scanned less often.

Final Thoughts

GC isn’t the easiest topic to fully understand, and it’s something that you really don’t even need to understand when developing with modern languages – but just because you don’t need to know it doesn’t give you a good excuse for not learning about it. While it doesn’t affect much of the code you write, it’s an integral part of every language implementation, and the algorithm behind an implementation’s garbage collector is often times a large reason why people tend to like or dislike certain implementations. If you stuck with me this far, then I’m glad – and I hope you learned something. If this interested you, I encourage you to continue looking into GC – and here’s a fun resource you can start off with that shows you some animated GIFs of how different GC algorithms visually work.

Interestingly, while researching this topic, the vast majority of posts I came across talk about how GC works specifically to the main implementation of Java. GC certainly isn’t exclusive to Java, but I imagine the reason for this is because Java is often times heavily compared to C++ which isn’t garbage collected. Hopefully over time, more posts will become popular over how GC works in other languages – but for now, we’ll take what we can get!

What Meta Tags Your Site Should be Using

Whenever you’re building a new site, you probably pay more attention to the HTML that’s in the <body> tag (i.e. the actual content) than what’s in the <head> tag – and that’s a good thing! If your page doesn’t have rich, valuable content – then it probably shouldn’t be there, but that doesn’t mean that you should put everything else on the backburner. There are tons of valuable tags you should be placing within the <head> tag that can really make your site more valuable, accessible, and help showcase it on social media platforms before people even click on links to your site. In this post, we’re going to go through which tags you should absolutely be placing in the <head> tag of your site if you want to get the maximum exposure and shareability possible. All of these are <meta> tags – with the exception of one – and the majority of them are related to how links to your site will render when shared on various social media platforms. I’m gonna group these into a few different categories:

  • General
  • Open Graph (i.e. Facebook)
  • Twitter

Ready? Let’s get to it!

tl;dr

Before we get into the explanations of all the meta tags, if you just want a quick example of what core meta tags I recommend every page should have (and the one’s we’ll be discussing), then here it is:

These meta tags are taken directly from this webpage. Now that we have that out of the way, I hope you’ll continue reading to see what these tags actually do and why they’re important!

General

These are gonna be the tags that every site should have, regardless of how you plan to use it.

Title

This one’s pretty easy to understand and it’s absolutely the most important tag you should place within your <head> tag. It’s also the only tag we’ll be talking about that’s not explicitly a <meta> tag – this one gets a tag all to itself. You probably already knew this, but the <title> tag sets the title of the page. This will be the title that you see in your browser tab, your bookmark menu, Google results, and practically anywhere that your site is shared. It’s a must to set this.

Viewport

Next to the title, the viewport meta tag is extremely important to have in your site because without it, your site won’t render properly on smaller screen sizes such as mobile phones and tablets. The viewport meta tag gives the browser instructions on how to control the page’s dimensions and scaling. On smaller devices by default, browsers will try to scale down the entire web page width to fit on your screen just like it would on a desktop monitor; you’ve probably seen this if you’ve viewed websites that haven’t been built recently, and you have to zoom in to actually read the content. With modern responsive websites, we don’t want that default behavior. We have the power to build websites that break down properly for smaller screen sizes, and in order to render these sites properly, we need the viewport tag.

Here’s an example of a basic viewport meta tag:

This tag sets the width of the page to the device-width (i.e. your viewport width), and the initial scale attribute sets the zoom level to 1, so that you’re not viewing a zoomed-in or zoomed-out version of the page.

Character Set

This meta tag sets the character set of your website. Browsers need to know which character set your site uses in order to render your content properly.  UTF-8 is the default character set for all HTML5 sites – but you still should be explicit about setting it because in HTML4, the default is ISO-8859-1. If you’re using HTML5 (which you should be), then this tag looks like this:

But if you’re stuck using earlier versions of HTML, then you’d use the http-equiv property to set the character set:

Description

The meta description tag sets a 255-character-max block of text that accurately describes the page you’re on. This tag has been the standard for services like Google, Facebook, Slack, and many others to pull in your page’s description for others to see, which makes it very important.

The limit that any service will pull from your meta description is typically 255 characters, so make sure you stay concise with it!

Open Graph

You probably read “Open Graph” above in the intro and may have thought, “what the heck is that?” Open Graph is a protocol that Facebook created that allows any web page links to become rich objects in a social graph. Whenever you paste a link in Facebook (along with many other services) and it automatically creates a clickable block with a title, description, and/or image from that site – it’s using these Open Graph meta tags to do that. Before I knew what was going on here, it always seemed like magic to me when this happened – but it’s all just from simple meta tags! The Open Graph protocol is abbreviated to og when used in HTML.

I’m gonna display a chunk of Open Graph meta tags here, and then we’ll talk about them.

You’ll notice that some of these tags seem redundant compared to the other tags we’ve added so far – and truth be told, I agree. But the Facebook Open Graph debugger throws warnings if you don’t have an og:title or og:description, so it’s best to include them for maximum accessibility.

og:title

This purpose of this meta tag is similar to the <title> tag that we discussed above, but strictly used when sharing a link to your web page. It wont be used for browser tabs, bookmarks, or Google search results like the actual <title> tag.

og:type

This describes what type of content you’re linking to. More often than not, this will be set to website, but as you see in the example, it doesn’t have to be. Check out the Open Graph docs for the various values that this meta tag can be set to.

og:url

This is the canonical URL that the Open Graph object will reference when shared, and it should 99% of the time be set to the URL of the page you’re linking. The only other value this should really be set to would be something like a home page, in case the current page (e.g. a 404 page, unauthorized page, etc.) really isn’t something you want shared.

og:image

Open Graph Object Example
Open Graph Object Example

This is probably the one that you’ve come across most often as a user, and the one that Open Graph really pioneered: an image meta tag. This tag links to an image file, and if it exists, it will display that image when shared on many social media platforms. While Open Graph was originally built by Facebook, several other services such as Slack, LinkedIn, Google+, etc. all use this to pull in an image when you share a web page.

Typically only JPEGs and PNGs are supported, but it’s really up to the platform you’re sharing it on. If they want to render gifs or svgs, then they can do that. When choosing an image size, there are a couple of recommendations.

1 – The image should be reasonably sized. Facebook and other services typically limit it to 8mB, but you really should never have an image that big on the web. My personal goal is to keep all images under 500kB.

2 – This is Facebook specifically, but they recommend an aspect ratio of 1.91 to 1, and further recommend images to be 600 x 315 or 1200 x 630 pixels. You can choose an image with any aspect ratio, but abiding by these guidelines will make sure that parts of your images don’t get cropped out.

og:description

Just like og:title is a doppelgänger to the title tag, og:description is similar to your meta description tag.

That covers the basic Open Graph meta tags, but as I mentioned earlier, there are more than just these if you want to get nitty-gritty with your site’s content. Let’s move on to our final category.

Twitter

Twitter has it’s own protocol suite for meta tags, and they involve rendering a “card” to your tweets which look just like Open Graph objects. In fact – Twitter will actually use Open Graph meta tags that you already have to help render your cards, which is nice so that you don’t have to duplicate any meta tag content. Here’s a base example of what meta tags you should use for Twitter:

There are more meta tags that Twitter supports such as image, title, description, etc., but the tags shown here are the important ones that are unique to Twitter. You can add those other meta tags – but as mentioned earlier, if they aren’t present, Twitter will go ahead and use the data provided by your Open Graph tags – which is what I prefer.

twitter:card

Twitter Summary Large Image Card Example
Summary Large Image Example

This is the most important twitter meta tag, and it’s required if you want to render a card at all. The various values can be one of ?summary?, ?summary_large_image?, ?app?, or ?player? – all of which you can read about here. The default value should be “summary”, unless you want to showcase a featured image, in which case you would use “summary_large_image.”

twitter:site

This meta tag describes the twitter username for the website used in the card footer, and is required if you want to track attributions to this username through Twitter Card Analytics.

twitter:creator

This meta tag describes the twitter username for the content creator/author.

That wraps it up for the must-have Twitter meta tags. As mentioned, there are plenty more, and if you’re interested you can read up on them here.

Final Thoughts

Did we cover every meta tag out there? Absolutely not – but we covered a lot of them that are pretty important. Some of the ones we missed out on include a whole suite of meta tags dedicated to deeplinking, where you can do things like tell an operating system (such as iOS, Android, or Windows Phone) to open up an app when you land on the webpage instead of rendering the webpage itself. You’ve probably seen this type of action happen when you click on a Twitter, Instagram, or Amazon link. We didn’t cover the author meta tag either, or different things you can do with the http-equiv attribute, or the keywords meta tag – and that last one’s for good reason; the keywords meta tag has become pretty unimportant, and if any SEO “gurus” try to tell you that it is important – then run. Run away, because that’s a bold-faced lie.

Now that you know the purpose of some of the various meta tags and how to use them, you can go update some of your projects to make them more shareable! I hope you enjoyed this post and learned a little bit more about how to power up the HTML in your web pages.

Core Functional Programming Concepts

If you’re a developer like me, then you probably grew up learning about Object-Oriented Programming and how that whole paradigm works. You may have messed with Java or C++ – or been lucky enough to use neat languages like Ruby, Python, or C# as your first true language – so chances are that you’re at least mildly comfortable with terms such as classes, objects, instance variables, static methods, etc. What you’re probably not as comfortable with are the core concepts behind this weird paradigm called functional programming – which is pretty drastically different from not only just object-oriented programming, but also procedural, prototypal, and a slough of other common paradigms out there.

Functional programming is becoming a pretty hot topic – and for very good reason. This paradigm is hardly new too; Haskell is potentially the most corely-functional language out there and has existed since 1990. Other languages such as Erlang, Scala, Clojure also fall into the functional category – and they all have a solid following. One of the major benefits of functional programming is the ability to write programs that run concurrently and that do it properly (check out my post on concurrency if you need a refresher on what that means) – meaning that common concerns such as deadlock, starvation, and thread-safety really aren’t an issue. Concurrency in procedural-based languages is a nightmare because state can change at any given moment. Objects have state that can change, practically any function can change any variable as long as they’re in lexical scope (or dynamic scope, for the few languages that use it) – it’s very powerful, but very bad at keeping tabs on state.

Functional programming touts many benefits – but the ability to take advantage of all of a CPU’s cores via concurrent behavior is what makes it really shine compared to the other popular programming languages today – so I want to go over some of the core concepts that power this language paradigm.


Foreword: All of these concepts are language-agnostic (in fact, many functional languages don’t even fully abide by them), but if you had to associate them with any one language, it would most likely fit best with Haskell, since Haskell most strictly abides by core functional concepts. The following 5 concepts are strictly theory-driven and help define the functional paradigm in the purest sense.

1. Functions are Pure

This is easily the foremost rule of functional programming. All functions are pure in the sense that they abide by two restrictions:

  1. A function called multiple times with the same arguments will always return the same value. Always.
  2. No side effects occur throughout the function’s execution.

The first rule is relatively simple to understand – if I call the function sum(2, 3) – then it should always return the same value every time. Areas where this breaks down in more procedural-coding is when you rely on state that the function doesn’t control, such as global variables or any sort of randomized activity. As soon as you throw in a rand() function call, or access a variable not defined in the function – then the function loses its purity, and that can’t happen in functional programming.

The second rule – no side effects – is a little bit more broad in nature. A side effect is basically a state change in something other than the function that’s currently executing. Modifying a variable defined outside the function, printing out to the console, raising an exception, and reading data from a file are all examples of side effects which prevent a function from being pure. At first, this might seem like a big constraint for functional programming – but think about it. If you know for sure that a function won’t modify any sort of state outside the function itself, then you have full confidence that you can call this function in any scenario. This opens so many doors for concurrent programming and multi-threaded applications.

2. Functions are First-Class and can be Higher-Order

This concept isn’t exclusive to functional programming (it’s used pretty heavily in Javascript, PHP, and among other languages too) – but it is a requirement of being functional. In fact – there’s a whole Wikipedia article over the concept of first-class functions. For a function to be first-class, you just have to be able to set it to a variable. That’s it. This allows you to handle the function as if it were a normal data type (such as an integer or string), and still be able to execute the function at some other point in runtime.

Higher-order functions build off of this concept of “functions as first-class citizens” and are defined as functions that either accept another function as an argument, or that return a function themselves. Common examples of higher-order functions are map functions which typically iterate over a list, modify the data based on a passed-in function, and return a new list, and filter functions, which accept a function specifying how elements of a list should be selected, and return a new list with the selections.

3. Variables are Immutable

This one’s pretty simple. In functional programming, you can’t modify a variable after it’s been initialized. You just can’t. You can create new variables just fine – but you can’t modify existing variables, and this really helps to maintain state throughout the runtime of a program. Once you create a variable and set its value, you can have full confidence knowing that the value of that variable will never change.

4. Functions have Referential Transparency

Referential transparency is a tricky definition to pinpoint, and if you ask 5 different developers, then you’re bound to get 5 different responses. The most accurate definition for referential transparency that I have come across (and that I agree with) is that if you can replace the value of a function call with its return value everywhere that it’s called and the state of the program stays the same, then the function is referentially transparent. This might seem obvious – but let me give you an example.

Let’s say we have a function in Java that just adds 3 and 5 together:

It’s pretty obvious that anywhere I call the addNumbers() function, I can easily replace that whole function call with the return value of 8 – so this function is referentially transparent. Here’s an example of one that’s not:

This is a void function, so it doesn’t return anything when called – so for the function to be referentially transparent, we should be able to replace the function call with nothing as well – but that obviously doesn’t work. The function changes the state of the console by printing out to it – so it’s not referentially transparent.

This is a tricky topic to get, but once you do, it’s a pretty powerful way to understand how functions really work.

5. Functional Programming is Based on Lambda Calculus

Functional programming is heavily rooted in a mathematical system called lambda calculus. I’m not a mathematician, and I certainly don’t pretend to be, so I won’t go into the nitty-gritty details about this field of math – but I do want to review the two core concepts of lambda calculus that really shaped the structure of how functional programming works:

  1. In lambda calculus, all functions can be written anonymously without a name – because the only portion of a function header that affects its execution is the list of arguments. In case you ever wondered, this is where lambda (or anonymous) functions get their name in modern-day programming – because of lambda calculus. *Brain explosion*.
  2. When invoked, all functions will go through a process called currying. What this means is that when a function with multiple arguments is called, it will execute the function once but it will only set one variable in the parameter list. At the end, a new function is returned with 1 less argument – the one that was just applied – and this new function is immediately invoked. This happens recursively until the function has been fully applied, and then a final result is returned. Because functions are pure in functional programming – this works. Otherwise, if state changes were a concern, currying could produce unsafe results.

As I mentioned earlier, there’s much more to lambda calculus than just this – but I wanted to review where some of the core concepts in functional programming came from. At the very least, you can bring up the phrase lambda calculus when talking about functional programming, and everyone else will think you’re really smart.

Final Thoughts

Functional programming involves a significantly different train of thought than what you’re probably used to – but it’s really powerful, and I personally think this topic is going to come up again and again with CPUs these days offering more cores to handle processes instead of just using one or two beefed up cores per unit. While I mentioned Haskell as being one of the more pure functional languages out there – there are a handful of other popular languages too that are classified as functional: Erlang, Clojure, Scala, and Elixir are just a few of them, and I highly encourage you to check one (or more) of them out. Thanks for sticking with me this long, and I hope you learned something!

How Daemons, the Init Process, and Process Forking Work

If you’ve ever worked with Unix-based systems, then you’re bound to have heard the term daemon (pronounced dee-mon) before. My goal here is to explain exactly what they are and how they work, especially since the name makes them seem more convoluted than they actually are.

At its surface, a daemon is nothing difficult to understand – it’s just a background process that’s not attached to the terminal in which it was spawned. But how do they get created, how are they related to other processes, and how do they actually work? That’s what we’re gonna get into today, but before we start really talking about daemons, we need to learn about how the init process and process forking both work.

How the Init Process Works

To start off, we need to talk about the init process – also known as the PID 1 (because it always has the process ID of 1). The init process is the very first process that is created when you start up a Unix-based machine, which means that all other processes can somehow trace ancestry back to this process.

The init process is normally started when the Kernel calls a certain filename – often found in /etc/rc or /etc/inittab – but this location can change based on OS. Normally this process sets the path, checks the file system, initializes serial ports, sets the clock, and more. Finally, the last thing the init process handles is starting up all the other background processes necessary for your operating system to run properly – and it runs them as daemons. Typically, all of these daemon scripts exist in /etc/init.d/; it’s conventional to end all of the daemon executables with the letter d (such as httpd, sshd, mysqld, etc.) – so you might think that this directory is named as such because of that, but it’s actually just a common unix convention to name directories that have multiple configuration files with a .d suffix. Great, so the init script starts the daemons, but we still haven’t answered how it does that. The init process starts the daemons by forking its own process to create new processes, which leads us to talking about how process forking works.

How Process Forking Works

Traditionally in Unix, the only way to create a process is to create a copy of the existing process and to go from there. This practice – known as process forking – involves duplicating the existing process to create a child process and making an exec system call to start another program. We get the phrase “process forking” because fork is an actual C method in the Unix standard library which handles creating new processes in this manner. The process that calls the fork command will be considered the parent process of the newly created child process. The child process is nearly identical to the parent process, with a few differences such as different process IDs and parent process IDs, no shared memory locks, no shared async I/O, and more.

In today’s Unix and Linux distributions, there are other manners in which you can create a process instead of using fork (such as posix_spawn), but this is still how the vast majority of processes are created.

Now that you know a little bit about the traditional use of the term “fork” in computer science, it probably makes more sense why on GitHub you clone somebody else’s repo by forking it. But I digress – back to daemons!

Finally, How Daemons Work

Schematic over Maxwell's Demon
Schematic over Maxwell’s Demon

Before we get into how daemons work, I want to mention where the name comes from. The term daemon was created from MIT’s Project MAC, who in turn got the name from Maxwell’s Demon – an imaginary being from a thought experiment that constantly works in the background, sorting molecules (see image). The exact spelling of daemon comes from the Greek daemon, which is a supernatural being that operates in the background of everyday life and is neither good nor evil in nature (instead of always evil, as we normally view demons). So as weird as it may sound, the term daemon (referring to a Unix background process) is actually based on the concept of a supernatural demon as we think of it today.

Daemons are background process that run separately from the controlling terminal and just about always have the init process as a parent process ID (though they’re not required to); they typically handle things such as network requests, hardware activity, and other wait & watch type tasks. They differ from simple background processes that are spawned in the terminal because these background process are typically bound to that terminal session, and when that terminal session ends it will send the SIGHUP message to all background processes – which normally terminates them. Because daemons are normally children of the init process, it’s more difficult to terminate them.

Daemons are spawned one of two ways: either the init process forks and creates them directly – like we mentioned above in the init process segment – or some other process will fork itself to create a child process, and then the parent process immediately exits. The first condition seems pretty straightforward – the init process forks to create a daemon – but how does that second condition work, and how does the init process end up becoming the parent of these daemons?

When you fork a process to create a child process, and then immediately kill that parent process, the child process becomes an orphaned process – a running process with no parent (not to be confused with a zombie process, such as a child process that has been terminated but is waiting on the parent process to read its exit status). By default, if a child process gets orphaned, the init process will automatically adopt the process and become its parent. This is a key concept to understand, because this is normally how daemons that you start after boot up relate to the init process. And that’s about all that makes daemons unique from normal background processes – see, not too bad!

Final Thoughts

All in all, daemons are a pretty simple concept to understand – but in order to fully grok them, we needed to go into what the init process is and how process forking works. Now go impress your friends, and tell them to start pronouncing it correctly too! Dee-mon instead of day-mon.

Optimizing Your Web Page for Speed

We’ve all had it happen – that web page that you navigate to, and you can’t hardly interact with the page for a full 10 seconds because images are still loading, or you can’t scroll down because Javascript is still executing, etc. These are what we call unoptimized web sites, and they’re a scourge among the internet. The good news is that it’s relatively simple to optimize your web page and allow it to load practically instantaneously – or at the very least, not hamper the interaction for your users while you’re waiting for larger files to fully download. Keep following along – I’m about to show you how to do it.


Note: This post covers shrinking your web page’s overall payload so that it loads quicker, and nothing related to Search Engine Optimization. If you’re looking for SEO – the internet has a plethora of other posts about this topic.

Optimizing Images

Shrinking the payload of your images is the biggest way that you can help to optimize your web site. Let’s say that you snap a photo with your mobile phone and you want to put it online. That image from your phone easily sits at about 4MB initially – and there’s no way you can put that on your website (especially if it just needs to be a thumbnail!). You might be thinking “but that’s what I do with Facebook and Instagram!” – but they have image optimization built into their services that fires when you upload the image, because they don’t want to house those large images either.

th-tulsaf35-1
This image was a 2 MB screenshot at first – now it’s just 20 KB!

Another thing you might be thinking is that you don’t want to degrade the quality of your images by shrinking their size – and truth be told, that’s just not a real concern. It’s true that when you shrink your image sizes, you will lower the quality of your images, but if you’re uploading a 3000 x 4000 pixel image to your website and your site naturally shrinks it down to 300 x 400 pixels anyway – then you’re losing quality already without saving yourself any of that payload size.

To optimize your images, there are 3 things you can do:

  • Crop your image to a size that you actually plan to show it at on your site
  • Re-save the image at about 60 – 70% quality (you won’t notice the difference) using a tool like Photoshop or Gimp
  • Use a non-lossy image optimization tool such as ImageOptim

By following these 3 procedures, you can easily bring a 4MB image down to under 150KB – or even far less if the image is smaller on your site!

Minimize & Concatenate your CSS & JS

If you’re unfamiliar with these concepts, minimizing means running your CSS and JS through a tool which goes through your code and moves it all to just one line, removes extra white space, shortens variable names, and a slew of other optimization techniques to shrink your file size. Minimizing your files can easily cut their payload in half – and it doesn’t affect how your users interact with your site at all.

You only want to minimize your production files – because developing minimized files is near impossible. To help with this, I encourage you to look into a build automation tool such as Gulp or Grunt to make your life easier. For JS, you can use gulp-uglify, and for CSS there’s gulp-minify-css (similar libraries available for Grunt and other build automation systems). Going further for CSS, there’s also gulp-uncss which will strip out any CSS you have from your production files that’s not actually used in your web site. It doesn’t get much better than that!

On top of minification, you should concatenate your CSS and JS files so that your users’ browsers only need to download the minimum number of files for your site. Using Sass makes CSS concatenation nice and simple because you can include all your other Sass files into one main file via the @import command, but native JS concatenation is a little more difficult. The new ES6 spec supports Javascript modules so that you can include all of your JS files into a main file, just like we talked about with Sass, but ES6 doesn’t have near enough browser support yet. You can make this better with Babel, or if you want to stick with pure ES5, you can use Browserify which allows you to write CommonJS-style modules.

Still – none of this accounts for external libraries you include such as jQuery or Lodash, which are normally just loaded into the global scope of a web page. To get the most code-supportive practice of concatenating JS files (and CSS files, if you’re not using Sass), you should use a build automation plugin such as gulp-concat where you specify exactly which files you want to concatenate, and it just appends the code one after another into a new file.

Easy peasy.

Limit Web Fonts

Just like images, CSS, and JS, fonts are also a resource that count towards your page’s overall size – and if you overload your site on fonts, then you might have a problem. Services like Google Fonts and Adobe Typekit have become pretty traditional ways of adding fonts to your website – and each of them allows you to select certain versions of fonts to use, such as bold, italic, semibold, bolditalic, light, etc. Each version of a font has to be downloaded by the browser, and the vast majority of times you don’t need every version of a font. I strongly, strongly encourage you to select exactly which font forms you need instead of selecting them all. Being choosey with your fonts could mean the difference between adding 50KB and 500KB of extra weight to your page from fonts.

Use Caching

Last but not least, make sure you establish a cache policy for your website. This can be handled a number of different ways depending on whether you use nginx or apache, and if you’re serving up a dynamic site, then there’s a good chance your CMS or language framework supports forms of both client-side and server-side caching (such as WP Super Cache for WordPress).

This topic can get pretty extensive, and if you want to dig deeper into how caching actually works and how to start establishing a stable cache policy, I encourage you to check out my post over How Browser Caching Works.

Final Thoughts

Building an optimized web page these days is incredibly important – especially with mobile phone browsing becoming much more prevalent; after all, you can imagine how annoying it must be for your users to sit and watch while your web page loads – especially if you know you can make it better. There are some users (especially in non-first-world countries) which have a very low monthly data cap too, and if your web site alone takes up 10% of that whole data cap – then that’s just a big no-no.

If it’s your first time really thinking about page optimization, then don’t rush things. Go through this post slowly and build a development process for yourself that you can use for all future projects you work on. After you build one or two sites following these practices, it becomes second nature to optimize everything on your site as you build it – and you may even find ways to optimize your site that we didn’t discuss here (such as uploading videos on YouTube and embedding them on your site, instead of playing them directly through your website).

Thanks for reading, and I hope this post helped to open the doors for you on how you can start optimizing your web pages!

ARIA Roles and Attributes: How to Actually Use Them

If you’re a web developer, then there’s a chance that you’ve heard of ARIA roles and attributes before. WAI-ARIA – a protocol suite for building Accessible Rich Internet Applications (hence the name) – lays down some rules to help developers build websites that are accessible for all users. A lot of the times when we think of accessibility, we only think of blind users – but there are a lot of other types of disabilities that people may have such as color blindness, motor impairment, lack of limbs, auditory issues, cognitive issues, “crisis” moments, etc. Using some of the core ARIA concepts can not only help you build websites that enhance the experience for users with disabilities, but it will also help you architect your HTML better and make it more semantic – and doing things like that will help you to become a better developer.

ARIA by no means makes up the entirety of accessibility concerns for web development, and if you’d like to learn how else you can build your website for accessibility, I suggest you hop on over to my post about Developing for Accessibility. In this post, we’ll specifically be sticking with ARIA roles and attributes, and how you can actually use them.

What I mean by “actually use them” is that I’m going to show you how to take your first simple steps implementing ARIA concepts into your HTML. If you google around for ARIA, you’ll likely find two kinds of resources:

  1. On one end of the spectrum, you’ll find the overwhelming documentation over every single ARIA role and attribute (and there’s a ton) to the point where your eyes glaze over just scrolling down the page
  2. Or, you’ll find some small posts and/or videos about accessibility that basically say “I’m not going to go over ARIA too much, but here are some of the roles you can put in your HTML to help with accessibility.”

Both of these options suck – because they’re not effective at teaching you. I want to provide a middle ground between these two categories. I’m going to show you how exactly you can use ARIA roles and attributes in your HTML today with real examples – but I’m not going to throw a book of documentation at you. We won’t go over everything – in fact, we’ll probably scrape less than 30% of the full WAI-ARIA spec – but we’re gonna cover an important amount that will make sense enough for you to actually use and remember it.

Ready? Let’s get to it.

How ARIA Roles and Attributes Work

Before we get to some examples, I want to explain what ARIA roles and attributes are and how they work. ARIA helps to define attributes that you apply to HTML elements just like an href or class attribute. As a user with little or no disabilities browsing the web, you won’t ever notice ARIA roles or attributes because they don’t affect the visual design of a site – they’re strictly used by screen readers and other assistive technologies.

Browsers build accessibility trees for each website that you visit so assistive technologies can navigate them easier. ARIA roles and attributes help to fill in the gaps of information about what certain elements or groups of elements are for, and how an element is supposed to be used.

Here’s an example of an unordered list aided with ARIA roles and attributes:

Just by looking at it, this type of semantic HTML probably makes sense to you. Without the help of ARIA, this would just look like a list of items – but now you can tell that this is supposed to be a menu and with the aria-expanded state set to false, you know that this menu isn’t showing the individual menu items yet.

Rules of ARIA Use

There are a few core rules to keep in mind when using ARIA:

  1. If you can semantically build your website using native elements (like <nav>, <header>, <aside>, <button>, etc.), then you should always do that instead of relying on ARIA roles or attributes. Use ARIA roles or attributes when the HTML isn’t obviously stating the purpose of an element or group of elements.
  2. Don’t take away or change the native semantic meaning of an element with ARIA roles or attributes.
  3. All interactive controls such as a button, sliding control, or drag-and-drop widget must be usable by the keyboard.
  4. There are 2 ways to hide information from the accessibility tree, which should be used very sparingly for situations where content is unimportant or meant to be hidden. You can do this either with role=”presentation” or aria-hidden=”true”. You should never use these on an element that is visible and can be focused with the keyboard, such as an input field or a link. Defining a presentation role is more strict than an aria-hidden=”true” state – and we’ll see an example of this down below.
  5. Lastly, all interactive elements such as form fields should have a name associated with them. Something like a <label> is perfect, and with ARIA, you can even specify that a certain element is labelled by or described by another element.

Great – we’ve now gotten all of the introductory ARIA stuff out of the way – let’s get to some examples of how you can use ARIA roles and attributes in your HTML today.

Using ARIA Roles and Attributes

ARIA breaks down into 3 categories: roles, properties, and states. Roles define the purpose of an element, properties help better describe what an element can do, and states are like properties that are designed to change – normally with the help of Javascript. An element can only have one ARIA role at a time, but can have as many properties and states as necessary.

Let’s start off simple.

Define your main header, content, and footer

Each page normally has an identifiable header, main content, and footer – and there are specific ARIA roles designed to help express these elements.

The banner, main, and contentinfo roles are meant to be used only one time per page, and they help screen readers figure out how a page is laid out on a high-level.

See, using ARIA roles is easy! Let’s get a little deeper.

Label and Describe Elements

If an element seems rather vague, but could either be given a title or described by another element, then you can define that relationship using ARIA. There are 3 different ARIA properties that can help with this:

Aria-label is a property that defines a short title for an element; aria-labelledby references the ID of another element, which is a short title for the element; and aria-describedby is just like aria-labelledby – but is meant for longer descriptions instead of short titles. Here’s an example of this using a buttons’ tooltip:

For shorter labels of important elements, such as a lightbox that contains a larger version of the image you clicked on, you can use the aria-label property:

Now it’s important to remember that we don’t need to label everything, especially if there’s already a predefined way of labelling an element such as a <figcaption>, title attribute, or an image’s alt attribute. We only need to label something if the HTML doesn’t clearly indicate the purpose of an important element.

Navigation

This topic’s going to be a bit extensive, but that’s because navigation is one of the areas of a site that you really want to get right since people need it to, well, navigate around. Normally this involves <nav>, <ul>, <li>, and <a> elements. Let me give you an example of a solid nav bar set up with ARIA roles and attributes, and then we’ll talk about it:

Lots of roles and attributes, right? Like I said, navigation is one of the most important parts of a website, and that’s why making sure the accessibility tree can build it properly is so important too. In this example, we defined the navigation with a navigation role, and its child unordered list as being a menubar. This means that the navigation is visually presented as a horizontal menu bar as opposed to a vertical menu (which instead would use a menu role). Beneath that, we have our list of menuitems. When we get to a menuitem that has a sub-menu that pops up, then we give it an ARIA property of aria-haspopup=”true”. We give the sub-menu a role of menu because this is a vertical submenu, as well as an ARIA state of aria-hidden=”true”. The reason this is a state is because the sub-menu is initially hidden from view, but when you hover over the parent menuitem, the sub-menu would appear, and then hide again when you aren’t interacting with it. With Javascript, you could change the state to be aria-hidden=”false” while the sub-menu is visible, and then back to true again when it’s not.

ARIA rule #3 above stated to be hesitant to use aria-hidden=”true” – but this is a perfect example of how to use it properly. The aria-hidden property deals with whether an element is supposed to be visible to a user at a certain time, while the presentation role straight up removes the element from the accessibility tree – which we certainly don’t want to do for navigation.

This same type of structure works for lists that aren’t necessarily menus – but instead of menu and menuitem roles, you would use list and listitem roles. Everything else such as properties and states remains exactly the same.

I know there are a lot of ARIA roles and attributes here – but you can reasonably assume that just about every nav – regardless of exact HTML structure – will follow an ARIA architecture similar to this example.

Tab Lists

Another common way you can use ARIA labels and descriptions is when you build a tab widget on your page, where you click tabs to reveal different content. On top of ARIA labels though, we have some other neat tab-specific ARIA roles and properties I want to show you. Specifically, they are:

  • tab – a clickable tab which reveals content
  • tablist – the container which groups the clickable tabs
  • tabpanel – the actual content of the tab
  • aria-controls – a property that’s not tab-specific, but helps indicate that an element controls another element

Tab lists are one of those things which really requires a lot of visual acuity to understand how they work, and without semantic HTML elements specific to tab architecture, it’s difficult to make tabs accessible by default. That’s why it’s so important to build them accessibly with ARIA roles and attributes. Here in this example, we’re doing a lot of different things:

  • Setting ARIA roles for the tablist, tabs, and tabpanels
  • Stating which tab controls which tabpanel
  • Stating which tab labels each tabpanel
  • Handling the aria-hidden state to indicate which tabpanel is visible at any given time

This, my friend, is proper and accessible HTML architecture.

Forms

Last, and perhaps most importantly, it’s absolutely essential that you make the interactive portions of a website as accessible as possible – and usually that ends up being your forms. There are a lot of various ARIA roles and attributes that can be applied to forms, so I just want to highlight some of the ones that are important to include:

  • form – pretty simple, just the landmark role for a <form>
  • search – the role for a form with the primary function of searching data
  • aria-required – property indicating whether a field is required
  • aria-invalid – property indicating that the value of an input field is invalid (wait until after form submission to add this)

On top of ARIA roles, there are a couple important things to consider when building accessible forms.

  1. It’s incredibly important that each form field has a valid <label> associated with it which either wraps the form field or references it with the for attribute. If this isn’t possible, then you can use the ARIA labelling methods discussed above. You cannot substitute the placeholder attribute for a label because it’s not meant to be handled as a label; a placeholder is meant to simply be an example of what you’re supposed to enter in that field.
  2. Forms are often times tabbed-through via the keyboard, so it’s important that the tab order makes sense. Normally this isn’t a concern, but if you position or hide certain input fields via CSS/Javascript, then the tab order might become unintuitive. When this happens, you can set the tabindex attribute of an element to make sure that the tab order is how you expect it to be.

Here’s an example form with proper markup:

I threw in a couple extra ARIA roles and attributes such as radiogroup and aria-multiline – but that’s just to show how specific you can get with them. Notice how we didn’t add a radio role to the radio buttons (which is a valid ARIA role) – that’s because a radio input field itself semantically expresses how that element is supposed to work, and we don’t need to express that again with ARIA. However, because the wrapper of those fields is just a <div>, we still went ahead and gave it a radiogroup role.

Mostly, I just wanted to show the importance of labelling your input fields and how you can flag certain fields as required via ARIA attributes. If any field were invalid during the submission, then we would add an aria-invalid=”true” state onto each invalid field, and remove that state when the field becomes valid again.

Final Thoughts

We went over a lot of examples, and there’s still many more ARIA roles and attributes that we didn’t talk about – so feel free to check out the ARIA docs if you want to learn more.

To me, I love building accessible websites because it really feels like the right thing to do, but I like it for another reason too: I’m huge into code architecture and organization, and using ARIA roles and attributes helps me to architect my HTML much more semantically – and I love that. I hate using un-semantic elements such as <div>, <span>, and sometimes even <ul> – but if I can add an ARIA role such as contentinfomenutreeitem, status, and more, then I’m infinitely more happy because I’ve appropriately defined via HTML what this element is supposed to be. Taking things even further with ARIA attributes such as aria-expandedaria-hidden, and aria-invalid make it even more semantic and meaningful.

If you don’t already, then I encourage you to start applying some of the ARIA principles into your web sites today – and as I mentioned in the intro, if you’d like to learn other ways that you can build your site accessibly, then you can check out my post over Developing for Accessibility. I hope I’ve proven that it’s not too difficult to get started – and if you want more information, then the docs can answer any question you may have about them.