Skip to content →

Author: Jarlskov

Jeg hedder Jesper Jarlskov og jeg er civilingeniør og softwareudvikler med fokus på udvikling til web, sociale og mobile teknologier. Velkommen til min blog om teknologi, softwareudvikling og hvad der sker i mit liv. Bloggen vil bl.a. omhandle projekter jeg diverse projekter jeg følger eller arbejder på, samt diverse ting jeg finder interessante og deleværdige. Jeg håber du findet noget interessant herinde. Jeg blogger desuden også om øl og mad. Jeg kan også følges på: Google+ eller Twitter.

April-2017.php

April is almost over, and it’s time for another monthly roundup of interesting articles and links.

During the month I’ve read some interesting articles providing a pretty good spread ranging from introductions to JavaScript tools over best practice for working developers to a deep dive into the PHP engine.

The world of JavaScript tooling and frameworks seems to be an ever moving target, even so, the purpose of these tools is to make the everyday life of developers easier. Npm seems to be something like a current de facto standard for JavaScript package management. Sitepoint posted a beginner’s guide to npm, which gives a bit more background info for people who wants to know a bit more than how to write npm install.

At our company we recently switched to the Webpack module bundler, it’s been working quite well, but it was interesting to read Tutorialzine’s learn Webpack in 15 minutes and gain a few more insights into how it all works.

JavaScript as a language is also a moving target. The latest version is ECMAScript 2015 (aka. ES2015, aka. ES6), if you want to know more about what’s new in this new version, you can Learn ES2015 with Babeljs.

On a more general development note, TechBeacon published an article called 35 bad programming habits that make your code smell. Despite the link batish title, the article contains some good points about bad programming habits worth keeping in mind when doing your development.

Developers often complain about having to maintain other peoples’ code, and sometimes you get the impressions that only greenfield projects are fun to work with. Tobias Schlitt from Quafoo has some very interesting points about the advantages of improving existing code bases in his Loving Legacy Code. I think the article presents some really interesting advantages that are often forgotten when we get too focused on the code itself.

I’ve written a few deployment scripts by myself, but it’s always interesting to learn about other peoples’ experiences, Tim MacDonals has some interesting points in his Writing a Zero Downtime Deployment Script article.

It’s always interesting to know how our tools actually work internally. Even though it’s a bit too low level for me I always find Nikita Popov‘s PHP internals articles interesting, the same goes for the new deep dive into the PHP 7 Virtual Machine.

As a last thing I’d like to share a PHP library that I’d like to play around with; Spatie’s Crawler a PHP library for crawling websites. I could imagine this would work well together with Fabian Potencier’s Goutte web scraper library. I currently use Goutte for a small “secret” side project I call Taplist, a web scraper that scrapes the websites of beer bars in Copenhagen, to collect the lists of what’s currently on tap in the bars in one place.

Leave a Comment

March-2017.php

The month of march is ending, so it’s time for this month’s roundup of recommended reads. This month consists of 3 main topics: PHP, JavaScript and project management.

Even though I work a lot with PHP there’s always something new to learn. I’ve previously dived a bit into how Composer works. Earlier in the month my attention was called towards the Composer documentation, namely the part about autoloader optimization, which features hints on how to make the Composer autoloader work faster in your production environment.

I’m working a lot with JavaScript these days, especially with the Vue framework. A lot of stuff is currently happening in the JavaScript space, and I’m struggling to both get updated and keep at least partly on top about what’s going on in that space.

JavaScript ES6, aka. ECMAScript 2015 is the newest version of the ECMAScript standard and comes with a bunch of new features. Sitepoint has a nice article about 10 ES6 native features that you previously needed external libraries to get. They also have a quick rundown of 3 JavaScript libraries to keep your eyes on in 2017, including the before mentioned Vue.

“Awesome”-lists, curated lists of links somebody finds noteworthy, seems to be the big thing on Github these days. So much so that there is even awesome awesomeness meta listsa popping up. Any respectable (and all other) technology seems to have at least one dedicated awesome list, and of course, there is also an awesome Vue list.

When working with a Framework and a related template engine, there is usually a standard way of sharing data between the application and the frontend. But as the physical distance between the front- and the back-end of an application grows, with more functionality moving to the browser, sharing data gets a bit more complex. Jesse Schutt has a nice article about some different strategies for sharing data between Laravel and Vue applications, but despite the name, the strategies are applicable to any application where the back-end and front-end are separated and needs to share data.

On a less technology focused note PHPStan author Ondřej Mirtes wrote a nice article on How PHPStan got to 1000 stars on Github. Despite the rather click baitish title, the article has some really nice points on how to keep your focus on building, launching and growing an open source project and the community around it. Most of the points are very relevant in any development process, not only for open source projects, and they’re definitely worth being aware of.

On the same project building, I stumbled upon an article about the self-limiting habits of entrepreneurs, which is also worth a read for anybody trying to build a project, open source or otherwise.

Leave a Comment

February-2017.php

February is coming to a close, and it’s time for a monthly round-up.

Even though February is the shortest month of the year I doubt it will be the least eventful. Especially the security scene has been on fire this month, and I doubt we’ve seen the final debris of this.

The month gave us two major security related findings from Google.

First, they announced the first practical way to create SHA-1 hash collisions, putting the final nail in the coffin for SHA-1 usage in any security relations.

Later in the month, Google’s security research team, Project Zero, announced how Cloudflare’s reverse proxies would, in certain cases return private data from memory, a bug which came to be known as Cloudbleed. The Google researchers worked with Cloudfare to stop the leak, but according to Cloudfare’s incident report, the issue had been open for a while.

On a slightly different note. Laravel is popular PHP framework. Articles online about the framework seems to be about equal amounts of hype, and belittlement. Earlier this month a critical analysis of Laravel were going its rounds in the Twittersphere. I believe it provides a nice description of the pros and cons of Laravel, without falling for neither the hype nor the hatred that is often displayed in framework discussions in general, and Laravel discussions in particular.

As a lead developer, I spend a lot of time thinking about and making decisions on software architecture. So it’s always nice with some inspiration and new ideas. Even though it’s a rather old article by now, I believe Uncle Bob has some nice points when discussion Screaming Architecture, when he points out that the architecture of a piece of software should make it obvious what the software does, rather than which framework it’s built upon.

Developers seem to find incredible performance gains when upgrading to PHP 7, all from Tumblr reporting more than 50% performance improvement to Badoo saving one million dollars per year in saved hosting and server costs. For the nerds out there, PHP core contributor Julien Pauli did a deep dive into the technical side of PHP 7’s performance improvement.

On the topic of performance, I found Sitespeed.io, a collection of open source performance testing/monitoring tools, that I’d like to look more into.

Want to know more about what’s going on in the PHP community? Here is a nice curated list of PHP podcasts.

Leave a Comment

Laravel file logging based on severity

Per default anything logged in a Laravel application will be logged to the file: storage/logs/laravel.log, this is fine for getting started, but as your application grows you might want something that’s a bit easier to work with.

Laravel comes with a couple of different loggers:

single
Everything it logged to the same file.
daily
Everything is logged to the same file, but the file is rotated on a daily basis so you have a fresh file every day
syslog
Log to syslog, to allow the OS to handle the logging
errorlog
Pass error handling to the web server

Having different handlers included provides some flexibility, but sometimes you need something more specific to your situation. Luckily Laravel has outsourced it’s logging needs to Monolog, which provides all the flexibility you could want.

In our case, logging to files was enough, but having all of our log entries in one file made it impossible to find anything. Monolog implements the Psr-3 specification which defines 8 severity levels, so we decided to split our logging into one dedicated file per severity level.

To add your own Monolog configuration, you just call Illuminate\Foundation\Application::configureMonologUsing() from your bootstrap/app.php.

To add a file destination for each severity level, we simply loop through all available severity levels and assign a StreamHandler to each level.

Simple as that.

Later we decided to also send all errors should also be sent to a Slack channel, so we could keep an eye on it, and react quickly if required. Luckily Monolog ships with a SlackHandler, so this is easy as well.

So with just a few registered handlers, we’ve tailored our error logging to a level that we feel suits our needs at the current time.

Notice how we only log to Slack when we’re not in debug mode, to filter out any errors thrown while developing.

Leave a Comment

HTTPS and HTTP/2 with letsencrypt on Debian and nginx

Introduction

Most web servers today run HTTP/1.1, but HTTP2 support is growing. I’ve written more in-depth about the advantages of HTTP/2. Most browsers which support HTTP2 only supports it over encrypted HTTPS connections. In this article, I’ll go through setting up NGINX to serve pages over HTTPS and HTTP/2. I’ll also talk a bit about tightening your HTTPS setup, to prevent common exploits.

Letsencrypt

HTTPS security utilises public-key cryptography to provide end-to-end encryption and authentication to connections. The certificates are signed by a certificate authority, which can then verify that the certificate holder is who he claims to be.

Letsencrypt is a free and automated certificate authority who provides free certificate signing, which has historically been a costly affair.

Signing and renewing certificates from Letsencrypt is done using their certbot tool, this tool is available in most package managers which mean it’s easy to install. On Debian Jessie, just install the certbot like anything else from apt:

Using certbot you can start generating new signed certificates for the domains of your hosted websites:

For example

Letsencrypt certificates are valid for 90 days, after which they must be renewed by running

To automate this process, you can add this to your crontab, and make it run daily or weekly or whatever you prefer. In my setup it runs twice daily:

Note that Letsencrypt currently doesn’t support wildcard certificates, so if you’re serving your website from both example.com and www.example.com (you probably shouldn’t), you need to generate a certificate for each.

NGINX

NGINX is a free open source asynchronous HTTP server and reverse proxy. It’s pretty easy to get up and running with Letsencrypt and HTTP/2, and it will be the focus of this guide.

HTTPS

To have NGINX serve encrypted date using our previously created Letsencrypt certificate we have to ask it to listen for HTTPS connections on port 443, and tell it where to find our certificates.

Open your site config, usually found in /etc/nginx/sites-available/<site>, here you’ll probably see that NGINX is currently listening for HTTP-connections on port 80:

So to start listening on port 443 as well, we just add another line:

The domain at the end would, of course, be the domain that your server is hosting.

Notice that we’ve added the ssl statement in there as well.

Next, we need to tell NGINX where to look for our new certificates, so it knows how to encrypt the data, we do this by adding

With the default Letsencrypt settings this would translate into something like:

And that’s really all there is to it.

HTTP/2

Enabling HTTP/2 support in NGINX is even easier, just add an http2 statement to each listen line in your site config. So:

Turns into:

Now test out your new, slightly more secure, setup:

If all tests pass, restart NGINX to publish your changes.

Now your website should also be available using the https:// scheme… unless the port is blocked in your firewall (that could happen to anybody). If your browser supports HTTP2, your website should also be served over this new protocol.

Improving HTTPS security

With the current setup, we’re running over HTTPS, which is a good start, but a lot of exploits have been discovered in various parts of the implementation, so there are a few more things we can do to harden the security. We do this by adding some additional settings to our NGINX site config.

Firstly, old versions of TLS is insecure, so we should force the server to not revert:

If you need to support older browsers like IE10, you need to turn on older versions of TLS;

This in effect only turns off SSL encryption, it’s not optimal, but sometimes you need to strike a balance, and it’s better than the NGINX default settings. You can see which browsers supports which TLS versions on caniuse.

When establishing a connection over HTTPS the server and the client negotiates which encryption cypher to use. This has been exploited in some cases, like in the BEAST exploit. To lessen the risks, we disable certain old insecure ciphers:

Normally HTTPS certificates are verified by the client contacting the certificate authority. We can turn this up a bit by having the server download the authority’s response, and supply it to the client together with the certificate. This saves the client the roundtrip to the certificate authority, speeding up the process. This is called OCSP stapling and is easily enabled in NGINX:

Enabling HTTPS is all well and good, but if a man-in-the-middle (MitM) attack actually occurs, the perpetrator can decrypt the connection from the server, and relay it to the client over an unencrypted connection. This can’t really be prevented, but it is possible to instruct the client that it should only accept encrypted connections from the domain; this will mitigate the problem anytime the clients visits the domain after the first time. This is called Strict Transport Security:

BE CAREFUL! with adding this last setting, since it will prevent clients from connecting to your site for one full year if you decide to turn off HTTPS or have an error in your setup that causes HTTPS to fail.

Again, we test our setup:

If all tests pass, restart NGINX to publish your changes (again, consider if you’re actually ready to enable Strict Transport Security).

Now, to test the setup. First we run an SSL test, to test the security of our protocol and cipher choices, the changes mentioned here improved this domain to an A+ grade.

We can also check our HTTPS header security. Since I still havn’t set up Content Security Policies and HTTP Public Key Pinning I’m sadly stuck down on a B grade, leaving room for improvement.

Further reading

Leave a Comment

January-2017.php

As a new feature this year, I’d like to start doing monthly summaries of interesting articles I read throughout the month, that I feel is worth returning back to. The lists will include a bunch of different articles revolving around web development, but will probably mainly focus on PHP. It might also show that I work with Laravel in my day job.

I will include anything that pops up on my radar during the month, even though some of it might be old news if I feel that it’s still relevant.

I believe one of the best ways to learn is to look at how other does the job you’re trying to become better at. I’ve had a hard time finding complete Open Source Laravel projects, but OpenLaravel showcases just that.

Besides diving into full project code bases, reading other peoples’ experiences with diving into specific areas of a code base can also be highly valuable; that’s why I’ve written about diving into Laravel Eloquent’s getter and setter magic. In the same sense, I found it interesting to read a breakdown of hacking a Concrete5 plugin.

Since we’re already on the topic of security, the people from Paragon IE wrote a PHP security pocket guide.

I like to consider myself a pragmatic programmer, who focuses on getting the right things to work in a proper way, which means not necessarily trying to solve every imaginable scenario that might be a problem someday. In other words, I try to practice YAGNI.

Shawn McCool has gathered a tonne of interesting information regarding the Active Records design pattern (the base for Laravel’s Eloquent ORM) in his giant all things Active Record article.

The last link for this month is a short story about why a bug fix is called a patch.

Leave a Comment

Exporting from MySQL to CSV file

We often have to export data from MySQL to other applications; this could be to further analyse the data, gather user emails for a newsletter or similar. Usually these applications, CSV is probably the most common format for data exporting like this.

Luckily SQL select output, with its rows and columns, if well suited for CSV output.

You can easily export straight from the MySQL client to a CSV file, by appending the expected CSV format to the end of the query:

For example:

The last three lines can be customised based on your needs.

NOTE: The default MySQL settings only allows writing files to the /var/lib/mysql-files/ directory.

If you do not have permissions to write files from the MySQL client, the same can be accomplished from the commandline:

For example:

The regex is, of course, courtesy stackoverflow.

Regex Explanation:

s/// means substitute what’s between the first // with what’s between the second //
the “g” at the end is a modifier that means “all instance, not just first”
^ (in this context) means beginning of line
$ (in this context) means end of line
So, putting it all together:

s/’/\’/ replace ‘ with \’
s/\t/\”,\”/g replace all \t (tab) with “,”
s/^/\”/ at the beginning of the line place a ”
s/$/\”/ at the end of the line place a ”
s/\n//g replace all \n (newline) with nothing

Leave a Comment

Lazy loading, eager loading and the n+1 problem

In this article I’ll look into a few data loading strategies, namely:

I’ll talk a bit about the pros and cons of each strategy, and common issues developers might run into. Especially the n+1 problem, and how to mitigate it.

The examples in the post are written using PHP and Laravel syntax, but the idea is the same for any language.

Lazy Loading

The first loading strategy is probably the most basic one. Lazy loading means postponing loading data until the time where you actually need it.

Lazy loading has the advantage that you will only load the data you actually need.

The n+1 problem

The problem is with lazy loading is what is known as the n+1 problem. Because the data is loaded for each user independently, n+1 database queries is required, where n is the number of users. If you only have a few users this will likely not be a problem, but this means that performance will degrade quickly when the number of users grows.

Eager Loading

In the eager loading strategy, data loading is moved to an earlier part of the code.

This will solve the n+1 problem. Since all data is fetched using a single query, the number of queries is independent of the number of items fetched.

One problem with eager loading is that you might end up loading more data than you actually need.

Often data is fetched in a controller and used in a view, in this case, the two will become very tightly coupled, and every time the data requirements of the view changes, the controller needs to change as well requiring maintenance in several different places.

Lazy-eager loading

The lazy-eager loading combines both of the strategies above; loading of data is postponed until it is required, but it is still being prepared beforehand. Let’s see an example.

As usual, a simple example like this only shows part of the story, but the $users list would usually be loaded in a controller, away from the iterator.

In this case, we’re keeping related code close together which means you’ll likely have fewer places to go to handle maintenance, but since the data is often required in templates we might be introducing logic into our templates giving the template more responsibilities. This can both make maintenance and performance testing harder.

Which strategy to choose?

This article has introduced three different data loading strategies, namely; lazy loading, eager loading and lazy-eager loading.

I’ve tried to outline som pros and cons of each strategy, as well as some of the issues related to each of them.

Which strategy to choose depends on the problem you’re trying to solve, since each strategy might make sense in their own cases.

As with most other problems I’d usually start by implementing whichever strategy is simplest and fastest to implement, to create a PoC of a solution to my problem. Afterwards I’d go through a range of performance tests, to see where the bottlenecks in my solution appears, and then look into which alternate strategy will solve that bottleneck.

Leave a Comment

HTTP/2 – What and why?

What is HTTP?

HTTP, HyperText Transfer Protocol, is an application layer network protocol for distributing hypermedia data. It is mainly used for serving websites in HTML format for browser consumption.

HTTP/2 and SPDY

The newest version of HTTP is HTTP/2. The protocol originated at Google under the name SPDY. The specification work and maintenance has later been moved to the IETF.

The purpose of HTTP/2 was to develop a better performing protocol, while still maintaining high-level backward compatibility with previous versions of the HTTP protocol. This means compliance to the same HTTP methods, status codes, etc.

New in HTTP/2

HTTP/2 maintains backward compatibility in the way that applications served over the protocol does not require any changes to work for the new version. But the the protocol contains a range of new performance enhancing features that applications can implement on a case-by-case basis.

Header compression

HTTP/2 supports most of the headers supported by earlier versions of HTTP. As something new, HTTP/2 also support compressing these headers to minimize the amount of data that has to be transferred.

Request pipelining

In earlier versions of HTTP, one TCP connection equaled one HTTP connection. In HTTP/2 several HTTP requests can be sent over the same TCP connection.

This allows HTTP/2 to bypass some of the issues in previous version of the protocol, like the maximum connection limit. It also means that all of the formalities of TCP, like handshakes and path MTU discovery, only has to be done once.

No HOL blocking

Head of Line block occurs when some incoming data must be handled in a specific order, and the first data takes a long time, forcing all of the following data to wait for the first data to be processed.

In HTTP/1 HOL blocking happens because multiple requests over the same TCP connection has to be handled in the order they were received. So if a client makes 2 requests to the same server, the second request won’t be handled before the first request is been completed. If the first request takes a long time to complete, this can will hold up the second request as well.

HTTP/2 allows pipelining requests over a single TCP connection, allowing the second request to be received at the same time, or even before, the first request. This means the server is able to handle the second request even while it’s still receiving the first request.

Server Push

In earlier versions, a typical web page was rendered by in a sequential manner. First the browser would request the page to be rendered. The server would then respond with the main content of the page, typically in HTML format. The browser would then start rendering the HTML from the top and down. Every time the browser encountered an external ressource, like a css file, an external JavaScript file, or an image, a new request would be made to the server to request the additional resource.

HTTP/2 offers a feature called server push. Server push allows the server to respond to a single request with several different resources. This allows the server to push required external CSS and JavaScript files to the user on a normal page request, thus allowing the client to have the resources at hand during the rendering, preventing the render-blocking additional requests that otherwise had to be done during page rendering.

HTTPS-only

The HTTP/2 specification, like earlier versions, allows HTTP to function both over a normal unencrypted connection, but also specifies a method for transferring over a TLS-encrypted connection, namely using HTTPS. The major browser vendors, though, have decided to force higher security standards, by only accepting HTTP/2 over encrypted HTTPS connections.

Leave a Comment

PHP regular expression functions causing segmentation fault

We recently had an issue where generating large exports for users to download would suddenly just stop for no apparent reason. Because of the size of the exports the first thought was that the process would time out, but the server didn’t return a timeout error to the browser, and the process didn’t really run for long enough to hit the time limit set on the server.

Looking through the Laravel and Apache vhost logs where the errors would normally be logged didn’t provide any hints as to what the issue was. Nothing was logged. After some more digging I found out that I could provoke an error in the general Apache log file.

It wasn’t a lot to go on, but at least I had a reproducible error message. A segmentation fault (or segfault) means that a process tries to access a memory location that it isn’t allowed to access, and for that reason it’s killed by the operating system.

After following along the code to try to identify when the segfault actually happened, I concluded it was caused by a call too preg_match().

Finally I had something concrete to start debugging (aka Googling) for, and I finally found out which Stackoverflow question contained the answer.

In short the problem happens because the preg_* functions in PHP builds upon the PCRE library. In PCRE certain regular expressions are matched by using a lot of recursive calls, which uses up a lot of stack space. It is possible to set a limit on the amount of recursions allowed, but in PHP this limit defaults to 100.000 which is more than fits in the stack. Setting a more realistic limit for pcre.recursion_limit as suggested in the Stackoverflow thread solved my current issue, and if the limit should prove to be too low for my system’s requirements, at least I will now get a proper error message to work from.

The Stackoverflow answer contains a more in-depth about the problem, the solution and other related issues. Definitely worth a read.

Leave a Comment