Sitepoint PHP

Feliratkozás Sitepoint PHP hírcsatorna csatornájára
Learn CSS | HTML5 | JavaScript | Wordpress | Tutorials-Web Development | Reference | Books and More
Frissítve: 54 perc 2 másodperc

Improving Performance Perception with Pingdom and GTmetrix

p, 06/22/2018 - 20:00

This article is part of a series on building a sample application --- a multi-image gallery blog --- for performance benchmarking and optimizations. (View the repo here.)

In this article, we'll analyze our gallery application using the tools we explained in the previous guide, and we'll look at possible ways to further improve its performance.

As per the previous post, please set up Ngrok and pipe to the locally hosted app through it, or host the app on a demo server of your own. This static URL will enable us to test our app with external tools like GTmetrix and Pingdom Tools.

We went and scanned our website with GTmetrix to see how we can improve it. We see that results, albeit not catastrophically bad, still have room for improvement.

The first tab --- PageSpeed --- contains a list of recommendations by Google. The first item under the PageSpeed tab --- a warning about a consistent URL --- pertains to our application outputting the images randomly, so that is an item we will skip. The next thing we can do something about is browser caching.

Browser Caching

We see that there is a main.css file that needs its Expires headers set, and the images in the gallery need the same thing. Now, the first idea for these static files would be to set this in our Nginx configuration:

location ~* \.(?:ico|css|js|gif|jpe?g|png)$ { expires 14d; }

We can simply put this inside our server block and leave it to Nginx, right?

Well, not really. This will take care of our static files, like CSS, but the /raw images we are being warned about aren't really that static. So this snippet in our Nginx configuration won't exactly fix this issue so easily. For our images, we have an actual controller that creates these on the fly, so it would be ideal if we could set our response headers right there, in the controller. For some reason, these weren't being set properly by Glide.

Maybe we could set our Nginx directive in a way to include the raw resources, but we felt the controller approach to be more future-proof. This is because we aren't sure what other content may end up with an raw suffix eventually --- maybe some videos, or even audio files.

So, we opened /src/ImageController.php in our image gallery app, and dropped these two lines inside of our serveImageAction(), just before the line return $response:

// cache for 2 weeks $response->setSharedMaxAge(1209600); // (optional) set a custom Cache-Control directive $response->headers->addCacheControlDirective('must-revalidate', true);

This will modify our dynamic image responses by adding the proper Cache Control and Expires headers.

Symfony has more comprehensive options for the caching of responses, as documented here.

Having restarted Nginx, we re-tested our app in GTmetrix, and lo and behold:

The post Improving Performance Perception with Pingdom and GTmetrix appeared first on SitePoint.

Kategóriák: IT Hírek

MySQL Performance Boosting with Indexes and Explain

cs, 06/21/2018 - 20:00

Techniques to improve application performance can come from a lot of different places, but normally the first thing we look at --- the most common bottleneck --- is the database. Can it be improved? How can we measure and understand what needs and can be improved?

One very simple yet very useful tool is query profiling. Enabling profiling is a simple way to get a more accurate time estimate of running a query. This is a two-step process. First, we have to enable profiling. Then, we call show profiles to actually get the query running time.

Let's imagine we have the following insert in our database (and let's assume User 1 and Gallery 1 are already created):

INSERT INTO `homestead`.`images` (`id`, `gallery_id`, `original_filename`, `filename`, `description`) VALUES (1, 1, 'me.jpg', 'me.jpg', 'A photo of me walking down the street'), (2, 1, 'dog.jpg', 'dog.jpg', 'A photo of my dog on the street'), (3, 1, 'cat.jpg', 'cat.jpg', 'A photo of my cat walking down the street'), (4, 1, 'purr.jpg', 'purr.jpg', 'A photo of my cat purring');

Obviously, this amount of data will not cause any trouble, but let's use it to do a simple profile. Let's consider the following query:

SELECT * FROM `homestead`.`images` AS i WHERE i.description LIKE '%street%';

This query is a good example of one that can become problematic in the future if we get a lot of photo entries.

To get an accurate running time on this query, we would use the following SQL:

set profiling = 1; SELECT * FROM `homestead`.`images` AS i WHERE i.description LIKE '%street%'; show profiles;

The result would look like the following:

Query_Id Duration Query 1 0.00016950 SHOW WARNINGS 2 0.00039200 SELECT * FROM homestead.images AS i \nWHERE i.description LIKE \'%street%\'\nLIMIT 0, 1000 3 0.00037600 SHOW KEYS FROM homestead.images 4 0.00034625 SHOW DATABASES LIKE \'homestead\ 5 0.00027600 SHOW TABLES FROM homestead LIKE \'images\' 6 0.00024950 SELECT * FROM homestead.images WHERE 0=1 7 0.00104300 SHOW FULL COLUMNS FROM homestead.images LIKE \'id\'

As we can see, the show profiles; command gives us times not only for the original query but also for all the other queries that are made. This way we can accurately profile our queries.

But how can we actually improve them?

We can either rely on our knowledge of SQL and improvise, or we can rely on the MySQL explain command and improve our query performance based on actual information.

Explain is used to obtain a query execution plan, or how MySQL will execute our query. It works with SELECT, DELETE, INSERT, REPLACE, and UPDATE statements, and it displays information from the optimizer about the statement execution plan. The official documentation does a pretty good job of describing how explain can help us:

With the help of EXPLAIN, you can see where you should add indexes to tables so that the statement executes faster by using indexes to find rows. You can also use EXPLAIN to check whether the optimizer joins the tables in an optimal order.

The post MySQL Performance Boosting with Indexes and Explain appeared first on SitePoint.

Kategóriák: IT Hírek

PHP-level Performance Optimization with Blackfire

sze, 06/20/2018 - 20:00

Throughout the past few months, we've introduced Blackfire and ways in which it can be used to detect application performance bottlenecks. In this post, we'll apply it to our freshly started project to try and find the low-points and low-hanging fruit which we can pick to improve our app's performance.

If you're using Homestead Improved (and you should be), Blackfire is already installed. Blackfire should only ever be installed in development, not in production, so it's fine to only have it there.

Note: Blackfire can be installed in production, as it doesn't really trigger for users unless they manually initiate it with the installed Blackfire extension. However, it's worth noting that defining profile triggers on certain actions or users that don't need the extension will incur a performance penalty for the end user. When Blackfire-testing live, make the test sessions short and effective, and avoid doing so under heavy load.

While it's useful to be introduced to Blackfire before diving into this, applying the steps in this post won't require any prior knowledge; we'll start from zero.


The following are useful terms when evaluating graphs produced by Blackfire.

  • Reference Profile: We usually need to run our first profile as a reference profile. This profile will be the performance baseline of our application. We can compare any profile with the reference, to measure the performance achievements.

  • Exclusive Time: The amount of time spent on a function/method to be executed, without considering the time spent for its external calls.

  • Inclusive Time: The total time spent to execute a function including all the external calls.

  • Hot Paths: Hot Paths are the parts of our application that were most active during the profile. These could be the parts that consumed more memory or took more CPU time.

The first step is registering for an account at Blackfire. The account page will have the tokens and IDs which need to be placed into Homestead.yaml after cloning the project. There's a placeholder for all those values at the bottom:

# blackfire: # - id: foo # token: bar # client-id: foo # client-token: bar

After uncommenting the rows and replacing the values, we need to install the Chrome companion.

The Chrome companion is useful only when needing to trigger profiling manually --- which will be the majority of your use cases. There are other integrations available as well, a full list of which can be found here.

Optimization with Blackfire

We'll test the home page: the landing page is arguably the most important part of any website, and if that takes too long to load, we're guaranteed to lose our visitors. They'll be gone before Google Analytics can kick in to register the bounce! We could test pages on which users add images, but read-only performance is far more important than write performance, so we'll focus on the former.

This version of the app loads all the galleries and sorts them by age.

Testing is simple. We open the page we want to benchmark, click the extension's button in the browser, and select "Profile!".

Here's the resulting graph:

In fact, we can see here that the execution time inclusive to exclusive is 100% on the PDO execution. Specifically, this means that the whole dark pink part is spent inside this function and that this function in particular is not waiting for any other function. This is the function being waited on. Other method calls might have light pink bars far bigger than PDO's, but those light pink parts are a sum of all the smaller light pink parts of depending functions, which means that looked at individually, those functions aren't the problem. The dark ones need to be handled first; they are the priority.

Also, switching to RAM mode reveals that while the whole call used almost a whopping 40MB of RAM, the vast majority is in the Twig rendering, which makes sense: it is showing a lot of data, after all.

In the diagram, hot paths have thick borders and generally indicate bottlenecks. Intensive nodes can be part of the hot path, but also be completely outside it. Intensive nodes are nodes a lot of time is spent in for some reason, and can be indicative of problems just as much.

By looking at the most problematic methods and clicking around on relevant nodes, we can identify that PDOExecute is the most problematic bottleneck, while unserialize uses the most RAM relative to other methods. If we apply some detective work and follow the flow of methods calling each other, we'll notice that both of these problems are caused by the fact that we're loading the whole set of galleries on the home page. PDOExecute takes forever in memory and wall time to find them and sort them, and Doctrine takes ages and endless CPU cycles to turn them into renderable entities with unserialize to loop through them in a twig template. The solution seems simple --- add pagination to the home page!

By adding a PER_PAGE constant into the HomeController and setting it to something like 12, and then using that pagination constant in the fetching procedure, we block the first call to the newest 12 galleries:

$galleries = $this->em->getRepository(Gallery::class)->findBy([], ['createdAt' => 'DESC'], self::PER_PAGE);

We'll trigger a lazy load when the user reaches the end of the page when scrolling, so we need to add some JS to the home view:

{% block javascripts %} {{ parent() }} <script> $(function () { var nextPage = 2; var $galleriesContainer = $('.home__galleries-container'); var $lazyLoadCta = $('.home__lazy-load-cta'); function onScroll() { var y = $(window).scrollTop() + $(window).outerHeight(); if (y >= $('body').innerHeight() - 100) { $(window).off('scroll.lazy-load'); $; } } $lazyLoadCta.on('click', function () { var url = "{{ url('home.lazy-load') }}"; $.ajax({ url: url, data: {page: nextPage}, success: function (data) { if (data.success === true) { $galleriesContainer.append(; nextPage++; $(window).on('scroll.lazy-load', onScroll); } } }); }); $(window).on('scroll.lazy-load', onScroll); }); </script> {% endblock %}

Since annotations are being used for routes, it's easy to just add a new method into the HomeController to lazily load our galleries when triggered:

/** * @Route("/galleries-lazy-load", name="home.lazy-load") */ public function homeGalleriesLazyLoadAction(Request $request) { $page = $request->get('page', null); if (empty($page)) { return new JsonResponse([ 'success' => false, 'msg' => 'Page param is required', ]); } $offset = ($page - 1) * self::PER_PAGE; $galleries = $this->em->getRepository(Gallery::class)->findBy([], ['createdAt' => 'DESC'], 12, $offset); $view = $this->twig->render('partials/home-galleries-lazy-load.html.twig', [ 'galleries' => $galleries, ]); return new JsonResponse([ 'success' => true, 'data' => $view, ]); }

The post PHP-level Performance Optimization with Blackfire appeared first on SitePoint.

Kategóriák: IT Hírek

Building an Image Gallery Blog with Symfony Flex: Data Testing

k, 06/19/2018 - 20:00

In the previous article, we demonstrated how to set up a Symfony project from scratch with Flex, and how to create a simple set of fixtures and get the project up and running.

The next step on our journey is to populate the database with a somewhat realistic amount of data to test application performance.

Note: if you did the “Getting started with the app” step in the previous post, you've already followed the steps outlined in this post. If that's the case, use this post as an explainer on how it was done.

As a bonus, we'll demonstrate how to set up a simple PHPUnit test suite with basic smoke tests.

More Fake Data

Once your entities are polished, and you've had your "That's it! I'm done!" moment, it's a perfect time to create a more significant dataset that can be used for further testing and preparing the app for production.

Simple fixtures like the ones we created in the previous article are great for the development phase, where loading ~30 entities is done quickly, and it can often be repeated while changing the DB schema.

Testing app performance, simulating real-world traffic and detecting bottlenecks requires bigger datasets (i.e. a larger amount of database entries and image files for this project). Generating thousands of entries takes some time (and computer resources), so we want to do it only once.

We could try increasing the COUNT constant in our fixture classes and seeing what will happen:

// src/DataFixtures/ORM/LoadUsersData.php class LoadUsersData extends AbstractFixture implements ContainerAwareInterface, OrderedFixtureInterface { const COUNT = 500; ... } // src/DataFixtures/ORM/LoadGalleriesData.php class LoadGalleriesData extends AbstractFixture implements ContainerAwareInterface, OrderedFixtureInterface { const COUNT = 1000; ... }

Now, if we run bin/, after some time we'll probably get a not-so-nice message like PHP Fatal error: Allowed memory size of N bytes exhausted.

Apart from slow execution, every error would result in an empty database because EntityManager is flushed only at the very end of the fixture class. Additionally, Faker is downloading a random image for every gallery entry. For 1,000 galleries with 5 to 10 images per gallery that would be 5,000 - 10,000 downloads, which is really slow.

There are excellent resources on optimizing Doctrine and Symfony for batch processing, and we're going to use some of these tips to optimize fixtures loading.

First, we'll define a batch size of 100 galleries. After every batch, we'll flush and clear the EntityManager (i.e., detach persisted entities) and tell the garbage collector to do its job.

To track progress, let's print out some meta information (batch identifier and memory usage).

Note: After calling $manager->clear(), all persisted entities are now unmanaged. The entity manager doesn't know about them anymore, and you'll probably get an "entity-not-persisted" error.

The key is to merge the entity back to the manager $entity = $manager->merge($entity);

Without the optimization, memory usage is increasing while running a LoadGalleriesData fixture class:

> loading [200] App\DataFixtures\ORM\LoadGalleriesData 100 Memory usage (currently) 24MB / (max) 24MB 200 Memory usage (currently) 26MB / (max) 26MB 300 Memory usage (currently) 28MB / (max) 28MB 400 Memory usage (currently) 30MB / (max) 30MB 500 Memory usage (currently) 32MB / (max) 32MB 600 Memory usage (currently) 34MB / (max) 34MB 700 Memory usage (currently) 36MB / (max) 36MB 800 Memory usage (currently) 38MB / (max) 38MB 900 Memory usage (currently) 40MB / (max) 40MB 1000 Memory usage (currently) 42MB / (max) 42MB

Memory usage starts at 24 MB and increases for 2 MB for every batch (100 galleries). If we tried to load 100,000 galleries, we'd need 24 MB + 999 (999 batches of 100 galleries, 99,900 galleries) * 2 MB = ~2 GB of memory.

After adding $manager->flush() and gc_collect_cycles() for every batch, removing SQL logging with $manager->getConnection()->getConfiguration()->setSQLLogger(null) and removing entity references by commenting out $this->addReference('gallery' . $i, $gallery);, memory usage becomes somewhat constant for every batch.

// Define batch size outside of the for loop $batchSize = 100; ... for ($i = 1; $i <= self::COUNT; $i++) { ... // Save the batch at the end of the for loop if (($i % $batchSize) == 0 || $i == self::COUNT) { $currentMemoryUsage = round(memory_get_usage(true) / 1024); $maxMemoryUsage = round(memory_get_peak_usage(true) / 1024); echo sprintf("%s Memory usage (currently) %dKB/ (max) %dKB \n", $i, $currentMemoryUsage, $maxMemoryUsage); $manager->flush(); $manager->clear(); // here you should merge entities you're re-using with the $manager // because they aren't managed anymore after calling $manager->clear(); // e.g. if you've already loaded category or tag entities // $category = $manager->merge($category); gc_collect_cycles(); } }

As expected, memory usage is now stable:

> loading [200] App\DataFixtures\ORM\LoadGalleriesData 100 Memory usage (currently) 24MB / (max) 24MB 200 Memory usage (currently) 26MB / (max) 28MB 300 Memory usage (currently) 26MB / (max) 28MB 400 Memory usage (currently) 26MB / (max) 28MB 500 Memory usage (currently) 26MB / (max) 28MB 600 Memory usage (currently) 26MB / (max) 28MB 700 Memory usage (currently) 26MB / (max) 28MB 800 Memory usage (currently) 26MB / (max) 28MB 900 Memory usage (currently) 26MB / (max) 28MB 1000 Memory usage (currently) 26MB / (max) 28MB

Instead of downloading random images every time, we can prepare 15 random images and update the fixture script to randomly choose one of them instead of using Faker's $faker->image() method.

Let's take 15 images from Unsplash and save them in var/demo-data/sample-images.

Then, update the LoadGalleriesData::generateRandomImage method:

private function generateRandomImage($imageName) { $images = [ 'image1.jpeg', 'image10.jpeg', 'image11.jpeg', 'image12.jpg', 'image13.jpeg', 'image14.jpeg', 'image15.jpeg', 'image2.jpeg', 'image3.jpeg', 'image4.jpeg', 'image5.jpeg', 'image6.jpeg', 'image7.jpeg', 'image8.jpeg', 'image9.jpeg', ]; $sourceDirectory = $this->container->getParameter('kernel.project_dir') . '/var/demo-data/sample-images/'; $targetDirectory = $this->container->getParameter('kernel.project_dir') . '/var/uploads/'; $randomImage = $images[rand(0, count($images) - 1)]; $randomImageSourceFilePath = $sourceDirectory . $randomImage; $randomImageExtension = explode('.', $randomImage)[1]; $targetImageFilename = sha1(microtime() . rand()) . '.' . $randomImageExtension; copy($randomImageSourceFilePath, $targetDirectory . $targetImageFilename); $image = new Image( Uuid::getFactory()->uuid4(), $randomImage, $targetImageFilename ); return $image; }

It's a good idea to remove old files in var/uploads when reloading fixtures, so I'm adding rm var/uploads/* command to bin/ script, immediately after dropping the DB schema.

Loading 500 users and 1000 galleries now takes ~7 minutes and ~28 MB of memory (peak usage).

Dropping database schema... Database schema dropped successfully! ATTENTION: This operation should not be executed in a production environment. Creating database schema... Database schema created successfully! > purging database > loading [100] App\DataFixtures\ORM\LoadUsersData 300 Memory usage (currently) 10MB / (max) 10MB 500 Memory usage (currently) 12MB / (max) 12MB > loading [200] App\DataFixtures\ORM\LoadGalleriesData 100 Memory usage (currently) 24MB / (max) 26MB 200 Memory usage (currently) 26MB / (max) 28MB 300 Memory usage (currently) 26MB / (max) 28MB 400 Memory usage (currently) 26MB / (max) 28MB 500 Memory usage (currently) 26MB / (max) 28MB 600 Memory usage (currently) 26MB / (max) 28MB 700 Memory usage (currently) 26MB / (max) 28MB 800 Memory usage (currently) 26MB / (max) 28MB 900 Memory usage (currently) 26MB / (max) 28MB 1000 Memory usage (currently) 26MB / (max) 28MB

Take a look at the fixture classes source: LoadUsersData.php and LoadGalleriesData.php.

The post Building an Image Gallery Blog with Symfony Flex: Data Testing appeared first on SitePoint.

Kategóriák: IT Hírek

Building an Image Gallery Blog with Symfony Flex: the Setup

h, 06/18/2018 - 20:00

This post begins our journey into Performance Month's zero-to-hero project. In this part, we'll set our project up so we can fine tune it throughout the next few posts, and bring it to a speedy perfection.

Now and then you have to create a new project repository, run that git init command locally and kick off a new awesome project. I have to admit I like the feeling of starting something new; it's like going on an adventure!

Lao Tzu said:

The journey of a thousand miles begins with one step

We can think about the project setup as the very first step of our thousand miles (users!) journey. We aren't sure where exactly we are going to end up, but it will be fun!

We also should keep in mind the advice from prof. Donald Knuth:

Premature optimization is the root of all evil (or at least most of it) in programming.

Our journey towards a stable, robust, high-performance web app will start with the simple but functional application --- the so-called minimum viable product (MVP). We'll populate the database with random content, do some benchmarks and improve performance incrementally. Every article in this series will be a checkpoint on our journey!

This article will cover the basics of setting up the project and organizing files for our Symfony Flex project. I'll also show you some tips, tricks and helper scripts I'm using for speeding up the development.

What Are We Building?

Before starting any project, you should have a clear vision of the final destination. Where are you headed? Who will be using your app and how? What are the main features you're building? Once you have that knowledge, you can prepare your environment, third-party libraries, and dive into developing the next big thing.

In this series of articles, we'll be building a simple image gallery blog where users can register or log in, upload images, and create simple public image galleries with descriptions written in Markdown format.

We'll be using the new Symfony Flex and Homestead (make sure you've read tutorials on them, as we're not going to cover them here). We picked Flex because Symfony 4 is just about to come out (if it hasn't already, by the time you're reading this), because it's infinitely lighter than the older version and lends itself perfectly to step-by-step optimization, and it's also the natural step in the evolution of the most popular enterprise PHP framework out there.

All the code referenced in this article is available at the GitHub repo.

We're going to use the Twig templating engine, Symfony forms, and Doctrine ORM with UUIDs as primary keys.

Entities and routes will use annotations; we'll have simple email/password based authentication, and we'll prepare data fixtures to populate the database.

Getting Started with the app

To try out the example we've prepared, do the following:

  • Set up an empty database called "blog".
  • Clone the project repository from GitHub.
  • Run composer install.
  • If you now open the app in your browser, you should see an exception regarding missing database tables. That's fine, since we haven't created any tables so far.
  • Update the .env file in your project root directory with valid database connection string (i.e., update credentials).
  • Run the database init script ./bin/ and wait until it generates some nice image galleries.
  • Open the app in your browser and enjoy!

After executing bin/ you should be able to see the home page of our site:

You can log in to the app with credentials and password 123456. See LoadUserData fixture class for more details regarding generated users.

Starting from scratch

In this section, we'll describe how to set up a new project from scratch. Feel free to take a look at the sample app codebase and see the details.

After creating a new project based on symfony/skeleton by executing the command

composer create-project "symfony/skeleton:^3.3" multi-user-gallery-blog

… we can first set minimum stability to "dev" because of some cutting edge packages:

composer config minimum-stability dev

… and then require additional packages (some of them are referenced by their aliases, the new feature brought by Flex):

composer req annotations security orm template asset validator ramsey/uuid-doctrine

Dependencies used only in the dev environment are required with the --dev flag:

composer req --dev fzaninotto/faker doctrine/Doctrine-Fixtures-Bundle

Flex is doing some serious work for us behind the scenes, and most of the libraries (or bundles) are already registered and configured with good-enough defaults! Check the config directory. You can check all the dependencies used in this project in the composer.json file.

Routes are defined by annotations, so the following will be automatically added into config/routes.yaml:

controllers: resource: ../src/Controller/ type: annotation Database, Scripts and Fixtures

Configure the DATABASE_URL environment variable (for example, by editing the .env file) to set up a working DB connection. If you're using our own Homestead Improved (recommended), you've got a database set up called homestead with the user / pass homestead / secret. A DB schema can be generated from existing entities by executing:

./bin/console doctrine:schema:create

If this doesn't run, try executing the console by invoking the PHP binary, like so:

php bin/console doctrine:schema:create

If this step executed fine in the "Getting Started with the app" section above, you should be able to see newly created tables in the database (for Gallery, Image and User entities).

If you want to drop the database schema, you can run:

./bin/console doctrine:schema:drop --full-database --force Fake it 'til you make it!

I can't imagine developing an app today without having data fixtures (i.e., scripts for seeding the DB). With a few simple scripts, you can populate your database with realistic content, which is useful when it comes to rapid app development and testing, but it's also a requirement for a healthy CI pipeline.

I find the Doctrine Fixtures Bundle to be an excellent tool for handling data fixtures as it supports ordered fixtures (i.e., you can control the order of execution), sharing objects (via references) between scripts, and accessing the service container.

Default Symfony services configuration doesn't allow public access to services, as best practice is to inject all dependencies. We'll need some services in our fixtures, so I'm going to make all services in App\Services publicly available by adding the following to config/services.yaml:

App\Service\: resource: '../src/Service/*' public: true

I'm also using Faker to get random but realistic data (names, sentences, texts, images, addresses, …).

Take a look at the script for seeding galleries with random images to get a feeling of how cool this combination is.

Usually, I combine commands for dropping the existing DB schema, creating the new DB schema, loading data fixtures, and other repetitive tasks into a single shell script --- for example, bin/ --- so I can easily regenerate the schema and load dummy data:

# Drop schema ./bin/console doctrine:schema:drop --full-database --force # Create schema ./bin/console doctrine:schema:create # Load fixtures ./bin/console doctrine:fixtures:load -n --fixtures src/DataFixtures/ORM # Install assets ./bin/console assets:install --symlink # Clear cache ./bin/console cache:clear

Make sure you restrict execution of this script on production, or you're going to have some serious fun at one point.

One can argue that randomly generated data can't reproduce different edge cases, so your CI can sometimes fail or pass depending on the data generation. It's true, and you should make sure all edge cases are covered with your fixtures.

Every time you find an edge case causing a bug, make sure you add it to data fixtures. This will help you build a more robust system and prevent similar errors in the future.

The post Building an Image Gallery Blog with Symfony Flex: the Setup appeared first on SitePoint.

Kategóriák: IT Hírek

Apache vs Nginx Performance: Optimization Techniques

sze, 06/13/2018 - 20:00

Some years ago, the Apache Foundation's web server, known simply as "Apache", was so ubiquitous that it became synonymous with the term "web server". Its daemon process on Linux systems has the name httpd (meaning simply http process) --- and comes preinstalled in major Linux distributions.

It was initially released in 1995, and, to quote Wikipedia, "it played a key role in the initial growth of the World Wide Web". It is still the most-used web server software according to W3techs. However, according to those reports which show some trends of the last decade and comparisons to other solutions, its market share is decreasing. The reports given by Netcraft and Builtwith differ a bit, but all agree on a trending decline of Apache's market share and the growth of Nginx.

Nginx --- pronounced engine x --- was released in 2004 by Igor Sysoev, with the explicit intent to outperform Apache. Nginx's website has an article worth reading which compares these two technologies. At first, it was mostly used as a supplement to Apache, mostly for serving static files, but it has been steadily growing, as it has been evolving to deal with the full spectrum of web server tasks.

It is often used as a reverse proxy, load balancer, and for HTTP caching. CDNs and video streaming providers use it to build their content delivery systems where performance is critical.

Apache has been around for a long time, and it has a big choice of modules. Managing Apache servers is known to be user-friendly. Dynamic module loading allows for different modules to be compiled and added to the Apache stack without recompiling the main server binary. Oftentimes, modules will be in Linux distro repositories, and after installing them through system package managers, they can be gracefully added to the stack with commands like a2enmod. This kind of flexibility has yet to be seen with Nginx. When we look at a guide for setting up Nginx for HTTP/2, modules are something Nginx needs to be built with --- configured for at build-time.

One other feature that has contributed to Apache's market rule is the .htaccess file. It is Apache's silver bullet, which made it a go-to solution for the shared hosting environments, as it allows controlling the server configuration on a directory level. Every directory on a server served by Apache can have its own .htaccess file.

Nginx not only has no equivalent solution, but discourages such usage due to performance hits.

Server vendors market share 1995–2005. Data by Netcraft

LiteSpeed, or LSWS, is one server contender that has a level of flexibility that can compare to Apache, while not sacrificing performance. It supports Apache-style .htaccess, mod_security and mod_rewrite, and it's worth considering for shared setups. It was planned as a drop-in replacement for Apache, and it works with cPanel and Plesk. It's been supporting HTTP/2 since 2015.

LiteSpeed has three license tiers, OpenLiteSpeed, LSWS Standard and LSWS Enterprise. Standard and Enterprise come with an optional caching solution comparable to Varnish, LSCache, which is built into the server itself, and can be controlled, with rewrite rules, in .htaccess files (per directory). It also comes with some DDOS-mitigating "batteries" built in. This, along with its event-driven architecture, makes it a solid contender, targeting primarily performance-oriented hosting providers, but it could be worth setting up even for smaller servers or websites.

Hardware Considerations

When optimizing our system, we cannot emphasize enough giving due attention to our hardware setup. Whichever of these solutions we choose for our setup, having enough RAM is critical. When a web server process, or an interpreter like PHP, don't have enough RAM, they start swapping, and swapping effectively means using the hard disk to supplement RAM memory. The effect of this is increased latency every time this memory is accessed. This takes us to the second point --- the hard disk space. Using fast SSD storage is another critical factor of our website speed. We also need to mind the CPU availability, and the physical distance of our server's data centers to our intended audience.

To dive in deeper into the hardware side of performance tuning, Dropbox has a good article.


One practical way to monitor our current server stack performance, per process in detail, is htop, which works on Linux, Unix and macOS, and gives us a colored overview of our processes.

Other monitoring tools are New Relic, a premium solution with a comprehensive set of tools, and Netdata, an open-source solution which offers great extensibility, fine-grained metrics and a customizable web dashboard, suitable for both little VPS systems and monitoring a network of servers. It can send alarms for any application or system process via email, Slack, pushbullet, Telegram, Twilio etc.

Monit is another, headless, open-source tool which can monitor the system, and can be configured to alert us, or restart certain processes, or reboot the system when some conditions are met.

Testing the System

AB --- Apache Benchmark --- is a simple load-testing tool by Apache Foundation, and Siege is another load-testing program. This article explains how to set them both up, and here we have some more advanced tips for AB, while an in-depth look at Siege can be found here.

If you prefer a web interface, there is Locust, a Python-based tool that comes in very handy for testing website performance.

After we install Locust, we need to create a locustfile in the directory from which we will launch it:

from locust import HttpLocust, TaskSet, task class UserBehavior(TaskSet): @task(1) def index(self): self.client.get("/") @task(2) def shop(self): self.client.get("/?page_id=5") @task(3) def page(self): self.client.get("/?page_id=2") class WebsiteUser(HttpLocust): task_set = UserBehavior min_wait = 300 max_wait = 3000

Then we simply launch it from the command line:

locust --host=

One warning with these load-testing tools: they have the effect of a DDoS attack, so it's recommended you limit testing to your own websites.

Tuning Apache Apache's mpm modules

Apache dates to 1995 and the early days of the internet, when an accepted way for servers to operate was to spawn a new process on each incoming TCP connection and to reply to it. If more connections came in, more worker processes were created to handle them. The costs of spawning new processes were high, and Apache developers devised a prefork mode, with a pre-spawned number of processes. Embedded dynamic language interpreters within each process (like mod_php) were still costly, and server crashes with Apache's default setups became common. Each process was only able to handle a single incoming connection.

This model is known as mpm_prefork_module within Apache's MPM (Multi-Processing Module) system. According to Apache's website, this mode requires little configuration, because it is self-regulating, and most important is that the MaxRequestWorkers directive be big enough to handle as many simultaneous requests as you expect to receive, but small enough to ensure there's enough physical RAM for all processes.

A small Locust load test that shows spawning of huge number of Apache processes to handle the incoming traffic.

We may add that this mode is maybe the biggest cause of Apache's bad name. It can get resource-inefficient.

Version 2 of Apache brought another two MPMs that try to solve the issues that prefork mode has. These are worker module, or mpm_worker_module, and event module.

Worker module is not process-based anymore; it's a hybrid process-thread based mode of operation. Quoting Apache's website,

a single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the ThreadsPerChild directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive.

This mode is more resource efficient.

2.4 version of Apache brought us the third MPM --- event module. It is based on worker MPM, and added a separate listening thread that manages dormant keepalive connections after the HTTP request has completed. It's a non-blocking, asynchronous mode with a smaller memory footprint. More about version 2.4 improvements here.

We have loaded a testing WooCommerce installation with around 1200 posts on a virtual server and tested it on Apache 2.4 with the default, prefork mode, and mod_php.

First we tested it with libapache2-mod-php7 and mpm_prefork_module at

Then, we went for testing the event MPM module.

We had to add multiverse to our /etc/apt/sources.list:

deb xenial main restricted universe multiverse deb xenial-updates main restricted universe multiverse deb xenial-security main restricted universe multiverse deb xenial partner

Then we did sudo apt-get updateand installed libapache2-mod-fastcgi and php-fpm:

sudo apt-get install libapache2-mod-fastcgi php7.0-fpm

Since php-fpm is a service separate from Apache, it needed a restart:

sudo service start php7.0-fpm

Then we disabled the prefork module, and enabled the event mode and proxy_fcgi:

sudo a2dismod php7.0 mpm_prefork sudo a2enmod mpm_event proxy_fcgi

We added this snippet to our Apache virtual host:

&lt;filesmatch "\.php$"&gt; SetHandler "proxy:fcgi://" &lt;/filesmatch&gt;

This port needs to be consistent with php-fpm configuration in /etc/php/7.0/fpm/pool.d/www.conf. More about the php-fpm setup here.

Then we tuned the mpm_event configuration in /etc/apache2/mods-available/mpm_event.conf, keeping in mind that our mini-VPS resources for this test were constrained --- so we merely reduced some default numbers. Details about every directive on Apache's website, and tips specific to the event mpm here. Keep in mind that started servers consume an amount of memory regardless of how busy they are. The MaxRequestWorkers directive sets the limit on the number of simultaneous requests allowed: setting MaxConnectionsPerChild to a value other than zero is important, because it prevents a possible memory leak.

<ifmodule mpm_event_module> StartServers 1 MinSpareThreads 30 MaxSpareThreads 75 ThreadLimit 64 ThreadsPerChild 30 MaxRequestWorkers 80 MaxConnectionsPerChild 80 </ifmodule>

Then we restarted the server with sudo service apache2 restart (if we change some directives, like ThreadLimit, we will need to stop and start the service explicitly, with sudo service apache2 stop; sudo service apache2 start).

Our tests on Pingdom now showed page load time reduced by more than half:

The post Apache vs Nginx Performance: Optimization Techniques appeared first on SitePoint.

Kategóriák: IT Hírek

Making Your Website Faster and Safer with Cloudflare

k, 06/12/2018 - 20:00

Cloudflare is an industry leader in the content-delivery space, reducing load and speeding up millions of websites.

What is peculiar about this provider is that it didn't start as speed-up/performance tool, but was instead born from Project Honeypot, which was conceived as a spam and hacking protection service. To this day, this is one of Cloudflare's major selling points: DDoS detection and protection. Their algorithms take note of visitors' IP addresses, payloads, resources requested, and request frequency to detect malicious visitors.

Because it sits as a proxy between websites and all incoming traffic, Cloudflare is able to reduce strain on servers significantly, so much so that DDoS attacks won't even reach the origin websites, as explained in this introduction. Cloudflare also provides the Always Online option, which caches a version of the user's website and serves a limited version of it in case of origin server outage --- when the original website returns 5xx or 4xx errors. It also features a full-fledged page cache.

These features can be a huge advantage: they can salvage a struggling web server under heavy load, and in case of server errors, can give some breathing room to developers to figure things out.

It's also available free. There are premium tiers, of course, and there are things (like additional page rules) that require paying, but the scope of Cloudflare's free tier alone makes it worthwhile to learn its ins and outs.

Comparison benchmarks put Cloudflare somewhere in the middle in regard to speed, but it would be hard to argue that it is the best value CDN on the market.

Setting Up Cloudflare

Setting a site up with Cloudflare is very straightforward. After registering at ([], we can add a new website. While the system scans for the given domain's IP and other details, we're offered an introductory video. Upon completion, we're given new nameservers to set up with our registrar.

We need to register these nameservers with our registrar and wait for changes to propagate across the internet. It may take up to 24 hours.

This change means giving all control over our domain to Cloudflare. This also means that, if we have email on this domain (MX records), we need to transfer these records to Cloudflare. If we have any subdomains, they also need to be set up as respective A records in Cloudflare's dashboard.

All existing domain records set up with our domain registrar or hosting provider need to be moved/copied to Cloudflare.

Some managed hosting providers may simplify/automate this transition process even more.

For each of our domain records, we can decide to simply let all the traffic pass through directly to our servers --- which means we can set exceptions for certain subdomains --- or we can turn off all Cloudflare functionality --- for example, while we're making some changes on the website.

Once we've set the domain up, that's basically all the work required outside of Cloudflare's dashboard. There's nothing more to do on the website itself, or the origin server. All further tuning is done on the Cloudflare website.

Setting up Encryption

An SSL certificate is part of the free plan on Cloudflare. There are four options for SSL setup, and we can find them under the Crypto tab in the dashboard.

  • OFF - this is self-explanatory. All traffic will be redirected to unsecured protocol (http)
  • FLEXIBLE - regardless of the protocol of our server, and whether we have an existing SSL certificate on it or not, Cloudflare will serve all our pages to end-visitors over https. Connections from Cloudflare to the origin server will go over an unsecured connection.
  • FULL - Cloudflare will communicate to your server via https, but won't validate certificates on the origin. Traffic from Cloudflare to visitors is served over https.
  • FULL STRICT - Cloudflare will require valid (not self-signed) SSL certificates on the origin server. Traffic from Cloudflare to visitors is served over https.

With these settings, we need to make sure the setup is sensible because we have two layers between our end users and our server content, so omissions here can result in a redirect loop, or too many redirections which can end up slowing the website.

Cloudflare also offers the option to buy a custom certificate, and for premium users who require extra safety or care about their market image, it gives the option of uploading custom/premium certificates. This is a part of premium plans.

Securing the Website

This is one area where Cloudflare shines: it gives unprecedented value for free. Across the hosting landscape, DDoS protection is a premium service, and not always provided, even for paying customers.

Cloudflare offers unmetered DDoS protection on the free tier, together with some other, rather sophisticated tools that protect websites on an infrastructure level before malicious traffic even reaches it. It offers rate limiting --- throttling of visits
according to user-defined, customizable rules. It offers smart firewall rules, country blocks, browser integrity checks, captcha protections, and more.

Today, when botnets rule the internet and freshly installed websites or servers are sometimes drowned in brute-force break-in attempts within minutes of going online, when spammers automatize web comments, and referrer spam is rampant even without any break-ins, POST attacks and slow attacks utilizing unorthodox means are not rare. This kind of protection can make or break smaller- or medium-sized websites.

There's also scraping protection, denying certain resources to certain visitor profiles, or obfuscating emails.

Premium tiers offer even more options.

The post Making Your Website Faster and Safer with Cloudflare appeared first on SitePoint.

Kategóriák: IT Hírek

What Is a CDN and How Does It Work?

h, 06/11/2018 - 20:00

CDN - you keep seeing the acronym. Maybe in URLs, maybe on landing pages, but it never quite clicked - what are Content Delivery Networks, what do they do exactly?

We'll explain in this overview article, and demonstrate on two popular ones in followup posts.

CDN Basics

A CDN is a network of computers that delivers content.

More specifically, it's a bunch of servers geographically positioned between the origin server of some web content, and the user requesting it, all with the purpose of delivering the content faster by reducing latency. This is their primary purpose.

These geographically closer servers, also called PoPs or Points of Presence, also cache the cacheable content which removes a lot of the load from the origin server. There are different types of CDNs offering different kinds of services, and they can have differing network topology: scattered CDNs aim to have as many servers scattered around the world as possible. Akamai is one such CDN. Consolidated CDNs have fewer points, but bigger ones built for network performance, throughput, and DDoS resistance.

Types of CDNs

We said that their primary purpose was to reduce latency and speed up rendering. But in the modern world of 2MB images and 500kb JavaScript libraries that take 3 minutes to boot up on websites, this latency matters little. But there are other purposes to CDNs, too, which evolved over time.

Content-oriented CDNs

Initially, CDNs were just for static content (JS, CSS, HTML). You had to push content to them as you created/uploaded it (they didn't know they needed to update their cache with your content, not even as someone requested it).

Then, they added origin pulling, making things more automatic - this meant that a user requested the CDN's URL, and then the CDN requested the origin website's URL automatically, caching what ever it got back. Additionally, availability became an important factor. Many CDNs now cache a website's "last alive" state so that if the origin goes down, the CDNed content is still accessible to users, creating the illusion of stability until things return to normal.

Additionally, modern CDNs often offer auto-optimization layers which will automagically resize images and save them for future use based on the image size requested. This means what if your site has a 2MB header image and someone requests it on a 300px wide screen, the CDN will make a copy that's 30kb in size and 300px wide and serve that in the future to all mobile users, automatically making the site faster.

Security-oriented CDNs

The final layer of practicality added to CDNs was DDoS and bot protection. CDNs like Incapsula specialize in this.

As the CDN is the outermost layer of a website's infrastructure and the first recipient of traffic, it can detect DDoS attacks early and block them with special DDoS protection servers called scrubbers without them ever reaching the origin server and crashing it.

The post What Is a CDN and How Does It Work? appeared first on SitePoint.

Kategóriák: IT Hírek

HTTP/2: Background, Performance Benefits and Implementations

p, 06/08/2018 - 20:00

On top of the infrastructure of the internet --- or the physical network layers --- sits the Internet Protocol, as part of the TCP/IP, or transport layer. It's the fabric underlying all or most of our internet communications.

A higher level protocol layer that we use on top of this is the application layer. On this level, various applications use different protocols to connect and transfer information. We have SMTP, POP3, and IMAP for sending and receiving emails, IRC and XMPP for chatting, SSH for remote sever access, and so on.

The best-known protocol among these, which has become synonymous with the use of the internet, is HTTP (hypertext transfer protocol). This is what we use to access websites every day. It was devised by Tim Berners-Lee at CERN as early as 1989. The specification for version 1.0 was released in 1996 (RFC 1945), and 1.1 in 1999.

The HTTP specification is maintained by the World Wide Web Consortium, and can be found at

The first generation of this protocol --- versions 1 and 1.1 --- dominated the web up until 2015, when HTTP/2 was released and the industry --- web servers and browser vendors --- started adopting it.


HTTP is a stateless protocol, based on a request-response structure, which means that the client makes requests to the server, and these requests are atomic: any single request isn't aware of the previous requests. (This is why we use cookies --- to bridge the gap between multiple requests in one user session, for example, to be able to serve an authenticated version of the website to logged in users.)

Transfers are typically initiated by the client --- meaning the user's browser --- and the servers usually just respond to these requests.

We could say that the current state of HTTP is pretty "dumb", or better, low-level, with lots of "help" that needs to be given to the browsers and to the servers on how to communicate efficiently. Changes in this arena are not that simple to introduce, with so many existing websites whose functioning depends on backward compatibility with any introduced changes. Anything being done to improve the protocol has to be done in a seamless way that won't disrupt the internet.

In many ways, the current model has become a bottleneck with this strict request-response, atomic, synchronous model, and progress has mostly taken the form of hacks, spearheaded often by the industry leaders like Google, Facebook etc. The usual scenario, which is being improved on in various ways, is for the visitor to request a web page, and when their browser receives it from the server, it parses the HTML and finds other resources necessary to render the page, like CSS, images, and JavaScript. As it encounters these resource links, it stops loading everything else, and requests specified resources from the server. It doesn't move a millimeter until it receives this resource. Then it requests another, and so on.

The number of requests needed to load world's biggest websites is often in couple of hundreds.

This includes a lot of waiting, and a lot of round trips during which our visitor sees only a white screen or a half-rendered website. These are wasted seconds. A lot of available bandwidth is just sitting there unused during these request cycles.

CDNs can alleviate a lot of these problems, but even they are nothing but hacks.

As Daniel Stenberg (one of the people working on HTTP/2 standardization) from Mozilla has pointed out, the first version of the protocol is having a hard time fully leveraging the capacity of the underlying transport layer, TCP.
Users who have been working on optimizing website loading speeds know this often requires some creativity, to put it mildly.

Over time, internet bandwidth speeds have drastically increased, but HTTP/1.1-era infrastructure didn't utilize this fully. It still struggled with issues like HTTP pipelining --- pushing more resources over the same TCP connection. Client-side support in browsers has been dragging the most, with Firefox and Chrome disabling it by default, or not supporting it at all, like IE, Firefox version 54+, etc.
This means that even small resources require opening a new TCP connection, with all the bloat that goes with it --- TCP handshakes, DNS lookups, latency… And due to head-of-line blocking, the loading of one resource results in blocking all other resources from loading.

A synchronous, non-pipelined connection vs a pipelined one, showing possible savings in load time.

Some of the optimization sorcery web developers have to resort to under the HTTP/1 model to optimize their websites include image sprites, CSS and JavaScript concatenation, sharding (distributing visitors' requests for resources over more than one domain or subdomain), and so on.

The improvement was due, and it had to solve these issues in a seamless, backward-compatible way so as not to interrupt the workings of the existing web.


In 2009, Google announced a project that would become a draft proposal of a new-generation protocol, SPDY (pronounced speedy), adding support to Chrome, and pushing it to all of its web services in subsequent years. Then followed Twitter and server vendors like Apache, nginx with their support, Node.js, and later came Facebook,, and most CDN providers.

SPDY introduced multiplexing --- sending multiple resources in parallel, over a single TCP connection. Connections are encrypted by default, and data is compressed. First, preliminary tests in the SPDY white paper performed on the top 25 sites showed speed improvements from 27% to over 60%.

After it proved itself in production, SPDY version 3 became basis for the first draft of HTTP/2, made by the Hypertext Transfer Protocol working group httpbis in 2015.

HTTP/2 aims to address the issues ailing the first version of the protocol --- latency issues --- by:

It also aims to solve head-of-line blocking. The data it transfers is in binary format, improving its efficiency, and it requires encryption by default (or at least, this is a requirement imposed by major browsers).

Header compression is performed with the HPACK algorithm, solving the vulnerability in SPDY, and reducing web request sizes by half.

Server push is one of the features that aims to solve wasted waiting time, by serving resources to the visitor's browser before the browser requires it. This reduces the round trip time, which is a big bottleneck in website optimization.

Due to all these improvements, the difference in loading time that HTTP/2 brings to the table can be seen on this example page by

Savings in loading time become more apparent the more resources a website has.

The post HTTP/2: Background, Performance Benefits and Implementations appeared first on SitePoint.

Kategóriák: IT Hírek

The Complete Guide to WordPress Performance Optimization

cs, 06/07/2018 - 20:00

According to, WordPress holds close to 50% of the CMS share of the world's top 1,000,000 websites. As for the ecommerce sphere, we're at 33% with WooCommerce. And if we cast a wider net, percentages go higher. Although we may complain that WordPress can get bloated, resource-heavy, and its data model leaves a lot to be desired, there is no denying that WordPress is everywhere.

WordPress can thank its simplicity and a low barrier to entry for this pervasiveness. It's easy to set up, and requires next to no technical knowledge. Hosting for WordPress can be found for as little as a couple of dollars per month, and the basic setup takes just a half hour of clicking. Free themes for WordPress are galore, some with included WYSIWYG page builders.

Many look down on it, but in many ways we can thank WordPress for the growth of the internet and PHP, and many internet professionals have WP's gentle learning curve to thank for their careers.

But this ease of entry comes at a cost. Plenty of websites that proudly wear the WordPress badge were not done by professionals but by the cheapest developers. And often, it shows. Professional look and professional performance were afterthoughts.

One of the main points of feedback the owner of an aspiring high-quality website will get from a grudging professional is that performance and a professional look and feel shouldn't be afterthoughts. You can't easily paint or stick them over a website. Professional websites should be premeditated.

Above, a famous UK used car dealer, Ling's Cars, tried a unique way to make a kitsch marketing punchline. Unless you're REALLY sure about what you're doing, DO NOT try this at home

And this starts with…

Choice of Hosting

Typically, new users will go with products that are on the low-cost side, with most of beginner-friendly bells and whistles. Considering the shady business practices by some big industry players in this arena, and the complaints and demands for site migration professionals coming from clients, this is a part of website setup that requires due attention.

We can divide WordPress hosting vendors into a few tiers.

Premium, WordPress-dedicated vendors like Kinsta whose plans start at $100 per month, or even higher-grade managed hosting like WordPress VIP by Automattic, may be worth their salt, but also may be out of reach for many website owners.

Medium tier Flywheel, A2 hosting, Siteground and Pantheon are among those considered reliable and performance oriented, offering acceptable speed and a managed hosting service for those more price-conscious. Users here may get a bit less hand-holding, but these services usually strike an acceptable balance between a solid setup, price, and options for more advanced users. Not to forget, there is Cloudways, which is a hybrid between VPS and managed hosting. Those with their audience in Europe may look into Pilvia, as it offers a performant server stack and is pretty affordable.

There's an interesting survey of customer satisfaction with more prominent hosting vendors, published by Codeinwp.

For those of us not scared of the command line, there are VPS and dedicated-server vendors like Digital Ocean, Vultr, Linode, Amazon's Lightsail, Hetzner in Europe, and OVH. Hetzner is a German vendor known for its quality physical servers on offer, somewhat above the price of virtual servers, while OVH offers very cost-efficient virtual servers. For the price-conscious, OVH's subsidiary Kimsufi in Europe and Canada also offers bargain physical dedicated servers, and Host US has very affordable virtual servers.

With managed hosting, things to look for are a good server stack, good CDN integration, and of course SSD storage. Guaranteed resources, like with A2, are a big plus. The next thing to look for is SSH-access. Tech-savvy users may profit from WP-CLI availability.

When choosing a VPS, the thing to look for is XEN or KVM virtualization over OpenVZ, because it mitigates the overselling of resources, helping guarantee that the resources you bought are really yours. It also provides better security.

Easy Engine is software that can make your entire VPS/WordPress installation a one-hour job.

Regarding the server stack, Nginx is preferred to Apache if we're pursuing performance, and PHP 7 is a must. If we really need Apache, using Nginx as a reverse proxy is a plus, but this setup can get complex.

Tests performed give PHP 7 a big edge over the previous version. According to

WordPress 4.1 executed 95% more requests per second on PHP 7 compared to PHP 5.6.

When choosing your hosting, be aware of negative experiences with some providers that have become notorious.

Software Considerations

Things that usually slow down WordPress websites are bulky, bloated front ends with a lot of static resources and database queries. These issues arise from the choice of theme (and its page builders, huge sliders, etc) --- which not only slow down initial loading due to many requests and overall size, but often slow down the browser due to a lot of JavaScript, and stuff to render, making it unresponsive.

The golden rule here is: don't use it unless there's a good reason to.

This may seem like a rule coming from the mouth of Homer Simpson, but if you can skip any of the bells and whistles, do so. Be conservative. If you must add some shiny functionality or JS eye candy, always prefer those tailored and coded as specifically as possible for your exact need. If you're a skilled coder, and the project justifies the effort, code it yourself with minimalism in mind.

Review all the plugins your website can't live without --- and remove the others.

And most importantly: back up your website before you begin pruning!

Data model

If you have a theme where you use a lot of custom posts or fields, be warned that a lot of these will slow down your database queries. Keep your data model as simple as possible, and if not, consider that WordPress' original intended purpose was as a blogging engine. If you need a lot more than that, you may want to consider some of the MVC web frameworks out there that will give you greater control over your data model and the choice of database.

In WordPress we can build a rich custom data model by using custom post types, custom taxonomies and custom fields, but be conscious of performance and complexity costs.

If you know your way around the code, inspect your theme to find unnecessary database queries. Every individual database trip spends precious milliseconds in your TTFB, and megabytes of your server's memory. Remember that secondary loops can be costly --- so be warned when using widgets and plugins that show extra posts, like in sliders or widget areas. If you must use them, consider fetching all your posts in a single query, where it may otherwise slow down your website. There's a GitHub repo for those not wanting to code from scratch.

Meta queries can be expensive

Using custom fields to fetch posts by some criteria can be a great tool to develop sophisticated things with WP. This is an example of a meta query, and here you can find some elaboration on its costs. Summary: post meta wasn't built for filtering, taxonomies were.

get_post_meta is a function typically called to fetch custom fields, and it can be called with just the post ID as an argument, in which case it fetches all the post's meta fields in an array, or it can have a custom field's name as a second argument, in which case it returns just the specified field.

If using get_post_meta()for a certain post multiple times on a single page or request (for multiple custom fields), be aware that this won't incur extra cost, because the first time this function is called, all the post meta gets cached.

Database hygiene

Installing and deleting various plugins, and changing different themes over the lifetime of your website, often clutters your database with a lot of data that isn't needed. It's completely possible to discover --- upon inspecting why a WordPress website is sluggish, or why it won't even load, due to exhausted server memory --- that the database has grown to hundreds and hundreds of megabytes, or over a gigabyte, with no content that explains it.

wp-options is where a lot of orphaned data usually gets left behind. This includes, but is not limited to, various transients (this post warns of best practices regarding deletion of transients in plugins). Transients are a form of cache, but as with any other caching, if misused, it can do more harm than good. If your server environment provides it, wp-cli has a command set dedicated to transients management, including deletion. If not, there are plugins in the WordPress plugins repo that can delete expired transients, but which offer less control.

If deleting transients still leaves us with a bloated database without any tangible cause, WP-Sweep is an excellent free tool that can do the job of cleaning up the database. Another one to consider is WP Optimize.

Before doing any kind of database cleanup, it's strongly recommended that you back up your database!

One of the plugins that comes in very handy for profiling of the whole WordPress request lifecycle is Debug Objects. It offers an inspection of all the transients, shortcodes, classes, styles and scripts, templates loaded, db queries, and hooks.

After ensuring a sane, performance-oriented setup --- considering our server stack in advance, eliminating the possible bloat created by theme choice and plugins and widgets overload --- we should try to identify bottlenecks.

If we test our website in a tool like Pingdom Speed Test, we'll get a waterfall chart of all the resources loaded in the request:

This gives us details about the request-response lifecycle, which we can analyze for bottlenecks. For instance:

  • If the pink DNS time above is too big, it could mean we should consider caching our DNS records for a longer period. This is done by increasing the TTL setting in our domain management/registrar dashboard.
  • If the SSL part is taking too long, we may want to consider enabling HTTP/2 to benefit from ALPN, adjusting our cache-control headers, and finally switching to a CDN service. “Web Performance in a Nutshell: HTTP/2, CDNs and Browser Caching” is a thorough article on this topic, as is “Analyzing HTTPS Performance Overhead” by KeyCDN.
  • Connect, Send, and Receive parts usually depend on network latency, so these can be improved by hosting close to your intended audience, making sure your host has a fast uplink, and using a CDN. For these items, you may want to consider a ping tool too (not to be confused with the Pingdom tools mentioned above), to make sure your server is responsive.
  • The Wait part --- the yellow part of the waterfall --- is the time your server infrastructure takes to produce or return the requested website. If this part takes too much time, you may want to return to our previous topics of optimizing the server, WordPress installation, and database stack. Or you may consider various layers of caching.

To get a more extensive testing and hand-holding tips on improving the website, there's a little command line utility called webcoach. In an environment with NodeJS and npm installed (like Homestead Improved), installing it is simple:

npm install webcoach -g

After it's been installed, we can get detailed insights and advice on how to improve our website's various aspects, including performance:

The post The Complete Guide to WordPress Performance Optimization appeared first on SitePoint.

Kategóriák: IT Hírek

An Introduction to MongoDB

h, 05/28/2018 - 21:00

MongoDB is an open-source, document-oriented, NoSQL database program. If you’ve been involved with the traditional, relational databases for long, the idea of a document-oriented, NoSQL database might indeed sound peculiar. “How can a database not have tables?”, you might wonder. This tutorial introduces you to some of the basic concepts of MongoDB and should help […]

Continue reading %An Introduction to MongoDB%

Kategóriák: IT Hírek

How to Fix Magento Login Issues with Cookies and Sessions

h, 05/21/2018 - 08:00

This article was created in partnership with Ktree. Thank you for supporting the partners who make SitePoint possible.

In this article are looking at how Magento cookies can create issues with the login functionality of both the customer-facing front-end and admin back-end, the reason it occurs and how it should be resolved.

This is also known as the looping issue, as the screen redirects itself to the same screen, even though the username and password is correct.

A script is provided at the end of the article which can help detect a few of the issues. Feel free to use and modify as per your needs.

What is a Cookie?

A cookie is a piece of text that a web server can store on a user's hard drive, and can also later retrieve it. Magento uses cookies in Cart & Backend Admin functionalities, and they may be the source of a few problems when unable to login to Magento.

What is a Session?

A session is an array variable on the server side, which stores information to be used across multiple pages. For example, items added to the cart are typically saved in sessions, and when the user browses the checkout page they are read from the session.

Sessions are identified by a unique ID. Its name changes depemnding on the programming language — in PHP it is called a 'PHP Session ID'. As you might have guessed, the same PHP Session ID needs to be stored as a cookie in the client browser to relate.

Magento's storage of Sessions

Magento can store sessions via multiple session providers and this can be configured in the Magento config file at app/etc/local.xml. These session providers can be chosen here.


<session_save><![CDATA[files]]></session_save> <session_save_path> <![CDATA[/tmp/session]]> </session_save_path>


Allowing sessions to store themselves in the database is done in /app/etc/local.xml by adding <session_save><![CDATA[db]]></session_save>.

Magento applications store sessions in the Core\_session table.


<session_save>db</session_save> <redis_session> <host></host> <port>6379</port> </redis_session>


session_save><![CDATA[memcache]]></session_save> <session_save_path> <![CDATA[tcp://localhost:11211?persistent=1&weight=2&timeout=10&retry_interval=10]]> </session_save_path> Magento Usage

Magento uses two different cookies named 'frontend' and 'adminhtml'. The first one is created when any page is browsed. The same cookie is also updated whenever the customer logs in, and the next one is created when a backend user is logged in. You can check whether the cookies have been created by clicking Inspect Element > Application, as in the below picture (from Chrome):

Cookies are configured in Magento via the Configuration admin menu - System > Configuration > General > Web.

Problem: Login Fails & Redirects to Login Page

If you haven't experienced this problem, then you haven't worked with Magento long enough!

This is how it typically happens: when you login by entering your username and password, you will be redirected to the same login page and URL, and your browser is appended with nonce id. This happens for both the customer front-end and the Magento back-end login.

Let's look at a few reasons why this happens, and how we should resolve those issues.

Reason #1: Cookie domain does not match server domain

Let's say your Magento site is and the cookie domain in Magento is configured as

In this scenario both Magento cookies will set Domain Value as, but for validating the session Magento will consider the domain through which the site was accessed — in this case Since it won't be able to find an active session with the domain value, it will redirect the user to the login page even when valid credentials are provided.


After login or logout, the Magento system will regenerate the session using the following script:

public function renewSession() { $this->getCookie()->delete($this->getSessionName()); $this->regenerateSessionId(); $sessionHosts = $this->getSessionHosts(); $currentCookieDomain = $this->getCookie()->getDomain(); if (is_array($sessionHosts)) { foreach (array_keys($sessionHosts) as $host) { // Delete cookies with the same name for parent domains if (strpos($currentCookieDomain, $host) > 0) { $this->getCookie()->delete($this->getSessionName(), null, $host); } } } return $this; }


Magento will validate the session for every request with the following method:

public function init($namespace, $sessionName=null) { if (!isset($_SESSION)) { $this->start($sessionName); } if (!isset($_SESSION[$namespace])) { $_SESSION[$namespace] = array(); } $this->_data = &$_SESSION[$namespace]; $this->validate(); $this->revalidateCookie(); return $this; }

You may normally see this when you migrate your Magento instance from one domain to another domain, for example from Production to Staging, and forget to change the cookie domain.

Note: you can run the provided cookieTest.php script, which validates what the server cookie domain is, and what is set in the Magento config.


Change the Cookie Domain via the Configuration admin menu. Go to System > Configuration > General > Web, as per the screenshot.

Alternatively you can change this by running these SQL queries.

For validating the cookie domain use this select query to get the configuration:

SELECT * FROM core_config_data WHERE path = 'web/cookie/cookie_domain';

After executing this query, we will get the results. Verify the 'value' column is the same as your domain. Update the value if it is not the same as your domain.

To update the cookie domain, use this query:

UPDATE core_config_data SET VALUE = "" WHERE path = 'web/cookie/cookie_domain'; Reason #2: Multiple subdomains used and Magento's cookie configuration is incorrect

Let's say your site is Logging into works fine.

But on your staging/QA site, for example, you are unable to login without deleting all cookies. The system may allow logins to, but when we login again to, your next click on kicks you back to the login page. Similar behavior is experienced for customers using the front-end login as well.

Solution 1

Option A: If your main domain and subdomains are hosted on the same server

Continue reading %How to Fix Magento Login Issues with Cookies and Sessions%

Kategóriák: IT Hírek

8 Tips for Improving Bootstrap Accessibility

h, 02/12/2018 - 17:00

A few years ago, I wrote about my experiences on developing a Bootstrap version 3 project to be fully accessible for people with disabilities. This focussed mostly on how accessible it is in terms of front-end design. (It didn’t cover accessibility in terms of screen readers, as that’s a whole other story.)

While I could see that the developers behind Bootstrap were making an effort, there were a few areas where this popular UI library fell short. I could also see that there were issues raised on the project that showed they were actively improving — which is fantastic, considering how approximately 3.6% of websites use Bootstrap.

Recently, Bootstrap version 4 was released, so let’s take a look and see if any of the issues I had in the past have improved.

What We’re Looking For with Design Accessibility

There are a few things to consider when designing a website with accessibility in mind. I believe these improve the user experience for everyone and will likely cover a lot of points you would consider anyway.


One way to achieve accessibility is by having a clean, easy-to-use layout that looks good on all devices, as well as looking good at a high zoom level. Up to 200% is a good guide.

Bonus points: having front-end code that matches the layout is also good for users who access the Web with a screen reader or by using a keyboard instead of a mouse.

This allows people to use your website easily irrespective of how they’re viewing it.

Continue reading %8 Tips for Improving Bootstrap Accessibility%

Kategóriák: IT Hírek

PHP-FPM tuning: Using ‘pm static’ for Max Performance

sze, 11/29/2017 - 18:00

Let's take a very quick look at how best to set up PHP-FPM for high throughput, low latency, and a more stable use of CPU and memory. By default, most setups have PHP-FPM’s PM (process manager) string set to dynamic and there’s also the common advice to use ondemand if you suffer from available memory issues. However, let's compare the two management options based on’s documentation and also compare my favorite for high traffic setup --- static pm:

  • pm = dynamic: the number of child processes is set dynamically based on the following directives: pm.max_children, pm.start_servers, pm.min_spare_servers, pm.max_spare_servers.

  • pm = ondemand: the processes spawn on demand when requested, as opposed to dynamic, where pm.start_servers are started when the service is started.

  • pm = static: the number of child processes is fixed by pm.max_children.

See the full list of global php-fpm.conf directives for further details.

PHP-FPM Process Manager (PM) Similarities to CPUFreq Governor

Now, this may seem a bit off topic, but I hope to tie it back into our PHP-FPM tuning topic. Okay, we’ve all had slow CPU issues at some point, whether it be a laptop, VM or dedicated server. Remember CPU frequency scaling? (CPUFreq governor.) These settings, available on both *nix and Windows, can improve the performance and system responsiveness by changing the CPU governor setting from ondemand to performance. This time, let's compare the descriptions and look for similarities:

  • Governor = ondemand: scales CPU frequency dynamically according to current load. Jumps to the highest frequency and then scales down as the idle time increases.

  • Governor = conservative: scales the frequency dynamically according to current load. Scales the frequency more gradually than ondemand.

  • Governor = performance: always run the CPU at the maximum frequency.

See the full list of CPUFreq governor options for further details.

Notice the similarities? I wanted to use this comparison first, with the aim of finding the best way to write an article which recommends using pm static for PHP-FPM as your first choice.

With CPU governor, the performance setting is a pretty safe performance boost because it’s almost entirely dependent on your server CPU’s limit. The only other factors would be things such as heat, battery life (laptop) and other side effects of clocking your CPU frequency to 100% permanently. Once set to performance, it is indeed the fastest setting for your CPU. For example read about the ‘force_turbo’ setting on Raspberry Pi, which forces your RPi board to use the performance governor where performance improvement is more noticeable due to the low CPU clock speeds.

Using ‘pm static’ to Achieve Your Server’s Max Performance

The PHP-FPM pm static setting depends heavily on how much free memory your server has. Basically, if you are suffering from low server memory, then pm ondemand or dynamic may be better options. On the other hand, if you have the memory available, you can avoid much of the PHP process manager (PM) overhead by setting pm static to the max capacity of your server. In other words, when you do the math, pm.static should be set to the max amount of PHP-FPM processes that can run without creating memory availability or cache pressure issues. Also, not so high as to overwhelm CPU(s) and have a pile of pending PHP-FPM operations.

In the screenshot above, this server has pm = static and pm.max_children = 100 which uses a max of around 10GB of the 32GB installed. Take note of the self explanatory highlighted columns. During that screenshot there were about 200 ‘active users’ (past 60 seconds) in Google Analytics. At that level, about 70% of PHP-FPM children are still idle. This means PHP-FPM is always set to the max capacity of your server’s resources regardless of current traffic. Idle processes stay online, waiting for traffic spikes and responding immediately, rather than having to wait on the pm to spawn children and then kill them off after x pm.process_idle_timeout expires. I have pm.max_requests set extremely high because this is a production server with no PHP memory leaks. You can use pm.max_requests = 0 with static if you have 110% confidence in your current and future PHP scripts. However, it’s recommended to restart scripts over time. Set the number of requests to a high number since the point is to avoid pm overhead. So for example at least pm.max_requests = 1000 depending on your number of pm.max_children and number of requests per second.

The screenshot uses Linux top filtered by ‘u’ (user) option and the name of the PHP-FPM user. The number of processes displayed are only the ‘top’ 50 or so (didn’t count), but basically top displays the top stats which fit in your terminal window --- in this case, sorted by %CPU. To view all 100 PHP-FPM processes you can use something like:

top -bn1 | grep php-fpm

Continue reading %PHP-FPM tuning: Using ‘pm static’ for Max Performance%

Kategóriák: IT Hírek

23 Development Tools for Boosting Website Performance

k, 11/28/2017 - 18:00

When dealing with performance, it's hard to remember all the tools that might help you out during development. For that purpose, we've compiled a list of 23 performance tools for your reference. Some you'll have heard of, others probably not. Some have been covered in detail in our performance month, others are yet to be covered future articles; but all are very useful and should be part of your arsenal.

Client-side Performance Tools 1. Test your Mobile Speed with Google

Google’s Test My Site is an online tool offered by Google and powered by the popular website performance tool

You can either visualize your report on site or have it emailed to you via your email address.

The tool gives you your website loading time (or Speed Index) calculated using a Chrome browser on a Moto G4 device within a 3G network. It also gives you the estimated percentage of visitors lost due to loading time. Among other things it also:

  • compares your site speed with the top-performing sites in your industry
  • gives you top fixes that can help you speed up your website loading time.
2. is an open-source tool --- or a set of tools --- that can help you measure your website performance and improve it.

Image source:

  • Coach: gives you performance advice and fixes for your website based on best practices.
  • Browsertime: collects metrics and HAR files from your browser.
  • Chrome-HAR: helps you compare HAR files.
  • PageXray: extracts different metrics (from HAR files) such as size, number of requests, and so on.

You can install these tool(s) using npm:

npm install -g --help

Or Docker:

docker run --shm-size=1g --rm -v "$(pwd)":/ sitespeedio/ --video --speedIndex 3. Lighthouse by Google

Lighthouse is an open-source tool for running audits to improve web page quality. It's integrated into Chrome's DevTools and can be also installed as a Chrome extension or CLI-based tool. It's an indispensable tool for measuring, debugging and improving the performance of modern, client-side apps (particularity PWAs).

You can find the extension from the Chrome Web Store.

Or you can install Lighthouse, from npm, on your system with:

npm install -g lighthouse

Then run it with:

lighthouse <url>

You can use Lighthouse programmatically to build your own performance tool or for continuous integration.

Make sure to check these Lighthouse-based tools:

  • webpack-lighthouse-plugin: a Lighthouse plugin for Webpack
  • treo: Lighthouse as a service with a personal free plan.
  • calibreapp: a paid service, based on Lighthouse, that helps you track, understand and improve performance metrics using real Google Chrome instances.
  • lighthouse-cron: a module which can help you track your Lighthouse scores and metrics overtime.

We've got an in-depth look at Lighthouse in our PWA performance month post.

4. Lightcrawler

You can use Lightcrawler to crawl your website then run each page found through Lighthouse.

Start by installing the tool via npm:

npm install --save-dev lightcrawler

Then run it from the terminal by providing the target URL and a JSON configuration file:

lightcrawler --url <url> --config lightcrawler-config.json

The configuration file can be something like:

{ "extends": "lighthouse:default", "settings": { "crawler": { "maxDepth": 2, "maxChromeInstances": 5 }, "onlyCategories": [ "Performance", ], "onlyAudits": [ "accesskeys", "time-to-interactive", "user-timings" ] } } 5. YSlow

YSlow is a JavaScript bookmarklet that can be added to your browser and invoked on any visited web page. This tool analyzes web pages and helps you discover the reasons for slowness based on Yahoo's rules for high-performance websites.

Image source:

You can install YSlow by dragging and dropping the bookmarklet to your browser’s bookmark bar. Find more information here.

6. GTmetrix

GTmetrix is an online tool that gives you insights into your website performance (fully loaded time, total page size, number of requests etc.) and also practical recommendations on how to optimize it.

7. Page Performance

Page performance is a Chrome extension that can be used to run a quick performance analysis. If you have many tabs open, the extension will be invoked on the active tab.

8. The AMP Project

The AMP (Accelerated Mobile Pages) project is an open-source project that aims to make the web faster. The AMP project enables developers to create websites that are fast, high-performing and with great user experiences across all platforms (desktop browsers and mobile devices).

The AMP project is essentially three core components:

  • AMP HTML: it's HTML but with some restrictions to guarantee reliable performance.
  • AMP JS: a JavaScript library that takes care of rendering AMP HTML.
  • AMP Cache: a content delivery network for caching and delivering valid AMP pages. You can use tools such as AMP Validator or amphtml-validator to check if your pages are valid AMP pages.

Once you add AMP markup to your pages, Google will discover them automatically and cache them to deliver them through the AMP CDN. You can learn from here how to create your first AMP page.

Continue reading %23 Development Tools for Boosting Website Performance%

Kategóriák: IT Hírek

Case Study: Optimizing CommonMark Markdown Parser with

cs, 11/23/2017 - 07:12

As you may know, I am the author and maintainer of the PHP League's CommonMark Markdown parser. This project has three primary goals:

  1. fully support the entire CommonMark spec
  2. match the behavior of the JS reference implementation
  3. be well-written and super-extensible so that others can add their own functionality.

This last goal is perhaps the most challenging, especially from a performance perspective. Other popular Markdown parsers are built using single classes with massive regex functions. As you can see from this benchmark, it makes them lightning fast:

Library Avg. Parse Time File/Class Count Parsedown 1.6.0 2 ms 1 PHP Markdown 1.5.0 4 ms 4 PHP Markdown Extra 1.5.0 7 ms 6 CommonMark 0.12.0 46 ms 117

Unfortunately, because of the tightly-coupled design and overall architecture, it's difficult (if not impossible) to extend these parsers with custom logic.

For the League's CommonMark parser, we chose to prioritize extensibility over performance. This led to a decoupled object-oriented design which users can easily customize. This has enabled others to build their own integrations, extensions, and other custom projects.

The library's performance is still decent --- the end user probably can't differentiate between 42ms and 2ms (you should be caching your rendered Markdown anyway). Nevertheless, we still wanted to optimize our parser as much as possible without compromising our primary goals. This blog post explains how we used Blackfire to do just that.

Profiling with Blackfire

Blackfire is a fantastic tool from the folks at SensioLabs. You simply attach it to any web or CLI request and get this awesome, easy-to-digest performance trace of your application's request. In this post, we'll be examining how Blackfire was used to identify and optimize two performance issues found in version 0.6.1 of the league/commonmark library.

Let's start by profiling the time it takes league/commonmark to parse the contents of the CommonMark spec document:

Later on we'll compare this benchmark to our changes in order to measure the performance improvements.

Quick side-note: Blackfire adds overhead while profiling things, so the execution times will always be much higher than usual. Focus on the relative percentage changes instead of the absolute "wall clock" times.

Optimization 1

Looking at our initial benchmark, you can easily see that inline parsing with InlineParserEngine::parse() accounts for a whopping 43.75% of the execution time. Clicking this method reveals more information about why this happens:

Here we see that InlineParserEngine::parse() is calling Cursor::getCharacter() 79,194 times --- once for every single character in the Markdown text. Here's a partial (slightly-modified) excerpt of this method from 0.6.1:

public function parse(ContextInterface $context, Cursor $cursor) { // Iterate through every single character in the current line while (($character = $cursor->getCharacter()) !== null) { // Check to see whether this character is a special Markdown character // If so, let it try to parse this part of the string foreach ($matchingParsers as $parser) { if ($res = $parser->parse($context, $inlineParserContext)) { continue 2; } } // If no parser could handle this character, then it must be a plain text character // Add this character to the current line of text $lastInline->append($character); } }

Blackfire tells us that parse() is spending over 17% of its time checking every. single. character. one. at. a. time. But most of these 79,194 characters are plain text which don't need special handling! Let's optimize this.

Instead of adding a single character at the end of our loop, let's use a regex to capture as many non-special characters as we can:

public function parse(ContextInterface $context, Cursor $cursor) { // Iterate through every single character in the current line while (($character = $cursor->getCharacter()) !== null) { // Check to see whether this character is a special Markdown character // If so, let it try to parse this part of the string foreach ($matchingParsers as $parser) { if ($res = $parser->parse($context, $inlineParserContext)) { continue 2; } } // If no parser could handle this character, then it must be a plain text character // NEW: Attempt to match multiple non-special characters at once. // We use a dynamically-created regex which matches text from // the current position until it hits a special character. $text = $cursor->match($this->environment->getInlineParserCharacterRegex()); // Add the matching text to the current line of text $lastInline->append($character); } }

Once this change was made, I re-profiled the library using Blackfire:

Okay, things are looking a little better. But let's actually compare the two benchmarks using Blackfire's comparison tool to get a clearer picture of what changed:

This single change resulted in 48,118 fewer calls to that Cursor::getCharacter() method and an 11% overall performance boost! This is certainly helpful, but we can optimize inline parsing even further.

Continue reading %Case Study: Optimizing CommonMark Markdown Parser with

Kategóriák: IT Hírek

How to Optimize Docker-based CI Runners with Shared Package Caches

k, 11/21/2017 - 18:00

At Unleashed Technologies we use Gitlab CI with Docker runners for our continuous integration testing. We've put significant effort into speeding up the build execution speeds. One of the optimizations we made was to share a cache volume across all the CI jobs, allowing them to share files like package download caches.

Continue reading %How to Optimize Docker-based CI Runners with Shared Package Caches%

Kategóriák: IT Hírek

How to Optimize SQL Queries for Faster Sites

h, 11/20/2017 - 18:00

This article was originally published on the Delicious Brains blog, and is republished here with permission.

You know that a fast site == happier users, improved ranking from Google, and increased conversions. Maybe you even think your WordPress site is as fast as it can be: you've looked at site performance, from the best practices of setting up a server, to troubleshooting slow code, and offloading your images to a CDN, but is that everything?

With dynamic, database-driven websites like WordPress, you might still have one problem on your hands: database queries slowing down your site.

In this post, I’ll take you through how to identify the queries causing bottlenecks, how to understand the problems with them, along with quick fixes and other approaches to speed things up. I’ll be using an actual query we recently tackled that was slowing things down on the customer portal of


The first step in fixing slow SQL queries is to find them. Ashley has sung the praises of the debugging plugin Query Monitor on the blog before, and it’s the database queries feature of the plugin that really makes it an invaluable tool for identifying slow SQL queries. The plugin reports on all the database queries executed during the page request. It allows you to filter them by the code or component (the plugin, theme or WordPress core) calling them, and highlights duplicate and slow queries:

If you don’t want to install a debugging plugin on a production site (maybe you’re worried about adding some performance overhead) you can opt to turn on the MySQL Slow Query Log, which logs all queries that take a certain amount of time to execute. This is relatively simple to configure and set up where to log the queries to. As this is a server-level tweak, the performance hit will be less that a debugging plugin on the site, but should be turned off when not using it.


Once you have found an expensive query that you want to improve, the next step is to try to understand what is making the query slow. Recently during development to our site, we found a query that was taking around 8 seconds to execute!

SELECT l.key_id, l.order_id, l.activation_email, l.licence_key, l.software_product_id, l.software_version, l.activations_limit, l.created, l.renewal_type, l.renewal_id, l.exempt_domain, s.next_payment_date, s.status, pm2.post_id AS 'product_id', pm.meta_value AS 'user_id' FROM oiz6q8a_woocommerce_software_licences l INNER JOIN oiz6q8a_woocommerce_software_subscriptions s ON s.key_id = l.key_id INNER JOIN oiz6q8a_posts p ON p.ID = l.order_id INNER JOIN oiz6q8a_postmeta pm ON pm.post_id = p.ID AND pm.meta_key = '_customer_user' INNER JOIN oiz6q8a_postmeta pm2 ON pm2.meta_key = '_software_product_id' AND pm2.meta_value = l.software_product_id WHERE p.post_type = 'shop_order' AND pm.meta_value = 279 ORDER BY s.next_payment_date

We use WooCommerce and a customized version of the WooCommerce Software Subscriptions plugin to run our plugins store. The purpose of this query is to get all subscriptions for a customer where we know their customer number. WooCommerce has a somewhat complex data model, in that even though an order is stored as a custom post type, the id of the customer (for stores where each customer gets a WordPress user created for them) is not stored as the post_author, but instead as a piece of post meta data. There are also a couple of joins to custom tables created by the software subscriptions plugin. Let’s dive in to understand the query more.

MySQL is your Friend

MySQL has a handy statement DESCRIBE which can be used to output information about a table’s structure such as its columns, data types, defaults. So if you execute DESCRIBE wp_postmeta; you will see the following results:

Field Type Null Key Default Extra meta_id bigint(20) unsigned NO PRI NULL auto_increment post_id bigint(20) unsigned NO MUL 0 meta_key varchar(255) YES MUL NULL meta_value longtext YES NULL

That’s cool, but you may already know about it. But did you know that the DESCRIBE statement prefix can actually be used on SELECT, INSERT, UPDATE, REPLACE and DELETE statements? This is more commonly known by its synonym EXPLAIN and will give us detailed information about how the statement will be executed.

Here are the results for our slow query:

id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE pm2 ref meta_key meta_key 576 const 28 Using where; Using temporary; Using filesort 1 SIMPLE pm ref post_id,meta_key meta_key 576 const 37456 Using where 1 SIMPLE p eq_ref PRIMARY,type_status_date PRIMARY 8 1 Using where 1 SIMPLE l ref PRIMARY,order_id order_id 8 1 Using index condition; Using where 1 SIMPLE s eq_ref PRIMARY PRIMARY 8 deliciousbrainsdev.l.key_id 1 NULL

At first glance, this isn’t very easy to interpret. Luckily the folks over at SitePoint have put together a comprehensive guide to understanding the statement.

The most important column is type, which describes how the tables are joined. If you see ALL then that means MySQL is reading the whole table from disk, increasing I/O rates and putting load on the CPU. This is know as a “full table scan” (more on that later).

The rows column is also a good indication of what MySQL is having to do, as this shows how many rows it has looked in to find a result.

Explain also gives us more information we can use to optimize. For example, the pm2 table (wp_postmeta), it is telling us we are Using filesort, because we are asking the results to be sorted using an ORDER BY clause on the statement. If we were also grouping the query we would be adding overhead to the execution.

Visual Investigation

MySQL Workbench is another handy, free tool for this type of investigation. For databases running on MySQL 5.6 and above, the results of EXPLAIN can be outputted as JSON, and MySQL Workbench turns that JSON into a visual execution plan of the statement:

It automatically draws your attention to issues by coloring parts of the query by cost. We can see straight away that join to the wp_woocommerce_software_licences (alias l) table has a serious issue.

Continue reading %How to Optimize SQL Queries for Faster Sites%

Kategóriák: IT Hírek

How to Read Big Files with PHP (Without Killing Your Server)

cs, 11/16/2017 - 19:00

It’s not often that we, as PHP developers, need to worry about memory management. The PHP engine does a stellar job of cleaning up after us, and the web server model of short-lived execution contexts means even the sloppiest code has no long-lasting effects.

There are rare times when we may need to step outside of this comfortable boundary --- like when we're trying to run Composer for a large project on the smallest VPS we can create, or when we need to read large files on an equally small server.

It’s the latter problem we'll look at in this tutorial.

The code for this tutorial can be found on GitHub.

Measuring Success

The only way to be sure we’re making any improvement to our code is to measure a bad situation and then compare that measurement to another after we’ve applied our fix. In other words, unless we know how much a “solution” helps us (if at all), we can’t know if it really is a solution or not.

There are two metrics we can care about. The first is CPU usage. How fast or slow is the process we want to work on? The second is memory usage. How much memory does the script take to execute? These are often inversely proportional --- meaning that we can offload memory usage at the cost of CPU usage, and vice versa.

In an asynchronous execution model (like with multi-process or multi-threaded PHP applications), both CPU and memory usage are important considerations. In traditional PHP architecture, these generally become a problem when either one reaches the limits of the server.

It's impractical to measure CPU usage inside PHP. If that’s the area you want to focus on, consider using something like top, on Ubuntu or macOS. For Windows, consider using the Linux Subsystem, so you can use top in Ubuntu.

For the purposes of this tutorial, we’re going to measure memory usage. We’ll look at how much memory is used in “traditional” scripts. We’ll implement a couple of optimization strategies and measure those too. In the end, I want you to be able to make an educated choice.

The methods we’ll use to see how much memory is used are:

// formatBytes is taken from the documentation memory_get_peak_usage(); function formatBytes($bytes, $precision = 2) { $units = array("b", "kb", "mb", "gb", "tb"); $bytes = max($bytes, 0); $pow = floor(($bytes ? log($bytes) : 0) / log(1024)); $pow = min($pow, count($units) - 1); $bytes /= (1 << (10 * $pow)); return round($bytes, $precision) . " " . $units[$pow]; }

We’ll use these functions at the end of our scripts, so we can see which script uses the most memory at one time.

What Are Our Options?

There are many approaches we could take to read files efficiently. But there are also two likely scenarios in which we could use them. We could want to read and process data all at the same time, outputting the processed data or performing other actions based on what we read. We could also want to transform a stream of data without ever really needing access to the data.

Let’s imagine, for the first scenario, that we want to be able to read a file and create separate queued processing jobs every 10,000 lines. We’d need to keep at least 10,000 lines in memory, and pass them along to the queued job manager (whatever form that may take).

For the second scenario, let’s imagine we want to compress the contents of a particularly large API response. We don’t care what it says, but we need to make sure it’s backed up in a compressed form.

In both scenarios, we need to read large files. In the first, we need to know what the data is. In the second, we don’t care what the data is. Let’s explore these options…

Reading Files, Line By Line

There are many functions for working with files. Let’s combine a few into a naive file reader:

// from memory.php function formatBytes($bytes, $precision = 2) { $units = array("b", "kb", "mb", "gb", "tb"); $bytes = max($bytes, 0); $pow = floor(($bytes ? log($bytes) : 0) / log(1024)); $pow = min($pow, count($units) - 1); $bytes /= (1 << (10 * $pow)); return round($bytes, $precision) . " " . $units[$pow]; } print formatBytes(memory_get_peak_usage()); // from reading-files-line-by-line-1.php function readTheFile($path) { $lines = []; $handle = fopen($path, "r"); while(!feof($handle)) { $lines[] = trim(fgets($handle)); } fclose($handle); return $lines; } readTheFile("shakespeare.txt"); require "memory.php";

We’re reading a text file containing the complete works of Shakespeare. The text file is about 5.5MB, and the peak memory usage is 12.8MB. Now, let’s use a generator to read each line:

// from reading-files-line-by-line-2.php function readTheFile($path) { $handle = fopen($path, "r"); while(!feof($handle)) { yield trim(fgets($handle)); } fclose($handle); } readTheFile("shakespeare.txt"); require "memory.php";

The text file is the same size, but the peak memory usage is 393KB. This doesn’t mean anything until we do something with the data we’re reading. Perhaps we can split the document into chunks whenever we see two blank lines. Something like this:

// from reading-files-line-by-line-3.php $iterator = readTheFile("shakespeare.txt"); $buffer = ""; foreach ($iterator as $iteration) { preg_match("/\n{3}/", $buffer, $matches); if (count($matches)) { print "."; $buffer = ""; } else { $buffer .= $iteration . PHP_EOL; } } require "memory.php";

Any guesses how much memory we’re using now? Would it surprise you to know that, even though we split the text document up into 1,216 chunks, we still only use 459KB of memory? Given the nature of generators, the most memory we’ll use is that which we need to store the largest text chunk in an iteration. In this case, the largest chunk is 101,985 characters.

I’ve already written about the performance boosts of using generators and Nikita Popov’s Iterator library, so go check that out if you’d like to see more!

Generators have other uses, but this one is demonstrably good for performant reading of large files. If we need to work on the data, generators are probably the best way.

Piping Between Files

In situations where we don’t need to operate on the data, we can pass file data from one file to another. This is commonly called piping (presumably because we don’t see what’s inside a pipe except at each end … as long as it's opaque, of course!). We can achieve this by using stream methods. Let’s first write a script to transfer from one file to another, so that we can measure the memory usage:

// from piping-files-1.php file_put_contents( "piping-files-1.txt", file_get_contents("shakespeare.txt") ); require "memory.php";

Unsurprisingly, this script uses slightly more memory to run than the text file it copies. That’s because it has to read (and keep) the file contents in memory until it has written to the new file. For small files, that may be okay. When we start to use bigger files, no so much…

Let’s try streaming (or piping) from one file to another:

// from piping-files-2.php $handle1 = fopen("shakespeare.txt", "r"); $handle2 = fopen("piping-files-2.txt", "w"); stream_copy_to_stream($handle1, $handle2); fclose($handle1); fclose($handle2); require "memory.php";

This code is slightly strange. We open handles to both files, the first in read mode and the second in write mode. Then we copy from the first into the second. We finish by closing both files again. It may surprise you to know that the memory used is 393KB.

That seems familiar. Isn’t that what the generator code used to store when reading each line? That’s because the second argument to fgets specifies how many bytes of each line to read (and defaults to -1 or until it reaches a new line).

The third argument to stream_copy_to_stream is exactly the same sort of parameter (with exactly the same default). stream_copy_to_stream is reading from one stream, one line at a time, and writing it to the other stream. It skips the part where the generator yields a value, since we don’t need to work with that value.

Piping this text isn’t useful to us, so let’s think of other examples which might be. Suppose we wanted to output an image from our CDN, as a sort of redirected application route. We could illustrate it with code resembling the following:

// from piping-files-3.php file_put_contents( "piping-files-3.jpeg", file_get_contents( "" ) ); // ...or write this straight to stdout, if we don't need the memory info require "memory.php";

Imagine an application route brought us to this code. But instead of serving up a file from the local file system, we want to get it from a CDN. We may substitute file_get_contents for something more elegant (like Guzzle), but under the hood it’s much the same.

The memory usage (for this image) is around 581KB. Now, how about we try to stream this instead?

// from piping-files-4.php $handle1 = fopen( "", "r" ); $handle2 = fopen( "piping-files-4.jpeg", "w" ); // ...or write this straight to stdout, if we don't need the memory info stream_copy_to_stream($handle1, $handle2); fclose($handle1); fclose($handle2); require "memory.php";

The memory usage is slightly less (at 400KB), but the result is the same. If we didn’t need the memory information, we could just as well print to standard output. In fact, PHP provides a simple way to do this:

$handle1 = fopen( "", "r" ); $handle2 = fopen( "php://stdout", "w" ); stream_copy_to_stream($handle1, $handle2); fclose($handle1); fclose($handle2); // require "memory.php"; Other Streams

There are a few other streams we could pipe and/or write to and/or read from:

  • php://stdin (read-only)
  • php://stderr (write-only, like php://stdout)
  • php://input (read-only) which gives us access to the raw request body
  • php://output (write-only) which lets us write to an output buffer
  • php://memory and php://temp (read-write) are places we can store data temporarily. The difference is that php://temp will store the data in the file system once it becomes large enough, while php://memory will keep storing in memory until that runs out.

Continue reading %How to Read Big Files with PHP (Without Killing Your Server)%

Kategóriák: IT Hírek

Your First PHP Code

k, 10/31/2017 - 18:00

The following is a short extract from our new book, PHP & MySQL: Novice to Ninja, 6th Edition, written by Tom Butler and Kevin Yank. It's the ultimate beginner's guide to PHP. SitePoint Premium members get access with their membership, or you can buy a copy in stores worldwide.

Now that you have your virtual server up and running, it’s time to write your first PHP script. PHP is a server-side language. This concept may be a little difficult to grasp, especially if you’ve only ever designed websites using client-side languages like HTML, CSS, and JavaScript.

A server-side language is similar to JavaScript in that it allows you to embed little programs (scripts) into the HTML code of a web page. When executed, these programs give you greater control over what appears in the browser window than HTML alone can provide. The key difference between JavaScript and PHP is the stage of loading the web page at which these embedded programs are executed.

Client-side languages like JavaScript are read and executed by the web browser after downloading the web page (embedded programs and all) from the web server. In contrast, server-side languages like PHP are run by the web server, before sending the web page to the browser. Whereas client-side languages give you control over how a page behaves once it’s displayed by the browser, server-side languages let you generate customized pages on the fly before they’re even sent to the browser.

Once the web server has executed the PHP code embedded in a web page, the result takes the place of the PHP code in the page. All the browser sees is standard HTML code when it receives the page, hence the name “server-side language.” Let’s look at simple example of some PHP that generates a random number between 1 and 10 and then displays it on the screen:

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Random Number</title> </head> <body> <p>Generating a random number between 1 and 10: <?php echo rand(1, 10); ?> </p> </body> </html>

Most of this is plain HTML. Only the line between <?php and ?> is PHP code. <?php marks the start of an embedded PHP script and ?> marks its end. The web server is asked to interpret everything between these two delimiters and convert it to regular HTML code before it sends the web page to the requesting browser. If you right-click inside your browser and choose View Source (the text may be different depending on the browser you’re using) you can see that the browser is presented with the following:

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Random Number</title> </head> <body> <p>Generating a random number between 1 and 10: 5 </p> </body> </html>

Notice that all signs of the PHP code have disappeared. In its place the output of the script has appeared, and it looks just like standard HTML. This example demonstrates several advantages of server-side scripting …

  • No browser compatibility issues. PHP scripts are interpreted by the web server alone, so there’s no need to worry about whether the language features you’re using are supported by the visitor’s browser.
  • Access to server-side resources. In the example above, we placed a random number generated by the web server into the web page. If we had inserted the number using JavaScript, the number would be generated in the browser and someone could potentially amend the code to insert a specific number. Granted, there are more impressive examples of the exploitation of server-side resources, such as inserting content pulled out of a MySQL database.
  • Reduced load on the client. JavaScript can delay the display of a web page significantly (especially on mobile devices!) as the browser must run the script before it can display the web page. With server-side code, this burden is passed to the web server, which you can make as beefy as your application requires (and your wallet can afford).
  • Choice. When writing code that’s run in the browser, the browser has to understand how to run the code given to it. All modern browsers understand HTML, CSS and JavaScript. To write some code that’s run in the browser, you must use one of these languages. By running code on the server that generates HTML, you have a choice of many languages—one of which is PHP.
Basic Syntax and Statements

PHP syntax will be very familiar to anyone with an understanding of JavaScript, C, C++, C#, Objective-C, Java, Perl, or any other C-derived language. But if these languages are unfamiliar to you, or if you’re new to programming in general, there’s no need to worry about it.

Continue reading %Your First PHP Code%

Kategóriák: IT Hírek