After a couple of months spent using and editing this blog’s engine I can say that Heroku’s cloud is not heaven.
Wordpress works like a glance but there are many problems and some of these aren’t easy to solve.

First of all: Heroku is SLOW! I know, I use the free plan but response time is too high. Also the 10$ hosting I used for many years is faster. I think it’s a problem related to Heroku stack. It works well when you use a lot of workers and you have a lot of traffic but when you have less traffic is awfully slow.

Anyway solve this problem is easy: you can use a CDN. Contents are cached and delivered through multiple servers located all over the world. It’s easy and transparent because it works at DNS level. I choose CloudFlare because has a free plan (and we widely use it at @thefool_it). In front of my blog is able to speed-up response time from about five second to less than one.

Another problem comes when you use plugins that try to “activate” itself like Jetpack or Google Analytics plugin. Heroku filters many server-to-server connections and activation through OAuth-like handshake fails. Some plugins offer a “manual” activation wizard, for everything else there’s not solution: you can edit and maintain the plugin’s code or drop it and use another one. Not fun.

Clouds are not heaven (yet).

ActiveRecord is an incredibly powerful tool but the Rails Guides doesn’t cover every possible situation and the ActiveRecord’s official documentation is huge. Find something you are looking for can be hard. If you have to do something strange and you have no time to search you have to hope someone had got the same problem and posted it on StackOverflow or on its own blog.

Recently I have to modelize a relation where a resource belongs to an entity and contemporary is related to N other entities.

The One-to-Many relation is easy: use belongs_to and has_many. The other part is harder because you need to use a connection table (HABTM doesn’t work) and you need to rename relation because its name is already taken.

You can use a connection table using through attribute:

has_many :connection_table
has_many :items, through: :connection_table

and rename a has_many through relation using source attribute:

has_many :related_items, through: :connection_table, source: :items

Problem solved:

class Resource < ActiveRecord::Base
belongs_to :entity
has_many :connections
has_many :related_entities,
through: :connections, source: :entity


class Entity < ActiveRecord::Base
has_many :resources
has_many :connections
has_many :related_resources,
through: :connections, source: :resource


class Connection < ActiveRecord::Base
belongs_to :entity
belongs_to :resource

Thanks to @olinicola for the advises 🙂

If you run a commercial webapp, probably you have to track access.

CloudFlare helps you to manage more connection but hides from you many informations about the client. If you try to log the IP address you always get the CloudFlare’s ones.

Common headers which nginx uses to forward original IP (X-Forwarded-For and X-Real-IP) contain the CloudFlare’s IP. The correct header where to look is HTTP_CF_CONNECTING_IP.

/* PHP */
# Rack

During last years I had to develop projects containing up to hundred of million of objects. Now I need to move ahead and scale up to several billions of objects, reaching the limit of “big-data” definition. The common implementation of relational model which I always used isn’t enough anymore.

We know that a standard single-machine instance of MySQL (which all web developers have used at least once) show its limit over the 100 millions of rows. I need to scale horizontally and also need most specific features to easily manage a huge amount of data.

This is not a limit of relational model. Other implementations (like PostgreSQL or Oracle) can easily scale over that limit. Unfortunately many operations you usually do on data (like joins and set operations) aren’t so fast to run with billion of records. I need something else.

So called “NoSQL databases” offer you more data model (document-oriented, columnar, key-value, graph and more) where you can store your data in an more efficient way. They also offer features like sharding, replication, caching and indexing out of the box.

I’m not a NoSQL expert so I can’t advise you if choose a DBMS instead of another is a good choice or not. I’m entering this world just now like many other developers but I think that polyglot persistence is the future. Store your data using more than one DBMS to fit your requirements and take advantage of features of each one is a smart choice.

Big-data and polyglot persistence are interesting topics. I found some interest books about these topics. They can be a high quality introduction.

Seven Databases in Seven Weeks
by Eric Redmond and Jim R. Wilson

Contains an overview about different kinds of data model with real-world example for each one: PostgreSQL (RDBMS), Riak and Redis (Key-Value), HBase (Column-oriented), MongoDB and CouchDB (Document-oriented) and Neo4j (Graph).

NoSQL Distilled
by Pramod J.Sadalage and Martin Fowler

Similarly to the previous one this book starts with overview about the NoSQL world. The first part analyze how different softwares implement key-features: data-modeling, distribution (to scaling horizontally) and replication (to keep if safe and analyzable) of data.

The second part focus on each different typology of DBMS and analyze how they implement concepts exposed in previous part.

Big Data Glossary
by Pete Warden

Big data is more that persistence. There are many other operations you can do on your data and many way to analyze results. If you aren’t familiar with concepts like MapReduce, Natural Language Processing and Machine Learning this book explain you the basics.

First 5 chapters are about storing big-data, other 6 chapters are about processing and refining data with focus on high-specific topics.

I’m a developer and I want to start a blog. There are many different engine I can chose: Jekill, Tumblr, Posterous and counting. I choose WordPress because it’s more user friendly IMHO. The problem is: self-hosted or managed?

As usual every choice has pros and cons.

If you choose a self-hosted solution you have to pay for it. Most of times developers can access to a friend’s VPS and deploy there but this require you to setup the environment (nginx, php-fpm, MySQL and so on) and keep everything up to date to avoid intrusions.

If you chose the managed solution you have to pay. offers a free plan but includes advertising and use of plugins is forbidden. The premium plan is better but is 99$ a year.

I tried both solutions. I deployed a version of WordPress on a VPS (thanks @dani_viga). Then you had to install nginx, install php-fpm, configure virtual host, fight against configuration problems, fight against permissions problems, fight against incompatible plugins and finally your new self-hosted WordPress blog is online (and you have a big headache). I’m a developer, not a Sysadmin. I don’t like to refine configuration, keep everything up to date. I just want something easier.

So I tried because I couldn’t find another provider with a free plan. Beautiful site, great wizard for setup blog, everything is up and running in five minutes. I start to write my first post but i didn’t like the syntax highlighter. There is no way to change it. I want to include a custom social streamer. There is no way to do it. I want to customize my template. There is no way to do it… I closed my account. I’m a developer: i don’t like stuff I can’t edit.

One minute before giving up I decided to try one last solution: Heroku. The Cedar stack supports PHP. I searched for a WordPress version ready to deploy on Heroku and I found this. It uses PostgreSQL and, after five minutes of setup, works like a glance. The only problem is the file upload but I solved using the WPRO Plugin (WordPress Read-Only) to upload file directly to Amazon S3.

Now my blog runs in the clouds 🙂