ActiveRecord is an incredibly powerful tool but the Rails Guides doesn’t cover every possible situation and the ActiveRecord’s official documentation is huge. Find something you are looking for can be hard. If you have to do something strange and you have no time to search you have to hope someone had got the same problem and posted it on StackOverflow or on its own blog.

Recently I have to modelize a relation where a resource belongs to an entity and contemporary is related to N other entities.

The One-to-Many relation is easy: use belongs_to and has_many. The other part is harder because you need to use a connection table (HABTM doesn’t work) and you need to rename relation because its name is already taken.

You can use a connection table using through attribute:

has_many :connection_table
has_many :items, through: :connection_table

and rename a has_many through relation using source attribute:

has_many :related_items, through: :connection_table, source: :items

Problem solved:

class Resource < ActiveRecord::Base
belongs_to :entity
has_many :connections
has_many :related_entities,
through: :connections, source: :entity
end

 

class Entity < ActiveRecord::Base
has_many :resources
has_many :connections
has_many :related_resources,
through: :connections, source: :resource
end

 

class Connection < ActiveRecord::Base
belongs_to :entity
belongs_to :resource
end

Thanks to @olinicola for the advises ­čÖé

If you run a commercial webapp, probably you have to track access.

CloudFlare helps you to manage more connection but hides from you many informations about the client. If you try to log the IP address you always get the CloudFlare’s ones.

Common headers which nginx uses to forward original IP (X-Forwarded-For and X-Real-IP) contain the CloudFlare’s IP. The correct header where to look is HTTP_CF_CONNECTING_IP.

1
2
/* PHP */
$_SERVER['HTTP_CF_CONNECTING_IP']
1
2
# Rack
request.headers["HTTP_CF_CONNECTING_IP"]

During last years I had to develop projects containing up to hundred of million of objects. Now I need to move ahead and scale up to several billions of objects, reaching the limit of “big-data” definition. The common implementation of relational model which I always used isn’t enough anymore.

We know that a standard single-machine instance of MySQL (which all web developers have used at least once) show its limit over the 100 millions of rows. I need to scale horizontally and also need most specific features to easily manage a huge amount of data.

This is not a limit of relational model. Other implementations (like PostgreSQL or Oracle) can easily scale over that limit. Unfortunately many operations you usually do on data (like joins and set operations) aren’t so fast to run with billion of records. I need something else.

So called “NoSQL databases” offer you more data model (document-oriented, columnar, key-value, graph and more) where you can store your data in an more efficient way. They also offer features like sharding, replication, caching and indexing out of the box.

I’m not a NoSQL expert so I can’t advise you if choose a DBMS instead of another is a good choice or not. I’m entering this world just now like many other developers but I think that polyglot persistence is the future. Store your data using more than one DBMS to fit your requirements and take advantage of features of each one is a smart choice.

Big-data and polyglot persistence are interesting topics. I found some interest books about these topics. They can be a high quality introduction.

Seven Databases in Seven Weeks
by Eric Redmond and Jim R. Wilson

Contains an overview about different kinds of data model with real-world example for each one: PostgreSQL (RDBMS), Riak and Redis (Key-Value), HBase (Column-oriented), MongoDB and CouchDB (Document-oriented) and Neo4j (Graph).

NoSQL Distilled
by Pramod J.Sadalage and Martin Fowler

Similarly to the previous one this book starts with overview about the NoSQL world. The first part analyze how different softwares implement key-features: data-modeling, distribution (to scaling horizontally) and replication (to keep if safe and analyzable) of data.

The second part focus on each different typology of DBMS and analyze how they implement concepts exposed in previous part.

Big Data Glossary
by Pete Warden

Big data is more that persistence. There are many other operations you can do on your data and many way to analyze results. If you aren’t familiar with concepts like MapReduce, Natural Language Processing and Machine Learning this book explain you the basics.

First 5 chapters are about storing big-data, other 6 chapters are about processing and refining data with focus on high-specific topics.

I’m a developer and I want to start a blog. There are many different engine I can chose: Jekill, Tumblr, Posterous and counting.┬áI choose WordPress because it’s more user friendly IMHO.┬áThe problem is: self-hosted or managed?

As usual every choice has pros and cons.

If you choose a self-hosted solution you have to pay for it. Most of times developers can access to a friend’s VPS and deploy there but this require you to setup the environment (nginx, php-fpm, MySQL and so on) and keep everything up to date to avoid intrusions.

If you chose the managed solution you have to pay. WordPress.com offers a free plan but includes advertising and use of plugins is forbidden. The premium plan is better but is 99$ a year.

I tried both solutions. I deployed a version of WordPress on a VPS (thanks @dani_viga). Then you had to install nginx, install php-fpm, configure virtual host, fight against configuration problems, fight against permissions problems, fight against incompatible plugins and finally your new self-hosted WordPress blog is online (and you have a big headache). I’m a developer, not a Sysadmin. I don’t like to refine configuration, keep everything up to date. I just want something easier.

So I tried WordPress.com because I couldn’t find another provider with a free plan. Beautiful site, great wizard for setup blog, everything is up and running in five minutes. I start to write my first post but i didn’t like the syntax highlighter. There is no way to change it. I want to include a custom social streamer.┬áThere is no way to do it. I want to customize my template.┬áThere is no way to do it… I closed my account. I’m a developer: i don’t like stuff I can’t edit.

One minute before giving up I decided to try one last solution: Heroku. The Cedar stack supports PHP. I searched for a WordPress version ready to deploy on Heroku and I found this. It uses PostgreSQL and, after five minutes of setup, works like a glance. The only problem is the file upload but I solved using the WPRO Plugin (WordPress Read-Only) to upload file directly to Amazon S3.

Now my blog runs in the clouds ­čÖé