Mass Assignment and Security in Ruby on Rails

The topic of security vulnerabilities exposed through attributes= in ActiveRecord::Base recently came up on the SF Ruby Meetup list. The related blog entry can be found here. It’s an important question about Rails, and one that a little information can help clarify what to worry about and what not to worry about.

First off, for anyone not familiar enough with Rails, what is attributes=?. It’s the magic method that allows mass assignment of attributes from a hash. It’s what lets you do things like the following with your active record classes:

@foo = Foo.new params[:foo]
@foo.update_attributes params[:foo]
# or, of course
@foo.attributes = params[:foo]

The attributes= method takes each key in the hash, and as long as your model provides a mutator method for that attribute (responds to “key=”, and, of course, each column in your table gets one of these mutators created by default). However, before attributes= begins processing the hash, it first passes it to remove_attributes_protected_from_mass_assignment (unless you’ve called attributes= with guard_protected_attributes set to false). By default, this will exclude the primary key column, the inheritance column (e.g. type), and the column named id (if your primary key has another name). Everything else is fair game.

Protecting Attributes

For simple models, the default behavior may be just fine. You don’t have to worry about your id or type columns getting changed, and anything else can be easily used with update_attributes. Things can get tricky, though, once you start adding in some of the extra goodies in Rails. Let’s say, for example, that I have a blog_posts table. I have a simple model that just inherits from ActiveRecord::Base and I’ve decided to add created_at and updated_at fields to get some automatic time stamping behavior. Let’s say the update action in my controller looks like this:

class BlogPostController < ApplicationController

  def update
    @blog_post = BlogPost.find params[:id]
    @blog_post.update_attributes(params[:blog_post])
  end

end

Here, any HTTP client can pass me a parameter named blog_post[created_at] and my creation time will be set to whatever value is passed in (any value passed in for updated_at will be overridden when the record is saved, so it’s not exposed in this particular scenario). That’s probably not what I want. I probably only want to allow the creation time to be set when the record is first saved.

There are a few different ways to protect my created_at attribute. ActiveRecord::Base provides two mechanisms for controlling which attributes are weeded out by remove_attributes_protected_from_mass_assignment. You can use attr_protected to specify a list of additional attributes which may not be used with attributes= (a black list). Alternatively, you can use attr_accessible to specify an exclusive list of attributes that may be used with attributes= (a white list). If you use attr_accessible, any attribute not in the list you specify cannot be used with mass assignment. Lastly, in this case we could also just mark created_at as a read-only attribute. Read-only attributes may be mass assigned when calling new, but not through other means. So, either one of these two additions to our model will prevent surprises happening to created_at:

class BlogPost < ActiveRecord::Base
  attr_protected :created_at
end

# or:

class BlogPost < ActiveRecord::Base
  attr_readonly :created_at
end

Why doesn’t Rails just protect these magical attributes by default? In my view, what protection you need for the time stamp fields, if any, is really application-specific. While most applications probably don’t want to allow these fields to be set outside of the model itself, that’s not always the case. In the case of my BlogPost example, if I were to add functionality allowing posts to be imported from other blogs, there might be a legitimate case where I want to allow created_at to be explicitly set (to preserve the value from the imported post). Leaving this decision to the application developer ultimately provides the most flexibility, but it does require some awareness of the consequences when using mass assignment.

Associations

A forged created_at value isn’t terribly dire for most applications, but plugins and associations can add new assignment methods to your model, some of which may cause much bigger security problems. Associations in particular can expose some security vulnerabilities for the naïve programmer. Coming back to my blog post example, let’s say I want to track the author of each post. I have a User model, and, for simplicity, I’m just going to let the author be tracked as the user property on a blog post:

class BlogPost < ActiveRecord::Base
  attr_readonly :created_at
  belongs_to :user
end

With no modification to my controller, I could wind up with a very nasty surprise if an HTTP client were to pass in a blog_post[user_id] parameter. In effect, it could allow anyone to spoof the author of a post or corrupt the data in my table by specifying an invalid user id. In this example, there’s really no case where I want to allow unchecked mass assignment to the user_id property, so changing my model to prohibit that will provide some added protection:

class BlogPost < ActiveRecord::Base
  attr_readonly :created_at
  attr_protected :user_id
  belongs_to :user
end

That leaves me with the issue of how to legitimately set the author on a blog post. For the sake of simplicity, let’s say I have the current user id stored in the session. We’ll imagine my application controller takes care of preventing users that haven’t logged in from accessing my controller, so I don’t have to worry about the id not being set. (Perhaps not the best implementation, but, again, it keeps this example simple.) Lastly, in my application I consider the author to be the user who creates the post, not any subsequent users who edit it, so I’ll do this work in the create method:

class BlogPostController < ApplicationController

  def create
    @blog_post = BlogPost.new(params[:blog_post])
    @blog_post.user = User.find session[:user_id]
    @blog_post.save
  end

  # update method remains unchanged ...

end

Here I’m actually using the user assignment method created by the association. I could assign directly to the user_id attribute, but for the cost of an additional database call, I’ll get an exception if I try to assign an invalid id. Whether that’s really worth doing or not depends and involves using some of your own judgement. What gets highlighted here, though, is yet another assignment method we’ve acquired. Do we need to worry about protecting the user= method as well? You certainly can protect it the same way. However, that method will throw an AssociationTypeMismatch exception if you try to assign anything other than a User model instance, which provides some protection from the kind of attack that user_id is vulnerable to. The one exception to this kind of type-based protection is classes which have association collections (reference many associated objects of the same type). The _ids methods that are added by has_many and has_and_belongs_to_many could be assigned to through mass assignment using HTTP parameters, so some thought should be put into whether or not to protect them in your application.

Other than using attr_protected, there is another way you could potentially safeguard user_idfrom malicious HTTP clients, and it builds on the kind of precaution in the previous controller example. You could simply validate the attribute. Here’s what that might look like:

class BlogPost < ActiveRecord::Base
  attr_readonly :created_at
  belongs_to :user

  protected

  def validate()
    errors.add(:user_id) unless (user_id.nil? || User.exists?(user_id))
  end
end

This, of course, doesn’t prevent an HTTP client from setting the user_id (unless the controller always explicitly assigns the user after a mass assignment), but it does prevent assigning an invalid user id, whether by the HTTP client or application code using the model, and it some applications that may be sufficient. It also has the added draw of containing the safeguards specific to a model within the model instead of letting them leak out into the controller.

At about this point in the discussion, I should point out that one blog entry referenced in the original thread has a very different take. The author of that post argues for using attr_accessible for every model, always. That’s certainly available in ActiveRecord for you to do that if you choose, but I fundamentally disagree with that as an approach and I feel it runs counter to the spirit of Rails. It solves the problem of potentially exposing an attribute for mass assignment without intending to do so, but at the cost of quite a bit of tedium on any sizable application. You are forced to enumerate all columns which can be used with mass assignment (which may well be most) for all tables and to update the list for every schema change. That’s highly error prone, and part of the whole point of the ActiveRecord approach is not configuring your entire schema in both the database and the model.

Context Sensitive Attributes

In many real-world applications it’s not just a matter of whether or not to allow an attribute to be assigned to, but when and who is allowed. In my blog example, I may decide that administrators are allowed to post as other users and can choose the author from a drop-down list. Naturally, I wouldn’t want to allow non-administrators to do so. In other words, the context of the assignment matters. In fact, this was the underlying issue of the original message in the thread. Easily enough you could limit mass assignment to the subset of attributes which can always be assigned to in any context and then manually deal with the exceptions in the controller. In my example that might look something like:

class BlogPostController < ApplicationController

  def update
    @blog_post = BlogPost.find params[:id]
    @blog_post.attributes = params[:blog_post]
    @blog_post.user_id = params[:blog_post][:user_id] if current_user.admin?
    @blog_post.save
  end

end

I’m assuming here that I’ve added a current_user method to the application controller that returns the user object for the currently logged in user. This handles the issue, but it’s not particularly elegant, and can quickly get unwieldy if, as in actual applications, many of these exceptions occur in many different places.

The temptation is to try to somehow make all the magic happen within the model. Another list member wound up blogging about this very issue here. I agree with his assessment that the model really isn’t a place to put that kind of contextual knowledge. Contextual information belongs in the controller. He suggests an approach that allows lists of sensitive parameters to be declared in the model and then selectively used to filter parameters by the controller. I think an ideal situation would allow the user to declare multiple filtering lists in the model and then selectively choose them from the controller, possibly applying them as part of a mass assignment call. And, of course, validation should still be applied for sensitive bits like foreign keys. Unfortunately, as far as I know, no plugin yet exists to do all of that magically.

So, to summarize a bit, I really see the question that spawned all of this to be touching on two slightly different issues: general security/data integrity issues associated with mass assignment and how to more elegantly handle filtering parameters for mass assignment based on context. Unfortunately, there really isn’t a single best practice for the latter, but I’d certainly be interested in hearing how others address it. I don’t, as some would seem to argue, feel that either attr_protected or attr_accessible have outlived their usefulness. Even in an application that strictly adheres to a strong division of responsibilities between the model and controller, they make sense for attributes that the model itself may wish to exclude from mass assignment, such as time stamps or other attributes which may have special meaning or behavior.

Comments

Thanks for your comments, and thanks for referencing my blog post. I added a comment on my blog referencing yours ;)

Your post is very well written, and describes the problem perfectly.

I'm a real stickler about a lot of things. Hence, I'm really a fan of being closed by default. Unless I explicitly say otherwise, things like @blog_post.user_id should never be updatable by attributes=.

That's why I really like the idea of using ActiveRecord::Base.send(:attr_accessible, nil). That makes it so that if any other engineer makes a mistake, he gets a bug (i.e. a field that isn't being updated) instead of a security vulnerability.

I'm definitely up for hearing about other approaches to solving the attributes= problem, but I really want things to be closed by default.

For instance, one of my only complaints about Rails is that you have to remember to call h() in your templates. The problem is that people aren't perfect. One thing I really liked about Genshi (which is another templating engine in Python) is that it would automatically escape HTML by default, unless it knew the HTML was coming from another template.

Similarly, most modern Linux distributions have all ports closed by default. The OpenBSD guys are some of the most security conscious guys on the planet, but even they have occasionally made mistakes. That's why it's better to not have sshd listening by default unless you decide you really need to, and you're going to make sure the system stays up to date.

Getting back to your post, I guess I'd rather have " @blog_post.user_id = params[:blog_post][:user_id] if current_user.admin?" in my controller than risk someone inadvertently leaving a backdoor open. Personally, I don't even find it all that ugly. It just says, "If you're an admin, you're also allowed to set the :user_id when you use this action."

It's okay if we disagree, though. Engineers generally do ;)

Comments are now closed for this post.