时间: 2012-05-20 / 分类: 软件技巧 / 浏览次数: / 0个评论 发表评论

Three things you should never put in your database

As I’ve said in a few talks, the best way to improve your systems is by first notdoing "dumb things". I don’t mean you or your development staff is "dumb", it’seasy to overlook the implications of these types of decisions and not realize howbad they are
for maintainability let alone scaling. As a consultant I see this stuffall of the time and I have yet to ever see it work out well for anyone.

Images, files, and binary data

Your database supports BLOBs so it must be a good idea to shove your files in thereright? No it isn’t! Hell it isn’t even very convenient to use with many DB languagebindings.

There are a few of problems with storing files in your database:

  • read/write to a DB is always slower than a filesystem
  • your DB backups grow to be huge and more time consuming
  • access to the files now requires going through your app and DB layers

The last two are the real killers. Storing your thumbnail images in your database? Greatnow you can’t use nginx or another lightweight web server to serve them up.

Do yourself a favor and store a simple relative path to your files on disk in the database oruse something like S3 or any CDN instead.

Ephemeral data

Usage statistics, metrics, GPS locations, session data anything that is only useful to youfor a short period of time or frequently changes. If you find yourself DELETEing an hour,day, or weeks worth of some table with a cron job, you’re using the wrong tool
for the job.

Use redis,
anything elsethat is better suited to that type of work load. The same advice goes for aggregations of ephemeral data that doesn’t live for very long.

Sure it’s possible to use a backhoe to plant sometomatoes in the garden, but it’s far faster to grab the shovel in the garage than scheduletime with a backhoe and have it arrive at your place and dig. Use
the right tool(s) for thejob at hand.


This one seems ok on the surface and the "I might need to use a complex query on them atsome point in the future" argument seems to win people over. Storing your logs in a databaseisn’t a HORRIBLE idea, but storing them in the same database as your other
production datais.

Maybe you’re conservative with your logging and only emit one log line per web request normally.That is still generating a log INSERT for every action on your site that is competing for resourcesthat your users could be using. Turn up your logging to a verbose
or debug level and watch yourproduction database catch on fire!

Instead use something like Splunk,
or plain old rotating flat files for your logs. The few times you need to inspect them in odd ways,even to the point of having to write a bit of code to find your answers, is easily outweighed by the constant resources it puts on your system.

But wait, you’re a unique snowflake and your problem is SO different that it’s ok for you to doone of these three.
No you aren’t and no it really isn’t. Trust me.




我已经在很多演讲里说过,改进你的系统的最好的方法是先避免做“蠢事”。我并不是说你或你开发的东西“蠢”,只是有些决定很容易被人们忽略掉其暗含 的牵连,认识不到这样做对系统维护尤其是系统升级带来多大的麻烦。作为一个顾问,像这样的事情我到处都能见到,我还从来没有见过做出这样的决定的人有过好 的结果的。




  • 对数据库的读/写的速度永远都赶不上文件系统处理的速度
  • 数据库备份变的巨大,越来越耗时间
  • 对文件的访问需要穿越你的应用层和数据库层













你必须 登录后 才能留言!