Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Here we provide the details of the Linux VM on which we deploy PostgreSQL.

Table of Contents

Context

About Us

We ...

  • have little linux admin expertise
  • are linux users (bash, vi, sed, awk, grep, etc does not scare us)
  • have few resources (money, people, time)

...

  • simple is important
  • automation is important (we use bash and python)
About the Application

The application is...

  • not huge
  • not complex
  • not trivial either
  • not particularly resource intensive

Our next release will be version 1.1.
We are using PG 8.3.7.
Christophe will be upgrading us to 9.0 for our 1.1 release.

You should also know that...

  • overnight ETL jobs load about 300 MB data
  • overnight map tile regeneration is read intensive on the DB

...

  • will update and insert an average of 10-100 "street address records" per day
  • will issue 1K to 100K? reads per day (search, browse, report)
  • does a a modest amount of spatial processing (nearest streets, point in polygon, nearest addresses)
  • has no known or unresolved performance issue (we had a django ORM performance issue)
Security
  • the data is not sensitive
  • the cities network does contain sensitive information
  • the application is a business critical system
Downtime

Although the application is business critical, our downtime requirements are fairly lightweight.
Generally we can be down for an hour without people getting very excited.
If the application is down, the city will not be able to do things like issue building permits.
But let me be clear - we do not want people to get excited.
Downtime is scheduled off hours and communicated to our users well in advance.

Failover

Here again, the application is business critical but our requirements are fairly lightweight.
We are allowed to loose no more than 24 hours worth of data.
PG Experts has recommended we accomplish failover via log file shipping.

Facilities

We have 2 data centers (DC): San Diego (SD) and San Francisco (SF).
You'll need VPN credentials to access SF.
SD is "secondary", and is operated by Carinet.
SF is primary and is operated by the primarycity.
We are running VM Ware 4.?

...

1.0
We using Centos 5.5.
We use Nagios for monitoring.

Now, About that VM...

We want a small core VM (< 1 GB?) so the VM copy/clone operations are network friendly and reasonably fast.

...

OS

For consistency we would like to use centos 5.5.

Disk Partitions

We would like separate disk partitions for:

  • pg data
  • transaction log files
  • backups
  • os

PGExperts We want a minimal linux install (no desktop software, etc).

Disk Partitions

PG Experts shall recommend a size for each partition. When we do a pgdump (e.g.)
pg_dump --host localhost --port 5432 --username postgres --format custom --blobs --verbose --file eas_20110408_1409.dmp eas_qa
the file size is about 200MB.

Monitoring

...

Transaction Logs

...

Backups

...

OS

...

PGDATA

...

I will naively propose 20GB as the size we want for PGDATA.
Why?
When I go to PGDATA and look at disk usage I see that we are using anywhere from 4.6 GB (DEV) to 2.9GB (PROD).
I do not have any data growth information.
But we know that street addresses do not change much in the city.
We do have one area of the application that (ab)uses audit tables - but I would be a little surprised if this became an issue.