Here we provide the details of the Linux VM on which we deploy PostgreSQL.
Context
We ...
- have little linux admin expertise
- are linux users (bash, vi, sed, awk, grep, etc does not scare us)
- have few resources (money, people, time)
Therefore...
- simple is important
- automation is important (we use bash and python)
The application is...
- not huge
- not complex
- not trivial either
- not particularly resource intensive
You should know that...
- overnight ETL jobs load about 300 MB data
- overnight map tile regeneration is read intensive on the DB
The application itself ...
- will update and insert an average of 10-100 "street address records" per day
- will issue 1K to 100K? reads per day (search, browse, report)
- does a a modest amount of spatial processing (nearest streets, point in polygon, nearest addresses)
- has no known or unresolved performance issue (we had a django ORM performance issue)
Security
- the data is not sensitive
- the cities network does contain sensitive information
- the application is a business critical system
Facilities
We have 2 data centers (DC): San Diego (SD) and San Francisco (SF).
SF is the primary.
We are running VM Ware 4.?
- SF data center is primary; SD is secondary.
Paul will pick up here tomorrow
XXXXXXXXXXXXXXXXXXXXXX
We want a small core VM (< 1 GB?) so the VM copy/clone operations are network friendly and reasonably fast.
So I think a 10GB PGDATA partition would be plenty for now.
We do not expect that much growth since addresses do not change that much.
We are expecting to upgrade to PG 9.0 for our 1.1 release.
Failover will be via log file shipping.
If that proves problematic in any way (not expected), it is acceptable to loose 24 hours worth of data.
We have room to build this VM at either data center.
We can give you access to the hypervisor, storage allocation, everything you need.
If we use the SD DC, you can get started right away.
If you use the SF DC, you'll need to wait for the VPN creds.
We'll (city staff) want to look over your shoulder in web meetings to learn how to fish.
OS
For consistency we would like to use centos 5.5.
Disk Partitions
We would like separate disk partitions for:
- pg data
- transaction log files
- backups
- os
PGExperts shall recommend a size for each.
When we do a pgdump (e.g.)
pg_dump --host localhost --port 5432 --username postgres --format custom --blobs --verbose --file eas_20110408_1409.dmp eas_qa
the file size is about 200MB.
Monitoring
We would like to implement monitoring that works the same way using the same technology in both SD and SF.
SF OPs have a new monitoring tool called ?what? that they would prefer to use.
Will Ops still support the use of Nagios?
What does carinet have for montioring?
?Ask Hema? If we need to simplify, can we cut corners (not monitor) on monitoring in SD?
Add Comment