Introduction
In order to do minimal changes to standard postgresql deploy (like install postGIS) you must be sure you have control of where you get your packages. This is not as straight forward as it should be. I encourage you to take a few minutes to read this through. First, your "disto repository" is not likely to be up to date with respect to postgresql packages. In other words, if you blindly type "yum install postgresql-devel", you may have trouble getting everything installed properly. I've seen errors regarding missing libraries and the like. To avoid the fuss, when you are installing postgresql packages, use the postgresql repositories instead of your ditsro repsository.
Here is how you do that. You should read this from top to bottom.
- http://yum.pgsqlrpms.org/howtoyum.php
Also read this; you can safley skip sections on this page. - http://www.postgresonline.com/journal/index.php?/archives/45-An-Almost-Idiots-Guide-to-PostgreSQL-YUM.html
And this is good too:
Section | |
---|---|
|
Include Page | ||||
---|---|---|---|---|
|
Pre-Installation
Before you install, you should make sure you have a clean slate.
First, look for other postgres installs. I found that my centos 5 comes with postgres 8.1 already installed.
I suggest that we remove these older versions before we install the newer ones.
If you really need both versions to work side by side - this wiki page won't help you.
Code Block |
---|
yum list postgres*
yum uninstall postgresql
yum uninstall postgresql_XXX
yum uninstall postgresql_YYY
|
If you installed from source you can go the the build directory and type
Code Block |
---|
make uninstall
|
Installation
While we developed EAS under postgresql 8.3.7, installing 8.3.7 was giving us trouble. 8.3.11 was available via YUM and once we used YUM the install went smoothly.Using YUM is the way to go.
I followed these instructions:
Once you have YUM configured to use only the postgres respositories and not the Centos repos (see above), the following steps provide a pain free installation.
Code Block |
---|
yum install postgresql
yum install postgresql-devel
yum install postgresql-server
yum install postgis
yum install postgresql-contrib
|
Additional Links
Initializing Database
Our Oracle trained DBAs decided to intialize the DB here
Code Block |
---|
/ur01/pgsql/data
|
Then we want to create 2 databases.
Code Block |
---|
createdb mad --owner postgres
createdb sfmaps --owner postgres
|
At this point, you can use pg_restore.
For more on that read
http://eas.googlecode.com/svn/trunk/releases/1_0_prod/README
Links
- http://www.postgresql.org/docs/8.3/static/creating-cluster.html
- http://www.postgresql.org/docs/8.3/static/runtime-config-file-locations.html
- http://archives.postgresql.org/pgsql-admin/2007-05/msg00104.php
Performance Tuning
The EAS databases (mad, sfmaps) need to support 3 sorts of activities
- relatively high I/O because it's geospatial
- light OLTP actvity from single django web application
- most connections from single django web application (no connection pooling)
- small number of simultaneous connections (estimate max of 50)
- heavy write activity during overnight ETL processes
- heavy read processes from geoserver to produce web maps, especially during cache seeding
The host VMs for the DB have different amounts of RAM as shown in the table below.
Here I discuss the settings in the postgresql.conf file.
The default settings are for low resource environments and will result in relatively poor performance.
These settings are my "best guess" based on some reading (see links below).
We'll need to do some monitoring in PROD to tune these values.
parameter | value | value | value | links (below) |
---|---|---|---|---|
VM GB RAM | 4 | 6 | 6 |
|
shared_buffers | 1GB | 1.5GB | 1.5GB | 1,2 |
work_mem | 16MB | 16MB | 16MB | 2 |
maintenance_work_mem | 128MB | 128MB | 128MB | 2 |
wal_buffers | 1MB | 1MB | 1MB | 2 |
checkpoint_segments | 6 | 6 | 6 | 2 |
random_page_cost | 2 | 2 | 2 | 2 |
seq_page_cost | 1 | 1 | 1 | 2 |
effective_cache_size | 2GB | 3GB | 3GB | 1 |
max_connections | 100 | 100 | 100 | default |
A restart is needed for these configurations to take affect.
Links
1 - http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm
2 - http://workshops.opengeo.org/postgis-intro/tuning.html
3 - http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
4 - http://postgis.refractions.net/docs/
...
Did you do your reading?
Errors
Errors we've seen
error: ch06.html
Security - Quick Start
This is my best guess on security and users - feel free to recommend alternatives.
For the time being, I have modified
- hba.conf
- postgresql.conf.
so as to allow the postgres user to connect to any database from any of
- localhost
- geoserver
- web server
This needs to be substantially refined for production.
The following section should address this in complete detail.
Security - Production
The dataserver should be accessible only from the following:
- web server
- geo server
- localhost
We will use postgres's hba.conf to control this access.
Remember that each environment (DEV, QA, PROD) will have it's own trio of severs:
- web server
- geoserver
- dataserver
See This diagram for an example.
Get the IPs for the hba.conf from Henry or Paul.
We want to be as strict as is reasonable - not as strict as possible.
Database Accounts
I know we'll want at least 2 users
- postgres
- eas_user
Let's discuss if there is any value in having additional users such as:
- geoserver user
- etl_user
Postgres User
The postgres user will be
- used mostly by DBAs
- be used by developers in DEV and QA
- used in emergency cases by developers in PROD
EAS User
The eas_user is for users of the EAS application...
- needs to be able to read and write to all tables the mad and sfmaps databases
- should we use a group or role here?
- to make things simple, we'll use eas_user for access from geoserver
For any new users, we'll need to set the correct access privileges.
Until now, I have been using the postgres user everywhere (sorry!).
Installation Errors
error:
Code Block |
---|
configure: error: no acceptable C compiler found in $PATH |
...
Code Block |
---|
yum install zlib.x86_64 zlib-devel.x86_64
|
During pg_restore you may see the following:
Code Block |
---|
pg_restore --host localhost --port 5432 --username postgres --dbname mad ./mad-1_0-beta-2.backup pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 734; 1255 116672554 FUNCTION st_simplifypreservetopology(geometry, double precision) postgres pg_restore: [archiver (db)] could not execute query: ERROR: could not find function "topologypreservesimplify" in file "/usr/lib64/pgsql/liblwgeom.so" Command was: CREATE FUNCTION st_simplifypreservetopology(geometry, double precision) RETURNS geometry AS '$libdir/liblwgeom', 'topolo... pg_restore: [archiver (db)] could not execute query: ERROR: function public.st_simplifypreservetopology(geometry, double precision) does not exist Command was: ALTER FUNCTION public.st_simplifypreservetopology(geometry, double precision) OWNER TO postgres; WARNING: errors ignored on restore: 2 [postgres@CentOSDB17882 ~]$ |
solution
Don't worry about these errors.
This is because we upgraded postgres and postgis.
Moreover, we do not use these functions.